Support PPL queries when having trailing pipes and/or empty pipes#5161
Support PPL queries when having trailing pipes and/or empty pipes#5161anasalkouz wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
…ensearch-project#4032) Signed-off-by: Anas Alkouz <aalkouz@amazon.com>
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis change adds support for trailing pipes in OpenSearch PPL queries by relaxing grammar requirements to make pipe-command segments optional, with comprehensive test coverage across unit and integration tests validating parser behavior with trailing and empty pipes. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (4)
ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.java (1)
950-989: Missing negative test case for the new grammar behaviour.All new tests verify that parse trees are produced (positive cases). Per coding guidelines, grammar tests should include negative cases. Consider adding a test verifying that an invalid command token following a pipe still raises a
SyntaxCheckException, so regressions in error detection are caught:`@Test`(expected = SyntaxCheckException.class) public void testPipeWithInvalidCommandShouldFail() { new PPLSyntaxParser().parse("source=t | | 123invalidcommand"); }Based on learnings: "Applies to
**/{grammar,parser}/**/*Test.java: Include edge cases and boundary conditions in parser and grammar tests" and "Applies to**/{grammar,parser}/**/*.java: Test new grammar rules with positive and negative cases for parser development".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.java` around lines 950 - 989, Add a negative parser test to ensure invalid tokens after a pipe throw SyntaxCheckException: create a new test method (e.g., testPipeWithInvalidCommandShouldFail) that calls new PPLSyntaxParser().parse("source=t | | 123invalidcommand") and expects SyntaxCheckException; place it alongside the existing positive tests (those invoking PPLSyntaxParser.parse in PPLSyntaxParserTest) so grammar regressions are caught.ppl/src/main/antlr/OpenSearchPPLParser.g4 (1)
568-574:appendcolCommandandappendCommandnot updated — inconsistent trailing-pipe behaviour.After this change,
appendpipe [ stats count() | ]succeeds (viasubPipeline), but the equivalentappendcol [ stats count() | ]still fails becauseappendcolCommandandappendCommandstill use the old(PIPE commands)*pattern (command required after pipe).🔧 Suggested fix to align `appendcolCommand` with the PR's intent
appendcolCommand - : APPENDCOL (OVERRIDE EQUAL override = booleanLiteral)? LT_SQR_PRTHS commands (PIPE commands)* RT_SQR_PRTHS + : APPENDCOL (OVERRIDE EQUAL override = booleanLiteral)? LT_SQR_PRTHS commands (PIPE commands?)* RT_SQR_PRTHS ; appendCommand - : APPEND LT_SQR_PRTHS searchCommand? (PIPE commands)* RT_SQR_PRTHS + : APPEND LT_SQR_PRTHS searchCommand? (PIPE commands?)* RT_SQR_PRTHS ;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/main/antlr/OpenSearchPPLParser.g4` around lines 568 - 574, The grammar rules appendcolCommand and appendCommand still require a commands after every PIPE (using (PIPE commands)*), causing appendcol [...] with a trailing pipe to fail; update both rules to allow a PIPE with optional commands by changing the repetition from (PIPE commands)* to (PIPE commands?)* (i.e., make commands optional after PIPE) so trailing pipes are accepted; reference rules: appendcolCommand, appendCommand, token PIPE and rule commands.ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java (1)
1657-1724: Add at least one negative test for completeness.The new tests cover positive cases well. Per guidelines, parser/grammar tests should include both positive and negative cases. Consider adding a test verifying that a genuinely malformed pipe sequence (e.g., pipe with an invalid command token still present) produces a
SyntaxCheckException.Additionally, there is no coverage for the
subPipelinecontext (i.e., trailing pipe insideappendpipe [...]). SinceappendPipeCommandusessubPipeline, which was also updated, a test like:`@Test` public void testTrailingPipeInAppendPipe() { assertEqual( "source=t | appendpipe [ stats count() | ]", /* expected AST for appendpipe */ ... ); }would close the coverage gap.
Based on learnings: "Applies to
**/{grammar,parser}/**/*Test.java: Include edge cases and boundary conditions in parser and grammar tests" and "Applies to**/{grammar,parser}/**/*.java: Test new grammar rules with positive and negative cases for parser development".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java` around lines 1657 - 1724, Add negative and subPipeline coverage in AstBuilderTest: add a test method (e.g., testMalformedPipeProducesSyntaxError) that parses a deliberately malformed pipeline like "source=t | invalidCmd |" and asserts a SyntaxCheckException is thrown (use the test harness/assertThrows pattern used elsewhere and reference SyntaxCheckException). Also add a positive test (e.g., testTrailingPipeInAppendPipe) for the appendpipe/subPipeline case using the existing AST helper builders (relation(...), agg(...), projectWithArg(...), etc.) to assert that "source=t | appendpipe [ stats count() | ]" produces the expected AST (reference the grammar rule appendPipeCommand and subPipeline to locate the parser behavior to exercise). Ensure both tests are added to AstBuilderTest with clear names so future diffs cover negative and subPipeline boundary cases.integ-test/src/test/java/org/opensearch/sql/ppl/TrailingPipeIT.java (1)
23-31:toString()JSON comparison is fragile — prefer a dedicated equality helper.
JSONObject.toString()ordering is not guaranteed to be stable across library versions or platforms. If the response JSON keys reorder between calls, the assertion would produce a false failure even when the data is identical.Other integration tests in this codebase use structured comparison helpers. Consider comparing the
totalcount and a sample of hits, or use a deep-equality helper available in the base class, rather than relying on string serialization order.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integ-test/src/test/java/org/opensearch/sql/ppl/TrailingPipeIT.java` around lines 23 - 31, The testTrailingPipeAfterSource relies on JSONObject.toString() ordering; instead assert structural equality: call executeQuery(...) into resultWithout/resultWith and compare resultWithout.getInt("total") with resultWith.getInt("total") and compare the hits arrays element-wise (e.g., getJSONArray("hits") and compare lengths and each JSONObject entry using JSONObject.equals or a deep-equality helper from the test base). Replace the assertEquals(resultWithout.toString(), resultWith.toString()) with these targeted assertions so ordering of serialized keys won't cause false failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.java`:
- Around line 950-956: Update the misleading test comment in
testQueryWithMultipleTrailingPipesShouldPass: the test string passed to
PPLSyntaxParser.parse ("search source=t a=1 b=2 | fields a,b | |") contains two
consecutive empty pipe tokens (an empty pipeline segment plus a trailing empty
pipe), so change the comment to accurately state that the test verifies handling
of two consecutive trailing pipes (an empty middle pipe and a trailing empty
pipe) rather than "one optional trailing pipe"; keep the assertion and parse
call unchanged.
---
Nitpick comments:
In `@integ-test/src/test/java/org/opensearch/sql/ppl/TrailingPipeIT.java`:
- Around line 23-31: The testTrailingPipeAfterSource relies on
JSONObject.toString() ordering; instead assert structural equality: call
executeQuery(...) into resultWithout/resultWith and compare
resultWithout.getInt("total") with resultWith.getInt("total") and compare the
hits arrays element-wise (e.g., getJSONArray("hits") and compare lengths and
each JSONObject entry using JSONObject.equals or a deep-equality helper from the
test base). Replace the assertEquals(resultWithout.toString(),
resultWith.toString()) with these targeted assertions so ordering of serialized
keys won't cause false failures.
In `@ppl/src/main/antlr/OpenSearchPPLParser.g4`:
- Around line 568-574: The grammar rules appendcolCommand and appendCommand
still require a commands after every PIPE (using (PIPE commands)*), causing
appendcol [...] with a trailing pipe to fail; update both rules to allow a PIPE
with optional commands by changing the repetition from (PIPE commands)* to (PIPE
commands?)* (i.e., make commands optional after PIPE) so trailing pipes are
accepted; reference rules: appendcolCommand, appendCommand, token PIPE and rule
commands.
In `@ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.java`:
- Around line 950-989: Add a negative parser test to ensure invalid tokens after
a pipe throw SyntaxCheckException: create a new test method (e.g.,
testPipeWithInvalidCommandShouldFail) that calls new
PPLSyntaxParser().parse("source=t | | 123invalidcommand") and expects
SyntaxCheckException; place it alongside the existing positive tests (those
invoking PPLSyntaxParser.parse in PPLSyntaxParserTest) so grammar regressions
are caught.
In `@ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.java`:
- Around line 1657-1724: Add negative and subPipeline coverage in
AstBuilderTest: add a test method (e.g., testMalformedPipeProducesSyntaxError)
that parses a deliberately malformed pipeline like "source=t | invalidCmd |" and
asserts a SyntaxCheckException is thrown (use the test harness/assertThrows
pattern used elsewhere and reference SyntaxCheckException). Also add a positive
test (e.g., testTrailingPipeInAppendPipe) for the appendpipe/subPipeline case
using the existing AST helper builders (relation(...), agg(...),
projectWithArg(...), etc.) to assert that "source=t | appendpipe [ stats count()
| ]" produces the expected AST (reference the grammar rule appendPipeCommand and
subPipeline to locate the parser behavior to exercise). Ensure both tests are
added to AstBuilderTest with clear names so future diffs cover negative and
subPipeline boundary cases.
| @Test | ||
| public void testQueryWithMultipleTrailingPipesShouldPass() { | ||
| // Multiple consecutive trailing pipes should be handled gracefully | ||
| // by allowing one optional trailing pipe | ||
| ParseTree tree = new PPLSyntaxParser().parse("search source=t a=1 b=2 | fields a,b | |"); | ||
| assertNotEquals(null, tree); | ||
| } |
There was a problem hiding this comment.
Misleading comment — the test exercises two pipes, not one.
The comment at Line 952-953 says "allowing one optional trailing pipe" but the query | fields a,b | | has an empty middle pipe AND a trailing empty pipe (two pipe tokens without intervening commands).
✏️ Suggested comment fix
- // Multiple consecutive trailing pipes should be handled gracefully
- // by allowing one optional trailing pipe
+ // Multiple consecutive pipes (empty middle pipe + trailing pipe) should parse successfully🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.java`
around lines 950 - 956, Update the misleading test comment in
testQueryWithMultipleTrailingPipesShouldPass: the test string passed to
PPLSyntaxParser.parse ("search source=t a=1 b=2 | fields a,b | |") contains two
consecutive empty pipe tokens (an empty pipeline segment plus a trailing empty
pipe), so change the comment to accurately state that the test verifies handling
of two consecutive trailing pipes (an empty middle pipe and a trailing empty
pipe) rather than "one optional trailing pipe"; keep the assertion and parse
call unchanged.
Signed-off-by: Anas Alkouz aalkouz@amazon.com
Description
This PR enhances the PPL (Piped Processing Language) grammar to support trailing pipes at the end of queries and empty pipes in the middle of command sequences. This improvement makes the PPL syntax more forgiving and user-friendly.
Motivation
Users often accidentally add trailing pipes when writing queries, especially when:
Previously, these queries would fail with syntax errors. This change allows such queries to execute successfully, improving the overall user experience.
Changes
Grammar Modifications
File:
ppl/src/main/antlr/OpenSearchPPLParser.g4Modified two key grammar rules to support optional trailing and empty pipes:
The pattern
(PIPE commands?)*elegantly handles:Test Coverage
1. Syntax Parser Tests
File:
ppl/src/test/java/org/opensearch/sql/ppl/antlr/PPLSyntaxParserTest.javaAdded tests for:
2. AST Builder Tests
File:
ppl/src/test/java/org/opensearch/sql/ppl/parser/AstBuilderTest.javaAdded tests to verify correct AST generation for:
3. Integration Tests
File:
integ-test/src/test/java/org/opensearch/sql/ppl/TrailingPipeIT.javaCreated integration tests that validate:
Examples
Before (Would Fail)
After (Now Works)
Empty Pipes (Also Supported)
Testing
./gradlew spotlessApplyRelated Issues
Resolves #4032
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.