-
Notifications
You must be signed in to change notification settings - Fork 180
Add Frequently Used Big5 PPL Queries #4976
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Frequently Used Big5 PPL Queries #4976
Conversation
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Louis Chu <lingzhichu.clz@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds three Big5 integration tests and corresponding PPL query resources, updates Big5 test data and expected Calcite YAML for one test, and adjusts expected rows in two existing aggregation/condition integration tests. One new test asserts an explained YAML plan. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (1)
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl (1)
5-5: Index name inconsistency across test resources.This query uses
custom-big5as the default index, while other queries in this PR usebig5(e.g., script_engine_like_pattern_with_sort.ppl, rex_regex_transformation.ppl). Consider standardizing the index name across all test resources unless there's a specific reason for the difference.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java(2 hunks)integ-test/src/test/resources/big5/data/big5.json(1 hunks)integ-test/src/test/resources/big5/queries/dedup_metrics_size_field.ppl(1 hunks)integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl(1 hunks)integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl(1 hunks)integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/dedup_metrics_size_field.yaml(1 hunks)integ-test/src/test/resources/expectedOutput/calcite/parse_regex_with_cast_transformation.yaml(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test new grammar rules with positive and negative cases for PPL parser changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Include edge cases and boundary conditions in PPL parser tests
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/parse_regex_with_cast_transformation.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.javainteg-test/src/test/resources/expectedOutput/calcite/dedup_metrics_size_field.yaml
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/parse_regex_with_cast_transformation.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🪛 Biome (2.1.2)
integ-test/src/test/resources/big5/data/big5.json
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 4-5: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 5-6: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, integration)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (6)
integ-test/src/test/resources/big5/data/big5.json (1)
1-6: LGTM! Static analysis warnings are false positives.The file uses NDJSON (newline-delimited JSON) format, which is the standard format for OpenSearch bulk indexing operations. The Biome linter doesn't recognize this format and reports false positives. The new log entries (sshd and kernel) provide appropriate test data for the like pattern queries.
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1)
35-38: LGTM! Query correctly demonstrates LIKE pattern with sorting.The PPL query properly filters messages containing "sshd" and sorts results by timestamp, which aligns with the new test data added to big5.json.
integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl (1)
25-29: LGTM! Rex query correctly demonstrates regex extraction.The query properly extracts
logTypeandfilenamefromlog.file.pathusing a regex pattern and computes the filename length. The transformation logic is sound and aligns with the test objectives.integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl (1)
45-47: LGTM! Query correctly demonstrates LIKE pattern with aggregation.The PPL query properly combines pattern matching with aggregation to count occurrences grouped by metrics.size.
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)
8-9: LGTM! Appropriate static import added.The import of
assertYamlEqualsIgnoreIdis correctly added to support YAML plan assertions in the new tests.integ-test/src/test/resources/expectedOutput/calcite/dedup_metrics_size_field.yaml (1)
1-16: LGTM! Expected Calcite plan correctly models deduplication.The plan properly shows:
- Logical plan with ROW_NUMBER window function for deduplication by metrics.size
- Physical plan with filter condition (<=($45, 1)) to keep only the first occurrence
- Pushdown of EXISTS filter and field projections to OpenSearch
This validates the correct translation of PPL dedup operation to Calcite execution plan.
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Show resolved
Hide resolved
integ-test/src/test/resources/big5/queries/dedup_metrics_size_field.ppl
Outdated
Show resolved
Hide resolved
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
Show resolved
Hide resolved
…tePPLBig5IT.java Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
…tePPLBig5IT.java Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
…tePPLBig5IT.java Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
…tePPLBig5IT.java Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java(2 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (4)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (4)
8-9: LGTM!The static import for
assertYamlEqualsIgnoreIdis properly used in the new test methods for plan validation.
60-70: Test resource files exist and are properly configured.The test correctly references
rex_regex_transformation.pplandrex_regex_transformation.yamlfrom the integ-test resources directory. Both files are present in the repository, and the test follows guidelines for using valid test data from resources.
48-58: Test resource files verified and present.The test correctly follows the pattern with proper JavaDoc, timing measurement, and Calcite plan assertion. Both required resource files exist:
dedup_metrics_size_field.pplanddedup_metrics_size_field.yaml.
72-80: The proposed fix cannot be applied as written - the yaml plan file doesn't exist.This test correctly only performs timing without assertions, which differs from
dedup_metrics_size_fieldandrex_regex_transformation. However, those tests have corresponding yaml plan files (dedup_metrics_size_field.yamlandrex_regex_transformation.yamlinexpectedOutput/calcite/), whereasscript_engine_like_pattern_with_aggregation.yamlis missing.Before adding plan validation assertions, create the expected plan YAML file. Alternatively, if timing-only is intentional (e.g., for performance benchmarking without full plan validation), document that decision in a comment.
Likely an incorrect or invalid review comment.
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Show resolved
Hide resolved
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (2)
60-68: Consider adding plan assertions for consistency.For consistency with
rex_regex_transformation()and to properly test Calcite's SQL generation and optimization paths, consider adding plan validation assertions. This would verify that the generated Calcite plan matches expected behavior for LIKE pattern matching with aggregation.🔎 Example refactor to add assertions
@Test public void script_engine_like_pattern_with_aggregation() throws IOException { String ppl = sanitize(loadExpectedQuery("script_engine_like_pattern_with_aggregation.ppl")); timing(summary, "script_engine_like_pattern_with_aggregation", ppl); + String expected = loadExpectedPlan("big5/script_engine_like_pattern_with_aggregation.yaml"); + assertYamlEqualsIgnoreId(expected, explainQueryYaml(ppl)); }Note: You would need to create the corresponding expected output YAML file at
integ-test/src/test/resources/expectedOutput/calcite/big5/script_engine_like_pattern_with_aggregation.yaml.Based on learnings, Calcite integration changes should test SQL generation and optimization paths.
70-78: Consider adding plan assertions for consistency.Similar to the comment above, consider adding plan validation assertions to verify Calcite's handling of LIKE pattern matching with sorting. This would provide better test coverage and consistency with the
rex_regex_transformation()test.🔎 Example refactor to add assertions
@Test public void script_engine_like_pattern_with_sort() throws IOException { String ppl = sanitize(loadExpectedQuery("script_engine_like_pattern_with_sort.ppl")); timing(summary, "script_engine_like_pattern_with_sort", ppl); + String expected = loadExpectedPlan("big5/script_engine_like_pattern_with_sort.yaml"); + assertYamlEqualsIgnoreId(expected, explainQueryYaml(ppl)); }Note: You would need to create the corresponding expected output YAML file at
integ-test/src/test/resources/expectedOutput/calcite/big5/script_engine_like_pattern_with_sort.yaml.Based on learnings, Calcite integration changes should test SQL generation and optimization paths.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
🧰 Additional context used
📓 Path-based instructions (6)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test new grammar rules with positive and negative cases for PPL parser changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Include edge cases and boundary conditions in PPL parser tests
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.javainteg-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yamlinteg-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: build-linux (21, doc)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
🔇 Additional comments (4)
integ-test/src/test/resources/expectedOutput/calcite/big5/rex_regex_transformation.yaml (1)
1-14: LGTM! Expected plan structure is correct.The YAML properly defines both logical and physical Calcite plans for the rex regex transformation query. The regex pattern correctly extracts
logTypeandfilenamefrom the log path, and the physical plan includes appropriate pushdown optimizations (PROJECT, SORT, LIMIT).integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)
48-58: LGTM! Test follows best practices.The test properly validates Calcite's handling of rex regex transformations by loading the expected plan and asserting the actual generated plan matches. The JavaDoc clearly documents the test's purpose.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLAggregationIT.java (1)
738-745: The test expectations are correct and match the big5 test data. The big5.json file contains exactly three documents with @timestamp values (2023-01-02, 2023-03-01, 2023-05-01) that normalize to the expected 1-day spans. No action required.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalcitePPLConditionBuiltinFunctionIT.java (1)
96-100: Row count of 3 is correct for theisnotnull(aws)filter.The big5.json test data contains exactly 3 documents with a non-null
awsobject field (which contains nestedaws.cloudwatchproperties). All 3 will be returned by the query, confirming the expectedverifyNumOfRows(actual, 3)assertion is accurate.
|
@Swiddis the SQL CLI Integration tests are failing, any recommendations on how to fix this? |
LantaoJin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you resolve the conflicts
Signed-off-by: Aaron Alvarez <900908alvarezaaron@gmail.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)
70-78: Add plan assertions as previously requested.The JavaDoc has been added, but the main issue from the previous review remains unaddressed: this test still only performs timing without validating the generated Calcite plan. As noted in the previous review comment (lines 70-78), add plan validation:
@Test public void script_engine_like_pattern_with_sort() throws IOException { String ppl = sanitize(loadExpectedQuery("script_engine_like_pattern_with_sort.ppl")); timing(summary, "script_engine_like_pattern_with_sort", ppl); + String expected = loadExpectedPlan("big5/script_engine_like_pattern_with_sort.yaml"); + assertYamlEqualsIgnoreId(expected, explainQueryYaml(ppl)); }Create
integ-test/src/test/resources/expectedOutput/calcite/big5/script_engine_like_pattern_with_sort.yamlwith the expected Calcite query plan.Based on learnings, test SQL generation and optimization paths for Calcite integration changes.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.javainteg-test/src/test/resources/big5/queries/rex_regex_transformation.pplinteg-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.pplinteg-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
🚧 Files skipped from review as they are similar to previous changes (3)
- integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
- integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl
- integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Include edge cases and boundary conditions in PPL parser tests
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, unit)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (21, integration)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: test-sql-cli-integration (21)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java (1)
48-58: Good implementation - JavaDoc and assertions added.The JavaDoc has been added per previous review feedback, and the test now includes proper plan validation via
assertYamlEqualsIgnoreId, which tests Calcite's SQL generation and optimization paths as expected for Calcite integration tests.
integ-test/src/test/java/org/opensearch/sql/calcite/big5/CalcitePPLBig5IT.java
Show resolved
Hide resolved
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1)
1-35: Consider clarifying the comment block's purpose.The comment block describes an OpenSearch DSL query structure with script engine details, while the PPL query below (lines 36-39) is much simpler. Consider adding a note at the beginning indicating this represents the expected compiled/translated OpenSearch query format for documentation/testing purposes, to avoid confusion about why both representations exist.
💡 Example clarification
-/* Filter messages containing 'sshd', sort by timestamp, and return top 10 results */ +/* Filter messages containing 'sshd', sort by timestamp, and return top 10 results + * + * Expected compiled OpenSearch DSL query structure (for documentation/validation): + */ /*
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
integ-test/src/test/resources/big5/queries/rex_regex_transformation.pplinteg-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.pplinteg-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
🚧 Files skipped from review as they are similar to previous changes (2)
- integ-test/src/test/resources/big5/queries/rex_regex_transformation.ppl
- integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_aggregation.ppl
🧰 Additional context used
📓 Path-based instructions (1)
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
- GitHub Check: build-linux (21, integration)
- GitHub Check: build-linux (25, unit)
- GitHub Check: build-linux (21, doc)
- GitHub Check: build-linux (25, integration)
- GitHub Check: build-linux (25, doc)
- GitHub Check: build-linux (21, unit)
- GitHub Check: bwc-tests-full-restart (25)
- GitHub Check: bwc-tests-rolling-upgrade (25)
- GitHub Check: bwc-tests-full-restart (21)
- GitHub Check: bwc-tests-rolling-upgrade (21)
- GitHub Check: security-it-linux (25)
- GitHub Check: security-it-linux (21)
- GitHub Check: security-it-windows-macos (macos-14, 21)
- GitHub Check: security-it-windows-macos (macos-14, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 25)
- GitHub Check: security-it-windows-macos (windows-latest, 21)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
- GitHub Check: build-windows-macos (macos-14, 21, doc)
- GitHub Check: build-windows-macos (macos-14, 21, unit)
- GitHub Check: build-windows-macos (macos-14, 25, integration)
- GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
- GitHub Check: build-windows-macos (macos-14, 25, doc)
- GitHub Check: build-windows-macos (macos-14, 25, unit)
- GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
- GitHub Check: CodeQL-Scan (java)
- GitHub Check: test-sql-cli-integration (21)
🔇 Additional comments (1)
integ-test/src/test/resources/big5/queries/script_engine_like_pattern_with_sort.ppl (1)
36-39: LGTM! The PPL query syntax and logic are correct:
- Filters messages containing 'sshd' using the LIKE predicate
- Sorts by timestamp in descending order
- Returns top 10 results
Test data verification confirms the big5 dataset contains a message record with "sshd" in both the message text and process.name field. The data format is valid NDJSON with proper Elasticsearch bulk indexing structure, and includes realistic log metadata (timestamps, agent info, cloud region). The query will execute successfully and return meaningful test results.
Signed-off-by: Aaron Alvarez <aaarone@amazon.com>
6f1d81c to
3ed2ba6
Compare
Description
This PR continues the work done in PR #4816 to add frequent used queries to the big5 workload based on gap analysis between existing benchmarks and frequent used query patterns.
dedupquery is added here: #4991Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.