Skip to content

Conversation

@srikanthpadakanti
Copy link

Description

This pull request adds native support for the mvexpand command in PPL to OpenSearch SQL, enabling users to expand multivalue fields (arrays) into separate rows directly within queries. This functionality is analogous to Splunk's mvexpand command and streamlines analytics, dashboarding, and data preparation involving arrays or multivalue fields.

Key features introduced:

Native mvexpand command for PPL queries to expand array fields into separate rows/events.
Optional limit parameter to restrict the number of expanded values per event/document.
Robust handling of empty/null arrays, large arrays (with memory/resource limits), and non-array fields.
Streaming/distributable execution for performance and scalability.
Comprehensive documentation and edge case coverage.
This feature makes OpenSearch SQL more powerful and user-friendly for log analytics, data exploration, and migration from platforms like Splunk.

Related Issues

Resolves #4439
#4439

Check List

  • [ X] New functionality includes testing.
  • [ X] New functionality has been documented.
  • [ X] New functionality has javadoc added.
  • [ X] New functionality has a user manual doc added.
  • [ X] New PPL command checklist all confirmed.
  • [ X] API changes companion pull request created.
  • [ X] Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Srikanth Padakanti added 30 commits November 22, 2025 14:42
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
…it supposed to

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 11, 2025

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Added MV-EXPAND command to expand multivalue/array fields into separate rows with an optional per-document limit.
  • Documentation

    • New user guide and examples for MV-EXPAND; updated PPL command index and example outputs.
  • Tests

    • Added comprehensive unit and integration tests, mappings, and doctest data covering MV-EXPAND edge cases.
  • Chores

    • Parser, lexer, anonymizer, and planner plumbing updated to support MV-EXPAND.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Adds native MVEXPAND support across PPL/SQL: new AST node, parser/lexer grammar, DSL, anonymizer, Calcite planning (with per-document limit), analyzer visitor registration, function planning tweaks, documentation, doctest fixtures, and unit/integration tests.

Changes

Cohort / File(s) Summary
AST Node
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
New MvExpand UnresolvedPlan node with Field field, optional Integer limit, attach(child), getChild(), and visitor accept.
AST Visitor API
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
Added visitMvExpand delegating to visitChildren; consolidated tree imports to wildcard. Removed two AddTotals visitor methods.
Analysis Layer
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
Registered visitMvExpand that throws a Calcite-only exception (marks node as handled only via Calcite path).
Calcite Planning
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Added visitMvExpand, extractFieldName(Field) helper, and extended buildExpandRelNode(...) to accept @Nullable Integer perDocLimit; propagate per-doc limit into right-side plan and validate field types (missing -> empty, non-array -> semantic error).
DSL / Parser
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java, ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
Added mvexpand(...) DSL helper; parser visitor visitMvexpandCommand constructs MvExpand from field + optional limit.
PPL Grammar & Lexer
ppl/src/main/antlr/OpenSearchPPLLexer.g4, ppl/src/main/antlr/OpenSearchPPLParser.g4
Added MVEXPAND token and mvexpandCommand rule supporting optional limit=<int>; integrated into commands.
PPL Anonymizer
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
Added visitMvExpand to mask target field and to mask optional limit in anonymized PPL strings.
Function Planning Helpers
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
Registered an additional INTERNAL_ITEM signature with permissive first-arg type (IGNORE) to allow planning when element type is undefined.
Docs & Indexing
docs/user/ppl/cmd/mvexpand.md, docs/user/ppl/index.md, docs/category.json, docs/user/dql/metadata.rst
New mvexpand docs and index entry; category registration; updated example metadata.
Doctest Resources
doctest/test_data/mvexpand_logs.json, doctest/test_mapping/mvexpand_logs.json, doctest/test_docs.py
Added newline-delimited test data, mapping, and registered it in TEST_DATA.
Integration Tests & Resources
integ-test/src/test/java/.../CalciteMvExpandCommandIT.java, integ-test/.../CalciteExplainIT.java, integ-test/.../CalciteNoPushdownIT.java, integ-test/src/test/resources/*, integ-test/src/test/resources/mvexpand_edge_cases.json, mvexpand_edge_cases_mapping.json
New comprehensive integration test class with many test cases; added explain YAMLs, edge-case dataset and mapping; registered index enum constant.
PPL Planner Tests
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java, ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
New Calcite-based planner tests and updated anonymizer tests covering mvexpand behaviors and limit masking.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant PPLParser
    participant AstBuilder
    participant Analyzer
    participant CalcitePlanner
    participant Execution

    Client->>PPLParser: "source | mvexpand field [limit=n]"
    PPLParser->>AstBuilder: parse mvexpandCommand
    AstBuilder->>AstBuilder: construct MvExpand(field, optional limit)
    AstBuilder->>Analyzer: provide UnresolvedPlan with MvExpand
    Analyzer->>CalcitePlanner: visitMvExpand (throws only-for-calcite -> Calcite path used)
    CalcitePlanner->>CalcitePlanner: resolve field (extractFieldName), validate type
    alt field missing
        CalcitePlanner->>CalcitePlanner: produce EMPTY 0-row result while preserving schema
    else field non-array
        CalcitePlanner->>Client: throw SemanticCheckException
    else array OK
        CalcitePlanner->>CalcitePlanner: buildExpandRelNode(..., perDocLimit)
        CalcitePlanner->>Execution: produce physical plan (with optional right-side limit)
    end
    Execution->>Client: return expanded rows
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Pay extra attention to:
    • CalciteRelNodeVisitor: per-document limit propagation and correct insertion of the limit on the expansion (right) branch.
    • Field resolution and type validation (missing vs null vs non-array) and associated SemanticCheckException behavior.
    • Parser/grammar changes for optional limit parsing and AST construction in AstBuilder.
    • PPLFuncImpTable operator registration to ensure it doesn't unintentionally relax typing across other paths.
    • Integration tests and test resources for correctness and potential flakiness (bulk insert/setup/teardown).

Suggested labels

feature

Suggested reviewers

  • LantaoJin
  • penghuo
  • anirudha
  • kavithacm
  • derek-ho
  • joshuali925
  • GumpacG
  • Swiddis

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.04% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Mvexpand feature' is vague and generic, using non-descriptive terminology that doesn't convey specific implementation details about the mvexpand command being added. Consider a more descriptive title like 'Add native mvexpand command to PPL for expanding multivalue fields' to clearly communicate the main change.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed The description comprehensively explains the mvexpand feature being added, including key features, implementation details, edge case handling, and relates it to the resolved issue #4439.
Linked Issues check ✅ Passed The PR successfully implements all primary coding objectives from issue #4439: mvexpand command with field expansion, optional limit parameter, edge case handling for empty/null/large arrays, and comprehensive documentation and testing.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the mvexpand feature: AST nodes, parsing, Calcite integration, tests, test data, and documentation. No unrelated modifications detected.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34db739 and 602358e.

📒 Files selected for processing (1)
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (8)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

655-664: Consider adding JavaDoc for consistency.

The method follows the established visitor pattern and correctly anonymizes both the field name and optional limit. However, other visitor methods in this class lack JavaDoc as well, so this is consistent with the existing style.

The implementation correctly:

  • Masks the field name with MASK_COLUMN
  • Masks the limit value with MASK_LITERAL when present
  • Processes the child node following the same pattern as visitExpand
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java (2)

17-29: LGTM - Well-structured AST node following established patterns.

The MvExpand AST node correctly extends UnresolvedPlan and follows the same patterns as other AST nodes like Expand. The use of @Nullable for the optional limit parameter is appropriate, and the Lombok annotations provide consistent behavior.

Consider adding a null-check for field in the constructor to fail fast if an invalid AST is constructed:

 public MvExpand(Field field, @Nullable Integer limit) {
+   if (field == null) {
+     throw new IllegalArgumentException("Field cannot be null for mvexpand");
+   }
   this.field = field;
   this.limit = limit;
 }

31-45: Methods follow existing AST patterns.

The attach(), getChild(), and accept() methods correctly implement the UnresolvedPlan contract. The use of ImmutableList for getChild() is consistent with the codebase.

As per coding guidelines, consider adding brief JavaDoc for public methods, particularly attach() and getChild() to document their behavior.

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (2)

43-82: TableWithArray duplicates existing code.

The comment on line 42 acknowledges this mirrors CalcitePPLExpandTest.TableWithArray. Consider extracting this to a shared test utility class to reduce duplication and ensure consistency.


97-163: Tests are smoke tests only - consider adding assertions.

Several tests (lines 124-163) only verify that planning succeeds without asserting expected logical plans or behavior. While smoke tests are valuable for regression detection, consider adding at least one assertion per test to verify correct behavior.

For example, testMvExpandEmptyOrNullArray (line 131-135) could verify the plan handles the null filter appropriately.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)

61-70: Exception handling in cleanup may mask issues.

The cleanupAfterEach method catches and ignores all exceptions, which could hide legitimate cleanup failures or resource leaks during test debugging.

 @AfterEach
 public void cleanupAfterEach() throws Exception {
-  try {
-    deleteIndexIfExists(INDEX + "_not_array");
-    deleteIndexIfExists(INDEX + "_missing_field");
-    deleteIndexIfExists(INDEX + "_limit_test");
-    deleteIndexIfExists(INDEX + "_int_field");
-  } catch (Exception ignored) {
-  }
+  deleteIndexIfExists(INDEX + "_not_array");
+  deleteIndexIfExists(INDEX + "_missing_field");
+  deleteIndexIfExists(INDEX + "_limit_test");
+  deleteIndexIfExists(INDEX + "_int_field");
 }

The deleteIndexIfExists method already handles 404 errors gracefully, so the outer try-catch is unnecessary.


340-360: bulkInsert has unused ID parsing logic.

The bulkInsert method includes logic to parse custom IDs using | as a delimiter (lines 346-353), but none of the test calls use this format. All documents are inserted with auto-generated IDs. Consider simplifying if this feature isn't needed:

 private static void bulkInsert(String index, String... docs) throws IOException {
   StringBuilder bulk = new StringBuilder();
   int nextAutoId = 1;
   for (String doc : docs) {
-    String id;
-    String json;
-    if (doc.contains("|")) {
-      String[] parts = doc.split("\\|", 2);
-      id = parts[0];
-      json = parts[1];
-    } else {
-      id = String.valueOf(nextAutoId++);
-      json = doc;
-    }
+    String id = String.valueOf(nextAutoId++);
+    String json = doc;
     bulk.append("{\"index\":{\"_id\":").append(id).append("}}\n");
     bulk.append(json).append("\n");
   }

Alternatively, if this is for future use, add a comment explaining the intended use case.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

2858-2874: Missing field handling creates schema with NULL literal of untyped nature.

When a field is missing, the code adds relBuilder.literal(null) (line 2865) which creates a NULL literal without explicit type. This could cause type inference issues downstream. Consider using a typed null:

-      // Add NULL for missing field
-      projects.add(relBuilder.literal(null));
+      // Add typed NULL for missing field (VARCHAR as a reasonable default for unknown fields)
+      projects.add(
+          context.rexBuilder.makeNullLiteral(
+              context.rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR)));
       names.add(fieldName);

This ensures the NULL has a consistent type in the resulting schema.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90ee47c and 125cf3b.

📒 Files selected for processing (27)
  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java (1 hunks)
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (6 hunks)
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1 hunks)
  • docs/category.json (1 hunks)
  • docs/user/dql/metadata.rst (3 hunks)
  • docs/user/ppl/cmd/mvexpand.rst (1 hunks)
  • docs/user/ppl/index.rst (1 hunks)
  • doctest/test_data/mvexpand_logs.json (1 hunks)
  • doctest/test_docs.py (1 hunks)
  • doctest/test_mapping/mvexpand_logs.json (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1 hunks)
  • integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml (1 hunks)
  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml (1 hunks)
  • integ-test/src/test/resources/mvexpand_edge_cases.json (1 hunks)
  • integ-test/src/test/resources/mvexpand_edge_cases_mapping.json (1 hunks)
  • ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1 hunks)
  • ppl/src/main/antlr/OpenSearchPPLParser.g4 (3 hunks)
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (2 hunks)
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (2 hunks)
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1 hunks)
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1 hunks)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
  • core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure methods are under 20 lines with single responsibility
  • Verify proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
  • core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

  • Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)

  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

  • Verify SQL generation and optimization paths
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
🧠 Learnings (8)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml
  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
  • integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
📚 Learning: 2025-12-11T05:27:39.831Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.831Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml
  • integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Applied to files:

  • docs/user/ppl/index.rst
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes

Applied to files:

  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
  • PPLIntegTestCase (36-415)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
  • Index (218-262)
🪛 Biome (2.1.2)
docs/category.json

[error] 77-77: unexpected character <

(parse)


[error] 77-77: unexpected character <

(parse)


[error] 77-77: unexpected character <

(parse)


[error] 77-77: unexpected character <

(parse)


[error] 77-77: unexpected character <

(parse)


[error] 77-77: unexpected character <

(parse)


[error] 77-77: String values must be double quoted.

(parse)


[error] 77-78: unexpected character =

(parse)


[error] 78-78: unexpected character =

(parse)


[error] 78-78: unexpected character =

(parse)


[error] 78-78: unexpected character =

(parse)


[error] 78-78: unexpected character =

(parse)


[error] 78-78: unexpected character =

(parse)


[error] 78-78: unexpected character =

(parse)


[error] 79-79: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 79-79: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 79-128: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 128-129: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 128-129: unexpected character >

(parse)


[error] 129-129: unexpected character >

(parse)


[error] 129-129: unexpected character >

(parse)


[error] 129-129: unexpected character >

(parse)


[error] 129-129: unexpected character >

(parse)


[error] 129-129: unexpected character >

(parse)


[error] 129-129: unexpected character >

(parse)


[error] 129-129: Minus must be followed by a digit

(parse)

integ-test/src/test/resources/mvexpand_edge_cases.json

[error] 1-2: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 2-3: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 3-4: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 4-5: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 5-6: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 6-7: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 7-8: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 8-9: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 9-10: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 10-11: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 11-12: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 12-13: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 13-14: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 14-15: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 15-16: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 16-17: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 17-18: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

doctest/test_data/mvexpand_logs.json

[error] 1-2: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 2-3: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 3-4: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 4-5: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)


[error] 5-6: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

🔇 Additional comments (29)
docs/user/dql/metadata.rst (1)

38-38: LGTM! Documentation properly reflects the new test index.

The row count update and the addition of the mvexpand_logs table entry are consistent with the mvexpand feature addition.

Also applies to: 51-51

docs/user/ppl/index.rst (1)

67-67: LGTM! Proper documentation for the new mvexpand command.

The command entry correctly identifies it as experimental (since 3.4) and provides an accurate description that aligns with the PR objectives.

doctest/test_docs.py (1)

60-61: LGTM! Proper test data registration.

The mvexpand_logs test data entry is correctly added to the TEST_DATA dictionary with appropriate syntax.

ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1)

53-53: LGTM! Lexer token properly added.

The MVEXPAND token is correctly defined and positioned alphabetically after the EXPAND token, following the established pattern for command keywords.

integ-test/src/test/resources/mvexpand_edge_cases_mapping.json (1)

1-8: LGTM! Test mapping is properly defined.

The mapping defines username as a keyword field and skills as a nested field, which are appropriate for testing mvexpand edge cases.

integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1)

35-35: LGTM! Test suite properly integrated.

CalciteMvExpandCommandIT.class is correctly added to the no-pushdown test suite in alphabetical order, following the established pattern.

integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml (1)

1-7: LGTM! Expected explain output is well-structured.

The YAML defines the expected logical and physical plans for mvexpand, showing proper projection of the VALUE field alongside the original fields.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)

46-46: LGTM! Test data properly initialized.

The MVEXPAND_EDGE_CASES index is correctly loaded in the init method, following the established pattern for test setup.

doctest/test_mapping/mvexpand_logs.json (1)

1-24: LGTM!

The mapping structure is well-defined for testing mvexpand behavior with nested fields. The combination of keyword and integer nested types provides good coverage for type inference testing.

integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml (1)

1-7: LGTM!

The explain output structure correctly represents a Calcite plan for mvexpand operations, including the VALUE projection that represents the expanded field.

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)

456-458: LGTM!

The visitor method follows the established pattern in AbstractNodeVisitor, correctly delegating to visitChildren for default traversal behavior.

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)

848-858: LGTM!

The test coverage for mvexpand anonymization is appropriate, covering both basic usage and the limit parameter. The tests follow the existing pattern and naming conventions.

integ-test/src/test/resources/mvexpand_edge_cases.json (1)

1-18: LGTM - Comprehensive edge case coverage.

The test data covers a good range of edge cases for mvexpand validation:

  • Normal multivalue expansion
  • Single values
  • Empty and null arrays
  • Missing fields
  • Heterogeneous nested objects
  • Duplicates
  • Large arrays

Note: The static analysis errors from Biome are false positives. This is valid newline-delimited JSON (NDJSON) format used for bulk operations in OpenSearch, not a standard JSON array.

ppl/src/main/antlr/OpenSearchPPLParser.g4 (3)

81-81: LGTM!

The mvexpandCommand is properly integrated into the commands alternatives, following the established pattern in the grammar.


120-120: LGTM!

The MVEXPAND token is correctly added to commandName, maintaining consistency with other command definitions.


532-534: LGTM!

The grammar rule for mvexpandCommand is well-defined with:

  • Required field expression to expand
  • Optional limit parameter for restricting expanded values per document

The syntax follows existing patterns in the grammar and correctly uses tokens that should already be defined in the lexer.

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1)

702-706: LGTM!

The new MVEXPAND_EDGE_CASES enum constant follows the established pattern used by other test indices in this file (e.g., TIME_TEST_DATA2, TELEMETRY). The naming conventions and constructor usage are consistent.

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

82-82: LGTM!

Import correctly placed among the tree imports.

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2)

81-81: LGTM!

Import correctly placed alphabetically among the tree imports.


705-708: LGTM!

The implementation correctly follows the established pattern for Calcite-only commands (e.g., visitExpand, visitBin). Throwing getOnlyForCalciteException("MvExpand") properly routes MvExpand processing to the Calcite engine as intended by this feature.

docs/user/ppl/cmd/mvexpand.rst (1)

1-29: LGTM!

The documentation is well-structured with clear description, syntax definition, and helpful notes about doctest stability. The command syntax and parameter descriptions are accurate and match the implementation.

doctest/test_data/mvexpand_logs.json (1)

1-6: LGTM!

The test data file correctly uses NDJSON (Newline-Delimited JSON) format, which is the expected format for OpenSearch bulk data loading. The static analysis errors from Biome are false positives—Biome expects standard JSON but NDJSON is intentionally one JSON object per line.

The test cases comprehensively cover the edge cases documented in mvexpand.rst:

  • basic: Multi-value array with 3 elements
  • empty: Empty array
  • null: Null field value
  • single: Single-element array
  • ids: Numeric array for limit testing
  • missing: Document without the target field
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)

23-59: Comprehensive test setup with good edge case coverage.

The test initialization creates a well-structured nested mapping and populates diverse test documents covering various edge cases. The test data design is thoughtful, including documents for single elements, empty arrays, null values, partial fields, mixed shapes, duplicates, and heterogeneous types.


227-251: Well-designed limit parameter test.

The testMvexpandLimitParameter test properly validates the limit=3 syntax with a document containing 5 elements, verifying both the row count and specific row values.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (4)

2836-2899: Well-implemented visitMvExpand with proper edge case handling.

The implementation correctly handles three cases:

  1. Missing field → Returns empty result while preserving schema (lines 2852-2874)
  2. Non-array type → Throws SemanticCheckException with clear message (lines 2877-2885)
  3. Valid array → Delegates to buildExpandRelNode with optional limit (lines 2887-2898)

The limit validation (lines 2892-2895) correctly rejects non-positive values.


2901-2913: extractFieldName helper is well-implemented.

The method correctly handles QualifiedName expressions by joining parts with dots, with a sensible fallback that removes backticks for other expression types.


3158-3188: buildExpandRelNode correctly applies per-document limit.

The updated method signature accepts an optional perDocLimit parameter and applies it at the right point in the pipeline—after uncollect but before building the right node for the correlate join (lines 3185-3187). This correctly limits expanded values per source document.


847-852: Consistent update to buildExpandRelNode calls.

The call in visitPatterns is correctly updated to pass null for the new perDocLimit parameter, maintaining backward compatibility.

core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)

1101-1108: LGTM. The fallback registration for INTERNAL_ITEM properly supports MVEXPAND with undefined element types.

The design is sound:

  • Operator resolution tries the specific type checker first (ARRAY|INTEGER or MAP|ANY), then the permissive fallback (IGNORE|CHARACTER)
  • The SqlTypeFamily.IGNORE family explicitly skips type validation at planning time (PPLTypeChecker.java:81-83), allowing planning to succeed when element types are unknown
  • Execution defers to Calcite's ITEM operator, which handles the actual runtime validation
  • The CHARACTER-only constraint for the second argument correctly targets the string-key access pattern used by MVEXPAND; integer-index access on undefined types is intentionally excluded

@dai-chen dai-chen added enhancement New feature or request PPL Piped processing language backport 2.19-dev labels Dec 11, 2025
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1)

131-170: Consider adding assertions to verify correctness beyond planning success.

While smoke tests are valuable for ensuring planning doesn't throw exceptions, most tests here (lines 131-170) only verify that getRelNode(ppl) succeeds without validating the generated plan or SQL output. In contrast, testMvExpandBasic (lines 97-115) demonstrates stronger validation by asserting on expected logical plans and Spark SQL translation.

Consider adding similar assertions to at least 2-3 of these edge-case tests (e.g., testMvExpandEmptyOrNullArray, testMvExpandProjectNested) to ensure they not only plan successfully but also generate correct plans.

Example for testMvExpandEmptyOrNullArray:

@Test
public void testMvExpandEmptyOrNullArray() {
  String ppl = "source=DEPT | where isnull(EMPNOS) | mvexpand EMPNOS";
  RelNode root = getRelNode(ppl);
  // Verify filter is present in logical plan
  String logicalPlan = root.explain();
  assertTrue(logicalPlan.contains("LogicalFilter"), 
      "Expected filter for isnull(EMPNOS) in plan");
  assertTrue(logicalPlan.contains("LogicalCorrelate"), 
      "Expected correlation for mvexpand in plan");
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00c990f and 44814ab.

📒 Files selected for processing (4)
  • docs/category.json (2 hunks)
  • docs/user/ppl/cmd/mvexpand.rst (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2 hunks)
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/user/ppl/cmd/mvexpand.rst
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure methods are under 20 lines with single responsibility
  • Verify proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)

  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

  • Check test naming follows conventions (*Test.java for unit, *IT.java for integration)
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

  • Verify SQL generation and optimization paths
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (3)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: All tests must pass before merge; fix or ask for guidance if blocked

Applied to files:

  • docs/category.json
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧬 Code graph analysis (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
  • Index (218-262)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (2)
legacy/src/main/java/org/opensearch/sql/legacy/executor/format/Schema.java (1)
  • Schema (17-170)
legacy/src/main/java/org/opensearch/sql/legacy/parser/SqlParser.java (1)
  • SqlParser (58-628)
🔇 Additional comments (6)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2)

46-46: LGTM!

The index loading for MVEXPAND_EDGE_CASES follows the established pattern and properly sets up test data for the new mvexpand test.


321-327: Good fix on the comment; test implementation looks solid.

The misleading comment from the previous review has been corrected. The test properly validates the explain plan for the mvexpand command and follows the established testing pattern in this file.

Note: The limit parameter for mvexpand is covered in integration tests in CalciteMvExpandCommandIT.java (e.g., mvexpand skills limit=3), so no additional coverage is needed in this explain plan validation test.

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (4)

43-82: LGTM!

The TableWithArray implementation correctly provides a Calcite Table with an INTEGER array field (EMPNOS) for testing. JavaDoc clearly explains its purpose as a test fixture.


84-95: LGTM!

The config method properly sets up the test schema with the DEPT table containing array fields, and configures heuristic join order planning as mentioned in the PR objectives.


97-115: LGTM!

Excellent test that validates both the Calcite logical plan structure and Spark SQL translation for basic mvexpand. The expected plan correctly shows the correlation and uncollect pattern.


117-129: Previous review comment addressed - tests now correctly named.

The test previously flagged as misleading has been properly addressed:

  • Renamed to testMvExpandWithHeadCommand to reflect it tests the head command
  • Added new testMvExpandWithLimitParameter to test the actual mvexpand limit syntax (limit=2)

Both tests validate the planning succeeds for their respective scenarios.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/user/ppl/cmd/mvexpand.md (1)

7-9: Add language specifier to syntax code block for consistency.

The fenced code block at line 7 is missing a language identifier, unlike the other example code blocks in the file (lines 24, 40, 54, 64, 77) which use text. This creates a minor formatting inconsistency.

 ## Syntax
-```
+```text
 mvexpand <field> [limit=<int>]

</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 44814ab0922099584138430aab7a0e704d0c65a9 and f9dd692b38579ba69040a53247c6de3600cc2b6d.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `docs/category.json` (2 hunks)
* `docs/user/ppl/cmd/mvexpand.md` (1 hunks)
* `docs/user/ppl/index.rst` (1 hunks)

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary>

* docs/user/ppl/index.rst

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>📓 Common learnings</summary>

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor for Calcite integration


</details>
<details>
<summary>📚 Learning: 2025-12-02T17:27:55.938Z</summary>

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: All tests must pass before merge; fix or ask for guidance if blocked


**Applied to files:**
- `docs/category.json`

</details>

</details><details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

<details>
<summary>docs/user/ppl/cmd/mvexpand.md</summary>

7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

</details>

<details>
<summary>🔇 Additional comments (2)</summary><blockquote>

<details>
<summary>docs/category.json (1)</summary><blockquote>

`30-30`: **✅ mvexpand entry is correctly placed and formatted.**

The mvexpand documentation entry is now properly located in the `ppl_cli_calcite` section with the `.md` extension, maintaining alphabetical ordering between `multisearch.md` and `replace.md`. Previous merge conflict issues have been resolved, and the JSON is syntactically valid.

</blockquote></details>
<details>
<summary>docs/user/ppl/cmd/mvexpand.md (1)</summary><blockquote>

`1-98`: **✅ Documentation is comprehensive and covers edge cases well.**

The mvexpand documentation clearly describes the command purpose, provides syntax with required and optional parameters, and includes five well-structured examples covering key scenarios: basic expansion, limit parameter, empty arrays, null arrays, single-value arrays, and missing fields. This aligns with the PR objectives and provides users with clear guidance on command behavior.

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

srikanthpadakanti and others added 3 commits December 11, 2025 12:12
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9dd692 and 2464675.

📒 Files selected for processing (1)
  • docs/user/ppl/cmd/mvexpand.md (1 hunks)
🧰 Additional context used
🪛 markdownlint-cli2 (0.18.1)
docs/user/ppl/cmd/mvexpand.md

7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
docs/user/ppl/cmd/mvexpand.md (1)

1-125: Comprehensive documentation with good edge-case coverage.

The documentation is well-structured and covers the key mvexpand requirements from the PR objectives: basic expansion, optional limit parameter, and important edge cases (empty arrays, null arrays, single-element arrays, missing fields). The examples use clear tabular output formatting and stable doctest methodology (filtering by case=<name>).

// Some datasets (or Calcite's type inference) may give the element an UNDEFINED type.
// Accept a "ignore" first-argument family so INTERNAL_ITEM(elem, 'key') can still be planned
// and resolved at runtime (fallback semantics handled at execution side). - Used in MVEXPAND
registerOperator(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if missed some context. What is this used for? Please point me to comment related if any.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code defines a fallback type-checker for INTERNAL_ITEM to handle cases where the element type is undefined at planning time. It ensures the operator accepts composite types (IGNORE, CHARACTER) for resilient type validation during query execution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some datasets (or Calcite's type inference) may give the element an UNDEFINED type

Could you clarify a little more which query/test dataset you have this problem? If any previous comment related, please let me know. Thanks!

Copy link
Author

@srikanthpadakanti srikanthpadakanti Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without that code, tests fail because the INTERNAL_ITEM operator requires specific argument types ({ARRAY, INTEGER} or {STRUCT, ANY}) for type validation during query execution. When Calcite assigns an UNDEFINED type at planning time, the absence of fallback handling in registerOperator causes validation to fail, triggering ExpressionEvaluationException.

With that code, the fallback type-checker accepts UNDEFINED types as valid arguments (IGNORE, CHARACTER), allowing queries to pass type validation, execute correctly, and produce expected results in the tests.

Example - I commented that code and ran my IT below is the failure

testMvexpandDuplicate

org.opensearch.client.ResponseException: method [POST], host [http://127.0.0.1:65246], URI [/_plugins/_ppl], status line [HTTP/1.1 400 Bad Request]
{
"error": {
"reason": "Invalid Query",
"details": "INTERNAL_ITEM function expects {[ARRAY,INTEGER]|[STRUCT,ANY]}, but got [UNDEFINED,STRING]",
"type": "ExpressionEvaluationException"
},
"status": 400
}

Signed-off-by: Srikanth Padakanti <srikanth29.9@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)

94-94: Add JavaDoc for visitor method.

Per coding guidelines, public methods must have JavaDoc documentation with @param and @return tags. A previous review already flagged this same issue.

Based on coding guidelines, all public classes and methods must have proper JavaDoc.

Also applies to: 872-878

🧹 Nitpick comments (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (2)

3107-3182: Consider extracting the missing field handling logic to improve readability.

The visitMvExpand method is ~75 lines, exceeding the 50-line guideline for this file. The missing field handling (lines 3135-3156) is a self-contained 22-line block that could be extracted into a helper method like buildEmptyRelationWithField(RelBuilder, RelDataType, String). This would improve readability and maintainability.

The core logic is sound:

  • Missing field correctly returns an EMPTY relation (0 rows) while preserving schema
  • Non-array types properly throw SemanticCheckException
  • Limit validation and array expansion are handled correctly
🔎 Suggested refactoring to extract complex logic

Extract the missing field handling into a helper method:

/**
 * Builds an empty relation (0 rows) while preserving schema and adding the missing field as NULL.
 */
private void buildEmptyRelationWithField(
    RelBuilder relBuilder, RelDataType rowType, String missingFieldName) {
  List<RelDataTypeField> fields = rowType.getFieldList();
  List<RexNode> projects = new ArrayList<>();
  List<String> names = new ArrayList<>();

  // Keep existing fields
  for (RelDataTypeField f : fields) {
    projects.add(relBuilder.field(f.getIndex()));
    names.add(f.getName());
  }

  // Add NULL for missing field
  projects.add(relBuilder.literal(null));
  names.add(missingFieldName);

  relBuilder.project(projects, names);
  // Return 0 rows
  relBuilder.filter(relBuilder.literal(false));
}

Then simplify visitMvExpand:

     // 2A. Missing field → true EMPTY relation (no schema, no rows)
     if (matched == null) {
-      // Schema must include the missing field, even if no rows returned.
-      List<RelDataTypeField> fields = rowType.getFieldList();
-      List<RexNode> projects = new ArrayList<>();
-      List<String> names = new ArrayList<>();
-
-      // Keep existing fields
-      for (RelDataTypeField f : fields) {
-        projects.add(relBuilder.field(f.getIndex()));
-        names.add(f.getName());
-      }
-
-      // Add NULL for missing field
-      projects.add(relBuilder.literal(null));
-      names.add(fieldName);
-
-      relBuilder.project(projects, names);
-
-      // Now return 0 rows
-      relBuilder.filter(relBuilder.literal(false));
-
+      buildEmptyRelationWithField(relBuilder, rowType, fieldName);
       return relBuilder.peek();
     }

3184-3196: Consider making the fallback logic more explicit.

The extractFieldName method correctly handles QualifiedName by joining parts. However, the fallback toString().replace("", "")` is somewhat fragile as it:

  • Assumes backticks are the only wrapper characters
  • Depends on UnresolvedExpression.toString() implementation details
  • Could incorrectly strip backticks that are part of the actual field name

Consider adding a comment explaining the fallback behavior or handling other known UnresolvedExpression subtypes explicitly.

🔎 Example: More explicit fallback handling
 private String extractFieldName(Field f) {
   UnresolvedExpression inner = f.getField();

   if (inner instanceof QualifiedName) {
     List<String> parts = ((QualifiedName) inner).getParts();
     if (!parts.isEmpty()) {
       return String.join(".", parts);
     }
   }

-  // Fallback - return clean string
+  // Fallback: Remove backtick delimiters that may be added by the parser.
+  // This assumes field names don't contain backticks as part of their actual name.
   return inner.toString().replace("`", "");
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f86c52 and 32d3867.

📒 Files selected for processing (12)
  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (6 hunks)
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1 hunks)
  • docs/category.json (2 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2 hunks)
  • ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1 hunks)
  • ppl/src/main/antlr/OpenSearchPPLParser.g4 (3 hunks)
  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (2 hunks)
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (2 hunks)
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • docs/category.json
  • ppl/src/main/antlr/OpenSearchPPLParser.g4
  • integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
  • ppl/src/main/antlr/OpenSearchPPLLexer.g4
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧰 Additional context used
📓 Path-based instructions (9)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
core/src/main/java/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit

  • Public methods MUST have JavaDoc with @param, @return, and @throws
  • Follow existing function implementation patterns in the same package
  • New expression functions should follow ExpressionFunction interface patterns
  • Validate function naming follows project conventions (camelCase)

Files:

  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read

  • Suggest extracting complex logic into helper methods
  • Check for code organization and logical grouping
  • Validate all RelNode types are handled

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/ast/**/*.java: - AST nodes must be immutable where possible

  • Follow visitor pattern for AST traversal
  • Ensure proper toString() implementation for debugging

Files:

  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
core/src/main/java/org/opensearch/sql/expression/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/expression/**/*.java: - New expression implementations must follow existing patterns

  • Type handling must be consistent with project type system
  • Error handling must use appropriate exception types
  • Null handling must be explicit and documented

Files:

  • core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
🧠 Learnings (5)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes

Applied to files:

  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax

Applied to files:

  • ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧬 Code graph analysis (1)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)
legacy/src/main/java/org/opensearch/sql/legacy/utils/StringUtils.java (1)
  • StringUtils (17-117)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: WhiteSource Security Check
🔇 Additional comments (9)
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)

71-71: LGTM! MvExpand visitor follows established pattern.

The new visitMvExpand method correctly follows the visitor pattern established in this class, delegating to visitChildren as expected.

Also applies to: 458-460

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

85-85: LGTM! MvExpand anonymization correctly implemented.

The visitMvExpand method properly anonymizes the mvexpand command by masking the field and limit values, following the established anonymization patterns in this class.

Also applies to: 658-667

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (1)

83-83: LGTM! MvExpand correctly designated as Calcite-only.

The visitMvExpand override properly throws a Calcite-only exception, consistent with other commands like Expand, Flatten, and Reverse that follow the same pattern in this file.

Also applies to: 717-720

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)

866-875: LGTM! Good test coverage for mvexpand anonymization.

The new tests cover both the basic mvexpand command and the variant with a limit parameter, following the established test patterns in this file.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (4)

94-94: LGTM! Imports are necessary for MVExpand support.

The new imports for QualifiedName and MvExpand are correctly added to support the MVExpand command implementation.

Also applies to: 126-126


849-854: LGTM! Backward compatible integration.

The visitPatterns method correctly passes null for perDocLimit, maintaining the existing expand behavior without per-document limits.


3102-3102: LGTM! Backward compatible integration.

The visitExpand method correctly passes null for perDocLimit, maintaining the existing expand behavior without per-document limits.


3441-3494: LGTM! Clean integration of per-document limit support.

The buildExpandRelNode changes are well-designed:

  • The @Nullable Integer perDocLimit parameter clearly indicates optional behavior
  • The limit is correctly applied after uncollect but before building the right node (lines 3468-3470), ensuring it limits values per document
  • Backward compatibility is maintained for existing Expand and Patterns operations
  • The integration follows existing patterns in the codebase

Based on learnings: This follows existing patterns in CalciteRelNodeVisitor for Calcite integration.

core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)

1111-1118: The IGNORE type family fallback is an established design pattern, not a risk.

The second registration for INTERNAL_ITEM correctly implements a designed fallback mechanism:

  1. Intentional design: The SqlTypeFamily.IGNORE pattern is explicitly handled in PPLTypeChecker.java (line 81) and already used for other operators (IS_NULL, IS_NOT_NULL, IS_PRESENT), confirming it's a well-established, deliberate pattern—not a permissive edge case.

  2. Proper registration order: The first registration (lines 1103-1110) validates specific types (ARRAY+INTEGER, MAP+ANY), while the fallback (lines 1115-1118) only activates when the first doesn't match, enabling runtime resolution for UNDEFINED element types as required by MVEXPAND.

  3. Test coverage exists: Integration tests (CalciteMvExpandCommandIT.java) and planning tests (CalcitePPLMvExpandTest.java) cover array expansion scenarios including edge cases (empty arrays, null values, primitives). The testInternalItemAccessOnStructs test in CalciteExplainIT.java validates INTERNAL_ITEM access patterns on dynamic structures.

The implementation is sound and follows established patterns for deferred type resolution.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32d3867 and 07509ae.

📒 Files selected for processing (1)
  • docs/user/ppl/index.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Applied to files:

  • docs/user/ppl/index.md
🔇 Additional comments (2)
docs/user/ppl/index.md (2)

81-83: No duplicate entries found—all three commands appear once with unique entries.

The addtotals, addcoltotals, and mvexpand commands each appear only once in the documentation table at their respective lines (81–83). The mvexpand.md documentation file exists and is properly formatted. The table entries are correctly formatted with appropriate version and status information.

Likely an incorrect or invalid review comment.


83-83: mvexpand.md documentation file exists and is appropriately structured.

The mvexpand.md file is present at docs/user/ppl/cmd/mvexpand.md with comprehensive documentation including clear syntax definition, five detailed examples covering normal cases and edge cases (empty arrays, null values, missing fields, limit parameter), and explanatory notes. The documentation follows the established pattern of other PPL command docs.

Comment on lines +81 to +83
| [addtotals command](cmd/addtotals.md) | 3.4 | stable (since 3.4) | Adds row and column values and appends a totals column and row. |
| [addcoltotals command](cmd/addcoltotals.md) | 3.4 | stable (since 3.4) | Adds column values and appends a totals row. |
| [mvexpand command](cmd/mvexpand.md) | 3.4 | experimental (since 3.4) | Expand a multi-valued field into separate documents (one per value). |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for PPL command checklist and mvexpand implementation files

# Check for PPL command checklist
if [ -f docs/dev/ppl-commands.md ]; then
  echo "=== PPL Commands Checklist ==="
  cat docs/dev/ppl-commands.md
else
  echo "docs/dev/ppl-commands.md not found"
fi

# Search for mvexpand related files
echo ""
echo "=== mvexpand implementation files ==="
echo "Grammar/Parser/AST files:"
fd -H 'mvexpand|Mvexpand|MVExpand' --type f | head -30

echo ""
echo "=== Test files ==="
fd -H 'mvexpand' --type f | grep -i test

echo ""
echo "=== Documentation files ==="
fd -H 'mvexpand' --type f | grep -i 'doc\|md'

echo ""
echo "=== Check directory structure ==="
git ls-files | grep -E 'mvexpand|ppl' | head -50

Repository: opensearch-project/sql

Length of output: 5306


🏁 Script executed:

#!/bin/bash
# Verify detailed checklist compliance for mvexpand

echo "=== 1. Lexer/Parser Grammar Files ==="
echo "Checking for OpenSearchPPLLexer.g4:"
find . -name "*Lexer.g4" -type f
echo ""
echo "Checking for OpenSearchPPLParser.g4:"
find . -name "*Parser.g4" -type f
echo ""
echo "Searching for 'mvexpand' in grammar files:"
fd -H 'Lexer.g4|Parser.g4' --type f -x grep -l 'mvexpand' {} \;

echo ""
echo "=== 2. AST Implementation ==="
echo "Checking for mvexpand AST tree node:"
fd -H 'Mvexpand|MVExpand' ppl/src/main/java --type f | grep -i ast
echo ""
echo "All AST tree classes:"
fd -H --type f ppl/src/main/java/org/opensearch/sql/ast/tree | grep -E '\.java$' | grep -i mvexpand

echo ""
echo "=== 3. Visitor Pattern Implementation ==="
echo "Checking AbstractNodeVisitor:"
fd -H 'AbstractNodeVisitor.java' --type f
echo ""
echo "Searching for mvexpand visit methods:"
rg 'visit.*[Mm]vexpand|[Mm]vexpand.*visitor' --type java

echo ""
echo "=== 4. Test Files Verification ==="
echo "Unit tests:"
fd -H 'CalcitePPLMvExpandTest.java' --type f
echo ""
echo "Integration tests (pushdown):"
fd -H 'CalciteMvExpandCommandIT.java' --type f
echo ""
echo "Integration tests (non-pushdown):"
rg 'CalciteNoPushdownIT.*mvexpand|mvexpand.*CalciteNoPushdownIT' --type java
echo ""
echo "Anonymizer tests:"
rg 'PPLQueryDataAnonymizerTest.*mvexpand|mvexpand.*PPLQueryDataAnonymizerTest' --type java

echo ""
echo "=== 5. Documentation ==="
echo "User documentation exists:"
ls -lh docs/user/ppl/cmd/mvexpand.md

echo ""
echo "=== 6. Index linked ==="
echo "Checking if mvexpand is linked in index.md:"
grep -n 'mvexpand' docs/user/ppl/index.md

Repository: opensearch-project/sql

Length of output: 2294


🏁 Script executed:

#!/bin/bash
# Deep dive verification for remaining checklist items

echo "=== 1. Lexer Grammar File - mvexpand keyword ==="
rg 'mvexpand|MVEXPAND' ppl/src/main/antlr/OpenSearchPPLLexer.g4

echo ""
echo "=== 2. AST Implementation Details ==="
echo "Searching for Mvexpand class in ast package:"
find . -path '*/ast/tree*' -name '*.java' -type f | xargs grep -l 'class.*Mvexpand\|class.*MVExpand' 2>/dev/null
echo ""
echo "Checking ppl ast files:"
fd -H --type f 'ppl/src/main/java.*ast' | head -30

echo ""
echo "=== 3. Visitor Pattern - Analyzer ==="
echo "Searching for visitMvexpand in Analyzer:"
rg 'visitMvexpand|visit.*Mvexpand' ppl/src/main/java --type java

echo ""
echo "=== 4. Visitor Pattern - CalciteRelNodeVisitor ==="
echo "Searching for visitMvexpand in CalciteRelNodeVisitor:"
rg 'visitMvexpand|visit.*Mvexpand' ppl/src/main/java/org/opensearch/sql/ppl/calcite --type java | head -20

echo ""
echo "=== 5. Non-pushdown Integration Tests ==="
echo "Searching for mvexpand in CalciteNoPushdownIT and similar:"
fd -H 'NoPushdown.*\.java' ppl/src integ-test/src --type f | xargs grep -l 'mvexpand' 2>/dev/null
echo ""
echo "Checking CalciteNoPushdownIT classes:"
fd -H 'CalciteNoPushdown' --type f -x grep -l 'mvexpand' {} \;

echo ""
echo "=== 6. Anonymizer Tests ==="
echo "Checking PPLQueryDataAnonymizerTest:"
fd -H 'PPLQueryDataAnonymizer' --type f
echo ""
echo "Searching for anonymizer test file:"
rg 'class.*PPLQueryDataAnonymizerTest' --type java
echo ""
echo "Checking if mvexpand is tested in anonymizer:"
fd -H 'PPLQueryDataAnonymizerTest' --type f -x grep -i 'mvexpand' {} \;

echo ""
echo "=== 7. Explain Tests ==="
echo "Checking ExplainIT classes:"
fd -H 'ExplainIT|CalciteExplainIT' --type f | head -10
echo ""
echo "Searching for mvexpand in explain tests:"
rg 'mvexpand' ppl/src integ-test/src --type java | grep -i explain

Repository: opensearch-project/sql

Length of output: 2375


🏁 Script executed:

#!/bin/bash
# Final verification for remaining checklist items

echo "=== 1. AST Implementation Details ==="
echo "Checking AstBuilder visitMvexpandCommand implementation:"
rg -A 20 'visitMvexpandCommand' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | head -25

echo ""
echo "=== 2. Visitor Pattern - Analyzer ==="
echo "Checking for Analyzer class with visit methods:"
fd -H 'Analyzer.java' ppl/src/main/java --type f
echo ""
echo "Searching for visitor implementations in core/analyzer:"
rg 'visitMvexpand|class.*Analyzer' core/src/main/java/org/opensearch/sql/analyzer --type java | head -10

echo ""
echo "=== 3. Visitor Pattern - CalciteRelNodeVisitor or similar ==="
echo "Searching for all visitor classes in ppl:"
fd -H 'Visitor.java' ppl/src/main/java --type f
echo ""
echo "Checking PPLNodeVisitor:"
rg 'PPLNodeVisitor|class.*Visitor' ppl/src/main/java --type java | grep -i 'class' | head -10

echo ""
echo "=== 4. Non-pushdown Integration Tests ==="
echo "All *IT classes in ppl integ tests:"
fd -H '.*IT\.java' ppl/src/test integ-test/src/test --type f | xargs grep -l 'mvexpand' 2>/dev/null
echo ""
echo "All *IT classes that might test PPL commands:"
fd -H '.*CalciteNo.*IT\.java' integ-test/src/test --type f | head -5

echo ""
echo "=== 5. NewAddedCommandsIT ==="
echo "Searching for NewAddedCommandsIT:"
fd -H 'NewAddedCommandsIT' --type f
echo ""
echo "Checking for mvexpand in that test:"
fd -H 'NewAddedCommandsIT' --type f -x grep -i 'mvexpand' {} \;

echo ""
echo "=== 6. Summary - Check AST and Visitor integration ==="
echo "Looking for UnresolvedPlan or command class usage:"
rg -B 5 -A 15 'visitMvexpandCommand.*{' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | head -30

Repository: opensearch-project/sql

Length of output: 3291


🏁 Script executed:

#!/bin/bash
# Find MvExpand class and verify visitor implementations

echo "=== 1. MvExpand AST Class ==="
echo "Searching for MvExpand class definition:"
fd -H 'MvExpand.java' --type f
echo ""
echo "Checking class content:"
fd -H 'MvExpand.java' --type f -x cat {} \;

echo ""
echo "=== 2. Visitor Pattern in Calcite ==="
echo "Searching for visitor implementations in calcite package:"
fd -H 'Visitor.java' ppl/src/main/java/org/opensearch/sql/ppl/calcite --type f | head -10
echo ""
echo "Checking for ppllocalqueryplanner or related:"
fd -H 'PPLLocalQueryPlanner\|LogicalPlanVisitor\|CalciteRelNodeVisitor' --type f | head -10
echo ""
echo "Checking for mvexpand visitor support in calcite:"
rg 'visitMvexpand\|MvExpand' ppl/src/main/java/org/opensearch/sql/ppl/calcite --type java | head -15

echo ""
echo "=== 3. Non-pushdown Tests ==="
echo "Content of CalciteNoPushdownIT (checking if mvexpand should be there):"
fd -H 'CalciteNoPushdownIT.java' --type f -x head -50 {} \;
echo ""
echo "Checking all test methods in CalciteNoPushdownIT:"
fd -H 'CalciteNoPushdownIT.java' --type f -x rg 'testMv|mvexpand|void test' {} \;

echo ""
echo "=== 4. NewAddedCommandsIT Tests ==="
echo "Content check for NewAddedCommandsIT (ppl version):"
fd -H 'ppl/NewAddedCommandsIT.java' --type f
echo ""
echo "Checking CalciteNewAddedCommandsIT:"
fd -H 'CalciteNewAddedCommandsIT.java' --type f -x head -50 {} \;
echo ""
echo "Checking if mvexpand should be in NewAddedCommandsIT:"
fd -H 'NewAddedCommandsIT.java' --type f -x rg 'mvexpand\|testMv' {} \;

echo ""
echo "=== 5. Analyzer Visitor ==="
echo "Searching for Analyzer or resolver in ppl:"
fd -H 'Analyzer\|Resolver' ppl/src/main/java --type f | grep -E '\.java$' | head -10
echo ""
echo "Checking if mvexpand needs analyzer visitor:"
rg 'class.*MvExpand\|MvExpand' ppl/src/main/java/org/opensearch/sql/ppl --type java | head -5

Repository: opensearch-project/sql

Length of output: 3979


🏁 Script executed:

#!/bin/bash
# Verify visitor implementations and NewAddedCommandsIT

echo "=== 1. AbstractNodeVisitor visitMvExpand ==="
echo "Checking for visitMvExpand in AbstractNodeVisitor:"
fd -H 'AbstractNodeVisitor.java' --type f -x rg 'visitMvExpand' {} \;

echo ""
echo "=== 2. Find all visitor implementations ==="
echo "Searching for all visitMvExpand implementations across codebase:"
rg 'visitMvExpand|visitMvexpand' --type java

echo ""
echo "=== 3. PPL NewAddedCommandsIT content ==="
echo "Content of ppl/NewAddedCommandsIT.java:"
fd -H 'ppl/NewAddedCommandsIT.java' --type f -x cat {} \; | head -100

echo ""
echo "=== 4. Verify CalciteNoPushdownIT includes mvexpand ==="
echo "Checking if CalciteMvExpandCommandIT is in CalciteNoPushdownIT suite:"
fd -H 'CalciteNoPushdownIT.java' --type f -x grep 'CalciteMvExpandCommandIT' {} \;

echo ""
echo "=== 5. Summary - All key checklist items ==="
echo "Lexer: MVEXPAND keyword"
rg "MVEXPAND.*'MVEXPAND'" ppl/src/main/antlr/OpenSearchPPLLexer.g4
echo ""
echo "Parser: mvexpand rule in grammar"
rg 'mvexpandCommand' ppl/src/main/antlr/OpenSearchPPLParser.g4 | head -3
echo ""
echo "AST: MvExpand class exists"
ls -lh core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
echo ""
echo "Parser Builder: visitMvexpandCommand"
rg 'visitMvexpandCommand' ppl/src/main/java --type java | head -1
echo ""
echo "Unit Tests:"
ls -lh ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
echo ""
echo "Integration Tests (pushdown):"
ls -lh integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
echo ""
echo "Integration Tests (non-pushdown via suite):"
grep 'CalciteMvExpandCommandIT' integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
echo ""
echo "Explain Tests:"
rg 'mvexpand.*explain|explain.*mvexpand' integ-test/src/test/java --type java | head -1
echo ""
echo "Anonymizer Tests:"
rg 'testMvexpand' ppl/src/test/java --type java | head -2
echo ""
echo "User Documentation:"
ls -lh docs/user/ppl/cmd/mvexpand.md

Repository: opensearch-project/sql

Length of output: 2799


🏁 Script executed:

#!/bin/bash
# Final verification for NewAddedCommandsIT

echo "=== NewAddedCommandsIT - Full content ==="
fd -H 'ppl/NewAddedCommandsIT.java' --type f -x wc -l {} \;
echo ""
fd -H 'ppl/NewAddedCommandsIT.java' --type f -x cat {} \;

echo ""
echo "=== Check for test methods in NewAddedCommandsIT ==="
fd -H 'NewAddedCommandsIT.java' ppl/src/test --type f -x rg 'void test|public void' {} \;

echo ""
echo "=== Cross-cluster tests (optional) ==="
echo "Checking for CrossClusterSearchIT:"
fd -H 'CrossClusterSearchIT' --type f
echo ""
echo "Checking if mvexpand is tested there:"
rg 'mvexpand' integ-test/src/test --type java | grep -i 'cross\|cluster'

Repository: opensearch-project/sql

Length of output: 441


🏁 Script executed:

#!/bin/bash
# Check if ppl/NewAddedCommandsIT.java should exist and verify its purpose

echo "=== Checking ppl test directory structure ==="
fd -H '\.java$' ppl/src/test/java/org/opensearch/sql/ppl --type f | grep -i 'test\|it' | head -20

echo ""
echo "=== Looking for NewAddedCommandsIT pattern ==="
echo "All NewAddedCommandsIT files:"
fd -H 'NewAddedCommandsIT' --type f

echo ""
echo "=== Content of base NewAddedCommandsIT ==="
fd -H 'ppl/NewAddedCommandsIT' --type f
echo ""
fd -H 'integ-test.*NewAddedCommandsIT' --type f -x cat {} \; | head -50

echo ""
echo "=== Checklist item analysis: Should mvexpand be in ppl/NewAddedCommandsIT? ==="
echo "Checking if other commands have tests in both locations:"
echo "Commands in CalciteNewAddedCommandsIT (extends ppl/NewAddedCommandsIT):"
fd -H 'CalciteNewAddedCommandsIT' --type f -x cat {} \;
echo ""
echo "Checking ppl test directory for NewAddedCommandsIT:"
ls -la ppl/src/test/java/org/opensearch/sql/ppl/ | grep -i 'new\|added'

Repository: opensearch-project/sql

Length of output: 2578


🏁 Script executed:

#!/bin/bash
# Direct verification of NewAddedCommandsIT file

echo "=== Finding NewAddedCommandsIT.java ==="
find . -name 'NewAddedCommandsIT.java' -type f

echo ""
echo "=== Content of integ-test ppl NewAddedCommandsIT.java ==="
cat integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java

echo ""
echo "=== Verify file exists and size ==="
ls -lah integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java

echo ""
echo "=== Line count ==="
wc -l integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java

echo ""
echo "=== Final checklist summary ==="
echo "1. Lexer/Parser Updates: ✓ MVEXPAND keyword and grammar present"
echo "2. AST Implementation: ✓ MvExpand class present in core/ast/tree"
echo "3. Visitor Pattern:"
echo "   - AbstractNodeVisitor: ✓"
echo "   - Analyzer: ✓"
echo "   - CalciteRelNodeVisitor: ✓"
echo "   - PPLQueryDataAnonymizer: ✓"
echo "4. Unit Tests: ✓ CalcitePPLMvExpandTest.java"
echo "5. Integration Tests (pushdown): ✓ CalciteMvExpandCommandIT.java"
echo "6. Integration Tests (non-pushdown): ✓ CalciteMvExpandCommandIT.class in CalciteNoPushdownIT suite"
echo "7. Explain Tests: ✓ CalciteExplainIT.java"
echo "8. Anonymizer Tests: ✓ PPLQueryDataAnonymizerTest"
echo "9. NewAddedCommandsIT: ? Need to verify"
echo "10. User Documentation: ✓ docs/user/ppl/cmd/mvexpand.md"
echo "11. Cross-cluster Tests: Optional (not found, but optional in checklist)"

Repository: opensearch-project/sql

Length of output: 8153


Add test for mvexpand in NewAddedCommandsIT.

Most checklist items are addressed (lexer/parser, AST, all visitor patterns, unit tests, integration tests for pushdown/non-pushdown, explain tests, anonymizer tests, user documentation). However, the "Unsupported in v2 test" item from docs/dev/ppl-commands.md is missing: add a testMvexpand() method to integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java following the pattern of existing tests like testJoin() and testAddTotalCommand().

🤖 Prompt for AI Agents
In docs/user/ppl/index.md around lines 81-83: add the missing integration test
by creating a method testMvexpand() in
integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java
following the pattern of testJoin() and testAddTotalCommand(): add a
@Test-annotated method named testMvexpand that runs a representative mvexpand
query, verifies expected behavior, and includes the "Unsupported in v2"
assertion (matching the other NewAddedCommandsIT tests) so the checklist item is
satisfied; ensure the test uses the same setup/teardown, visitor/pushdown
assertions, and any explain/anonymizer checks used by neighboring tests.

Srikanth Padakanti added 2 commits December 18, 2025 17:18
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>
@srikanthpadakanti
Copy link
Author

Hello @penghuo @ykmr1224 Can you please look into this PR. Thanks

import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Field;

/** AST node representing an {@code mvexpand <field> [limit N]} operation. */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[limit N] -> [limit=N]
It might be better referring as mvexpand PPL command to avoid inconsistency.


PPL query:
```ppl
source=mvexpand_logs | where case='basic' | mvexpand tags | fields tags.value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we output value to field.value? Are we intentionally making it different from Splunk command spec or due to some reason?


## Notes about these doctests
- The tests below target a single, deterministic document by using `where case='<name>'` so the doctests are stable.
- The test index name used in these examples is `mvexpand_logs`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the examples, it would be easier to understand if we use array function to make and multivalue field, and expand it by mvexpand command. And then, we don't need to add new index mvexpand_logs for this doc.

Refer here for examples: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/collection.md

Comment on lines +1 to +7
calcite:
logical: |
LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(skills=[$0], username=[$3], VALUE=[$10])
LogicalValues(tuples=[[]])
physical: |
EnumerableValues(tuples=[[]]) No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks it does not do anything related to mvexpand.

Comment on lines +1 to +7
calcite:
logical: |
LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(skills=[$0], username=[$3], VALUE=[$10])
LogicalValues(tuples=[[]])
physical: |
EnumerableValues(tuples=[[]]) No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

RelDataTypeField matched = rowType.getField(fieldName, false, false);

// 2A. Missing field → true EMPTY relation (no schema, no rows)
if (matched == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we raise error when the field is not found?

String fieldName = extractFieldName(field);

// 2. Lookup field
RelDataTypeField matched = rowType.getField(fieldName, false, false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rely on rexVisitor.analyze to resolve field node.

* @return RelNode representing records with the expanded multi-value field
*/
@Override
public RelNode visitMvExpand(MvExpand mvExpand, CalcitePlanContext context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it have lots of difference with visitExpand? Should it be mostly same other than limit parameter?

"Expected SemanticCheckException about non-array integer field, got: " + msg);
}

private static String createTempIndexWithMapping(String baseName, String mappingJson)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we already have these helper methods in SQLIntegTestCase. Please use existing ones whenever possible.

public void testMvexpandSingleElement() throws Exception {
String query =
String.format(
"source=%s | mvexpand skills | where username='single' | fields username, skills.name",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it bring skills.name although it expanded skills?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add native support for mvexpand command in PPL

3 participants