Mvexpand feature #4944

srikanthpadakanti · 2025-12-11T17:26:46Z

Description

This pull request adds native support for the mvexpand command in PPL to OpenSearch SQL, enabling users to expand multivalue fields (arrays) into separate rows directly within queries. This functionality is analogous to Splunk's mvexpand command and streamlines analytics, dashboarding, and data preparation involving arrays or multivalue fields.

Key features introduced:

Native mvexpand command for PPL queries to expand array fields into separate rows/events.
Optional limit parameter to restrict the number of expanded values per event/document.
Robust handling of empty/null arrays, large arrays (with memory/resource limits), and non-array fields.
Streaming/distributable execution for performance and scalability.
Comprehensive documentation and edge case coverage.
This feature makes OpenSearch SQL more powerful and user-friendly for log analytics, data exploration, and migration from platforms like Splunk.

Related Issues

Resolves #4439
#4439

Check List

[ X] New functionality includes testing.
[ X] New functionality has been documented.
[ X] New functionality has javadoc added.
[ X] New functionality has a user manual doc added.
[ X] New PPL command checklist all confirmed.
[ X] API changes companion pull request created.
[ X] Commits are signed per the DCO using --signoff or -s.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

…it supposed to Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai · 2025-12-11T17:27:03Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added MV-EXPAND command to expand multivalue/array fields into separate rows with an optional per-document limit.
Documentation
- New user guide and examples for MV-EXPAND; updated PPL command index and example outputs.
Tests
- Added comprehensive unit and integration tests, mappings, and doctest data covering MV-EXPAND edge cases.
Chores
- Parser, lexer, anonymizer, and planner plumbing updated to support MV-EXPAND.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds native MVEXPAND support across PPL/SQL: new AST node, parser/lexer grammar, DSL, anonymizer, Calcite planning (with per-document limit), analyzer visitor registration, function planning tweaks, documentation, doctest fixtures, and unit/integration tests.

Changes

Cohort / File(s)	Summary
AST Node `core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java`	New `MvExpand` UnresolvedPlan node with `Field field`, optional `Integer limit`, `attach(child)`, `getChild()`, and visitor `accept`.
AST Visitor API `core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java`	Added `visitMvExpand` delegating to `visitChildren`; consolidated tree imports to wildcard. Removed two AddTotals visitor methods.
Analysis Layer `core/src/main/java/org/opensearch/sql/analysis/Analyzer.java`	Registered `visitMvExpand` that throws a Calcite-only exception (marks node as handled only via Calcite path).
Calcite Planning `core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`	Added `visitMvExpand`, `extractFieldName(Field)` helper, and extended `buildExpandRelNode(...)` to accept `@Nullable Integer perDocLimit`; propagate per-doc limit into right-side plan and validate field types (missing -> empty, non-array -> semantic error).
DSL / Parser `core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java`, `ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`	Added `mvexpand(...)` DSL helper; parser visitor `visitMvexpandCommand` constructs `MvExpand` from field + optional limit.
PPL Grammar & Lexer `ppl/src/main/antlr/OpenSearchPPLLexer.g4`, `ppl/src/main/antlr/OpenSearchPPLParser.g4`	Added `MVEXPAND` token and `mvexpandCommand` rule supporting optional `limit=<int>`; integrated into `commands`.
PPL Anonymizer `ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java`	Added `visitMvExpand` to mask target field and to mask optional limit in anonymized PPL strings.
Function Planning Helpers `core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java`	Registered an additional `INTERNAL_ITEM` signature with permissive first-arg type (IGNORE) to allow planning when element type is undefined.
Docs & Indexing `docs/user/ppl/cmd/mvexpand.md`, `docs/user/ppl/index.md`, `docs/category.json`, `docs/user/dql/metadata.rst`	New mvexpand docs and index entry; category registration; updated example metadata.
Doctest Resources `doctest/test_data/mvexpand_logs.json`, `doctest/test_mapping/mvexpand_logs.json`, `doctest/test_docs.py`	Added newline-delimited test data, mapping, and registered it in `TEST_DATA`.
Integration Tests & Resources `integ-test/src/test/java/.../CalciteMvExpandCommandIT.java`, `integ-test/.../CalciteExplainIT.java`, `integ-test/.../CalciteNoPushdownIT.java`, `integ-test/src/test/resources/*`, `integ-test/src/test/resources/mvexpand_edge_cases.json`, `mvexpand_edge_cases_mapping.json`	New comprehensive integration test class with many test cases; added explain YAMLs, edge-case dataset and mapping; registered index enum constant.
PPL Planner Tests `ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java`, `ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java`	New Calcite-based planner tests and updated anonymizer tests covering mvexpand behaviors and limit masking.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant PPLParser
    participant AstBuilder
    participant Analyzer
    participant CalcitePlanner
    participant Execution

    Client->>PPLParser: "source | mvexpand field [limit=n]"
    PPLParser->>AstBuilder: parse mvexpandCommand
    AstBuilder->>AstBuilder: construct MvExpand(field, optional limit)
    AstBuilder->>Analyzer: provide UnresolvedPlan with MvExpand
    Analyzer->>CalcitePlanner: visitMvExpand (throws only-for-calcite -> Calcite path used)
    CalcitePlanner->>CalcitePlanner: resolve field (extractFieldName), validate type
    alt field missing
        CalcitePlanner->>CalcitePlanner: produce EMPTY 0-row result while preserving schema
    else field non-array
        CalcitePlanner->>Client: throw SemanticCheckException
    else array OK
        CalcitePlanner->>CalcitePlanner: buildExpandRelNode(..., perDocLimit)
        CalcitePlanner->>Execution: produce physical plan (with optional right-side limit)
    end
    Execution->>Client: return expanded rows

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pay extra attention to:
- CalciteRelNodeVisitor: per-document limit propagation and correct insertion of the limit on the expansion (right) branch.
- Field resolution and type validation (missing vs null vs non-array) and associated SemanticCheckException behavior.
- Parser/grammar changes for optional limit parsing and AST construction in AstBuilder.
- PPLFuncImpTable operator registration to ensure it doesn't unintentionally relax typing across other paths.
- Integration tests and test resources for correctness and potential flakiness (bulk insert/setup/teardown).

Suggested labels

feature

Suggested reviewers

LantaoJin
penghuo
anirudha
kavithacm
derek-ho
joshuali925
GumpacG
Swiddis

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.04% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Title check	❓ Inconclusive	The title 'Mvexpand feature' is vague and generic, using non-descriptive terminology that doesn't convey specific implementation details about the mvexpand command being added.	Consider a more descriptive title like 'Add native mvexpand command to PPL for expanding multivalue fields' to clearly communicate the main change.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description comprehensively explains the mvexpand feature being added, including key features, implementation details, edge case handling, and relates it to the resolved issue #4439.
Linked Issues check	✅ Passed	The PR successfully implements all primary coding objectives from issue #4439: mvexpand command with field expansion, optional limit parameter, edge case handling for empty/null/large arrays, and comprehensive documentation and testing.
Out of Scope Changes check	✅ Passed	All changes are directly related to implementing the mvexpand feature: AST nodes, parsing, Calcite integration, tests, test data, and documentation. No unrelated modifications detected.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 34db739 and 602358e.

📒 Files selected for processing (1)

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (8)

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

655-664: Consider adding JavaDoc for consistency.

The method follows the established visitor pattern and correctly anonymizes both the field name and optional limit. However, other visitor methods in this class lack JavaDoc as well, so this is consistent with the existing style.

The implementation correctly:

Masks the field name with MASK_COLUMN

Masks the limit value with MASK_LITERAL when present

Processes the child node following the same pattern as visitExpand
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java (2)
17-29: LGTM - Well-structured AST node following established patterns.

The MvExpand AST node correctly extends UnresolvedPlan and follows the same patterns as other AST nodes like Expand. The use of @Nullable for the optional limit parameter is appropriate, and the Lombok annotations provide consistent behavior.

Consider adding a null-check for field in the constructor to fail fast if an invalid AST is constructed:
 public MvExpand(Field field, @Nullable Integer limit) {
+   if (field == null) {
+     throw new IllegalArgumentException("Field cannot be null for mvexpand");
+   }
   this.field = field;
   this.limit = limit;
 }
31-45: Methods follow existing AST patterns.

The attach(), getChild(), and accept() methods correctly implement the UnresolvedPlan contract. The use of ImmutableList for getChild() is consistent with the codebase.

As per coding guidelines, consider adding brief JavaDoc for public methods, particularly attach() and getChild() to document their behavior.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (2)

43-82: TableWithArray duplicates existing code.

The comment on line 42 acknowledges this mirrors CalcitePPLExpandTest.TableWithArray. Consider extracting this to a shared test utility class to reduce duplication and ensure consistency.

97-163: Tests are smoke tests only - consider adding assertions.

Several tests (lines 124-163) only verify that planning succeeds without asserting expected logical plans or behavior. While smoke tests are valuable for regression detection, consider adding at least one assertion per test to verify correct behavior.

For example, testMvExpandEmptyOrNullArray (line 131-135) could verify the plan handles the null filter appropriately.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
61-70: Exception handling in cleanup may mask issues.

The cleanupAfterEach method catches and ignores all exceptions, which could hide legitimate cleanup failures or resource leaks during test debugging.
 @AfterEach
 public void cleanupAfterEach() throws Exception {
-  try {
-    deleteIndexIfExists(INDEX + "_not_array");
-    deleteIndexIfExists(INDEX + "_missing_field");
-    deleteIndexIfExists(INDEX + "_limit_test");
-    deleteIndexIfExists(INDEX + "_int_field");
-  } catch (Exception ignored) {
-  }
+  deleteIndexIfExists(INDEX + "_not_array");
+  deleteIndexIfExists(INDEX + "_missing_field");
+  deleteIndexIfExists(INDEX + "_limit_test");
+  deleteIndexIfExists(INDEX + "_int_field");
 }
The deleteIndexIfExists method already handles 404 errors gracefully, so the outer try-catch is unnecessary.

340-360: bulkInsert has unused ID parsing logic.

The bulkInsert method includes logic to parse custom IDs using | as a delimiter (lines 346-353), but none of the test calls use this format. All documents are inserted with auto-generated IDs. Consider simplifying if this feature isn't needed:
 private static void bulkInsert(String index, String... docs) throws IOException {
   StringBuilder bulk = new StringBuilder();
   int nextAutoId = 1;
   for (String doc : docs) {
-    String id;
-    String json;
-    if (doc.contains("|")) {
-      String[] parts = doc.split("\\|", 2);
-      id = parts[0];
-      json = parts[1];
-    } else {
-      id = String.valueOf(nextAutoId++);
-      json = doc;
-    }
+    String id = String.valueOf(nextAutoId++);
+    String json = doc;
     bulk.append("{\"index\":{\"_id\":").append(id).append("}}\n");
     bulk.append(json).append("\n");
   }
Alternatively, if this is for future use, add a comment explaining the intended use case.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
2858-2874: Missing field handling creates schema with NULL literal of untyped nature.

When a field is missing, the code adds relBuilder.literal(null) (line 2865) which creates a NULL literal without explicit type. This could cause type inference issues downstream. Consider using a typed null:
-      // Add NULL for missing field
-      projects.add(relBuilder.literal(null));
+      // Add typed NULL for missing field (VARCHAR as a reasonable default for unknown fields)
+      projects.add(
+          context.rexBuilder.makeNullLiteral(
+              context.rexBuilder.getTypeFactory().createSqlType(SqlTypeName.VARCHAR)));
       names.add(fieldName);
This ensures the NULL has a consistent type in the resulting schema.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 90ee47c and 125cf3b.

📒 Files selected for processing (27)

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2 hunks)
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (2 hunks)
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java (2 hunks)
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java (1 hunks)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (6 hunks)
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1 hunks)
docs/category.json (1 hunks)
docs/user/dql/metadata.rst (3 hunks)
docs/user/ppl/cmd/mvexpand.rst (1 hunks)
docs/user/ppl/index.rst (1 hunks)
doctest/test_data/mvexpand_logs.json (1 hunks)
doctest/test_docs.py (1 hunks)
doctest/test_mapping/mvexpand_logs.json (1 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (1 hunks)
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml (1 hunks)
integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml (1 hunks)
integ-test/src/test/resources/mvexpand_edge_cases.json (1 hunks)
integ-test/src/test/resources/mvexpand_edge_cases_mapping.json (1 hunks)
ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1 hunks)
ppl/src/main/antlr/OpenSearchPPLParser.g4 (3 hunks)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (2 hunks)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (2 hunks)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1 hunks)
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1 hunks)

🧰 Additional context used

📓 Path-based instructions (7)

**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

Check for proper JavaDoc on public classes and methods

Flag redundant comments that restate obvious code

Ensure methods are under 20 lines with single responsibility

Verify proper error handling with specific exception types

Check for Optional usage instead of null returns

Validate proper use of try-with-resources for resource management

Files:

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

Check test naming follows conventions (*Test.java for unit, *IT.java for integration)

Ensure tests are independent and don't rely on execution order

Validate meaningful test data that reflects real-world scenarios

Check for proper cleanup of test resources

Files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

Check AST generation for new syntax

Ensure corresponding AST builder classes are updated

Validate edge cases and boundary conditions

Files:

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)

Check tests can be run with ./gradlew :integ-test:integTest

Ensure proper test data setup and teardown

Validate end-to-end scenario coverage

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

Verify SQL generation and optimization paths

Document any Calcite-specific workarounds

Test compatibility with Calcite version constraints

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

🧠 Learnings (8)

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

📚 Learning: 2025-12-11T05:27:39.831Z

Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.831Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml
integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Applied to files:

docs/user/ppl/index.rst

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes

Applied to files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java

🧬 Code graph analysis (1)

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)

integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)

PPLIntegTestCase (36-415)

integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)

Index (218-262)

🪛 Biome (2.1.2)

docs/category.json

[error] 77-77: unexpected character <

(parse)

[error] 77-77: unexpected character <

(parse)

[error] 77-77: unexpected character <

(parse)

[error] 77-77: unexpected character <

(parse)

[error] 77-77: unexpected character <

(parse)

[error] 77-77: unexpected character <

(parse)

[error] 77-77: String values must be double quoted.

(parse)

[error] 77-78: unexpected character =

(parse)

[error] 78-78: unexpected character =

(parse)

[error] 78-78: unexpected character =

(parse)

[error] 78-78: unexpected character =

(parse)

[error] 78-78: unexpected character =

(parse)

[error] 78-78: unexpected character =

(parse)

[error] 78-78: unexpected character =

(parse)

[error] 79-79: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 79-79: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 79-128: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 128-129: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 128-129: unexpected character >

(parse)

[error] 129-129: unexpected character >

(parse)

[error] 129-129: unexpected character >

(parse)

[error] 129-129: unexpected character >

(parse)

[error] 129-129: unexpected character >

(parse)

[error] 129-129: unexpected character >

(parse)

[error] 129-129: unexpected character >

(parse)

[error] 129-129: Minus must be followed by a digit

(parse)

integ-test/src/test/resources/mvexpand_edge_cases.json

[error] 1-2: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 2-3: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 3-4: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 4-5: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 5-6: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 6-7: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 7-8: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 8-9: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 9-10: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 10-11: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 11-12: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 12-13: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 13-14: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 14-15: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 15-16: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 16-17: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 17-18: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

doctest/test_data/mvexpand_logs.json

[error] 1-2: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 2-3: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 3-4: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 4-5: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

[error] 5-6: End of file expected

Use an array for a sequence of values: [1, 2]

(parse)

🔇 Additional comments (29)

docs/user/dql/metadata.rst (1)

38-38: LGTM! Documentation properly reflects the new test index.

The row count update and the addition of the mvexpand_logs table entry are consistent with the mvexpand feature addition.

Also applies to: 51-51

docs/user/ppl/index.rst (1)

67-67: LGTM! Proper documentation for the new mvexpand command.

The command entry correctly identifies it as experimental (since 3.4) and provides an accurate description that aligns with the PR objectives.

doctest/test_docs.py (1)

60-61: LGTM! Proper test data registration.

The mvexpand_logs test data entry is correctly added to the TEST_DATA dictionary with appropriate syntax.

ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1)

53-53: LGTM! Lexer token properly added.

The MVEXPAND token is correctly defined and positioned alphabetically after the EXPAND token, following the established pattern for command keywords.

integ-test/src/test/resources/mvexpand_edge_cases_mapping.json (1)

1-8: LGTM! Test mapping is properly defined.

The mapping defines username as a keyword field and skills as a nested field, which are appropriate for testing mvexpand edge cases.

integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1)

35-35: LGTM! Test suite properly integrated.

CalciteMvExpandCommandIT.class is correctly added to the no-pushdown test suite in alphabetical order, following the established pattern.

integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml (1)

1-7: LGTM! Expected explain output is well-structured.

The YAML defines the expected logical and physical plans for mvexpand, showing proper projection of the VALUE field alongside the original fields.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)

46-46: LGTM! Test data properly initialized.

The MVEXPAND_EDGE_CASES index is correctly loaded in the init method, following the established pattern for test setup.

doctest/test_mapping/mvexpand_logs.json (1)

1-24: LGTM!

The mapping structure is well-defined for testing mvexpand behavior with nested fields. The combination of keyword and integer nested types provides good coverage for type inference testing.

integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml (1)

1-7: LGTM!

The explain output structure correctly represents a Calcite plan for mvexpand operations, including the VALUE projection that represents the expanded field.

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)

456-458: LGTM!

The visitor method follows the established pattern in AbstractNodeVisitor, correctly delegating to visitChildren for default traversal behavior.

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)

848-858: LGTM!

The test coverage for mvexpand anonymization is appropriate, covering both basic usage and the limit parameter. The tests follow the existing pattern and naming conventions.

integ-test/src/test/resources/mvexpand_edge_cases.json (1)

1-18: LGTM - Comprehensive edge case coverage.

The test data covers a good range of edge cases for mvexpand validation:

Normal multivalue expansion

Single values

Empty and null arrays

Missing fields

Heterogeneous nested objects

Duplicates

Large arrays

Note: The static analysis errors from Biome are false positives. This is valid newline-delimited JSON (NDJSON) format used for bulk operations in OpenSearch, not a standard JSON array.

ppl/src/main/antlr/OpenSearchPPLParser.g4 (3)

81-81: LGTM!

The mvexpandCommand is properly integrated into the commands alternatives, following the established pattern in the grammar.

120-120: LGTM!

The MVEXPAND token is correctly added to commandName, maintaining consistency with other command definitions.

532-534: LGTM!

The grammar rule for mvexpandCommand is well-defined with:

Required field expression to expand

Optional limit parameter for restricting expanded values per document

The syntax follows existing patterns in the grammar and correctly uses tokens that should already be defined in the lexer.

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1)

702-706: LGTM!

The new MVEXPAND_EDGE_CASES enum constant follows the established pattern used by other test indices in this file (e.g., TIME_TEST_DATA2, TELEMETRY). The naming conventions and constructor usage are consistent.

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

82-82: LGTM!

Import correctly placed among the tree imports.

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2)

81-81: LGTM!

Import correctly placed alphabetically among the tree imports.

705-708: LGTM!

The implementation correctly follows the established pattern for Calcite-only commands (e.g., visitExpand, visitBin). Throwing getOnlyForCalciteException("MvExpand") properly routes MvExpand processing to the Calcite engine as intended by this feature.

docs/user/ppl/cmd/mvexpand.rst (1)

1-29: LGTM!

The documentation is well-structured with clear description, syntax definition, and helpful notes about doctest stability. The command syntax and parameter descriptions are accurate and match the implementation.

doctest/test_data/mvexpand_logs.json (1)

1-6: LGTM!

The test data file correctly uses NDJSON (Newline-Delimited JSON) format, which is the expected format for OpenSearch bulk data loading. The static analysis errors from Biome are false positives—Biome expects standard JSON but NDJSON is intentionally one JSON object per line.

The test cases comprehensively cover the edge cases documented in mvexpand.rst:

basic: Multi-value array with 3 elements

empty: Empty array

null: Null field value

single: Single-element array

ids: Numeric array for limit testing

missing: Document without the target field

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)

23-59: Comprehensive test setup with good edge case coverage.

The test initialization creates a well-structured nested mapping and populates diverse test documents covering various edge cases. The test data design is thoughtful, including documents for single elements, empty arrays, null values, partial fields, mixed shapes, duplicates, and heterogeneous types.

227-251: Well-designed limit parameter test.

The testMvexpandLimitParameter test properly validates the limit=3 syntax with a document containing 5 elements, verifying both the row count and specific row values.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (4)

2836-2899: Well-implemented visitMvExpand with proper edge case handling.

The implementation correctly handles three cases:

Missing field → Returns empty result while preserving schema (lines 2852-2874)

Non-array type → Throws SemanticCheckException with clear message (lines 2877-2885)

Valid array → Delegates to buildExpandRelNode with optional limit (lines 2887-2898)

The limit validation (lines 2892-2895) correctly rejects non-positive values.

2901-2913: extractFieldName helper is well-implemented.

The method correctly handles QualifiedName expressions by joining parts with dots, with a sensible fallback that removes backticks for other expression types.

3158-3188: buildExpandRelNode correctly applies per-document limit.

The updated method signature accepts an optional perDocLimit parameter and applies it at the right point in the pipeline—after uncollect but before building the right node for the correlate join (lines 3185-3187). This correctly limits expanded values per source document.

847-852: Consistent update to buildExpandRelNode calls.

The call in visitPatterns is correctly updated to pass null for the new perDocLimit parameter, maintaining backward compatibility.

core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)

1101-1108: LGTM. The fallback registration for INTERNAL_ITEM properly supports MVEXPAND with undefined element types.

The design is sound:

Operator resolution tries the specific type checker first (ARRAY|INTEGER or MAP|ANY), then the permissive fallback (IGNORE|CHARACTER)

The SqlTypeFamily.IGNORE family explicitly skips type validation at planning time (PPLTypeChecker.java:81-83), allowing planning to succeed when element types are unknown

Execution defers to Calcite's ITEM operator, which handles the actual runtime validation

The CHARACTER-only constraint for the second argument correctly targets the string-key access pattern used by MVEXPAND; integer-index access on undefined types is intentionally excluded

core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java

docs/category.json

docs/user/ppl/cmd/mvexpand.rst

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1)
131-170: Consider adding assertions to verify correctness beyond planning success.

While smoke tests are valuable for ensuring planning doesn't throw exceptions, most tests here (lines 131-170) only verify that getRelNode(ppl) succeeds without validating the generated plan or SQL output. In contrast, testMvExpandBasic (lines 97-115) demonstrates stronger validation by asserting on expected logical plans and Spark SQL translation.

Consider adding similar assertions to at least 2-3 of these edge-case tests (e.g., testMvExpandEmptyOrNullArray, testMvExpandProjectNested) to ensure they not only plan successfully but also generate correct plans.

Example for testMvExpandEmptyOrNullArray:
@Test
public void testMvExpandEmptyOrNullArray() {
  String ppl = "source=DEPT | where isnull(EMPNOS) | mvexpand EMPNOS";
  RelNode root = getRelNode(ppl);
  // Verify filter is present in logical plan
  String logicalPlan = root.explain();
  assertTrue(logicalPlan.contains("LogicalFilter"), 
      "Expected filter for isnull(EMPNOS) in plan");
  assertTrue(logicalPlan.contains("LogicalCorrelate"), 
      "Expected correlation for mvexpand in plan");
}

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00c990f and 44814ab.

📒 Files selected for processing (4)

docs/category.json (2 hunks)
docs/user/ppl/cmd/mvexpand.rst (1 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2 hunks)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

docs/user/ppl/cmd/mvexpand.rst

🧰 Additional context used

📓 Path-based instructions (7)

**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

Check for proper JavaDoc on public classes and methods

Flag redundant comments that restate obvious code

Ensure methods are under 20 lines with single responsibility

Verify proper error handling with specific exception types

Check for Optional usage instead of null returns

Validate proper use of try-with-resources for resource management

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Verify integration tests are in correct module (integ-test/)

Check tests can be run with ./gradlew :integ-test:integTest

Ensure proper test data setup and teardown

Validate end-to-end scenario coverage

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify test coverage for new business logic

Check test naming follows conventions (*Test.java for unit, *IT.java for integration)

Ensure tests are independent and don't rely on execution order

Validate meaningful test data that reflects real-world scenarios

Check for proper cleanup of test resources

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

Verify SQL generation and optimization paths

Document any Calcite-specific workarounds

Test compatibility with Calcite version constraints

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

Check AST generation for new syntax

Ensure corresponding AST builder classes are updated

Validate edge cases and boundary conditions

Files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

🧠 Learnings (3)

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: All tests must pass before merge; fix or ask for guidance if blocked

Applied to files:

docs/category.json

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

🧬 Code graph analysis (2)

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)

integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)

Index (218-262)

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (2)

legacy/src/main/java/org/opensearch/sql/legacy/executor/format/Schema.java (1)

Schema (17-170)

legacy/src/main/java/org/opensearch/sql/legacy/parser/SqlParser.java (1)

SqlParser (58-628)

🔇 Additional comments (6)

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2)

46-46: LGTM!

The index loading for MVEXPAND_EDGE_CASES follows the established pattern and properly sets up test data for the new mvexpand test.

321-327: Good fix on the comment; test implementation looks solid.

The misleading comment from the previous review has been corrected. The test properly validates the explain plan for the mvexpand command and follows the established testing pattern in this file.

Note: The limit parameter for mvexpand is covered in integration tests in CalciteMvExpandCommandIT.java (e.g., mvexpand skills limit=3), so no additional coverage is needed in this explain plan validation test.

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (4)

43-82: LGTM!

The TableWithArray implementation correctly provides a Calcite Table with an INTEGER array field (EMPNOS) for testing. JavaDoc clearly explains its purpose as a test fixture.

84-95: LGTM!

The config method properly sets up the test schema with the DEPT table containing array fields, and configures heuristic join order planning as mentioned in the PR objectives.

97-115: LGTM!

Excellent test that validates both the Calcite logical plan structure and Spark SQL translation for basic mvexpand. The expected plan correctly shows the correlation and uncollect pattern.

117-129: Previous review comment addressed - tests now correctly named.

The test previously flagged as misleading has been properly addressed:

Renamed to testMvExpandWithHeadCommand to reflect it tests the head command

Added new testMvExpandWithLimitParameter to test the actual mvexpand limit syntax (limit=2)

Both tests validate the planning succeeds for their respective scenarios.

docs/category.json

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

docs/user/ppl/cmd/mvexpand.md (1)

7-9: Add language specifier to syntax code block for consistency.

The fenced code block at line 7 is missing a language identifier, unlike the other example code blocks in the file (lines 24, 40, 54, 64, 77) which use text. This creates a minor formatting inconsistency.

 ## Syntax
-```
+```text
 mvexpand <field> [limit=<int>]


</blockquote></details>

</blockquote></details>

<details>
<summary>📜 Review details</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: CHILL

**Plan**: Pro

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 44814ab0922099584138430aab7a0e704d0c65a9 and f9dd692b38579ba69040a53247c6de3600cc2b6d.

</details>

<details>
<summary>📒 Files selected for processing (3)</summary>

* `docs/category.json` (2 hunks)
* `docs/user/ppl/cmd/mvexpand.md` (1 hunks)
* `docs/user/ppl/index.rst` (1 hunks)

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (1)</summary>

* docs/user/ppl/index.rst

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>📓 Common learnings</summary>

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor for Calcite integration


</details>
<details>
<summary>📚 Learning: 2025-12-02T17:27:55.938Z</summary>

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: All tests must pass before merge; fix or ask for guidance if blocked


**Applied to files:**
- `docs/category.json`

</details>

</details><details>
<summary>🪛 markdownlint-cli2 (0.18.1)</summary>

<details>
<summary>docs/user/ppl/cmd/mvexpand.md</summary>

7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

</details>

<details>
<summary>🔇 Additional comments (2)</summary><blockquote>

<details>
<summary>docs/category.json (1)</summary><blockquote>

`30-30`: **✅ mvexpand entry is correctly placed and formatted.**

The mvexpand documentation entry is now properly located in the `ppl_cli_calcite` section with the `.md` extension, maintaining alphabetical ordering between `multisearch.md` and `replace.md`. Previous merge conflict issues have been resolved, and the JSON is syntactically valid.

</blockquote></details>
<details>
<summary>docs/user/ppl/cmd/mvexpand.md (1)</summary><blockquote>

`1-98`: **✅ Documentation is comprehensive and covers edge cases well.**

The mvexpand documentation clearly describes the command purpose, provides syntax with required and optional parameters, and includes five well-structured examples covering key scenarios: basic expansion, limit parameter, empty arrays, null arrays, single-value arrays, and missing fields. This aligns with the PR objectives and provides users with clear guidance on command behavior.

</blockquote></details>

</blockquote></details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9dd692 and 2464675.

📒 Files selected for processing (1)

docs/user/ppl/cmd/mvexpand.md (1 hunks)

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

docs/user/ppl/cmd/mvexpand.md

7-7: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)

docs/user/ppl/cmd/mvexpand.md (1)

1-125: Comprehensive documentation with good edge-case coverage.

The documentation is well-structured and covers the key mvexpand requirements from the PR objectives: basic expansion, optional limit parameter, and important edge cases (empty arrays, null arrays, single-element arrays, missing fields). The examples use clear tabular output formatting and stable doctest methodology (filtering by case=<name>).

docs/user/ppl/cmd/mvexpand.md

dai-chen · 2025-12-11T20:08:55Z

core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java

+      // Some datasets (or Calcite's type inference) may give the element an UNDEFINED type.
+      // Accept a "ignore" first-argument family so INTERNAL_ITEM(elem, 'key') can still be planned
+      // and resolved at runtime (fallback semantics handled at execution side). - Used in MVEXPAND
+      registerOperator(


Not sure if missed some context. What is this used for? Please point me to comment related if any.

This code defines a fallback type-checker for INTERNAL_ITEM to handle cases where the element type is undefined at planning time. It ensures the operator accepts composite types (IGNORE, CHARACTER) for resilient type validation during query execution.

Some datasets (or Calcite's type inference) may give the element an UNDEFINED type

Could you clarify a little more which query/test dataset you have this problem? If any previous comment related, please let me know. Thanks!

Without that code, tests fail because the INTERNAL_ITEM operator requires specific argument types ({ARRAY, INTEGER} or {STRUCT, ANY}) for type validation during query execution. When Calcite assigns an UNDEFINED type at planning time, the absence of fallback handling in registerOperator causes validation to fail, triggering ExpressionEvaluationException.

With that code, the fallback type-checker accepts UNDEFINED types as valid arguments (IGNORE, CHARACTER), allowing queries to pass type validation, execute correctly, and produce expected results in the tests.

Example - I commented that code and ran my IT below is the failure

testMvexpandDuplicate

org.opensearch.client.ResponseException: method [POST], host [http://127.0.0.1:65246], URI [/_plugins/_ppl], status line [HTTP/1.1 400 Bad Request]
{
"error": {
"reason": "Invalid Query",
"details": "INTERNAL_ITEM function expects {[ARRAY,INTEGER]|[STRUCT,ANY]}, but got [UNDEFINED,STRING]",
"type": "ExpressionEvaluationException"
},
"status": 400
}

docs/user/ppl/index.rst

Signed-off-by: Srikanth Padakanti <srikanth29.9@gmail.com>

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)

94-94: Add JavaDoc for visitor method.

Per coding guidelines, public methods must have JavaDoc documentation with @param and @return tags. A previous review already flagged this same issue.

Based on coding guidelines, all public classes and methods must have proper JavaDoc.

Also applies to: 872-878

🧹 Nitpick comments (2)

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (2)
3107-3182: Consider extracting the missing field handling logic to improve readability.

The visitMvExpand method is ~75 lines, exceeding the 50-line guideline for this file. The missing field handling (lines 3135-3156) is a self-contained 22-line block that could be extracted into a helper method like buildEmptyRelationWithField(RelBuilder, RelDataType, String). This would improve readability and maintainability.

The core logic is sound:

Missing field correctly returns an EMPTY relation (0 rows) while preserving schema

Non-array types properly throw SemanticCheckException

Limit validation and array expansion are handled correctly
🔎 Suggested refactoring to extract complex logic

Extract the missing field handling into a helper method:
/**
 * Builds an empty relation (0 rows) while preserving schema and adding the missing field as NULL.
 */
private void buildEmptyRelationWithField(
    RelBuilder relBuilder, RelDataType rowType, String missingFieldName) {
  List<RelDataTypeField> fields = rowType.getFieldList();
  List<RexNode> projects = new ArrayList<>();
  List<String> names = new ArrayList<>();

  // Keep existing fields
  for (RelDataTypeField f : fields) {
    projects.add(relBuilder.field(f.getIndex()));
    names.add(f.getName());
  }

  // Add NULL for missing field
  projects.add(relBuilder.literal(null));
  names.add(missingFieldName);

  relBuilder.project(projects, names);
  // Return 0 rows
  relBuilder.filter(relBuilder.literal(false));
}
Then simplify visitMvExpand:
     // 2A. Missing field → true EMPTY relation (no schema, no rows)
     if (matched == null) {
-      // Schema must include the missing field, even if no rows returned.
-      List<RelDataTypeField> fields = rowType.getFieldList();
-      List<RexNode> projects = new ArrayList<>();
-      List<String> names = new ArrayList<>();
-
-      // Keep existing fields
-      for (RelDataTypeField f : fields) {
-        projects.add(relBuilder.field(f.getIndex()));
-        names.add(f.getName());
-      }
-
-      // Add NULL for missing field
-      projects.add(relBuilder.literal(null));
-      names.add(fieldName);
-
-      relBuilder.project(projects, names);
-
-      // Now return 0 rows
-      relBuilder.filter(relBuilder.literal(false));
-
+      buildEmptyRelationWithField(relBuilder, rowType, fieldName);
       return relBuilder.peek();
     }
3184-3196: Consider making the fallback logic more explicit.

The extractFieldName method correctly handles QualifiedName by joining parts. However, the fallback toString().replace("", "")` is somewhat fragile as it:

Assumes backticks are the only wrapper characters

Depends on UnresolvedExpression.toString() implementation details

Could incorrectly strip backticks that are part of the actual field name

Consider adding a comment explaining the fallback behavior or handling other known UnresolvedExpression subtypes explicitly.
🔎 Example: More explicit fallback handling
 private String extractFieldName(Field f) {
   UnresolvedExpression inner = f.getField();

   if (inner instanceof QualifiedName) {
     List<String> parts = ((QualifiedName) inner).getParts();
     if (!parts.isEmpty()) {
       return String.join(".", parts);
     }
   }

-  // Fallback - return clean string
+  // Fallback: Remove backtick delimiters that may be added by the parser.
+  // This assumes field names don't contain backticks as part of their actual name.
   return inner.toString().replace("`", "");
 }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0f86c52 and 32d3867.

📒 Files selected for processing (12)

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (2 hunks)
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (2 hunks)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (6 hunks)
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1 hunks)
docs/category.json (2 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java (1 hunks)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (2 hunks)
ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1 hunks)
ppl/src/main/antlr/OpenSearchPPLParser.g4 (3 hunks)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (2 hunks)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (2 hunks)
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

docs/category.json
ppl/src/main/antlr/OpenSearchPPLParser.g4
integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java
ppl/src/main/antlr/OpenSearchPPLLexer.g4
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

🧰 Additional context used

📓 Path-based instructions (9)

**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

Flag classes >500 lines as needing organization review

Check for dead code, unused imports, and unused variables

Identify code reuse opportunities across similar implementations

Assess holistic maintainability - is code easy to understand and modify?

Flag code that appears AI-generated without sufficient human review

Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

Check for proper JavaDoc on public classes and methods

Flag redundant comments that restate obvious code

Ensure proper error handling with specific exception types

Check for Optional usage instead of null returns

Validate proper use of try-with-resources for resource management

Files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java

**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

Check AST generation for new syntax

Ensure corresponding AST builder classes are updated

Validate edge cases and boundary conditions

Files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java

core/src/main/java/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit

Public methods MUST have JavaDoc with @param, @return, and @throws

Follow existing function implementation patterns in the same package

New expression functions should follow ExpressionFunction interface patterns

Validate function naming follows project conventions (camelCase)

Files:

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java

**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

Verify RelNode visitor implementations are complete

Check RexNode handling follows project conventions

Validate SQL generation is correct and optimized

Ensure Calcite version compatibility

Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

Document any Calcite-specific workarounds

Test compatibility with Calcite version constraints

Files:

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read

Suggest extracting complex logic into helper methods

Check for code organization and logical grouping

Validate all RelNode types are handled

Files:

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

core/src/main/java/org/opensearch/sql/ast/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/ast/**/*.java: - AST nodes must be immutable where possible

Follow visitor pattern for AST traversal

Ensure proper toString() implementation for debugging

Files:

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java

**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java

**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

Check boundary condition tests (min/max values, empty inputs)

Validate error condition tests (invalid inputs, exceptions)

Ensure multi-document tests for per-document operations

Flag smoke tests without meaningful assertions

Check test naming follows pattern: test

Verify test data is realistic and covers edge cases

Verify test coverage for new business logic

Ensure tests are independent and don't rely on execution order

Validate meaningful test data that reflects real-world scenarios

Check for proper cleanup of test resources

Files:

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java

core/src/main/java/org/opensearch/sql/expression/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/expression/**/*.java: - New expression implementations must follow existing patterns

Type handling must be consistent with project type system

Error handling must use appropriate exception types

Null handling must be explicit and documented

Files:

core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java

🧠 Learnings (5)

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes

Applied to files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax

Applied to files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java

📚 Learning: 2025-12-11T05:27:39.856Z

Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

🧬 Code graph analysis (1)

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

legacy/src/main/java/org/opensearch/sql/legacy/utils/StringUtils.java (1)

StringUtils (17-117)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: WhiteSource Security Check

🔇 Additional comments (9)

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)

71-71: LGTM! MvExpand visitor follows established pattern.

The new visitMvExpand method correctly follows the visitor pattern established in this class, delegating to visitChildren as expected.

Also applies to: 458-460

ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

85-85: LGTM! MvExpand anonymization correctly implemented.

The visitMvExpand method properly anonymizes the mvexpand command by masking the field and limit values, following the established anonymization patterns in this class.

Also applies to: 658-667

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (1)

83-83: LGTM! MvExpand correctly designated as Calcite-only.

The visitMvExpand override properly throws a Calcite-only exception, consistent with other commands like Expand, Flatten, and Reverse that follow the same pattern in this file.

Also applies to: 717-720

ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java (1)

866-875: LGTM! Good test coverage for mvexpand anonymization.

The new tests cover both the basic mvexpand command and the variant with a limit parameter, following the established test patterns in this file.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (4)

94-94: LGTM! Imports are necessary for MVExpand support.

The new imports for QualifiedName and MvExpand are correctly added to support the MVExpand command implementation.

Also applies to: 126-126

849-854: LGTM! Backward compatible integration.

The visitPatterns method correctly passes null for perDocLimit, maintaining the existing expand behavior without per-document limits.

3102-3102: LGTM! Backward compatible integration.

The visitExpand method correctly passes null for perDocLimit, maintaining the existing expand behavior without per-document limits.

3441-3494: LGTM! Clean integration of per-document limit support.

The buildExpandRelNode changes are well-designed:

The @Nullable Integer perDocLimit parameter clearly indicates optional behavior

The limit is correctly applied after uncollect but before building the right node (lines 3468-3470), ensuring it limits values per document

Backward compatibility is maintained for existing Expand and Patterns operations

The integration follows existing patterns in the codebase

Based on learnings: This follows existing patterns in CalciteRelNodeVisitor for Calcite integration.

core/src/main/java/org/opensearch/sql/expression/function/PPLFuncImpTable.java (1)

1111-1118: The IGNORE type family fallback is an established design pattern, not a risk.

The second registration for INTERNAL_ITEM correctly implements a designed fallback mechanism:

Intentional design: The SqlTypeFamily.IGNORE pattern is explicitly handled in PPLTypeChecker.java (line 81) and already used for other operators (IS_NULL, IS_NOT_NULL, IS_PRESENT), confirming it's a well-established, deliberate pattern—not a permissive edge case.

Proper registration order: The first registration (lines 1103-1110) validates specific types (ARRAY+INTEGER, MAP+ANY), while the fallback (lines 1115-1118) only activates when the first doesn't match, enabling runtime resolution for UNDEFINED element types as required by MVEXPAND.

Test coverage exists: Integration tests (CalciteMvExpandCommandIT.java) and planning tests (CalcitePPLMvExpandTest.java) cover array expansion scenarios including edge cases (empty arrays, null values, primitives). The testInternalItemAccessOnStructs test in CalciteExplainIT.java validates INTERNAL_ITEM access patterns on dynamic structures.

The implementation is sound and follows established patterns for deferred type resolution.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32d3867 and 07509ae.

📒 Files selected for processing (1)

docs/user/ppl/index.md (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Applied to files:

docs/user/ppl/index.md

🔇 Additional comments (2)

docs/user/ppl/index.md (2)

81-83: No duplicate entries found—all three commands appear once with unique entries.

The addtotals, addcoltotals, and mvexpand commands each appear only once in the documentation table at their respective lines (81–83). The mvexpand.md documentation file exists and is properly formatted. The table entries are correctly formatted with appropriate version and status information.

Likely an incorrect or invalid review comment.

83-83: mvexpand.md documentation file exists and is appropriately structured.

The mvexpand.md file is present at docs/user/ppl/cmd/mvexpand.md with comprehensive documentation including clear syntax definition, five detailed examples covering normal cases and edge cases (empty arrays, null values, missing fields, limit parameter), and explanatory notes. The documentation follows the established pattern of other PPL command docs.

coderabbitai · 2025-12-18T23:14:07Z

docs/user/ppl/index.md

+| [addtotals command](cmd/addtotals.md) | 3.4 | stable (since 3.4) | Adds row and column values and appends a totals column and row. | 
+| [addcoltotals command](cmd/addcoltotals.md) | 3.4 | stable (since 3.4) | Adds column values and appends a totals row. |
+| [mvexpand command](cmd/mvexpand.md) | 3.4 | experimental (since 3.4) | Expand a multi-valued field into separate documents (one per value). |


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for PPL command checklist and mvexpand implementation files # Check for PPL command checklist if [ -f docs/dev/ppl-commands.md ]; then echo "=== PPL Commands Checklist ===" cat docs/dev/ppl-commands.md else echo "docs/dev/ppl-commands.md not found" fi # Search for mvexpand related files echo "" echo "=== mvexpand implementation files ===" echo "Grammar/Parser/AST files:" fd -H 'mvexpand|Mvexpand|MVExpand' --type f | head -30 echo "" echo "=== Test files ===" fd -H 'mvexpand' --type f | grep -i test echo "" echo "=== Documentation files ===" fd -H 'mvexpand' --type f | grep -i 'doc\|md' echo "" echo "=== Check directory structure ===" git ls-files | grep -E 'mvexpand|ppl' | head -50

Repository: opensearch-project/sql

Length of output: 5306

🏁 Script executed:

#!/bin/bash # Verify detailed checklist compliance for mvexpand echo "=== 1. Lexer/Parser Grammar Files ===" echo "Checking for OpenSearchPPLLexer.g4:" find . -name "*Lexer.g4" -type f echo "" echo "Checking for OpenSearchPPLParser.g4:" find . -name "*Parser.g4" -type f echo "" echo "Searching for 'mvexpand' in grammar files:" fd -H 'Lexer.g4|Parser.g4' --type f -x grep -l 'mvexpand' {} \; echo "" echo "=== 2. AST Implementation ===" echo "Checking for mvexpand AST tree node:" fd -H 'Mvexpand|MVExpand' ppl/src/main/java --type f | grep -i ast echo "" echo "All AST tree classes:" fd -H --type f ppl/src/main/java/org/opensearch/sql/ast/tree | grep -E '\.java$' | grep -i mvexpand echo "" echo "=== 3. Visitor Pattern Implementation ===" echo "Checking AbstractNodeVisitor:" fd -H 'AbstractNodeVisitor.java' --type f echo "" echo "Searching for mvexpand visit methods:" rg 'visit.*[Mm]vexpand|[Mm]vexpand.*visitor' --type java echo "" echo "=== 4. Test Files Verification ===" echo "Unit tests:" fd -H 'CalcitePPLMvExpandTest.java' --type f echo "" echo "Integration tests (pushdown):" fd -H 'CalciteMvExpandCommandIT.java' --type f echo "" echo "Integration tests (non-pushdown):" rg 'CalciteNoPushdownIT.*mvexpand|mvexpand.*CalciteNoPushdownIT' --type java echo "" echo "Anonymizer tests:" rg 'PPLQueryDataAnonymizerTest.*mvexpand|mvexpand.*PPLQueryDataAnonymizerTest' --type java echo "" echo "=== 5. Documentation ===" echo "User documentation exists:" ls -lh docs/user/ppl/cmd/mvexpand.md echo "" echo "=== 6. Index linked ===" echo "Checking if mvexpand is linked in index.md:" grep -n 'mvexpand' docs/user/ppl/index.md

Repository: opensearch-project/sql

Length of output: 2294

🏁 Script executed:

#!/bin/bash # Deep dive verification for remaining checklist items echo "=== 1. Lexer Grammar File - mvexpand keyword ===" rg 'mvexpand|MVEXPAND' ppl/src/main/antlr/OpenSearchPPLLexer.g4 echo "" echo "=== 2. AST Implementation Details ===" echo "Searching for Mvexpand class in ast package:" find . -path '*/ast/tree*' -name '*.java' -type f | xargs grep -l 'class.*Mvexpand\|class.*MVExpand' 2>/dev/null echo "" echo "Checking ppl ast files:" fd -H --type f 'ppl/src/main/java.*ast' | head -30 echo "" echo "=== 3. Visitor Pattern - Analyzer ===" echo "Searching for visitMvexpand in Analyzer:" rg 'visitMvexpand|visit.*Mvexpand' ppl/src/main/java --type java echo "" echo "=== 4. Visitor Pattern - CalciteRelNodeVisitor ===" echo "Searching for visitMvexpand in CalciteRelNodeVisitor:" rg 'visitMvexpand|visit.*Mvexpand' ppl/src/main/java/org/opensearch/sql/ppl/calcite --type java | head -20 echo "" echo "=== 5. Non-pushdown Integration Tests ===" echo "Searching for mvexpand in CalciteNoPushdownIT and similar:" fd -H 'NoPushdown.*\.java' ppl/src integ-test/src --type f | xargs grep -l 'mvexpand' 2>/dev/null echo "" echo "Checking CalciteNoPushdownIT classes:" fd -H 'CalciteNoPushdown' --type f -x grep -l 'mvexpand' {} \; echo "" echo "=== 6. Anonymizer Tests ===" echo "Checking PPLQueryDataAnonymizerTest:" fd -H 'PPLQueryDataAnonymizer' --type f echo "" echo "Searching for anonymizer test file:" rg 'class.*PPLQueryDataAnonymizerTest' --type java echo "" echo "Checking if mvexpand is tested in anonymizer:" fd -H 'PPLQueryDataAnonymizerTest' --type f -x grep -i 'mvexpand' {} \; echo "" echo "=== 7. Explain Tests ===" echo "Checking ExplainIT classes:" fd -H 'ExplainIT|CalciteExplainIT' --type f | head -10 echo "" echo "Searching for mvexpand in explain tests:" rg 'mvexpand' ppl/src integ-test/src --type java | grep -i explain

Repository: opensearch-project/sql

Length of output: 2375

🏁 Script executed:

#!/bin/bash # Final verification for remaining checklist items echo "=== 1. AST Implementation Details ===" echo "Checking AstBuilder visitMvexpandCommand implementation:" rg -A 20 'visitMvexpandCommand' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | head -25 echo "" echo "=== 2. Visitor Pattern - Analyzer ===" echo "Checking for Analyzer class with visit methods:" fd -H 'Analyzer.java' ppl/src/main/java --type f echo "" echo "Searching for visitor implementations in core/analyzer:" rg 'visitMvexpand|class.*Analyzer' core/src/main/java/org/opensearch/sql/analyzer --type java | head -10 echo "" echo "=== 3. Visitor Pattern - CalciteRelNodeVisitor or similar ===" echo "Searching for all visitor classes in ppl:" fd -H 'Visitor.java' ppl/src/main/java --type f echo "" echo "Checking PPLNodeVisitor:" rg 'PPLNodeVisitor|class.*Visitor' ppl/src/main/java --type java | grep -i 'class' | head -10 echo "" echo "=== 4. Non-pushdown Integration Tests ===" echo "All *IT classes in ppl integ tests:" fd -H '.*IT\.java' ppl/src/test integ-test/src/test --type f | xargs grep -l 'mvexpand' 2>/dev/null echo "" echo "All *IT classes that might test PPL commands:" fd -H '.*CalciteNo.*IT\.java' integ-test/src/test --type f | head -5 echo "" echo "=== 5. NewAddedCommandsIT ===" echo "Searching for NewAddedCommandsIT:" fd -H 'NewAddedCommandsIT' --type f echo "" echo "Checking for mvexpand in that test:" fd -H 'NewAddedCommandsIT' --type f -x grep -i 'mvexpand' {} \; echo "" echo "=== 6. Summary - Check AST and Visitor integration ===" echo "Looking for UnresolvedPlan or command class usage:" rg -B 5 -A 15 'visitMvexpandCommand.*{' ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java | head -30

Repository: opensearch-project/sql

Length of output: 3291

🏁 Script executed:

#!/bin/bash # Find MvExpand class and verify visitor implementations echo "=== 1. MvExpand AST Class ===" echo "Searching for MvExpand class definition:" fd -H 'MvExpand.java' --type f echo "" echo "Checking class content:" fd -H 'MvExpand.java' --type f -x cat {} \; echo "" echo "=== 2. Visitor Pattern in Calcite ===" echo "Searching for visitor implementations in calcite package:" fd -H 'Visitor.java' ppl/src/main/java/org/opensearch/sql/ppl/calcite --type f | head -10 echo "" echo "Checking for ppllocalqueryplanner or related:" fd -H 'PPLLocalQueryPlanner\|LogicalPlanVisitor\|CalciteRelNodeVisitor' --type f | head -10 echo "" echo "Checking for mvexpand visitor support in calcite:" rg 'visitMvexpand\|MvExpand' ppl/src/main/java/org/opensearch/sql/ppl/calcite --type java | head -15 echo "" echo "=== 3. Non-pushdown Tests ===" echo "Content of CalciteNoPushdownIT (checking if mvexpand should be there):" fd -H 'CalciteNoPushdownIT.java' --type f -x head -50 {} \; echo "" echo "Checking all test methods in CalciteNoPushdownIT:" fd -H 'CalciteNoPushdownIT.java' --type f -x rg 'testMv|mvexpand|void test' {} \; echo "" echo "=== 4. NewAddedCommandsIT Tests ===" echo "Content check for NewAddedCommandsIT (ppl version):" fd -H 'ppl/NewAddedCommandsIT.java' --type f echo "" echo "Checking CalciteNewAddedCommandsIT:" fd -H 'CalciteNewAddedCommandsIT.java' --type f -x head -50 {} \; echo "" echo "Checking if mvexpand should be in NewAddedCommandsIT:" fd -H 'NewAddedCommandsIT.java' --type f -x rg 'mvexpand\|testMv' {} \; echo "" echo "=== 5. Analyzer Visitor ===" echo "Searching for Analyzer or resolver in ppl:" fd -H 'Analyzer\|Resolver' ppl/src/main/java --type f | grep -E '\.java$' | head -10 echo "" echo "Checking if mvexpand needs analyzer visitor:" rg 'class.*MvExpand\|MvExpand' ppl/src/main/java/org/opensearch/sql/ppl --type java | head -5

Repository: opensearch-project/sql

Length of output: 3979

🏁 Script executed:

#!/bin/bash # Verify visitor implementations and NewAddedCommandsIT echo "=== 1. AbstractNodeVisitor visitMvExpand ===" echo "Checking for visitMvExpand in AbstractNodeVisitor:" fd -H 'AbstractNodeVisitor.java' --type f -x rg 'visitMvExpand' {} \; echo "" echo "=== 2. Find all visitor implementations ===" echo "Searching for all visitMvExpand implementations across codebase:" rg 'visitMvExpand|visitMvexpand' --type java echo "" echo "=== 3. PPL NewAddedCommandsIT content ===" echo "Content of ppl/NewAddedCommandsIT.java:" fd -H 'ppl/NewAddedCommandsIT.java' --type f -x cat {} \; | head -100 echo "" echo "=== 4. Verify CalciteNoPushdownIT includes mvexpand ===" echo "Checking if CalciteMvExpandCommandIT is in CalciteNoPushdownIT suite:" fd -H 'CalciteNoPushdownIT.java' --type f -x grep 'CalciteMvExpandCommandIT' {} \; echo "" echo "=== 5. Summary - All key checklist items ===" echo "Lexer: MVEXPAND keyword" rg "MVEXPAND.*'MVEXPAND'" ppl/src/main/antlr/OpenSearchPPLLexer.g4 echo "" echo "Parser: mvexpand rule in grammar" rg 'mvexpandCommand' ppl/src/main/antlr/OpenSearchPPLParser.g4 | head -3 echo "" echo "AST: MvExpand class exists" ls -lh core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java echo "" echo "Parser Builder: visitMvexpandCommand" rg 'visitMvexpandCommand' ppl/src/main/java --type java | head -1 echo "" echo "Unit Tests:" ls -lh ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java echo "" echo "Integration Tests (pushdown):" ls -lh integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java echo "" echo "Integration Tests (non-pushdown via suite):" grep 'CalciteMvExpandCommandIT' integ-test/src/test/java/org/opensearch/sql/calcite/CalciteNoPushdownIT.java echo "" echo "Explain Tests:" rg 'mvexpand.*explain|explain.*mvexpand' integ-test/src/test/java --type java | head -1 echo "" echo "Anonymizer Tests:" rg 'testMvexpand' ppl/src/test/java --type java | head -2 echo "" echo "User Documentation:" ls -lh docs/user/ppl/cmd/mvexpand.md

Repository: opensearch-project/sql

Length of output: 2799

🏁 Script executed:

#!/bin/bash # Final verification for NewAddedCommandsIT echo "=== NewAddedCommandsIT - Full content ===" fd -H 'ppl/NewAddedCommandsIT.java' --type f -x wc -l {} \; echo "" fd -H 'ppl/NewAddedCommandsIT.java' --type f -x cat {} \; echo "" echo "=== Check for test methods in NewAddedCommandsIT ===" fd -H 'NewAddedCommandsIT.java' ppl/src/test --type f -x rg 'void test|public void' {} \; echo "" echo "=== Cross-cluster tests (optional) ===" echo "Checking for CrossClusterSearchIT:" fd -H 'CrossClusterSearchIT' --type f echo "" echo "Checking if mvexpand is tested there:" rg 'mvexpand' integ-test/src/test --type java | grep -i 'cross\|cluster'

Repository: opensearch-project/sql

Length of output: 441

🏁 Script executed:

#!/bin/bash # Check if ppl/NewAddedCommandsIT.java should exist and verify its purpose echo "=== Checking ppl test directory structure ===" fd -H '\.java$' ppl/src/test/java/org/opensearch/sql/ppl --type f | grep -i 'test\|it' | head -20 echo "" echo "=== Looking for NewAddedCommandsIT pattern ===" echo "All NewAddedCommandsIT files:" fd -H 'NewAddedCommandsIT' --type f echo "" echo "=== Content of base NewAddedCommandsIT ===" fd -H 'ppl/NewAddedCommandsIT' --type f echo "" fd -H 'integ-test.*NewAddedCommandsIT' --type f -x cat {} \; | head -50 echo "" echo "=== Checklist item analysis: Should mvexpand be in ppl/NewAddedCommandsIT? ===" echo "Checking if other commands have tests in both locations:" echo "Commands in CalciteNewAddedCommandsIT (extends ppl/NewAddedCommandsIT):" fd -H 'CalciteNewAddedCommandsIT' --type f -x cat {} \; echo "" echo "Checking ppl test directory for NewAddedCommandsIT:" ls -la ppl/src/test/java/org/opensearch/sql/ppl/ | grep -i 'new\|added'

Repository: opensearch-project/sql

Length of output: 2578

🏁 Script executed:

#!/bin/bash # Direct verification of NewAddedCommandsIT file echo "=== Finding NewAddedCommandsIT.java ===" find . -name 'NewAddedCommandsIT.java' -type f echo "" echo "=== Content of integ-test ppl NewAddedCommandsIT.java ===" cat integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java echo "" echo "=== Verify file exists and size ===" ls -lah integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java echo "" echo "=== Line count ===" wc -l integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java echo "" echo "=== Final checklist summary ===" echo "1. Lexer/Parser Updates: ✓ MVEXPAND keyword and grammar present" echo "2. AST Implementation: ✓ MvExpand class present in core/ast/tree" echo "3. Visitor Pattern:" echo " - AbstractNodeVisitor: ✓" echo " - Analyzer: ✓" echo " - CalciteRelNodeVisitor: ✓" echo " - PPLQueryDataAnonymizer: ✓" echo "4. Unit Tests: ✓ CalcitePPLMvExpandTest.java" echo "5. Integration Tests (pushdown): ✓ CalciteMvExpandCommandIT.java" echo "6. Integration Tests (non-pushdown): ✓ CalciteMvExpandCommandIT.class in CalciteNoPushdownIT suite" echo "7. Explain Tests: ✓ CalciteExplainIT.java" echo "8. Anonymizer Tests: ✓ PPLQueryDataAnonymizerTest" echo "9. NewAddedCommandsIT: ? Need to verify" echo "10. User Documentation: ✓ docs/user/ppl/cmd/mvexpand.md" echo "11. Cross-cluster Tests: Optional (not found, but optional in checklist)"

Repository: opensearch-project/sql

Length of output: 8153

Add test for mvexpand in NewAddedCommandsIT.

Most checklist items are addressed (lexer/parser, AST, all visitor patterns, unit tests, integration tests for pushdown/non-pushdown, explain tests, anonymizer tests, user documentation). However, the "Unsupported in v2 test" item from docs/dev/ppl-commands.md is missing: add a testMvexpand() method to integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java following the pattern of existing tests like testJoin() and testAddTotalCommand().

🤖 Prompt for AI Agents

In docs/user/ppl/index.md around lines 81-83: add the missing integration test by creating a method testMvexpand() in integ-test/src/test/java/org/opensearch/sql/ppl/NewAddedCommandsIT.java following the pattern of testJoin() and testAddTotalCommand(): add a @Test-annotated method named testMvexpand that runs a representative mvexpand query, verifies expected behavior, and includes the "Unsupported in v2" assertion (matching the other NewAddedCommandsIT tests) so the checklist item is satisfied; ensure the test uses the same setup/teardown, visitor/pushdown assertions, and any explain/anonymizer checks used by neighboring tests.

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

srikanthpadakanti · 2025-12-22T19:15:59Z

Hello @penghuo @ykmr1224 Can you please look into this PR. Thanks

ykmr1224 · 2025-12-22T21:41:20Z

core/src/main/java/org/opensearch/sql/ast/tree/MvExpand.java

+import org.opensearch.sql.ast.AbstractNodeVisitor;
+import org.opensearch.sql.ast.expression.Field;
+
+/** AST node representing an {@code mvexpand <field> [limit N]} operation. */


[limit N] -> [limit=N]
It might be better referring as mvexpand PPL command to avoid inconsistency.

ykmr1224 · 2025-12-22T21:49:54Z

docs/user/ppl/cmd/mvexpand.md

+
+PPL query:
+```ppl
+source=mvexpand_logs | where case='basic' | mvexpand tags | fields tags.value


Why do we output value to field.value? Are we intentionally making it different from Splunk command spec or due to some reason?

ykmr1224 · 2025-12-22T21:53:41Z

docs/user/ppl/cmd/mvexpand.md

+
+## Notes about these doctests
+- The tests below target a single, deterministic document by using `where case='<name>'` so the doctests are stable.
+- The test index name used in these examples is `mvexpand_logs`.


Looking at the examples, it would be easier to understand if we use array function to make and multivalue field, and expand it by mvexpand command. And then, we don't need to add new index mvexpand_logs for this doc.

Refer here for examples: https://github.com/opensearch-project/sql/blob/main/docs/user/ppl/functions/collection.md

ykmr1224 · 2025-12-22T21:58:26Z

integ-test/src/test/resources/expectedOutput/calcite/explain_mvexpand.yaml

+calcite:
+  logical: |
+    LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
+      LogicalProject(skills=[$0], username=[$3], VALUE=[$10])
+        LogicalValues(tuples=[[]])
+  physical: |
+    EnumerableValues(tuples=[[]])


Looks it does not do anything related to mvexpand.

ykmr1224 · 2025-12-22T21:59:37Z

integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_mvexpand.yaml

+calcite:
+  logical: |
+    LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
+      LogicalProject(skills=[$0], username=[$3], VALUE=[$10])
+        LogicalValues(tuples=[[]])
+  physical: |
+    EnumerableValues(tuples=[[]])


Same as above

ykmr1224 · 2025-12-22T22:18:44Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+    RelDataTypeField matched = rowType.getField(fieldName, false, false);
+
+    // 2A. Missing field → true EMPTY relation (no schema, no rows)
+    if (matched == null) {


Why don't we raise error when the field is not found?

ykmr1224 · 2025-12-22T22:24:39Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+    String fieldName = extractFieldName(field);
+
+    // 2. Lookup field
+    RelDataTypeField matched = rowType.getField(fieldName, false, false);


We should rely on rexVisitor.analyze to resolve field node.

ykmr1224 · 2025-12-22T22:27:03Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+   * @return RelNode representing records with the expanded multi-value field
+   */
+  @Override
+  public RelNode visitMvExpand(MvExpand mvExpand, CalcitePlanContext context) {


Why does it have lots of difference with visitExpand? Should it be mostly same other than limit parameter?

ykmr1224 · 2025-12-22T22:40:06Z

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

+        "Expected SemanticCheckException about non-array integer field, got: " + msg);
+  }
+
+  private static String createTempIndexWithMapping(String baseName, String mappingJson)


I think we already have these helper methods in SQLIntegTestCase. Please use existing ones whenever possible.

ykmr1224 · 2025-12-22T22:45:46Z

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

+  public void testMvexpandSingleElement() throws Exception {
+    String query =
+        String.format(
+            "source=%s | mvexpand skills | where username='single' | fields username, skills.name",


Why does it bring skills.name although it expanded skills?

Srikanth Padakanti added 30 commits November 22, 2025 14:42

Initial checkpoint - following calcite way and commented legacy way

8362fc2

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Removed the build.gradle dependency opensearch-common

384ba15

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Ready to submit this PR

7f382f9

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Ready to submit this PR

44c8124

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Ready to submit this PR

3cad64e

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Add mvexpand.rst

8e4a2c5

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Add Tests

474617d

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Add the mvexpand.rst to the index.rst

d502b03

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Remove the unwanted code

c62defe

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Fix the failing test

a3799b2

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments and fix the tests accordingly

d90be9f

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments and fix the tests accordingly

da16288

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments and fix the tests accordingly

1301e06

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Add comment lines for buildUnnestForLeft

beb31de

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Fix the mvexpand.rst

627ef8f

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Fix the failing test

58facf8

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Fix the failing test

63cdbf7

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Fix the failing test

bdc3aa1

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Fix the failing test

fc8e345

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments

c830356

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments

e9b6f27

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments

fa9436e

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments

ea091d2

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the PR comments

4d9b24d

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Address the issue as the happy path scenario was not working the way …

b9d3164

…it supposed to Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

MvExpand as its own implementation - not aliasing

26a59a4

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Refactoring EXPAND and MVEXPAND

43c806e

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Refactor EXPAND and MVEXPAND and fix its unittest

a07dff2

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

Convert mvexpand.rst examples to doctest

7be7473

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

metadata.rst was missing the mvexpand_logs entry

2c0ea2c

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

srikanthpadakanti requested review from LantaoJin, RyanL1997, noCharger, qianheng-aws and yuancu as code owners December 11, 2025 17:26

address merge issue

00c990f

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

dai-chen added enhancement New feature or request PPL Piped processing language backport 2.19-dev labels Dec 11, 2025

address merge issue

44814ab

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

docs/category.json Outdated Show resolved Hide resolved

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java Show resolved Hide resolved

change rst to md

f9dd692

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

srikanthpadakanti and others added 3 commits December 11, 2025 12:12

Merge branch 'opensearch-project:main' into main

69d6a5a

change rst to md

2464675

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

delete unnecessary test

0f86c52

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai bot reviewed Dec 11, 2025

View reviewed changes

docs/user/ppl/cmd/mvexpand.md Show resolved Hide resolved

dai-chen reviewed Dec 11, 2025

View reviewed changes

Merge branch 'main' into main

32d3867

Signed-off-by: Srikanth Padakanti <srikanth29.9@gmail.com>

coderabbitai bot reviewed Dec 18, 2025

View reviewed changes

remove index.rst and add mvexpand entry in index.md

07509ae

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

coderabbitai bot reviewed Dec 18, 2025

View reviewed changes

Srikanth Padakanti added 2 commits December 18, 2025 17:18

spotless apply

34db739

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

merge issues.

602358e

Signed-off-by: Srikanth Padakanti <srikanth_padakanti@apple.com>

srikanthpadakanti requested a review from dai-chen December 19, 2025 21:06

dai-chen approved these changes Dec 19, 2025

View reviewed changes

ykmr1224 reviewed Dec 22, 2025

View reviewed changes

Mvexpand feature #4944

Are you sure you want to change the base?

Mvexpand feature #4944

Uh oh!

Conversation

srikanthpadakanti commented Dec 11, 2025

Description

Related Issues

Check List

Uh oh!

coderabbitai bot commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srikanthpadakanti Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

srikanthpadakanti commented Dec 22, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot commented Dec 11, 2025 •

edited

Loading

srikanthpadakanti Dec 19, 2025 •

edited

Loading