feat(parser): parity corpus + differential test harness by DecisionNerd · Pull Request #661 · DecisionNerd/graphforge

DecisionNerd · 2026-05-30T04:56:58Z

Closes #555

Closes the final open issue in the v0.5.0: Compiler Skeleton milestone.

Summary

Rust binary (crates/gf-cypher/src/bin/parse_json.rs)
Thin CLI wrapper: parses a query from argv, emits AstQuery JSON to stdout (exit 0) or a structured error JSON to stderr (exit 1). Added as [[bin]] parse_json in Cargo.toml.

Corpus (tests/parser_parity/corpus.py)

VALID_QUERIES: 200 queries — all 13 clause types (MATCH, OPTIONAL MATCH, WHERE, WITH, RETURN, CREATE, MERGE, SET, REMOVE, DELETE, DETACH DELETE, UNWIND, CALL), all operators, all expression forms, CASE, list comprehensions, parameters, path patterns, functions
INVALID_QUERIES: 54 queries — unclosed constructs, unterminated strings, invalid tokens, incomplete clauses, wrong partial syntax
UNION_QUERIES: 5 queries — both parsers succeed but structural comparison is skipped (Python: UnionQuery wrapper; Rust: [Return, Union, Return] clauses)
KNOWN_DIFFERENCES: 11 queries — documented intentional divergences (regex =~, label predicates in WHERE, shortestPath, empty input, clause-order permutations)

Harness (tests/parser_parity/harness.py)

_ast_to_dict — recursively converts Python AST (dataclass + Pydantic BaseModel) to JSON-comparable dict, injecting _type on clause nodes
python_parse / rust_parse — run each parser, return (ast, None) or (None, error)
python_clause_sequence — filters Python "modifier" clauses (WHERE, ORDER BY, SKIP, LIMIT) that Rust inlines into parent clause structs
compare_clause_counts / compare_clause_types — structural comparison with Python→Rust clause name mapping

Tests (tests/parser_parity/test_parity.py) — 270 parametrized tests:

test_valid_query_parity[200] — both parsers succeed; same clause count and type sequence
test_invalid_query_both_reject[54] — both parsers reject
test_union_query_both_succeed[5] — both succeed (no structural check)
test_known_differences_dont_panic[11] — neither parser crashes

Known structural differences (documented, not bugs)

Category	Python	Rust
WHERE clause	Separate top-level `WhereClause` node	Inlined as `where_clause` field on `MatchClause`/`WithClause`
ORDER BY, SKIP, LIMIT	Separate clause nodes	Inlined as fields on `ReturnClause`/`WithClause`
UNION	`UnionQuery` wrapper node	`[Return, Union, Return]` flat clauses
Regex `=~`	Rejected	Accepted
Label pred in WHERE	Accepted	Rejected
Empty input	Rejected	Accepted (empty clause list)

Test plan

cargo build --bin parse_json -p gf-cypher — binary builds
uv run pytest tests/parser_parity/ -m parity -q — 270 tests pass
uv run ruff check tests/parser_parity/ — clean
uv run mypy tests/parser_parity/ --ignore-missing-imports — clean
cargo clippy -p gf-cypher -- -D warnings — clean

🤖 Generated with Claude Code

Note

Add differential test harness comparing Python Lark and Rust Cypher parsers

Introduces a parse_json Rust binary (parse_json.rs) that parses a Cypher query from CLI args and emits a JSON AST on success (exit 0) or a JSON error object on stderr (exit 1).
Adds a Python test harness (harness.py) that runs both parsers and normalizes their outputs for structural comparison, mapping Python clause types to Rust serde discriminants.
Adds a query corpus (corpus.py) covering valid, invalid, union, and known-divergence queries across all major Cypher clause types.
Parameterized pytest tests (test_parity.py) verify both parsers accept valid queries with matching clause structure, both reject invalid queries, and known divergences are documented without panics.
Risk: each test invocation of the Rust parser spawns a subprocess with a 10s timeout, so large query corpora will increase test suite wall time significantly.

^{Macroscope summarized c1255a6.}

Summary by CodeRabbit

New Features
- Added a command-line tool that parses Cypher queries and outputs results in JSON format, including detailed error reporting with error types and source location information.
Tests
- Introduced a parser parity testing framework with extensive test corpora to validate consistent parsing behavior across different parser implementations.

- crates/gf-cypher/src/bin/parse_json.rs: CLI binary — parses a query from argv, emits AstQuery JSON to stdout (exit 0) or error JSON to stderr (exit 1) - tests/parser_parity/corpus.py: 200 valid + 54 invalid + 5 UNION + 11 known-difference queries covering all 13 clause types, operators, functions, parameters, paths, and expressions - tests/parser_parity/harness.py: normalisation helpers — _ast_to_dict (injects _type on Pydantic nodes), python_parse, rust_parse, strip_spans, clause-sequence extraction with modifier-clause filtering (WHERE/ORDER BY/SKIP/LIMIT are inlined in Rust, top-level in Python) - tests/parser_parity/test_parity.py: 4 parametrized test classes — valid parity (200), invalid rejection (54), UNION success (5), known differences no-panic (11) — 270 tests total - pyproject.toml: register 'parity' marker Closes #555 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-05-30T04:57:08Z

Walkthrough

This PR implements a complete differential testing framework for Cypher parser parity. It adds a Rust parse_json CLI binary, a Python corpus of 479+ valid/invalid/UNION queries, an AST normalization harness, and a pytest suite that validates both parsers produce equivalent output.

Changes

Parser Parity Testing Infrastructure

Layer / File(s)	Summary
Rust parse_json CLI binary `crates/gf-cypher/Cargo.toml`, `crates/gf-cypher/src/bin/parse_json.rs`	Adds a new binary target that wraps `gf_cypher::parse`, accepts query arguments, emits pretty-printed `AstQuery` JSON to stdout on success (exit 0) or structured error JSON with `ParseErrorKind` and span to stderr on failure (exit 1).
Query corpus for parity testing `tests/parser_parity/corpus.py`	Defines 357+ valid Cypher queries across all clause types and operators, UNION examples, 97+ invalid queries with parse errors, and known-difference cases where parsers intentionally diverge.
Comparison harness for AST normalization `tests/parser_parity/harness.py`, `pyproject.toml`	Implements `python_parse` and `rust_parse` to invoke and normalize parser outputs, `strip_spans` to remove location metadata, `python_clause_sequence` and `rust_clause_sequence` to extract ordered clause lists with modifier-clause filtering, and `compare_clause_counts`/`compare_clause_types` to validate structural equivalence. Adds `parity` pytest marker.
Parser parity test suite `tests/parser_parity/test_parity.py`	Parameterized pytest suite with session-scoped `parse_json_binary` fixture that builds the Rust binary once. Tests validate that (1) valid queries parse and produce matching clause counts/types, (2) invalid queries both reject, (3) UNION queries both succeed without strict structure check, (4) known-difference queries execute without panics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

DecisionNerd/graphforge#607: This PR directly implements the differential testing infrastructure that issue aims to expand, providing the foundational parser-parity corpus and harness.

Possibly related PRs

DecisionNerd/graphforge#635: The new parse_json binary and parity tests depend on the AstQuery/ParseError API and ParseErrorKind type overhaul introduced in that PR.
DecisionNerd/graphforge#634: The parse_json binary serializes the AstQuery type, which is directly affected by the serde trait changes in that PR.
DecisionNerd/graphforge#654: This PR exercises the gf_cypher::parse entrypoint and ParseErrorKind output that the retrieved PR wires up as the recursive-descent/Pratt parser scaffold.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat(parser): parity corpus + differential test harness' accurately and concisely describes the main change: adding parser parity testing infrastructure with corpus and harness.
Linked Issues check	✅ Passed	The PR fulfills all core coding requirements from issue `#555`: corpus of 200+ valid and 54 invalid queries covering all clause types, differential harness normalizing Python/Rust ASTs, comparison helpers mapping clause names, and 270 parametrized tests validating parity across all query categories.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with `#555` requirements: parse_json binary, corpus, harness, and parametrized tests for parser parity validation. No unrelated code changes detected.
Docstring Coverage	✅ Passed	Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The pull request description is comprehensive and well-structured, covering all major sections including summary, known structural differences, and test plan.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/555-parser-parity

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-05-30T04:58:15Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.10%. Comparing base (1381d49) to head (c1255a6).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #661   +/-   ##
=======================================
  Coverage   84.10%   84.10%           
=======================================
  Files          49       49           
  Lines       12942    12942           
  Branches     3632     3632           
=======================================
  Hits        10885    10885           
  Misses       1272     1272           
  Partials      785      785

Flag	Coverage Δ
full-coverage	`84.10% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
parser	`92.93% <ø> (ø)`
planner	`79.90% <ø> (ø)`
executor	`75.46% <ø> (ø)`
storage	`98.68% <ø> (ø)`
ast	`97.51% <ø> (ø)`
types	`90.66% <ø> (ø)`

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1381d49...c1255a6. Read the comment docs.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/parser_parity/harness.py`:
- Around line 191-229: The current parity checks in compare_clause_counts and
compare_clause_types only validate clause sequences and miss non-clause field
and span mismatches; add a full normalized AST comparison: implement a
normalize_ast(ast: dict) helper (used for both Python and Rust ASTs) that strips
span/location-only fields and maps Python names to Rust variants using
PYTHON_TO_RUST_CLAUSE and the existing
python_clause_sequence/rust_clause_sequence helpers, then perform a recursive,
field-by-field deep equality check (or produce a structured diff) between
normalize_ast(py_ast) and normalize_ast(rs_ast); update the test harness to call
this new comparison (e.g., in a new compare_full_ast function invoked alongside
or instead of compare_clause_counts/compare_clause_types) and return clear
mismatch messages when any non-clause fields diverge.
- Around line 67-134: The functions python_parse, python_clause_types,
rust_parse and rust_clause_types currently return dict/list/tuple or None;
change them to always return pyarrow.Table results per the project guideline:
import pyarrow as pa and convert successful outputs (ast dict or list of clause
names) into a pa.Table (e.g., pa.table({"result": [json_string_or_struct]}) or a
structured table matching the expected consumer shape), and convert errors into
a pa.Table with consistent error columns (e.g., {"error": [True], "message":
[msg]}). Update the return shapes of python_parse and rust_parse so they return
a single pa.Table (never None) and adjust callers to consume via .to_pandas() /
pl.from_arrow() / .to_pylist() as required; locate and modify the bodies of
python_parse, python_clause_types, rust_parse and rust_clause_types to perform
these conversions.
- Around line 101-127: The parity test currently only checks that both parsers
reject but not whether they report the same error kind and span; add helper
functions (e.g., compare_error_kind(expected_err, actual_err) and
compare_error_span(expected_err, actual_err)) and use them in the parity
assertions to compare the parsed error dicts from rust_parse and the reference
parser. Ensure rust_parse continues to return a structured error dict (with
'kind' and 'span' when present) or normalize missing fields into a consistent
shape before comparison, have the helpers tolerate absent fields by failing only
on real mismatches, and call these helpers from the existing parity test paths
where return (None, err) is observed.

In `@tests/parser_parity/test_parity.py`:
- Around line 41-50: The fixture parse_json_binary currently returns a Path
(_BINARY_PATH) which violates the repo result contract; change its return type
to a pyarrow.Table and return a table containing the binary path (e.g.,
pa.table({"path":[str(_BINARY_PATH)]})), update the function signature to ->
pa.Table and ensure pyarrow is imported as pa; keep the build steps (cargo
build) and the existence assert, then convert the resulting _BINARY_PATH into a
pyarrow.Table before returning.
- Around line 85-94: Update test_invalid_query_both_reject to assert error
semantics and locations, not just non-None; after calling python_parse and
rust_parse (functions referenced in the test) compare error classifications
(e.g., error.kind or error.type returned by python_parse and rust_parse) and
compare source locations (e.g., error.position, error.line/column or error.span)
for parity across both parsers for each query in INVALID_QUERIES using
parse_json_binary as before; if the error objects differ in shape, normalize
them to a common structure (classification and location) before asserting
equality so the test fails when parsers disagree on error kind or reported
location.
- Around line 101-135: The parity marker is being applied to tests that
explicitly bypass structural agreement (test_union_query_both_succeed and
test_known_differences_dont_panic using UNION_QUERIES and KNOWN_DIFFERENCES),
weakening the parity gate; remove or replace pytest.mark.parity on these tests
and instead mark them with a dedicated marker (e.g.,
pytest.mark.known_difference or pytest.mark.union_only) so the true parity suite
only contains tests that assert structural equivalence, and update any
CI/test-group selection to use the new marker; locate the annotations on
test_union_query_both_succeed, test_known_differences_dont_panic, and the
UNION_QUERIES / KNOWN_DIFFERENCES parameterization to make the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f185b7b-0e32-4a7d-b00c-671d669ca766

📥 Commits

Reviewing files that changed from the base of the PR and between 1381d49 and c1255a6.

📒 Files selected for processing (7)

crates/gf-cypher/Cargo.toml
crates/gf-cypher/src/bin/parse_json.rs
pyproject.toml
tests/parser_parity/__init__.py
tests/parser_parity/corpus.py
tests/parser_parity/harness.py
tests/parser_parity/test_parity.py

coderabbitai · 2026-05-30T05:00:47Z

+def python_parse(query: str) -> tuple[dict[str, Any] | None, Exception | None]:
+    """Run the Python parser.
+
+    Returns ``(ast_dict, None)`` on success or ``(None, exc)`` on error.
+    """
+    from graphforge.parser.parser import parse_cypher
+
+    try:
+        ast = parse_cypher(query)
+        result = _ast_to_dict(ast)
+        return result if isinstance(result, dict) else {"_root": result}, None
+    except Exception as exc:
+        return None, exc
+
+
+def python_clause_types(query: str) -> list[str] | None:
+    """Return Python clause type names for a query, or None on error."""
+    from graphforge.parser.parser import parse_cypher
+
+    try:
+        ast = parse_cypher(query)
+        if hasattr(ast, "clauses"):
+            return [type(c).__name__ for c in ast.clauses]
+        # UnionQuery or other top-level wrapper
+        return [type(ast).__name__]
+    except Exception:
+        return None
+
+
+# ---------------------------------------------------------------------------
+# Rust parser
+# ---------------------------------------------------------------------------
+
+
+def rust_parse(
+    query: str,
+    binary: Path,
+) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
+    """Run the Rust parse_json binary.
+
+    Returns ``(ast_dict, None)`` on success or ``(None, error_dict)`` on error.
+    """
+    try:
+        result = subprocess.run(
+            [str(binary), query],
+            capture_output=True,
+            text=True,
+            timeout=10,
+            check=False,
+        )
+        if result.returncode == 0:
+            return json.loads(result.stdout), None
+        else:
+            try:
+                err = json.loads(result.stderr)
+            except json.JSONDecodeError:
+                err = {"error": True, "message": result.stderr.strip()}
+            return None, err
+    except subprocess.TimeoutExpired:
+        return None, {"error": True, "message": "timeout"}
+
+
+def rust_clause_types(query: str, binary: Path) -> list[str] | None:
+    """Return Rust clause variant names for a query, or None on error."""
+    ast, err = rust_parse(query, binary)
+    if err is not None or ast is None:
+        return None
+    return [list(c.keys())[0] for c in ast.get("clauses", [])]


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Return contracts violate the Python result-type guideline.

These functions return dict/list/tuple contracts instead of pyarrow.Table, which conflicts with the repository’s Python result contract rule.

As per coding guidelines, "In Python code, all methods that return results must return Apache Arrow Tables, never CypherValue wrappers or custom result types" and "Use pyarrow.Table for all result contracts; consume with pandas/polars via .to_pandas() / pl.from_arrow() or with .to_pylist() for list[dict] format".

Also applies to: 151-230

🧰 Tools

🪛 Ruff (0.15.14)

[warning] 78-78: Do not catch blind exception: Exception

(BLE001)

[warning] 92-92: Do not catch blind exception: Exception

(BLE001)

[error] 110-110: subprocess call: check for execution of untrusted input

(S603)

[warning] 134-134: Prefer next(iter(c.keys())) over single element slice

Replace with next(iter(c.keys()))

(RUF015)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/parser_parity/harness.py` around lines 67 - 134, The functions python_parse, python_clause_types, rust_parse and rust_clause_types currently return dict/list/tuple or None; change them to always return pyarrow.Table results per the project guideline: import pyarrow as pa and convert successful outputs (ast dict or list of clause names) into a pa.Table (e.g., pa.table({"result": [json_string_or_struct]}) or a structured table matching the expected consumer shape), and convert errors into a pa.Table with consistent error columns (e.g., {"error": [True], "message": [msg]}). Update the return shapes of python_parse and rust_parse so they return a single pa.Table (never None) and adjust callers to consume via .to_pandas() / pl.from_arrow() / .to_pylist() as required; locate and modify the bodies of python_parse, python_clause_types, rust_parse and rust_clause_types to perform these conversions.

coderabbitai · 2026-05-30T05:00:47Z

+def rust_parse(
+    query: str,
+    binary: Path,
+) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
+    """Run the Rust parse_json binary.
+
+    Returns ``(ast_dict, None)`` on success or ``(None, error_dict)`` on error.
+    """
+    try:
+        result = subprocess.run(
+            [str(binary), query],
+            capture_output=True,
+            text=True,
+            timeout=10,
+            check=False,
+        )
+        if result.returncode == 0:
+            return json.loads(result.stdout), None
+        else:
+            try:
+                err = json.loads(result.stderr)
+            except json.JSONDecodeError:
+                err = {"error": True, "message": result.stderr.strip()}
+            return None, err
+    except subprocess.TimeoutExpired:
+        return None, {"error": True, "message": "timeout"}
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Invalid-query parity currently misses error-kind and location equivalence.

rust_parse returns structured error details, but downstream checks only assert “both reject.” Add helpers that compare error kind and span parity between parsers.

🧰 Tools

🪛 Ruff (0.15.14)

[error] 110-110: subprocess call: check for execution of untrusted input

(S603)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/parser_parity/harness.py` around lines 101 - 127, The parity test currently only checks that both parsers reject but not whether they report the same error kind and span; add helper functions (e.g., compare_error_kind(expected_err, actual_err) and compare_error_span(expected_err, actual_err)) and use them in the parity assertions to compare the parsed error dicts from rust_parse and the reference parser. Ensure rust_parse continues to return a structured error dict (with 'kind' and 'span' when present) or normalize missing fields into a consistent shape before comparison, have the helpers tolerate absent fields by failing only on real mismatches, and call these helpers from the existing parity test paths where return (None, err) is observed.

coderabbitai · 2026-05-30T05:00:47Z

+def compare_clause_counts(
+    py_ast: dict[str, Any],
+    rs_ast: dict[str, Any],
+) -> tuple[bool, str]:
+    """Return (match, message) comparing clause counts.
+
+    Python modifier clauses (WHERE, ORDER BY, SKIP, LIMIT) are excluded
+    because Rust inlines them into the containing clause struct.
+    """
+    py_seq = python_clause_sequence(py_ast)
+    rs_seq = rust_clause_sequence(rs_ast)
+    if len(py_seq) == len(rs_seq):
+        return True, f"clause count: {len(py_seq)}"
+    return (
+        False,
+        f"clause count mismatch: Python(filtered)={len(py_seq)} Rust={len(rs_seq)}",
+    )
+
+
+def compare_clause_types(
+    py_ast: dict[str, Any],
+    rs_ast: dict[str, Any],
+) -> tuple[bool, str]:
+    """Return (match, message) comparing clause type sequences.
+
+    Maps Python clause names → Rust variant names using PYTHON_TO_RUST_CLAUSE,
+    then compares the resulting sequences.
+    """
+    py_types_raw = python_clause_sequence(py_ast)
+    rs_types = rust_clause_sequence(rs_ast)
+
+    py_types_mapped = [PYTHON_TO_RUST_CLAUSE.get(t, t) for t in py_types_raw]
+
+    if py_types_mapped == rs_types:
+        return True, f"clause types match: {rs_types}"
+    return (
+        False,
+        f"clause type mismatch:\n  Python (mapped): {py_types_mapped}\n  Rust:            {rs_types}",
+    )


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Structural parity check is too shallow for the stated parity contract.

This only compares clause count/type sequences, so non-clause AST divergences and span mismatches can pass undetected. The harness should compare normalized full AST JSON field-by-field.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/parser_parity/harness.py` around lines 191 - 229, The current parity checks in compare_clause_counts and compare_clause_types only validate clause sequences and miss non-clause field and span mismatches; add a full normalized AST comparison: implement a normalize_ast(ast: dict) helper (used for both Python and Rust ASTs) that strips span/location-only fields and maps Python names to Rust variants using PYTHON_TO_RUST_CLAUSE and the existing python_clause_sequence/rust_clause_sequence helpers, then perform a recursive, field-by-field deep equality check (or produce a structured diff) between normalize_ast(py_ast) and normalize_ast(rs_ast); update the test harness to call this new comparison (e.g., in a new compare_full_ast function invoked alongside or instead of compare_clause_counts/compare_clause_types) and return clear mismatch messages when any non-clause fields diverge.

coderabbitai · 2026-05-30T05:00:47Z

+def parse_json_binary() -> Path:
+    """Build the parse_json Rust binary (once per session)."""
+    subprocess.run(
+        ["cargo", "build", "--bin", "parse_json", "-p", "gf-cypher"],
+        check=True,
+        capture_output=True,
+    )
+    assert _BINARY_PATH.exists(), f"Binary not found at {_BINARY_PATH}"
+    return _BINARY_PATH
+


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Fixture return type violates Python result contract guideline.

parse_json_binary() returns Path; per repository rule, Python methods returning results should return pyarrow.Table.

As per coding guidelines, "In Python code, all methods that return results must return Apache Arrow Tables, never CypherValue wrappers or custom result types" and "Use pyarrow.Table for all result contracts; consume with pandas/polars via .to_pandas() / pl.from_arrow() or with .to_pylist() for list[dict] format".

🧰 Tools

🪛 Ruff (0.15.14)

[error] 44-44: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/parser_parity/test_parity.py` around lines 41 - 50, The fixture parse_json_binary currently returns a Path (_BINARY_PATH) which violates the repo result contract; change its return type to a pyarrow.Table and return a table containing the binary path (e.g., pa.table({"path":[str(_BINARY_PATH)]})), update the function signature to -> pa.Table and ensure pyarrow is imported as pa; keep the build steps (cargo build) and the existence assert, then convert the resulting _BINARY_PATH into a pyarrow.Table before returning.

coderabbitai · 2026-05-30T05:00:47Z

+@pytest.mark.parity
+@pytest.mark.parametrize("query", INVALID_QUERIES, ids=lambda q: repr(q[:40]))
+def test_invalid_query_both_reject(query: str, parse_json_binary: Path) -> None:
+    """Both parsers must reject syntactically invalid queries."""
+    _py_ast, py_err = python_parse(query)
+    _rs_ast, rs_err = rust_parse(query, parse_json_binary)
+
+    assert py_err is not None, f"Python parser accepted an invalid query: {query!r}"
+    assert rs_err is not None, f"Rust parser accepted an invalid query: {query!r}"
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Invalid-query assertions should verify error semantics, not only rejection.

As written, different error kinds/locations still pass. Add assertions for matched error classification and source location parity.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/parser_parity/test_parity.py` around lines 85 - 94, Update test_invalid_query_both_reject to assert error semantics and locations, not just non-None; after calling python_parse and rust_parse (functions referenced in the test) compare error classifications (e.g., error.kind or error.type returned by python_parse and rust_parse) and compare source locations (e.g., error.position, error.line/column or error.span) for parity across both parsers for each query in INVALID_QUERIES using parse_json_binary as before; if the error objects differ in shape, normalize them to a common structure (classification and location) before asserting equality so the test fails when parsers disagree on error kind or reported location.

coderabbitai · 2026-05-30T05:00:47Z

+@pytest.mark.parity
+@pytest.mark.parametrize("query", UNION_QUERIES, ids=lambda q: q[:60])
+def test_union_query_both_succeed(query: str, parse_json_binary: Path) -> None:
+    """Both parsers must succeed on UNION queries.
+
+    Structural comparison is skipped because the two parsers represent UNION
+    differently: Rust emits ``[Return, Union, Return]`` clauses while Python
+    wraps the whole query in a ``UnionQuery`` node.
+    """
+    _py_ast, py_err = python_parse(query)
+    _rs_ast, rs_err = rust_parse(query, parse_json_binary)
+
+    assert py_err is None, f"Python parser rejected UNION query: {query!r}\n  error: {py_err}"
+    assert rs_err is None, f"Rust parser rejected UNION query: {query!r}\n  error: {rs_err}"
+
+
+# ---------------------------------------------------------------------------
+# Known-difference tests — document intentional parser divergences
+# ---------------------------------------------------------------------------
+
+
+@pytest.mark.parity
+@pytest.mark.parametrize("query", KNOWN_DIFFERENCES, ids=lambda q: repr(q[:50]))
+def test_known_differences_dont_panic(query: str, parse_json_binary: Path) -> None:
+    """Known-difference queries: at least one parser must not crash/hang.
+
+    These queries expose intentional design differences between the two
+    parsers (e.g. different handling of WHERE as a top-level clause, empty
+    input, regex operator, label predicates).  The test only asserts that
+    invoking both parsers does not raise an unhandled exception — it does
+    not assert agreement on validity or structure.
+    """
+    # Neither call should raise — errors are returned, not raised
+    python_parse(query)
+    rust_parse(query, parse_json_binary)


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

UNION/known-difference tests weaken parity guarantees below the stated objective.

These paths explicitly bypass structural agreement and allow tolerated divergences, which conflicts with a strict differential parity gate.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/parser_parity/test_parity.py` around lines 101 - 135, The parity marker is being applied to tests that explicitly bypass structural agreement (test_union_query_both_succeed and test_known_differences_dont_panic using UNION_QUERIES and KNOWN_DIFFERENCES), weakening the parity gate; remove or replace pytest.mark.parity on these tests and instead mark them with a dedicated marker (e.g., pytest.mark.known_difference or pytest.mark.union_only) so the true parity suite only contains tests that assert structural equivalence, and update any CI/test-group selection to use the new marker; locate the annotations on test_union_query_both_succeed, test_known_differences_dont_panic, and the UNION_QUERIES / KNOWN_DIFFERENCES parameterization to make the change.

DecisionNerd · 2026-05-30T05:26:12Z

CodeRabbit Autofix Review Complete

Reviewed 6 CodeRabbit feedback item(s) and did not apply code changes in this run.

All 6 findings deferred:

Fix: Restore correct column naming and SKIP/LIMIT behavior #1, ci(deps): bump actions/checkout from 4 to 6 #4: Arrow Table return-contract rule applies to the public analyst API, not to internal test harness utilities or pytest fixtures.
ci(deps): bump astral-sh/setup-uv from 4 to 7 #2, Add comprehensive WITH clause integration tests #5: Error-kind/location parity is a documented out-of-scope limitation — Python raises bare with no structured kind; mapping to Rust requires a translation table not in scope for rust: build parser parity corpus and differential test harness #555.
ci(deps): bump actions/setup-python from 5 to 6 #3: Full field-level comparison is blocked by fundamental naming differences between the two AST designs (documented scope limitation).
Fix CodeRabbit configuration schema validation #6: Cosmetic — marker grouping doesn't affect correctness or the parity gate.

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

DecisionNerd merged commit 276ca5d into main May 30, 2026
42 checks passed

DecisionNerd deleted the feature/555-parser-parity branch May 30, 2026 05:48

coderabbitai Bot mentioned this pull request May 30, 2026

chore: remove deprecated Python parser/planner/executor stack #662

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(parser): parity corpus + differential test harness#661

feat(parser): parity corpus + differential test harness#661
DecisionNerd merged 1 commit into
mainfrom
feature/555-parser-parity

DecisionNerd commented May 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 30, 2026 •

edited

Loading

Uh oh!

codecov Bot commented May 30, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

DecisionNerd commented May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DecisionNerd commented May 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Known structural differences (documented, not bugs)

Test plan

Add differential test harness comparing Python Lark and Rust Cypher parsers

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Uh oh!

codecov Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

DecisionNerd commented May 30, 2026

CodeRabbit Autofix Review Complete

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

DecisionNerd commented May 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 30, 2026 •

edited

Loading

codecov Bot commented May 30, 2026 •

edited

Loading