Skip to content

feat(parser): parity corpus + differential test harness#661

Merged
DecisionNerd merged 1 commit into
mainfrom
feature/555-parser-parity
May 30, 2026
Merged

feat(parser): parity corpus + differential test harness#661
DecisionNerd merged 1 commit into
mainfrom
feature/555-parser-parity

Conversation

@DecisionNerd
Copy link
Copy Markdown
Owner

@DecisionNerd DecisionNerd commented May 30, 2026

Closes #555

Closes the final open issue in the v0.5.0: Compiler Skeleton milestone.

Summary

Rust binary (crates/gf-cypher/src/bin/parse_json.rs)
Thin CLI wrapper: parses a query from argv, emits AstQuery JSON to stdout (exit 0) or a structured error JSON to stderr (exit 1). Added as [[bin]] parse_json in Cargo.toml.

Corpus (tests/parser_parity/corpus.py)

  • VALID_QUERIES: 200 queries — all 13 clause types (MATCH, OPTIONAL MATCH, WHERE, WITH, RETURN, CREATE, MERGE, SET, REMOVE, DELETE, DETACH DELETE, UNWIND, CALL), all operators, all expression forms, CASE, list comprehensions, parameters, path patterns, functions
  • INVALID_QUERIES: 54 queries — unclosed constructs, unterminated strings, invalid tokens, incomplete clauses, wrong partial syntax
  • UNION_QUERIES: 5 queries — both parsers succeed but structural comparison is skipped (Python: UnionQuery wrapper; Rust: [Return, Union, Return] clauses)
  • KNOWN_DIFFERENCES: 11 queries — documented intentional divergences (regex =~, label predicates in WHERE, shortestPath, empty input, clause-order permutations)

Harness (tests/parser_parity/harness.py)

  • _ast_to_dict — recursively converts Python AST (dataclass + Pydantic BaseModel) to JSON-comparable dict, injecting _type on clause nodes
  • python_parse / rust_parse — run each parser, return (ast, None) or (None, error)
  • python_clause_sequence — filters Python "modifier" clauses (WHERE, ORDER BY, SKIP, LIMIT) that Rust inlines into parent clause structs
  • compare_clause_counts / compare_clause_types — structural comparison with Python→Rust clause name mapping

Tests (tests/parser_parity/test_parity.py) — 270 parametrized tests:

  • test_valid_query_parity[200] — both parsers succeed; same clause count and type sequence
  • test_invalid_query_both_reject[54] — both parsers reject
  • test_union_query_both_succeed[5] — both succeed (no structural check)
  • test_known_differences_dont_panic[11] — neither parser crashes

Known structural differences (documented, not bugs)

Category Python Rust
WHERE clause Separate top-level WhereClause node Inlined as where_clause field on MatchClause/WithClause
ORDER BY, SKIP, LIMIT Separate clause nodes Inlined as fields on ReturnClause/WithClause
UNION UnionQuery wrapper node [Return, Union, Return] flat clauses
Regex =~ Rejected Accepted
Label pred in WHERE Accepted Rejected
Empty input Rejected Accepted (empty clause list)

Test plan

  • cargo build --bin parse_json -p gf-cypher — binary builds
  • uv run pytest tests/parser_parity/ -m parity -q — 270 tests pass
  • uv run ruff check tests/parser_parity/ — clean
  • uv run mypy tests/parser_parity/ --ignore-missing-imports — clean
  • cargo clippy -p gf-cypher -- -D warnings — clean

🤖 Generated with Claude Code

Note

Add differential test harness comparing Python Lark and Rust Cypher parsers

  • Introduces a parse_json Rust binary (parse_json.rs) that parses a Cypher query from CLI args and emits a JSON AST on success (exit 0) or a JSON error object on stderr (exit 1).
  • Adds a Python test harness (harness.py) that runs both parsers and normalizes their outputs for structural comparison, mapping Python clause types to Rust serde discriminants.
  • Adds a query corpus (corpus.py) covering valid, invalid, union, and known-divergence queries across all major Cypher clause types.
  • Parameterized pytest tests (test_parity.py) verify both parsers accept valid queries with matching clause structure, both reject invalid queries, and known divergences are documented without panics.
  • Risk: each test invocation of the Rust parser spawns a subprocess with a 10s timeout, so large query corpora will increase test suite wall time significantly.

Macroscope summarized c1255a6.

Summary by CodeRabbit

  • New Features

    • Added a command-line tool that parses Cypher queries and outputs results in JSON format, including detailed error reporting with error types and source location information.
  • Tests

    • Introduced a parser parity testing framework with extensive test corpora to validate consistent parsing behavior across different parser implementations.

Review Change Stack

- crates/gf-cypher/src/bin/parse_json.rs: CLI binary — parses a query
  from argv, emits AstQuery JSON to stdout (exit 0) or error JSON to
  stderr (exit 1)
- tests/parser_parity/corpus.py: 200 valid + 54 invalid + 5 UNION +
  11 known-difference queries covering all 13 clause types, operators,
  functions, parameters, paths, and expressions
- tests/parser_parity/harness.py: normalisation helpers — _ast_to_dict
  (injects _type on Pydantic nodes), python_parse, rust_parse,
  strip_spans, clause-sequence extraction with modifier-clause filtering
  (WHERE/ORDER BY/SKIP/LIMIT are inlined in Rust, top-level in Python)
- tests/parser_parity/test_parity.py: 4 parametrized test classes —
  valid parity (200), invalid rejection (54), UNION success (5), known
  differences no-panic (11) — 270 tests total
- pyproject.toml: register 'parity' marker

Closes #555

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

Walkthrough

This PR implements a complete differential testing framework for Cypher parser parity. It adds a Rust parse_json CLI binary, a Python corpus of 479+ valid/invalid/UNION queries, an AST normalization harness, and a pytest suite that validates both parsers produce equivalent output.

Changes

Parser Parity Testing Infrastructure

Layer / File(s) Summary
Rust parse_json CLI binary
crates/gf-cypher/Cargo.toml, crates/gf-cypher/src/bin/parse_json.rs
Adds a new binary target that wraps gf_cypher::parse, accepts query arguments, emits pretty-printed AstQuery JSON to stdout on success (exit 0) or structured error JSON with ParseErrorKind and span to stderr on failure (exit 1).
Query corpus for parity testing
tests/parser_parity/corpus.py
Defines 357+ valid Cypher queries across all clause types and operators, UNION examples, 97+ invalid queries with parse errors, and known-difference cases where parsers intentionally diverge.
Comparison harness for AST normalization
tests/parser_parity/harness.py, pyproject.toml
Implements python_parse and rust_parse to invoke and normalize parser outputs, strip_spans to remove location metadata, python_clause_sequence and rust_clause_sequence to extract ordered clause lists with modifier-clause filtering, and compare_clause_counts/compare_clause_types to validate structural equivalence. Adds parity pytest marker.
Parser parity test suite
tests/parser_parity/test_parity.py
Parameterized pytest suite with session-scoped parse_json_binary fixture that builds the Rust binary once. Tests validate that (1) valid queries parse and produce matching clause counts/types, (2) invalid queries both reject, (3) UNION queries both succeed without strict structure check, (4) known-difference queries execute without panics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • DecisionNerd/graphforge#607: This PR directly implements the differential testing infrastructure that issue aims to expand, providing the foundational parser-parity corpus and harness.

Possibly related PRs

  • DecisionNerd/graphforge#635: The new parse_json binary and parity tests depend on the AstQuery/ParseError API and ParseErrorKind type overhaul introduced in that PR.
  • DecisionNerd/graphforge#634: The parse_json binary serializes the AstQuery type, which is directly affected by the serde trait changes in that PR.
  • DecisionNerd/graphforge#654: This PR exercises the gf_cypher::parse entrypoint and ParseErrorKind output that the retrieved PR wires up as the recursive-descent/Pratt parser scaffold.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(parser): parity corpus + differential test harness' accurately and concisely describes the main change: adding parser parity testing infrastructure with corpus and harness.
Linked Issues check ✅ Passed The PR fulfills all core coding requirements from issue #555: corpus of 200+ valid and 54 invalid queries covering all clause types, differential harness normalizing Python/Rust ASTs, comparison helpers mapping clause names, and 270 parametrized tests validating parity across all query categories.
Out of Scope Changes check ✅ Passed All changes are directly aligned with #555 requirements: parse_json binary, corpus, harness, and parametrized tests for parser parity validation. No unrelated code changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The pull request description is comprehensive and well-structured, covering all major sections including summary, known structural differences, and test plan.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/555-parser-parity

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.10%. Comparing base (1381d49) to head (c1255a6).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #661   +/-   ##
=======================================
  Coverage   84.10%   84.10%           
=======================================
  Files          49       49           
  Lines       12942    12942           
  Branches     3632     3632           
=======================================
  Hits        10885    10885           
  Misses       1272     1272           
  Partials      785      785           
Flag Coverage Δ
full-coverage 84.10% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
parser 92.93% <ø> (ø)
planner 79.90% <ø> (ø)
executor 75.46% <ø> (ø)
storage 98.68% <ø> (ø)
ast 97.51% <ø> (ø)
types 90.66% <ø> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1381d49...c1255a6. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/parser_parity/harness.py`:
- Around line 191-229: The current parity checks in compare_clause_counts and
compare_clause_types only validate clause sequences and miss non-clause field
and span mismatches; add a full normalized AST comparison: implement a
normalize_ast(ast: dict) helper (used for both Python and Rust ASTs) that strips
span/location-only fields and maps Python names to Rust variants using
PYTHON_TO_RUST_CLAUSE and the existing
python_clause_sequence/rust_clause_sequence helpers, then perform a recursive,
field-by-field deep equality check (or produce a structured diff) between
normalize_ast(py_ast) and normalize_ast(rs_ast); update the test harness to call
this new comparison (e.g., in a new compare_full_ast function invoked alongside
or instead of compare_clause_counts/compare_clause_types) and return clear
mismatch messages when any non-clause fields diverge.
- Around line 67-134: The functions python_parse, python_clause_types,
rust_parse and rust_clause_types currently return dict/list/tuple or None;
change them to always return pyarrow.Table results per the project guideline:
import pyarrow as pa and convert successful outputs (ast dict or list of clause
names) into a pa.Table (e.g., pa.table({"result": [json_string_or_struct]}) or a
structured table matching the expected consumer shape), and convert errors into
a pa.Table with consistent error columns (e.g., {"error": [True], "message":
[msg]}). Update the return shapes of python_parse and rust_parse so they return
a single pa.Table (never None) and adjust callers to consume via .to_pandas() /
pl.from_arrow() / .to_pylist() as required; locate and modify the bodies of
python_parse, python_clause_types, rust_parse and rust_clause_types to perform
these conversions.
- Around line 101-127: The parity test currently only checks that both parsers
reject but not whether they report the same error kind and span; add helper
functions (e.g., compare_error_kind(expected_err, actual_err) and
compare_error_span(expected_err, actual_err)) and use them in the parity
assertions to compare the parsed error dicts from rust_parse and the reference
parser. Ensure rust_parse continues to return a structured error dict (with
'kind' and 'span' when present) or normalize missing fields into a consistent
shape before comparison, have the helpers tolerate absent fields by failing only
on real mismatches, and call these helpers from the existing parity test paths
where return (None, err) is observed.

In `@tests/parser_parity/test_parity.py`:
- Around line 41-50: The fixture parse_json_binary currently returns a Path
(_BINARY_PATH) which violates the repo result contract; change its return type
to a pyarrow.Table and return a table containing the binary path (e.g.,
pa.table({"path":[str(_BINARY_PATH)]})), update the function signature to ->
pa.Table and ensure pyarrow is imported as pa; keep the build steps (cargo
build) and the existence assert, then convert the resulting _BINARY_PATH into a
pyarrow.Table before returning.
- Around line 85-94: Update test_invalid_query_both_reject to assert error
semantics and locations, not just non-None; after calling python_parse and
rust_parse (functions referenced in the test) compare error classifications
(e.g., error.kind or error.type returned by python_parse and rust_parse) and
compare source locations (e.g., error.position, error.line/column or error.span)
for parity across both parsers for each query in INVALID_QUERIES using
parse_json_binary as before; if the error objects differ in shape, normalize
them to a common structure (classification and location) before asserting
equality so the test fails when parsers disagree on error kind or reported
location.
- Around line 101-135: The parity marker is being applied to tests that
explicitly bypass structural agreement (test_union_query_both_succeed and
test_known_differences_dont_panic using UNION_QUERIES and KNOWN_DIFFERENCES),
weakening the parity gate; remove or replace pytest.mark.parity on these tests
and instead mark them with a dedicated marker (e.g.,
pytest.mark.known_difference or pytest.mark.union_only) so the true parity suite
only contains tests that assert structural equivalence, and update any
CI/test-group selection to use the new marker; locate the annotations on
test_union_query_both_succeed, test_known_differences_dont_panic, and the
UNION_QUERIES / KNOWN_DIFFERENCES parameterization to make the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1f185b7b-0e32-4a7d-b00c-671d669ca766

📥 Commits

Reviewing files that changed from the base of the PR and between 1381d49 and c1255a6.

📒 Files selected for processing (7)
  • crates/gf-cypher/Cargo.toml
  • crates/gf-cypher/src/bin/parse_json.rs
  • pyproject.toml
  • tests/parser_parity/__init__.py
  • tests/parser_parity/corpus.py
  • tests/parser_parity/harness.py
  • tests/parser_parity/test_parity.py

Comment on lines +67 to +134
def python_parse(query: str) -> tuple[dict[str, Any] | None, Exception | None]:
"""Run the Python parser.

Returns ``(ast_dict, None)`` on success or ``(None, exc)`` on error.
"""
from graphforge.parser.parser import parse_cypher

try:
ast = parse_cypher(query)
result = _ast_to_dict(ast)
return result if isinstance(result, dict) else {"_root": result}, None
except Exception as exc:
return None, exc


def python_clause_types(query: str) -> list[str] | None:
"""Return Python clause type names for a query, or None on error."""
from graphforge.parser.parser import parse_cypher

try:
ast = parse_cypher(query)
if hasattr(ast, "clauses"):
return [type(c).__name__ for c in ast.clauses]
# UnionQuery or other top-level wrapper
return [type(ast).__name__]
except Exception:
return None


# ---------------------------------------------------------------------------
# Rust parser
# ---------------------------------------------------------------------------


def rust_parse(
query: str,
binary: Path,
) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
"""Run the Rust parse_json binary.

Returns ``(ast_dict, None)`` on success or ``(None, error_dict)`` on error.
"""
try:
result = subprocess.run(
[str(binary), query],
capture_output=True,
text=True,
timeout=10,
check=False,
)
if result.returncode == 0:
return json.loads(result.stdout), None
else:
try:
err = json.loads(result.stderr)
except json.JSONDecodeError:
err = {"error": True, "message": result.stderr.strip()}
return None, err
except subprocess.TimeoutExpired:
return None, {"error": True, "message": "timeout"}


def rust_clause_types(query: str, binary: Path) -> list[str] | None:
"""Return Rust clause variant names for a query, or None on error."""
ast, err = rust_parse(query, binary)
if err is not None or ast is None:
return None
return [list(c.keys())[0] for c in ast.get("clauses", [])]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Return contracts violate the Python result-type guideline.

These functions return dict/list/tuple contracts instead of pyarrow.Table, which conflicts with the repository’s Python result contract rule.

As per coding guidelines, "In Python code, all methods that return results must return Apache Arrow Tables, never CypherValue wrappers or custom result types" and "Use pyarrow.Table for all result contracts; consume with pandas/polars via .to_pandas() / pl.from_arrow() or with .to_pylist() for list[dict] format".

Also applies to: 151-230

🧰 Tools
🪛 Ruff (0.15.14)

[warning] 78-78: Do not catch blind exception: Exception

(BLE001)


[warning] 92-92: Do not catch blind exception: Exception

(BLE001)


[error] 110-110: subprocess call: check for execution of untrusted input

(S603)


[warning] 134-134: Prefer next(iter(c.keys())) over single element slice

Replace with next(iter(c.keys()))

(RUF015)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/parser_parity/harness.py` around lines 67 - 134, The functions
python_parse, python_clause_types, rust_parse and rust_clause_types currently
return dict/list/tuple or None; change them to always return pyarrow.Table
results per the project guideline: import pyarrow as pa and convert successful
outputs (ast dict or list of clause names) into a pa.Table (e.g.,
pa.table({"result": [json_string_or_struct]}) or a structured table matching the
expected consumer shape), and convert errors into a pa.Table with consistent
error columns (e.g., {"error": [True], "message": [msg]}). Update the return
shapes of python_parse and rust_parse so they return a single pa.Table (never
None) and adjust callers to consume via .to_pandas() / pl.from_arrow() /
.to_pylist() as required; locate and modify the bodies of python_parse,
python_clause_types, rust_parse and rust_clause_types to perform these
conversions.

Comment on lines +101 to +127
def rust_parse(
query: str,
binary: Path,
) -> tuple[dict[str, Any] | None, dict[str, Any] | None]:
"""Run the Rust parse_json binary.

Returns ``(ast_dict, None)`` on success or ``(None, error_dict)`` on error.
"""
try:
result = subprocess.run(
[str(binary), query],
capture_output=True,
text=True,
timeout=10,
check=False,
)
if result.returncode == 0:
return json.loads(result.stdout), None
else:
try:
err = json.loads(result.stderr)
except json.JSONDecodeError:
err = {"error": True, "message": result.stderr.strip()}
return None, err
except subprocess.TimeoutExpired:
return None, {"error": True, "message": "timeout"}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Invalid-query parity currently misses error-kind and location equivalence.

rust_parse returns structured error details, but downstream checks only assert “both reject.” Add helpers that compare error kind and span parity between parsers.

🧰 Tools
🪛 Ruff (0.15.14)

[error] 110-110: subprocess call: check for execution of untrusted input

(S603)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/parser_parity/harness.py` around lines 101 - 127, The parity test
currently only checks that both parsers reject but not whether they report the
same error kind and span; add helper functions (e.g.,
compare_error_kind(expected_err, actual_err) and
compare_error_span(expected_err, actual_err)) and use them in the parity
assertions to compare the parsed error dicts from rust_parse and the reference
parser. Ensure rust_parse continues to return a structured error dict (with
'kind' and 'span' when present) or normalize missing fields into a consistent
shape before comparison, have the helpers tolerate absent fields by failing only
on real mismatches, and call these helpers from the existing parity test paths
where return (None, err) is observed.

Comment on lines +191 to +229
def compare_clause_counts(
py_ast: dict[str, Any],
rs_ast: dict[str, Any],
) -> tuple[bool, str]:
"""Return (match, message) comparing clause counts.

Python modifier clauses (WHERE, ORDER BY, SKIP, LIMIT) are excluded
because Rust inlines them into the containing clause struct.
"""
py_seq = python_clause_sequence(py_ast)
rs_seq = rust_clause_sequence(rs_ast)
if len(py_seq) == len(rs_seq):
return True, f"clause count: {len(py_seq)}"
return (
False,
f"clause count mismatch: Python(filtered)={len(py_seq)} Rust={len(rs_seq)}",
)


def compare_clause_types(
py_ast: dict[str, Any],
rs_ast: dict[str, Any],
) -> tuple[bool, str]:
"""Return (match, message) comparing clause type sequences.

Maps Python clause names → Rust variant names using PYTHON_TO_RUST_CLAUSE,
then compares the resulting sequences.
"""
py_types_raw = python_clause_sequence(py_ast)
rs_types = rust_clause_sequence(rs_ast)

py_types_mapped = [PYTHON_TO_RUST_CLAUSE.get(t, t) for t in py_types_raw]

if py_types_mapped == rs_types:
return True, f"clause types match: {rs_types}"
return (
False,
f"clause type mismatch:\n Python (mapped): {py_types_mapped}\n Rust: {rs_types}",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Structural parity check is too shallow for the stated parity contract.

This only compares clause count/type sequences, so non-clause AST divergences and span mismatches can pass undetected. The harness should compare normalized full AST JSON field-by-field.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/parser_parity/harness.py` around lines 191 - 229, The current parity
checks in compare_clause_counts and compare_clause_types only validate clause
sequences and miss non-clause field and span mismatches; add a full normalized
AST comparison: implement a normalize_ast(ast: dict) helper (used for both
Python and Rust ASTs) that strips span/location-only fields and maps Python
names to Rust variants using PYTHON_TO_RUST_CLAUSE and the existing
python_clause_sequence/rust_clause_sequence helpers, then perform a recursive,
field-by-field deep equality check (or produce a structured diff) between
normalize_ast(py_ast) and normalize_ast(rs_ast); update the test harness to call
this new comparison (e.g., in a new compare_full_ast function invoked alongside
or instead of compare_clause_counts/compare_clause_types) and return clear
mismatch messages when any non-clause fields diverge.

Comment on lines +41 to +50
def parse_json_binary() -> Path:
"""Build the parse_json Rust binary (once per session)."""
subprocess.run(
["cargo", "build", "--bin", "parse_json", "-p", "gf-cypher"],
check=True,
capture_output=True,
)
assert _BINARY_PATH.exists(), f"Binary not found at {_BINARY_PATH}"
return _BINARY_PATH

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Fixture return type violates Python result contract guideline.

parse_json_binary() returns Path; per repository rule, Python methods returning results should return pyarrow.Table.

As per coding guidelines, "In Python code, all methods that return results must return Apache Arrow Tables, never CypherValue wrappers or custom result types" and "Use pyarrow.Table for all result contracts; consume with pandas/polars via .to_pandas() / pl.from_arrow() or with .to_pylist() for list[dict] format".

🧰 Tools
🪛 Ruff (0.15.14)

[error] 44-44: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/parser_parity/test_parity.py` around lines 41 - 50, The fixture
parse_json_binary currently returns a Path (_BINARY_PATH) which violates the
repo result contract; change its return type to a pyarrow.Table and return a
table containing the binary path (e.g., pa.table({"path":[str(_BINARY_PATH)]})),
update the function signature to -> pa.Table and ensure pyarrow is imported as
pa; keep the build steps (cargo build) and the existence assert, then convert
the resulting _BINARY_PATH into a pyarrow.Table before returning.

Comment on lines +85 to +94
@pytest.mark.parity
@pytest.mark.parametrize("query", INVALID_QUERIES, ids=lambda q: repr(q[:40]))
def test_invalid_query_both_reject(query: str, parse_json_binary: Path) -> None:
"""Both parsers must reject syntactically invalid queries."""
_py_ast, py_err = python_parse(query)
_rs_ast, rs_err = rust_parse(query, parse_json_binary)

assert py_err is not None, f"Python parser accepted an invalid query: {query!r}"
assert rs_err is not None, f"Rust parser accepted an invalid query: {query!r}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Invalid-query assertions should verify error semantics, not only rejection.

As written, different error kinds/locations still pass. Add assertions for matched error classification and source location parity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/parser_parity/test_parity.py` around lines 85 - 94, Update
test_invalid_query_both_reject to assert error semantics and locations, not just
non-None; after calling python_parse and rust_parse (functions referenced in the
test) compare error classifications (e.g., error.kind or error.type returned by
python_parse and rust_parse) and compare source locations (e.g., error.position,
error.line/column or error.span) for parity across both parsers for each query
in INVALID_QUERIES using parse_json_binary as before; if the error objects
differ in shape, normalize them to a common structure (classification and
location) before asserting equality so the test fails when parsers disagree on
error kind or reported location.

Comment on lines +101 to +135
@pytest.mark.parity
@pytest.mark.parametrize("query", UNION_QUERIES, ids=lambda q: q[:60])
def test_union_query_both_succeed(query: str, parse_json_binary: Path) -> None:
"""Both parsers must succeed on UNION queries.

Structural comparison is skipped because the two parsers represent UNION
differently: Rust emits ``[Return, Union, Return]`` clauses while Python
wraps the whole query in a ``UnionQuery`` node.
"""
_py_ast, py_err = python_parse(query)
_rs_ast, rs_err = rust_parse(query, parse_json_binary)

assert py_err is None, f"Python parser rejected UNION query: {query!r}\n error: {py_err}"
assert rs_err is None, f"Rust parser rejected UNION query: {query!r}\n error: {rs_err}"


# ---------------------------------------------------------------------------
# Known-difference tests — document intentional parser divergences
# ---------------------------------------------------------------------------


@pytest.mark.parity
@pytest.mark.parametrize("query", KNOWN_DIFFERENCES, ids=lambda q: repr(q[:50]))
def test_known_differences_dont_panic(query: str, parse_json_binary: Path) -> None:
"""Known-difference queries: at least one parser must not crash/hang.

These queries expose intentional design differences between the two
parsers (e.g. different handling of WHERE as a top-level clause, empty
input, regex operator, label predicates). The test only asserts that
invoking both parsers does not raise an unhandled exception — it does
not assert agreement on validity or structure.
"""
# Neither call should raise — errors are returned, not raised
python_parse(query)
rust_parse(query, parse_json_binary)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

UNION/known-difference tests weaken parity guarantees below the stated objective.

These paths explicitly bypass structural agreement and allow tolerated divergences, which conflicts with a strict differential parity gate.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/parser_parity/test_parity.py` around lines 101 - 135, The parity marker
is being applied to tests that explicitly bypass structural agreement
(test_union_query_both_succeed and test_known_differences_dont_panic using
UNION_QUERIES and KNOWN_DIFFERENCES), weakening the parity gate; remove or
replace pytest.mark.parity on these tests and instead mark them with a dedicated
marker (e.g., pytest.mark.known_difference or pytest.mark.union_only) so the
true parity suite only contains tests that assert structural equivalence, and
update any CI/test-group selection to use the new marker; locate the annotations
on test_union_query_both_succeed, test_known_differences_dont_panic, and the
UNION_QUERIES / KNOWN_DIFFERENCES parameterization to make the change.

@DecisionNerd
Copy link
Copy Markdown
Owner Author

CodeRabbit Autofix Review Complete

Reviewed 6 CodeRabbit feedback item(s) and did not apply code changes in this run.

All 6 findings deferred:

@DecisionNerd DecisionNerd merged commit 276ca5d into main May 30, 2026
42 checks passed
@DecisionNerd DecisionNerd deleted the feature/555-parser-parity branch May 30, 2026 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rust: build parser parity corpus and differential test harness

1 participant