feat(explain): implement explain API for AST stage#660
Conversation
- gf-cypher: add `explain(cypher)` free function — parses input and
returns "AST\n---\n{json}\n\nGraphIR\n-------\n(stub)" so all three
BDD scenarios pass (non-empty, contains "AST", contains "GraphIR")
- gf-cypher: promote serde_json from dev-dep to regular dep
- gf-core: add ExplainStage enum (Ast, BoundAst, GraphIr, LogicalPlan,
PhysicalPlan) for future multi-stage API; GraphForge::explain() stub
directs callers to gf_cypher::explain until engine facade is wired
- gf-core: add gf-cypher as dev-dep for BDD tests; update when_explain
step to call gf_cypher::explain; implement then_nonempty_string and
then_result_contains_text assertion steps
- src/graphforge/api.py: add explain(query, *, stage="ast") to main
Python GraphForge class; stage default "ast" returns parsed AST as JSON
- 6 new unit tests in gf-cypher covering all explain exit criteria
Closes #556
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an AST-stage explain feature: gf-cypher::explain() returns pretty JSON AST; gf-core documents ExplainStage and redirects the facade; BDD steps updated to call the Rust explain; Python GraphForge.explain() exposes the AST output and unit tests validate behavior. ChangesExplain API for AST stage
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #660 +/- ##
==========================================
+ Coverage 84.09% 84.10% +0.01%
==========================================
Files 49 49
Lines 12921 12942 +21
Branches 3628 3632 +4
==========================================
+ Hits 10866 10885 +19
- Misses 1270 1272 +2
Partials 785 785
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-core/src/lib.rs`:
- Around line 375-388: The doc comment for the method explain() is inaccurate:
it claims the method "always returns the AST stage as pretty-printed JSON" and
that gf_cypher's ExplainStage API is re-exported, but the implementation returns
Err(GfError::NotImplemented(...)) and gf_cypher is only a dev-dependency. Update
the doc comment on pub fn explain(&self, _cypher: &str) to state that this is a
stub that currently returns GfError::NotImplemented, remove or correct the claim
about re-exporting ExplainStage/gf_cypher (note gf_cypher is a dev-dependency
and not available to normal consumers), and instruct callers to call
gf_cypher::explain directly until the facade is implemented.
In `@crates/gf-core/tests/bdd/api_steps.rs`:
- Around line 743-758: The generic assertion step then_nonempty_string (and the
other then_* steps around lines 761-779) currently panics when world.last_result
is None; instead, if world.last_error indicates a pending step (e.g., equals or
contains "not implemented") return early without panicking. Update
then_nonempty_string and the sibling then_* helper functions to check
world.last_result first and, if None, inspect world.last_error for the "not
implemented" marker and simply return; only assert/fail when last_result is
missing and last_error is not the pending sentinel. Use the existing symbols
world.last_result and world.last_error and apply the same change to the other
then_* assertion functions mentioned.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e48b0a4a-f7fe-46bb-866a-dbaaa5de7a82
📒 Files selected for processing (6)
crates/gf-core/Cargo.tomlcrates/gf-core/src/lib.rscrates/gf-core/tests/bdd/api_steps.rscrates/gf-cypher/Cargo.tomlcrates/gf-cypher/src/lib.rssrc/graphforge/api.py
- api_steps.rs: then_nonempty_string / then_result_contains_text skip gracefully when last_error contains 'not implemented' (pending step) - gf-core lib.rs: correct doc comment on explain() stub — it returns NotImplemented, not JSON; gf-cypher is dev-dep not re-exported - api.py: restructure explain() try/except so repr() fallback is not double-encoded by json.dumps Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes Applied SuccessfullyFixed 3 file(s) based on 3 CodeRabbit feedback item(s). Files modified:
Commit: The latest autofix changes are on the |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/graphforge/api.py`:
- Around line 1580-1584: The current AST serialization in explain(stage="ast")
fails for lists and nested Pydantic models because parser.parse(query) can
return list[CypherQuery] and CypherQuery.clauses can contain Pydantic BaseModel
instances; instead of calling dataclasses.asdict(ast) directly, normalize the
AST to only built-in types by: detect if ast is a list and map each element
through a converter; for each element, recursively convert dataclasses (use
dataclasses.is_dataclass and dataclasses.asdict) and Pydantic models (detect
pydantic.BaseModel and call .dict() or .model_dump() depending on version), and
leave primitives/containers intact; finally call json.dumps on the normalized
structure (with indent and default=str as fallback) and return that string.
Ensure you reference parser.parse, CypherQuery, dataclasses.asdict, and
pydantic.BaseModel when locating the code to modify.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c01bbc62-316c-44d8-b9f9-9f4a9a4f826f
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock,!**/*.lock
📒 Files selected for processing (3)
crates/gf-core/src/lib.rscrates/gf-core/tests/bdd/api_steps.rssrc/graphforge/api.py
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/gf-core/tests/bdd/api_steps.rs
- crates/gf-core/src/lib.rs
- api.py: narrow CypherQuery | list[CypherQuery] union before calling dataclasses.asdict — fixes mypy arg-type error on line 1581 - test_api_coverage.py: add TestExplainAst with 4 tests covering explain() happy path, unknown stage, and post-close guard Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/unit/api/test_api_coverage.py (1)
287-295: ⚡ Quick winTest contains dead code and doesn't validate JSON as its name/docstring claim.
import jsonis unused andjson_partis computed but never referenced. Alsolstrip("AST\n-")strips a character set, not the"AST\n---\n"prefix, so it wouldn't do what it looks like. The only effective check is"{" in result, which is much weaker than "contains valid JSON". Parse the body and assert structure instead.♻️ Proposed fix to actually validate the JSON payload
def test_explain_output_contains_json(self): """explain() output contains valid JSON after the header.""" import json gf = GraphForge() result = gf.explain("RETURN 1") - json_part = result.split("\n\n")[0].lstrip("AST\n-") - # just verify JSON is somewhere in there - assert "{" in result + # Strip the "AST\n---\n" header, then assert the remainder is valid JSON. + header, _, body = result.partition("\n---\n") + assert header == "AST" + json.loads(body) # raises if the body is not valid JSON🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/api/test_api_coverage.py` around lines 287 - 295, The test_explain_output_contains_json has dead code and doesn't actually validate JSON; update the test to remove the unused import and instead extract the JSON payload produced by GraphForge.explain by splitting off the header (e.g. take the content after the header separator rather than using lstrip) then parse that substring with json.loads and assert the resulting object shape (e.g. is a dict or contains expected keys); update references to json_part and the assertion so the test truly verifies valid JSON rather than just checking for "{".
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/unit/api/test_api_coverage.py`:
- Around line 303-310: The test test_explain_after_close_raises currently
expects LifecycleError but GraphForge.explain calls _ensure_open which raises
RuntimeError("GraphForge instance has been closed"); update the test to expect
pytest.raises(RuntimeError) instead of LifecycleError and remove the unused
import of LifecycleError so the assertion matches
GraphForge.explain/_ensure_open behavior.
---
Nitpick comments:
In `@tests/unit/api/test_api_coverage.py`:
- Around line 287-295: The test_explain_output_contains_json has dead code and
doesn't actually validate JSON; update the test to remove the unused import and
instead extract the JSON payload produced by GraphForge.explain by splitting off
the header (e.g. take the content after the header separator rather than using
lstrip) then parse that substring with json.loads and assert the resulting
object shape (e.g. is a dict or contains expected keys); update references to
json_part and the assertion so the test truly verifies valid JSON rather than
just checking for "{".
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5f3ab4a4-5a85-4b53-a8c9-e30c213d7b92
📒 Files selected for processing (2)
src/graphforge/api.pytests/unit/api/test_api_coverage.py
🚧 Files skipped from review as they are similar to previous changes (1)
- src/graphforge/api.py
- api.py: add _ast_to_dict() helper that recursively handles @DataClass and Pydantic BaseModel nodes; replace dataclasses.asdict()+isinstance split with a single _ast_to_dict(ast) call so nested clause BaseModel instances are serialized via .model_dump() rather than deep-copied - test_api_coverage.py: fix test_explain_after_close_raises to expect RuntimeError (what _ensure_open actually raises) not LifecycleError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fixes Applied SuccessfullyFixed 2 file(s) based on 2 CodeRabbit feedback item(s). Files modified:
Commit: The latest autofix changes are on the |
- api.py: remove # type: ignore[arg-type] on dataclasses.asdict — mypy now accepts the narrowed DataclassInstance type without the suppress - test_api_coverage.py: remove unused `import json` and `json_part` local from test_explain_output_contains_json Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/graphforge/api.py (1)
116-137: ⚡ Quick winPrefer
mode="json"inmodel_dump()for explicit JSON serialization.Line 131 uses
model_dump(mode="python"), which returns Python objects (e.g.,datetimeinstances,Enummembers). Since the final step isjson.dumps()on line 1604, usingmode="json"would ensure all values are already JSON-serializable and avoid reliance on thedefault=strfallback for complex Python types.📝 Suggested change
if isinstance(node, BaseModel): - return {k: _ast_to_dict(v) for k, v in node.model_dump(mode="python").items()} + return {k: _ast_to_dict(v) for k, v in node.model_dump(mode="json").items()}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/graphforge/api.py` around lines 116 - 137, In _ast_to_dict, when handling Pydantic BaseModel instances (the isinstance(node, BaseModel) branch), change the call model_dump(mode="python") to model_dump(mode="json") so the Pydantic values are pre-converted to JSON-serialisable types before later json.dumps; update any assumptions that rely on Python-mode types (e.g., datetimes or Enums) being stringified later and ensure _ast_to_dict continues to recursively process the returned structure from model_dump("json").
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/graphforge/api.py`:
- Around line 116-137: In _ast_to_dict, when handling Pydantic BaseModel
instances (the isinstance(node, BaseModel) branch), change the call
model_dump(mode="python") to model_dump(mode="json") so the Pydantic values are
pre-converted to JSON-serialisable types before later json.dumps; update any
assumptions that rely on Python-mode types (e.g., datetimes or Enums) being
stringified later and ensure _ast_to_dict continues to recursively process the
returned structure from model_dump("json").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7e12fd38-ad57-4f36-acde-2c5c085798dd
📒 Files selected for processing (2)
src/graphforge/api.pytests/unit/api/test_api_coverage.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/unit/api/test_api_coverage.py
Closes #556
Part of v0.5.0: Compiler Skeleton milestone.
Summary
gf-cypher:gf_cypher::explain(cypher)— parses input and returns a multi-section string with AST JSON + GraphIR stub section.serde_jsonpromoted from dev-dep to regular dep.gf-core:ExplainStageenum (Ast,BoundAst,GraphIr,LogicalPlan,PhysicalPlan) added for future multi-stage API.GraphForge::explain()stub updated with correct documentation pointing togf_cypher::explain.when_explainstep updated to callgf_cypher::explain;then_nonempty_stringandthen_result_contains_textassertion steps implemented (were pending stubs).explain(query, *, stage="ast")added toGraphForgeinapi.py; returns"AST\n---\n{json}"using the Python parser's AST.Why
gf-cyphernotgf-coregf-corecannot depend ongf-cypher— that creates a cycle (gf-ast→gf-core→gf-cypher→gf-ast). Theexplainfunction lives ingf-cypherwhere it has access to both the parser andgf-asttypes. Thegf-corefacade'sexplain()stub is documented to point callers togf_cypher::explainuntil the engine facade is fully wired in a later milestone.Test plan
tests/features/api/explain.featurenow passgf-cyphercovering: non-empty output,"AST"header, valid JSON body,"clauses"key present, parse error propagation, write-query safetycargo test --workspace— all tests passcargo clippy --workspace -- -D warnings— clean🤖 Generated with Claude Code
Note
Add
explainAPI for the AST stage to the Cypher parser and Python facadeexplain(cypher: &str)togf-cypher(lib.rs) that parses a Cypher query, pretty-prints the AST as JSON, and returns a multi-section string withASTand placeholderGraphIRsections.GraphForge.explain(query, stage='ast')to the Python API (api.py), converting the parsed AST to indented JSON prefixed with anASTheader; raisesNotImplementedErrorfor unsupported stages.gf_cypher::explaindirectly rather than the engine facade, which still returnsNotImplemented.ExplainStageenum ingf-core(lib.rs) with variants for future pipeline stages (Ast,BoundAst,GraphIr,LogicalPlan,PhysicalPlan).Macroscope summarized 84f692a.
Summary by CodeRabbit
New Features
Tests