feat(ast): Token enum, ParseError, and full AST node types#635
Conversation
gf-ast now contains: - Token enum with all openCypher token classes (keywords, literals, operators, punctuation, trivia) — frozen lexer ABI - Keyword enum covering all reserved words - ParseError / ParseErrorKind with structured error kinds and span info - Complete AST: AstQuery, AstClause (12 variants), NodePattern, RelPattern, PathPattern, Expr (15 variants), plus all supporting structs (BinaryOp, FunctionCall, CaseExpr, ListComprehension, etc.) - All types are Clone + Debug + PartialEq + Serialize + Deserialize - 47 unit tests covering span extraction, JSON round-trips, and all major node types gf-cypher updated: parse() now returns ParseError (not GfError) to align with the frozen ABI; stub returns UnexpectedEof until #554 lands. Also fixes pre-existing CI failures on main: - api_v05.py: add directory check in __init__; fix whitespace-only query validation in find(); fix mypy arg-type error in add_edges - tests/features/steps/api_steps.py: fix RUF012 mutable class defaults - Reformat conftest.py, api_steps.py to satisfy ruff format Closes #552, #553 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughThis PR introduces a complete OpenCypher AST module (gf-ast) with span-rich, JSON-serializable data structures for Cypher parsing and interop. It defines AstQuery (dialect + ordered clauses), 12 AstClause variants, detailed pattern and expression trees, along with Token/Keyword and ParseError types for lexer/parser ABI contracts. Comprehensive round-trip tests validate all node types and clauses. Updates to gf-cypher integrate the new error contract; formatting refactors improve test infrastructure and Python API logic. ChangesAST & Parser Contract Implementation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #635 +/- ##
=======================================
Coverage 84.08% 84.09%
=======================================
Files 49 49
Lines 12915 12921 +6
Branches 3627 3628 +1
=======================================
+ Hits 10860 10866 +6
Misses 1270 1270
Partials 785 785
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report in Codecov by Sentry.
|
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/gf-ast/src/tests.rs (1)
7-7: 💤 Low valueOptional: Remove redundant crate import.
In Rust 2018+, you can use
serde_json::to_stringdirectly withoutuse serde_json;. The import is harmless but unnecessary.♻️ Proposed cleanup
use crate::{ ast::*, parse_error::*, token::*, Span, }; -use serde_json;Then use
serde_json::to_stringandserde_json::from_strthroughout.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-ast/src/tests.rs` at line 7, The tests include an unnecessary crate import "use serde_json;"—remove that redundant use statement and update call sites (e.g., any existing to_string/from_str usage) to call the functions with the full path serde_json::to_string and serde_json::from_str where needed; this cleans up imports while keeping behavior the same.crates/gf-cypher/src/lib.rs (1)
15-23: 💤 Low valueRemove redundant import alias.
Spanis already re-exported fromgf_coreat line 7. The aliasSpan as AstSpanon line 16 is unnecessary sincegf_ast::Spanis the same type.♻️ Proposed simplification
pub fn parse(_input: &str) -> Result<AstQuery, ParseError> { - use gf_ast::{ParseErrorKind, Span as AstSpan}; + use gf_ast::ParseErrorKind; Err(ParseError::new( ParseErrorKind::UnexpectedEof { expected: vec!["statement".to_owned()], }, - AstSpan::new(0, 0), + Span::new(0, 0), "parser not yet implemented — lands in M9 issue `#554`", ))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gf-cypher/src/lib.rs` around lines 15 - 23, Remove the unnecessary alias for Span in the parse function import and update its usage: replace the import gf_ast::{ParseErrorKind, Span as AstSpan} with gf_ast::{ParseErrorKind, Span} and change the call AstSpan::new(0, 0) to Span::new(0, 0) so the code uses the re-exported Span type directly in function parse.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gf-ast/src/parse_error.rs`:
- Around line 33-42: The file contains a duplicated doc comment for the
ParseError type (the two identical blocks mentioning `ParseError`,
`gf_cypher::parse`, and the `span` and `kind` fields); remove the second
duplicated comment block so only one doc comment describing
`ParseError`/`gf_cypher::parse`/`span` and `kind` remains (locate the duplicate
by searching for the repeated text mentioning `ParseError` and
`gf_cypher::parse` and delete the latter occurrence).
---
Nitpick comments:
In `@crates/gf-ast/src/tests.rs`:
- Line 7: The tests include an unnecessary crate import "use serde_json;"—remove
that redundant use statement and update call sites (e.g., any existing
to_string/from_str usage) to call the functions with the full path
serde_json::to_string and serde_json::from_str where needed; this cleans up
imports while keeping behavior the same.
In `@crates/gf-cypher/src/lib.rs`:
- Around line 15-23: Remove the unnecessary alias for Span in the parse function
import and update its usage: replace the import gf_ast::{ParseErrorKind, Span as
AstSpan} with gf_ast::{ParseErrorKind, Span} and change the call AstSpan::new(0,
0) to Span::new(0, 0) so the code uses the re-exported Span type directly in
function parse.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: abc6ca59-b5b5-447d-873b-06319790515a
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock,!**/*.lock
📒 Files selected for processing (14)
crates/gf-ast/Cargo.tomlcrates/gf-ast/src/ast.rscrates/gf-ast/src/lib.rscrates/gf-ast/src/parse_error.rscrates/gf-ast/src/tests.rscrates/gf-ast/src/token.rscrates/gf-core/src/lib.rscrates/gf-core/tests/bdd/api_steps.rscrates/gf-cypher/src/lib.rscrates/gf-exec/src/lib.rssrc/graphforge/api_v05.pytests/features/conftest.pytests/features/steps/api_steps.pytests/features/test_api_bdd.py
💤 Files with no reviewable changes (1)
- tests/features/test_api_bdd.py
| /// A structured parse error produced by `gf-cypher`. | ||
| /// | ||
| /// `ParseError` is the error type in the `Result` returned by | ||
| /// `gf_cypher::parse`. The differential test harness uses the `span` and | ||
| /// `kind` fields to assert that both parsers flag the same source location. | ||
| /// A structured parse error produced by `gf-cypher`. | ||
| /// | ||
| /// `ParseError` is the error type in the `Result` returned by | ||
| /// `gf_cypher::parse`. The differential test harness uses the `span` and | ||
| /// `kind` fields to assert that both parsers flag the same source location. |
There was a problem hiding this comment.
Remove duplicate doc comment.
Lines 33–37 and 38–42 are identical. Remove the second block (lines 38–42).
📝 Proposed fix
/// A structured parse error produced by `gf-cypher`.
///
/// `ParseError` is the error type in the `Result` returned by
/// `gf_cypher::parse`. The differential test harness uses the `span` and
/// `kind` fields to assert that both parsers flag the same source location.
-/// A structured parse error produced by `gf-cypher`.
-///
-/// `ParseError` is the error type in the `Result` returned by
-/// `gf_cypher::parse`. The differential test harness uses the `span` and
-/// `kind` fields to assert that both parsers flag the same source location.
#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// A structured parse error produced by `gf-cypher`. | |
| /// | |
| /// `ParseError` is the error type in the `Result` returned by | |
| /// `gf_cypher::parse`. The differential test harness uses the `span` and | |
| /// `kind` fields to assert that both parsers flag the same source location. | |
| /// A structured parse error produced by `gf-cypher`. | |
| /// | |
| /// `ParseError` is the error type in the `Result` returned by | |
| /// `gf_cypher::parse`. The differential test harness uses the `span` and | |
| /// `kind` fields to assert that both parsers flag the same source location. | |
| /// A structured parse error produced by `gf-cypher`. | |
| /// | |
| /// `ParseError` is the error type in the `Result` returned by | |
| /// `gf_cypher::parse`. The differential test harness uses the `span` and | |
| /// `kind` fields to assert that both parsers flag the same source location. | |
| #[derive(Debug, Clone, PartialEq, Serialize, Deserialize)] |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gf-ast/src/parse_error.rs` around lines 33 - 42, The file contains a
duplicated doc comment for the ParseError type (the two identical blocks
mentioning `ParseError`, `gf_cypher::parse`, and the `span` and `kind` fields);
remove the second duplicated comment block so only one doc comment describing
`ParseError`/`gf_cypher::parse`/`span` and `kind` remains (locate the duplicate
by searching for the repeated text mentioning `ParseError` and
`gf_cypher::parse` and delete the latter occurrence).
Summary
Tokenenum with all openCypher token classes,Keywordenum,ParseError/ParseErrorKindwith structured span info — allClone + Debug + PartialEq + Serialize + DeserializeAstQuery,AstClause(12 variants),NodePattern,RelPattern,PathPattern,Expr(15 variants),BinaryOp,FunctionCall,CaseExpr,ListComprehension,PatternComprehension, and all supporting typesgf-cypher::parse()updated to returnParseError(notGfError) aligned with the frozen ABIapi_v05.py: missing directory check, whitespace-only query validation, mypy arg-type errorapi_steps.py: RUF012 mutable class defaultsconftest.py: ruff formatTest plan
cargo test --package gf-ast— 47 tests passcargo clippy --workspace -- -D warnings— cleanuv run ruff check .— cleanuv run mypy src/graphforge— cleanuv run pytest tests/unit— 3672 passCloses #552, #553
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes / Behavior
Tests
Note
Add full AST node types, Token enum, and ParseError to gf-ast crate
AstQuery, all clause types (MatchClause,ReturnClause,MergeClause, etc.), expression types (Expr,Literal,BinaryOp, etc.), and path/pattern nodes.Tokenenum covering keywords, literals, punctuation, trivia, and EOF, plusKeywordand helper methodsspan()/is_trivia().ParseErrorandParseErrorKind, implementingDisplayandstd::error::Error.AstQuerystruct.gf-cypher'sparse()to returnResult<AstQuery, ParseError>(stub that always returnsUnexpectedEof) instead ofGfError.AstQuery { source, span }struct is removed from the public API, which is a breaking change for any existing consumers.Macroscope summarized 6944056.