Support pattern expressions as boolean expressions#2360
Conversation
There was a problem hiding this comment.
Pull request overview
Adds openCypher-compatible “pattern expressions” as boolean predicates in WHERE by resolving the (a) ambiguity (parenthesized expression vs node pattern) using Bison GLR parsing.
Changes:
- Switch
cypher_gram.yto%glr-parser, add%dprecannotations, and introducemake_exists_pattern_sublink()to shareEXISTS(pattern)wrapping logic. - Allow
anonymous_pathto appear as anexpr_atom, translating bare patterns into anEXISTSsublink boolean. - Add regression coverage for pattern expressions (
regress/sql+regress/expected) and register the new test inREGRESS.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/backend/parser/cypher_gram.y | Enables bare pattern parsing in expression context via GLR + %dprec, refactors EXISTS wrapping into a helper. |
| regress/sql/pattern_expression.sql | New regression test cases covering pattern expressions in WHERE and boolean combinations. |
| regress/expected/pattern_expression.out | Expected output for the new regression test. |
| Makefile | Adds pattern_expression to the regression test run list. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@gregfelice Please see the above comments by Copilot |
|
Addressed all 3 Copilot suggestions:
Regression test passes ( |
|
@gregfelice Did you push them? |
|
@jrgemignani Pushed now — sorry about that! All 3 Copilot suggestions addressed in commit 5d11080. And noted on the workflow — I'll reply directly to Copilot's comments going forward instead of posting standalone. Thanks for the heads up. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@gregfelice Please see Copilot above |
|
I'm on holiday this week - will get to next week - Thanks
…On Mon, Apr 6, 2026 at 6:17 PM John Gemignani ***@***.***> wrote:
*jrgemignani* left a comment (apache/age#2360)
<#2360 (comment)>
@gregfelice <https://github.com/gregfelice> Please see Copilot above
—
Reply to this email directly, view it on GitHub
<#2360 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABBXGAANBXB5W733GNLJAT4UPKCZAVCNFSM6AAAAACW6JESACVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DCOJTGQZDCMRZHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Greg Felice
415.343.5227
***@***.***
linkedin.com/in/gregfelice
calendly.com/gregfelice/30min
|
|
@jrgemignani — both Copilot items addressed and threads resolved. The |
d05092d to
908adc6
Compare
|
Rebased onto current Local PG18-cassert CI on the new SHA is in |
…che#1577) Enable bare graph patterns as boolean expressions in WHERE clauses: MATCH (a:Person), (b:Person) WHERE (a)-[:KNOWS]->(b) -- now valid, equivalent to EXISTS(...) RETURN a.name, b.name Previously, this required wrapping in EXISTS(): WHERE EXISTS((a)-[:KNOWS]->(b)) The bare pattern syntax is standard openCypher and is used extensively in Neo4j. Its absence was the most frequently cited migration blocker. Implementation approach: - Switch the Cypher parser from LALR(1) to Bison GLR mode. GLR handles the inherent ambiguity between parenthesized expressions '(' expr ')' and graph path nodes '(' var_name label_opt props ')' by forking the parse stack and discarding the failing path. - Add anonymous_path as an expr_atom alternative with %dprec 1 (lower priority than expression path at %dprec 2). The action wraps the pattern in a cypher_sub_pattern + EXISTS SubLink, reusing the same transform_cypher_sub_pattern() machinery as explicit EXISTS(). - Extract make_exists_pattern_sublink() helper shared by both EXISTS(pattern) and bare pattern rules. - Fix YYLLOC_DEFAULT to use YYRHSLOC() for GLR compatibility. - %dprec annotations on expr_var/var_name_opt resolve the reduce/reduce conflict between expression variables and pattern node variables. Conflict budget: 7 shift/reduce (path extension vs arithmetic on -/<), 3 reduce/reduce (expr_var vs var_name_opt on )/}/=). All are expected and handled correctly by GLR forking + %dprec disambiguation. All 32 regression tests pass (31 existing + 1 new). New pattern_expression test covers: bare patterns, NOT patterns, labeled nodes, AND/OR combinations, left-directed patterns, anonymous nodes, multi-hop patterns, EXISTS() backward compatibility, and non-pattern expression regression checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Move "Helper function to create an ExplainStmt node" comment from above make_exists_pattern_sublink() to above make_explain_stmt() where it belongs. 2. Add block comment documenting the %expect/%expect-rr conflict budget: 7 S/R from path vs arithmetic on - and <, 3 R/R from expr_var vs var_name_opt on ) } =. 3. Clarify test comment: "Regular expressions" -> "Regular (non-pattern) expressions" to avoid confusion with regex. Regression test: pattern_expression OK. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Pattern expressions are now accepted anywhere an expr is valid (RETURN, WITH, SET, CASE, boolean combinations), not only WHERE. This matches openCypher semantics and documents the broader surface area that was already implicitly enabled by adding anonymous_path to expr_atom. Added regression tests for each new context: RETURN projection (bare and AS-aliased), mixed with other projections, CASE WHEN, boolean AND/OR combinators, SET to persist a computed boolean property, and WITH ... WHERE pipeline. - Remove the hardcoded `%expect 7` / `%expect-rr 3` conflict budget from cypher_gram.y. The exact conflict counts can drift across Bison versions and distros, which would break builds even though the grammar is correct (GLR handles the conflicts at runtime via fork + %dprec). Instead, pass -Wno-conflicts-sr / -Wno-conflicts-rr via BISONFLAGS in the Makefile so the build stays clean without binding us to a specific Bison release. Kept a block comment in the grammar explaining why GLR conflicts are expected and how they resolve. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
908adc6 to
61426cf
Compare
|
Rebased onto current Local PG18-cassert CI on the new SHA will sit in |
|
From Opus 4.7 - Deep review: PR #2360 — Pattern expressions as boolean expressionsI read the PR description and the actual diff ( What it does (correctly)The change is fundamentally sound:
Concerns worth raising1. Removing
|
jrgemignani
left a comment
There was a problem hiding this comment.
Please see the Opus 4.7 review
|
@gregfelice I'm being extra careful due to this modifying the grammar. |
Reverts the broad `-Werror` drop and the no-%expect approach from the prior round on jrgemignani's request. The earlier framing — that conflict counts drift across Bison versions, so %expect is fragile — overcorrected: it removed the only build-time alarm bell for unintended new conflicts. Makefile: keep -Werror so any unexpected Bison warning (unused rules, undeclared types, etc.) still fails the build; downgrade only the two conflict categories to plain warnings via -Wno-error=conflicts-sr -Wno-error=conflicts-rr. pgxs auto-adds -Wno-deprecated, so existing %name-prefix= / %pure-parser directives remain non-erroring. cypher_gram.y: add `%expect 7` and `%expect-rr 3` matching the Bison 3.8.2 totals. Bison treats %expect as exact-match, not as a ceiling — any deviation fails the build and forces an audit of the new conflicts. Comment updated to reflect that future Bison versions reporting different counts should bump the numbers explicitly with a version note in the commit message, rather than removing the directive. No grammar or runtime change. Cassert installcheck 34/34 AGE tests green.
|
@jrgemignani — thanks for the careful review. Addressed all three blocking asks in 1. Makefile 2. 3. Parse-time microbench — Phase 3.5 release-build perf preflight — built a user-owned PG18 release cluster (
Both deltas well below the 5% gate, and W01's LALR-side range (5.6%) is itself wider than the LALR-vs-GLR delta. No measurable parse-time regression on representative read-only or write Cypher. Cassert installcheck: 34/34 AGE tests green (pgvector skipped — I also updated the PR description to (a) note the Non-blocking follow-ups from the review — your call on bundling:
|
|
@gregfelice Ty. Just me - I prefer the builds to fail on anything, warnings included :) And grammars are tricky. |
|
@gregfelice I would add the non-blocking follow ups that you listed. Follow ups sometimes get forgotten. Otherwise, everything else looks good. |
Summary
Adds support for openCypher pattern expressions as general boolean expressions. Pattern expressions are now accepted anywhere a regular expression is valid — in
WHERE,RETURN,WITH,SET,CASE, and as operands of boolean combinators — not just inWHERE.Implementation
Grammar (
cypher_gram.y)%glr-parser) to handle the inherent ambiguity between parenthesized expressions and graph patterns. Both start with(, so a single-token lookahead cannot decide which production applies until several tokens later. GLR forks at the conflict point and discards the failing alternative.anonymous_pathtoexpr_atomwith a%dprec 1annotation. A bare(a)prefers the expression-variable interpretation (%dprec 2on'(' expr ')'), so single-node pattern expressions still resolve as plain variable references.make_exists_pattern_sublink()to wrap the pattern in anEXISTSsubquery — a bare pattern(a)-[:KNOWS]->(b)in expression context is semantically equivalent toEXISTS((a)-[:KNOWS]->(b)).make_explain_stmt()and placed it immediately above its docstring comment so the grammar file stays readable.(rhs)[1]withYYRHSLOC(rhs, 1)in theYYLLOC_DEFAULTmacro — a required GLR-correctness fix, not stylistic. GLR's rhs stack has a different layout than LALR(1)'s; the raw index lands on the wrong slot under GLR and produces incorrect source locations.%expect/%expect-rr. GLR produces 7 shift/reduce and 3 reduce/reduce conflicts (all arising from the(ambiguity between path_node and parenthesized expr, plus expr_var/var_name_opt overlap on)/}/=). These are handled correctly at runtime by GLR +%dprec. The grammar declares%expect 7and%expect-rr 3so any deviation up or down fails the build and forces an audit of the new conflicts. The Makefile keeps-Werrorand scopes the relaxation to-Wno-error=conflicts-sr -Wno-error=conflicts-rr, so other Bison warning categories (unused rules, undeclared types, etc.) still fail the build. Earlier rounds of this PR tried%expect-less variants for "Bison version portability," but Bison's%expectis exact-match — so the right answer is to pin the totals and update them deliberately if a future Bison reports different counts, not to remove the alarm.Parser (
cypher_analyze.c,cypher_clause.c)Files changed
src/backend/parser/cypher_gram.y%glr-parser,%dprecannotations,anonymous_pathinexpr_atom,make_exists_pattern_sublink(), GLR comment block explaining why conflicts are expectedMakefileBISONFLAGS += -Werror -Wno-error=conflicts-sr -Wno-error=conflicts-rr— keep-Werrorfor other warning categories, scope relaxation to the two known-noise conflict categories; registerpattern_expressionregression testregress/sql/pattern_expression.sqlregress/expected/pattern_expression.outTest plan
WHERE (a)-[:KNOWS]->(b)WHERE NOT (a)-[:KNOWS]->(:Person)WHERE (a)-[:KNOWS]->(:Person)ASalias → boolean columnCASE WHEN ... THEN ... ELSE ... ENDANDandORin RETURNSET a.flag = (a)-[:R]->(:L)WITHprojection + subsequentWHEREfilterRETURN (1 + 2),RETURN (n.name)— parenthesized expressions still work-Werrorwith%expect 7 %expect-rr 3pinned (Bison 3.8.2); any conflict-count drift now fails the build deliberately