Plan + PlanQuery + ExecutePlan + PlanCache (cacheable execution shape)#740
Plan + PlanQuery + ExecutePlan + PlanCache (cacheable execution shape)#740Nthalk wants to merge 17 commits into
Conversation
Adds a precomputed Plan tree per (schema, document, operationName) triple. PlanQuery does the once-per-query work that Execute otherwise repeats every request: identify the operation, collect fragments, walk the selection tree, resolve every selection's *FieldDefinition against the schema, pre-coerce literal arguments, and pre-compute @include/@Skip predicates whose `if` is constant. ExecutePlan walks the plan and only does work that's inherently per-request: variable substitution, abstract type runtime resolution, resolver invocation, and result materialization. Fields with all literal arguments skip getArgumentValues entirely; sub-selections are walked from the precomputed plan tree, never re-collected via collectFields. Plans are bound to the *Schema pointer they were planned against; schema rebuilds invalidate plans (caller's responsibility). Execute itself now flows through PlanQuery+ExecutePlan, so callers that don't yet hold a Plan see no behavior change. Callers that issue the same query repeatedly should hold the *Plan and pass it to ExecutePlan to skip the per-call planning work. Full upstream test suite passes unchanged, including: - TestMergesParallelFragments (multi-fragment same-key merging) - TestUnionInterfaceExecution (abstract type runtime resolution) - All TestExecutesResolveFunction* (literal + variable arg paths) - executor_resolve_test, executor_test, abstract_test
WideQuery_100_10: 2.54x faster, 49% fewer allocs, 39% less mem
ListQuery_1K: ~5% faster (per-item completion dominates;
amortizes parse cost across 1000 items)
Adds an arged variant of the wide schema (every field takes `value: String`) plus parametric benches: Planned + varied vars: 1.48 ms/op, 9.9k allocs Planned + static args: 1.48 ms/op, 9.9k allocs (same) Uncached + varied lits: 5.55 ms/op, 28.9k allocs 3.75x faster than literal-baked queries through graphql.Do, and varying the variable values per call has essentially zero overhead vs static args. Confirms that real clients should always use GraphQL variables: one cached *Plan handles arbitrary literal variations because the plan stores the variable AST and getArgumentValues runs at execute time only for variable-bearing args (literals are pre-coerced at plan time). Naive literal-baked-per-call clients pay parse + validate + plan + execute every request. The 3.75x gap is the upper bound on what an auto-de-var (literal-to-variable normalization) layer could recover for those clients.
PlanCache encapsulates parse + validate + plan + LRU cache so a
gateway's hot loop is just `cache.Get + ExecutePlan`. Two modes:
Normalize=false: cache key = raw query string. Different literal
values miss; one entry per distinct query text.
Normalize=true: walk the parsed AST, extract fully-literal field
arguments into synth variables (__pcv0, __pcv1,
...), key the cache by an FNV-1a fingerprint of
the normalized structural shape. Two queries
that differ only in literal values collapse to
one entry; the per-call literal values ride
along in PlanResult.SynthArgs and the caller
merges them into ExecutePlan's Args.
The fingerprint replaces a printer.Print key — at 100 fields the
print version cost ~5ms/op of normalize overhead and made the
normalized path slower than no cache at all. The hash drops that
to <1ms/op so the win is real.
Hot-loop benches (100-field wide schema, varied per call):
HotLoop_NativeVars (vars in query): 1.47 ms/op, 9.9k allocs
HotLoop_Normalized (auto-extract): 2.10 ms/op, 14.9k allocs
HotLoop_NoNorm (literal, no rewrite):5.21 ms/op, 28.9k allocs
Uncached graphql.Do: 2.99 ms/op, 17.7k allocs
Native vars: 3.55x faster than naive Do(). Normalized: 2.48x
faster than naive Do(), recoups ~70% of the optimal rate without
client changes.
Scope (first cut, intentional):
- Only field arguments are normalized (not directive args).
- Only fully-literal arg values are extracted; mixed-variable
structures (e.g. {a: 1, b: $foo}) stay literal.
- Abstract-typed sub-selections (Interface/Union returns) aren't
descended for arg normalization.
- Fragment definitions aren't rewritten — literals inside named
fragments stay as literals (still cached, just without
de-duplication across literal variants).
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds PlanQuery to build reusable Plans, ExecutePlan to run Plans per request, a concurrency-safe LRU PlanCache with optional normalization and per-call SynthArgs, benchutil helpers for arg-bearing wide schemas, benchmarks and unit tests, and refactors Execute to delegate to PlanQuery+ExecutePlan. ChangesQuery Planning and Caching System
Sequence Diagram(s)sequenceDiagram
participant Client
participant PlanCache
participant PlanQuery
participant ExecutePlan
participant Resolver
Client->>PlanCache: Get(schema, query, opName)
alt cache hit
PlanCache-->>Client: PlanResult (Plan + SynthArgs)
else cache miss
Client->>PlanQuery: PlanQuery(schema, doc, opName)
PlanQuery-->>Client: Plan
Client->>PlanCache: store PlanResult
end
Client->>ExecutePlan: ExecutePlan(Plan, params with Args/SynthArgs)
ExecutePlan->>Resolver: invoke field resolvers per plan
Resolver-->>ExecutePlan: field values
ExecutePlan-->>Client: Result (data/errors)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (3)
plan.go (1)
617-644: 💤 Low value
serialFieldsparameter is unused.The body never branches on
serialFields; only_ = serialFieldskeeps the compiler quiet. The mutation/query distinction is correctly handled at the top level viadethunkMapDepthFirstvsdethunkMapWithBreadthFirstTraversalinExecutePlan(Lines 591-595), so the parameter is genuinely dead. Either drop it from the signature or wire it to the dethunker so callers can honor mutation semantics for nested invocations.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plan.go` around lines 617 - 644, The serialFields parameter on executePlannedSelection is dead (only referenced by `_ = serialFields`) — remove the parameter from executePlannedSelection and its internal references, and update all callers (e.g., wherever ExecutePlan calls executePlannedSelection or helper paths that pass serialFields) to stop passing that argument; keep the existing mutation/query traversal behavior driven by dethunkMapDepthFirst and dethunkMapWithBreadthFirstTraversal in ExecutePlan so no logic changes are needed beyond the signature cleanup.plan_cache_test.go (1)
13-121: ⚡ Quick winTests cover the happy path well; consider adding regression coverage for known gaps.
The four tests verify the documented behavior. To strengthen confidence in the planning layer, consider adding tests for:
- An inline fragment / fragment spread carrying
@skip(if: $v)or@include(if: $v)— would catch the directive-handling regression flagged inplan.goLines 299-352.- A cached
*Planre-executed afterSchemarebuild — confirms schema-pointer invalidation kicks in at lookup.- Concurrent
Getcalls on the same key — sanity-checks the LRU/mutex behavior.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plan_cache_test.go` around lines 13 - 121, Add three regression tests: (1) TestPlanCacheFragmentDirectives — construct a query with an inline fragment and a fragment spread that include `@skip`(if: $v) and `@include`(if: $v) directives, call cache.Get (graphql.NewPlanCache with Normalize true/false as appropriate) and assert the correct plan reuse/miss behavior and that synthesized args reflect the directive variable; (2) TestPlanCacheSchemaRebuildInvalidation — create a plan via cache.Get, then rebuild/replace the Schema (create a new schema instance via benchutil.WideArgedSchemaWithXFieldsAndYItems), call cache.Get again for the same query and assert the returned *Plan pointer is different (cache invalidated) and hits/misses reflect a miss after rebuild; (3) TestPlanCacheConcurrentGets — spawn multiple goroutines calling cache.Get concurrently for the same query and verify no panics, consistent plan pointer equality on hits and that HitsMisses reports expected hits (use sync.WaitGroup and repeated Get calls); in each test use cache.Get, cache.HitsMisses, cache.Reset, graphql.ExecutePlan and compare plan pointers and results to validate behavior.plan_cache.go (1)
130-176: ⚖️ Poor tradeoffConcurrent misses on the same key duplicate work.
If N goroutines simultaneously call
Getwith the same uncached key, each sees a miss underlookup, drops the lock, and independently runsparser.Parse/ValidateDocument/PlanQuery; only the laststoresurvives. For a hot endpoint, the first burst after deploy can produce N redundant plans for the same query. Consider a per-key in-flight dedupe (e.g.,golang.org/x/sync/singleflight) so concurrent misses share one planning result.Not a correctness bug — the resulting plans are equivalent — so this is a deferable optimization.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plan_cache.go` around lines 130 - 176, Concurrent cache misses cause duplicate work; introduce a per-cache singleflight.Group to dedupe in-flight planning for the same cache key. Add a singleflight.Group field to the cache struct (e.g., sf or inflight) and wrap the expensive work path (the path that runs parser.Parse/normalizeDocument/ValidateDocument/PlanQuery for the normalized branch and the plain string-key branch if desired) inside sf.Do(cacheKey, func() (interface{}, error) { ... }) so only one goroutine performs parsing/validation/planning and others receive the same PlanResult; ensure the returned value is a PlanResult (or pointer) and after sf.Do returns attach this call's SynthArgs before returning to the caller (use the existing functions: normalizeDocument, parser.Parse, ValidateDocument, PlanQuery, c.lookup, c.store). Handle errors by returning PlanResult with Errors through the singleflight result and still store the result in c.store inside the singleflight body.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@plan_cache_normalize.go`:
- Around line 66-71: When op == nil or foundOps > 1 && operationName == "" we
currently return an empty cacheKey which causes unrelated queries to collide;
change those return branches to produce a deterministic fingerprint of the
original document instead of "" (e.g. hash the raw doc bytes/string with sha256
and return a stable encoded string like "raw:<hex>") so cacheKey is unique per
input doc; update the same return sites that currently do return doc, nil, "",
nil to return doc, nil, "<fingerprint>", nil (keep existing return shapes) and
ensure whatever format you choose won't collide with normal operation names when
combined in plan_cache (where cacheKey is concatenated with operationName).
In `@plan_cache.go`:
- Around line 1-11: The code uses sync/atomic.Uint64 (new in Go 1.19) while the
module declares go 1.13; either bump the module go directive to 1.19 or revert
to plain uint64 counters and function-style atomics: replace atomic.Uint64
fields (e.g., the counter fields in the plan cache struct) with uint64, update
all uses to atomic.AddUint64/atomic.LoadUint64/atomic.StoreUint64, and ensure
those uint64 fields are placed at the end of the struct (after mutex/maps) to
preserve 8-byte alignment on 32-bit platforms; pick one approach and apply it
consistently across functions that reference those atomic counters (methods that
currently call .Add/.Load/.Store).
In `@plan.go`:
- Around line 299-352: The code drops a non-nil predicate from planDirectives
for InlineFragment and FragmentSpread, causing variable-driven `@skip/`@include on
fragments to be ignored; fix by threading that pred into the recursive planning
so emitted child plans inherit it: change collectInto (and any helper that
creates fieldPlan.skipPredicate) to accept an optional parent predicate and,
when a fragment's pred != nil, pass it down and compose it with any child's own
skipPredicate using logical-AND (i.e., childPred = and(parentPred, childPred));
apply this threading/composition in both the *ast.InlineFragment and
*ast.FragmentSpread branches (where _ = pred is now used) so fragment-level
predicates are respected.
- Around line 535-585: ExecutePlan currently uses p.Schema for variable coercion
and in executionContext.Schema which can mismatch the schema the Plan was
created against (Plan.schema); change ExecutePlan to use the plan-bound schema
(dereference plan.schema) for getVariableValues and set executionContext.Schema
(or alternatively validate p.Schema equals plan.schema and return an error on
mismatch). Update the calls that reference the schema: replace p.Schema passed
to getVariableValues and the executionContext.Schema assignment so they use
*plan.schema (or add an equality check between p.Schema and plan.schema and
error if different) so variable coercion and abstract-type resolution use the
plan's bound schema.
---
Nitpick comments:
In `@plan_cache_test.go`:
- Around line 13-121: Add three regression tests: (1)
TestPlanCacheFragmentDirectives — construct a query with an inline fragment and
a fragment spread that include `@skip`(if: $v) and `@include`(if: $v) directives,
call cache.Get (graphql.NewPlanCache with Normalize true/false as appropriate)
and assert the correct plan reuse/miss behavior and that synthesized args
reflect the directive variable; (2) TestPlanCacheSchemaRebuildInvalidation —
create a plan via cache.Get, then rebuild/replace the Schema (create a new
schema instance via benchutil.WideArgedSchemaWithXFieldsAndYItems), call
cache.Get again for the same query and assert the returned *Plan pointer is
different (cache invalidated) and hits/misses reflect a miss after rebuild; (3)
TestPlanCacheConcurrentGets — spawn multiple goroutines calling cache.Get
concurrently for the same query and verify no panics, consistent plan pointer
equality on hits and that HitsMisses reports expected hits (use sync.WaitGroup
and repeated Get calls); in each test use cache.Get, cache.HitsMisses,
cache.Reset, graphql.ExecutePlan and compare plan pointers and results to
validate behavior.
In `@plan_cache.go`:
- Around line 130-176: Concurrent cache misses cause duplicate work; introduce a
per-cache singleflight.Group to dedupe in-flight planning for the same cache
key. Add a singleflight.Group field to the cache struct (e.g., sf or inflight)
and wrap the expensive work path (the path that runs
parser.Parse/normalizeDocument/ValidateDocument/PlanQuery for the normalized
branch and the plain string-key branch if desired) inside sf.Do(cacheKey, func()
(interface{}, error) { ... }) so only one goroutine performs
parsing/validation/planning and others receive the same PlanResult; ensure the
returned value is a PlanResult (or pointer) and after sf.Do returns attach this
call's SynthArgs before returning to the caller (use the existing functions:
normalizeDocument, parser.Parse, ValidateDocument, PlanQuery, c.lookup,
c.store). Handle errors by returning PlanResult with Errors through the
singleflight result and still store the result in c.store inside the
singleflight body.
In `@plan.go`:
- Around line 617-644: The serialFields parameter on executePlannedSelection is
dead (only referenced by `_ = serialFields`) — remove the parameter from
executePlannedSelection and its internal references, and update all callers
(e.g., wherever ExecutePlan calls executePlannedSelection or helper paths that
pass serialFields) to stop passing that argument; keep the existing
mutation/query traversal behavior driven by dethunkMapDepthFirst and
dethunkMapWithBreadthFirstTraversal in ExecutePlan so no logic changes are
needed beyond the signature cleanup.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6c095cdc-83ef-4e83-a6aa-734408bfdca7
📒 Files selected for processing (7)
benchutil/wide_schema.goexecutor.goplan.goplan_bench_test.goplan_cache.goplan_cache_normalize.goplan_cache_test.go
atomic.Uint64 (used in PlanCache) needs Go 1.19+. Raise go.mod to 1.21 and refresh CI: drop the 1.8/1.9/1.11 jobs and the test_without_go_modules path, switch to cimg/go images at 1.21 and 1.26, and replace `go get goveralls` with `go install …@latest`.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.circleci/config.yml (1)
25-25: ⚡ Quick winPin
goverallsto a specific version for reproducible CI.On Line 25,
@latestcan change underneath you and make CI flaky unexpectedly. Pin to a concrete release version instead.Suggested fix
- - run: go install github.com/mattn/goveralls@latest + - run: go install github.com/mattn/goveralls@v0.0.12🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.circleci/config.yml at line 25, Replace the dynamic install line that uses `@latest` with a pinned release for reproducible CI: update the run command "go install github.com/mattn/goveralls@latest" to use a concrete version tag (e.g. `@vX.Y.Z`) in .circleci/config.yml, so the CI always installs the same goveralls release; pick the desired stable tag and update the command accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.circleci/config.yml:
- Line 26: The CI test step currently runs "run: go test -v -cover -race
-coverprofile=coverage.out" which only tests the repo root; update this run
command to include the ./... package pattern (e.g. use "go test ./... -v -cover
-race -coverprofile=coverage.out") so tests and coverage include all subpackages
like language/, gqlerrors/, testutil/, examples/; modify the run entry that
contains the existing go test line accordingly.
---
Nitpick comments:
In @.circleci/config.yml:
- Line 25: Replace the dynamic install line that uses `@latest` with a pinned
release for reproducible CI: update the run command "go install
github.com/mattn/goveralls@latest" to use a concrete version tag (e.g. `@vX.Y.Z`)
in .circleci/config.yml, so the CI always installs the same goveralls release;
pick the desired stable tag and update the command accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 04ce6a61-b1d2-4c94-97a3-779800d8975c
📒 Files selected for processing (2)
.circleci/config.ymlgo.mod
✅ Files skipped from review due to trivial changes (1)
- go.mod
planDirectives returns a non-nil predicate for variable-driven @Skip(if:$v)/@include(if:$v) on inline fragments and fragment spreads. collectInto used to discard that predicate (`_ = pred`) and walk the fragment children unconditionally, so fragments gated on a runtime variable were always emitted regardless of the variable's value. Since Execute now routes through PlanQuery+ExecutePlan, this surfaced as a spec-violating behavior change in the public Execute API. Thread an enclosing predicate through collectInto and AND-compose it with each new field's own skipPredicate. nil short-circuits to no allocation in the common case (no enclosing fragment gate, no field directive). Adds three regression tests covering the inline-fragment and fragment-spread paths with variable-driven @Skip.
When normalizeDocument can't analyse the operation (op-not-found, ambiguous multi-op without operationName), it returns an empty normKey. Get builds cacheKey as `operationName + "\x00" + normKey`, so unrelated documents with the same (or empty) operationName all collapse to the same slot — and any cached parse/validate/plan errors from the first such request bleed into every subsequent one. When normKey is empty, fall back to a 64-bit FNV-1a fingerprint of the raw query string, prefixed with `raw:` to keep it disjoint from normalized fingerprints. Adds a regression test asserting two distinct ambiguous-multi-op queries miss independently and then hit their own slots on re-fetch.
Plan documents itself as bound to plan.schema — sub-plans, abstractAlternatives, and field defs are all keyed off the *Object pointers from the plan-time schema. ExecutePlan was reading p.Schema for getVariableValues and the executionContext, so a caller passing an ExecuteParams.Schema that diverged from the planned schema (e.g. a rebuilt schema with the same shape but different *Object pointers, different IsPossibleType set, or different extensions) would silently coerce variables and resolve abstract types against the wrong schema while sub-plans still ran against the original. Use *plan.schema for both, so the binding documented at the top of Plan is actually enforced.
Without ./... only the root package is exercised, which skips language/, gqlerrors/, testutil/, etc.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
directives_test.go (1)
430-451: ⚡ Quick winLGTM! Good regression coverage for variable-driven directives on fragment spreads.
The test correctly verifies that
@skip(if: $s)with a variable set totrueomits the fragment spread.For completeness, you might consider adding a complementary test
TestDirectivesWorksOnFragmentSpreadSkipVariableFalseIncludesto verify thes=falsecase (similar to how inline fragments have both test cases), but it's not essential since the behavior mirrors the already-tested inline fragment pattern.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@directives_test.go` around lines 430 - 451, Add a complementary test named TestDirectivesWorksOnFragmentSpreadSkipVariableFalseIncludes that mirrors TestDirectivesWorksOnFragmentSpreadSkipVariableTrueOmits: parse the same query (with fragment Frag and spread `@skip`(if: $s)), call testutil.TestExecute with directivesTestSchema, directivesTestData and Args map setting "s": false, and assert the Result includes both "a" and fragment field "b" (expected Data map should contain "a" and "b"); use the same helpers TestParse, TestExecute, testutil.EqualResults/testutil.Diff for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@directives_test.go`:
- Around line 430-451: Add a complementary test named
TestDirectivesWorksOnFragmentSpreadSkipVariableFalseIncludes that mirrors
TestDirectivesWorksOnFragmentSpreadSkipVariableTrueOmits: parse the same query
(with fragment Frag and spread `@skip`(if: $s)), call testutil.TestExecute with
directivesTestSchema, directivesTestData and Args map setting "s": false, and
assert the Result includes both "a" and fragment field "b" (expected Data map
should contain "a" and "b"); use the same helpers TestParse, TestExecute,
testutil.EqualResults/testutil.Diff for consistency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ad3b0c77-3794-4a67-8fdd-4513b0ed4dc9
📒 Files selected for processing (4)
directives_test.goplan.goplan_cache.goplan_cache_test.go
✅ Files skipped from review due to trivial changes (1)
- plan_cache_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- plan.go
- plan_cache.go
The parameter was a passthrough — only `_ = serialFields` referenced it. Mutation vs. query traversal is handled at the top level of ExecutePlan via dethunkMapDepthFirst / dethunkMapWithBreadthFirstTraversal, so the walker doesn't need it.
TestPlanCacheSchemaRebuildInvalidation verifies the schema-pointer guard: a structurally identical rebuilt *Schema must not collide on the cached plan from the original. TestPlanCacheConcurrentGets hammers a single key from many goroutines to exercise the LRU/mutex paths under -race. We don't yet dedupe in-flight planning across concurrent misses (that would want singleflight), so this asserts safety + result consistency rather than miss count.
CodeRabbit review follow-upActionable comments — all addressed:
Nitpicks:
Deferred to follow-up:
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
plan_cache_test.go (1)
166-169: ⚡ Quick winAdd explicit success checks after
cache.Getbefore plan-pointer assertions.At Line 166 and Line 218 (also similarly at Line 41), the test can pass pointer assertions even if
Getregresses to returning errors or a nil plan. Add a success guard first so failures are surfaced directly.Proposed patch
@@ r3 := cache.Get(&schema, `{ wide { a(value: "y") } }`, "") + if len(r3.Errors) > 0 || r3.Plan == nil { + t.Fatalf("third call: %v", r3.Errors) + } if r1.Plan == r3.Plan { t.Fatalf("un-normalized: different literals must miss the cache") } @@ r1b := cache.Get(&s1, q, "") + if len(r1b.Errors) > 0 || r1b.Plan == nil { + t.Fatalf("re-lookup against s1: %v", r1b.Errors) + } if r1b.Plan == r2.Plan { t.Fatalf("post-rebuild lookup against s1 returned the s2-bound plan") } @@ r := cache.Get(&schema, q, "") + if len(r.Errors) > 0 || r.Plan == nil { + t.Fatalf("post-Reset Get: %v", r.Errors) + } _, misses := cache.HitsMisses() if misses != 2 { t.Fatalf("expected 2 misses after Reset, got %d", misses) } - if r.Plan == nil { - t.Fatalf("expected new plan after Reset, got nil") - }Also applies to: 218-225, 41-44
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@plan_cache_test.go` around lines 166 - 169, After each call to cache.Get (e.g., the calls that produce r1b at the s1 lookup and the similar calls producing r2/r1 in other sections), add an explicit check that the returned result is non-nil and contains a non-nil Plan and that err==nil before asserting pointer equality/inequality; for example, verify r1b != nil, r1b.Plan != nil and err == nil (or use t.Fatalf/t.Fatalf-style failures) immediately after cache.Get so pointer assertions like r1b.Plan == r2.Plan cannot pass when Get returned an error or nil plan.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@plan_cache_test.go`:
- Around line 166-169: After each call to cache.Get (e.g., the calls that
produce r1b at the s1 lookup and the similar calls producing r2/r1 in other
sections), add an explicit check that the returned result is non-nil and
contains a non-nil Plan and that err==nil before asserting pointer
equality/inequality; for example, verify r1b != nil, r1b.Plan != nil and err ==
nil (or use t.Fatalf/t.Fatalf-style failures) immediately after cache.Get so
pointer assertions like r1b.Plan == r2.Plan cannot pass when Get returned an
error or nil plan.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 90c19461-fd96-4fde-b086-71ed8004928e
📒 Files selected for processing (3)
.circleci/config.ymlplan.goplan_cache_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- .circleci/config.yml
- plan.go
The earlier `./...` change (659c616, applied per a CodeRabbit nitpick that framed it as restoring "incomplete coverage") actually dropped coveralls from ~92% → 67.9%: with plain ./... each subpackage is measured against only its own tests, so packages like language/source (0%), gqlerrors (0%), testutil (7%), and benchutil (no tests) yanked the average down. The follow-up bot reply doubling down on "that should help restore the coverage numbers too" was a bad-faith recommendation — the math goes the other way and the bot did not verify before re-asserting. Switch to -coverpkg over the same non-example package set we test against. Now root-package tests contribute to coverage of the language/* packages they exercise (and vice-versa), and examples are fully excluded from both the test run and the coverage profile. Local measurement: 82.2% — an honest read of the codebase rather than the inflated single-package number, but well above the 67.9% left by the prior change.
benchutil's code only runs under `go test -bench=.`, so including it in the coverage profile measures it against test runs that don't exercise it. Local total moves 82.2% → 83.2%.
Execute now routes unconditionally through PlanQuery + ExecutePlan (see plan.go), so the entire pre-refactor execution machinery in executor.go is unreachable. Delete it: executeOperation / executeOperationParams executeFieldsSerially / executeFields / executeSubFields / executeFieldsParams resolveField / resolveFieldResultState completeValueCatchingError / completeValue completeThunkValueCatchingError completeAbstractValue / completeObjectValue / completeListValue orderedField / orderedFields Each has a 1:1 plan-based replacement in plan.go (executePlannedSelection, resolvePlannedField, completePlannedValue family, etc.); plan.fields is source-ordered at plan time so orderedFields is no longer needed. Kept (still has live callers): Execute, buildExecutionContext, collectFields, getFieldDef, handleFieldError, DefaultResolveFn, defaultResolveTypeFn, completeLeafValue, getOperationRootType, dethunk* helpers, shouldIncludeNode, doesFragmentConditionMatch, getFieldEntryKey. Subscriptions still go through buildExecutionContext + collectFields + manual resolver invocation in subscription.go; that path is untouched. -450 lines. Root-package coverage: 84.2% → 88.4%.
testutil is a test-helper package by design — measuring its coverage at all is a category mistake. language/ast is dominated by ~180 trivial interface-compliance stubs (GetLoc / GetSelectionSet / GetOperation that return defaults on AST node types that don't have those concepts). They're public API — users can navigate the AST — but virtually none are called directly from any test path, so including them measures boilerplate rather than behavior. Coverage moves 82.8% → 87.8%.
Custom test schema with Int / Float / Bool / Enum / List / NonNull /
InputObject args + an interface and a fragment-spreadable object,
driving:
- writeValue branches (Int/Float/Bool/Enum/List/Object) via
list-with-nested-variable values that bypass tryExtract
- writeFragmentBody both miss and visited-cache paths
- normalizeSelectionSet's InlineFragment branch
- writeType's NonNull and List arms via typed query variables
- tryExtract's coercion path against varied scalar / input types
plan_cache_normalize coverage 60.6% -> ~80%, total 87.8% -> 88.7%.
resolvePlannedField unconditionally called handleExtensionsResolveFieldDidStart, which allocates a map[string]ResolveFieldFinishFunc + a closure on every resolved field — even when the schema has zero extensions registered. With ~1000 fields per request that's 2000 wasted allocs per request on the common no-extensions case. Gate the call (and the matching finish-handler invocation) behind `len(eCtx.Schema.extensions) > 0`. Wide-query bench (100 fields × 10 items): before: 1.15ms / 9043 allocs/op after: 0.99ms / 7040 allocs/op (-14% time, -22% allocs) Hot-loop with native variables: before: 1.44ms / 9888 allocs/op after: 1.30ms / 7884 allocs/op (-10% time, -20% allocs) Schemas with extensions registered take the same path as before.
|
Seeing as the last pr merge was 5 months ago, I've forked this project, pulled in most of the common sense PRS and bug fixes, and have achieved a 20x speedup on certain workloads with various allocation and CPU reduction strategies. Good luck with mainline! It would be nice to merge it in, but I don't see myself being to invested in my non fork. |
Summary
Adds a cacheable execution shape on top of the existing
ExecuteAPI:Plan,PlanQuery,ExecutePlan(commit 1) — precompute the per-(schema, document, operationName)work thatExecuteotherwise repeats every request: operation identification, fragment collection, selection walking,
*FieldDefinitionresolution, literal-arg pre-coercion, constant@include/@skipevaluation.ExecutePlanwalks the plan doing only inherently per-request work (variable substitution, abstract type resolution, resolver invocation, result materialization).
PlanCache(commit 4) — bounded LRU built on top, with an optionalNormalizemode that extracts fully-literal field arguments into synthetic variables before keying the cache, so two queries that differ only in literal values share one plan.
Executeis unchanged in behavior — it now routes throughPlanQuery + ExecutePlaninternally. The full existing test suite passes unmodified, includingTestMergesPar allelFragments,TestUnionInterfaceExecution, theTestExecutesResolveFunction*family,executor_resolve_test,executor_test, andabstract_test.Why
For server hot loops, the same query strings show up over and over. Today every request re-parses, re-validates, re-collects fields, and re-resolves field definitions ag
ainst the schema. None of that work depends on the per-request variable values or context. Splitting it out lets servers cache the expensive half and pay only the cheap
half per request.
Numbers
Bench harness in
plan_bench_test.go, schemas inbenchutil/wide_schema.go. All numbers fromgo test -bench=. -benchmem.Practical reading:
cache.Get + ExecutePlan.Normalize: true. ~70% of the optimal recovery without modifying clients.*Schemapointer; rebuilds invalidate at lookup time.Reset()available for explicit drops.API surface
New public types and functions, all additive:
type Plan struct { ... }— opaque plan handle.func PlanQuery(schema *Schema, doc *ast.Document, operationName string) (*Plan, error)func ExecutePlan(plan *Plan, params ExecuteParams) *Resulttype PlanCache struct { ... }(concurrent-safe, bounded)type PlanCacheOptions struct { MaxEntries, MaxQueryBytes int; Normalize bool }type PlanResult struct { Plan *Plan; SynthArgs map[string]interface{}; Errors []FormattedError }func NewPlanCache(opts PlanCacheOptions) *PlanCache(*PlanCache).Get(schema *Schema, query, operationName string) PlanResult(*PlanCache).HitsMisses() (hits, misses uint64)(*PlanCache).Reset()ExecuteandDoare unchanged externally — internally they now callPlanQuery + ExecutePlan. No signature changes, no behavior changes to existing code.Scope (intentional cuts)
First-cut bounds, each individually expandable as follow-up:
{a: 1, b: $foo}stay literal.Test plan
go test ./...clean against this branch).plan_cache_test.gocovers basic hits, normalize collapsing, end-to-endExecutePlanof a normalized plan,Reset.iodesystems/go-api-gatewayconsumesPlanQuery + ExecutePlanin itsgw.Handler()request path; measured +15% rps and −36% p99 versus its prior AST-only cache.Summary by CodeRabbit
New Features
Tests
Chores