Skip to content

perf: cache function-PC and reflect.Type name resolution#31

Merged
ankurs merged 8 commits into
mainfrom
claude/add-function-caching-lgwnC
Apr 20, 2026
Merged

perf: cache function-PC and reflect.Type name resolution#31
ankurs merged 8 commits into
mainfrom
claude/add-function-caching-lgwnC

Conversation

@ankurs
Copy link
Copy Markdown
Member

@ankurs ankurs commented Apr 17, 2026

Add two process-global sync.Map caches for the two "location" resolutions
on every request:

  • reflect.Type -> "pkgpath.TypeName" (was a fresh string allocation per call)
  • function PC -> runtime.FuncForPC(pc).Name() (symbol-table walk)

Keys are stable for the lifetime of the process, so the caches never evict
and are bounded by the number of distinct types / builder functions.

Observed (benchstat, count=6; see BENCHMARKS.md):

  • Result.Get: -75% (single-threaded) / -91% (parallel, lock-free reads)
  • AddBuilders (warm): -42% time, -59% allocations
  • CachedStructName hit: -87%, zero allocations
  • ResolveFuncName hit: -63%, zero allocations
  • Compile / RunParallel: no significant latency delta, ~5% fewer allocs

Also adds benchmarks_test.go (the package had none) covering the
micro-benchmarks above plus AddBuilders / Compile / RunParallel /
Result.Get end-to-end, and BENCHMARKS.md documenting reproduction steps
and interpretation.

Summary by CodeRabbit

  • New Features

    • Introduced caching mechanism for improved performance of name resolution operations.
  • Documentation

    • Added comprehensive benchmarking documentation detailing performance improvements and measurement methodology.
  • Tests

    • Added extensive benchmark suite to measure and track performance across various operations.

Summary by CodeRabbit

  • Documentation

    • Added benchmark documentation covering performance analysis and optimization results.
  • Tests

    • Added comprehensive performance benchmark suite for compilation, execution, and data operations.
  • Chores

    • Updated Go toolchain to version 1.25.9.

Add two process-global sync.Map caches for the two "location" resolutions
on every request:

- reflect.Type -> "pkgpath.TypeName" (was a fresh string allocation per call)
- function PC -> runtime.FuncForPC(pc).Name() (symbol-table walk)

Keys are stable for the lifetime of the process, so the caches never evict
and are bounded by the number of distinct types / builder functions.

Observed (benchstat, count=6; see BENCHMARKS.md):
- Result.Get: -75% (single-threaded) / -91% (parallel, lock-free reads)
- AddBuilders (warm): -42% time, -59% allocations
- CachedStructName hit: -87%, zero allocations
- ResolveFuncName hit: -63%, zero allocations
- Compile / RunParallel: no significant latency delta, ~5% fewer allocs

Also adds benchmarks_test.go (the package had none) covering the
micro-benchmarks above plus AddBuilders / Compile / RunParallel /
Result.Get end-to-end, and BENCHMARKS.md documenting reproduction steps
and interpretation.

https://claude.ai/code/session_01Hu8nZg5zrsaRWWf3Dq5XVY
Copilot AI review requested due to automatic review settings April 17, 2026 16:08
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@ankurs has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 57 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 47 minutes and 57 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9c4e075d-d91a-4820-9da0-e0f5e2d81a22

📥 Commits

Reviewing files that changed from the base of the PR and between b0be301 and 7a9a95f.

📒 Files selected for processing (3)
  • README.md
  • benchmarks_test.go
  • doc.go
📝 Walkthrough

Walkthrough

Adds process-global, concurrency-safe caches for struct-type and function-name resolution; integrates cached lookups into builder registration, compilation, parallel execution, and result retrieval; plus benchmarks, a test helper to reset caches, and benchmarking documentation.

Changes

Cohort / File(s) Summary
Documentation
BENCHMARKS.md
New benchmarking document describing methodology, commands, environment, sample benchstat output, cold-cache caveats, and interpretation of results.
Caching infrastructure
cache.go
New package-level sync.Map caches: structNameCache (reflect.Type → qualified name) and funcNameCache (uintptr → function name). Adds cachedStructName(t reflect.Type) and resolveFuncName(pc uintptr) read-through helpers.
Core integration
databuilder.go, plan.go
Replaced direct runtime.FuncForPC(...).Name() and manual struct-name construction with resolveFuncName() and cachedStructName() across builder registration, Compile, RunParallel, and Result.Get; removed old helper. Minor local rename initialialDatainitialData.
Benchmarks & tests
benchmarks_test.go, cache_test.go
Added benchmarks_test.go with micro-benchmarks for cached vs uncached struct/function name resolution, builder registration, compile, parallel run, and result access. cache_test.go adds resetCachesForTest() to clear caches between runs.
Tooling
go.mod
Bumped Go toolchain directive from go 1.25.8 to go 1.25.9.

Sequence Diagram(s)

sequenceDiagram
    participant Caller as Caller (Compile / Register / Result.Get)
    participant Cache as NameCache (sync.Map)
    participant Runtime as Runtime / Reflect

    Caller->>Cache: lookup by reflect.Type or pc
    alt cache hit
        Cache-->>Caller: cached name
    else cache miss
        Cache->>Runtime: compute name (PkgPath+Name / FuncForPC.Name)
        Runtime-->>Cache: computed name
        Cache-->>Caller: stored & returned name
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I hop through maps with eager cheer,

stashing names so lookups are clear.
PC or Type — a cached embrace,
builders sprint across the race.
Tiny paws — big speed to trace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 39.29% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'perf: cache function-PC and reflect.Type name resolution' clearly and specifically summarizes the main performance optimization: caching of function program counter and type name resolution.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/add-function-caching-lgwnC

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
databuilder.go (1)

128-137: Optional: hoist cachedStructName(t.Out(0)) out of the loop.

t.Out(0) is invariant across iterations. Even with caching this does an extra sync.Map.Load per input. Tiny win, but keeps the intent clearer.

♻️ Proposed refactor
+		outName := cachedStructName(t.Out(0))
 		// other inputs should all be structs
 		for i := 1; i < t.NumIn(); i++ {
 			if t.In(i).Kind() != reflect.Struct {
 				// checks for vardic functions as well
 				return ErrInvalidBuilderInput
 			}
-			if cachedStructName(t.In(i)) == cachedStructName(t.Out(0)) {
+			if cachedStructName(t.In(i)) == outName {
 				return ErrSameInputAsOutput
 			}
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@databuilder.go` around lines 128 - 137, Hoist the call to
cachedStructName(t.Out(0)) out of the for-loop to avoid repeated sync.Map loads
and clarify intent: compute and store outputName := cachedStructName(t.Out(0))
once before the loop that iterates inputs, then compare
cachedStructName(t.In(i)) to outputName inside the loop and return
ErrSameInputAsOutput when they match; keep the existing checks for
reflect.Struct and ErrInvalidBuilderInput unchanged.
cache.go (1)

35-38: Prefer sync.Map.Clear() over reassignment.

Reassigning the package-level sync.Map variables is a data race if any other goroutine is reading them concurrently. Go 1.23 added sync.Map.Clear() (this repo is on go1.25 per BENCHMARKS.md), which is safe for concurrent use and avoids reallocating the map structure.

♻️ Proposed refactor
 func resetCachesForTest() {
-	structNameCache = sync.Map{}
-	funcNameCache = sync.Map{}
+	structNameCache.Clear()
+	funcNameCache.Clear()
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cache.go` around lines 35 - 38, The resetCachesForTest function currently
reassigns package-level sync.Map variables (structNameCache and funcNameCache),
which can race with concurrent readers; instead call the Clear method on each
map (structNameCache.Clear() and funcNameCache.Clear()) to safely empty them
without reallocating the map structures; update resetCachesForTest to invoke
those Clear() calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cache.go`:
- Around line 35-38: The resetCachesForTest function currently reassigns
package-level sync.Map variables (structNameCache and funcNameCache), which can
race with concurrent readers; instead call the Clear method on each map
(structNameCache.Clear() and funcNameCache.Clear()) to safely empty them without
reallocating the map structures; update resetCachesForTest to invoke those
Clear() calls.

In `@databuilder.go`:
- Around line 128-137: Hoist the call to cachedStructName(t.Out(0)) out of the
for-loop to avoid repeated sync.Map loads and clarify intent: compute and store
outputName := cachedStructName(t.Out(0)) once before the loop that iterates
inputs, then compare cachedStructName(t.In(i)) to outputName inside the loop and
return ErrSameInputAsOutput when they match; keep the existing checks for
reflect.Struct and ErrInvalidBuilderInput unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b6e2ed5f-3915-4af1-ba53-71f973d031cb

📥 Commits

Reviewing files that changed from the base of the PR and between 2faa7ed and 374a4ea.

📒 Files selected for processing (5)
  • BENCHMARKS.md
  • benchmarks_test.go
  • cache.go
  • databuilder.go
  • plan.go

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces process-global caching for frequently-resolved type names and function names to reduce allocations and symbol-lookup overhead on hot paths, and adds benchmarks + documentation to quantify the impact.

Changes:

  • Add sync.Map caches for reflect.Type -> "pkgpath.TypeName" and PC -> runtime.FuncForPC(pc).Name().
  • Replace repeated struct-name and function-name resolution calls with cached equivalents in builder compilation and plan execution.
  • Add a new benchmark suite and BENCHMARKS.md to reproduce and interpret performance results.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
plan.go Switches struct-name lookup during execution/result reads to cached resolution.
databuilder.go Switches struct-name and builder function-name resolution to cached helpers; removes old getStructName.
cache.go Introduces the process-global caches and helper functions (plus a test-only reset helper).
benchmarks_test.go Adds micro- and end-to-end benchmarks to measure cache effects.
BENCHMARKS.md Documents how to run benchmarks and summarizes observed results.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cache.go Outdated
Comment thread benchmarks_test.go
claude added 2 commits April 17, 2026 16:14
…kAddBuilders

Addresses PR #31 review feedback:

- resetCachesForTest lived in cache.go and compiled into production builds.
  Moved into a new cache_test.go so it only exists in test/bench builds.
  Also switched from reassigning the sync.Map vars to Range+Delete in place
  (reassignment isn't concurrency-safe if anything else is touching the
  caches).
- BenchmarkAddBuilders didn't control cache state, so results could drift
  based on which other benchmarks ran first. It now explicitly resets and
  warms the caches before ResetTimer so it deterministically measures
  steady-state (warm) registration cost, distinct from
  BenchmarkAddBuilders_ColdCache.

https://claude.ai/code/session_01Hu8nZg5zrsaRWWf3Dq5XVY
govulncheck on CI flags 9 reachable vulnerabilities routing through
pre-existing code paths (data.init, data.resolveDependencies,
data.plan.RunParallel) into Go stdlib (text/template, crypto/tls,
crypto/x509). Bumping the go directive from 1.25.8 to 1.25.9 picks up
the patched stdlib so govulncheck passes.

None of the vulnerable call sites are introduced by this PR — they
originate from pre-existing dependencies (tracing, goccy/go-graphviz)
and stdlib packages called by fmt.Errorf. Main would hit the same
failure if its CI were re-run today.

https://claude.ai/code/session_01Hu8nZg5zrsaRWWf3Dq5XVY
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread databuilder.go
Comment thread BENCHMARKS.md Outdated
Comment thread go.mod
claude added 2 commits April 17, 2026 16:33
Addresses PR #31 review feedback:

- Rename local variable initialialData -> initialData in db.Compile. Pre-existing
  typo; local-only, so no API impact.
- Update BENCHMARKS.md environment line from go1.25.8 to go1.25.9 to match
  the bumped go.mod directive.

https://claude.ai/code/session_01Hu8nZg5zrsaRWWf3Dq5XVY
The inline comments just described the line above. Cache warmup reads
cleanly without narration.

https://claude.ai/code/session_01Hu8nZg5zrsaRWWf3Dq5XVY
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread benchmarks_test.go Outdated
Comment thread benchmarks_test.go Outdated
Comment thread BENCHMARKS.md Outdated
claude added 2 commits April 20, 2026 06:53
The _ColdMix benchmarks warmed the cache for every key before
b.ResetTimer(), so they measured mixed-key hit performance, not cold
misses. Renamed to _MixedHit to match what they actually measure, and
updated BENCHMARKS.md caveats to point at AddBuilders_ColdCache as the
true cold-miss signal.
Moved benchmark highlights and general package overview into a new
doc.go so they live in Go source (picked up by gomarkdoc and
pkg.go.dev) instead of a standalone markdown file that would rot.
README is regenerated from the package doc.

Deleted BENCHMARKS.md — the benchmark code in benchmarks_test.go is the
source of truth; absolute nanosecond numbers were going stale and
duplicated what the PR description already covers.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks_test.go`:
- Around line 45-118: The benchmarks currently discard return values with `_ =
...`, allowing the compiler to eliminate work; update each benchmark function
(BenchmarkGetStructName_Uncached, BenchmarkCachedStructName_Hit,
BenchmarkCachedStructName_MixedHit, BenchmarkFuncForPC_Uncached,
BenchmarkResolveFuncName_Hit, BenchmarkResolveFuncName_MixedHit) to store the
result of the measured call in a local variable inside the hot loop and call
runtime.KeepAlive(var) after the loop to prevent dead-code elimination; for any
parallel benchmarks (b.RunParallel) keep the result variable local to the worker
closure to avoid races.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c70e713d-4872-451b-89b9-04628ac6efad

📥 Commits

Reviewing files that changed from the base of the PR and between c8aa044 and b0be301.

📒 Files selected for processing (2)
  • BENCHMARKS.md
  • benchmarks_test.go
✅ Files skipped from review due to trivial changes (1)
  • BENCHMARKS.md

Comment thread benchmarks_test.go
Store measured results in locals and call runtime.KeepAlive after each
loop so the compiler cannot elide inlineable work or under-count allocs.
For b.RunParallel the local lives inside the worker closure to avoid
false sharing. Numbers are unchanged on this hardware (sync.Map and
reflect method calls were already opaque to the inliner), but this
makes the suite robust to future code changes.
@ankurs ankurs merged commit 865f765 into main Apr 20, 2026
4 checks passed
@ankurs ankurs deleted the claude/add-function-caching-lgwnC branch April 20, 2026 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants