Background
During the Phase 8 review (2026-05-14) and the Phase 8.5 discuss-phase (2026-05-17), the security-reviewer agent flagged that DefaultScorer.Score() allocates approximately 5 MB of heap pressure per call across its 6 algorithms:
- DamerauLevenshteinOSA: ~2.4 MB (three-row int DP)
- JaroWinkler: 2 ×
[256]bool (stack, negligible)
- TokenJaccard: 2 tokenisations (rune-count maps)
- QGramJaccard: 2 ×
map[string]int with capacity (len(s)-n+1)*5/4
- SorensenDice: same as QGramJaccard
- DoubleMetaphone: 2 ×
[4]byte (negligible post-Phase 8.5 optimisation)
Total: ~5 MB heap per Score() call.
The dispatch-table abstraction in dispatch_*.go means each algorithm allocates its own tokenisation independently — there is no cross-algorithm input reuse within a single Score() invocation.
Decision (deferred from v1.0)
The Phase 8.5 discuss-phase session decided to defer this optimisation work from v1.0 for the following reasons:
- Not a real-world DoS today. Go's GC handles ~5 MB/call fine at expected library throughput. No consumer has reported GC pressure.
- Both implementation options need real design work that does not fit the v1.0 schedule:
WithMaxScoreAllocBytes(n int) ScorerOption — alloc-budget enforcement has unclear semantics if hit mid-Score (truncate? error? best-effort?).
- Cross-algorithm tokenisation cache scoped to a single
Score() call — needs careful lifetime/safety design to avoid use-after-free across goroutines if the Scorer is shared.
- v1.0 documents the 5 MB/call as expected behaviour in
docs/algorithms.md#performance-characteristics, allowing consumers to profile their own workloads.
Triggers for picking this up
Re-open this work when any of the following hold:
- A real-world consumer reports measurable GC pressure under their workload.
- Benchmarks show a GC-bound throughput regression that per-algorithm optimisations cannot address.
- A use case emerges (e.g. streaming match against millions of candidates) where amortising tokenisation across algorithms would deliver a 2x+ throughput improvement.
Acceptance criteria when picked up
- Design doc covering both options (allocation budget vs tokenisation cache) with explicit semantics for each.
- Benchmark suite demonstrating measurable GC reduction and per-Score latency impact.
- API addition that is backward-compatible — no breaking change to the existing Scorer surface.
- Documentation update in
docs/algorithms.md#performance-characteristics.
- New BDD scenarios covering the chosen option.
References
- L4237 in
REVIEW-FINDINGS.md (Phase 8 review, 2026-05-14)
- Phase 8.5 discuss-phase decision Q14a (2026-05-17)
Background
During the Phase 8 review (2026-05-14) and the Phase 8.5 discuss-phase (2026-05-17), the security-reviewer agent flagged that
DefaultScorer.Score()allocates approximately 5 MB of heap pressure per call across its 6 algorithms:[256]bool(stack, negligible)map[string]intwith capacity(len(s)-n+1)*5/4[4]byte(negligible post-Phase 8.5 optimisation)Total: ~5 MB heap per
Score()call.The dispatch-table abstraction in
dispatch_*.gomeans each algorithm allocates its own tokenisation independently — there is no cross-algorithm input reuse within a singleScore()invocation.Decision (deferred from v1.0)
The Phase 8.5 discuss-phase session decided to defer this optimisation work from v1.0 for the following reasons:
WithMaxScoreAllocBytes(n int) ScorerOption— alloc-budget enforcement has unclear semantics if hit mid-Score(truncate? error? best-effort?).Score()call — needs careful lifetime/safety design to avoid use-after-free across goroutines if the Scorer is shared.docs/algorithms.md#performance-characteristics, allowing consumers to profile their own workloads.Triggers for picking this up
Re-open this work when any of the following hold:
Acceptance criteria when picked up
docs/algorithms.md#performance-characteristics.References
REVIEW-FINDINGS.md(Phase 8 review, 2026-05-14)