Skip to content

perf: capture parse Position without boxing the offset Int#876

Open
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/parser-pos-no-boxing
Open

perf: capture parse Position without boxing the offset Int#876
He-Pin wants to merge 1 commit into
databricks:masterfrom
He-Pin:perf/parser-pos-no-boxing

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 30, 2026

Motivation

Parser.Pos is invoked for nearly every AST node to capture the source offset. It was Index.map(off => new Position(...)): fastparse's Index stores the offset as an Int in its successValue: Any field (boxing it), and the .map then unboxes it and allocates a closure — per node. boxToInteger via SharedPackageDefs.Index was a top self-frame in the parse flamegraph on kube-prometheus.

Modification

  • Write the Position object straight into successValue via ctx.freshSuccess(new Position(fileScope, ctx.index)), skipping the Int box/unbox and the map closure. Parse output (positions / error offsets) is unchanged.

Result

Scala Native --debug-stats parse_time on kube-prometheus (148 files), interleaved and cooled:

min mean
master 92.6 ms 104.2 ms
this PR 86.9 ms 96.0 ms

→ ~6–8% faster parse; output byte-identical. (JMH ParserBenchmark over the test suite showed +5.4% on JVM.)

Test plan

  • ./mill __.reformat
  • ./mill 'sjsonnet.jvm[3.3.7]'.test — 518/518 pass

Motivation:
Parser.Pos is invoked for nearly every AST node. It was `Index.map(off => new
Position(...))`: fastparse's `Index` stores the offset as an Int in its
`successValue: Any` field (boxing it), and the `.map` then unboxes it and
allocates a closure — per node. boxToInteger via SharedPackageDefs.Index was a
top self-frame in the parse flamegraph on kube-prometheus.

Modification:
- Rewrite Pos to write the Position object straight into successValue via
  ctx.freshSuccess(new Position(fileScope, ctx.index)), skipping the Int
  box/unbox and the map closure. Parse output (positions/errors) is unchanged.

Result:
JMH ParserBenchmark (parse-only, all test-suite files): 1.669 -> 1.579 ms/op
(+5.4%, non-overlapping bands). Native parse_time on kube-prometheus:
~105.6 -> ~100.9 ms (+4.5%, consistent). Output byte-identical. 450/450 tests pass.
@He-Pin He-Pin force-pushed the perf/parser-pos-no-boxing branch from b8a525f to c231f70 Compare May 30, 2026 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant