Skip to content

perf: lazily build platform stdlib#846

Closed
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:perf/lazy-platform-stdlib
Closed

perf: lazily build platform stdlib#846
He-Pin wants to merge 2 commits into
databricks:masterfrom
He-Pin:perf/lazy-platform-stdlib

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 12, 2026

Motivation:

Platform CLI entry points eagerly build augmented stdlib modules (JVM: xz/gzip/regex; Native: gzip/regex) before parsing, even for programs that never reference std. This adds avoidable work on no-stdlib workloads such as large formatting/rendering templates.

Key Design Decision:

Keep public strict APIs strict. Interpreter, StaticOptimizer, SjsonnetMainBase.main0, and mainConfigured continue to accept Val.Obj and preserve the legacy protected createOptimizer(ev, std: Val.Obj, ...) extension path. The lazy provider is only exposed through private[sjsonnet] platform entry points used by JVM/Native mains.

Modification:

  • Add internal stdProvider: () => Val.Obj constructor/entry-point paths for platform mains.
  • Force the provider only when static optimization resolves an unbound std or $std.
  • Route public strict main0 / mainConfigured through the original strict interpreter construction path.
  • Add regression coverage for provider laziness and legacy optimizer-hook preservation.

Benchmark Results:

JVM JMH (bench.runRegressions, single thread, same machine, #845 baseline vs this branch):

Case #845 baseline This branch Result
cpp_suite/large_string_template.jsonnet 0.629 ms/op 0.599 ms/op +4.8%
bug_suite/assertions.jsonnet 0.207 ms/op 0.201 ms/op neutral / +2.9%

Scala Native hyperfine (-N, single benchmark process at a time):

Case This branch #845 baseline jrsonnet Result
minimal 1 startup 5.86 +/- 0.62 ms 5.68 +/- 0.67 ms 4.61 +/- 1.98 ms noisy, not claimed
cpp_suite/large_string_template.jsonnet 10.98 +/- 1.05 ms 11.28 +/- 1.34 ms 5.61 +/- 2.28 ms +2.7%, jrsonnet still ~1.96x faster
bug_suite/assertions.jsonnet 5.93 +/- 0.60 ms 6.20 +/- 0.57 ms 4.94 +/- 0.76 ms neutral / +4.3%

Analysis:

The stable value is on no-stdlib CLI paths: platform augmented stdlib construction is skipped unless std is actually referenced. Startup-only measurements are very noisy on this machine and are not used as the primary claim. The remaining confirmed gap is still large_string_template vs jrsonnet, but this reduces a small part of the platform CLI overhead without changing Jsonnet semantics.

References:

  • Builds on perf: render escaped byte strings in one pass #845 (perf: render escaped byte strings in one pass).
  • Local validation:
    • ./mill --no-server --ticker false --color false -j 1 __.checkFormat
    • ./mill --no-server --ticker false --color false -j 1 __.test
    • ./mill --no-server --ticker false --color false -j 1 bench.runRegressions bench/resources/cpp_suite/large_string_template.jsonnet bench/resources/bug_suite/assertions.jsonnet
    • JDK_JAVA_OPTIONS='--enable-native-access=ALL-UNNAMED -Xmx8G -XX:+UseG1GC' ./mill --no-server --ticker false --color false -j 1 'sjsonnet.native[3.3.7]'.nativeLink

Result:

Draft stacked follow-up. Keep if CI stays green and the reviewer accepts the modest no-stdlib CLI improvement; otherwise it can be dropped independently from #845.

He-Pin added 2 commits May 12, 2026 15:29
Motivation:
Long JSON strings that contain escape characters used to scan the UTF-8 byte array twice in ByteRenderer: once to find escapes and once to pre-compute the exact escaped length. The large_string_template benchmark spends visible time in this path.

Modification:
Render escaped long strings with one copy/escape pass, growing ByteBuilder incrementally and refreshing its backing array after capacity checks. Add a regression test that compares ByteRenderer output with the char Renderer for long escaped strings including two-byte escapes, six-byte control escapes, and a trailing plain tail.

Result:
The large_string_template JMH target improves from roughly 0.70-0.72 ms/op to roughly 0.58-0.65 ms/op in local runs, while full tests and formatting checks remain green.

References:
bench/resources/cpp_suite/large_string_template.jsonnet
Motivation:
Platform CLI entry points eagerly constructed augmented stdlib modules even for Jsonnet programs that never reference std. This added startup and no-stdlib template overhead on the Native/JVM CLI path.

Modification:
Add internal std-provider entry points for platform mains while keeping public strict Val.Obj APIs on the legacy optimizer-hook path. Static optimization now forces the provider only for unresolved std/ identifiers, and regression tests cover provider laziness plus legacy createOptimizer override preservation.

Result:
Focused JMH and Native hyperfine runs are modest-positive on the no-stdlib template path, std-heavy guards remain neutral/noisy, and full ./mill --no-server --ticker false --color false -j 1 __.test passes.
@He-Pin He-Pin closed this May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant