perf: lazily build platform stdlib#846
Closed
He-Pin wants to merge 2 commits into
Closed
Conversation
Motivation: Long JSON strings that contain escape characters used to scan the UTF-8 byte array twice in ByteRenderer: once to find escapes and once to pre-compute the exact escaped length. The large_string_template benchmark spends visible time in this path. Modification: Render escaped long strings with one copy/escape pass, growing ByteBuilder incrementally and refreshing its backing array after capacity checks. Add a regression test that compares ByteRenderer output with the char Renderer for long escaped strings including two-byte escapes, six-byte control escapes, and a trailing plain tail. Result: The large_string_template JMH target improves from roughly 0.70-0.72 ms/op to roughly 0.58-0.65 ms/op in local runs, while full tests and formatting checks remain green. References: bench/resources/cpp_suite/large_string_template.jsonnet
Motivation: Platform CLI entry points eagerly constructed augmented stdlib modules even for Jsonnet programs that never reference std. This added startup and no-stdlib template overhead on the Native/JVM CLI path. Modification: Add internal std-provider entry points for platform mains while keeping public strict Val.Obj APIs on the legacy optimizer-hook path. Static optimization now forces the provider only for unresolved std/ identifiers, and regression tests cover provider laziness plus legacy createOptimizer override preservation. Result: Focused JMH and Native hyperfine runs are modest-positive on the no-stdlib template path, std-heavy guards remain neutral/noisy, and full ./mill --no-server --ticker false --color false -j 1 __.test passes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation:
Platform CLI entry points eagerly build augmented stdlib modules (JVM: xz/gzip/regex; Native: gzip/regex) before parsing, even for programs that never reference
std. This adds avoidable work on no-stdlib workloads such as large formatting/rendering templates.Key Design Decision:
Keep public strict APIs strict.
Interpreter,StaticOptimizer,SjsonnetMainBase.main0, andmainConfiguredcontinue to acceptVal.Objand preserve the legacy protectedcreateOptimizer(ev, std: Val.Obj, ...)extension path. The lazy provider is only exposed throughprivate[sjsonnet]platform entry points used by JVM/Native mains.Modification:
stdProvider: () => Val.Objconstructor/entry-point paths for platform mains.stdor$std.main0/mainConfiguredthrough the original strict interpreter construction path.Benchmark Results:
JVM JMH (
bench.runRegressions, single thread, same machine, #845 baseline vs this branch):cpp_suite/large_string_template.jsonnetbug_suite/assertions.jsonnetScala Native hyperfine (
-N, single benchmark process at a time):1startupcpp_suite/large_string_template.jsonnetbug_suite/assertions.jsonnetAnalysis:
The stable value is on no-stdlib CLI paths: platform augmented stdlib construction is skipped unless
stdis actually referenced. Startup-only measurements are very noisy on this machine and are not used as the primary claim. The remaining confirmed gap is stilllarge_string_templatevs jrsonnet, but this reduces a small part of the platform CLI overhead without changing Jsonnet semantics.References:
perf: render escaped byte strings in one pass)../mill --no-server --ticker false --color false -j 1 __.checkFormat./mill --no-server --ticker false --color false -j 1 __.test./mill --no-server --ticker false --color false -j 1 bench.runRegressions bench/resources/cpp_suite/large_string_template.jsonnet bench/resources/bug_suite/assertions.jsonnetJDK_JAVA_OPTIONS='--enable-native-access=ALL-UNNAMED -Xmx8G -XX:+UseG1GC' ./mill --no-server --ticker false --color false -j 1 'sjsonnet.native[3.3.7]'.nativeLinkResult:
Draft stacked follow-up. Keep if CI stays green and the reviewer accepts the modest no-stdlib CLI improvement; otherwise it can be dropped independently from #845.