perf: preserve ASCII-safe simple format results by He-Pin · Pull Request #856 · databricks/sjsonnet

He-Pin · 2026-05-13T14:04:55Z

Motivation:
large_string_template still spent time re-encoding and re-scanning the huge string produced by simple named format interpolation, even when the static format literals and dynamic values made the final string JSON-string ASCII-safe. This keeps the focus on the largest remaining local gap from #666-style benchmarking.

Key Design Decision
Track ASCII safety as metadata on the compiled simple named format path instead of adding another renderer scan. The optimization is deliberately limited to all-simple %(key)s formats, where every emitted dynamic value goes through simpleStringValue and can be conservatively classified.

Modification:

Add staticAsciiSafe metadata to compiled format strings.
Return Val.Str.asciiSafe from Format.PartialApplyFmt when static literals and all simple named dynamic values are JSON-string ASCII-safe.
Keep unsafe strings, unsafe static literals, and mixed-key unsafe values conservative.
Add focused regression tests for safe numeric values, unsafe dynamic strings, unsafe static literals, and mixed-key safety.
Update the gap and sync ledgers.

Benchmark Results:
JVM JMH (bench.runRegressions bench/resources/cpp_suite/large_string_template.jsonnet):

Before: 0.683 ms/op
After: 0.677 ms/op

Scala Native hyperfine, large_string_template, before vs after:

Forward: 8.64 +/- 0.75 ms -> 8.01 +/- 0.52 ms
Reverse: 8.65 +/- 0.50 ms -> 8.17 +/- 0.48 ms

Scala Native hyperfine, source-built jrsonnet comparison:

sjsonnet after: 8.01-8.17 ms
jrsonnet: 6.0 +/- 1.2 ms
Remaining gap: about 1.34x

Guard benchmark, kube-prometheus:

Forward: 131.74 ms -> 129.14 ms
Reverse: 129.11 ms baseline vs 130.94 ms candidate
Interpreted as neutral/noisy, not a target regression.

Analysis:
The existing renderer already has an ASCII-safe direct byte path, but formatted strings lost that metadata and fell back to UTF-8 encoding plus escape scanning. This change preserves the metadata at the producer where the safety condition is known, avoiding an extra whole-string encoding/scan on large simple format outputs. The safety predicate is conservative: unknown complex values, unsafe strings, or unsafe literal text do not get the fast-path marker.

References:

performance optimization #666
bench/reports/sjsonnet-vs-jrsonnet-gaps.md
bench/reports/sync-points.md

Result:
large_string_template improves in both Native command orders, JVM JMH does not regress, output equality holds for large_string_template and kube-prometheus, and ./mill --no-server --ticker false --color false -j 1 __.test plus __.checkFormat pass.

Motivation: PR databricks#840 introduced a strict JSON fast path for .json imports but still forces a full UTF-8 string decode for every cached file before handing the text to ujson.StringParser. Real-world workloads (e.g. kube-prometheus) import many .json files; decoding each one twice (once into String for parsing, again as cache content) is pure overhead. Key Design Decision: ujson 4.4.3 ships ByteArrayParser, which parses UTF-8 JSON directly from a byte array without an intermediate String. Cache small resolved files as raw bytes (already what we read from disk) and lazily decode text only when the importstr/parser-input path actually needs it. Preserve parse-cache content identity by hashing the cached bytes with SHA-256 (length + hex digest) so external ParseCache implementations keep the same collision resistance as the old full-string key. Modification: * Importer.scala: CachedResolver.parseJsonImport now calls ujson.ByteArrayParser.transform(content.readRawBytes(), visitor) instead of decoding the whole file to String first. * CachedResolvedFile.scala (JVM/Native): small files are cached as Array[Byte]; getParserInput / readString materialize the String lazily; readRawBytes returns the cached bytes directly; contentHash is length + SHA-256 over the cached bytes; binary imports still use StaticBinaryResolvedFile. * PreloaderTests.scala: tighten the strict-JSON fast-path coverage so it fails if the fast path ever falls back to readString(). Result: * Output equality vs upstream sjsonnet and jrsonnet preserved on kube-prometheus and large_string_template. * Native kube-prometheus hyperfine A/B (forward & reverse): clean 139.4 +/- 2.8 ms -> candidate 132.7 +/- 1.9 ms (forward) candidate 132.1 +/- 1.9 ms vs clean 140.3 +/- 2.6 ms (reverse) * Full ./mill __.test green. References: Follow-up to databricks#840

Motivation: Large inline objects produced by strict JSON imports can exceed the small-object shape that computeSortedInlineOrder was originally tuned for. Native sampling on kube-prometheus showed sorted inline-order computation as a materialization hotspot, and insertion sort becomes quadratic on those wider objects. Modification: Keep insertion sort for small inline objects, and use an in-place quicksort with insertion-sort cleanup for larger visible field sets. Record the accepted benchmark result and rejected parser/key-render micro-routes in the performance ledgers. Result: Kube-prometheus Native A/B improved on top of strict JSON byte imports, with forward mean 145.3ms -> 140.0ms and reverse mean 151.6ms -> 148.9ms. Formatting and the full test suite pass. References: Upstream-base: databricks/sjsonnet@cedc083 Prior optimization: 883fca5 perf: parse strict JSON imports from bytes

Motivation: Keep the performance exploration ledger current so future optimization work does not repeat Native-negative or build-invalid routes. Modification: Record rejected short-string, ASCII-safe, inline sort-cache, path-only parse-cache, and Native GC configuration probes with the validation evidence that ruled them out. Result: No runtime code changes are retained; the branch documents the failed hypotheses and preserves the current accepted optimization stack. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Motivation: large_string_template still spent time re-encoding and re-scanning the huge string produced by simple named format interpolation, even when the final result was known to be JSON-string ASCII-safe. Modification: Track whether compiled format literals are ASCII-safe and return Val.Str.asciiSafe from PartialApplyFmt when every simple named dynamic value is also safe. Add regression coverage for safe numeric values, unsafe string values, unsafe static literals, and mixed-key safety. Result: Native large_string_template improved in both command orders (8.64 -> 8.01 ms forward, 8.65 -> 8.17 ms reverse); JVM JMH stayed neutral-positive (0.683 -> 0.677 ms/op); full __.test and checkFormat pass. References: bench/reports/sjsonnet-vs-jrsonnet-gaps.md Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

He-Pin and others added 4 commits May 13, 2026 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: preserve ASCII-safe simple format results#856

perf: preserve ASCII-safe simple format results#856
He-Pin wants to merge 4 commits into
databricks:masterfrom
He-Pin:perf/simple-format-ascii-safe

He-Pin commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant