Skip to content

perf: chunk long string byte escaping#809

Merged
stephenamar-db merged 2 commits intodatabricks:masterfrom
He-Pin:split/pr776-byte-chunked-escape
May 7, 2026
Merged

perf: chunk long string byte escaping#809
stephenamar-db merged 2 commits intodatabricks:masterfrom
He-Pin:split/pr776-byte-chunked-escape

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 30, 2026

Motivation:

Split the JMH-positive, JDK17/JIT/GC-friendly long-string rendering piece out of #776. Keep this PR focused on byte rendering for long strings that contain JSON escapes; this does not include the broader format, stdlib, compareStrings, or Scala Native experiments from #776.

Modification:

  • Add CharSWAR.findFirstEscapeChar(byte[], from, to) on JVM, Scala.js, and Scala Native.
  • In BaseByteRenderer, keep the existing UTF-8 byte array for long strings, locate escape bytes, bulk-copy clean chunks with System.arraycopy, and escape only matching bytes inline.
  • Precompute the exact escaped output length, reserve ByteBuilder once, then write directly to the backing byte array. This removes repeated ensureLength/appendUnsafeC calls from the dirty long-string loop.
  • Use a static byte hex table for \u00XX control escapes.

JIT / GC shape:

  • Hot code stays in simple while loops, System.arraycopy, and small private helpers.
  • No reflection, no internal JDK APIs, no closures/iterators in the rendering loop.
  • No per-chunk or per-escape objects are allocated by this follow-up; the existing per-long-string UTF-8 byte array remains the only temporary for this path.
  • I tested a no-allocation ASCII scalar path, but rejected it because it regressed large_string_template and large_string_join JMH.

Notable results only:

JMH target run, same machine, same command shape on upstream/master and this branch:

./mill -i bench.runRegressions bench/resources/cpp_suite/large_string_template.jsonnet bench/resources/cpp_suite/large_string_join.jsonnet

Benchmark upstream/master PR Delta
large_string_template 1.552 ms/op 1.154 ms/op -25.6% / 1.34x faster

Scala Native hyperfine, release-full native binary, 20 runs:

Benchmark upstream/master PR Delta
large_string_template 10.5 +/- 0.2 ms 9.6 +/- 0.3 ms -8.6% / 1.09x faster

large_string_join was rechecked as a guardrail and stayed neutral, so it is intentionally omitted from the result tables.

Verification:

  • ./mill -i 'sjsonnet.jvm[3.3.7].compile'
  • ./mill -i 'sjsonnet.jvm[3.3.7].test'
  • ./mill -i 'sjsonnet.js[3.3.7].compile' 'sjsonnet.native[3.3.7].compile'
  • ./mill -i 'sjsonnet.native[3.3.7].nativeLink'
  • ./mill -i __.checkFormat
  • git diff --check
  • Focused JMH and Native hyperfine commands above

References:

Motivation:
Split the JMH-positive long-string rendering piece out of databricks#776 without carrying over the broader Scala Native render-pipeline experiment.

Modification:
- Add CharSWAR.findFirstEscapeChar for byte arrays on JVM, JS, and Native.
- Keep the existing UTF-8 byte array for long strings, but locate escape bytes and copy clean chunks with System.arraycopy.
- Escape only the matching bytes inline.
- Precompute the exact escaped output length before writing dirty strings so ByteBuilder does not grow repeatedly.

Result:
This keeps the change JDK17/JIT/GC friendly: straight byte-array loops, no internal JDK APIs, no extra temporary arrays beyond the existing UTF-8 encoding, and no regression on clean long strings.
@He-Pin He-Pin marked this pull request as ready for review May 7, 2026 17:23
@He-Pin He-Pin marked this pull request as draft May 7, 2026 17:26
@He-Pin He-Pin marked this pull request as ready for review May 7, 2026 17:52
@stephenamar-db stephenamar-db merged commit 6898d2f into databricks:master May 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants