Skip to content

perf: scan lf text-block lines with indexOf#841

Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/lf-text-block-indexof
Open

perf: scan lf text-block lines with indexOf#841
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/lf-text-block-indexof

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented May 11, 2026

Motivation:
Large text blocks spend visible parser time finding line boundaries with a Scala character loop.

Key Design Decision:
Only single-character LF separators use String.indexOf('\n'); CRLF and other multi-character separator paths keep the previous scanner to preserve behavior.

Modification:
In the bulk ||| text-block body scanner, use String.indexOf to find LF line endings after indentation.

Benchmark Results:

Benchmark master PR #841 Delta Notes
Scala Native hyperfine bench/resources/cpp_suite/large_string_template.jsonnet 10.855 +/- 0.773 ms 10.300 +/- 0.638 ms -5.12% Output matched master.
Scala Native exploration A/B 11.191 ms 10.373 ms -7.3% Reverse-order A/B against frozen clean binary.
Source-built jrsonnet comparison sjsonnet 10.552 +/- 0.656 ms; jrsonnet 5.611 +/- 0.826 ms gap 1.88x Same workload, output matched jrsonnet.

Analysis:
This is a narrow parser optimization with explicit CRLF fallback. Other larger text-block scanner rewrites were tested and rejected, so this PR keeps only the stable LF indexOf win.

References:
Source exploration commit: He-Pin/sjsonnet 8f8ed59a.

Result:
Local ./mill --no-server -j 1 __.reformat and ./mill --no-server -j 1 __.test passed on this split branch (2066/2066).

Motivation:
Large text blocks spend visible parser time finding line boundaries with a Scala character loop.

Modification:
Use String.indexOf for single-character LF separators in the bulk text-block scanner while preserving the existing CRLF and multi-character separator path.

Result:
LF text-block parsing uses the platform string search fast path without changing text-block semantics.
@He-Pin He-Pin marked this pull request as ready for review May 11, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant