perf: speed up base64 makeArray byte patterns#827
Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Open
perf: speed up base64 makeArray byte patterns#827He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Conversation
Motivation:\nstd.base64 over std.makeArray byte streams currently evaluates each element through the generic lazy function path, creating Val.Num wrappers and doing per-element validation before feeding the encoder. The go_suite base64_stress benchmark is dominated by this overhead.\n\nModification:\nRecognize pure non-negative linear makeArray bodies of the form (... i ... ) % 256 and represent them with a lazy byte-pattern array. std.base64 can request raw bytes directly, while ordinary array access still computes elements lazily and falls back to the generic path for unrecognized or non-exact expressions.\n\nResult:\nbench/resources/go_suite/base64_stress.jsonnet: upstream/master 0.174 ms/op, this change 0.061 ms/op (-65.0%).\n\nVerification:\n./mill -i 'sjsonnet.jvm[3.3.7].reformat'\n./mill -i __.checkFormat\ngit diff --check\n./mill -i 'sjsonnet.jvm[3.3.7].test'\n./mill -i 'sjsonnet.js[3.3.7].test'\nJDK_JAVA_OPTIONS="--enable-native-access=ALL-UNNAMED -Xmx8G -XX:+UseG1GC" ./mill -i 'sjsonnet.native[3.3.7].test'\n./mill -i bench.runRegressions bench/resources/go_suite/base64_stress.jsonnet
Contributor
Author
|
Closing as low priority for the current jrsonnet/docs gap work. The standout result is from base64_stress, which is not a jrsonnet docs benchmark; current docs base64 cases do not show an attributable standout delta. |
Contributor
Author
|
Reopened. This is not negative; keep it as draft for the generated std.makeArray byte-stream path. It should not be treated as a current docs/jrsonnet-gap ready PR because the standout benchmark is base64_stress rather than a jrsonnet docs case. |
Contributor
Author
|
Marking ready under the updated review bar: a PR can be ready when it has a clear positive result for at least one benchmark suite, even if it is not the primary docs/jrsonnet-gap case. Keep normal/noise-only benchmark movement out of the description. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
std.base64can encode an array of byte values. Generated byte arrays from commonstd.makeArray(..., function(i) (... i ... ) % 256)patterns currently allocate oneVal.Numper element before base64 can copy them into abyte[].This PR targets that generated-array path. The jrsonnet docs base64 byte-array benchmark is a large literal array, so it is tracked separately and is not treated as the same workload.
Modification
std.makeArraybodies ending in% 256.std.base64request raw bytes directly when the array can provide them.Result
Rebased/current against latest
upstream/masterat8b67cb1e.Targeted JMH for the generated byte-array path, lower is better:
go_suite/base64_stress.jsonnetDocs-aligned base64 context against latest source-built jrsonnet master
5b43fa8(jrsonnet 0.5.0-pre98), Scala Native hyperfine:std.base64std.base64 (byte array)The docs cases do not show an attributable PR delta, so they are listed only as external context. The ready signal for this PR is the generated
std.makeArraybyte-stream suite above.Verification
./mill -i __.checkFormatgit diff --check./mill -i 'sjsonnet.jvm[3.3.7].test'./mill -i 'sjsonnet.js[3.3.7].test'JDK_JAVA_OPTIONS="--enable-native-access=ALL-UNNAMED -Xmx8G -XX:+UseG1GC" ./mill -i 'sjsonnet.native[3.3.7].test'./mill -i bench.runRegressions bench/resources/go_suite/base64_stress.jsonnet./mill -i bench.runRegressions bench/resources/go_suite/base64.jsonnet bench/resources/go_suite/base64Decode.jsonnet bench/resources/go_suite/base64DecodeBytes.jsonnet bench/resources/go_suite/base64_byte_array.jsonnet./mill --no-server 'sjsonnet.native[3.3.7]'.nativeLinkhyperfine --warmup 10 --min-runs 50Boundary Checks
base64_stressis not a jrsonnet docs benchmark; it covers the generated-byte-array path this PR targets.std.base64 (byte array)case is a literal array, not astd.makeArraypattern.+,*, and final% 256.Doublefall back to the generic evaluator.std.base64requests raw bytes.References
d8b4db85fcbf7b4936b5d8ec0b5c4ba5428092035b43fa88b8c43856dd5a2daa9c5c251153c5e14d