solidity: byte-swap + single mstore for uint64/uint128 (de)serialization#95
Draft
deuszx wants to merge 1 commit into
Draft
solidity: byte-swap + single mstore for uint64/uint128 (de)serialization#95deuszx wants to merge 1 commit into
deuszx wants to merge 1 commit into
Conversation
579dead to
1c4f2e9
Compare
Replace the byte-by-byte little-endian construction loop with a constant-time byte-swap chain followed by a single mstore/mload. The swap chain reverses byte order (BCS is little-endian, EVM mstore is big-endian); the assembly block then deposits the swapped value at the top of a 32-byte word so mstore writes the BCS bytes at result[0..N] in one operation. The deserialize path mirrors this with mload + shr + the same swap chain. The deserializers reintroduce the bounds check that the old byte-by-byte path got for free from Solidity's `input[pos + i]` indexing: require(pos + 8 <= input.length, "uint64 deserialize: out of bounds"); require(pos + 16 <= input.length, "uint128 deserialize: out of bounds"); Without these, mload would silently read up to 24 (resp. 16) bytes past the end of `input` for short payloads and return a garbage value. With them, the assembly read is also legal under the `memory-safe` contract, since `bytes memory` data slots are allocated rounded up to 32 bytes. Each unrolled swap chain carries a comment noting that EVM has no native byte-swap and describing what each term does, to make future edits less error-prone. Coverage: * `test_uint64_endian_boundaries` / `test_uint128_endian_boundaries` round-trip byte-distinct values (0, 1, 0xff, 0x100, 0x0102…, max) to catch endian bugs in the swap formula. * `test_uint_deserialize_truncated_input_reverts` calls `bcs_deserialize_offset_uint64` with 7 bytes and `bcs_deserialize_offset_uint128` with 15 bytes and asserts that each reverts rather than returning a garbage value past the input.
1c4f2e9 to
338e7ff
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace the byte-by-byte little-endian construction loop in
bcs_serialize_uint64/bcs_serialize_uint128and the mirrorconstruction in
bcs_deserialize_offset_uint64/bcs_deserialize_offset_uint128with a constant-time byte-swap chain followed by a single
mstore/mload.The swap chain reverses byte order (BCS is little-endian, EVM
mstoreis big-endian); the assembly block then deposits the swapped value at
the top of a 32-byte word so
mstorewrites the BCS bytes atresult[0..N]in one operation. The deserialize path mirrors thiswith
mload+shr+ the same swap chain.Benchmarks
Measured with
forge test --gas-report,via_ir = true,optimizer_runs = 200,solc 0.8.33.ser64deser64ser128deser128Yul does not collapse the old byte-by-byte loop to anything close
to the unrolled form, so these savings are real on hot paths
(e.g. light-client certificate verification, which decodes many
u64fields per call).Deployed bytecode (harness contract that inlines the library, both
forms):
Break-even: ~45 mixed
uint64/uint128(de)serialize calls. Asingle certificate verification hits orders of magnitude more.
Reproduce the benchmark
Save the two libraries below as
Old.solandNew.sol. Each fileexports a small harness contract whose
ser*/deser*methodsdelegate to the library so foundry can measure them.
Old.solis the form before this PR (the byte-by-byte loop).New.solis the form after this PR.Create
foundry.toml:Save the test harness as
test/Bench.t.sol:Run
forge test --gas-reportand inspect the per-function gastable for
OldHarness/NewHarness. Runtime-bytecode size ofeach harness comes from
solc --via-ir --optimize --bin-runtime(or
forge inspect <name> deployedBytecode).Test Plan
cargo test -p serde-generate --features solidity --test integration_tests solidity— full solidity test suite, including:
test_uint64_endian_boundaries/test_uint128_endian_boundariesround-trip byte-distinct values (
0,1,0xff,0x100,0x0102…,type(uintN).max) to catch endian bugs in the swapformula.
test_uint_deserialize_truncated_input_revertscallsbcs_deserialize_offset_uint64with 7 bytes andbcs_deserialize_offset_uint128with 15 bytes and asserts thateach reverts rather than returning a garbage value past the input.