feat(v2.1-rv64): switch to u16 limbs in deferral airs#2808
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
0b3f6a9 to
b09573e
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
This comment was marked as outdated.
This comment was marked as outdated.
b09573e to
c27ce01
Compare
This comment was marked as outdated.
This comment was marked as outdated.
c27ce01 to
2164e1e
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
2164e1e to
caee7ff
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
caee7ff to
a322ef8
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
PR 5 of the memory-bus-u16 split, stacked on PR 1. Two independent parts. Part 1 — internal u16 reshape, OutputKey byte layout unchanged: - Call core: input_commit, output_commit, output_len trace columns reshape to u16 cells. - Output core: output_commit, output_len reshape to u16 cells. - Canonicity sub-AIR over u16 cells. - Deferral rd_val / rs_val register-pointer columns become u16-shaped. - CPU + CUDA tracegen updates. - Reuses PR1's byte-memory and field-memory chunk helpers; no new generic split_memory_ops-style code is introduced. Part 2 — output commitment-format change (breaks vkey compatibility): - SPONGE_BYTES_PER_ROW = 2 * DIGEST_SIZE = 16; sponge_inputs cells hold u16 values on data rows and receive 16-bit range checks. Init row stays `[deferral_idx, output_len, 0, ...]` and is exempt from u16 checks. - Two memory-bus writes per output row instead of one. - def_fn::hash_output_raw packs byte pairs into each sponge cell. - Bitwise-lookup wiring removed from the output chip only — it no longer needs byte-level range checks. The call chip still uses bitwise. - CUDA + tests updated; expected output commits regenerated. Output length contract: - output_len remains a byte count. - Raw deferral outputs must be a multiple of 16 bytes; non-16-byte aligned raw outputs are invalid (no implicit padding). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
a322ef8 to
eef7099
Compare
|
Claude finished @shuklaayush's task in 2m 29s —— View job Code Review: PR #2808 — Switch to u16 limbs in deferral AIRs
Constraint SoundnessI independently verified the constraint logic across all changed AIRs. Canonicity sub-AIR (byte → u16): BabyBear Call core range-check coverage (37 per valid row):
Output AIR range-check coverage:
Memory bus address strides: With Sponge absorption rate: Each u16 sponge cell carries 2 bytes; Timestamp arithmetic:
Constant ArithmeticAll constants verified consistent between Rust and CUDA:
Note on CUDA/Rust Struct Layout ParityAll CUDA structs match their Rust counterparts field-for-field:
Extension Wiring
verify-stark Output CircuitBoth the AIR ( A previous automated review incorrectly flagged these as bugs (wrong The new
|
Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights. Commit: eef7099 |
|
Claude finished @shuklaayush's task in 2m 31s —— View job Code Review: PR #2808 — Switch to u16 limbs in deferral AIRs
Constraint SoundnessI independently verified the constraint logic across all changed AIRs. Canonicity sub-AIR (byte → u16): BabyBear Call core range-check coverage (37 per valid row):
Output AIR range-check coverage:
Memory bus address strides: With Sponge absorption rate: Each u16 sponge cell carries 2 bytes; Timestamp arithmetic:
Constant ArithmeticAll constants verified consistent between Rust and CUDA:
Note on CUDA/Rust Struct Layout ParityAll CUDA structs match their Rust counterparts field-for-field:
Extension Wiring
verify-stark Output CircuitBoth the AIR ( The new
|
PR 5 of the memory-bus-u16 split, stacked on #2794. Shrinks deferral trace columns whose constraints only operate on packed u16 values, and doubles the sponge absorption rate of the deferral output chip from one digest per row to two.
Why
#2794 still stores deferral commit / output-key columns byte-shaped while packing the memory-bus payloads. Shrinking those columns to u16 cells halves the affected trace width and lets the canonicity sub-AIR use per-u16-cell 16-bit range checks instead of byte-pair
BitwiseOperationLookupchecks.Separately, the deferral output chip absorbs only
DIGEST_SIZE = 8bytes per Poseidon row even though the sponge state can hold2 * DIGEST_SIZE = 16bytes. Doubling the absorption rate halves the number of output rows for the same byte stream.What changes
Trace columns to u16 cells
Files:
extensions/deferral/circuit/src/{call,output,canonicity}/,extensions/deferral/circuit/src/utils.rs,extensions/deferral/circuit/cuda/.input_commit,output_commit,output_lenare u16 cells.output_commit,output_lenare u16 cells.rd_val/rs_valregister-pointer columns are u16-shaped.split_byte_memory_ops/split_f_memory_opsandbyte_memory_op_chunk/f_memory_op_chunkhelpers; no newsplit_memory_ops-style helper is introduced.Sponge absorption rate
Files:
extensions/deferral/circuit/src/output/{air,trace,execution,tests}.rs,extensions/deferral/circuit/src/def_fn.rs,extensions/deferral/circuit/src/extension/mod.rs,extensions/deferral/circuit/cuda/src/output.cu.SPONGE_BYTES_PER_ROW = 2 * DIGEST_SIZE = 16(wasDIGEST_SIZE = 8).sponge_inputscells hold u16 values on data rows, each carrying two output bytes; data-rowsponge_inputsreceive 16-bit range checks. The init row remains[deferral_idx, output_len, 0, ...]and is exempt.def_fn::hash_output_rawpacks byte pairs into each sponge cell before absorbing.Output length contract
output_lenremains a byte count.Migration notes
bitwise_lushould drop the argument.resolves int-7834