Skip to content

feat(v2.1-rv64): switch memory bus blocks to u16 cells#2794

Open
shuklaayush wants to merge 1 commit into
develop-v2.1.0-rv64from
feat/memory-bus-u16-pr1
Open

feat(v2.1-rv64): switch memory bus blocks to u16 cells#2794
shuklaayush wants to merge 1 commit into
develop-v2.1.0-rv64from
feat/memory-bus-u16-pr1

Conversation

@shuklaayush
Copy link
Copy Markdown
Collaborator

@shuklaayush shuklaayush commented May 19, 2026

Switch memory bus blocks to u16 cells

PR 1 of the memory-bus-u16 split. Atomic flip of the memory-bus block from 8 u8 cells to 4 u16 cells. The bus interaction is now fixed-shape (BLOCK_FE_WIDTH = 4), and byte-shaped chips pack their u8 columns into bus cells via a single helper.

Why

MemoryBus interactions previously carried 8 field-element-encoded u8s per message. RV64 cell-typed memory (registers, RV memory, public values) is naturally u16-aligned, and packing two bytes per bus cell halves the bus payload width without losing constraint power. This PR makes the bus shape a single hard constant (BLOCK_FE_WIDTH) so the bridge API and downstream chips no longer need a const N generic for it.

Follow-up PRs will use the wider cells to narrow trace columns, prune redundant range checks, reshape SHA-2 / Keccak state, and revisit deferral.

What changes

Memory layout constants

Files: crates/vm/src/arch/config.rs, crates/vm/cuda/include/system/memory/params.cuh.

  • Single source of truth: U16_CELL_SIZE = size_of::<u16>() (intrinsic) drives BUS_PTR_SCALE; BLOCK_FE_WIDTH = 4 is the design choice; MEMORY_BLOCK_BYTES, BUS_BLOCK_STRIDE, BUS_LEAF_STRIDE, DIGEST_WIDTH all derive.
  • Cross-constant equality asserts removed where the derivation makes them true by construction.
  • One divisibility guard kept on DIGEST_WIDTH.is_multiple_of(BLOCK_FE_WIDTH).
  • CUDA-side params.cuh mirrors the constants and includes ASCII diagrams for the u16-celled AS layout (RV64) and the F-celled AS layout (DEFERRAL_AS), showing how byte_ptr, cell_idx, and bus_ptr relate.

Layout reference

Terminology:

  • Cell — one storage word in an address space (u16 for RV64 ASes, F for DEFERRAL_AS).
  • Block — one memory-bus message: BLOCK_FE_WIDTH cells = MEMORY_BLOCK_BYTES bytes.
  • Digest — output of one Poseidon2 compression, DIGEST_WIDTH cells; also one merkle leaf.
  • Leaf — one merkle-tree leaf = one Poseidon2 half = DIGEST_WIDTH cells = BLOCKS_PER_LEAF blocks.
u16-celled AS layout (RV64_REGISTER_AS, RV64_MEMORY_AS, PUBLIC_VALUES_AS)

One merkle leaf = 16 bytes = 8 u16 cells = 2 bus blocks:

  byte_ptr:     0   1   2   3   4   5   6   7   8   9   10  11  12  13  14  15
               ┌───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┬───┐
  u8 storage:  │b0 │b1 │b2 │b3 │b4 │b5 │b6 │b7 │b8 │b9 │b10│b11│b12│b13│b14│b15│
               └───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┴───┘
               ╰──────╯╰──────╯╰──────╯╰──────╯╰──────╯╰──────╯╰──────╯╰──────╯   u16 cells (LE)
  cell_idx:        0       1       2       3       4       5       6       7
  bus_ptr:         0       2       4       6       8      10      12      14
               ╰─────────── block 0 ──────────╯╰─────────── block 1 ──────────╯
               ╰────────────────── one merkle leaf / digest ──────────────────╯
  • byte_ptr = U16_CELL_SIZE * cell_idx (coincides with bus_ptr for u16 ASes)
  • bus_ptr = BUS_PTR_SCALE * cell_idx
  • Block stride on bus: BUS_BLOCK_STRIDE = 8 = BUS_PTR_SCALE * BLOCK_FE_WIDTH
  • Leaf stride on bus: BUS_LEAF_STRIDE = 16 = BUS_PTR_SCALE * DIGEST_WIDTH
F-celled AS layout (DEFERRAL_AS)

Each cell holds one F element (size_of::<F>() bytes on host; 4 for BabyBear). The bus addresses cells directly, so cell / block / leaf counts and bus_ptr / cell_idx mappings match the u16 AS.

  byte_ptr:       0       4       8       12      16      20      24      28
               ┌───────┬───────┬───────┬───────┬───────┬───────┬───────┬───────┐
  F storage:   │  F0   │  F1   │  F2   │  F3   │  F4   │  F5   │  F6   │  F7   │
               └───────┴───────┴───────┴───────┴───────┴───────┴───────┴───────┘
  cell_idx:        0       1       2       3       4       5       6       7
  bus_ptr:         0       2       4       6       8      10      12      14
               ╰─────────── block 0 ──────────╯╰─────────── block 1 ──────────╯
               ╰────────────────── one merkle leaf / digest ──────────────────╯
  • byte_ptr = size_of::<F>() * cell_idx (host-memory stride; ≠ bus_ptr)
  • bus_ptr = BUS_PTR_SCALE * cell_idx (same formula as u16 ASes)
  • BUS_BLOCK_STRIDE and BUS_LEAF_STRIDE unchanged — the bus addresses cells, not bytes.

Address-space cell types

  • RV64_REGISTER_AS, RV64_MEMORY_AS, PUBLIC_VALUES_AS switch to u16-celled.
  • DEFERRAL_AS stays F-celled (moves to u16 in a follow-up).

Persistent memory and merkle

  • Boundary AIR, merkle tree, merkle public-values, and CUDA inventory rewritten for 4-cell-per-block u16 leaves (BLOCKS_PER_LEAF = DIGEST_WIDTH / BLOCK_FE_WIDTH = 2).
  • assert_public_values_shape::<DIGEST_WIDTH>(num_public_values_bytes) and SystemConfig::public_values_cell_count() formalize the public-values byte-count → cell-count contract at construction and at every read site.

Offline-checker bridge

File: crates/vm/src/system/memory/offline_checker/bridge.rs.

  • MemoryBridge::read / write drop the const N generic; bus payload is always [T; BLOCK_FE_WIDTH].
  • Unified MemoryWriteAuxInput lets write accept either an &MemoryWriteAuxCols<V, BLOCK_FE_WIDTH> (standard case) or from_prev_data_exprs(&base_aux, prev_data_expr) for chips that materialize prev_data in their own columns (Keccak, SHA-2, LoadStore).
  • Added pack_u8_block (AB::Expr) and pack_u8_block_value (F-value form) so byte-shaped chips and testing helpers produce bus messages identical to what u16 chips emit natively.
  • Removed read_or_immediate / MemoryReadOrImmediateOperation (an RV32 pattern superseded by RV64_IMM_AS).

TracingMemory / GuestMemory byte view

  • read_bytes / write_bytes (cell-type-aware, BLOCK-aligned) for raw byte access against u16-celled storage.
  • Deferral-specific helpers (timed_read_deferral, tracing_read_deferral, …) for the F-celled AS.

Chip call-sites

~50 sites across riscv, riscv-adapters, keccak256, sha2, deferral.

  • Every memory_bridge.read / .write now hands the bridge a BLOCK_FE_WIDTH-shaped payload.
  • Byte-shaped chips wrap their [T; 8] u8 columns with pack_u8_block.
  • Register-read paths still zero-extend the low 32 bits to 8 bytes, then pack.
  • Per-AS metered leaf-bit shift in execution_mode/metered/memory_ctx.rs (U16_AS_LEAF_PTR_BITS = 4 for byte-pointer ASes, CELL_AS_LEAF_PTR_BITS = 3 for cell-pointer ASes), so RV64 byte pointers produce the right leaf label.

Algebra / ECC adapter cleanup

No-op semantic; finishes upstream removal.

  • Dropped phantom BLOCK_SIZE const generic from ModularAir/ModularChip, Fp2Air/Fp2Chip, WeierstrassAir/WeierstrassChip and their get_*_air / get_*_chip helpers.
  • The generic became unused after Rv64VecHeapAdapter* dropped its BLOCK_SIZE parameter; keeping it was a phantom that broke inference at call sites.

Testing infrastructure

  • MemoryTester / DeviceMemoryTester accept either N = BLOCK_FE_WIDTH (cell index) or N = MEMORY_BLOCK_BYTES (byte pointer in a u16 AS) and route accordingly; the byte path packs the result so the dummy chip's bus message matches a byte-shaped chip's exactly.
  • pack_u8_block_value is shared between the testing helper and the CUDA boundary-chip test that previously hand-rolled the same packing.

CUDA

  • params.cuh constants + static_asserts kept in sync.
  • Kernel memory-aux fill paths updated for the new block shape.

Out of scope (follow-up PRs)

  • Trace column narrowing now that u16-celled storage matches column width.
  • Redundant range-check removal on byte-shaped chips.
  • Dead bitwise-lookup cleanup.
  • SHA-2 / Keccak state reshaping to natively-u16 layouts.
  • Moving DEFERRAL_AS to u16.
  • Deferral commit / output-hash changes.

Migration notes

  • MemoryBridge::read::<N> / ::write::<N> calls: drop <N>; pass a BLOCK_FE_WIDTH-shaped payload. Byte-shaped data goes through pack_u8_block (constraint side) and pack_u8_block_value (witness side).
  • read_or_immediate callers: switch to RV64_IMM_AS-based addressing.
  • MemoryWriteAuxCols<V, N> with N != BLOCK_FE_WIDTH: use MemoryWriteAuxInput::from_prev_data_exprs(&base_aux, prev_data) to supply prev_data inline.
  • ModularAir, ModularChip, Fp2Air, Fp2Chip, WeierstrassAir, WeierstrassChip now take one fewer const generic (BLOCK_SIZE dropped).

Resolves INT-7829

@github-actions

This comment was marked as outdated.

@github-actions

This comment has been minimized.

@shuklaayush shuklaayush force-pushed the feat/memory-bus-u16-pr1 branch from 8cf1d05 to cfca264 Compare May 19, 2026 20:47
@github-actions

This comment was marked as outdated.

Atomic flip of the memory bus from 8 u8 cells per block to 4 u16 cells
per block. The bus payload shape is now a single hard constant
(BLOCK_FE_WIDTH = 4); byte-shaped chips pack their u8 columns into bus
cells via pack_u8_block.

PR 1 of the memory-bus-u16 split — see pr.md for the full description and
migration notes. Trace column narrowing, redundant range-check removal,
SHA-2 / Keccak state reshaping, and deferral changes are deferred to
follow-up PRs.

- Layout constants (arch::config, cuda/params.cuh): single source of
  truth — U16_CELL_SIZE drives BUS_PTR_SCALE; BLOCK_FE_WIDTH is the
  design choice; MEMORY_BLOCK_BYTES, BUS_BLOCK_STRIDE, BUS_LEAF_STRIDE,
  DIGEST_WIDTH derive. Cross-constant equality asserts removed where
  derivation makes them tautological.
- Address-space cell types: RV64_REGISTER_AS, RV64_MEMORY_AS,
  PUBLIC_VALUES_AS switch to u16-celled; DEFERRAL_AS stays F-celled.
- Persistent boundary AIR, merkle tree, merkle public-values, and the
  CUDA inventory rewritten for 4-cell-per-block u16 leaves
  (BLOCKS_PER_LEAF = 2). assert_public_values_shape +
  SystemConfig::public_values_cell_count() formalize the public-values
  byte→cell contract.
- offline_checker::bridge: read/write drop the const N generic; payload
  is always [T; BLOCK_FE_WIDTH]. write unifies the standard and
  prev-data-from-columns paths via MemoryWriteAuxInput. New
  pack_u8_block (AB::Expr) and pack_u8_block_value (F-value) helpers
  for byte-shaped chips and testing. Dropped read_or_immediate / the
  ReadOrImmediateOperation flavor (RV32 vestige superseded by
  RV64_IMM_AS).
- TracingMemory / GuestMemory: read_bytes / write_bytes for raw byte
  access against u16-celled storage; tracing_read_deferral et al.
  for DEFERRAL_AS.
- ~50 chip call-sites across riscv, riscv-adapters, keccak256, sha2,
  deferral updated so memory-bus sends/receives stay shape-matched.
  Byte-shaped chips wrap their columns with pack_u8_block; register-
  read paths still zero-extend the low 32 bits to 8 bytes before
  packing. Per-AS metered leaf-bit shift in execution_mode/metered/
  memory_ctx.rs so RV64 byte pointers produce the right leaf label.
- Algebra / ECC adapters: dropped phantom BLOCK_SIZE const generic
  from ModularAir/Chip, Fp2Air/Chip, WeierstrassAir/Chip and the
  matching get_*_air / get_*_chip helpers. The parameter became
  phantom after Rv64VecHeapAdapter* dropped BLOCK_SIZE and was
  breaking inference at call sites.
- Testing: MemoryTester / DeviceMemoryTester accept either N =
  BLOCK_FE_WIDTH (cell index) or N = MEMORY_BLOCK_BYTES (byte pointer
  in a u16 AS); pack_u8_block_value shared between testing and the
  CUDA boundary-chip test that previously hand-rolled the packing.
- CUDA parity: params.cuh constants + static_asserts + ASCII diagrams
  for both u16 and F-celled AS layouts; kernel memory-aux fill paths
  updated for the new block shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@shuklaayush shuklaayush force-pushed the feat/memory-bus-u16-pr1 branch from cfca264 to d30973f Compare May 19, 2026 20:55
@github-actions

This comment was marked as outdated.

@shuklaayush shuklaayush marked this pull request as ready for review May 19, 2026 21:08
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 19, 2026

Claude finished @shuklaayush's task in 3m 29s —— View job


Code Review: Switch memory bus blocks to u16 cells

  • Review core memory layout constants and config
  • Review offline checker bridge API changes
  • Review TracingMemory / GuestMemory (online.rs) changes
  • Review merkle tree and persistent memory changes
  • Review boundary AIR and CUDA changes
  • Review chip call-sites (riscv, keccak, sha2, deferral)
  • Review algebra / ECC adapter cleanup
  • Review testing infrastructure changes
  • Review metered execution changes
  • Final summary and feedback

Overall Assessment

This is a well-executed, large-scope refactor (100 files, ~4000 additions/deletions) that cleanly switches the memory bus message shape from 8 u8-valued cells to 4 u16-valued cells. The design is sound, the constant derivation is rigorous, and the migration across ~50 chip call-sites is impressively thorough with no correctness issues found.

Verdict: Approve — no blocking issues identified.


Constants & Layout (config.rs, params.cuh)

The constant hierarchy is well-derived and documented:

U16_CELL_SIZE = 2 (intrinsic)
    → BUS_PTR_SCALE = 2 (byte pointers coincide with bus pointers for u16 ASes)
    → BUS_PTR_SCALE_BITS = 1
BLOCK_FE_WIDTH = 4 (design choice)
    → MEMORY_BLOCK_BYTES = 8 (= BLOCK_FE_WIDTH × U16_CELL_SIZE)
    → BUS_BLOCK_STRIDE = 8 (= BUS_PTR_SCALE × BLOCK_FE_WIDTH)
DIGEST_WIDTH = 8 (= POSEIDON2_WIDTH / 2)
    → BLOCKS_PER_LEAF = 2 (= DIGEST_WIDTH / BLOCK_FE_WIDTH)
    → BUS_LEAF_STRIDE = 16 (= BUS_PTR_SCALE × DIGEST_WIDTH)

The static assert at bridge.rs:24-27 (MEMORY_BLOCK_BYTES == 2 * BLOCK_FE_WIDTH) guards the packing assumption. The CUDA params.cuh mirrors this with constexpr constants and static_asserts. The ASCII layout diagrams in params.cuh are excellent documentation.

The address_height computation in dimensions.rs:51-53 correctly accounts for the new u16 cell size: cell_index_bits = pointer_max_bits - BUS_PTR_SCALE_BITS, then address_height = cell_index_bits - DIGEST_WIDTH_BITS. This reduces address_height by 1 compared to the u8 layout (consistent with leaves now covering 16 bytes instead of 8).


Bridge API Design (bridge.rs)

The removal of the const N generic from MemoryBridge::read/write is clean — the bus payload is now always [T; BLOCK_FE_WIDTH].

The MemoryWriteAuxInput abstraction (bridge.rs:104-136) elegantly handles two cases:

  1. Standard: &MemoryWriteAuxCols<V, BLOCK_FE_WIDTH> where prev_data comes from aux columns (via the From impl)
  2. Inline: from_prev_data_exprs(&base_aux, prev_data) for chips that compute prev_data in their own columns (Keccak, SHA-2, LoadStore)

The three packing helpers (pack_u8_block, pack_u8_block_value, pack_u8_block_bytes) are correct: each produces out[i] = data[2i] + 256 × data[2i+1] (LE u16 packing).


Byte Range Check Soundness (RISCV chips)

The most critical correctness concern: chips operating on individual bytes no longer get byte-range guarantees from the memory bus (which now only guarantees u16 range). All 7 affected RISCV chips properly add byte range checks:

Chip File Range Check Approach
BranchEqualCoreAir branch_eq/core.rs:108-116 bitwise_lookup_bus.send_range for a,b byte pairs
MultiplicationCoreAir mul/core.rs:97-102 send_range for b,c limbs
DivRemCoreAir divrem/core.rs:250-268 MSB sign-aware range checks + non-MSB byte checks
JalrCoreAir jalr/core.rs:118-133 send_range for rd_data and rs1 bytes
BaseAluCoreAir base_alu/core.rs:142-148 send_range for ADD/SUB b,c limbs (bitwise ops already constrained via XOR lookup)
LessThanCoreAir less_than/core.rs:122-135 MSB sign-adjusted check + remaining byte checks
ShiftCoreAir shift/core.rs:204-218 Output limbs and input b,c byte pairs

All chips include a consistent comment pattern noting the change. The coverage is complete and correct.


Persistent Memory & Boundary AIR (persistent.rs)

The BLOCKS_PER_LEAF = 2 split is handled correctly. Each leaf now carries per-block timestamps ([T; BLOCKS_PER_LEAF]), and the boundary AIR loops over blocks to emit bus messages at the right stride:

for block_idx in 0..BLOCKS_PER_LEAF {
    let offset = block_stride_f * AB::F::from_usize(block_idx);
    self.memory_bus.send(
        MemoryAddress::new(local.address_space,
            local.leaf_label * leaf_stride_f + offset),
        local.values[block_idx * BLOCK_FE_WIDTH..(block_idx + 1) * BLOCK_FE_WIDTH].to_vec(),
        local.timestamps[block_idx],
    ).eval(builder, local.expand_direction);
}

The group_touched_memory_by_leaf function correctly regroups BLOCK_FE_WIDTH-granularity touched memory into DIGEST_WIDTH-sized leaves with per-block timestamps. Untouched blocks within a touched leaf get timestamp 0, which correctly balances the boundary interaction (boundary sends at t=0 init, receives at t=0 final).

The initial-row constraint (when_initial.assert_zero(local.timestamps[i]) for all blocks) ensures initial timestamps are zero.


Metered Execution (memory_ctx.rs)

The leaf-pointer-span differentiation at lines 166-170 is correct:

let leaf_ptr_span = if address_space == DEFERRAL_AS {
    DIGEST_WIDTH as u32          // cell-pointer AS: 8 cells per leaf
} else {
    (U16_CELL_SIZE * DIGEST_WIDTH) as u32  // byte-pointer AS: 16 bytes per leaf
};

This ensures byte pointers (RV64 ASes) and cell pointers (DEFERRAL_AS) both produce the correct leaf labels when right-shifted by leaf_ptr_span.ilog2().


TracingMemory / GuestMemory (online.rs)

The new read_bytes/write_bytes raw byte API on both GuestMemory and TracingMemory is well-designed. The validation in assert_valid_byte_access ensures N exactly equals the AS memory block size and byte_ptr is block-aligned. The byte_prev_access_time method correctly computes the block index from byte pointers using memory_block_bytes().

The touched_blocks_to_equipartition finalize path correctly reads cells via the AS-specific to_field method, producing field-element values from the underlying byte storage.


Algebra/ECC Cleanup

The phantom BLOCK_SIZE const generic removal from ModularAir/ModularChip, Fp2Air/Fp2Chip, WeierstrassAir/WeierstrassChip is consistent and complete. The generic became unused after Rv64VecHeapAdapter dropped its BLOCK_SIZE parameter. No remnant references remain.


Deferral Circuit (call/air.rs, output/air.rs, utils.rs)

The split_byte_memory_ops / split_f_memory_ops distinction in utils.rs cleanly separates byte-addressed heap storage from cell-addressed DEFERRAL_AS storage at the type level. The num_byte_memory_ops and num_f_memory_ops const functions enforce divisibility at compile time.


Testing Infrastructure (testing/memory/mod.rs)

The MemoryTester routing correctly handles:

  • N = BLOCK_FE_WIDTH → cell-index path with typed u16 access
  • N = MEMORY_BLOCK_BYTES → byte-pointer path with pack_u8_block_value for bus message packing

This ensures test bus messages match what real chips emit.


CUDA Changes

The CUDA boundary (boundary.cu) correctly branches on DEFERRAL_AS vs non-deferral for cell-to-byte conversion. The new FpArray::from_u16_array helper and u16_from_bytes_le utility are used correctly. The params.cuh static_asserts provide compile-time cross-constant validation.


Minor Observations

  1. The divrem/core.rs:540 change from F::from_u32(256) to F::from_u32(1 << LIMB_BITS) is a nice robustness improvement.

  2. The public_values.rs test at line 302 (write_bytes::<4>(PUBLIC_VALUES_AS, 12, [0, 0, 0, 1]) expecting F::from_u16(0x0100) at cell 7) is a good regression test for u16 LE packing.

  3. The assert_public_values_shape validation at SystemConfig construction time (checking alignment to U16_CELL_SIZE, DIGEST_WIDTH, and power-of-two leaf count) prevents misconfiguration early.

No correctness bugs, soundness gaps, or API misuse were found. The PR description, layout diagrams, and migration notes are thorough.

@github-actions
Copy link
Copy Markdown

group app.proof_time_ms app.cycles leaf.proof_time_ms
fibonacci 1,578 4,000,051 436
keccak 14,040 14,365,133 2,394
sha2_bench 9,363 11,167,961 1,449
regex 1,478 4,090,656 357
ecrecover 476 112,210 258
pairing 598 592,827 260
kitchen_sink 1,823 1,979,971 407

Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights.

Commit: d30973f

Benchmark Workflow

@shuklaayush shuklaayush changed the title feat(memory): switch memory bus blocks to u16 cells feat(v2.1-rv64): switch memory bus blocks to u16 cells May 19, 2026
Comment thread crates/vm/src/arch/config.rs
pub const OPENVM_DEFAULT_INIT_FILE_NAME: &str = "openvm_init.rs";
/// Default block size (in bytes) for memory bus interactions. Memory is always read/written a
/// full block at a time even when the instruction accesses fewer bytes (e.g. RISC-V `lb`/`lh`).
pub const DEFAULT_BLOCK_SIZE: usize = 8;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we rename this constant once more T_T

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i got ai to write a base set of constants that make all this easier to understand. it might have gone a bit overboard but i feel it's much cleaner now

) -> [AB::Expr; BLOCK_FE_WIDTH] {
let mut out: [AB::Expr; BLOCK_FE_WIDTH] = std::array::from_fn(|_| AB::Expr::ZERO);
for i in 0..BLOCK_FE_WIDTH {
out[i] = data[i * 2].clone() + AB::Expr::from_u64(256) * data[i * 2 + 1].clone();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 is hard coded which doesn't really make this tied to the BLOCK_FE_WIDTH constant

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for below

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh the 2 comes from the fact that this is packing two u8 elements into one u16. it's independent of BLOCK_FE_WIDTH

Copy link
Copy Markdown
Contributor

@stephenh-axiom-xyz stephenh-axiom-xyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am good on the deferral extension changes

// field
// - T is the exact memory cell type for this address space, satisfying the type requirement
unsafe { self.memory.read(addr_space, ptr) }
if size_of::<T>() == 1 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the possible values of T here ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

u8 for the first path and u16 or F for the else

}

/// Returns `[pointer:BLOCK_SIZE]_{address_space}`
/// Reads `BLOCK_SIZE` AS-native cells starting at `cell_idx`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does AS-native cells mean ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

address space native i.e. either u16 or F

Comment thread extensions/riscv/circuit/src/adapters/alu.rs
@@ -205,11 +207,19 @@ where
addr_space: u32,
ptr: u32,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this also be renamed as cell_idx too or not ?

Copy link
Copy Markdown
Collaborator Author

@shuklaayush shuklaayush May 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, because it can also be bytes. maybe it can be ptr_or_idx but let's do it in a follow up and not here since four other prs are stacked on top of this

Comment on lines -91 to +93
// Range check for postimage bytes (pairs)
for (size_t i = 0; i < KECCAK_WIDTH_BYTES; i += 2) {
bitwise_lookup.add_range(state.bytes[i], state.bytes[i + 1]);
}
// AIR bounds the pointer bytes before composing them.
bitwise_lookup.add_range(buffer_ptr_limbs[0], buffer_ptr_limbs[1]);
bitwise_lookup.add_range(buffer_ptr_limbs[2], buffer_ptr_limbs[3]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what was this range check for postage bytes and why is it getting removed now ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you needed to decompose the u16 postimage into bytes before sending it to the memory bus earlier. and to show that the decomposition is canonical, you needed to range check that the two limbs are indeed bytes. now that the memory bus is u16 you don't need this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants