test(DON'T MERGE): develop-v2.1.0-rvr#2790
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment has been minimized.
This comment has been minimized.
f08f0fe to
0110850
Compare
makes the basic framework for using rvr extensions in openvm - adds an rvr feature flag - defines `RvrExtensionCtx` struct to provide mappings between opcode/executor/air indices - defines `VmRvrExtension` trait that extensions can implement to be registered - updated macro so that rvr `ExtensionRegistry` can be auto-generated in `SdkVmConfig` closes INT-7474, INT-7475, INT-7479
#2730) Moves the rvr files related to compiling and execution into openvm-circuit. Those rvr files previously depended on openvm-circuit and in order to enable rvr execution through the openvm pipeline, they had to be made a part of openvm-circuit to prevent circular dependencies. closes INT-7537
- Vm execution instance is made to use rvr execution, depending on the feature. Helper functions to convert between the existing `VmState` and the rvr state are also added. - The `VmConfig` macro now has a `create_rvr_extensions` method implementation, but instead of defining a new `VmRvrConfig` trait, the `create_rvr_extensions` method piggybacks on the existing `VmExecutionConfig`. This is to avoid complex feature-gated trait bounds. closes INT-6810, INT-7476
Enables running benchmarks through rvr extension. Benchmark tests do not check execution correctness and currently execution involving extensions other than RV32IM fail because `VmRvrExtension` trait implementation is not properly wired. closes INT-7480
Previously rvr execution had to use `executor_idx_to_air_idx` information in order to construct `ExtensionRegistry`. This was a problem for pure execution which didn't need air indices so the interface diverged between rvr and aot/interpreted. Now for rvr pure execution, dummy index values of `NO_CHIP` are used instead to keep the interface consistent. towards INT-7611
Removes the rvr tests and instead adds rvr comparison steps in existing openvm tests in a similar way to aot. Unlike aot, metered cost execution is also run and compared for rvr and interpreted modes. closes INT-7627
- Introduces a new `Rv32IoExtension` in rvr that handles the rv32io instructions (hint_storew, hint_buffer, reveal). This is mainly to have a struct managing the hint_store chip index. - Adds rvr tests to the CI file in the same way as aot. closes INT-7466
Implements the rvr feature for the keccak256 extension and also adds rvr tests to CI. Now extensions don't take a `staticlib_path` argument manually and instead uses the auto-built staticlib made by a build.rs file. closes INT-7468
Implements the rvr feature for the Algebra extension. The rvr side of the Algebra extension is now also split into `ModularRvrExtension` and `Fp2RvrExtension`. A notable change is to have the C code for the Algebra extension which uses `libsecp256k1` to also unconditionally contain the C code needed in the ECC extension, since they are closely related and doing so would avoid configuration dependencies. closes INT-7470, INT-7704
Implements the rvr feature for all extensions that are left - BigInt, Sha2, ECC, Pairing, Deferral. Code for tests and CI are also updated. Changes for the Deferral extension includes additions to the VM state used in rvr execution. closes INT-7465
closes INT-7821
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Changed rvr execution to use the existing openvm `VmState` instead of defining a new state struct and copying data between the two forms. The references and pointers to the fields in `VmState` are passed to the rvr execution functions so they can be used in C code. - The Deferral extension now uses a callback registration system to expose the Deferral related data to C code instead of piggybacking on the same mechanism of `OpenVmHostCallbacks`. This is enabled for each extension so that they can have per-extension data and state that is maintained separately. closes INT-7572
db2dea9 to
c8a2092
Compare
|
Claude finished @shuklaayush's task in 10m 14s —— View job Code Review: PR #2790 — RVR Integration (Re-review on sync)Reviewed ~15.7k additions across 175 files introducing RVR (RISC-V Runtime) compiled execution support.
Critical / High Severity1. Out-of-bounds dispatch table access for dynamic jumps
Additionally, 2. Missing memory bounds checks in IO callbacks Guest programs can trigger out-of-bounds host memory access through several callbacks:
The 3.
4. NULL function pointer dereference in C IO layer
5. Global mutable state for host callbacks is not thread-safe
The same pattern appears in 6. The new Medium Severity7. let elapsed = self.config().segment_check_insns - remaining_counter as u64;After 8. When 9.
10. The alignment assertion is 11.
12. The norm computation 13. Fragile chip index Multiple extensions derive a secondary chip index as 14.
Low Severity / Code Quality15. Redundant let block_starts: HashMap<u32, u32> = blocks.iter().map(|b| (b.start_pc, b.start_pc)).collect();This is just a 16. Duplicate 17. Any 18. Recompilation on every metered execution (known TODO) Both 19. Near-identical Multiple 20. Feature-gated fields on public
21.
22. C23 features limit portability Uses Positive Patterns
SummaryThe architecture is well-designed — the adapter/core split, extension trait pattern, and C codegen pipeline are solid. The main concerns are:
Priority fixes:
|
This comment has been minimized.
This comment has been minimized.
- Makes rvr metered execution use the existing `SegmentationCtx` of openvm instead of its own new structs and code. This fixes the discrepancy between rvr and interpreted/aot segmentation logic and resolves the issue of rvr making too many segments. (https://github.com/axiom-crypto/openvm-eth/actions/runs/26112543464) - Fixes the calculation of `num_insns` in segments that are used as segment boundaries. The instruction counts were recorded as multiples of `segment_check_insns` (1000) that didn't map to the actual basic block boundaries. Addresses the problem of overflowing GPU memory. (https://github.com/axiom-crypto/openvm-eth/actions/runs/26127167098) closes INT-7835
|
Claude finished @gdmlcjs's task in 9m 59s —— View job Code Review: PR #2790 — RVR Integration (Re-review on sync)Reviewed ~15.5k additions across 176 files. The latest commit (0ea17b6) is a significant improvement — it replaces the duplicated
Changes in Latest Commit (0ea17b6)The refactoring is clean and correct:
The instret accounting math is correct in release mode (verified algebraically: total instret = Σ(block costs) via wrapping arithmetic), but there's a remaining debug-mode issue (see #1 below). Critical / High Severity1. self.instret += self.segment_check_insns - self.instrets_until_check;When execution terminates shortly after a periodic check callback, In release mode, wrapping arithmetic gives the correct total (verified), but in debug mode this panics. Since 2. Out-of-bounds dispatch table access for dynamic jumps (unchanged from previous review)
3. Missing memory bounds checks in IO callbacks (unchanged from previous review) Guest programs can trigger out-of-bounds host memory access:
The 16 KB guard page catches small overflows, but larger overflows bypass it entirely. 4. Panics in Multiple
On Rust 1.91.1 these abort rather than cause UB, but aborting the entire host process on bad guest input is a harsh failure mode. Consider 5. NULL function pointer dereference in C IO layer (unchanged from previous review) All forwarding stubs ( 6. The Medium Severity7. Global mutable state prevents concurrent RVR instances (unchanged)
8. The alignment assertion is 9.
10. Both 11.
12. C tracer After 13. CFG worklist iteration limit reached silently When the worklist exceeds 14. The Low Severity / Code Quality15. 16. Redundant 17. Duplicate 18. 19. No What Improved Since Last Review
Positive Patterns
SummaryThe latest refactoring significantly improves the codebase by eliminating duplicated segmentation logic. The architecture remains solid. Priority fixes:
|
This comment has been minimized.
This comment has been minimized.
moves the rvr compilation stage for metered and metered cost out of execution and into instance construction closes INT-7626
|
Claude finished @gdmlcjs's task in 4m 23s —— View job Code Review: PR #2790 — RVR Integration (Re-review after compile-stage refactor)Reviewed ~15.4k additions across 176 files. The latest commit (43e0f72) moves RVR compilation into instance construction, removing the
What Improved Since Last Review
Critical / High Severity1. Out-of-bounds dispatch table access for dynamic jumps (unchanged)
Additionally, 2. Missing memory bounds checks in IO callbacks (unchanged) Guest programs can trigger out-of-bounds host memory access through several callbacks:
The 16 KB guard page catches small overflows, but overflows beyond 16 KB bypass it entirely. Consider storing 3.
4. NULL function pointer dereference in C IO layer (unchanged) All forwarding stubs ( 5. The Medium Severity6. Global mutable state prevents concurrent RVR instances (unchanged)
7. self.instret += self.segment_check_insns - self.instrets_until_check;The subtraction is safe only if 8.
9.
10. Both 11. C tracer After 12.
13. Fragile chip index Multiple extensions derive secondary chip indices as Low Severity / Code Quality14. Redundant 15. 16. Duplicate 17. 18. No Changes in Latest Commit (43e0f72) — Detailed ReviewThe refactor is clean and correct:
Positive Patterns
SummaryThe latest refactor (43e0f72) is a solid improvement — moving compilation to instance construction eliminates per-execution overhead and simplifies types. The overall RVR architecture (adapter/core split, extension trait pattern, C codegen pipeline) is well-designed. Priority fixes (unchanged from prior reviews):
|
closes INT-7648
|
Claude finished @gdmlcjs's task in 6m 55s —— View job Code Review: PR #2790 — RVR Integration (Re-review on sync)Reviewed ~15.4k additions across 176 files. The latest commit (36c26d5) simplifies the benchmark code by introducing a
Changes in Latest Commits36c26d5 — Simplify benchmarks: Introduces 43e0f72 — Compile at construction: 0ea17b6 — SegmentationCtx reuse: Replaces duplicated Critical / High Severity1. Out-of-bounds dispatch table access for dynamic jumps (unchanged from prior reviews)
Additionally, Impact: Guest-controlled register values ( 2. Missing memory bounds checks in IO callbacks (unchanged) Guest programs can trigger out-of-bounds host memory access:
The 16 KB guard page catches small overflows, but overflows beyond 16 KB bypass it entirely. Consider storing 3. Panics in
Other
On Rust 1.91+ these abort rather than cause UB, but aborting the entire host on bad guest input is a harsh failure mode. 4. NULL function pointer dereference in C IO layer (unchanged) All forwarding stubs ( 5. The Medium Severity6. self.instret += self.segment_check_insns - self.instrets_until_check;This subtraction relies on 7. Global mutable state prevents concurrent RVR instances (unchanged)
8.
9. The alignment assertion is 10. Both 11. Missing
12.
13. C tracer After 14.
Low Severity / Code Quality15. Redundant 16. 17. Duplicate 18. 19. 20. Positive Patterns
SummaryThe latest three commits (SegmentationCtx reuse → compile-at-construction → benchmark simplification) are good improvements that eliminate duplication and simplify types. The Priority fixes (unchanged from prior reviews):
|
This comment has been minimized.
This comment has been minimized.
Air indices are now represented as an enum in rvr code. The `AirIndex` enum has `Uninitialized` and `NoChip` variants that replace the previous `NO_CHIP = u32::MAX`. `AirIndex::Uninitialized` is only used in pure execution where air indices don't matter and causes a panic in rvr metered and metered cost execution. closes INT-7611
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Code reviewChecked for bugs and CLAUDE.md compliance against AGENTS.md. Two memory-safety issues stand out in the new RVR backend. 1. Out-of-bounds dispatch-table read on dynamic jumps (memory safety / UB)
openvm/crates/rvr/rvr-openvm/src/emit/codegen.rs Lines 141 to 146 in 163d10c
openvm/crates/rvr/rvr-openvm/src/emit/project.rs Lines 334 to 341 in 163d10c …and the table is sized to openvm/crates/rvr/rvr-openvm/src/emit/project.rs Lines 627 to 651 in 163d10c The only PC range guard runs once in the `rv_execute` prologue: openvm/crates/rvr/rvr-openvm/src/emit/project.rs Lines 658 to 671 in 163d10c All subsequent inter-block transitions are `[[clang::musttail]] return dispatch_tabledispatch_index(next_pc);` and bypass that guard. For `JumpDyn`, `next_pc = (rs1 + imm) & ~1` is fully guest-controlled; if the guest computes a target `pc < text_start` the unsigned subtraction in `dispatch_index` wraps to a huge `uint32_t`, and if `pc > max_pc` the index exceeds `table_size`. The result is a load of an arbitrary 8-byte value adjacent to `dispatch_table` and a `musttail` call through it — UB at minimum, and a potential RCE vector when the input ELF is adversarial (the stated use case for OpenVM). In-range misaligned/non-block-start jumps land on `rv_trap` slots, so only the out-of-range case is unsafe — the `Branch`/`FallThrough`/`emit_tail_call` sites use compile-time constant targets and are safe in practice (the dispatch-table fallback there is dead code unless the lifter misclassifies a block, which would also be a lift bug). Suggested fix: add a bounded fallback in `dispatch_index` (or guard each indexed tail-call) so that `pc < text_start || pc > max_pc` resolves to `rv_trap` instead of indexing the table. E.g. inline a range check before the shift, or emit the check in the `JumpDyn` codegen path specifically (static targets don't need it). 2. Heap buffer overflow when `segment_check_insns` is raised above the defaultThe C tracer's `record_mem_page*` / `record_pv_page*` / `record_deferral_page*` deliberately skip bounds checks: openvm/crates/rvr/rvr-openvm/c/openvm_tracer_metered.h Lines 55 to 107 in 163d10c …relying on the `_Static_assert` pair at the top of the header to prove the heap-allocated buffers are large enough between flushes: openvm/crates/rvr/rvr-openvm/c/openvm_tracer_metered.h Lines 36 to 46 in 163d10c But `TRACER_SEGMENT_CHECK_INSNS` is baked in from `DEFAULT_SEGMENT_CHECK_INSNS = 1000` at codegen time, while the runtime check threshold is taken from `MeteredCtx::segment_check_insns`: openvm/crates/vm/src/arch/rvr/execute.rs Lines 247 to 254 in 163d10c …which has a public setter that accepts any `u64` and resizes only the unrelated `page_indices_since_checkpoint` buffer: openvm/crates/vm/src/arch/execution_mode/metered/ctx.rs Lines 107 to 121 in 163d10c The C buffers stay sized to `MEM_PAGE_BUF_CAP` / `PV_PAGE_BUF_CAP` (compile-time constants used directly for the Rust `vec![0u32; ...]` allocations): openvm/crates/vm/src/arch/rvr/metered.rs Lines 127 to 140 in 163d10c With `MEM_PAGE_BUF_CAP = 65536` and `MAX_MEM_PAGES_PER_INSN = 10`, the assert holds only for `segment_check_insns <= 65536/(2*10) = 3276`; for `PV_PAGE_BUF_CAP = 4096` and `MAX_PV_PAGES_PER_INSN = 1`, the safe ceiling is `4096/2 = 2048`. Any caller that does `MeteredCtx::with_segment_check_insns(n)` with `n > 2048` and runs through the RVR backend can overflow the heap buffer at the unchecked `mem_page_buf[mem_page_buf_len++]` / `pv_page_buf[...]` stores. `MAX_MEM_PAGES_PER_INSN = 10` is a worst case (e.g. HINT_BUFFER / ECC), so typical programs won't hit it, but chosen-input programs can. Suggested fix: assert in `execute.rs` (and the other RVR metered entry points) that `seg_state.segmentation_ctx.segment_check_insns <= DEFAULT_SEGMENT_CHECK_INSNS`, or rebuild/regenerate the C with the runtime value baked in. At minimum a `debug_assert!` with a clear error message is better than silent UB. No CLAUDE.md violations: the file-naming rule under "File Structure" in AGENTS.md is chip-scoped (the surrounding sentences are about `air.rs`/`trace.rs`), and pre-existing `crates/vm/src/arch/aot/` already uses names like `pure.rs` / `metered_execute.rs`, so the new `crates/vm/src/arch/rvr/{execute,pure,metered,metered_cost}.rs` layout matches established practice. No issues found in the workflow/CI changes or the broader diff-only scan. |
…#2807) ## Summary - **RVR metered execution can now suspend at segment boundaries.** Previously only the interpreter and AOT backends supported segment-by-segment metered runs; RVR ran metered execution straight to termination. This branch adds a parallel `RvrMeteredSegmentInstance` (`RvrMeteredInstanceWith<F, SegmentBoundary>`) whose `execute_metered_until_segment_boundary` returns after the metered segmentation callback creates a segment, mirroring the suspend/resume shape the other two backends already expose. The tracer countdown is carried across calls by checkpointing `tracer.check_counter` into `segmentation_ctx.instrets_until_check` on suspend and restoring it on entry; both values are `try_from`-validated against u32 at the entry point (new `ExecuteError::InvalidMeteredContext`) and the hot C-callback's matching cast is guarded by `debug_assert_eq!`. Mid-segment suspension is out of scope: `initialize_segment_memory` resets the per-segment page-indices checkpoint buffer assuming the page buffers have already been flushed at a segment boundary. - **Generated-C surface reorganized by policy.** Block-begin and suspender helpers move into `c/block/{instret,metered,metered_segment}.h` and `c/suspender/{none,instret_limit,segment_boundary}.h`; tracer headers move under `c/tracer/`. A new `SuspendPolicy` enum drives which pair is included, with `compile_impl` rejecting incoherent combinations (`Metered` × `InstretLimit`, `Pure|MeteredCost` × `SegmentBoundary`) at compile time. Compile-time selection without preprocessor directives in the generated C, per the AGENTS.md guidance. - **`MeteredCtx` round-trip via `MeteredCtxParts`.** `SegmentationState` now carries the full `MemoryCtx` and `suspend_on_segment` flag, so a suspended metered run can be converted back to a `MeteredCtx` (`into_metered_ctx`) and resumed without losing page-tracking or segmentation state. A new test exercises the field-by-field round-trip. - **All RVR codegen inputs embedded at compile time.** Removes every `CARGO_MANIFEST_DIR` runtime dependency from the RVR project-emit pipeline so binaries (Docker images, etc.) no longer need the source tree to invoke `compile_impl`. Core C files (`openvm_io.{c,h}`, `rvr_ext_wrappers.c`) switch from `fs::copy` to `fs::write(include_str!(…))`. Extension `.a` staticlibs migrate from `staticlib_path() -> &Path` to `staticlib_file() -> (&'static str, &'static [u8])` via `include_bytes!(env!("RVR_*_FFI_STATICLIB"))`, with a new `write_extension_staticlibs` helper writing them to the temp project for `make` to link. Modular's libsecp256k1 amalgamation include (~85 `.c`/`.h` files, with test/bench/ctime/valgrind files filtered out) is collected by `extensions/algebra/rvr/build.rs` into a generated `SECP256K1_C_FILES` const and returned via the new `RvrExtension::extra_c_include_files()` hook (for files written but not compiled as their own TUs); `extra_cflags` switches to relative `-Isecp256k1/src` / `-Isecp256k1` against the temp project root. Trait return types are tightened from `&str` to `&'static str`. - **Up-front toolchain detection.** `compile_impl` probes the C compiler, linker, and `make` in PATH before building and reports all missing tools at once via `RuntimeToolchainError`. Adds `RVR_MAKE` override, forwards `HOST_OS` to the Makefile (replacing its `uname -s` shell-out), and threads path context into `CompileError` I/O variants. - **Metrics consolidation.** The four near-identical `Instant::now() … counter!().absolute() … gauge!().set()` blocks across interpreter / AOT / RVR are replaced by a single `ExecutionMetricTimer` helper in `arch::execution_metrics` (guarded against div-by-zero on sub-microsecond runs). A complementary `CompilationTimer` (`arch::compilation_metrics`) wraps every `*_instance` constructor and emits a `compile_{pure,metered,metered_cost,metered_segment}_ms` gauge labeled by backend (`interpreter` / `aot` / `rvr`). - **E1/E2/E3 jargon dropped.** `execute_e1` span/metric names become `execute_pure`; `terminate_execute_e12_*` → `terminate_execute_*`; const generic `E1` → `PURE_EXECUTION`. Comment references to "(E1)/(E2)/(E3)" are removed in favor of "pure/metered/preflight". ## Breaking changes - **Metric names.** `execute_e1_insns` → `execute_pure_insns`, `execute_e1_insn_mi/s` → `execute_pure_insn_mi/s`. Dashboards or alerting keyed on the old names need to be updated. - **`RvrExtension` trait surface.** `extra_c_source_paths() -> Vec<PathBuf>` → `extra_c_sources() -> Vec<(&'static str, &'static str)>`; `staticlib_path/paths()` → `staticlib_file/files()` returning embedded bytes; new optional `extra_c_include_files()` for files written but not compiled as TUs. Existing impls need a one-time conversion to `include_str!` / `include_bytes!`. - **`ExecutorInventory` generic param renames** (`E1`/`E2`/`E3` → `CombinedE`/`NewE`/`TargetE`) are visible in error messages but compatible. - **`CompileError` shape.** `CProject(io::Error)` → `CProject { path, source }`; `Toolchain(String)` → `Toolchain(#[from] RuntimeToolchainError)`; new `ToolchainCommand { command, source }`. Callers matching on these variants need to update. - **Binary size.** Embedded `.a` staticlibs and libsecp256k1 sources grow the binary by roughly 5–30 MB depending on enabled extensions. resolves int-7917 --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code reviewChecked for bugs and CLAUDE.md compliance against AGENTS.md. One new memory-safety issue in the keccak RVR extension; the two issues from the prior sync (dispatch-table OOB on Guest-controlled stack buffer overflow in
openvm/extensions/keccak256/rvr/src/lib.rs Lines 62 to 72 in 8d74cbf
openvm/crates/rvr/rvr-openvm/c/openvm_state.h Lines 16 to 21 in 8d74cbf …and openvm/crates/rvr/rvr-openvm/c/openvm_state.h Lines 138 to 144 in 8d74cbf A guest that sets This is distinct from the host-callback bounds issues called out in the prior reviews ( Suggested fix: validate if (unlikely(num_words == 0 || num_words > KECCAK_WIDTH_WORDS)) { rv_trap(state); return; }Or push the check up to No additional new CLAUDE.md violations found in this sync: the file-layout rule under "File Structure" in AGENTS.md is chip-scoped (the surrounding context is about (Note: this update edits the previously-test placeholder in place; the original sticky review remains at #issuecomment-4521872268 and should be considered superseded.) |
Note: cells_used metrics omitted because CUDA tracegen does not expose unpadded trace heights. Commit: b7b23b6 |
No description provided.