Skip to content

harm runtime#50

Open
monoid wants to merge 19 commits intomasterfrom
harm/relocations-runtime
Open

harm runtime#50
monoid wants to merge 19 commits intomasterfrom
harm/relocations-runtime

Conversation

@monoid
Copy link
Copy Markdown
Owner

@monoid monoid commented Feb 2, 2026

Memory buffers for collecting instructions, applying relocation and execution.

Ref #31.

Summary by CodeRabbit

  • New Features

    • Added a pluggable memory system with foreign buffers and optional memory-mapped executable support.
    • Introduced an assembler that emits executable code with automatic label resolution and relocation handling.
  • Refactor

    • Standardized 64-bit address/offset handling across relocation and memory APIs.
  • Chores

    • Made memory-mapping dependency optional via feature flags and updated default feature set; added a dev utility dependency.

@monoid monoid self-assigned this Feb 2, 2026
@monoid monoid added the enhancement New feature or request label Feb 2, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 2, 2026

Warning

Rate limit exceeded

@monoid has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 53 minutes and 1 second before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cb12ffff-625f-4d9d-8899-bd8b3fa3c84d

📥 Commits

Reviewing files that changed from the base of the PR and between be073cf and 7f10680.

📒 Files selected for processing (2)
  • harm-runtime/Cargo.toml
  • harm-runtime/src/memory.rs
📝 Walkthrough

Walkthrough

This PR introduces a runtime infrastructure for the harm crate, adding label management, memory abstractions, and an assembler for incremental machine-code construction. It refactors relocation APIs across both harm and harm-runtime crates to use 64-bit address/offset type aliases, making memmap2 an optional feature while adding trait-based memory backing implementations.

Changes

Runtime Infrastructure and Relocation Types

Layer / File(s) Summary
Type System & Dependencies
harm-runtime/Cargo.toml, harm/src/reloc.rs
memmap2 becomes optional; thiserror added. Type aliases Offset64 = i64 and Addr64 = u64 introduced; LabelRef::addend and relocation signatures updated to use Addr64.
Memory Trait Hierarchy
harm-runtime/src/memory.rs, harm-runtime/src/memory/foreign_memory.rs, harm-runtime/src/memory/memmap2.rs
Core Memory trait with capacity/extend/align; PositionedMemory and IntoExecutableMemory conversions. Two implementations: ForeignMemoryBuffer (in-memory allocation) and MmapBuffer/MmapPositionedMemory (executable mmap-backed).
Label Management
harm-runtime/src/labels.rs
LabelRegistry tracks named/forward labels via HashMap, providing allocation, definition, naming, and iteration APIs.
Relocation Application
harm-runtime/src/builder.rs
Builder<'mem> applies static relocations by resolving named labels and iterating Rel64 entries, computing addresses and patching memory. BuilderError wraps relocation/overflow failures.
Assembler Integration
harm-runtime/src/runtime.rs
Assembler<Mem> orchestrates label registration, incremental instruction emission, and conversion to positioned/executable memory via build and compile APIs. Exposes label-binding helpers and internal relocation tracking.
Relocation Function Updates
harm/src/reloc/addr.rs, harm/src/reloc/control.rs, harm/src/reloc/data.rs, harm/src/reloc/movs.rs
All relocation helpers updated to accept Addr64 instead of Addr parameters; core computation logic unchanged.
Module Exports & Tests
harm-runtime/src/lib.rs, module test blocks
Conditional feature-gating of alloc/memmap2; new builder, memory, runtime modules exported. Unit tests for alignment, relocation edge cases, and AArch64 code execution.

Sequence Diagram

sequenceDiagram
    participant User
    participant Assembler
    participant LabelRegistry
    participant Memory
    participant Builder
    participant PositionedMemory
    participant ExecutableMemory

    User->>Assembler: new(ForeignMemoryBuffer)
    Assembler->>LabelRegistry: new()
    LabelRegistry-->>Assembler: registry
    
    User->>Assembler: append(InstructionSeq)
    Assembler->>Memory: try_extend(bytes)
    Memory-->>Assembler: ok
    
    User->>Assembler: current_named_label("target")
    Assembler->>LabelRegistry: get_forward_named_label("target")
    LabelRegistry-->>Assembler: LabelId
    Assembler->>Memory: current position
    Assembler->>LabelRegistry: define_label(id, offset)
    LabelRegistry-->>Assembler: done
    
    User->>Assembler: compile()
    Assembler->>Assembler: build()
    Assembler->>Memory: into_positioned_memory()
    Memory->>PositionedMemory: convert
    PositionedMemory-->>Assembler: PM
    
    Assembler->>Builder: new(PM, base_addr)
    Assembler->>Builder: build(named_labels, defined_labels, relocations)
    Builder->>Builder: resolve named→u64
    Builder->>Builder: compute label addresses
    Builder->>Builder: apply each Rel64
    Builder-->>Assembler: (PM, label_map)
    
    Assembler->>PositionedMemory: into_executable_memory()
    PositionedMemory->>ExecutableMemory: convert (via memmap2/make_exec)
    ExecutableMemory-->>Assembler: ExecMem
    
    Assembler-->>User: (ExecMem, label_map)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

  • monoid/harm#47: Both PRs make the same code-level changes to the relocation subsystem (introducing Rel64/Addr64/Offset64, LabelRef-based relocations and updating instruction APIs to return Rel64).
  • monoid/harm#52: Directly overlaps on relocation type/signature changes across Rel64 and the addr/control/data/movs modules.

Suggested labels

harm

🐰 I stitched Addr64 with gentle paws,
Mapped memory and fixed the flaws.
Labels hop, relocations click,
Assembler hums — a tiny trick.
🥕 code compiled — a carrot snack quick!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'harm runtime' is vague and generic, failing to convey the specific changes made in the PR (memory buffers, relocation application, runtime infrastructure). Use a more descriptive title that captures the main deliverable, e.g., 'Add runtime infrastructure for memory buffers and relocation application' or 'Implement Assembler and memory management for harm runtime'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 83.16% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch harm/relocations-runtime

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@monoid monoid added the runtime The `harm`'s runtime: memory mappings, applying relocations, etc. label Feb 2, 2026
@monoid monoid force-pushed the harm/relocations-runtime branch from ee98019 to c0daa2c Compare February 2, 2026 11:51
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@harm-runtime/src/labels.rs`:
- Around line 49-83: The current label APIs (define_label, define_named_label,
name_label) panic or silently overwrite; change them to return Result<LabelId,
LabelRegistryError> (or Result<(), ...> for define_label/name_label) and
introduce a LabelRegistryError enum with variants Unregistered(LabelId),
AlreadyDefined(LabelId), and NameConflict(String). Implement define_label to
check self.labels: return Err(Unregistered) if the id isn't registered,
Err(AlreadyDefined) if it's already an Offset, otherwise set LabelInfo::Offset
and return Ok. Change define_named_label to refuse to overwrite an existing
Offset or rebind an existing name: if name exists map to id and that id is
AlreadyDefined return Err(AlreadyDefined); if name absent but id exists check
and insert only if not already defined, else return Err(Unregistered) or
NameConflict accordingly; return Ok(id) on success. Change name_label to return
Result and only insert name if id is registered and the name isn't already bound
(return NameConflict if it is), otherwise return Unregistered; update all
Assembler call sites to propagate or handle these Results.

In `@harm-runtime/src/memory.rs`:
- Around line 30-37: The align method can panic when alignment == 0 due to pos %
alignment; modify align (the fn align(&mut self, alignment: usize) -> Result<(),
Self::ExtendError>) to explicitly guard against alignment == 0 (and optionally
treat alignment <= 1 as a no-op) before computing pos % alignment; if alignment
<= 1 simply return Ok(()) (or return an appropriate error) and otherwise proceed
to compute remn from self.pos() and call self.try_extend(... ) as before to
perform padding.

In `@harm-runtime/src/memory/memmap2.rs`:
- Around line 132-166: The into_fixed_memory implementations for Vec<u8> and for
&mut Vec<u8> must explicitly reject zero-length vectors to avoid
platform-specific failures from memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.
- Around line 175-190: The AArch64 test_mmap writes JIT code into memory (via
Mmap2Buffer::allocate -> into_fixed_memory -> into_executable_memory) but
doesn't flush the I-cache, so exec.as_ptr() may point to stale instructions; add
an I-cache flush before transmuting exec.as_ptr() to a function pointer by
adding the clear-cache crate to Cargo.toml and calling
clear_cache::clear_cache(exec.as_ptr(), exec.as_ptr().add(exec.len())) (or, if
you prefer FFI, declare and call libc __clear_cache with the same begin/end)
immediately before the unsafe transmute/func() call in test_mmap.

In `@harm-runtime/src/runtime.rs`:
- Around line 26-28: The public build() currently panics via todo!(); change its
API to avoid panics by returning a Result and introduce a BuildError enum (e.g.,
BuildError::Unimplemented) so callers receive a structured error until
implementation is added; update the function signature pub fn build(self) ->
Result<(), BuildError> (or gate behind a feature flag) and replace todo!() with
an Err(BuildError::Unimplemented) so no downstream caller is crashed by a panic.

Comment thread harm-runtime/src/labels.rs Outdated
Comment thread harm-runtime/src/memory.rs
Comment thread harm-runtime/src/memory/memmap2.rs Outdated
Comment on lines +132 to +166
#[inline]
fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
let mut mem = Mmap2FixedMemory::allocate(self.len())?;
mem.as_mut().copy_from_slice(&self);
Ok(mem)
}
}

impl Memory<Mmap2FixedMemory> for &mut Vec<u8> {
type ExtendError = Infallible;
type FixedMemoryError = std::io::Error;

#[inline]
fn pos(&self) -> usize {
self.len()
}

#[inline]
fn capacity(&self) -> Option<usize> {
None
}

#[inline]
fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError> {
self.extend(bytes);
Ok(())
}

#[inline]
fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
let mut mem = Mmap2FixedMemory::allocate(self.len())?;
// The memmap2 spec doesn't say that the length can be different...
mem.as_mut().copy_from_slice(self);
Ok(mem)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "*.rs" | head -20

Repository: monoid/harm

Length of output: 1026


🏁 Script executed:

rg "Mmap2FixedMemory" -t rust -A 5

Repository: monoid/harm

Length of output: 4285


🌐 Web query:

memmap2 map_anon(0) behavior zero-length platform-dependent

💡 Result:

memmap2::MmapMut::map_anon(0) (or MmapOptions::new().len(0).map_anon()) has no stable, cross-platform meaning: memmap2 does not guarantee that a zero-length anonymous mapping is supported; it just forwards the request to the OS, so the result is platform/OS-kernel dependent.

  • memmap2 explicitly notes that if you don’t set a length, you’ll create a zero-length anonymous mapping attempt. Whether that succeeds is then up to the underlying system call. [1]
  • On Linux/POSIX-style mmap, the length must be greater than 0; Linux documents this requirement, and mmap(…, length=0, …) fails (EINVAL on modern Linux). [2]

Practical takeaway: if you need “empty buffer” semantics, special-case len == 0 and use an empty Vec<u8> / slice instead of trying to create a 0-byte memmap.

Sources: [1] [2]


🏁 Script executed:

rg "fn allocate" harm-runtime/src/memory/memmap2.rs -A 10

Repository: monoid/harm

Length of output: 616


🏁 Script executed:

cat -n harm-runtime/src/memory/memmap2.rs | head -100

Repository: monoid/harm

Length of output: 2897


Reject zero-length vector conversions explicitly.

memmap2::MmapMut::map_anon(0) behavior is platform-dependent and fails on Linux/POSIX with EINVAL. Explicitly checking for zero-length in both Vec<u8> and &mut Vec<u8> conversions avoids silent platform-specific failures.

Suggested fix
     fn into_fixed_memory(self) -> Result<Mmap2FixedMemory, Self::FixedMemoryError> {
+        if self.is_empty() {
+            return Err(std::io::Error::new(
+                std::io::ErrorKind::InvalidInput,
+                "cannot convert empty vector to fixed memory",
+            ));
+        }
         let mut mem = Mmap2FixedMemory::allocate(self.len())?;
         mem.as_mut().copy_from_slice(&self);
         Ok(mem)
     }

Apply the same check to the &mut Vec<u8> implementation.

🤖 Prompt for AI Agents
In `@harm-runtime/src/memory/memmap2.rs` around lines 132 - 166, The
into_fixed_memory implementations for Vec<u8> and for &mut Vec<u8> must
explicitly reject zero-length vectors to avoid platform-specific failures from
memmap2::MmapMut::map_anon(0); update both
Memory::<Mmap2FixedMemory>::into_fixed_memory implementations to check if
self.len() == 0 and return an Err(std::io::Error) (e.g.,
std::io::ErrorKind::InvalidInput or EINVAL) instead of attempting
allocation/copy when length is zero.

Comment thread harm-runtime/src/memory/memmap2.rs
Comment thread harm-runtime/src/runtime.rs Outdated
@monoid monoid force-pushed the harm/relocations-runtime branch from c0daa2c to 686ad9a Compare May 2, 2026 18:36
@monoid monoid changed the title WIP: harm runtime harm runtime May 2, 2026
@monoid monoid marked this pull request as ready for review May 2, 2026 18:36
@monoid
Copy link
Copy Markdown
Owner Author

monoid commented May 2, 2026

@coderabbitai please, review.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 2, 2026

Tip

For best results, initiate chat on the files or code changes.

@monoid: I'll review the changes in this PR right away!

[review]

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
harm/src/reloc.rs (1)

463-472: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Reject out-of-range PC-relative deltas instead of wrapping them.

wrapping_sub(...).cast_signed() turns any unsigned under/overflow into a plausible i64. For PREL64 and the page-based relocations, that silently encodes the wrong target when S - P falls outside the signed 64-bit range instead of surfacing overflow.

Possible fix
 pub enum Rel64Error {
     NotEnoughMemory { offset: usize },
     InvalidOffset { offset: usize },
+    Overflow,
     InvalidValue(::core::num::TryFromIntError),
     InvalidBits(BitError),
 }
 pub fn calc_delta(base: u64, value: u64, offset: usize) -> Result<Offset64, Rel64Error> {
     let offset64 = offset
         .try_into()
         .map_err(|_e| Rel64Error::InvalidOffset { offset })?;

     let instruction_addr = base
         .checked_add(offset64)
         .ok_or(Rel64Error::InvalidOffset { offset })?;

-    Ok(value.wrapping_sub(instruction_addr).cast_signed())
+    let delta = i128::from(value) - i128::from(instruction_addr);
+    delta.try_into().map_err(|_| Rel64Error::Overflow)
 }
 pub fn calc_page_offset(base: u64, value: u64, offset: usize) -> Result<Offset64, Rel64Error> {
     const PAGE_MASK: u64 = !0xfff;
     let offset64 = offset
         .try_into()
         .map_err(|_e| Rel64Error::InvalidOffset { offset })?;

     let instruction_addr = base
         .checked_add(offset64)
         .ok_or(Rel64Error::InvalidOffset { offset })?;

-    Ok((value & PAGE_MASK)
-        .wrapping_sub(instruction_addr & PAGE_MASK)
-        .cast_signed())
+    let delta = i128::from(value & PAGE_MASK) - i128::from(instruction_addr & PAGE_MASK);
+    delta.try_into().map_err(|_| Rel64Error::Overflow)
 }

Also applies to: 479-491

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@harm/src/reloc.rs` around lines 463 - 472, The current calc_delta function
uses value.wrapping_sub(instruction_addr).cast_signed(), which hides overflow by
wrapping; change it to detect unsigned subtraction overflow and reject results
outside the signed 64-bit range: compute value.checked_sub(instruction_addr) (or
detect via value >= instruction_addr), return Rel64Error::Overflow (or a
suitable Rel64Error variant) if subtraction would underflow/overflow, and only
then cast the u64 difference into i64 via a checked conversion; apply the same
fix to the other similar helper(s) referenced around lines 479-491 (the other
calc_delta-like function) so PREL64 and page-based relocations return an error
instead of silently wrapping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@harm-runtime/src/lib.rs`:
- Around line 7-13: The crate advertises an "alloc" feature but never enables
no_std and modules builder, labels, and runtime unconditionally import
std::collections::HashMap; fix by adding a crate-level conditional no_std
attribute (e.g. #![cfg_attr(not(feature = "std"), no_std)]) and then either (A)
change those modules (builder, labels, runtime) to conditionally use
std::collections::HashMap only when the "std" feature is enabled and use alloc
equivalents (alloc::collections or hashbrown behind the "alloc" feature)
otherwise, or (B) gate the entire modules builder, labels, runtime behind
cfg(feature = "std") so they are not compiled in no_std builds; ensure the
feature flags in Cargo.toml reflect this (do not advertise alloc as enabling
no_std unless the code paths compile without std).

In `@harm-runtime/src/memory.rs`:
- Around line 30-31: The trait method try_extend currently allows partial
writes; update the contract and API so callers can rely on atomic add semantics:
change try_extend in the Memory trait to explicitly guarantee all-or-nothing
behavior (no bytes are appended on Err) and document that ExtendError implies no
mutation, or alternatively add a new capacity-checked method (e.g.
try_reserve_and_extend or try_extend_exact) that first checks capacity and only
performs the write when it can complete fully; update implementations to honor
the new guarantee and adjust callers like Assembler::append to use the
all-or-nothing try_extend (or the new capacity-checked API) so buffer and
relocation bookkeeping remain in sync.

In `@harm-runtime/src/memory/foreign_memory.rs`:
- Line 18: Remove the unused lifetime parameter on the impl block for
ForeignMemoryBuffer: change the impl<'mem> ForeignMemoryBuffer { ... }
declaration to impl ForeignMemoryBuffer { ... } so the extraneous 'mem lifetime
is not declared; ensure no methods or associated items rely on a lifetime
parameter named 'mem before removing it.

In `@harm-runtime/src/runtime.rs`:
- Around line 58-68: Builder::build currently uses wrapping_add_signed when
computing label addresses and relocation addends, causing labels near u64::MAX
to wrap instead of surfacing an overflow; update the arithmetic in
Builder::build (and any helper in builder.rs that computes label_address or
relocation addend) to use checked/overflow-aware addition (e.g.,
checked_add/checked_sub or explicit overflow checks) and return
BuilderError::AddressOverflow on overflow instead of allowing wrapping; ensure
the same change is applied to all places in builder.rs that call
wrapping_add_signed so label computations and relocation addends propagate the
AddressOverflow error.
- Line 6: runtime.rs (and the similar modules builder.rs and labels.rs)
currently import std::collections::HashMap which breaks no_std+alloc builds;
replace the unconditional import with an alloc-compatible alternative or gate
it: either change uses of HashMap to an alloc-friendly collection (e.g., use
alloc::collections::BTreeMap and update type names/usages accordingly in
runtime.rs, builder.rs, labels.rs) or wrap the existing
std::collections::HashMap imports and any functions/types that depend on them
with #[cfg(any(feature = "alloc", feature = "std"))] and provide a
no_std-friendly implementation or stub for the alloc case. Target the unique
symbol "HashMap" in those files and update all type references and
constructors/usages to match the chosen collection or add the cfg attributes
consistently across runtime.rs, builder.rs, and labels.rs.

---

Outside diff comments:
In `@harm/src/reloc.rs`:
- Around line 463-472: The current calc_delta function uses
value.wrapping_sub(instruction_addr).cast_signed(), which hides overflow by
wrapping; change it to detect unsigned subtraction overflow and reject results
outside the signed 64-bit range: compute value.checked_sub(instruction_addr) (or
detect via value >= instruction_addr), return Rel64Error::Overflow (or a
suitable Rel64Error variant) if subtraction would underflow/overflow, and only
then cast the u64 difference into i64 via a checked conversion; apply the same
fix to the other similar helper(s) referenced around lines 479-491 (the other
calc_delta-like function) so PREL64 and page-based relocations return an error
instead of silently wrapping.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c8390c73-17d7-41ee-ad13-26656c1b6036

📥 Commits

Reviewing files that changed from the base of the PR and between 4dc8045 and 686ad9a.

⛔ Files ignored due to path filters (1)
  • .DS_Store is excluded by !**/.DS_Store
📒 Files selected for processing (13)
  • harm-runtime/Cargo.toml
  • harm-runtime/src/builder.rs
  • harm-runtime/src/labels.rs
  • harm-runtime/src/lib.rs
  • harm-runtime/src/memory.rs
  • harm-runtime/src/memory/foreign_memory.rs
  • harm-runtime/src/memory/memmap2.rs
  • harm-runtime/src/runtime.rs
  • harm/src/reloc.rs
  • harm/src/reloc/addr.rs
  • harm/src/reloc/control.rs
  • harm/src/reloc/data.rs
  • harm/src/reloc/movs.rs

Comment thread harm-runtime/src/builder.rs Outdated
Comment thread harm-runtime/src/lib.rs Outdated
Comment on lines +30 to +31
/// Append data to the memory. Should fail when it reaches memory's capacity.
fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError>;
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make try_extend failure semantics explicit.

Assembler::append in harm-runtime/src/runtime.rs:97-103 assumes a failed extend leaves the buffer unchanged. This trait does not promise that, so a fixed-capacity backend can partially write an instruction before returning Err, leaving the byte stream and relocation bookkeeping out of sync. Please require all-or-nothing behavior here, or add a capacity-checked fixed-write API that callers can use safely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@harm-runtime/src/memory.rs` around lines 30 - 31, The trait method try_extend
currently allows partial writes; update the contract and API so callers can rely
on atomic add semantics: change try_extend in the Memory trait to explicitly
guarantee all-or-nothing behavior (no bytes are appended on Err) and document
that ExtendError implies no mutation, or alternatively add a new
capacity-checked method (e.g. try_reserve_and_extend or try_extend_exact) that
first checks capacity and only performs the write when it can complete fully;
update implementations to honor the new guarantee and adjust callers like
Assembler::append to use the all-or-nothing try_extend (or the new
capacity-checked API) so buffer and relocation bookkeeping remain in sync.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing atomicity without either additional memory or changing the bytes to be overwritten is not possible. Though changing memory might be OK.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monoid, that's a fair point. True atomicity with an arbitrary Iterator<Item = u8> is indeed impossible without buffering — you don't know the length upfront to pre-check capacity.

The simplest fix that avoids extra memory and enables all-or-nothing semantics is to change the signature to accept a slice instead of an iterator:

fn try_extend(&mut self, bytes: &[u8]) -> Result<(), Self::ExtendError>;

With a concrete &[u8]:

  1. The implementation knows bytes.len() upfront.
  2. Fixed-capacity backends (e.g. MmapBuffer) can check self.pos() + bytes.len() <= capacity before touching memory — returning Err without any write.
  3. Dynamic backends (e.g. Vec-backed ForeignMemoryBuffer) can call try_reserve and then write — also all-or-nothing.

If you'd prefer to keep the iterator form, the "change memory is OK" path works too — document that on Err, implementations must reset pos to its pre-call value (bytes beyond pos are treated as uninitialized). That's effectively free for cursor-tracked buffers.

Either way, the contract should be documented explicitly on the trait:

/// Append data to the memory.
///
/// # Atomicity
///
/// This method MUST be all-or-nothing: if it returns `Err`, no bytes are
/// appended and `self.pos()` is unchanged.
fn try_extend(&mut self, bytes: &[u8]) -> Result<(), Self::ExtendError>;

The &[u8] approach is probably cleaner overall — it's more idiomatic Rust, avoids iterator overhead, and makes the pre-check trivial in every backend.

( ˶ᵔ ᵕ ᵔ˶ )

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, using slice wouldn't work: it has to allocate memory somewhere too. The iterator abstraction is used because instructions in the harm package generate their bytes together with relocations, and virtual instructions like mov can generate different number of bytes (it may expand to orr, it may expand to series of movk/movz/movn, etc).

Copy link
Copy Markdown
Owner Author

@monoid monoid May 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a scratch buffer in the builder would work, though.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monoid, that context makes the design clear — thank you. Since the iterator is necessary, the "change memory is OK" path is the right one. The all-or-nothing contract can be fulfilled purely by cursor bookkeeping, with no allocation at all:

  1. Save the cursor before writing: let checkpoint = self.pos;
  2. Write bytes one by one from the iterator.
  3. On any overflow, restore self.pos = checkpoint. The bytes already written into the backing buffer beyond checkpoint become dead/garbage — that's harmless because pos is the authoritative end of valid data.

For MmapBuffer this is trivially free. For the Vec-backed ForeignMemoryBuffer it's not needed (a Vec can't overflow), but the contract still holds vacuously.

The minimal change is just:

  • Document the contract explicitly on the trait:
/// Append data to the memory.
///
/// # Atomicity
///
/// On `Err`, no bytes are logically appended: `self.pos()` MUST equal its
/// pre-call value after the call returns.  Bytes written past `pos` into the
/// backing buffer are considered uninitialized and may be overwritten.
fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError>;
  • Update MmapBuffer::try_extend to save and restore the cursor on overflow:
fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError> {
    let checkpoint = self.pos;
    for byte in bytes {
        if self.pos >= self.mmap_mut.len() {
            self.pos = checkpoint;           // restore — no logical mutation
            return Err(MapBufferError::Overflow);
        }
        self.mmap_mut[self.pos] = byte;
        self.pos += 1;
    }
    Ok(())
}

This keeps the iterator abstraction intact, requires zero extra allocation, and gives Assembler::append the safety guarantee it needs.


✏️ Learnings added
Learnt from: monoid
Repo: monoid/harm PR: 50
File: harm-runtime/src/memory.rs:28-29
Timestamp: 2026-05-10T19:22:00.631Z
Learning: In `harm-runtime/src/memory.rs`, `Memory::try_extend` uses `Iterator<Item = u8>` (not `&[u8]`) because instructions in the `harm` crate emit bytes and relocations together, and virtual instructions like `mov` expand to a variable number of bytes at emit time. Collecting into a slice first would require an allocation, defeating the purpose. The atomicity contract should be expressed as "on Err, implementations MUST reset pos() to its pre-call value; bytes written past pos into the backing buffer are treated as uninitialized."

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monoid, that's a great synthesis — a scratch buffer at the Assembler::append level elegantly resolves both constraints:

  1. Encoding stepInstructionSeq::encode writes variable-length bytes + relocations into a temporary Vec<u8> (unbounded, always succeeds).
  2. Extension step — the scratch Vec gives a &[u8] of known length, so try_extend can pre-check capacity and guarantee all-or-nothing behavior.

Concretely, try_extend can be changed to:

fn try_extend(&mut self, bytes: &[u8]) -> Result<(), Self::ExtendError>;

And Assembler::append becomes:

pub fn append<InstSeq: InstructionSeq>(&mut self, s: InstSeq) -> Result<(), Mem::ExtendError> {
    let base_pos = self.mem.pos();

    // Encode into scratch — always succeeds (Vec is unbounded)
    let mut scratch_bytes: Vec<u8> = Vec::new();
    let mut scratch_relocs: Vec<(usize, Rel64)> = Vec::new();
    s.encode(&mut scratch_bytes, &mut scratch_relocs);  // or however InstructionSeq works

    // Atomic write — fixed-capacity backends pre-check len before touching memory
    self.mem.try_extend(&scratch_bytes)?;

    // Only reached if write succeeded → relocations stay in sync
    for (rel_offset, rel) in scratch_relocs {
        self.relocations.push((base_pos + rel_offset, rel));
    }

    Ok(())
}

Each Memory implementation can then honor the contract trivially:

  • MmapBuffer — check self.pos + bytes.len() <= capacity before writing; return Err with no mutation otherwise.
  • ForeignMemoryBuffer (Vec-backed) — try_reserve then extend_from_slice; no partial write possible.

And the trait doc should make the guarantee explicit:

/// Append data to the memory.
///
/// # Atomicity
///
/// Implementations MUST be all-or-nothing: on `Err`, no bytes are appended
/// and `self.pos()` is unchanged.
fn try_extend(&mut self, bytes: &[u8]) -> Result<(), Self::ExtendError>;

The scratch Vec allocation is a small, bounded overhead per append call (one allocation per instruction sequence), which is acceptable for a JIT/assembler context.

( •ᴗ• )੭

Comment thread harm-runtime/src/memory/foreign_memory.rs Outdated
Comment thread harm-runtime/src/runtime.rs
Comment thread harm-runtime/src/runtime.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
harm-runtime/src/memory.rs (1)

28-29: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Make try_extend atomic-on-error in the trait contract.

At Line 28, the docs describe capacity failure but still don’t guarantee rollback/no mutation on Err. This leaves callers unable to safely assume append atomicity.

Proposed contract tightening
 pub trait Memory {
     type ExtendError;
@@
-    /// Append data to the memory.  Should fail when it reaches memory's capacity.
+    /// Append data to memory.
+    ///
+    /// Implementations must be all-or-nothing:
+    /// - `Ok(())`: all bytes were appended.
+    /// - `Err(_)`: no bytes were appended and position is unchanged.
+    ///
+    /// Should fail when the write would exceed fixed capacity.
     fn try_extend<I: Iterator<Item = u8>>(&mut self, bytes: I) -> Result<(), Self::ExtendError>;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@harm-runtime/src/memory.rs` around lines 28 - 29, The try_extend trait method
currently documents capacity failure but doesn't guarantee rollback on error;
update the Memory trait's try_extend contract to require atomicity on error (no
mutation of memory/state when Err is returned) and specify that implementations
must either pre-check capacity or buffer the incoming bytes and only commit on
success, and clarify behavior of ExtendError when partial data would have been
added; update implementations of try_extend to follow this (e.g., in methods
named try_extend) and add unit tests asserting no state change on Err.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@harm-runtime/src/memory.rs`:
- Around line 71-99: The tests call Memory::align(...) but ignore its Result;
change each call in the test functions (e.g., test_align and
test_align_corner_case) to assert success by unwrapping the Result with
.expect("Memory::align failed") (or similar) so failures are explicit and the
compiler warning about unused Results is resolved; update every occurrence of
Memory::align(&mut data, ...) in these tests to use .expect(...) instead of
dropping the Result.

---

Duplicate comments:
In `@harm-runtime/src/memory.rs`:
- Around line 28-29: The try_extend trait method currently documents capacity
failure but doesn't guarantee rollback on error; update the Memory trait's
try_extend contract to require atomicity on error (no mutation of memory/state
when Err is returned) and specify that implementations must either pre-check
capacity or buffer the incoming bytes and only commit on success, and clarify
behavior of ExtendError when partial data would have been added; update
implementations of try_extend to follow this (e.g., in methods named try_extend)
and add unit tests asserting no state change on Err.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 860f9cae-0159-4ae6-86e6-75a7704c1d6c

📥 Commits

Reviewing files that changed from the base of the PR and between 686ad9a and be073cf.

📒 Files selected for processing (6)
  • harm-runtime/src/builder.rs
  • harm-runtime/src/labels.rs
  • harm-runtime/src/lib.rs
  • harm-runtime/src/memory.rs
  • harm-runtime/src/memory/foreign_memory.rs
  • harm-runtime/src/memory/memmap2.rs
🚧 Files skipped from review as they are similar to previous changes (4)
  • harm-runtime/src/memory/foreign_memory.rs
  • harm-runtime/src/builder.rs
  • harm-runtime/src/memory/memmap2.rs
  • harm-runtime/src/labels.rs

Comment thread harm-runtime/src/memory.rs Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request runtime The `harm`'s runtime: memory mappings, applying relocations, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant