feat(storage): publish primitive storage API by ukint-vs · Pull Request #1366 · gear-tech/sails

ukint-vs · 2026-05-20T12:25:01Z

Summary

This PR narrows sails-storage to the first publishable public surface: reusable storage primitives and build helpers, not benchmark harnesses or token-contract-specific storage.

Storage primitives

Adds the sails-storage crate.
Publishes root-level FixedOpenAddressMap, StaticOpenAddressTable, StaticLayout, StaticActorIdU256Map, and StaticActorPairU256Map as the stable consumer-facing surface.
Documents the unusual static-memory and actor-map semantics in crate-level rustdoc with compiling examples.
Keeps token/VFT-specific and research-only storage shapes out of the stable public namespace.

Build integration

Adds sails_rs::build::StaticMemoryLayout and static-memory build helpers.
Exposes reserve_actor_u256_map and reserve_actor_pair_u256_map for generated static-memory constants.

Explicitly Not Included

No docs/ files.
No benchmark crates or generated benchmark clients.
No benchmark IDL files.
No example app changes.
No workflow or script changes.

Verification

cargo test -p sails-storage — 24 passed
cargo test -p sails-storage --doc — 3 passed
cargo test -p sails-rs --no-default-features --features wasm-builder build::tests — 9 passed, 14 filtered
git diff --check — passed for the current worktree diff
live PR diff checked for docs/, benchmarks/, examples/, scripts/, .github/, and .idl paths

Test plan

Storage public API tests cover root fixed/static map usage, actor map operations, pair map operations, and zero actor rejection.
Crate rustdoc examples cover fixed maps, generic static tables, and actor/U256 static maps.
Build helper tests cover generated static-memory layout constants.

…age-profiling

gemini-code-assist

Code Review

This pull request introduces the sails-storage crate, which provides allocator-light storage primitives, and adds extensive benchmarking support for storage and VFT operations, including new benchmark crates and analysis tools. It also refactors message handling in rs/gstd/macros.rs to use with_read_on_stack_or_heap for improved memory management. The reviewer identified a potential DoS vulnerability in the message tracker within the aggregator example, recommending a bounded-state pattern with explicit eviction to prevent gas exhaustion.

gemini-code-assist · 2026-05-20T12:30:45Z

+    #[export]
+    pub fn prepare_tracker(&mut self, len: u32) -> TrackerBenchResult {
+        let mut tracker = msg_tracker().borrow_mut();
+        tracker.clear();


To prevent gas exhaustion and DoS, avoid patterns that can lead to panics when collections reach their limit. Instead of an assertion, ensure the message tracker uses a bounded-state pattern with explicit eviction of the oldest entries when the capacity (FIXED_TRACKER_CAPACITY) is reached. Additionally, include regression tests to verify the cap is enforced.

References

Avoid storing unbounded collections in service state to prevent gas exhaustion and DoS. Use a bounded-state pattern, such as a fixed-size cap with explicit eviction of the oldest entries, and include regression tests to verify the cap is enforced.

ukint-vs · 2026-05-20T13:23:59Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new sails-storage crate providing allocator-light storage primitives, including fixed-capacity and static-memory open-addressed maps. The review identified critical issues regarding unbounded collection growth in the static actor maps, which could lead to denial-of-service vulnerabilities, as well as platform-dependent hash and read operations that could cause cross-architecture inconsistencies. The reviewer provided actionable suggestions to improve hash distribution and ensure correct endianness handling.

gemini-code-assist · 2026-05-20T13:30:19Z

+        &self,
+        actor: ActorId,
+        value: U256,
+    ) -> Result<Option<U256>, TableError> {
+        let key = actor_key(actor);
+        reject_zero_key(&key)?;
+        if value.is_zero() {
+            return self
+                .remove_key(&key)
+                .map(|previous| previous.map(u256_from_value));
+        }


The StaticActorIdU256Map implementation has a potential capacity leak. When a zero value is written to 'remove' an entry, the value is zeroed but the key remains in the slot. Because this map uses linear probing and identifies vacant slots only by an all-zero key, a slot that was once occupied can never be reused for a different key. This leads to unbounded growth of unique keys in the state, which can cause gas exhaustion or DoS. Per repository rules, use a bounded-state pattern (e.g., a fixed-size cap with explicit eviction) and include regression tests to verify the cap is enforced.

References

Avoid storing unbounded collections in service state to prevent gas exhaustion and DoS. Use a bounded-state pattern, such as a fixed-size cap with explicit eviction of the oldest entries, and include regression tests to verify the cap is enforced.

gemini-code-assist · 2026-05-20T13:30:19Z

+        if value.is_zero() {
+            return self
+                .remove_key(&left, &right)
+                .map(|previous| previous.map(u256_from_value));
+        }


Similar to StaticActorIdU256Map, StaticActorPairU256Map does not free slots when a zero value is written. This prevents slot reuse and leads to an unbounded collection of unique actor pairs in the service state, risking gas exhaustion and DoS. This should be addressed by implementing a bounded-state pattern with explicit eviction and regression tests, as required by repository rules.

References

Avoid storing unbounded collections in service state to prevent gas exhaustion and DoS. Use a bounded-state pattern, such as a fixed-size cap with explicit eviction of the oldest entries, and include regression tests to verify the cap is enforced.

gemini-code-assist · 2026-05-20T13:30:19Z

+    let mut hash = 0x811c_9dc5u32;
+
+    for byte in bytes {
+        hash ^= u32::from(*byte);
+        hash = hash.wrapping_mul(0x0100_0193);
+    }
+
+    hash as usize


The hash_bytes function implements a 32-bit FNV-1a hash but returns a usize. On 64-bit platforms, the resulting hash will only occupy the lower 32 bits. If the table size (slots) exceeds 2^32, the hash will only ever map to the first 4GB of the table, leading to poor distribution and potential performance issues. For a general-purpose storage library, it is better to use the full width of usize for the hash calculation by using platform-appropriate constants.

let mut hash = if core::mem::size_of::<usize>() == 8 { 0xcbf2_9ce4_8422_2325usize } else { 0x811c_9dc5usize }; let prime = if core::mem::size_of::<usize>() == 8 { 0x0000_0100_0000_01b3usize } else { 0x0100_0193usize }; for byte in bytes { hash ^= usize::from(*byte); hash = hash.wrapping_mul(prime); } hash

gemini-code-assist · 2026-05-20T13:30:19Z

+unsafe fn read_words_32(ptr: *const u8) -> [u64; 4] {
+    unsafe {
+        [
+            ptr::read_unaligned(ptr.cast::<u64>()),
+            ptr::read_unaligned(ptr.add(8).cast::<u64>()),
+            ptr::read_unaligned(ptr.add(16).cast::<u64>()),
+            ptr::read_unaligned(ptr.add(24).cast::<u64>()),
+        ]
+    }


The read_words_32 function uses ptr::read_unaligned to read u64 values directly from memory, which is platform-dependent as it uses the native endianness of the CPU. However, words_32 (used for comparison) uses u64::from_le_bytes. On a big-endian system, read_words_32 will return big-endian words while words_32 returns little-endian words, causing key comparisons to fail. To ensure platform-independent behavior and correctness on all architectures, these values should be explicitly converted from little-endian.

Suggested change

unsafe fn read_words_32(ptr: *const u8) -> [u64; 4] {

unsafe {

[

ptr::read_unaligned(ptr.cast::<u64>()),

ptr::read_unaligned(ptr.add(8).cast::<u64>()),

ptr::read_unaligned(ptr.add(16).cast::<u64>()),

ptr::read_unaligned(ptr.add(24).cast::<u64>()),

]

}

unsafe fn read_words_32(ptr: *const u8) -> [u64; 4] {

unsafe {

[

u64::from_le(ptr::read_unaligned(ptr.cast::<u64>())),

u64::from_le(ptr::read_unaligned(ptr.add(8).cast::<u64>())),

u64::from_le(ptr::read_unaligned(ptr.add(16).cast::<u64>())),

u64::from_le(ptr::read_unaligned(ptr.add(24).cast::<u64>())),

]

}

}

gemini-code-assist · 2026-05-20T13:30:19Z

+fn fold_words(bytes: &[u8; 32]) -> u32 {
+    let ptr = bytes.as_ptr();
+    unsafe {
+        ptr::read_unaligned(ptr.cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(4).cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(8).cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(12).cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(16).cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(20).cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(24).cast::<u32>())
+            ^ ptr::read_unaligned(ptr.add(28).cast::<u32>())
+    }
+}


The fold_words function uses ptr::read_unaligned to read u32 values in native endianness. This makes the hash calculation platform-dependent. If program state is analyzed or shared across different architectures (e.g., a native tool reading WASM state), the hash distribution will differ, leading to incorrect lookups. It is safer to explicitly read these values as little-endian.

Suggested change

fn fold_words(bytes: &[u8; 32]) -> u32 {

let ptr = bytes.as_ptr();

unsafe {

ptr::read_unaligned(ptr.cast::<u32>())

^ ptr::read_unaligned(ptr.add(4).cast::<u32>())

^ ptr::read_unaligned(ptr.add(8).cast::<u32>())

^ ptr::read_unaligned(ptr.add(12).cast::<u32>())

^ ptr::read_unaligned(ptr.add(16).cast::<u32>())

^ ptr::read_unaligned(ptr.add(20).cast::<u32>())

^ ptr::read_unaligned(ptr.add(24).cast::<u32>())

^ ptr::read_unaligned(ptr.add(28).cast::<u32>())

}

}

fn fold_words(bytes: &[u8; 32]) -> u32 {

let ptr = bytes.as_ptr();

unsafe {

u32::from_le(ptr::read_unaligned(ptr.cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(4).cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(8).cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(12).cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(16).cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(20).cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(24).cast::<u32>()))

^ u32::from_le(ptr::read_unaligned(ptr.add(28).cast::<u32>()))

}

}

gemini-code-assist

Code Review

This pull request introduces the sails-storage crate, which provides allocator-light storage primitives such as fixed and static open-addressed maps, alongside build-script utilities for managing static memory layouts in WASM. Feedback identifies a critical resource exhaustion vulnerability in the static map's lookup logic where tombstone slots are not reused, potentially leading to capacity leaks. Additionally, improvements were suggested to ensure the Rust keyword validation list is comprehensive to prevent compilation errors in generated code and to optimize utility functions by reusing existing word-wise logic.

gemini-code-assist · 2026-05-20T13:31:46Z

+    fn lookup(&self, key: &[u8; 32]) -> StaticLookup {
+        let key_words = words_32(key);
+        let mut index = static_map_index(hash_actor_key(key), LOG2_SLOTS);
+
+        for _ in 0..self.slots {
+            let stored_key = unsafe { read_words_32(self.key_ptr(index)) };
+            if stored_key == key_words {
+                return StaticLookup::Found(index);
+            }
+            if words_are_zero(stored_key) {
+                return StaticLookup::Vacant(index);
+            }
+
+            index = (index + 1) & self.mask;
+        }
+
+        StaticLookup::Full
+    }


The current lookup implementation doesn't correctly handle tombstones (slots with a non-zero key but a zero value) for insertion. When a key is "removed" by setting its value to zero, the slot is not made available for new keys. This can lead to a situation where the map runs out of capacity for new insertions, even if many entries have been removed. This is a form of capacity leak and could lead to DoS via resource exhaustion. To fix this, the lookup function should be modified to identify and allow reuse of these tombstone slots for new insertions. Additionally, please include a regression test to verify that the capacity is correctly managed and the cap is enforced, as per the repository's bounded-state pattern rules.

fn lookup(&self, key: &[u8; 32]) -> StaticLookup { let key_words = words_32(key); let mut index = static_map_index(hash_actor_key(key), LOG2_SLOTS); let mut first_tombstone = None; for _ in 0..self.slots { let stored_key_ptr = self.key_ptr(index); let stored_key = unsafe { read_words_32(stored_key_ptr) }; if stored_key == key_words { return StaticLookup::Found(index); } if words_are_zero(stored_key) { return StaticLookup::Vacant(first_tombstone.unwrap_or(index)); } if first_tombstone.is_none() { let value_ptr = self.value_ptr(index); if unsafe { words_at_zero(value_ptr) } { first_tombstone = Some(index); } } index = (index + 1) & self.mask; } first_tombstone.map(StaticLookup::Vacant).unwrap_or(StaticLookup::Full) }

References

Avoid storing unbounded collections in service state to prevent gas exhaustion and DoS. Use a bounded-state pattern, such as a fixed-size cap with explicit eviction of the oldest entries, and include regression tests to verify the cap is enforced.

gemini-code-assist · 2026-05-20T13:31:46Z

+fn is_zero_32(bytes: &[u8; 32]) -> bool {
+    bytes.iter().all(|byte| *byte == 0)
+}


The is_zero_32 function can be optimized by reusing the existing words_32 and words_are_zero functions. This would also improve consistency with other parts of the code that perform similar checks using word-wise operations instead of byte-wise iteration.

Suggested change

fn is_zero_32(bytes: &[u8; 32]) -> bool {

bytes.iter().all(|byte| *byte == 0)

}

fn is_zero_32(bytes: &[u8; 32]) -> bool {

words_are_zero(words_32(bytes))

}

gemini-code-assist

Code Review

This pull request introduces the sails-storage crate, which provides allocator-light storage primitives such as fixed-capacity and static-memory open-addressed maps, alongside build-script helpers for managing static memory layouts and patching WASM binaries. Feedback highlights a critical security concern regarding capacity leaks in the actor map remove_key implementation, which could lead to Denial of Service if not addressed with tombstones or cluster re-hashing. Additionally, suggestions were made to optimize the find_slot function for power-of-two capacities and to improve the robustness of the WASM patching logic to avoid fragility with future parser versions.

gemini-code-assist · 2026-05-20T13:40:58Z

+    fn remove_key(&self, key: &[u8; 32]) -> Result<Option<[u8; 32]>, TableError> {
+        let StaticLookup::Found(index) = self.lookup(key) else {
+            return Ok(None);
+        };
+
+        let previous = self.visible_value(index);
+        if previous.is_some() {
+            unsafe {
+                self.write_value(index, &[0; 32]);
+            }
+        }
+        Ok(previous)
+    }


The remove_key implementation for StaticActorIdU256Map (and similarly for StaticActorPairU256Map) zeros the value but leaves the key in the slot. In an open-addressed map using linear probing without tombstones, this results in a capacity leak. Slots occupied by 'removed' keys will never be considered Vacant for new keys, eventually causing CapacityOverflow and potential DoS. To prevent gas exhaustion and ensure correct state management, implement a bounded-state pattern (e.g., using tombstones or cluster re-hashing). Additionally, include regression tests to verify that the capacity is correctly managed and the cap is enforced.

References

Avoid storing unbounded collections in service state to prevent gas exhaustion and DoS. Use a bounded-state pattern, such as a fixed-size cap with explicit eviction of the oldest entries, and include regression tests to verify the cap is enforced.

gemini-code-assist · 2026-05-20T13:40:58Z

+                if let Some((id, range)) = other.as_section() {
+                    module.section(&wasm_encoder::RawSection {
+                        id,
+                        data: &wasm[range],
+                    });
+                } else if !matches!(
+                    other,
+                    wasmparser::Payload::CodeSectionStart { .. }
+                        | wasmparser::Payload::CodeSectionEntry { .. }
+                ) {
+                    return Err(Error::UnsupportedWasmPayload);
+                }


The patch_imported_memory function uses an aggressive UnsupportedWasmPayload error for any payload that isn't a section or a code section entry. While this works for simple modules, it might fail on valid WebAssembly modules containing other non-section payloads yielded by wasmparser (e.g., if future versions yield data segments item-by-item similar to code entries). More importantly, the current logic relies on as_section() returning the full range for CodeSectionStart. While this is true for current wasmparser versions, it's a fragile pattern for round-tripping. A more robust approach would be to explicitly handle all core WASM section types or use a higher-level re-encoding library if possible.

ukint-vs added 7 commits May 19, 2026 00:38

feat(rs): add static storage maps and benchmarks

4f9cdfc

feat(storage): add experimental VFT account storage

de572e4

feat: reduce generated sails vft overhead

2e08d25

bench: add noop sails hot lower bound

0f03eed

feat(storage): publish primitive storage API

d8f875c

chore: exclude docs from storage branch

5e41b8b

Merge remote-tracking branch 'origin/master' into work/sails-vft-stor…

694ada1

…age-profiling

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

ukint-vs added 6 commits May 20, 2026 16:31

chore: narrow storage PR scope

59c360a

chore: remove benchmark artifacts from storage PR

5d183f4

chore: drop experimental VFT account feature

bebd0a1

chore: reduce storage PR to stable primitives

2548042

chore: simplify storage primitives

f2490eb

docs: document sails storage primitives

de0c7c7

gemini-code-assist Bot reviewed May 20, 2026

View reviewed changes

ukint-vs added 2 commits May 20, 2026 17:46

chore: refresh lockfile for storage crate

da6febf

chore: trim storage API to generic primitives

16f1e21

ukint-vs force-pushed the work/sails-vft-storage-profiling branch from b570541 to 16f1e21 Compare May 20, 2026 17:19

ukint-vs marked this pull request as draft May 22, 2026 12:59

Conversation

ukint-vs commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Explicitly Not Included

Verification

Test plan

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

ukint-vs commented May 20, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ukint-vs commented May 20, 2026 •

edited

Loading