starknet_committer: support both layouts in fetch_patricia_paths#13958
Conversation
1ace527 to
61f0e7e
Compare
61f0e7e to
b6f331e
Compare
PR SummaryMedium Risk Overview Refactors traversal to explicitly handle layout-dependent child traversal by queuing unmodified subtrees for hash-only reads and flushing pending binary/edge preimages once child hashes are available, and updates call sites (e.g., Reviewed by Cursor Bugbot for commit b4d157b. Bugbot is set up for automated code reviews on this repo. Configure here. |
yoavGrs
left a comment
There was a problem hiding this comment.
@yoavGrs made 1 comment.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on ArielElp).
a discussion (no related file):
Please separate moving the file from the changes made inside it. It’ll make it easier for me to review the changes.
yoavGrs
left a comment
There was a problem hiding this comment.
@yoavGrs reviewed 1 file, made 6 comments, and resolved 1 discussion.
Reviewable status: 1 of 4 files reviewed, 6 unresolved discussions (waiting on ArielElp).
crates/starknet_committer/src/db/facts_db/traversal.rs line 53 at r1 (raw file):
} // TODO(Rotem): Match the python logic of fetching the nodes.
What about this TODO?
On this point, is there a note in the test that we are collecting too many nodes?
Code quote:
// TODO(Rotem): Match the python logic of fetching the nodes.crates/starknet_committer/src/db/trie_traversal.rs line 57 at r1 (raw file):
#[cfg(test)] #[path = "facts_db/traversal_test.rs"]
The test should be beside the corresponding module.
Code quote:
"facts_db/traversal_test.rs"crates/starknet_committer/src/db/trie_traversal.rs line 61 at r1 (raw file):
/// Logs out a warning of a trivial modification. macro_rules! log_trivial_modification {
Move it down, please.
(Top-down Ordering)
Code quote:
macro_rules! log_trivial_modification crates/starknet_committer/src/db/trie_traversal.rs line 67 at r1 (raw file):
} /// Returns the Patricia inner nodes ([PreimageMap]) in the paths to the given `leaf_indices` in the
With the siblings?
Code quote:
in the paths to the given `leaf_indices`crates/starknet_committer/src/db/trie_traversal.rs line 101 at r1 (raw file):
/// Required for `patricia_update` function in Cairo. /// Extra preimages (more than the data required to verify merkle paths) are required to verify /// correctness of final tree topology; for more details see 'traverse_edge' in 'patricia.cairo'.
Not in this repo
Code quote:
see 'traverse_edge' in 'patricia.cairo'crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):
while !current_traversal.is_empty() || !current_hash_only.is_empty() { read_hashes::<L, Layout>(¤t_hash_only, storage, key_context, &mut hash_by_index) .await?;
Can you read all the hashes for the waiting hash-only vector outside the loop?
Code quote:
read_hashes::<L, Layout>(¤t_hash_only, storage, key_context, &mut hash_by_index)
.await?;22de1c4 to
7cb0ac4
Compare
ArielElp
left a comment
There was a problem hiding this comment.
@ArielElp partially reviewed 2 files and made 6 comments.
Reviewable status: 2 of 5 files reviewed, 6 unresolved discussions (waiting on yoavGrs).
crates/starknet_committer/src/db/trie_traversal.rs line 57 at r1 (raw file):
Previously, yoavGrs wrote…
The test should be beside the corresponding module.
Tests are moved to their own module in the 3'rd PR in this stack
crates/starknet_committer/src/db/trie_traversal.rs line 61 at r1 (raw file):
Previously, yoavGrs wrote…
Move it down, please.
(Top-down Ordering)
Move above first usage
crates/starknet_committer/src/db/trie_traversal.rs line 67 at r1 (raw file):
Previously, yoavGrs wrote…
With the siblings?
Yes, added to the comment
crates/starknet_committer/src/db/trie_traversal.rs line 101 at r1 (raw file):
Previously, yoavGrs wrote…
Not in this repo
It's part of cairo zero's common library, what do you suggest?
crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):
Previously, yoavGrs wrote…
Can you read all the hashes for the waiting hash-only vector outside the loop?
Indices are learned dynamically, so it doesn't sound possible
crates/starknet_committer/src/db/facts_db/traversal.rs line 53 at r1 (raw file):
Previously, yoavGrs wrote…
What about this TODO?
On this point, is there a note in the test that we are collecting too many nodes?
Asked Cursor to compare the old (pre this PR) and the pythonic implementation, it sounds like if anything we're collecting less unneccessary nodes:
-
Binary children with mixed “modified / sibling” subtrees
Python always enqueues both left and right subtrees when the parent has non-emptyleaf_indices(unlessheight - 1 == 0, then it usesset_leaffor both). The sibling child is enqueued withleaf_indices == []; the next layer still loads that node from DB, then treats it as a binary sibling (hash only, no children).Your Rust only enqueues children where
!is_unmodified(). Ifis_unmodifiedmeans “no leaf indices in this subtree,” that skips enqueueing and can skip the extramgetfor that child’s preimage compared to Python, which always does that read for binary siblings. That matches the spirit of the Rust TODO (“Match the python logic of fetching the nodes”) if the goal is identical witness sets and round-trips. -
Edge siblings
Python: edge withleaf_indices == []still reads and stores the edge inprefetched_nodes, thencontinue(no bottom traversal). Rust: analogous subtree handling depends on howFactsSubTree/get_bottom_subtreetreat “sibling” vs “modified.” -
Default / empty leaves on edges
Rust explicitly fillsempty_leaves_indiceswithL::default(). Python’sfetch_witnessespre-fills every requested index withempty_leafand only overwrites aftermgeton modified leaf keys—so “empty under edge but not bottom” may be represented differently unless your Python callers rely on the same invariants. -
Root wrapper
Python’s public entry isPatriciaTree.fetch_witnesses(tree height, empty root check, full leaf indices). Your outerfetch_patricia_pathsmatches that shape (check empty indices + empty root, build main subtree, call inner).
IMO we should stop treating the python implementation as the source of truth, we rewrote the hints, makes sense for the algorithms to slightly change.
yoavGrs
left a comment
There was a problem hiding this comment.
@yoavGrs reviewed 3 files and all commit messages, made 3 comments, and resolved 3 discussions.
Reviewable status: 4 of 5 files reviewed, 5 unresolved discussions (waiting on ArielElp).
crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):
Previously, ArielElp wrote…
Indices are learned dynamically, so it doesn't sound possible
This is my suggestion:
while !current_traversal.is_empty() {
let filled_roots = ...;
for (filled_root, subtree) in filled_roots {
...
}
}
read_hashes::<L, Layout>(¤t_hash_only, storage);
crates/starknet_committer/src/db/trie_traversal.rs line 163 at r3 (raw file):
} else { pending_binary.push((filled_root.hash, left_index, right_index)); }
I would match here on Layout::SubTree::should_traverse_unmodified_child
Code quote:
let left_hash = schedule_binary_child::<L, Layout>(
left_data,
left_subtree,
&mut hash_by_index,
&mut next_traversal,
&mut next_hash_only,
);
let right_hash = schedule_binary_child::<L, Layout>(
right_data,
right_subtree,
&mut hash_by_index,
&mut next_traversal,
&mut next_hash_only,
);
if let (Some(left_hash), Some(right_hash)) = (left_hash, right_hash) {
witnesses.insert(
filled_root.hash,
Preimage::Binary(BinaryData {
left_data: left_hash,
right_data: right_hash,
}),
);
} else {
pending_binary.push((filled_root.hash, left_index, right_index));
}crates/starknet_committer/src/db/trie_traversal.rs line 191 at r3 (raw file):
} None => { pending_edge.push((filled_root.hash, path_to_bottom, bottom_index));
Same suggestion here.
Code quote:
let bottom_hash = schedule_edge_child::<L, Layout>(
bottom_data,
bottom_subtree,
&mut hash_by_index,
&mut next_traversal,
);
match bottom_hash {
Some(hash) => {
witnesses.insert(
filled_root.hash,
Preimage::Edge(EdgeData { bottom_data: hash, path_to_bottom }),
);
}
None => {
pending_edge.push((filled_root.hash, path_to_bottom, bottom_index));7cb0ac4 to
1d0e779
Compare
ArielElp
left a comment
There was a problem hiding this comment.
@ArielElp partially reviewed 1 file and made 3 comments.
Reviewable status: 4 of 5 files reviewed, 5 unresolved discussions (waiting on yoavGrs).
crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):
Previously, yoavGrs wrote…
This is my suggestion:
while !current_traversal.is_empty() { let filled_roots = ...; for (filled_root, subtree) in filled_roots { ... } } read_hashes::<L, Layout>(¤t_hash_only, storage);
Not sure if this was what you meant, but I moved read_hashes into the first step of clear_pending, so the main loop is more straightforward now, as you suggested
crates/starknet_committer/src/db/trie_traversal.rs line 163 at r3 (raw file):
Previously, yoavGrs wrote…
I would match here on
Layout::SubTree::should_traverse_unmodified_child
This is what happens in schedule_binary_child though, it's nicer if all these considerations are kept inside IMO
crates/starknet_committer/src/db/trie_traversal.rs line 191 at r3 (raw file):
Previously, yoavGrs wrote…
Same suggestion here.
Same as above
yoavGrs
left a comment
There was a problem hiding this comment.
@yoavGrs reviewed 1 file and all commit messages, made 3 comments, and resolved 2 discussions.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on ArielElp).
crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):
Previously, ArielElp wrote…
Not sure if this was what you meant, but I moved read_hashes into the first step of clear_pending, so the main loop is more straightforward now, as you suggested
Yours is better.
crates/starknet_committer/src/db/trie_traversal.rs line 163 at r3 (raw file):
Previously, ArielElp wrote…
This is what happens in schedule_binary_child though, it's nicer if all these considerations are kept inside IMO
OK, non-blocking anyway
crates/starknet_committer/src/db/facts_db/traversal.rs line 53 at r1 (raw file):
Previously, ArielElp wrote…
Asked Cursor to compare the old (pre this PR) and the pythonic implementation, it sounds like if anything we're collecting less unneccessary nodes:
Binary children with mixed “modified / sibling” subtrees
Python always enqueues both left and right subtrees when the parent has non-emptyleaf_indices(unlessheight - 1 == 0, then it usesset_leaffor both). The sibling child is enqueued withleaf_indices == []; the next layer still loads that node from DB, then treats it as a binary sibling (hash only, no children).Your Rust only enqueues children where
!is_unmodified(). Ifis_unmodifiedmeans “no leaf indices in this subtree,” that skips enqueueing and can skip the extramgetfor that child’s preimage compared to Python, which always does that read for binary siblings. That matches the spirit of the Rust TODO (“Match the python logic of fetching the nodes”) if the goal is identical witness sets and round-trips.Edge siblings
Python: edge withleaf_indices == []still reads and stores the edge inprefetched_nodes, thencontinue(no bottom traversal). Rust: analogous subtree handling depends on howFactsSubTree/get_bottom_subtreetreat “sibling” vs “modified.”Default / empty leaves on edges
Rust explicitly fillsempty_leaves_indiceswithL::default(). Python’sfetch_witnessespre-fills every requested index withempty_leafand only overwrites aftermgeton modified leaf keys—so “empty under edge but not bottom” may be represented differently unless your Python callers rely on the same invariants.Root wrapper
Python’s public entry isPatriciaTree.fetch_witnesses(tree height, empty root check, full leaf indices). Your outerfetch_patricia_pathsmatches that shape (check empty indices + empty root, build main subtree, call inner).
IMO we should stop treating the python implementation as the source of truth, we rewrote the hints, makes sense for the algorithms to slightly change.
Agreed. We need to evaluate the efficiency of this code on its own, and stop comparing it to the python version.
ArielElp
left a comment
There was a problem hiding this comment.
@ArielElp resolved 1 discussion.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on yoavGrs).
dorimedini-starkware
left a comment
There was a problem hiding this comment.
@dorimedini-starkware reviewed 5 files and all commit messages, and made 6 comments.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on ArielElp and yoavGrs).
crates/starknet_committer/src/db/trie_traversal.rs line 111 at r4 (raw file):
let mut pending_binary: Vec<(HashOutput, NodeIndex, NodeIndex)> = Vec::new(); // Pending edge preimage entries: (node_hash, path, bottom_index). let mut pending_edge: Vec<(HashOutput, PathToBottom, NodeIndex)> = Vec::new();
why do you need the path to bottom as well as the bottom index? to be able to compute the top index?
Code quote:
let mut pending_edge: Vec<(HashOutput, PathToBottom, NodeIndex)> = Vec::new();crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):
leaves_map.insert(*index, L::default()); } }
here, and everywhere, it would be nice if you keep the original inline comments that existed in the old traversal function
Code quote:
if let Some(ref mut leaves_map) = leaves {
for index in empty_leaves_indices {
leaves_map.insert(*index, L::default());
}
}crates/starknet_committer/src/db/trie_traversal.rs line 233 at r4 (raw file):
hash_by_index.insert(subtree.get_root_index(), filled_root.hash); } Ok(())
isn't this equivalent?
Suggestion:
let filled_roots =
get_roots_from_storage::<L, Layout>(hash_only_subtrees, storage, key_context).await?;
for (filled_root, subtree) in filled_roots.into_iter().zip(hash_only_subtrees.iter()) {
hash_by_index.insert(subtree.get_root_index(), filled_root.hash);
}
Ok(())crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):
/// Loads hashes for `hash_only_subtrees`, then flushes [`PreimageMap`] entries that were waiting on /// child hashes from storage.
Suggestion:
/// Loads hashes for `hash_only_subtrees` into `hash_by_index`, then flushes [`PreimageMap`] entries that were waiting on
/// child hashes from storage.crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):
false } _ => true,
is it possible to reach a situation where one child is Some and the other is None?
if not, consider panicking in this case explicitly, instead of just silently keeping the node in pending state
Code quote:
_ => true,crates/starknet_committer/src/patricia_merkle_tree/tree.rs line 57 at r4 (raw file):
/// Fetch the leaves in the contracts trie only, to be able to get the storage root hashes. /// Assumption: `contract_sorted_leaf_indices` contains all `contract_storage_sorted_leaf_indices` /// keys.
hmm... we currently also assume the storage is in fact layout, right? why isn't this documented?
Code quote:
/// Fetch all tries patricia paths given the modified leaves.
/// Fetch the leaves in the contracts trie only, to be able to get the storage root hashes.
/// Assumption: `contract_sorted_leaf_indices` contains all `contract_storage_sorted_leaf_indices`
/// keys.1d0e779 to
da2b238
Compare
ArielElp
left a comment
There was a problem hiding this comment.
@ArielElp reviewed 2 files, made 6 comments, and resolved 2 discussions.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on dorimedini-starkware).
crates/starknet_committer/src/db/trie_traversal.rs line 111 at r4 (raw file):
Previously, dorimedini-starkware wrote…
why do you need the path to bottom as well as the bottom index? to be able to compute the top index?
bottom index to find the hash of the bottom node in hash_by_index in the next iteration (when clearing the pending list)
path to bottom because it is part of the preimage
crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):
Previously, dorimedini-starkware wrote…
here, and everywhere, it would be nice if you keep the original inline comments that existed in the old traversal function
Done.
crates/starknet_committer/src/db/trie_traversal.rs line 233 at r4 (raw file):
Previously, dorimedini-starkware wrote…
isn't this equivalent?
Done.
crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):
Previously, dorimedini-starkware wrote…
is it possible to reach a situation where one child is
Someand the other isNone?
if not, consider panicking in this case explicitly, instead of just silently keeping the node in pending state
It is possible in index layout if one child is modified and the other isn't. The unmodified one is added to next_hash_only, which is read in the same iteration in the beginning of clear_pending_nodes. The modified one will be read in the next iteration via the regular traversal path.
crates/starknet_committer/src/patricia_merkle_tree/tree.rs line 57 at r4 (raw file):
Previously, dorimedini-starkware wrote…
hmm... we currently also assume the storage is in fact layout, right? why isn't this documented?
This function is abstracted too a bit higher up the stack
crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):
/// Loads hashes for `hash_only_subtrees`, then flushes [`PreimageMap`] entries that were waiting on /// child hashes from storage.
Done.
dorimedini-starkware
left a comment
There was a problem hiding this comment.
@dorimedini-starkware reviewed 1 file and all commit messages, made 3 comments, and resolved 4 discussions.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on ArielElp).
crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):
Previously, ArielElp wrote…
Done.
no, there was a comment just above the for index in empty_leaves_indices line
crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):
Previously, ArielElp wrote…
Done.
in read_hashes yes, but not here
crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):
Previously, ArielElp wrote…
It is possible in index layout if one child is modified and the other isn't. The unmodified one is added to next_hash_only, which is read in the same iteration in the beginning of clear_pending_nodes. The modified one will be read in the next iteration via the regular traversal path.
can you add a comment explaining each of the three cases in which we return true?
ArielElp
left a comment
There was a problem hiding this comment.
@ArielElp reviewed 1 file and all commit messages, and made 3 comments.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on dorimedini-starkware).
crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):
Previously, dorimedini-starkware wrote…
no, there was a comment just above the
for index in empty_leaves_indicesline
Done
crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):
Previously, dorimedini-starkware wrote…
in
read_hashesyes, but not here
Done.
crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):
Previously, dorimedini-starkware wrote…
can you add a comment explaining each of the three cases in which we return
true?
Done.
da2b238 to
b4d157b
Compare
dorimedini-starkware
left a comment
There was a problem hiding this comment.
@dorimedini-starkware reviewed 1 file and all commit messages, and resolved 2 discussions.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on ArielElp).

No description provided.