Skip to content

starknet_committer: support both layouts in fetch_patricia_paths#13958

Merged
ArielElp merged 1 commit into
mainfrom
ariel/refactor_fetch_patricia_paths
May 25, 2026
Merged

starknet_committer: support both layouts in fetch_patricia_paths#13958
ArielElp merged 1 commit into
mainfrom
ariel/refactor_fetch_patricia_paths

Conversation

@ArielElp
Copy link
Copy Markdown
Contributor

@ArielElp ArielElp commented May 5, 2026

No description provided.

@reviewable-StarkWare
Copy link
Copy Markdown

This change is Reviewable

Copy link
Copy Markdown
Contributor Author

ArielElp commented May 5, 2026

@ArielElp ArielElp force-pushed the ariel/refactor_fetch_patricia_paths branch from 1ace527 to 61f0e7e Compare May 5, 2026 08:51
@ArielElp ArielElp force-pushed the ariel/refactor_fetch_patricia_paths branch from 61f0e7e to b6f331e Compare May 7, 2026 12:02
@ArielElp ArielElp marked this pull request as ready for review May 7, 2026 14:31
@cursor
Copy link
Copy Markdown

cursor Bot commented May 7, 2026

PR Summary

Medium Risk
Changes the core Patricia witness-fetching traversal logic and how child hashes/preimages are assembled, which can affect proof correctness across different DB layouts. While covered by tests, subtle traversal or hashing-order bugs could break state proof generation.

Overview
Moves fetch_patricia_paths/fetch_patricia_paths_inner out of facts_db into db/trie_traversal.rs and makes them generic over NodeLayout, enabling use with both facts and index layouts.

Refactors traversal to explicitly handle layout-dependent child traversal by queuing unmodified subtrees for hash-only reads and flushing pending binary/edge preimages once child hashes are available, and updates call sites (e.g., patricia_merkle_tree/tree.rs) and tests accordingly.

Reviewed by Cursor Bugbot for commit b4d157b. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@yoavGrs yoavGrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yoavGrs made 1 comment.
Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on ArielElp).


a discussion (no related file):
Please separate moving the file from the changes made inside it. It’ll make it easier for me to review the changes.

Copy link
Copy Markdown
Contributor

@yoavGrs yoavGrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yoavGrs reviewed 1 file, made 6 comments, and resolved 1 discussion.
Reviewable status: 1 of 4 files reviewed, 6 unresolved discussions (waiting on ArielElp).


crates/starknet_committer/src/db/facts_db/traversal.rs line 53 at r1 (raw file):

}

// TODO(Rotem): Match the python logic of fetching the nodes.

What about this TODO?
On this point, is there a note in the test that we are collecting too many nodes?

Code quote:

// TODO(Rotem): Match the python logic of fetching the nodes.

crates/starknet_committer/src/db/trie_traversal.rs line 57 at r1 (raw file):

#[cfg(test)]
#[path = "facts_db/traversal_test.rs"]

The test should be beside the corresponding module.

Code quote:

"facts_db/traversal_test.rs"

crates/starknet_committer/src/db/trie_traversal.rs line 61 at r1 (raw file):

/// Logs out a warning of a trivial modification.
macro_rules! log_trivial_modification {

Move it down, please.
(Top-down Ordering)

Code quote:

macro_rules! log_trivial_modification 

crates/starknet_committer/src/db/trie_traversal.rs line 67 at r1 (raw file):

}

/// Returns the Patricia inner nodes ([PreimageMap]) in the paths to the given `leaf_indices` in the

With the siblings?

Code quote:

in the paths to the given `leaf_indices`

crates/starknet_committer/src/db/trie_traversal.rs line 101 at r1 (raw file):

/// Required for `patricia_update` function in Cairo.
/// Extra preimages (more than the data required to verify merkle paths) are required to verify
/// correctness of final tree topology; for more details see 'traverse_edge' in 'patricia.cairo'.

Not in this repo

Code quote:

 see 'traverse_edge' in 'patricia.cairo'

crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):

    while !current_traversal.is_empty() || !current_hash_only.is_empty() {
        read_hashes::<L, Layout>(&current_hash_only, storage, key_context, &mut hash_by_index)
            .await?;

Can you read all the hashes for the waiting hash-only vector outside the loop?

Code quote:

        read_hashes::<L, Layout>(&current_hash_only, storage, key_context, &mut hash_by_index)
            .await?;

@ArielElp ArielElp force-pushed the ariel/refactor_fetch_patricia_paths branch 2 times, most recently from 22de1c4 to 7cb0ac4 Compare May 17, 2026 07:54
Copy link
Copy Markdown
Contributor Author

@ArielElp ArielElp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArielElp partially reviewed 2 files and made 6 comments.
Reviewable status: 2 of 5 files reviewed, 6 unresolved discussions (waiting on yoavGrs).


crates/starknet_committer/src/db/trie_traversal.rs line 57 at r1 (raw file):

Previously, yoavGrs wrote…

The test should be beside the corresponding module.

Tests are moved to their own module in the 3'rd PR in this stack


crates/starknet_committer/src/db/trie_traversal.rs line 61 at r1 (raw file):

Previously, yoavGrs wrote…

Move it down, please.
(Top-down Ordering)

Move above first usage


crates/starknet_committer/src/db/trie_traversal.rs line 67 at r1 (raw file):

Previously, yoavGrs wrote…

With the siblings?

Yes, added to the comment


crates/starknet_committer/src/db/trie_traversal.rs line 101 at r1 (raw file):

Previously, yoavGrs wrote…

Not in this repo

It's part of cairo zero's common library, what do you suggest?


crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):

Previously, yoavGrs wrote…

Can you read all the hashes for the waiting hash-only vector outside the loop?

Indices are learned dynamically, so it doesn't sound possible


crates/starknet_committer/src/db/facts_db/traversal.rs line 53 at r1 (raw file):

Previously, yoavGrs wrote…

What about this TODO?
On this point, is there a note in the test that we are collecting too many nodes?

Asked Cursor to compare the old (pre this PR) and the pythonic implementation, it sounds like if anything we're collecting less unneccessary nodes:

  1. Binary children with mixed “modified / sibling” subtrees
    Python always enqueues both left and right subtrees when the parent has non-empty leaf_indices (unless height - 1 == 0, then it uses set_leaf for both). The sibling child is enqueued with leaf_indices == []; the next layer still loads that node from DB, then treats it as a binary sibling (hash only, no children).

    Your Rust only enqueues children where !is_unmodified(). If is_unmodified means “no leaf indices in this subtree,” that skips enqueueing and can skip the extra mget for that child’s preimage compared to Python, which always does that read for binary siblings. That matches the spirit of the Rust TODO (“Match the python logic of fetching the nodes”) if the goal is identical witness sets and round-trips.

  2. Edge siblings
    Python: edge with leaf_indices == [] still reads and stores the edge in prefetched_nodes, then continue (no bottom traversal). Rust: analogous subtree handling depends on how FactsSubTree / get_bottom_subtree treat “sibling” vs “modified.”

  3. Default / empty leaves on edges
    Rust explicitly fills empty_leaves_indices with L::default(). Python’s fetch_witnesses pre-fills every requested index with empty_leaf and only overwrites after mget on modified leaf keys—so “empty under edge but not bottom” may be represented differently unless your Python callers rely on the same invariants.

  4. Root wrapper
    Python’s public entry is PatriciaTree.fetch_witnesses (tree height, empty root check, full leaf indices). Your outer fetch_patricia_paths matches that shape (check empty indices + empty root, build main subtree, call inner).


IMO we should stop treating the python implementation as the source of truth, we rewrote the hints, makes sense for the algorithms to slightly change.

Copy link
Copy Markdown
Contributor

@yoavGrs yoavGrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yoavGrs reviewed 3 files and all commit messages, made 3 comments, and resolved 3 discussions.
Reviewable status: 4 of 5 files reviewed, 5 unresolved discussions (waiting on ArielElp).


crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):

Previously, ArielElp wrote…

Indices are learned dynamically, so it doesn't sound possible

This is my suggestion:

while !current_traversal.is_empty() {
  let filled_roots = ...;
  for (filled_root, subtree) in filled_roots {
    ...
  }
}
read_hashes::<L, Layout>(&current_hash_only, storage);

crates/starknet_committer/src/db/trie_traversal.rs line 163 at r3 (raw file):

                        } else {
                            pending_binary.push((filled_root.hash, left_index, right_index));
                        }

I would match here on Layout::SubTree::should_traverse_unmodified_child

Code quote:

                        let left_hash = schedule_binary_child::<L, Layout>(
                            left_data,
                            left_subtree,
                            &mut hash_by_index,
                            &mut next_traversal,
                            &mut next_hash_only,
                        );
                        let right_hash = schedule_binary_child::<L, Layout>(
                            right_data,
                            right_subtree,
                            &mut hash_by_index,
                            &mut next_traversal,
                            &mut next_hash_only,
                        );

                        if let (Some(left_hash), Some(right_hash)) = (left_hash, right_hash) {
                            witnesses.insert(
                                filled_root.hash,
                                Preimage::Binary(BinaryData {
                                    left_data: left_hash,
                                    right_data: right_hash,
                                }),
                            );
                        } else {
                            pending_binary.push((filled_root.hash, left_index, right_index));
                        }

crates/starknet_committer/src/db/trie_traversal.rs line 191 at r3 (raw file):

                            }
                            None => {
                                pending_edge.push((filled_root.hash, path_to_bottom, bottom_index));

Same suggestion here.

Code quote:

                        let bottom_hash = schedule_edge_child::<L, Layout>(
                            bottom_data,
                            bottom_subtree,
                            &mut hash_by_index,
                            &mut next_traversal,
                        );

                        match bottom_hash {
                            Some(hash) => {
                                witnesses.insert(
                                    filled_root.hash,
                                    Preimage::Edge(EdgeData { bottom_data: hash, path_to_bottom }),
                                );
                            }
                            None => {
                                pending_edge.push((filled_root.hash, path_to_bottom, bottom_index));

@ArielElp ArielElp force-pushed the ariel/refactor_fetch_patricia_paths branch from 7cb0ac4 to 1d0e779 Compare May 18, 2026 10:16
Copy link
Copy Markdown
Contributor Author

@ArielElp ArielElp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArielElp partially reviewed 1 file and made 3 comments.
Reviewable status: 4 of 5 files reviewed, 5 unresolved discussions (waiting on yoavGrs).


crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):

Previously, yoavGrs wrote…

This is my suggestion:

while !current_traversal.is_empty() {
  let filled_roots = ...;
  for (filled_root, subtree) in filled_roots {
    ...
  }
}
read_hashes::<L, Layout>(&current_hash_only, storage);

Not sure if this was what you meant, but I moved read_hashes into the first step of clear_pending, so the main loop is more straightforward now, as you suggested


crates/starknet_committer/src/db/trie_traversal.rs line 163 at r3 (raw file):

Previously, yoavGrs wrote…

I would match here on Layout::SubTree::should_traverse_unmodified_child

This is what happens in schedule_binary_child though, it's nicer if all these considerations are kept inside IMO


crates/starknet_committer/src/db/trie_traversal.rs line 191 at r3 (raw file):

Previously, yoavGrs wrote…

Same suggestion here.

Same as above

Copy link
Copy Markdown
Contributor

@yoavGrs yoavGrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yoavGrs reviewed 1 file and all commit messages, made 3 comments, and resolved 2 discussions.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on ArielElp).


crates/starknet_committer/src/db/trie_traversal.rs line 129 at r1 (raw file):

Previously, ArielElp wrote…

Not sure if this was what you meant, but I moved read_hashes into the first step of clear_pending, so the main loop is more straightforward now, as you suggested

Yours is better.


crates/starknet_committer/src/db/trie_traversal.rs line 163 at r3 (raw file):

Previously, ArielElp wrote…

This is what happens in schedule_binary_child though, it's nicer if all these considerations are kept inside IMO

OK, non-blocking anyway


crates/starknet_committer/src/db/facts_db/traversal.rs line 53 at r1 (raw file):

Previously, ArielElp wrote…

Asked Cursor to compare the old (pre this PR) and the pythonic implementation, it sounds like if anything we're collecting less unneccessary nodes:

  1. Binary children with mixed “modified / sibling” subtrees
    Python always enqueues both left and right subtrees when the parent has non-empty leaf_indices (unless height - 1 == 0, then it uses set_leaf for both). The sibling child is enqueued with leaf_indices == []; the next layer still loads that node from DB, then treats it as a binary sibling (hash only, no children).

    Your Rust only enqueues children where !is_unmodified(). If is_unmodified means “no leaf indices in this subtree,” that skips enqueueing and can skip the extra mget for that child’s preimage compared to Python, which always does that read for binary siblings. That matches the spirit of the Rust TODO (“Match the python logic of fetching the nodes”) if the goal is identical witness sets and round-trips.

  2. Edge siblings
    Python: edge with leaf_indices == [] still reads and stores the edge in prefetched_nodes, then continue (no bottom traversal). Rust: analogous subtree handling depends on how FactsSubTree / get_bottom_subtree treat “sibling” vs “modified.”

  3. Default / empty leaves on edges
    Rust explicitly fills empty_leaves_indices with L::default(). Python’s fetch_witnesses pre-fills every requested index with empty_leaf and only overwrites after mget on modified leaf keys—so “empty under edge but not bottom” may be represented differently unless your Python callers rely on the same invariants.

  4. Root wrapper
    Python’s public entry is PatriciaTree.fetch_witnesses (tree height, empty root check, full leaf indices). Your outer fetch_patricia_paths matches that shape (check empty indices + empty root, build main subtree, call inner).


IMO we should stop treating the python implementation as the source of truth, we rewrote the hints, makes sense for the algorithms to slightly change.

Agreed. We need to evaluate the efficiency of this code on its own, and stop comparing it to the python version.

Copy link
Copy Markdown
Contributor Author

@ArielElp ArielElp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArielElp resolved 1 discussion.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on yoavGrs).

Copy link
Copy Markdown
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dorimedini-starkware reviewed 5 files and all commit messages, and made 6 comments.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on ArielElp and yoavGrs).


crates/starknet_committer/src/db/trie_traversal.rs line 111 at r4 (raw file):

    let mut pending_binary: Vec<(HashOutput, NodeIndex, NodeIndex)> = Vec::new();
    // Pending edge preimage entries: (node_hash, path, bottom_index).
    let mut pending_edge: Vec<(HashOutput, PathToBottom, NodeIndex)> = Vec::new();

why do you need the path to bottom as well as the bottom index? to be able to compute the top index?

Code quote:

let mut pending_edge: Vec<(HashOutput, PathToBottom, NodeIndex)> = Vec::new();

crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):

                            leaves_map.insert(*index, L::default());
                        }
                    }

here, and everywhere, it would be nice if you keep the original inline comments that existed in the old traversal function

Code quote:

                    if let Some(ref mut leaves_map) = leaves {
                        for index in empty_leaves_indices {
                            leaves_map.insert(*index, L::default());
                        }
                    }

crates/starknet_committer/src/db/trie_traversal.rs line 233 at r4 (raw file):

        hash_by_index.insert(subtree.get_root_index(), filled_root.hash);
    }
    Ok(())

isn't this equivalent?

Suggestion:

    let filled_roots =
        get_roots_from_storage::<L, Layout>(hash_only_subtrees, storage, key_context).await?;
    for (filled_root, subtree) in filled_roots.into_iter().zip(hash_only_subtrees.iter()) {
        hash_by_index.insert(subtree.get_root_index(), filled_root.hash);
    }
    Ok(())

crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):

/// Loads hashes for `hash_only_subtrees`, then flushes [`PreimageMap`] entries that were waiting on
/// child hashes from storage.

Suggestion:

/// Loads hashes for `hash_only_subtrees` into `hash_by_index`, then flushes [`PreimageMap`] entries that were waiting on
/// child hashes from storage.

crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):

                false
            }
            _ => true,

is it possible to reach a situation where one child is Some and the other is None?
if not, consider panicking in this case explicitly, instead of just silently keeping the node in pending state

Code quote:

 _ => true,

crates/starknet_committer/src/patricia_merkle_tree/tree.rs line 57 at r4 (raw file):

/// Fetch the leaves in the contracts trie only, to be able to get the storage root hashes.
/// Assumption: `contract_sorted_leaf_indices` contains all `contract_storage_sorted_leaf_indices`
/// keys.

hmm... we currently also assume the storage is in fact layout, right? why isn't this documented?

Code quote:

/// Fetch all tries patricia paths given the modified leaves.
/// Fetch the leaves in the contracts trie only, to be able to get the storage root hashes.
/// Assumption: `contract_sorted_leaf_indices` contains all `contract_storage_sorted_leaf_indices`
/// keys.

@ArielElp ArielElp force-pushed the ariel/refactor_fetch_patricia_paths branch from 1d0e779 to da2b238 Compare May 19, 2026 12:37
Copy link
Copy Markdown
Contributor Author

@ArielElp ArielElp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArielElp reviewed 2 files, made 6 comments, and resolved 2 discussions.
Reviewable status: all files reviewed, 6 unresolved discussions (waiting on dorimedini-starkware).


crates/starknet_committer/src/db/trie_traversal.rs line 111 at r4 (raw file):

Previously, dorimedini-starkware wrote…

why do you need the path to bottom as well as the bottom index? to be able to compute the top index?

bottom index to find the hash of the bottom node in hash_by_index in the next iteration (when clearing the pending list)

path to bottom because it is part of the preimage


crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):

Previously, dorimedini-starkware wrote…

here, and everywhere, it would be nice if you keep the original inline comments that existed in the old traversal function

Done.


crates/starknet_committer/src/db/trie_traversal.rs line 233 at r4 (raw file):

Previously, dorimedini-starkware wrote…

isn't this equivalent?

Done.


crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):

Previously, dorimedini-starkware wrote…

is it possible to reach a situation where one child is Some and the other is None?
if not, consider panicking in this case explicitly, instead of just silently keeping the node in pending state

It is possible in index layout if one child is modified and the other isn't. The unmodified one is added to next_hash_only, which is read in the same iteration in the beginning of clear_pending_nodes. The modified one will be read in the next iteration via the regular traversal path.


crates/starknet_committer/src/patricia_merkle_tree/tree.rs line 57 at r4 (raw file):

Previously, dorimedini-starkware wrote…

hmm... we currently also assume the storage is in fact layout, right? why isn't this documented?

This function is abstracted too a bit higher up the stack


crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):

/// Loads hashes for `hash_only_subtrees`, then flushes [`PreimageMap`] entries that were waiting on
/// child hashes from storage.

Done.

Copy link
Copy Markdown
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dorimedini-starkware reviewed 1 file and all commit messages, made 3 comments, and resolved 4 discussions.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on ArielElp).


crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):

Previously, ArielElp wrote…

Done.

no, there was a comment just above the for index in empty_leaves_indices line


crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):

Previously, ArielElp wrote…

Done.

in read_hashes yes, but not here


crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):

Previously, ArielElp wrote…

It is possible in index layout if one child is modified and the other isn't. The unmodified one is added to next_hash_only, which is read in the same iteration in the beginning of clear_pending_nodes. The modified one will be read in the next iteration via the regular traversal path.

can you add a comment explaining each of the three cases in which we return true?

Copy link
Copy Markdown
Contributor Author

@ArielElp ArielElp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArielElp reviewed 1 file and all commit messages, and made 3 comments.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on dorimedini-starkware).


crates/starknet_committer/src/db/trie_traversal.rs line 168 at r4 (raw file):

Previously, dorimedini-starkware wrote…

no, there was a comment just above the for index in empty_leaves_indices line

Done


crates/starknet_committer/src/db/trie_traversal.rs line 237 at r4 (raw file):

Previously, dorimedini-starkware wrote…

in read_hashes yes, but not here

Done.


crates/starknet_committer/src/db/trie_traversal.rs line 258 at r4 (raw file):

Previously, dorimedini-starkware wrote…

can you add a comment explaining each of the three cases in which we return true?

Done.

@ArielElp ArielElp force-pushed the ariel/refactor_fetch_patricia_paths branch from da2b238 to b4d157b Compare May 20, 2026 09:03
Copy link
Copy Markdown
Collaborator

@dorimedini-starkware dorimedini-starkware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dorimedini-starkware reviewed 1 file and all commit messages, and resolved 2 discussions.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on ArielElp).

@ArielElp ArielElp added this pull request to the merge queue May 25, 2026
Merged via the queue into main with commit 0b057bd May 25, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants