Tutorial step to lower usage of accumulation storage by cheme · Pull Request #1 · paritytech/jam_public

cheme · 2026-03-19T19:12:31Z

This PR move the processing to an external state, making things more suitable.

This is currently in Draft, as it is very rough and the text is really not finished. (I would really appreciate help on redacting it).

There is multiple 'Mode':

Direct: should be kept
Preimage: this is a bad one: allows storing Witness and Operations in a preimage rather than sending it directly in workitem. Makes little sense except to show how to create preimage.
Segments and ProcessSegments: here we put payload as in direct, but it will be stored in a segment untill all segments get processed at once. This is not too bad, I think it should be kept, yet the use case is not really good.

@alxmirap I did not remove the preimage as I told you before, will likely do, but since it got a lot in common with the segment version I wonder if it can be ok (likely not, I will remove it later).

from them.

locations).

alxmirap

Thanks for putting this together Emeric. I think this probably combines a few things that could benefit from being in different PRs, like 1) adding new functionality and 2) separating code out from previous versions.

I think it would be good to make a clear distinction between service's v1 and v2. v1 is instructive, showing the basic capabilities of Jam to have its state and use it in Ethereum style, but the point made in the tutorial that this is not efficient is very good: it justifies why we have a v2.

v1 should be much more accessible and good as an introduction for new-comers to understand the basic service design, and the difference between refine and accumulation, while v2 would be much more geared to better engineered solutions and our favoured design approach.

To this end, it should introduce the notion of builder, using Jam as a rollup with commitment to raw state kept in the builder, and using refine for verification of operations, and not for calculating their outcome.

Also, it should demonstrate how to leverage the D3L by importing and exporting segments.

At this stage, pre-images might be a complication that we could introduce later, as there is not a very good use-case for them yet.

As for the code itself, I think we do well by having a fully working solution, but I don't think we should highlight or even refer all of it in the tutorial itself. For example, I think the State implementation with a Merkle tree is probably more involved and complex than needs to be in a tutorial.
We also have a service, a builder, a builder/state and a service state that shares a lot with the builder state. We could simplify this, or at least conveniently bound it in our discussion.

For instruction, it would probably be enough to refer to some abstract properties of this state:

it is a commitment to external state kept somewhere else
it evolves by simply checking the transition's starting point is our current state, and updating it to the transition's next
we verify in refinement that the state transition matches the operations we declared with it

alxmirap · 2026-03-23T09:34:18Z

+        //)
+        .get_matches();
+
+    let Some(input_path) = matches.get_one::<PathBuf>("input") else {


Does this allow us to have several "input" args in the command? What will the program do in that case, shouldn't we enforce only one of each kind?

Since there is no parameter name, the argument is positional (this is arg[1] and output is arg[2]), trying to pass an arg[3] emit an error, so guess it can only be one (I am not too familiar with clap).

Checking on parameter head, it also only allow a single one (I think clap infer multiplicity by type, eg e Vec implies multiple V I think).

alxmirap · 2026-03-23T09:41:33Z

+        )
+        .arg(
+            arg!(
+							--head <String> "Overload root hash for this state transition (rather than using db head)"


I don't quite understand what you mean by "overload" here. Could you make it a little clearer, please?

alxmirap · 2026-03-23T11:07:32Z

+            if op.exported_segments.len() > 0 || op.processed_segments.len() > 0 {
+                // We store hash of segment content with a reference count.
+                let mut buffed_segments: BTreeMap<Hash, u64> =
+                    get("buffed_segments").unwrap_or(BTreeMap::new());


Maybe cached_segments instead? 'buffed' has a connotation of somehow 'improved' or 'polished'.

Not false, I didn’t realize 👍

alxmirap · 2026-03-23T11:12:48Z

+                        buffed_segments.remove(p);
+                    }
+                }
+                for p in op.exported_segments {


I am confused as to why we have reference counting here.
Should it be possible for a segment to be exported by more than one item?
Perhaps we want something simpler in a tutorial, and ignore this low-level management?

I was thinking more in terms of a package exporting some segments, then another package which makes the first a pre-requisite would import segments from that.

Should it be possible for a segment to be exported by more than one item?
the hash is the hash of the segment content, so if we export two segments with same content we produce same hash, so we need the Rc.
Actually given what we put in the segments, this may not happen, yet being safe here makes things simpler I think.
Perhaps we want something simpler in a tutorial, and ignore this low-level management?

I added a note about it at the end of the markdown description, but I find it a bit more complex, as one need to fully understand the import export trust model.

At first I wanted to store the origin of the segment (either segment merkle root and index or workitem hash and index), but I did find no way to access it from refinement or accumulation, then I think a bit and realize this kind of tracking is somehow simpler.
The main point of tracking is to avoid double import (but in the markdown I mention it not being useful for this particular use case).

Not too sure here, I will give it some thoughts I think.

alxmirap · 2026-03-23T11:25:23Z

+    fn set_balance(&mut self, account: AccountId, token_id: TokenId, balance: u64) {
+        let to_key = token_ledger::api::balance_key(token_id, &account);
+        if !self.balances.set(to_key.to_vec(), balance) {
+            unimplemented!("error on key collision");


Why would we have a key collision here? Can't we replace an existing balance?
Or do we want to ensure that is done exclusively through transitions?

we store over a u15 index in binary tree , I think I describe this point in another comment.
I have to give it some thoughts, initially I thought some custom small tree was fun, but not too sure anymore.

alxmirap · 2026-03-23T11:31:09Z

+    }
+
+    // fail on key collision by returning false
+    pub fn set(&mut self, k: Vec<u8>, v: V) -> bool {


We seem to have this function both in here and in state.rs, with almost no differences.
There is probably more repeated code, and it would be better to avoid this repetition.

At first I did not have this repeated code (single crate for builder and state with feature skipping the witness record). Then switch to this to make things simple. I think with my latest change in builder I may be able to do something about it (I record witness in a simpler way and may just be able to call state after).

alxmirap · 2026-03-23T11:35:30Z

+    );
+}
+
+fn process_transfer<S: StateOps>(


Is this still relevant? I was under the impression this should now be handled exclusively in the builder. The advantage of using the root state transition in Jam service is that we can operate as a roll-up, and all transfers and state-computation are done in the L2 (the builder).

Refinement and accumulation should be the L1, and worry only with correctness. So, we might have a verify balance in transition.rs, but probably not a set_balance or process_transfer.

set_balance or process_transfer are needed so we can run others operations afterward, indeed we could skip writing the last one.
We also want to calculate the resulting state root to store it during accumulation, this calculation is done while when updating the values (really not optimal but simple). We cannot trust the builder to provide us the resulting root (we only trust the starting partial state since we can validate it during accumulation against previously known state root).

Co-authored-by: alxmirap <alexandre.pinto@parity.io>

…c into cheme/externalstate

cheme · 2026-03-23T14:47:51Z

+        }
+        let witness_root = *result.root();
+        for (key, value) in witness_key_values.into_iter() {
+            result.set(key, value);


related to my last comment (about why we record accessed hashes), here result set will update state root , so when we do init from witness, if the values are not correct we will have a different root.
When inserting a value, the new root is calculated against every sibling previously inserted from the witness.

* Creates a binary to start convert a user-friendly list of unsigned operations into a list of signed operations * Small fixes to the construction of the signed list of operations. * Various refactorings and some cleanup: - renamed `token-ledger-v1` to `token-ledger-service-v1`, to maintain the parallel with v2 - in the Tutorial, updated an example command since `cargo run` no longer requires expliction `-i` or `-o` options. However, there are some more changes in this command, and the documentation will have to be all reviewed later. - Added a paragraph describing the purpose of a new binary to convert user-friendly Json operations into fully-specified ones. - some changes to justfile to enable it to be invoked from any location, and not only from that of the `justfile`. Commands build-service, create-service, query-service and submit-file, and functions to get and save the last service id, have been successfully tested - added a command to connect to an RPC node, for the moment still unhandled. Ensures that we must have either this or output specified. - Extracts some functions in main.rs to make the code more readable. * Formatting * Adds options for connecting to an RPC node, and to build and submit a WorkPackage directly to it. WiP: still needs some refactoring to reduce the size of long functions. * Minor code adjustments * formatting and clippy * Fixes the calculation of max core * Move crates to v1 and v2 folders Also, builder has new features: - ability to receive a user-friendly JSON file without signatures and valid AccountIds (ie Public Keys) - ability to connect to an RPC node and submit Work Package directly without having to encode it first.

socket-security · 2026-04-16T15:59:12Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	tokio@1.50.0
	blake2b_simd@1.0.4
	bytes@1.11.1
	clap@4.6.0
	jam-null-authorizer-bin@0.1.26
	jam-std-common@0.1.26
	jam-tooling@0.1.26
	jsonrpsee@0.24.10
	tempfile@3.27.0

View full report

…, adds an introductory section on PVM debugging (#3) * Creates a binary to start convert a user-friendly list of unsigned operations into a list of signed operations * Small fixes to the construction of the signed list of operations. * Various refactorings and some cleanup: - renamed `token-ledger-v1` to `token-ledger-service-v1`, to maintain the parallel with v2 - in the Tutorial, updated an example command since `cargo run` no longer requires expliction `-i` or `-o` options. However, there are some more changes in this command, and the documentation will have to be all reviewed later. - Added a paragraph describing the purpose of a new binary to convert user-friendly Json operations into fully-specified ones. - some changes to justfile to enable it to be invoked from any location, and not only from that of the `justfile`. Commands build-service, create-service, query-service and submit-file, and functions to get and save the last service id, have been successfully tested - added a command to connect to an RPC node, for the moment still unhandled. Ensures that we must have either this or output specified. - Extracts some functions in main.rs to make the code more readable. * Formatting * Adds options for connecting to an RPC node, and to build and submit a WorkPackage directly to it. WiP: still needs some refactoring to reduce the size of long functions. * Minor code adjustments * formatting and clippy * Fixes the calculation of max core * Move crates to v1 and v2 folders Also, builder has new features: - ability to receive a user-friendly JSON file without signatures and valid AccountIds (ie Public Keys) - ability to connect to an RPC node and submit Work Package directly without having to encode it first. * Add a fix to require 32K min stack size * Reviews and extends logging for easier debugging. * Fixes the signing and verification code to use the same admin key. Moves the basic structs and functions to token-ledger-common. * A couple more payloads useful for simple tests * Some notes related to debugging * formatting * Fixes some typos * fix duplicate code added in the merge

…ity (#4) * Creates a binary to start convert a user-friendly list of unsigned operations into a list of signed operations * Small fixes to the construction of the signed list of operations. * Various refactorings and some cleanup: - renamed `token-ledger-v1` to `token-ledger-service-v1`, to maintain the parallel with v2 - in the Tutorial, updated an example command since `cargo run` no longer requires expliction `-i` or `-o` options. However, there are some more changes in this command, and the documentation will have to be all reviewed later. - Added a paragraph describing the purpose of a new binary to convert user-friendly Json operations into fully-specified ones. - some changes to justfile to enable it to be invoked from any location, and not only from that of the `justfile`. Commands build-service, create-service, query-service and submit-file, and functions to get and save the last service id, have been successfully tested - added a command to connect to an RPC node, for the moment still unhandled. Ensures that we must have either this or output specified. - Extracts some functions in main.rs to make the code more readable. Moves the basic structs and functions to token-ledger-common. Adds an Extrinsic mode, to deliver the witness via extrinsic and outside the payload. This includes also a deep refactoring of the code, and commenting out everything related to pre-images. Extracts the concept of Execution Mode, and refactors it to recognise 3 types of packages: Immediate: the logic of the package's work item is executed immediately, and both refinement and accumulation are executed. Deferring: the logic of the package is verified to be valid, but accumulation is not executed. Instead, the operations and state are exported to the D3L Deferred: the package does not carry any operation or witness. These are retrieved from the D3L during refinement, and then the accumulation logic is executed, completing the operation started in the Deferring package. The builder has been expanded to create and submit the two related packages at the same time.

… to follow. * Reviews the tutorial write-up on handling segments * Removes the --extrinsic option. Now, we create the extrinsic only in the --connect-rpc mode. * small rewrites in the section about WorkPackages and WorkItems

cheme added 13 commits March 13, 2026 17:50

token ledger external state

109f56d

using clap in main (as simple as possible).

a10ad12

allow forcing head value

e9cdeed

provide preimage when two step building, next register them and build

7303e37

from them.

preimage variant

da871fc

draft segment export

47cdb47

ok ApiResult::StorageFull for service at this point.

0b87957

just need to set number exports for work item to one

ddb3c8f

almost working (accumulate track on wrong index).

94249ae

Version -> Mode

f7f9116

Do tracking of segment data by hash (so no need to know their

5fab6e7

locations).

rework description

d3699bd

Better design description.

5bff2f8

alxmirap reviewed Mar 23, 2026

View reviewed changes

cheme and others added 5 commits March 23, 2026 15:12

Apply suggestion from @alxmirap

8ba4788

Co-authored-by: alxmirap <alexandre.pinto@parity.io>

Apply suggestion from @alxmirap

83db4b8

Co-authored-by: alxmirap <alexandre.pinto@parity.io>

Apply suggestion from @alxmirap

df437f8

Co-authored-by: alxmirap <alexandre.pinto@parity.io>

remove indexes field, few comments from review

b56cc99

Merge branch 'cheme/externalstate' of github.com:paritytech/jam_publi…

64ea679

…c into cheme/externalstate

cheme commented Mar 23, 2026

View reviewed changes

cheme added 3 commits March 23, 2026 17:15

use state under builder, avoid bit of redundancy also.

69fc61a

temp change

cd71e3e

finish restoring common crate

4eed5a0

cheme marked this pull request as ready for review March 23, 2026 16:41

new cmd syntax

69f5de6

alxmirap approved these changes Mar 25, 2026

View reviewed changes

cheme and others added 2 commits March 26, 2026 07:22

just bs fix

2727520

alxmirap added 2 commits April 17, 2026 10:45

Conversation

cheme commented Mar 19, 2026

Uh oh!

alxmirap left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

socket-security Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

socket-security Bot commented Apr 16, 2026 •

edited

Loading