Implement TVM emulator#85
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds a native TVM emulator wrapper (FFI) and test suite, and updates existing emulator tests to use JSON-provided block parameters.
Changes:
- Introduce
tvm_emulator_*FFI APIs backed byton_vm::Engine(create/configure/run get-methods/send messages). - Add Rust tests covering TVM emulator lifecycle/configuration and basic method/message execution.
- Adjust JSON test data format for
rand_seedand update transaction test to readnow/lt/rand_seedfrom JSON.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 15 comments.
| File | Description |
|---|---|
| src/emulator/src/lib.rs | Implements the TVM emulator struct plus the exported C ABI functions and helpers (address parsing, stack (de)serialization, error responses). |
| src/emulator/src/tests/test_emulator.rs | Updates the transaction test to use JSON block parameters; adds a comprehensive set of new TVM emulator tests. |
| src/emulator/src/tests/4C90C139A5736F34EA3EEF62F0B06431719913835EA5A1B9173F20B2EF711583_66815326000001.json | Changes rand_seed from a JSON number to a JSON string to match consumers expecting a string. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| let (new_code, new_data, actions) = if let Some((c4, c5)) = vm.get_committed_state() { | ||
| ( | ||
| base64_encode(write_boc(&c4)?), | ||
| base64_encode(write_boc(&c5)?), | ||
| // c5 contains actions | ||
| base64_encode(write_boc(&c5)?), | ||
| ) | ||
| } else { | ||
| (String::new(), String::new(), String::new()) | ||
| }; | ||
|
|
||
| // Re-read committed state properly: c4=data, c5=actions | ||
| // get_committed_state returns (c4, c5) where c4 is new data and c5 is actions | ||
| let result = json!({ | ||
| "success": true, | ||
| "new_code": new_code, // Note: code doesn't change via TVM, but API expects it | ||
| "new_data": new_data, |
There was a problem hiding this comment.
get_committed_state() is documented (in your own comment) as returning (c4=new_data, c5=actions), but the current code encodes c5 into new_data and also encodes c4 into new_code. This makes new_data and new_code incorrect and duplicates actions. Fix by mapping new_data to c4, actions to c5, and set new_code to the original code (or an empty string) consistently with the intended API semantics.
| "actions": actions, | ||
| }); |
There was a problem hiding this comment.
get_committed_state() is documented (in your own comment) as returning (c4=new_data, c5=actions), but the current code encodes c5 into new_data and also encodes c4 into new_code. This makes new_data and new_code incorrect and duplicates actions. Fix by mapping new_data to c4, actions to c5, and set new_code to the original code (or an empty string) consistently with the intended API semantics.
| let gas_used = vm.gas_used(); | ||
| let result_stack = vm.withdraw_stack(); | ||
| let stack_boc = self.serialize_stack(&result_stack)?; |
There was a problem hiding this comment.
The code claims to serialize a TL-B VmStack, but the implementation writes a custom (depth, [refs...]) format and then attempts to decode each referenced cell with read_stack_item, which expects canonical TL-B encoding for a stack value (not your custom tagged format). This will produce stack outputs that external consumers can’t decode as VmStack, and tvm_emulator_run_get_method will return a stack field that is not what the API comment promises. Use the canonical VmStack TL-B encoding (or existing ton_vm/ton_block stack serialization helpers) for both serialization and deserialization, so the returned/accepted stack_boc matches the documented format.
| // VmStack TL-B: vm_stk_cons#_ {n:#} rest:^(VmStackList n) tos:VmStackValue = VmStack (n + 1); | ||
| // vm_stk_nil#_ = VmStackList 0; | ||
| // For simplicity, serialize depth + items as references | ||
| let mut builder = BuilderData::new(); | ||
| let depth = stack.storage.len() as u32; | ||
| builder.append_u32(depth)?; | ||
| for item in &stack.storage { | ||
| let cell = self.stack_item_to_cell(item)?; | ||
| builder.checked_append_reference(cell)?; | ||
| } | ||
| let cell = builder.into_cell()?; | ||
| Ok(base64_encode(write_boc(&cell)?)) | ||
| } | ||
|
|
||
| fn stack_item_to_cell(&self, item: &StackItem) -> Result<Cell> { | ||
| let mut builder = BuilderData::new(); | ||
| match item { | ||
| StackItem::None => { | ||
| builder.append_u8(0x00)?; | ||
| } | ||
| StackItem::Integer(int_data) => { | ||
| builder.append_u8(0x01)?; | ||
| // Store as i64 for simplicity | ||
| let val = int_data.as_integer_value(i64::MIN..=i64::MAX).unwrap_or(0); | ||
| builder.append_i64(val)?; | ||
| } | ||
| StackItem::Cell(cell) => { | ||
| builder.append_u8(0x03)?; | ||
| builder.checked_append_reference(cell.clone())?; | ||
| } | ||
| StackItem::Slice(slice) => { | ||
| builder.append_u8(0x04)?; | ||
| let cell = slice.cell_opt().cloned().unwrap_or_default(); | ||
| builder.checked_append_reference(cell)?; | ||
| } | ||
| _ => { | ||
| builder.append_u8(0x00)?; | ||
| } | ||
| } |
There was a problem hiding this comment.
The code claims to serialize a TL-B VmStack, but the implementation writes a custom (depth, [refs...]) format and then attempts to decode each referenced cell with read_stack_item, which expects canonical TL-B encoding for a stack value (not your custom tagged format). This will produce stack outputs that external consumers can’t decode as VmStack, and tvm_emulator_run_get_method will return a stack field that is not what the API comment promises. Use the canonical VmStack TL-B encoding (or existing ton_vm/ton_block stack serialization helpers) for both serialization and deserialization, so the returned/accepted stack_boc matches the documented format.
| // VmStack TL-B: vm_stk_cons#_ {n:#} rest:^(VmStackList n) tos:VmStackValue = VmStack (n + 1); | |
| // vm_stk_nil#_ = VmStackList 0; | |
| // For simplicity, serialize depth + items as references | |
| let mut builder = BuilderData::new(); | |
| let depth = stack.storage.len() as u32; | |
| builder.append_u32(depth)?; | |
| for item in &stack.storage { | |
| let cell = self.stack_item_to_cell(item)?; | |
| builder.checked_append_reference(cell)?; | |
| } | |
| let cell = builder.into_cell()?; | |
| Ok(base64_encode(write_boc(&cell)?)) | |
| } | |
| fn stack_item_to_cell(&self, item: &StackItem) -> Result<Cell> { | |
| let mut builder = BuilderData::new(); | |
| match item { | |
| StackItem::None => { | |
| builder.append_u8(0x00)?; | |
| } | |
| StackItem::Integer(int_data) => { | |
| builder.append_u8(0x01)?; | |
| // Store as i64 for simplicity | |
| let val = int_data.as_integer_value(i64::MIN..=i64::MAX).unwrap_or(0); | |
| builder.append_i64(val)?; | |
| } | |
| StackItem::Cell(cell) => { | |
| builder.append_u8(0x03)?; | |
| builder.checked_append_reference(cell.clone())?; | |
| } | |
| StackItem::Slice(slice) => { | |
| builder.append_u8(0x04)?; | |
| let cell = slice.cell_opt().cloned().unwrap_or_default(); | |
| builder.checked_append_reference(cell)?; | |
| } | |
| _ => { | |
| builder.append_u8(0x00)?; | |
| } | |
| } | |
| // Serialize using the canonical TL-B VmStack encoding so the resulting | |
| // BOC is interoperable with external consumers and matches the API | |
| // contract for `stack_boc`. | |
| let mut builder = BuilderData::new(); | |
| stack.write_to(&mut builder)?; | |
| let cell = builder.into_cell()?; | |
| Ok(base64_encode(write_boc(&cell)?)) | |
| } | |
| fn stack_item_to_cell(&self, item: &StackItem) -> Result<Cell> { | |
| // Preserve canonical TL-B encoding for individual stack values as well. | |
| let mut builder = BuilderData::new(); | |
| item.write_to(&mut builder)?; |
| let depth = slice.get_next_u32()? as usize; | ||
| let mut storage = Vec::with_capacity(depth); | ||
| for _ in 0..depth { | ||
| let item_cell = slice.checked_drain_reference()?; | ||
| let mut item_slice = SliceData::load_cell(item_cell)?; | ||
| let item = read_stack_item(&mut item_slice)?; | ||
| storage.push(item); | ||
| } | ||
| Ok(Stack::with_storage(storage)) |
There was a problem hiding this comment.
The code claims to serialize a TL-B VmStack, but the implementation writes a custom (depth, [refs...]) format and then attempts to decode each referenced cell with read_stack_item, which expects canonical TL-B encoding for a stack value (not your custom tagged format). This will produce stack outputs that external consumers can’t decode as VmStack, and tvm_emulator_run_get_method will return a stack field that is not what the API comment promises. Use the canonical VmStack TL-B encoding (or existing ton_vm/ton_block stack serialization helpers) for both serialization and deserialization, so the returned/accepted stack_boc matches the documented format.
| let depth = slice.get_next_u32()? as usize; | |
| let mut storage = Vec::with_capacity(depth); | |
| for _ in 0..depth { | |
| let item_cell = slice.checked_drain_reference()?; | |
| let mut item_slice = SliceData::load_cell(item_cell)?; | |
| let item = read_stack_item(&mut item_slice)?; | |
| storage.push(item); | |
| } | |
| Ok(Stack::with_storage(storage)) | |
| Stack::construct_from(&mut slice) |
| match tvm_emulator.run_get_method(method_id, stack) { | ||
| Ok(result) => { | ||
| let c_string = CString::new(result).unwrap(); | ||
| c_string.into_raw() | ||
| } | ||
| Err(err) => { | ||
| log::error!("Failed to run get method: {err}"); | ||
| tvm_error_response(err) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The FFI returns heap-allocated C strings via CString::into_raw() but there is no corresponding exported “free” function for C callers to release them (tests free them using CString::from_raw, but typical C consumers cannot). This will cause leaks in real integrations. Add an exported function like tvm_emulator_string_free(*mut c_char) (or reuse an existing project-wide free function if present) and document that callers must free returned strings with that function.
| // Helper: error response for TVM emulator | ||
| fn tvm_error_response(err: impl ToString) -> *const c_char { | ||
| let result = json!({ | ||
| "success": false, | ||
| "error": err.to_string(), | ||
| }); | ||
| let c_string = CString::new(format!("{result:#}")).unwrap(); | ||
| c_string.into_raw() | ||
| } |
There was a problem hiding this comment.
The FFI returns heap-allocated C strings via CString::into_raw() but there is no corresponding exported “free” function for C callers to release them (tests free them using CString::from_raw, but typical C consumers cannot). This will cause leaks in real integrations. Add an exported function like tvm_emulator_string_free(*mut c_char) (or reuse an existing project-wide free function if present) and document that callers must free returned strings with that function.
| tvm_emulator_set_c7( | ||
| p, | ||
| address_str.as_ptr(), | ||
| json["now"].as_u64().unwrap_or(1700000000) as u32, | ||
| 1_000_000_000, | ||
| rand_seed.as_ptr(), | ||
| config_params_boc.as_ptr(), | ||
| ); |
There was a problem hiding this comment.
The test ignores the boolean return value from tvm_emulator_set_c7. If rand_seed parsing or config parsing fails, the test will still proceed and may produce misleading failures later. Capture and assert the return value (and consider printing the parsed rand_seed format expectations) so failures point to the real cause.
| @@ -10,7 +10,7 @@ | |||
| 109286709553761167786868452598623799483077907703140966234259861627898293463574, | |||
| 18296641377820784520646603013706397400899109453284289688122548468517220829435 | |||
| ], | |||
| "rand_seed": 44576371744839254860532476329668446971656461546251214697440441538577567691036, | |||
| "rand_seed": "44576371744839254860532476329668446971656461546251214697440441538577567691036", | |||
There was a problem hiding this comment.
tvm_emulator_set_c7 documents rand_seed_hex as a 64-char hex string and parses via UInt256::from_str. This JSON value looks like a decimal string, not hex, which can cause tvm_emulator_set_c7 to fail when tests pass it through. Either update the JSON to provide a 64-hex-character seed (recommended, consistent with the API contract), or adjust tvm_emulator_set_c7 to accept both decimal and hex representations intentionally.
| } | ||
| ctrls.put(4, StackItem::Cell(self.data.clone()))?; | ||
|
|
||
| let gas = Gas::new(self.gas_limit, 0, self.gas_limit, 1000); |
There was a problem hiding this comment.
1000 is an unexplained magic constant in Gas::new(...). Define a named constant (or derive it from config/capabilities) so it’s clear what this parameter represents and how it should be tuned.
No description provided.