Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions docs/PYTHON_PYODIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dStruct runs Python code directly in the browser using [Pyodide](https://pyodide

1. When a user opens a Python problem page, the `usePythonCodeRunner` hook eagerly calls `pythonRunner.init()` to warm up a dedicated **Web Worker**.
2. The worker downloads the Pyodide runtime (~30 MB, cached by the browser after first load) from the jsDelivr CDN. It then writes the dStruct Python harness files (`exec.py`, `array_tracker.py`, etc.) into Pyodide's virtual Emscripten filesystem and pre-imports `safe_exec`.
3. When the user clicks **Run**, the code string and optional test-case arguments (e.g. binary tree root) are posted to the worker. The existing `safe_exec(code, args)` harness performs AST transformation (list tracking via `TrackedList`), reconstructs arguments (TreeNode, ListNode, etc.) from the serialized payload, executes the code in a sandboxed namespace, and returns a structured `ExecutionResult` as JSON back to the main thread.
3. When the user clicks **Run**, the code string and optional test-case arguments (e.g. binary tree root) are posted to the worker. The existing `safe_exec(code, args)` harness performs AST transformation (tracked list/dict/set literals and `frozenset(...)` calls), reconstructs arguments (TreeNode, ListNode, nested JSON collections, graph edge lists, etc.) from the serialized payload, executes the code in a sandboxed namespace, and returns a structured `ExecutionResult` as JSON back to the main thread.
4. If execution exceeds the timeout (default 30 s), the main-thread runner terminates the worker via `Worker.terminate()` and automatically recreates a fresh one for subsequent runs.

```
Expand Down Expand Up @@ -124,7 +124,8 @@ To avoid the CDN dependency (e.g. for air-gapped deployments):
| `src/packages/dstruct-runner/python/exec.py` | Python harness: AST transform + sandboxed exec, receives `safe_exec(code, args)` |
| `src/packages/dstruct-runner/python/tree_utils.py` | `TreeNode`, `ListNode`, `build_tree`, `build_list` for argument reconstruction |
| `src/packages/dstruct-runner/python/array_tracker.py` | `TrackedList` implementation for callstack frame generation |
| `src/packages/dstruct-runner/python/array_tracker_transformer.py` | AST transformer: rewrites list literals to `TrackedList(...)` |
| `src/packages/dstruct-runner/python/collection_tracker.py` | `TrackedDict`, `TrackedSet`, `TrackedFrozenSet` for map/set-style frames |
| `src/packages/dstruct-runner/python/array_tracker_transformer.py` | AST transformer: rewrites list/dict/set literals, comprehensions, and `frozenset(...)` calls |
| `src/packages/dstruct-runner/python/output.py` | `tracked_print`: captures print output into `__stdout__` global |
| `src/packages/dstruct-runner/python/shared_types.py` | Python TypedDicts for `ExecutionResult`, `CallFrame`, etc. |
| `next.config.mjs` | Webpack + Turbopack `*.py` raw-loader rules for embedding Python as strings |
Expand Down Expand Up @@ -174,6 +175,30 @@ When the worker is bundled by webpack/Next.js, its script URL is something like
- On worker crash (uncaught error): same recovery -- terminate, reset, recreate on next use.
- The `settle()` helper inside `run()` ensures event listeners and the timeout timer are always cleaned up, preventing memory leaks.

## Collection and graph tracking (harness contract)

User code is parsed and transformed **before** execution:

- **List literals** `[a, b, …]` become `TrackedList(..., name, __callstack__)`.
- **Dict literals** `{…}` become `TrackedDict(dict({…}), name=…, callstack=__callstack__)`.
- **Set literals** `{a, b}` (when not an empty dict) become `TrackedSet(..., name=…, callstack=__callstack__)`.
- **List / dict / set comprehensions** are rewritten so the inner comprehension builds tracked data, then `list()` / `dict()` / `set()` unwraps to a normal value where needed.
- **Calls** `frozenset(x)` (no keywords, at most one positional) become `TrackedFrozenSet(x, name=…, callstack=__callstack__)` for membership reads. Other `frozenset` forms are left unchanged.

Case arguments from JSON (`createPythonRuntimeArgs` → worker) are converted in `_convert_arg_to_python`:

| `type` field | Expected JSON `value` | Wrapped as |
| -------------- | ---------------------- | ---------- |
| `array`, `matrix` | JSON array (or `null` → empty tracked list) | Nested lists → `TrackedList`; nested objects → `TrackedDict` |
| `set` | JSON array of elements (or `null` → empty `TrackedSet`) | Elements deep-wrapped |
| `map`, `object` | JSON object (or `null` → empty `TrackedDict`) | Values deep-wrapped |
| `graph` | JSON array (edge list / adjacency-style rows) | Same deep wrap as array (nested lists + dicts) |
| `binaryTree`, `linkedList` | Tracked builders when payload includes ids | Unchanged contract |

Malformed shapes raise **`TypeError`** with a clear message; `safe_exec` returns them in `ExecutionResult.error` (same as other Python exceptions).

**`TrackedDict` / `TrackedSet` methods:** common mutators are tracked (`clear` emits `clearAppearance`; `pop` / `popitem` / `setdefault` / `update` route through tracked paths where reads or writes apply). **`defaultdict`**, **`Counter`**, and other dict subclasses built without going through literals are **not** auto-wrapped.

## Known Limitations

- **Standard library only.** `micropip` / third-party packages are not installed. Code that imports `numpy`, `pandas`, etc. will fail with `ModuleNotFoundError`.
Expand Down
2 changes: 2 additions & 0 deletions src/features/codeRunner/lib/workers/pythonExec.worker.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ import { loadPyodide, type PyodideInterface, version } from "pyodide";

import arrayTrackerSrc from "#/packages/dstruct-runner/python/array_tracker.py";
import arrayTrackerTransformerSrc from "#/packages/dstruct-runner/python/array_tracker_transformer.py";
import collectionTrackerSrc from "#/packages/dstruct-runner/python/collection_tracker.py";
import execPySrc from "#/packages/dstruct-runner/python/exec.py";
import executionLocationSrc from "#/packages/dstruct-runner/python/execution_location.py";
import lineTrackingTransformerSrc from "#/packages/dstruct-runner/python/line_tracking_transformer.py";
Expand Down Expand Up @@ -29,6 +30,7 @@ const HARNESS_FILES: Record<string, string> = {
"shared_types.py": sharedTypesSrc,
"output.py": outputSrc,
"array_tracker.py": arrayTrackerSrc,
"collection_tracker.py": collectionTrackerSrc,
"array_tracker_transformer.py": arrayTrackerTransformerSrc,
"execution_location.py": executionLocationSrc,
"line_tracking_transformer.py": lineTrackingTransformerSrc,
Expand Down
124 changes: 120 additions & 4 deletions src/packages/dstruct-runner/python/array_tracker_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ class ListOptions(TypedDict):

class ListTrackingTransformer(ast.NodeTransformer):
"""AST transformer that replaces list operations with tracked list operations."""

def __init__(self) -> None:
super().__init__()
self.counter = 0
Expand Down Expand Up @@ -49,10 +49,126 @@ def visit_ListComp(self, node: ast.ListComp) -> ast.Call:
args=[
ast.ListComp(elt=elt, generators=generators),
ast.Constant(value="comprehension"),
ast.Name(id="__callstack__", ctx=ast.Load())
ast.Name(id="__callstack__", ctx=ast.Load()),
],
keywords=[]
keywords=[],
)
],
keywords=[]
keywords=[],
)

def visit_Dict(self, node: ast.Dict) -> ast.Call:
self.generic_visit(node)
dict_name = f"auto_dict_{self.counter}"
self.counter += 1
return ast.Call(
func=ast.Name(id="TrackedDict", ctx=ast.Load()),
args=[
ast.Call(func=ast.Name(id="dict", ctx=ast.Load()), args=[node], keywords=[]),
ast.keyword(arg="name", value=ast.Constant(value=dict_name)),
ast.keyword(
arg="callstack", value=ast.Name(id="__callstack__", ctx=ast.Load())
),
],
keywords=[],
)

def visit_Set(self, node: ast.Set) -> ast.Call:
self.generic_visit(node)
set_name = f"auto_set_{self.counter}"
self.counter += 1
return ast.Call(
func=ast.Name(id="TrackedSet", ctx=ast.Load()),
args=[
node,
ast.keyword(arg="name", value=ast.Constant(value=set_name)),
ast.keyword(
arg="callstack", value=ast.Name(id="__callstack__", ctx=ast.Load())
),
],
keywords=[],
)

def visit_DictComp(self, node: ast.DictComp) -> ast.Call:
key = self.visit(node.key)
value = self.visit(node.value)
generators = [self.visit(g) for g in node.generators]
return ast.Call(
func=ast.Name(id="dict", ctx=ast.Load()),
args=[
ast.Call(
func=ast.Name(id="TrackedDict", ctx=ast.Load()),
args=[
ast.DictComp(key=key, value=value, generators=generators),
],
keywords=[
ast.keyword(arg="name", value=ast.Constant(value="dict_comp")),
ast.keyword(
arg="callstack",
value=ast.Name(id="__callstack__", ctx=ast.Load()),
),
],
)
],
keywords=[],
)

def visit_SetComp(self, node: ast.SetComp) -> ast.Call:
elt = self.visit(node.elt)
generators = [self.visit(g) for g in node.generators]
return ast.Call(
func=ast.Name(id="set", ctx=ast.Load()),
args=[
ast.Call(
func=ast.Name(id="TrackedSet", ctx=ast.Load()),
args=[
ast.SetComp(elt=elt, generators=generators),
],
keywords=[
ast.keyword(arg="name", value=ast.Constant(value="set_comp")),
ast.keyword(
arg="callstack",
value=ast.Name(id="__callstack__", ctx=ast.Load()),
),
],
)
],
keywords=[],
)

def visit_Call(self, node: ast.Call) -> ast.Call:
"""Rewrite frozenset(iterable) to TrackedFrozenSet for read tracking."""
self.generic_visit(node)
if not isinstance(node.func, ast.Name) or node.func.id != "frozenset":
return node
if node.keywords:
return node
frozen_name = f"auto_frozen_{self.counter}"
self.counter += 1
if node.args:
iterable = node.args[0]
extra_args = node.args[1:]
if extra_args:
return node
return ast.Call(
func=ast.Name(id="TrackedFrozenSet", ctx=ast.Load()),
args=[iterable],
keywords=[
ast.keyword(arg="name", value=ast.Constant(value=frozen_name)),
ast.keyword(
arg="callstack",
value=ast.Name(id="__callstack__", ctx=ast.Load()),
),
],
)
return ast.Call(
func=ast.Name(id="TrackedFrozenSet", ctx=ast.Load()),
args=[],
keywords=[
ast.keyword(arg="name", value=ast.Constant(value=frozen_name)),
ast.keyword(
arg="callstack",
value=ast.Name(id="__callstack__", ctx=ast.Load()),
),
],
)
Loading
Loading