You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
No "dump all tensors of this task" shortcut. In selective mode, dumping a whole task means enumerating every tensor argument by hand — args.dump(x, y, z). Arg::dump(...) even static_asserts on sizeof...(Args) >= 1, so there is no terse "all args of this task" form.
enable_dump_tensor_selective() is a redundant mode toggle. Selective mode can be inferred from whether any Arg::dump(...) marker was placed: if at least one task marks tensors → selective; if none → full dump. The explicit enable call is an extra step users must remember, and forgetting it silently falls back to full dump even when Arg::dump(...) markers are present (current documented behavior).
Motivation / Use Case
The current flow forces two decisions on the user where one suffices:
enable_dump_tensor_selective(); // (1) remember to flip the mode
...
args.dump(x, y, z); // (2) then enumerate every tensor by hand
Forgetting (1) makes every args.dump(...) a silent no-op — the run dumps everything, which is exactly what selective mode was meant to avoid.
For "dump this entire task, nothing else", the user must list all tensor args, which is verbose and drifts out of sync as the task signature changes.
Removing the toggle and adding a dump-all shortcut reduces the API to a single intuitive call site and removes a silent-fallback footgun.
Proposed API / Behavior
Infer selective mode from markers — remove enable_dump_tensor_selective():
If any Arg::dump(...) marker is present in the orchestration, AICPU collection runs in selective mode (only marked tasks / args dumped).
If no Arg::dump(...) marker is present anywhere, behavior is the legacy full dump (every task, every tensor) — unchanged default.
--dump-tensor remains the top-level host enable switch; nothing here changes that.
Add a per-task dump-all shortcut:
Arg args;
args.add_input(x);
args.add_input(y);
args.add_output(z);
args.dump(); // dump-all: mark every tensor arg on this Argrt_submit_aiv_task(FUNC_ADD, args);
i.e. relax dump() so a no-argument call (or an explicit dump_all()) marks all tensor args currently on the Arg, instead of static_assert-ing on ≥1 argument.
Alternatives Considered
Keep enable_dump_tensor_selective() — current state; redundant call and silent-fallback footgun remain.
Documented behavior — docs/dfx/tensor-dump.md §3.2 (the doc notes that without enable_dump_tensor_selective(), dump(...) markers are ignored — the silent fallback this issue removes).
Follow-up to #838 / #844.
Summary
#844 added selective tensor dump (
enable_dump_tensor_selective()+Arg::dump(...)), resolving #838. Two usability gaps remain:args.dump(x, y, z).Arg::dump(...)evenstatic_asserts onsizeof...(Args) >= 1, so there is no terse "all args of this task" form.enable_dump_tensor_selective()is a redundant mode toggle. Selective mode can be inferred from whether anyArg::dump(...)marker was placed: if at least one task marks tensors → selective; if none → full dump. The explicit enable call is an extra step users must remember, and forgetting it silently falls back to full dump even whenArg::dump(...)markers are present (current documented behavior).Motivation / Use Case
The current flow forces two decisions on the user where one suffices:
args.dump(...)a silent no-op — the run dumps everything, which is exactly what selective mode was meant to avoid.Removing the toggle and adding a dump-all shortcut reduces the API to a single intuitive call site and removes a silent-fallback footgun.
Proposed API / Behavior
Infer selective mode from markers — remove
enable_dump_tensor_selective():Arg::dump(...)marker is present in the orchestration, AICPU collection runs in selective mode (only marked tasks / args dumped).Arg::dump(...)marker is present anywhere, behavior is the legacy full dump (every task, every tensor) — unchanged default.--dump-tensorremains the top-level host enable switch; nothing here changes that.Add a per-task dump-all shortcut:
i.e. relax
dump()so a no-argument call (or an explicitdump_all()) marks all tensor args currently on theArg, instead ofstatic_assert-ing on ≥1 argument.Alternatives Considered
enable_dump_tensor_selective()— current state; redundant call and silent-fallback footgun remain.Argmarker mechanism feat: support selective tensor dump by tensor argument #844 already established on the device side.Additional Context
feat: support selective tensor dump by tensor argument), Fixes [Feature] Support partial task selection for tensor dump #838.enable_dump_tensor_selective()—src/{a2a3,a5}/runtime/tensormap_and_ringbuffer/orchestration/pto_orchestration_api.h:141Arg::dump(...)(static_assert sizeof...(Args) >= 1) —src/{a2a3,a5}/runtime/tensormap_and_ringbuffer/runtime/pto_types.h:206docs/dfx/tensor-dump.md§3.2 (the doc notes that withoutenable_dump_tensor_selective(),dump(...)markers are ignored — the silent fallback this issue removes).