Skip to content

[Feature] Add per-kernel dispatch args dump for Insight Trace #837

@vegetabledoww

Description

@vegetabledoww

Summary

Follow-up to PR #792: --dump-args currently only exports orchestrator-level arguments to tensor_dump/args_dump.json.

Downstream Insight Trace needs the actual per-dispatch kernel_entry(args) layout for individual incore kernels so it can replay a single kernel dispatch directly.

Motivation / Use Case

The current args_dump.json is useful for orchestration-level inspection, but it is not sufficient to reconstruct one real kernel dispatch such as QK / SF / PV / UP.

Insight Trace needs the finalized args after scheduler payload construction, including the real slot ordering and per-dispatch metadata. Without that, downstream tooling cannot reliably replay one incore kernel from dump artifacts.

Proposed API / Behavior

Add a separate kernel-level dump artifact, for example:

tensor_dump/kernel_args_dump.json

This new dump should:

  • keep existing tensor_dump/args_dump.json unchanged for compatibility
  • capture records after scheduler payload construction, using the actual kernel_entry(args) layout
  • include per-dispatch identifiers such as:
    • dispatch_id
    • func_id
    • task_id
    • subtask_id
    • core_type
    • core_id
    • block_idx
  • mark the capture stage as before_dispatch
  • preserve the real arg_index ordering seen by the kernel
  • include tensor arg metadata:
    • dtype
    • ndims
    • shape
    • pointer value if needed
  • include scalar arg raw values with enough information to distinguish value/bits semantics
  • include context pointer args separately from normal tensor/scalar args

A possible top-level schema would group args by dispatch and include:

  • schema_version
  • total_dispatches
  • total_args
  • dispatches[]

Alternatives Considered

  • Reusing only args_dump.json: insufficient, because it reflects orchestration-level arguments rather than real per-kernel dispatch payload layout.
  • Reconstructing dispatch args offline from existing dump artifacts: possible only heuristically, and too fragile for downstream replay tooling.

Additional Context

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions