Skip to content

L2 swimlane: regression test for aicore_rotate failure paths + perf check on cache-line sharing #943

@hw-native-sys-bot

Description

@hw-native-sys-bot

Follow-up from #939 (ActiveHead cache-line refactor).

Pending: regression test for aicore_rotate failure-path accounting

PR #939 fixed a pre-existing over-counting bug: the pre-emptive dropped_record_count += BUFFER_SIZE in aicore_rotate's two failure branches (empty free queue, full ready queue) double-counted records that the flush retry path would still deliver, breaking the collected + dropped == total reconcile invariant when the run ended before the slot guard actually overflowed the projected BUFFER_SIZE more records.

We need a regression test that exercises both failure paths and asserts the reconcile invariant. Triggers are hard to set up in the existing 5-task vector example:

  • Empty free queue at rotation: requires driving enough rotations to exhaust the free pool (PLATFORM_AICORE_BUFFERS_PER_CORE per core). A long stress run with many tasks per core.
  • Ready queue full at rotation: requires the host drain thread to be slow / paused.

Approach options:

  • Add a stress test that runs N×PLATFORM_AICORE_BUFFERS_PER_CORE tasks per core and asserts the reconcile invariant in the captured JSON.
  • Add a sim-only knob to artificially block the host drain for a window.

Pending: perf measurement on paged_attention_unroll (RESOLVED)

Measured on a2a3 onboard, paged_attention_unroll Case1 with --enable-l2-swimlane 4, 3 iters each via task-submit:

pytest body wall (incl. import)
Baseline (upstream/main pre-#939) 15.46–15.67s 22.85–23.01s
B alone (#939) 15.19–15.34s 22.99–23.28s

Within noise (<2%); no measurable regression from packing head + counters into the same cache line. Design choice validated — counters can stay co-located with head.

Priority

Regression test is non-blocking; the fix in #939 is correct by code review and validated by reconcile math. Add when test-infra can model the trigger.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions