Skip to content

Refactor: holistic L2 perf type naming + drop dead fields#916

Open
hw-native-sys-bot wants to merge 1 commit into
hw-native-sys:mainfrom
hw-native-sys-bot:refactor/aicpu-buffer-state-rename
Open

Refactor: holistic L2 perf type naming + drop dead fields#916
hw-native-sys-bot wants to merge 1 commit into
hw-native-sys:mainfrom
hw-native-sys-bot:refactor/aicpu-buffer-state-rename

Conversation

@hw-native-sys-bot
Copy link
Copy Markdown
Collaborator

@hw-native-sys-bot hw-native-sys-bot commented May 30, 2026

Summary

A holistic pass over the L2 perf type names in src/{a2a3,a5}/platform/.
Renames consolidate the misleading parts and drop two fields that have
been dead since #878 / #921. No on-device behaviour change beyond the
field removals.

Replaces the earlier single-name L2PerfBufferStateAicpuBufferState
rename — that landing as the final name had less explanatory power than
the holistic L2PerfAicpuPool chosen here.

Step 1: buffer name symmetry

Old New
L2PerfBuffer L2PerfAicpuBuffer
PhaseBuffer AicpuPhaseBuffer

Producer is now in the name: AICPU-written buffers carry Aicpu,
mirroring the existing L2PerfAicoreBuffer for the AICore-written one.
On a2a3, AicpuPhaseBuffer also switches from a hand-written struct to
the existing TypedBuffer<R, N> template — matches the other two.

Step 4: pool struct rename

Old New
AicpuBufferState L2PerfAicpuPool
L2PerfAicoreBufferState L2PerfAicorePool
PhaseBufferState AicpuPhasePool (alias of L2PerfAicpuPool)

Replaces the BufferState suffix (it was a pool state holding
free_queue + counters, not a buffer) and gives the Phase usage its
own explicit type name even though it shares layout with
L2PerfAicpuPool.

Step 2: BufferKind enum

ReadyQueueEntry::is_phase (uint32_t) → kind (BufferKind, uint32_t
underlying):

enum class BufferKind : uint32_t {
    L2PerfAicpu  = 0,
    Phase        = 1,
    L2PerfAicore = 2,
};

The field was already a 3-value discriminator (despite the bool-shaped
name); enqueue sites and the host's switch site now read in terms of
the named enum instead of magic numbers. uint32_t underlying preserves
the byte layout — no ABI break.

Step 3: drop dead fields (a2a3 only)

  • L2PerfAicpuPool::aicore_ring_ptr — removed
  • L2PerfAicpuPool::mismatch_record_count — removed

Both fields have been dead since:

Pad is adjusted to keep sizeof(L2PerfAicpuPool) == 192 (static_assert
remains green). Host reconcile() loop simplifies to total/dropped
only.

a5 keeps both fields — its AICore pipeline still uses the legacy
staging-ring design where the host's per-core aicore_ring_ptr and the
mismatch counter are both live.

Step 5: deferred

Collapsing AicpuPhaseHeader into the root L2PerfDataHeader is a
shared-memory layout change and is left for a focused follow-up PR.

Test plan

  • All four platform variants (a2a3, a2a3sim, a5, a5sim) build clean
  • tests/st/.../l2_swimlane pass on a2a3sim
  • Same ST passes on a2a3 device 1
  • a5sim smoke (spmd_basic) passes

Scope

16 files, ~700-line diff (mostly mechanical renames). No behaviour
change beyond the dead-field removal on a2a3.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Warning

Review limit reached

@hw-native-sys-bot, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 10 minutes and 46 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a3b3e2a2-24c4-4519-adb8-0808fde43ba1

📥 Commits

Reviewing files that changed from the base of the PR and between c830d3a and 3e368b4.

📒 Files selected for processing (16)
  • docs/dfx/l2-swimlane-profiling.md
  • docs/profiling-framework.md
  • src/a2a3/platform/include/aicpu/l2_perf_collector_aicpu.h
  • src/a2a3/platform/include/common/l2_perf_profiling.h
  • src/a2a3/platform/include/common/platform_config.h
  • src/a2a3/platform/include/host/l2_perf_collector.h
  • src/a2a3/platform/src/aicpu/l2_perf_collector_aicpu.cpp
  • src/a2a3/platform/src/host/l2_perf_collector.cpp
  • src/a5/platform/include/aicore/l2_perf_collector_aicore.h
  • src/a5/platform/include/aicpu/l2_perf_collector_aicpu.h
  • src/a5/platform/include/common/l2_perf_profiling.h
  • src/a5/platform/include/common/platform_config.h
  • src/a5/platform/include/host/l2_perf_collector.h
  • src/a5/platform/include/host/profiling_common/buffer_pool_manager.h
  • src/a5/platform/src/aicpu/l2_perf_collector_aicpu.cpp
  • src/a5/platform/src/host/l2_perf_collector.cpp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request performs a comprehensive renaming of the L2PerfBufferState structure to AicpuBufferState across the codebase, including documentation, headers, and source files for both the a2a3 and a5 platforms. This refactoring improves semantic clarity by explicitly associating the buffer state with the AICPU. No review comments were provided, and there is no additional feedback to offer on these changes.

A holistic pass over the L2 perf type names in src/{a2a3,a5}/platform/.
Renames consolidate the misleading parts and drop two fields that have
been dead since hw-native-sys#878 / hw-native-sys#921. No on-device behaviour change beyond the
field removals.

== Step 1: buffer name symmetry ==

  L2PerfBuffer        -> L2PerfAicpuBuffer
  PhaseBuffer         -> AicpuPhaseBuffer

The producer is now in the name: AICPU-written buffers carry `Aicpu`,
mirroring the existing `L2PerfAicoreBuffer` for the AICore-written one.
On a2a3, AicpuPhaseBuffer also moves from a hand-written struct to the
existing `TypedBuffer<R, N>` template -- matches L2PerfAicpuBuffer and
L2PerfAicoreBuffer.

== Step 4: pool struct rename ==

  AicpuBufferState        -> L2PerfAicpuPool
  L2PerfAicoreBufferState -> L2PerfAicorePool
  PhaseBufferState        -> AicpuPhasePool   (alias of L2PerfAicpuPool)

Replaces the misleading `BufferState` suffix (it was a *pool* state
holding free_queue + counters, not a buffer) and gives the Phase usage
its own explicit type name even though it shares layout with
L2PerfAicpuPool.

== Step 2: BufferKind enum ==

  ReadyQueueEntry::is_phase (uint32_t) -> kind (BufferKind, uint32_t underlying)

  enum class BufferKind : uint32_t {
      L2PerfAicpu  = 0,
      Phase        = 1,
      L2PerfAicore = 2,
  };

The field was already a 3-value discriminator (despite the bool-shaped
name); enqueue sites and the host's switch site now read in terms of
the named enum instead of magic numbers. uint32_t underlying preserves
the byte layout -- no ABI break.

== Step 3: drop dead fields (a2a3 only) ==

  L2PerfAicpuPool::aicore_ring_ptr        -- removed
  L2PerfAicpuPool::mismatch_record_count  -- removed

Both fields have been dead since:
  - hw-native-sys#878 moved AICore writes to a per-core L2PerfAicoreBuffer pool with
    its own state (L2PerfAicorePool), so `aicore_ring_ptr` no longer
    points anywhere live.
  - hw-native-sys#921 moved the rotation-table fill from host to AICPU, removing the
    last code path that even read `aicore_ring_ptr` (host's flush join
    now indexes the AICore pool directly).
  - `mismatch_record_count` was last written before hw-native-sys#878 (legacy ring/
    task_id invariant violation) and is no longer written by anyone;
    the host reconcile() loop's reading-and-warning on a counter that
    can never advance is removed.

Pad is adjusted to keep `sizeof(L2PerfAicpuPool) == 192` (static_assert
remains green). Host reconcile() loop simplifies to `total/dropped` only.

a5 keeps both fields -- its AICore pipeline still uses the legacy
staging-ring design where the host's per-core `aicore_ring_ptr` and the
mismatch counter are both live.

== Step 5: deferred ==

Collapsing AicpuPhaseHeader into the root L2PerfDataHeader is a
shared-memory layout change and is left for a focused follow-up PR.

== Test plan ==

- All four platform variants (a2a3, a2a3sim, a5, a5sim) build clean
- tests/st/.../l2_swimlane pass on a2a3sim
- Same ST passes on a2a3 device 1
- a5sim smoke (spmd_basic) passes
@hw-native-sys-bot hw-native-sys-bot force-pushed the refactor/aicpu-buffer-state-rename branch from ea8b37e to 3e368b4 Compare May 30, 2026 12:52
@hw-native-sys-bot hw-native-sys-bot changed the title Refactor: rename L2PerfBufferState to AicpuBufferState Refactor: holistic L2 perf type naming + drop dead fields May 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants