Skip to content

Refactor: RAII rollback for profiling collector init paths (+125 lines)#948

Merged
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
hw-native-sys-bot:collectors-init-rollback
Jun 1, 2026
Merged

Refactor: RAII rollback for profiling collector init paths (+125 lines)#948
ChaoWao merged 1 commit into
hw-native-sys:mainfrom
hw-native-sys-bot:collectors-init-rollback

Conversation

@hw-native-sys-bot
Copy link
Copy Markdown
Collaborator

Summary

Plugs the latent leak CodeRabbit flagged on #944 and that I deferred to a follow-up: a5 PmuCollector::init, ScopeStatsCollector::init, TensorDumpCollector::init register paired device + host buffers in BufferPoolManager via alloc_paired_buffer BEFORE flipping initialized_. A later allocation failure returns -1, finalize() early-exits (gated on initialized_ / shm_host_), and every registered device buffer + framework-malloc'd host shadow leaks.

The pattern existed pre-#944 (old alloc_single_buffer + manual register_mapping had the same shape), so this is not a regression — just the cleanup #944's malloc_shadows_ tracking enables.

Framework changes

  • BufferPoolManager::release_all_owned(release_fn) — new method. Abort-path cleanup: releases EVERY framework-tracked dev_ptr (via release_fn) and every framework-malloc'd host shadow (via std::free), then clears all internal containers. Distinct from release_owned_buffers() because this also catches buffers parked in callers' SPSC free_queues (tracked via register_mapping but not framework-owned via a queue). Drains recycled/done/ready first (just clears — release goes via dev_to_host_ to avoid double-free), then walks the full mapping table.

  • profiling_common::InitRollbackGuard<Manager> — new RAII scope guard in profiler_base.h. Holds a manager reference + release_fn + a vector of "extra direct dev_ptrs" the collector owns outside the framework (e.g. PMU per-core PmuAicoreRings on a5 — plain alloc_cb allocations with no host shadow). On destruction without commit(), calls manager.release_all_owned(release_fn) and then release_fn on each direct ptr. Move-only.

Collector wiring

  • common/scope_stats_collector.cpp::init() — construct guard right after set_memory_context, guard.commit() right before return 0. Catches the shm region + ScopeStatsBuffer entries (free_queue and recycled pool).

  • common/tensor_dump_collector.cpp::init() — same pattern. Catches the shm region + per-thread arenas + DumpMetaBuffers.

  • a5/pmu_collector.cpp::init() — same pattern + guard.add_direct_ptr(ring) for each per-core PmuAicoreRing (those don't go through alloc_paired_buffer so the framework doesn't track them — register them with the guard explicitly).

Success-path cost

Constructing the guard is std::function move + std::vector default-construct + bool init. commit() flips one bool. The destructor short-circuits on committed_. Hot-path overhead is a single boolean check per init() call.

What this does NOT touch

a2a3 collectors (dep_gen, pmu, scope_stats, tensor_dump legacy, l2_swimlane) — they predate alloc_paired_buffer and manage their own buffer lifecycle outside the framework's dev_to_host_ map. They're left as-is; the existing manual release_one_buffer + clear_mappings loop in their finalize() handles cleanup.

Test plan

  • a2a3sim ST L1+L2: pass (rollback path inert on success — guard.commit short-circuits the destructor)
  • a5sim ST L1+L2: pass (same)
  • Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim): clean
  • CI green

Net: +125 lines (5 files touched).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Warning

Review limit reached

@ChaoWao, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 36 minutes and 10 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7c236652-db99-4425-9379-dc86ec0d503e

📥 Commits

Reviewing files that changed from the base of the PR and between 0e2e7cd and 6583f39.

📒 Files selected for processing (5)
  • src/a5/platform/src/host/pmu_collector.cpp
  • src/common/platform/include/host/profiling_common/buffer_pool_manager.h
  • src/common/platform/include/host/profiling_common/profiler_base.h
  • src/common/platform/src/host/scope_stats_collector.cpp
  • src/common/platform/src/host/tensor_dump_collector.cpp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an RAII rollback mechanism (InitRollbackGuard) to handle initialization failures across various collectors (PmuCollector, ScopeStatsCollector, and TensorDumpCollector), alongside a new release_all_owned method in BufferPoolManager for abort-path cleanup. The review feedback highlights two key improvement opportunities: first, using malloc_shadows_.erase instead of count to atomically check and remove host shadows to prevent potential double-free issues; second, calling manager_.release_all_owned unconditionally in the guard's destructor to ensure host shadows are cleaned up even if the device release callback is empty.

Comment thread src/common/platform/include/host/profiling_common/buffer_pool_manager.h Outdated
Comment thread src/common/platform/include/host/profiling_common/profiler_base.h Outdated
Plug the latent leak CodeRabbit flagged on hw-native-sys#944: a5 PmuCollector::init,
ScopeStatsCollector::init, TensorDumpCollector::init register paired
device+host buffers in BufferPoolManager via alloc_paired_buffer BEFORE
flipping the initialized_ flag. If a subsequent allocation fails, init()
returns -1; finalize() then early-exits (gated on initialized_ / shm_host_)
and every registered device buffer + framework-malloc'd host shadow leaks.

The pattern existed pre-hw-native-sys#944 (old alloc_single_buffer + manual
register_mapping had the same shape), so this is not a regression of the
unification work — just the cleanup it enables.

Framework changes
-----------------

- BufferPoolManager::release_all_owned(release_fn) [new]: abort-path
  cleanup that releases EVERY framework-tracked dev_ptr (via release_fn)
  and every framework-malloc'd host shadow (via std::free), then clears
  all internal containers. Distinct from release_owned_buffers() because
  this also catches buffers parked in callers' SPSC free_queues (tracked
  via register_mapping but not framework-owned via a queue). Drains
  recycled/done/ready first (just clears — release goes via dev_to_host_
  to avoid double-free) then walks the full mapping table.

- profiling_common::InitRollbackGuard<Manager> [new, profiler_base.h]:
  RAII scope guard for collector init() rollback. Holds a manager
  reference + release_fn + a vector of "extra direct dev_ptrs" the
  collector owns outside the framework (e.g. PMU per-core PmuAicoreRings
  on a5 — plain alloc_cb allocations with no host shadow). On destruction
  without commit(), calls manager.release_all_owned + free_cb on each
  direct ptr. Move-only.

Collector wiring
----------------

- common/scope_stats_collector.cpp init(): construct guard after
  set_memory_context, commit() right before return 0. Catches the shm
  region + ScopeStatsBuffer entries (free_queue and recycled pool).

- common/tensor_dump_collector.cpp init(): same pattern. Catches the
  shm region + per-thread arenas + DumpMetaBuffers (free_queue and
  recycled pool).

- a5/pmu_collector.cpp init(): same pattern + guard.add_direct_ptr(ring)
  for each per-core PmuAicoreRing (those don't go through
  alloc_paired_buffer so the framework doesn't track them — register
  them with the guard explicitly).

Test plan
---------

- a2a3sim ST L1+L2: pass (rollback path inert on success — guard.commit
  short-circuits the destructor).
- a5sim ST L1+L2: pass (same).
- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim): clean.

Net: +122 lines (5 files touched).
@ChaoWao ChaoWao force-pushed the collectors-init-rollback branch from fbca3d8 to 6583f39 Compare May 31, 2026 12:35
hw-native-sys-bot pushed a commit to hw-native-sys-bot/simpler that referenced this pull request May 31, 2026
…n, consolidate common/, rename platform/src→shared)

Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of
duplicated and oddly-placed files that the recent unification work left
behind. Four orthogonal changes bundled here because each touches the
same set of CMakeLists; splitting would mean three more rebuild rounds
and three more cmake-include-path edits across the same files.

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/
------------------------------------------------------------------------

Seven files per backend (14 total) were byte-identical (or differed only
in a one-line @brief arch qualifier) between a2a3 and a5:

  cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp,
  orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate, and extended each
arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to
pick them up from common/platform/{onboard,sim}/aicpu/. The
device_malloc.cpp arch tag in its @brief was the only real content
diff; generalized to "Real Hardware" / "Simulation" without the arch
qualifier. Backfilled a copyright header that was missing on
device_time.cpp (caught by the check-headers hook).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp)
have real arch-specific divergence (register addresses, kernel protocols)
and stay where they are.

2. Flatten profiling_common/ subdir
------------------------------------

src/common/platform/include/host/profiling_common/{buffer_pool_manager,
profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager,
profiler_base}.h. Updated 10 #include sites and the 2 header guards. The
profiling_common:: C++ namespace stays — file path and namespace don't
have to match.

3. Consolidate small src/common subdirs
----------------------------------------

- src/common/device_comm/device_arena.h → src/common/utils/device_arena.h.
  The file is a generic bump-arena utility, not a comm primitive; the
  enclosing dir name was misleading. Updated 10 #include sites
  "device_arena.h" → "utils/device_arena.h" and dropped the
  common/device_comm entry from 8 CMakeLists (replaced with common
  since utils/ resolves there).

- src/common/sim_context/ → src/common/platform/sim/sim_context/. The
  dir is sim-only infrastructure (CPU sim context for CANN intrinsic
  emulation), so it belongs next to the other common/platform/sim/
  shared sim infrastructure. Updated:
    * the dir's own CMakeLists relative path to log/include;
    * simpler_setup/runtime_compiler.py::compile_sim_context source
      path;
    * 4 sim-host CMakeLists references;
    * a small handful of docs that named the old path.

4. Rename platform/src → platform/shared
------------------------------------------

Per-arch src/{arch}/platform/src/ was confusingly nested inside the
top-level src/ directory and read as "src/src" in many paths. Renamed
to shared/ across all 3 trees (a2a3, a5, common), matching its actual
semantic ("shared between onboard and sim within one arch"). Updated 21
files that referenced the old path: CMakeLists, host headers, docs, one
test file, and the src/{arch}/docs/platform.md map.

Test plan
---------

- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim)
  + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
- CI will run the full ST + UT suite.

Net: ~30 file renames, ~14 files extracted to common, +0 / -0
behavioral changes (pure layout).
@ChaoWao ChaoWao merged commit 293e88a into hw-native-sys:main Jun 1, 2026
16 checks passed
@ChaoWao ChaoWao deleted the collectors-init-rollback branch June 1, 2026 00:50
hw-native-sys-bot pushed a commit to hw-native-sys-bot/simpler that referenced this pull request Jun 1, 2026
…n, consolidate common/, rename platform/src→shared)

Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of
duplicated and oddly-placed files that the recent unification work left
behind. Four orthogonal changes bundled here because each touches the
same set of CMakeLists; splitting would mean three more rebuild rounds
and three more cmake-include-path edits across the same files.

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/
------------------------------------------------------------------------

Seven files per backend (14 total) were byte-identical (or differed only
in a one-line @brief arch qualifier) between a2a3 and a5:

  cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp,
  orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate, and extended each
arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to
pick them up from common/platform/{onboard,sim}/aicpu/. The
device_malloc.cpp arch tag in its @brief was the only real content
diff; generalized to "Real Hardware" / "Simulation" without the arch
qualifier. Backfilled a copyright header that was missing on
device_time.cpp (caught by the check-headers hook).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp)
have real arch-specific divergence (register addresses, kernel protocols)
and stay where they are.

2. Flatten profiling_common/ subdir
------------------------------------

src/common/platform/include/host/profiling_common/{buffer_pool_manager,
profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager,
profiler_base}.h. Updated 10 #include sites and the 2 header guards. The
profiling_common:: C++ namespace stays — file path and namespace don't
have to match.

3. Consolidate small src/common subdirs
----------------------------------------

- src/common/device_comm/device_arena.h → src/common/utils/device_arena.h.
  The file is a generic bump-arena utility, not a comm primitive; the
  enclosing dir name was misleading. Updated 10 #include sites
  "device_arena.h" → "utils/device_arena.h" and dropped the
  common/device_comm entry from 8 CMakeLists (replaced with common
  since utils/ resolves there).

- src/common/sim_context/ → src/common/platform/sim/sim_context/. The
  dir is sim-only infrastructure (CPU sim context for CANN intrinsic
  emulation), so it belongs next to the other common/platform/sim/
  shared sim infrastructure. Updated:
    * the dir's own CMakeLists relative path to log/include;
    * simpler_setup/runtime_compiler.py::compile_sim_context source
      path;
    * 4 sim-host CMakeLists references;
    * a small handful of docs that named the old path.

4. Rename platform/src → platform/shared
------------------------------------------

Per-arch src/{arch}/platform/src/ was confusingly nested inside the
top-level src/ directory and read as "src/src" in many paths. Renamed
to shared/ across all 3 trees (a2a3, a5, common), matching its actual
semantic ("shared between onboard and sim within one arch"). Updated 21
files that referenced the old path: CMakeLists, host headers, docs, one
test file, and the src/{arch}/docs/platform.md map.

Test plan
---------

- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim)
  + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
- CI will run the full ST + UT suite.

Net: ~30 file renames, ~14 files extracted to common, +0 / -0
behavioral changes (pure layout).
hw-native-sys-bot pushed a commit to hw-native-sys-bot/simpler that referenced this pull request Jun 1, 2026
…n, consolidate common/, rename platform/src→shared)

Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of
duplicated and oddly-placed files that the recent unification work left
behind. Four orthogonal changes bundled here because each touches the
same set of CMakeLists; splitting would mean three more rebuild rounds
and three more cmake-include-path edits across the same files.

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/
------------------------------------------------------------------------

Seven files per backend (14 total) were byte-identical (or differed only
in a one-line @brief arch qualifier) between a2a3 and a5:

  cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp,
  orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate, and extended each
arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to
pick them up from common/platform/{onboard,sim}/aicpu/. The
device_malloc.cpp arch tag in its @brief was the only real content
diff; generalized to "Real Hardware" / "Simulation" without the arch
qualifier. Backfilled a copyright header that was missing on
device_time.cpp (caught by the check-headers hook).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp)
have real arch-specific divergence (register addresses, kernel protocols)
and stay where they are.

2. Flatten profiling_common/ subdir
------------------------------------

src/common/platform/include/host/profiling_common/{buffer_pool_manager,
profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager,
profiler_base}.h. Updated 10 #include sites and the 2 header guards. The
profiling_common:: C++ namespace stays — file path and namespace don't
have to match.

3. Consolidate small src/common subdirs
----------------------------------------

- src/common/device_comm/device_arena.h → src/common/utils/device_arena.h.
  The file is a generic bump-arena utility, not a comm primitive; the
  enclosing dir name was misleading. Updated 10 #include sites
  "device_arena.h" → "utils/device_arena.h" and dropped the
  common/device_comm entry from 8 CMakeLists (replaced with common
  since utils/ resolves there).

- src/common/sim_context/ → src/common/platform/sim/sim_context/. The
  dir is sim-only infrastructure (CPU sim context for CANN intrinsic
  emulation), so it belongs next to the other common/platform/sim/
  shared sim infrastructure. Updated:
    * the dir's own CMakeLists relative path to log/include;
    * simpler_setup/runtime_compiler.py::compile_sim_context source
      path;
    * 4 sim-host CMakeLists references;
    * a small handful of docs that named the old path.

4. Rename platform/src → platform/shared
------------------------------------------

Per-arch src/{arch}/platform/src/ was confusingly nested inside the
top-level src/ directory and read as "src/src" in many paths. Renamed
to shared/ across all 3 trees (a2a3, a5, common), matching its actual
semantic ("shared between onboard and sim within one arch"). Updated 21
files that referenced the old path: CMakeLists, host headers, docs, one
test file, and the src/{arch}/docs/platform.md map.

Test plan
---------

- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim)
  + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
- CI will run the full ST + UT suite.

Net: ~30 file renames, ~14 files extracted to common, +0 / -0
behavioral changes (pure layout).
hw-native-sys-bot pushed a commit to hw-native-sys-bot/simpler that referenced this pull request Jun 1, 2026
…n, consolidate common/, rename platform/src→shared)

Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of
duplicated and oddly-placed files that the recent unification work left
behind. Four orthogonal changes bundled here because each touches the
same set of CMakeLists; splitting would mean three more rebuild rounds
and three more cmake-include-path edits across the same files.

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/
------------------------------------------------------------------------

Seven files per backend (14 total) were byte-identical (or differed only
in a one-line @brief arch qualifier) between a2a3 and a5:

  cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp,
  orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate, and extended each
arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to
pick them up from common/platform/{onboard,sim}/aicpu/. The
device_malloc.cpp arch tag in its @brief was the only real content
diff; generalized to "Real Hardware" / "Simulation" without the arch
qualifier. Backfilled a copyright header that was missing on
device_time.cpp (caught by the check-headers hook).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp)
have real arch-specific divergence (register addresses, kernel protocols)
and stay where they are.

2. Flatten profiling_common/ subdir
------------------------------------

src/common/platform/include/host/profiling_common/{buffer_pool_manager,
profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager,
profiler_base}.h. Updated 10 #include sites and the 2 header guards. The
profiling_common:: C++ namespace stays — file path and namespace don't
have to match.

3. Consolidate small src/common subdirs
----------------------------------------

- src/common/device_comm/device_arena.h → src/common/utils/device_arena.h.
  The file is a generic bump-arena utility, not a comm primitive; the
  enclosing dir name was misleading. Updated 10 #include sites
  "device_arena.h" → "utils/device_arena.h" and dropped the
  common/device_comm entry from 8 CMakeLists (replaced with common
  since utils/ resolves there).

- src/common/sim_context/ → src/common/platform/sim/sim_context/. The
  dir is sim-only infrastructure (CPU sim context for CANN intrinsic
  emulation), so it belongs next to the other common/platform/sim/
  shared sim infrastructure. Updated:
    * the dir's own CMakeLists relative path to log/include;
    * simpler_setup/runtime_compiler.py::compile_sim_context source
      path;
    * 4 sim-host CMakeLists references;
    * a small handful of docs that named the old path.

4. Rename platform/src → platform/shared
------------------------------------------

Per-arch src/{arch}/platform/src/ was confusingly nested inside the
top-level src/ directory and read as "src/src" in many paths. Renamed
to shared/ across all 3 trees (a2a3, a5, common), matching its actual
semantic ("shared between onboard and sim within one arch"). Updated 21
files that referenced the old path: CMakeLists, host headers, docs, one
test file, and the src/{arch}/docs/platform.md map.

Test plan
---------

- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim)
  + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
- CI will run the full ST + UT suite.

Net: ~30 file renames, ~14 files extracted to common, +0 / -0
behavioral changes (pure layout).
hw-native-sys-bot pushed a commit to hw-native-sys-bot/simpler that referenced this pull request Jun 1, 2026
…n, consolidate common/, rename platform/src→shared)

Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of
duplicated and oddly-placed files that the recent unification work left
behind. Four orthogonal changes bundled here because each touches the
same set of CMakeLists; splitting would mean three more rebuild rounds
and three more cmake-include-path edits across the same files.

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/
------------------------------------------------------------------------

Seven files per backend (14 total) were byte-identical (or differed only
in a one-line @brief arch qualifier) between a2a3 and a5:

  cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp,
  orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate, and extended each
arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to
pick them up from common/platform/{onboard,sim}/aicpu/. The
device_malloc.cpp arch tag in its @brief was the only real content
diff; generalized to "Real Hardware" / "Simulation" without the arch
qualifier. Backfilled a copyright header that was missing on
device_time.cpp (caught by the check-headers hook).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp)
have real arch-specific divergence (register addresses, kernel protocols)
and stay where they are.

2. Flatten profiling_common/ subdir
------------------------------------

src/common/platform/include/host/profiling_common/{buffer_pool_manager,
profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager,
profiler_base}.h. Updated 10 #include sites and the 2 header guards. The
profiling_common:: C++ namespace stays — file path and namespace don't
have to match.

3. Consolidate small src/common subdirs
----------------------------------------

- src/common/device_comm/device_arena.h → src/common/utils/device_arena.h.
  The file is a generic bump-arena utility, not a comm primitive; the
  enclosing dir name was misleading. Updated 10 #include sites
  "device_arena.h" → "utils/device_arena.h" and dropped the
  common/device_comm entry from 8 CMakeLists (replaced with common
  since utils/ resolves there).

- src/common/sim_context/ → src/common/platform/sim/sim_context/. The
  dir is sim-only infrastructure (CPU sim context for CANN intrinsic
  emulation), so it belongs next to the other common/platform/sim/
  shared sim infrastructure. Updated:
    * the dir's own CMakeLists relative path to log/include;
    * simpler_setup/runtime_compiler.py::compile_sim_context source
      path;
    * 4 sim-host CMakeLists references;
    * a small handful of docs that named the old path.

4. Rename platform/src → platform/shared
------------------------------------------

Per-arch src/{arch}/platform/src/ was confusingly nested inside the
top-level src/ directory and read as "src/src" in many paths. Renamed
to shared/ across all 3 trees (a2a3, a5, common), matching its actual
semantic ("shared between onboard and sim within one arch"). Updated 21
files that referenced the old path: CMakeLists, host headers, docs, one
test file, and the src/{arch}/docs/platform.md map.

Test plan
---------

- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim)
  + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
- CI will run the full ST + UT suite.

Net: ~30 file renames, ~14 files extracted to common, +0 / -0
behavioral changes (pure layout).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants