Skip to content

Refactor: src/ layout cleanup (extract aicpu + rename platform/src→shared + consolidate common/)#950

Open
hw-native-sys-bot wants to merge 1 commit into
hw-native-sys:mainfrom
hw-native-sys-bot:src-layout-cleanup
Open

Refactor: src/ layout cleanup (extract aicpu + rename platform/src→shared + consolidate common/)#950
hw-native-sys-bot wants to merge 1 commit into
hw-native-sys:mainfrom
hw-native-sys-bot:src-layout-cleanup

Conversation

@hw-native-sys-bot
Copy link
Copy Markdown
Collaborator

Summary

Post-#944/#945/#948 layout cleanup. Audit of the post-unification tree turned up several rough edges; this PR addresses them in one pass because each change touches the same set of CMakeLists — splitting would mean three more cmake-include-path edit rounds.

Four orthogonal cleanups

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/

Seven files per backend (14 total) were byte-identical (or differed only in a one-line @brief arch qualifier) between a2a3 and a5:

  • cache_ops.cpp
  • device_log.cpp
  • device_time.cpp
  • device_malloc.cpp (only diff was (a2a3) vs (a5) in the @brief)
  • orch_so_file.cpp
  • platform_aicpu_affinity.cpp
  • spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate. Extended each arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to pick the new dir up. Generalized the device_malloc.cpp @brief to "Real Hardware" / "Simulation" without the arch qualifier. Backfilled a missing copyright header on device_time.cpp (caught by the check-headers hook on the move).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp) have real arch-specific divergence (register addresses, kernel protocols) and stay per-arch.

2. Flatten profiling_common/ subdir

src/common/platform/include/host/profiling_common/{buffer_pool_manager,profiler_base}.hsrc/common/platform/include/host/{buffer_pool_manager,profiler_base}.h. Updated 10 #include sites and the 2 header guards. The profiling_common:: C++ namespace stays — file path and namespace don't have to match.

3. Consolidate small src/common/ subdirs

  • src/common/device_comm/device_arena.hsrc/common/utils/device_arena.h. The file is a generic bump-arena utility, not a comm primitive; the enclosing dir name was misleading. Updated 10 #include sites "device_arena.h""utils/device_arena.h" and dropped the common/device_comm entry from 8 CMakeLists (replaced with common, since utils/ resolves there).
  • src/common/sim_context/src/common/platform/sim/sim_context/. The dir is sim-only infrastructure (CPU sim context for CANN intrinsic emulation); it belongs next to the other common/platform/sim/ shared sim infrastructure. Updated the dir's own CMakeLists, simpler_setup/runtime_compiler.py::compile_sim_context source path, 4 sim-host CMakeLists references, and a small handful of docs that named the old path.

4. Rename platform/src/platform/shared/

Per-arch src/{arch}/platform/src/ was confusingly nested inside the top-level src/ directory and read as "src/src" in many paths. Renamed to shared/ across all 3 trees (a2a3, a5, common), matching its actual semantic ("shared between onboard and sim within one arch"). Updated 21 files that referenced the old path: CMakeLists, host headers, docs, one test file, and the src/{arch}/docs/platform.md map. The developer-guide directory diagram is updated to match.

What's NOT changed

  • common/platform/include/aicore/aicore.h (a single file in its own dir) stays put — its include form "aicore/aicore.h" is consistent with "aicpu/..." and "host/...", and removing the namespacing would force a rename across 10+ files for no real gain.
  • Files where the per-arch divergence is real (tensor_dump.h pad sizes, platform_regs.h reg addresses, inner_platform_regs.cpp, kernel.cpp, inner_kernel.h) stay per-arch.

Test plan

  • Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim) + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
  • CI green

Net: ~30 file renames, ~14 files extracted to common, +0 / -0 behavioral changes (pure layout).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 607dc752-c75b-4f1a-b50e-65466da47dd5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR reorganizes the platform source code structure by consolidating shared implementations into a unified platform/shared/ directory across architectures, unifies profiler framework headers into a common location, updates build configurations to reflect the new layout, and removes redundant arch-specific device implementations.

Changes

Platform refactoring and header consolidation

Layer / File(s) Summary
Profiler framework header reorganization
src/common/platform/include/host/buffer_pool_manager.h, src/common/platform/include/host/profiler_base.h, src/common/platform/include/host/scope_stats_collector.h, src/common/platform/include/host/tensor_dump_collector.h, src/common/platform/include/common/scope_stats.h
Header guard macros in buffer_pool_manager.h and profiler_base.h are renamed to reflect the new unified path (removing PROFILING_COMMON from macro names). profiler_base.h updates its internal include from host/profiling_common/buffer_pool_manager.h to host/buffer_pool_manager.h. Collector headers update internal comments/references for profiler_base.h sources.
Update collector headers to use new profiler paths
src/a2a3/platform/include/host/dep_gen_collector.h, src/a2a3/platform/include/host/l2_swimlane_collector.h, src/a2a3/platform/include/host/pmu_collector.h, src/a5/platform/include/host/l2_swimlane_collector.h, src/a5/platform/include/host/pmu_collector.h, src/a2a3/platform/include/common/dep_gen.h
All collector headers and wiring headers (PmuCollector, L2SwimlaneCollector, ScopeStatsCollector, TensorDumpCollector, DepGenCollector) update their #include directives from host/profiling_common/profiler_base.h to host/profiler_base.h and documentation references from old profiling_common paths to the new shared host paths.
CMakeLists.txt refactoring for shared platform sources
src/a2a3/platform/onboard/aicpu/CMakeLists.txt, src/a2a3/platform/onboard/host/CMakeLists.txt, src/a2a3/platform/sim/aicpu/CMakeLists.txt, src/a2a3/platform/sim/host/CMakeLists.txt, src/a5/platform/onboard/aicpu/CMakeLists.txt, src/a5/platform/onboard/host/CMakeLists.txt, src/a5/platform/sim/aicpu/CMakeLists.txt, src/a5/platform/sim/host/CMakeLists.txt, src/common/platform/sim/sim_context/CMakeLists.txt, tests/ut/cpp/CMakeLists.txt, simpler_setup/runtime_compiler.py
Build configurations update source glob patterns from platform/src/aicpu to platform/shared/aicpu and common/platform/src/ to common/platform/shared/ locations. Include directories are adjusted to point to shared/host and new common paths. AICPU kernel includes add cross-arch header paths (e.g., common/platform/sim/aicpu for spin_hint.h). Sim context CMakeLists is moved to src/common/platform/sim/sim_context/CMakeLists.txt and its logging include path corrected.
Documentation updates for new paths and header locations
docs/developer-guide.md, docs/dfx/l2-swimlane-profiling.md, docs/dfx/pmu-profiling.md, docs/dfx/scope-stats.md, docs/dfx/tensor-dump.md, docs/hardware/cache-coherency.md, docs/logging.md, docs/profiling-framework.md, src/a2a3/docs/platform.md, src/a2a3/platform/docs/tpush-tpop-sim.md, src/a5/docs/platform.md, src/a5/platform/docs/tpush-tpop-sim.md, tests/st/a2a3/tensormap_and_ringbuffer/dfx/pmu/test_pmu.py, simpler_setup/tools/sched_overhead_analysis.py
All documentation and docstrings are updated to reflect the reorganized source paths. Shared source locations change from platform/src/ to platform/shared/. Profiler framework header references change from per-arch profiling_common/ paths to the unified common/platform/include/host/ location. Sim context paths update from src/common/sim_context/ to src/common/platform/sim/sim_context/.
Runtime header include path updates for device_arena
src/a2a3/platform/onboard/host/device_runner.h, src/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp, src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.h, src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h, src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_shared_memory.h, src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h, src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/pto_scheduler.h, src/a5/platform/onboard/host/device_runner.h, src/a5/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp, src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.h, src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h, src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_shared_memory.h, src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h, src/a5/runtime/tensormap_and_ringbuffer/runtime/scheduler/pto_scheduler.h, src/common/platform/onboard/host/device_runner_base.h, src/common/platform/sim/host/device_runner_base.h
All runtime and device runner headers consolidate their device_arena includes from device_arena.h or local includes to the standardized utils/device_arena.h path, ensuring consistent import patterns across the codebase.
Device implementation consolidation and removal
src/common/platform/onboard/aicpu/device_malloc.cpp, src/common/platform/onboard/aicpu/device_time.cpp, src/common/platform/sim/aicpu/device_malloc.cpp
Shared device_time.cpp adds get_sys_cnt_aicpu() function reading ARM cntvct_el0 timer; its documentation string is updated. Shared device_malloc.cpp doc comments remove arch-specific qualifiers. Arch-specific device implementations (device_malloc, device_time, device_log, cache_ops, orch_so_file, platform_aicpu_affinity, spin_hint) are removed from a2a3 onboard/sim and a5 onboard/sim as they are now consolidated into shared or have been superseded.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • hw-native-sys/simpler#945: Main PR's updates to scope_stats/tensor_dump collector paths and profiler framework header locations align with the retrieved PR's refactor that moved these collectors into the common directory and removed the old profiling_common per-arch headers.
  • hw-native-sys/simpler#944: Main PR consolidates profiler framework headers into a unified src/common/platform/include/host/ location, directly complementing the retrieved PR's profiling-framework unification effort for ProfilerBase and BufferPoolManager classes.

Poem

🐰 Platform paths scattered wide, now gathered in one place,
Profiler headers unified, a cleaner shared-source space,
Device functions consolidated, arch walls tumbling down,
Refactored with precision, the codebase wears new gown.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main refactoring work: extracting duplicated aicpu sources to common, renaming platform/src to platform/shared, and consolidating small src/common/ subdirectories.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, detailing all four orthogonal cleanup operations, affected files, testing approach, and what remains unchanged.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request consolidates and reorganizes platform-specific code and documentation, moving shared profiling framework headers, device memory allocation, logging, and other utilities from architecture-specific directories (such as src/a2a3/ and src/a5/) into a unified src/common/ directory structure. The review feedback points out several outdated file paths and links in both code comments and documentation files (including dep_gen.h, scope_stats.h, and various Markdown files under docs/) that still reference the old architecture-specific paths instead of the newly consolidated shared locations. Additionally, a typo in docs/dfx/scope-stats.md was identified where the device-side scope stats collector file is incorrectly named scope_stats_collector.cpp instead of scope_stats_collector_aicpu.cpp.

*
* Streaming buffer design mirrors PMU / L2Swimlane / TensorDump (single source of
* algorithmic truth in src/a2a3/platform/include/host/profiling_common/profiler_base.h):
* algorithmic truth in src/a2a3/platform/include/host/profiler_base.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The profiler_base.h header is located in the shared src/common/platform/include/host/ directory, not under src/a2a3/. The comment path should be updated to reflect the correct location.

 * algorithmic truth in src/common/platform/include/host/profiler_base.h):

* fixed-capacity buffers, mirroring PMU / dep_gen / tensor_dump / l2_swimlane (the
* single source of mgmt-loop truth is
* src/a2a3/platform/include/host/profiling_common/profiler_base.h):
* src/a2a3/platform/include/host/profiler_base.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The profiler_base.h header is located in the shared src/common/platform/include/host/ directory, not under src/a2a3/. The comment path should be updated to reflect the correct location.

 * src/common/platform/include/host/profiler_base.h):

Comment on lines +5 to +7
framework headers under `src/common/platform/include/host/`
([a2a3](../src/common/platform/include/host/),
[a5](../src/common/platform/include/host/)) — the two copies
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since the profiling framework headers have been consolidated into a single shared directory (src/common/platform/include/host/), they are no longer maintained as separate per-architecture copies. The duplicate links for a2a3 and a5 should be replaced with a single link to common, and the surrounding text (lines 4 and 8-9) should be updated to remove references to 'two copies' and 'each arch is free to evolve independently'.

space so the host can read device buffers directly.
`L2SwimlaneCollector` runs two background threads on top of a
[`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiling_common/buffer_pool_manager.h):
[`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/buffer_pool_manager.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The buffer_pool_manager.h header is located in the shared src/common/platform/include/host/ directory, not under src/a2a3/. The link should be updated to point to the correct path.

[`L2SwimlaneCollector`](../src/a2a3/platform/include/host/l2_swimlane_collector.h)
on a2a3 inherits from
[`profiling_common::ProfilerBase<L2SwimlaneCollector, L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiling_common/profiler_base.h):
[`profiling_common::ProfilerBase<L2SwimlaneCollector, L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiler_base.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The profiler_base.h header is located in the shared src/common/platform/include/host/ directory, not under src/a2a3/. The link should be updated to point to the correct path.

Comment thread docs/dfx/tensor-dump.md
[`TensorDumpCollector`](../src/a2a3/platform/include/host/tensor_dump_collector.h)
on a2a3 inherits from
[`profiling_common::ProfilerBase<TensorDumpCollector, DumpModule>`](../src/a2a3/platform/include/host/profiling_common/profiler_base.h):
[`profiling_common::ProfilerBase<TensorDumpCollector, DumpModule>`](../src/a2a3/platform/include/host/profiler_base.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The profiler_base.h header is located in the shared src/common/platform/include/host/ directory, not under src/a2a3/. The link should be updated to point to the correct path.

Comment thread docs/dfx/tensor-dump.md
Comment on lines +531 to +533
([`profiling_common::ProfilerBase`](../src/a5/platform/include/host/profiler_base.h))
as a2a3 and parameterizes
[`BufferPoolManager`](../src/a5/platform/include/host/profiling_common/buffer_pool_manager.h)
[`BufferPoolManager`](../src/a5/platform/include/host/buffer_pool_manager.h)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The profiler_base.h and buffer_pool_manager.h headers are located in the shared src/common/platform/include/host/ directory, not under src/a5/. The links should be updated to point to the correct path.

Comment thread docs/logging.md
| ------------ | ------------------- | ------- |
| `libsimpler_log.so` (host) | `src/common/log/unified_log_host.cpp` | `HostLogger` → stderr |
| AICPU binary (device) | `src/{arch}/platform/src/aicpu/unified_log_device.cpp` | `dev_vlog_*` → backend |
| AICPU binary (device) | `src/{arch}/platform/shared/aicpu/unified_log_device.cpp` | `dev_vlog_*` → backend |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The unified_log_device.cpp file is located in the shared src/common/platform/shared/aicpu/ directory, not under the per-architecture src/{arch}/ directory. The path should be updated to point to the correct location.

Comment thread docs/logging.md
| Change the sim AICPU output format | `src/{arch}/platform/sim/aicpu/device_log.cpp::dev_vlog_*` |
| Change the onboard AICPU CANN dlog tagging | `src/{arch}/platform/onboard/aicpu/device_log.cpp::dev_vlog_*` |
| Add a new C ABI entry point (e.g. dynamic config push) | `src/common/log/include/common/unified_log.h` + `unified_log_host.cpp` + `src/{arch}/platform/src/aicpu/unified_log_device.cpp` |
| Add a new C ABI entry point (e.g. dynamic config push) | `src/common/log/include/common/unified_log.h` + `unified_log_host.cpp` + `src/{arch}/platform/shared/aicpu/unified_log_device.cpp` |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The unified_log_device.cpp file is located in the shared src/common/platform/shared/aicpu/ directory, not under the per-architecture src/{arch}/ directory. The path should be updated to point to the correct location.

Comment thread docs/dfx/scope-stats.md
Pure-value API declarations. No runtime types cross this boundary.

platform/src/aicpu/scope_stats_collector.cpp
platform/shared/aicpu/scope_stats_collector.cpp
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The device-side scope stats collector file is named scope_stats_collector_aicpu.cpp, not scope_stats_collector.cpp. The path should be updated to reflect the correct file name.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/ut/cpp/CMakeLists.txt (2)

254-259: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Fix stale src/common/device_comm include directories (point to src/common/utils).

tests/ut/cpp/CMakeLists.txt still adds ${CMAKE_SOURCE_DIR}/../../../src/common/device_comm (e.g., add_common_utils_test, and also in a2a3_rt_objs / a5_rt_objs include dirs), but src/common/device_comm no longer exists while device_arena.h is now at src/common/utils/device_arena.h. The UTs include it as #include "device_arena.h", so update the include dirs to src/common/utils (or update the includes to match the new utils/... prefix).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/ut/cpp/CMakeLists.txt` around lines 254 - 259, The CMake include path
for device_arena.h is stale: update occurrences that add
"${CMAKE_SOURCE_DIR}/../../../src/common/device_comm" (seen in targets like
add_common_utils_test, a2a3_rt_objs, a5_rt_objs) to point to
"${CMAKE_SOURCE_DIR}/../../../src/common/utils" so the tests can find
device_arena.h (or alternatively change the test source includes to use
"utils/device_arena.h"); modify the include_directories /
target_include_directories entries that reference device_comm to use the utils
directory instead.

315-318: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

test_orch_so_file references a stale deleted a2a3 source path

tests/ut/cpp/CMakeLists.txt still builds test_orch_so_file from ${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp, but that file no longer exists; orch_so_file.cpp now lives under src/common/platform/onboard/aicpu/, so CMake/configure/build will fail.

🐛 Proposed fix (switch to consolidated common path)
 add_executable(test_orch_so_file
     common/test_orch_so_file.cpp
-    ${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp
+    ${CMAKE_SOURCE_DIR}/../../../src/common/platform/onboard/aicpu/orch_so_file.cpp
 )
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/ut/cpp/CMakeLists.txt` around lines 315 - 318, The CMake target
test_orch_so_file is referencing a deleted source path under src/a2a3; update
the add_executable source list so that test_orch_so_file uses the relocated file
at src/common/platform/onboard/aicpu/orch_so_file.cpp (replace the old
${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp
entry with the consolidated
${CMAKE_SOURCE_DIR}/../../../src/common/platform/onboard/aicpu/orch_so_file.cpp
path) so CMake can find orch_so_file.cpp when configuring and building.
🧹 Nitpick comments (1)
src/a5/platform/onboard/host/CMakeLists.txt (1)

24-26: 💤 Low value

Duplicate common include path.

Line 26 re-adds ${CMAKE_CURRENT_SOURCE_DIR}/../../../../common, which is already appended on line 24. The a2a3 onboard-host sibling lists it only once. Harmless (CMake de-duplicates include dirs) but redundant — drop one for clarity.

♻️ Remove the redundant entry
 list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common")
 list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/log/include")
-list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common")
 list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/host")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/a5/platform/onboard/host/CMakeLists.txt` around lines 24 - 26, The
CMakeLists uses CMAKE_CUSTOM_INCLUDE_DIRS and CMAKE_CURRENT_SOURCE_DIR to append
include paths but duplicates "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common"
(first and third list(APPEND) calls); remove the redundant third list(APPEND
CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common") so
the common include is only appended once and leave the
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/log/include" entry intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/dfx/l2-swimlane-profiling.md`:
- Line 362: Several documentation links still point to arch-specific profiling
headers (e.g., the link referencing BufferPoolManager<L2SwimlaneModule>) and
must be retargeted to the consolidated common header location; find the markdown
occurrences that link to architecture-specific include headers (the
BufferPoolManager<L2SwimlaneModule> link and the other similar header links) and
update their URLs to point to the consolidated common platform include directory
(the centralized host profiling headers) so the docs reference the shared header
location instead of the old arch-specific paths.

In `@docs/dfx/pmu-profiling.md`:
- Line 296: Update the stale documentation links that reference arch-specific
host headers so they point to the consolidated common headers; specifically
change the link target for profiling_common::ProfilerBase<PmuCollector,
PmuModule> and the other two links mentioned (the occurrences around lines
referencing the PMU framework) to use the new path prefix
src/common/platform/include/host/... instead of the old arch-specific host
header paths, ensuring each markdown link target is replaced with the
consolidated header location so the docs resolve correctly.

In `@docs/dfx/tensor-dump.md`:
- Line 364: Update the TensorDump framework include links that currently point
to arch-specific host headers to instead use the consolidated common headers;
specifically replace occurrences like ../src/a2a3/platform/include/host/... with
../src/common/platform/include/host/... for the referenced symbols/links such as
BufferPoolManager<DumpModule> and the other TensorDump host-header links in this
document so all four occurrences use src/common/platform/include/host/...

In `@docs/profiling-framework.md`:
- Around line 5-8: Update the paragraph that currently claims per-arch header
copies and uses two identical links so it consistently states that headers are
consolidated under the common host include directory and not duplicated
per-arch; replace the two identical links that point to the common location with
a single clear reference to the consolidated host headers and change every stale
occurrence of the legacy per-arch token (e.g. any 'src/a2a3/' or similar
per-arch paths) at the noted locations to the consolidated
'src/common/platform/include/host/' token, and reword the sentence that mentions
"the two copies are kept byte-for-byte" to reflect a single authoritative copy
so all references and links are consistent.

---

Outside diff comments:
In `@tests/ut/cpp/CMakeLists.txt`:
- Around line 254-259: The CMake include path for device_arena.h is stale:
update occurrences that add
"${CMAKE_SOURCE_DIR}/../../../src/common/device_comm" (seen in targets like
add_common_utils_test, a2a3_rt_objs, a5_rt_objs) to point to
"${CMAKE_SOURCE_DIR}/../../../src/common/utils" so the tests can find
device_arena.h (or alternatively change the test source includes to use
"utils/device_arena.h"); modify the include_directories /
target_include_directories entries that reference device_comm to use the utils
directory instead.
- Around line 315-318: The CMake target test_orch_so_file is referencing a
deleted source path under src/a2a3; update the add_executable source list so
that test_orch_so_file uses the relocated file at
src/common/platform/onboard/aicpu/orch_so_file.cpp (replace the old
${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp
entry with the consolidated
${CMAKE_SOURCE_DIR}/../../../src/common/platform/onboard/aicpu/orch_so_file.cpp
path) so CMake can find orch_so_file.cpp when configuring and building.

---

Nitpick comments:
In `@src/a5/platform/onboard/host/CMakeLists.txt`:
- Around line 24-26: The CMakeLists uses CMAKE_CUSTOM_INCLUDE_DIRS and
CMAKE_CURRENT_SOURCE_DIR to append include paths but duplicates
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common" (first and third list(APPEND)
calls); remove the redundant third list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common") so the common include is only
appended once and leave the
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/log/include" entry intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a49dc42e-8a7b-4661-8534-c74a4b47483f

📥 Commits

Reviewing files that changed from the base of the PR and between 0e2e7cd and a4a2473.

📒 Files selected for processing (102)
  • docs/developer-guide.md
  • docs/dfx/l2-swimlane-profiling.md
  • docs/dfx/pmu-profiling.md
  • docs/dfx/scope-stats.md
  • docs/dfx/tensor-dump.md
  • docs/hardware/cache-coherency.md
  • docs/logging.md
  • docs/profiling-framework.md
  • simpler_setup/runtime_compiler.py
  • simpler_setup/tools/sched_overhead_analysis.py
  • src/a2a3/docs/platform.md
  • src/a2a3/platform/docs/tpush-tpop-sim.md
  • src/a2a3/platform/include/aicpu/platform_regs.h
  • src/a2a3/platform/include/common/dep_gen.h
  • src/a2a3/platform/include/host/dep_gen_collector.h
  • src/a2a3/platform/include/host/l2_swimlane_collector.h
  • src/a2a3/platform/include/host/pmu_collector.h
  • src/a2a3/platform/onboard/aicpu/CMakeLists.txt
  • src/a2a3/platform/onboard/aicpu/device_malloc.cpp
  • src/a2a3/platform/onboard/aicpu/device_time.cpp
  • src/a2a3/platform/onboard/host/CMakeLists.txt
  • src/a2a3/platform/onboard/host/device_runner.h
  • src/a2a3/platform/shared/aicpu/dep_gen_collector_aicpu.cpp
  • src/a2a3/platform/shared/aicpu/l2_swimlane_collector_aicpu.cpp
  • src/a2a3/platform/shared/aicpu/platform_regs.cpp
  • src/a2a3/platform/shared/aicpu/pmu_collector_aicpu.cpp
  • src/a2a3/platform/shared/host/dep_gen_collector.cpp
  • src/a2a3/platform/shared/host/l2_swimlane_collector.cpp
  • src/a2a3/platform/shared/host/pmu_collector.cpp
  • src/a2a3/platform/sim/aicpu/CMakeLists.txt
  • src/a2a3/platform/sim/aicpu/device_malloc.cpp
  • src/a2a3/platform/sim/host/CMakeLists.txt
  • src/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.h
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_shared_memory.h
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/pto_scheduler.h
  • src/a5/docs/platform.md
  • src/a5/platform/docs/tpush-tpop-sim.md
  • src/a5/platform/include/host/l2_swimlane_collector.h
  • src/a5/platform/include/host/pmu_collector.h
  • src/a5/platform/onboard/aicpu/CMakeLists.txt
  • src/a5/platform/onboard/aicpu/cache_ops.cpp
  • src/a5/platform/onboard/aicpu/device_log.cpp
  • src/a5/platform/onboard/aicpu/device_time.cpp
  • src/a5/platform/onboard/aicpu/orch_so_file.cpp
  • src/a5/platform/onboard/aicpu/platform_aicpu_affinity.cpp
  • src/a5/platform/onboard/aicpu/spin_hint.h
  • src/a5/platform/onboard/host/CMakeLists.txt
  • src/a5/platform/onboard/host/device_runner.h
  • src/a5/platform/shared/aicpu/l2_swimlane_collector_aicpu.cpp
  • src/a5/platform/shared/aicpu/platform_regs.cpp
  • src/a5/platform/shared/aicpu/pmu_collector_aicpu.cpp
  • src/a5/platform/shared/host/l2_swimlane_collector.cpp
  • src/a5/platform/shared/host/pmu_collector.cpp
  • src/a5/platform/sim/aicpu/CMakeLists.txt
  • src/a5/platform/sim/aicpu/device_log.cpp
  • src/a5/platform/sim/aicpu/device_time.cpp
  • src/a5/platform/sim/aicpu/orch_so_file.cpp
  • src/a5/platform/sim/aicpu/platform_aicpu_affinity.cpp
  • src/a5/platform/sim/aicpu/spin_hint.h
  • src/a5/platform/sim/host/CMakeLists.txt
  • src/a5/runtime/tensormap_and_ringbuffer/host/runtime_maker.cpp
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.h
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_shared_memory.h
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.h
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/scheduler/pto_scheduler.h
  • src/common/platform/include/common/scope_stats.h
  • src/common/platform/include/host/buffer_pool_manager.h
  • src/common/platform/include/host/profiler_base.h
  • src/common/platform/include/host/scope_stats_collector.h
  • src/common/platform/include/host/tensor_dump_collector.h
  • src/common/platform/onboard/aicpu/cache_ops.cpp
  • src/common/platform/onboard/aicpu/device_log.cpp
  • src/common/platform/onboard/aicpu/device_malloc.cpp
  • src/common/platform/onboard/aicpu/device_time.cpp
  • src/common/platform/onboard/aicpu/orch_so_file.cpp
  • src/common/platform/onboard/aicpu/platform_aicpu_affinity.cpp
  • src/common/platform/onboard/aicpu/spin_hint.h
  • src/common/platform/onboard/host/device_runner_base.h
  • src/common/platform/shared/aicpu/scope_stats_collector_aicpu.cpp
  • src/common/platform/shared/aicpu/tensor_dump_aicpu.cpp
  • src/common/platform/shared/aicpu/unified_log_device.cpp
  • src/common/platform/shared/host/platform_compile_info.cpp
  • src/common/platform/shared/host/scope_stats_collector.cpp
  • src/common/platform/shared/host/tensor_dump_collector.cpp
  • src/common/platform/sim/aicpu/cache_ops.cpp
  • src/common/platform/sim/aicpu/device_log.cpp
  • src/common/platform/sim/aicpu/device_malloc.cpp
  • src/common/platform/sim/aicpu/device_time.cpp
  • src/common/platform/sim/aicpu/orch_so_file.cpp
  • src/common/platform/sim/aicpu/platform_aicpu_affinity.cpp
  • src/common/platform/sim/aicpu/spin_hint.h
  • src/common/platform/sim/host/device_runner_base.h
  • src/common/platform/sim/sim_context/CMakeLists.txt
  • src/common/platform/sim/sim_context/cpu_sim_context.cpp
  • src/common/platform/sim/sim_context/cpu_sim_context.h
  • src/common/utils/device_arena.h
  • tests/st/a2a3/tensormap_and_ringbuffer/dfx/pmu/test_pmu.py
  • tests/ut/cpp/CMakeLists.txt
💤 Files with no reviewable changes (14)
  • src/a5/platform/sim/aicpu/device_time.cpp
  • src/a2a3/platform/sim/aicpu/device_malloc.cpp
  • src/a5/platform/onboard/aicpu/cache_ops.cpp
  • src/a2a3/platform/onboard/aicpu/device_malloc.cpp
  • src/a5/platform/sim/aicpu/orch_so_file.cpp
  • src/a5/platform/onboard/aicpu/spin_hint.h
  • src/a2a3/platform/onboard/aicpu/device_time.cpp
  • src/a5/platform/sim/aicpu/spin_hint.h
  • src/a5/platform/onboard/aicpu/device_time.cpp
  • src/a5/platform/onboard/aicpu/device_log.cpp
  • src/a5/platform/onboard/aicpu/orch_so_file.cpp
  • src/a5/platform/onboard/aicpu/platform_aicpu_affinity.cpp
  • src/a5/platform/sim/aicpu/device_log.cpp
  • src/a5/platform/sim/aicpu/platform_aicpu_affinity.cpp

space so the host can read device buffers directly.
`L2SwimlaneCollector` runs two background threads on top of a
[`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiling_common/buffer_pool_manager.h):
[`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/buffer_pool_manager.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update these links to the consolidated common header path.

Line 362, Line 435, Line 563, and Line 565 still reference arch-specific header locations, but this PR consolidates profiling headers under src/common/platform/include/host/. Please retarget these links to avoid stale/broken documentation paths.

Also applies to: 435-435, 563-563, 565-565

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dfx/l2-swimlane-profiling.md` at line 362, Several documentation links
still point to arch-specific profiling headers (e.g., the link referencing
BufferPoolManager<L2SwimlaneModule>) and must be retargeted to the consolidated
common header location; find the markdown occurrences that link to
architecture-specific include headers (the BufferPoolManager<L2SwimlaneModule>
link and the other similar header links) and update their URLs to point to the
consolidated common platform include directory (the centralized host profiling
headers) so the docs reference the shared header location instead of the old
arch-specific paths.

Comment thread docs/dfx/pmu-profiling.md
[`PmuCollector`](../src/a2a3/platform/include/host/pmu_collector.h)
inherits from
[`profiling_common::ProfilerBase<PmuCollector, PmuModule>`](../src/a2a3/platform/include/host/profiling_common/profiler_base.h):
[`profiling_common::ProfilerBase<PmuCollector, PmuModule>`](../src/a2a3/platform/include/host/profiler_base.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stale PMU framework links after header consolidation.

Line 296, Line 464, and Line 466 still point to arch-specific host header paths. These should link to src/common/platform/include/host/... to match the refactor and keep docs navigable.

Also applies to: 464-464, 466-466

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dfx/pmu-profiling.md` at line 296, Update the stale documentation links
that reference arch-specific host headers so they point to the consolidated
common headers; specifically change the link target for
profiling_common::ProfilerBase<PmuCollector, PmuModule> and the other two links
mentioned (the occurrences around lines referencing the PMU framework) to use
the new path prefix src/common/platform/include/host/... instead of the old
arch-specific host header paths, ensuring each markdown link target is replaced
with the consolidated header location so the docs resolve correctly.

Comment thread docs/dfx/tensor-dump.md
space so the host can read device buffers directly.
`TensorDumpCollector` runs two background threads on top of a
[`BufferPoolManager<DumpModule>`](../src/a2a3/platform/include/host/profiling_common/buffer_pool_manager.h):
[`BufferPoolManager<DumpModule>`](../src/a2a3/platform/include/host/buffer_pool_manager.h):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Retarget TensorDump framework links to common include headers.

Line 364, Line 426, Line 531, and Line 533 still use arch-specific host-header paths. Update these links to src/common/platform/include/host/... so they reflect the consolidated framework location.

Also applies to: 426-426, 531-531, 533-533

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dfx/tensor-dump.md` at line 364, Update the TensorDump framework include
links that currently point to arch-specific host headers to instead use the
consolidated common headers; specifically replace occurrences like
../src/a2a3/platform/include/host/... with
../src/common/platform/include/host/... for the referenced symbols/links such as
BufferPoolManager<DumpModule> and the other TensorDump host-header links in this
document so all four occurrences use src/common/platform/include/host/...

Comment on lines +5 to 8
framework headers under `src/common/platform/include/host/`
([a2a3](../src/common/platform/include/host/),
[a5](../src/common/platform/include/host/)) — the two copies
are kept byte-for-byte structurally aligned so reviewers can diff them, but
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Resolve contradiction and stale links in framework header location docs.

Line 5–Line 8 says there are per-arch header copies, but both links point to the same common directory. Also, Line 68, Line 79, Line 100, Line 115, and Line 135 still reference src/a2a3/... paths. Please make this section consistently describe and link to the consolidated src/common/platform/include/host/ location.

Also applies to: 68-68, 79-79, 100-100, 115-115, 135-135

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/profiling-framework.md` around lines 5 - 8, Update the paragraph that
currently claims per-arch header copies and uses two identical links so it
consistently states that headers are consolidated under the common host include
directory and not duplicated per-arch; replace the two identical links that
point to the common location with a single clear reference to the consolidated
host headers and change every stale occurrence of the legacy per-arch token
(e.g. any 'src/a2a3/' or similar per-arch paths) at the noted locations to the
consolidated 'src/common/platform/include/host/' token, and reword the sentence
that mentions "the two copies are kept byte-for-byte" to reflect a single
authoritative copy so all references and links are consistent.

…n, consolidate common/, rename platform/src→shared)

Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of
duplicated and oddly-placed files that the recent unification work left
behind. Four orthogonal changes bundled here because each touches the
same set of CMakeLists; splitting would mean three more rebuild rounds
and three more cmake-include-path edits across the same files.

1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/
------------------------------------------------------------------------

Seven files per backend (14 total) were byte-identical (or differed only
in a one-line @brief arch qualifier) between a2a3 and a5:

  cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp,
  orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h

Moved a2a3's copy to common/, deleted a5's duplicate, and extended each
arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to
pick them up from common/platform/{onboard,sim}/aicpu/. The
device_malloc.cpp arch tag in its @brief was the only real content
diff; generalized to "Real Hardware" / "Simulation" without the arch
qualifier. Backfilled a copyright header that was missing on
device_time.cpp (caught by the check-headers hook).

The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp)
have real arch-specific divergence (register addresses, kernel protocols)
and stay where they are.

2. Flatten profiling_common/ subdir
------------------------------------

src/common/platform/include/host/profiling_common/{buffer_pool_manager,
profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager,
profiler_base}.h. Updated 10 #include sites and the 2 header guards. The
profiling_common:: C++ namespace stays — file path and namespace don't
have to match.

3. Consolidate small src/common subdirs
----------------------------------------

- src/common/device_comm/device_arena.h → src/common/utils/device_arena.h.
  The file is a generic bump-arena utility, not a comm primitive; the
  enclosing dir name was misleading. Updated 10 #include sites
  "device_arena.h" → "utils/device_arena.h" and dropped the
  common/device_comm entry from 8 CMakeLists (replaced with common
  since utils/ resolves there).

- src/common/sim_context/ → src/common/platform/sim/sim_context/. The
  dir is sim-only infrastructure (CPU sim context for CANN intrinsic
  emulation), so it belongs next to the other common/platform/sim/
  shared sim infrastructure. Updated:
    * the dir's own CMakeLists relative path to log/include;
    * simpler_setup/runtime_compiler.py::compile_sim_context source
      path;
    * 4 sim-host CMakeLists references;
    * a small handful of docs that named the old path.

4. Rename platform/src → platform/shared
------------------------------------------

Per-arch src/{arch}/platform/src/ was confusingly nested inside the
top-level src/ directory and read as "src/src" in many paths. Renamed
to shared/ across all 3 trees (a2a3, a5, common), matching its actual
semantic ("shared between onboard and sim within one arch"). Updated 21
files that referenced the old path: CMakeLists, host headers, docs, one
test file, and the src/{arch}/docs/platform.md map.

Test plan
---------

- Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim)
  + libcpu_sim_context.so + aicpu and aicore artifacts: clean.
- CI will run the full ST + UT suite.

Net: ~30 file renames, ~14 files extracted to common, +0 / -0
behavioral changes (pure layout).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants