Refactor: src/ layout cleanup (extract aicpu + rename platform/src→shared + consolidate common/)#950
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThis PR reorganizes the platform source code structure by consolidating shared implementations into a unified ChangesPlatform refactoring and header consolidation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request consolidates and reorganizes platform-specific code and documentation, moving shared profiling framework headers, device memory allocation, logging, and other utilities from architecture-specific directories (such as src/a2a3/ and src/a5/) into a unified src/common/ directory structure. The review feedback points out several outdated file paths and links in both code comments and documentation files (including dep_gen.h, scope_stats.h, and various Markdown files under docs/) that still reference the old architecture-specific paths instead of the newly consolidated shared locations. Additionally, a typo in docs/dfx/scope-stats.md was identified where the device-side scope stats collector file is incorrectly named scope_stats_collector.cpp instead of scope_stats_collector_aicpu.cpp.
| * | ||
| * Streaming buffer design mirrors PMU / L2Swimlane / TensorDump (single source of | ||
| * algorithmic truth in src/a2a3/platform/include/host/profiling_common/profiler_base.h): | ||
| * algorithmic truth in src/a2a3/platform/include/host/profiler_base.h): |
There was a problem hiding this comment.
| * fixed-capacity buffers, mirroring PMU / dep_gen / tensor_dump / l2_swimlane (the | ||
| * single source of mgmt-loop truth is | ||
| * src/a2a3/platform/include/host/profiling_common/profiler_base.h): | ||
| * src/a2a3/platform/include/host/profiler_base.h): |
| framework headers under `src/common/platform/include/host/` | ||
| ([a2a3](../src/common/platform/include/host/), | ||
| [a5](../src/common/platform/include/host/)) — the two copies |
There was a problem hiding this comment.
Since the profiling framework headers have been consolidated into a single shared directory (src/common/platform/include/host/), they are no longer maintained as separate per-architecture copies. The duplicate links for a2a3 and a5 should be replaced with a single link to common, and the surrounding text (lines 4 and 8-9) should be updated to remove references to 'two copies' and 'each arch is free to evolve independently'.
| space so the host can read device buffers directly. | ||
| `L2SwimlaneCollector` runs two background threads on top of a | ||
| [`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiling_common/buffer_pool_manager.h): | ||
| [`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/buffer_pool_manager.h): |
| [`L2SwimlaneCollector`](../src/a2a3/platform/include/host/l2_swimlane_collector.h) | ||
| on a2a3 inherits from | ||
| [`profiling_common::ProfilerBase<L2SwimlaneCollector, L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiling_common/profiler_base.h): | ||
| [`profiling_common::ProfilerBase<L2SwimlaneCollector, L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiler_base.h): |
| [`TensorDumpCollector`](../src/a2a3/platform/include/host/tensor_dump_collector.h) | ||
| on a2a3 inherits from | ||
| [`profiling_common::ProfilerBase<TensorDumpCollector, DumpModule>`](../src/a2a3/platform/include/host/profiling_common/profiler_base.h): | ||
| [`profiling_common::ProfilerBase<TensorDumpCollector, DumpModule>`](../src/a2a3/platform/include/host/profiler_base.h): |
| ([`profiling_common::ProfilerBase`](../src/a5/platform/include/host/profiler_base.h)) | ||
| as a2a3 and parameterizes | ||
| [`BufferPoolManager`](../src/a5/platform/include/host/profiling_common/buffer_pool_manager.h) | ||
| [`BufferPoolManager`](../src/a5/platform/include/host/buffer_pool_manager.h) |
| | ------------ | ------------------- | ------- | | ||
| | `libsimpler_log.so` (host) | `src/common/log/unified_log_host.cpp` | `HostLogger` → stderr | | ||
| | AICPU binary (device) | `src/{arch}/platform/src/aicpu/unified_log_device.cpp` | `dev_vlog_*` → backend | | ||
| | AICPU binary (device) | `src/{arch}/platform/shared/aicpu/unified_log_device.cpp` | `dev_vlog_*` → backend | |
| | Change the sim AICPU output format | `src/{arch}/platform/sim/aicpu/device_log.cpp::dev_vlog_*` | | ||
| | Change the onboard AICPU CANN dlog tagging | `src/{arch}/platform/onboard/aicpu/device_log.cpp::dev_vlog_*` | | ||
| | Add a new C ABI entry point (e.g. dynamic config push) | `src/common/log/include/common/unified_log.h` + `unified_log_host.cpp` + `src/{arch}/platform/src/aicpu/unified_log_device.cpp` | | ||
| | Add a new C ABI entry point (e.g. dynamic config push) | `src/common/log/include/common/unified_log.h` + `unified_log_host.cpp` + `src/{arch}/platform/shared/aicpu/unified_log_device.cpp` | |
| Pure-value API declarations. No runtime types cross this boundary. | ||
|
|
||
| platform/src/aicpu/scope_stats_collector.cpp | ||
| platform/shared/aicpu/scope_stats_collector.cpp |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tests/ut/cpp/CMakeLists.txt (2)
254-259:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winFix stale
src/common/device_comminclude directories (point tosrc/common/utils).
tests/ut/cpp/CMakeLists.txtstill adds${CMAKE_SOURCE_DIR}/../../../src/common/device_comm(e.g.,add_common_utils_test, and also ina2a3_rt_objs/a5_rt_objsinclude dirs), butsrc/common/device_commno longer exists whiledevice_arena.his now atsrc/common/utils/device_arena.h. The UTs include it as#include "device_arena.h", so update the include dirs tosrc/common/utils(or update the includes to match the newutils/...prefix).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ut/cpp/CMakeLists.txt` around lines 254 - 259, The CMake include path for device_arena.h is stale: update occurrences that add "${CMAKE_SOURCE_DIR}/../../../src/common/device_comm" (seen in targets like add_common_utils_test, a2a3_rt_objs, a5_rt_objs) to point to "${CMAKE_SOURCE_DIR}/../../../src/common/utils" so the tests can find device_arena.h (or alternatively change the test source includes to use "utils/device_arena.h"); modify the include_directories / target_include_directories entries that reference device_comm to use the utils directory instead.
315-318:⚠️ Potential issue | 🔴 Critical | ⚡ Quick win
test_orch_so_filereferences a stale deleteda2a3source path
tests/ut/cpp/CMakeLists.txtstill buildstest_orch_so_filefrom${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp, but that file no longer exists;orch_so_file.cppnow lives undersrc/common/platform/onboard/aicpu/, so CMake/configure/build will fail.🐛 Proposed fix (switch to consolidated common path)
add_executable(test_orch_so_file common/test_orch_so_file.cpp - ${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp + ${CMAKE_SOURCE_DIR}/../../../src/common/platform/onboard/aicpu/orch_so_file.cpp )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/ut/cpp/CMakeLists.txt` around lines 315 - 318, The CMake target test_orch_so_file is referencing a deleted source path under src/a2a3; update the add_executable source list so that test_orch_so_file uses the relocated file at src/common/platform/onboard/aicpu/orch_so_file.cpp (replace the old ${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp entry with the consolidated ${CMAKE_SOURCE_DIR}/../../../src/common/platform/onboard/aicpu/orch_so_file.cpp path) so CMake can find orch_so_file.cpp when configuring and building.
🧹 Nitpick comments (1)
src/a5/platform/onboard/host/CMakeLists.txt (1)
24-26: 💤 Low valueDuplicate
commoninclude path.Line 26 re-adds
${CMAKE_CURRENT_SOURCE_DIR}/../../../../common, which is already appended on line 24. The a2a3 onboard-host sibling lists it only once. Harmless (CMake de-duplicates include dirs) but redundant — drop one for clarity.♻️ Remove the redundant entry
list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common") list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/log/include") -list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common") list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/host")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/a5/platform/onboard/host/CMakeLists.txt` around lines 24 - 26, The CMakeLists uses CMAKE_CUSTOM_INCLUDE_DIRS and CMAKE_CURRENT_SOURCE_DIR to append include paths but duplicates "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common" (first and third list(APPEND) calls); remove the redundant third list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common") so the common include is only appended once and leave the "${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/log/include" entry intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/dfx/l2-swimlane-profiling.md`:
- Line 362: Several documentation links still point to arch-specific profiling
headers (e.g., the link referencing BufferPoolManager<L2SwimlaneModule>) and
must be retargeted to the consolidated common header location; find the markdown
occurrences that link to architecture-specific include headers (the
BufferPoolManager<L2SwimlaneModule> link and the other similar header links) and
update their URLs to point to the consolidated common platform include directory
(the centralized host profiling headers) so the docs reference the shared header
location instead of the old arch-specific paths.
In `@docs/dfx/pmu-profiling.md`:
- Line 296: Update the stale documentation links that reference arch-specific
host headers so they point to the consolidated common headers; specifically
change the link target for profiling_common::ProfilerBase<PmuCollector,
PmuModule> and the other two links mentioned (the occurrences around lines
referencing the PMU framework) to use the new path prefix
src/common/platform/include/host/... instead of the old arch-specific host
header paths, ensuring each markdown link target is replaced with the
consolidated header location so the docs resolve correctly.
In `@docs/dfx/tensor-dump.md`:
- Line 364: Update the TensorDump framework include links that currently point
to arch-specific host headers to instead use the consolidated common headers;
specifically replace occurrences like ../src/a2a3/platform/include/host/... with
../src/common/platform/include/host/... for the referenced symbols/links such as
BufferPoolManager<DumpModule> and the other TensorDump host-header links in this
document so all four occurrences use src/common/platform/include/host/...
In `@docs/profiling-framework.md`:
- Around line 5-8: Update the paragraph that currently claims per-arch header
copies and uses two identical links so it consistently states that headers are
consolidated under the common host include directory and not duplicated
per-arch; replace the two identical links that point to the common location with
a single clear reference to the consolidated host headers and change every stale
occurrence of the legacy per-arch token (e.g. any 'src/a2a3/' or similar
per-arch paths) at the noted locations to the consolidated
'src/common/platform/include/host/' token, and reword the sentence that mentions
"the two copies are kept byte-for-byte" to reflect a single authoritative copy
so all references and links are consistent.
---
Outside diff comments:
In `@tests/ut/cpp/CMakeLists.txt`:
- Around line 254-259: The CMake include path for device_arena.h is stale:
update occurrences that add
"${CMAKE_SOURCE_DIR}/../../../src/common/device_comm" (seen in targets like
add_common_utils_test, a2a3_rt_objs, a5_rt_objs) to point to
"${CMAKE_SOURCE_DIR}/../../../src/common/utils" so the tests can find
device_arena.h (or alternatively change the test source includes to use
"utils/device_arena.h"); modify the include_directories /
target_include_directories entries that reference device_comm to use the utils
directory instead.
- Around line 315-318: The CMake target test_orch_so_file is referencing a
deleted source path under src/a2a3; update the add_executable source list so
that test_orch_so_file uses the relocated file at
src/common/platform/onboard/aicpu/orch_so_file.cpp (replace the old
${CMAKE_SOURCE_DIR}/../../../src/a2a3/platform/onboard/aicpu/orch_so_file.cpp
entry with the consolidated
${CMAKE_SOURCE_DIR}/../../../src/common/platform/onboard/aicpu/orch_so_file.cpp
path) so CMake can find orch_so_file.cpp when configuring and building.
---
Nitpick comments:
In `@src/a5/platform/onboard/host/CMakeLists.txt`:
- Around line 24-26: The CMakeLists uses CMAKE_CUSTOM_INCLUDE_DIRS and
CMAKE_CURRENT_SOURCE_DIR to append include paths but duplicates
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common" (first and third list(APPEND)
calls); remove the redundant third list(APPEND CMAKE_CUSTOM_INCLUDE_DIRS
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common") so the common include is only
appended once and leave the
"${CMAKE_CURRENT_SOURCE_DIR}/../../../../common/log/include" entry intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a49dc42e-8a7b-4661-8534-c74a4b47483f
📒 Files selected for processing (102)
docs/developer-guide.mddocs/dfx/l2-swimlane-profiling.mddocs/dfx/pmu-profiling.mddocs/dfx/scope-stats.mddocs/dfx/tensor-dump.mddocs/hardware/cache-coherency.mddocs/logging.mddocs/profiling-framework.mdsimpler_setup/runtime_compiler.pysimpler_setup/tools/sched_overhead_analysis.pysrc/a2a3/docs/platform.mdsrc/a2a3/platform/docs/tpush-tpop-sim.mdsrc/a2a3/platform/include/aicpu/platform_regs.hsrc/a2a3/platform/include/common/dep_gen.hsrc/a2a3/platform/include/host/dep_gen_collector.hsrc/a2a3/platform/include/host/l2_swimlane_collector.hsrc/a2a3/platform/include/host/pmu_collector.hsrc/a2a3/platform/onboard/aicpu/CMakeLists.txtsrc/a2a3/platform/onboard/aicpu/device_malloc.cppsrc/a2a3/platform/onboard/aicpu/device_time.cppsrc/a2a3/platform/onboard/host/CMakeLists.txtsrc/a2a3/platform/onboard/host/device_runner.hsrc/a2a3/platform/shared/aicpu/dep_gen_collector_aicpu.cppsrc/a2a3/platform/shared/aicpu/l2_swimlane_collector_aicpu.cppsrc/a2a3/platform/shared/aicpu/platform_regs.cppsrc/a2a3/platform/shared/aicpu/pmu_collector_aicpu.cppsrc/a2a3/platform/shared/host/dep_gen_collector.cppsrc/a2a3/platform/shared/host/l2_swimlane_collector.cppsrc/a2a3/platform/shared/host/pmu_collector.cppsrc/a2a3/platform/sim/aicpu/CMakeLists.txtsrc/a2a3/platform/sim/aicpu/device_malloc.cppsrc/a2a3/platform/sim/host/CMakeLists.txtsrc/a2a3/runtime/tensormap_and_ringbuffer/host/runtime_maker.cppsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_shared_memory.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.hsrc/a2a3/runtime/tensormap_and_ringbuffer/runtime/scheduler/pto_scheduler.hsrc/a5/docs/platform.mdsrc/a5/platform/docs/tpush-tpop-sim.mdsrc/a5/platform/include/host/l2_swimlane_collector.hsrc/a5/platform/include/host/pmu_collector.hsrc/a5/platform/onboard/aicpu/CMakeLists.txtsrc/a5/platform/onboard/aicpu/cache_ops.cppsrc/a5/platform/onboard/aicpu/device_log.cppsrc/a5/platform/onboard/aicpu/device_time.cppsrc/a5/platform/onboard/aicpu/orch_so_file.cppsrc/a5/platform/onboard/aicpu/platform_aicpu_affinity.cppsrc/a5/platform/onboard/aicpu/spin_hint.hsrc/a5/platform/onboard/host/CMakeLists.txtsrc/a5/platform/onboard/host/device_runner.hsrc/a5/platform/shared/aicpu/l2_swimlane_collector_aicpu.cppsrc/a5/platform/shared/aicpu/platform_regs.cppsrc/a5/platform/shared/aicpu/pmu_collector_aicpu.cppsrc/a5/platform/shared/host/l2_swimlane_collector.cppsrc/a5/platform/shared/host/pmu_collector.cppsrc/a5/platform/sim/aicpu/CMakeLists.txtsrc/a5/platform/sim/aicpu/device_log.cppsrc/a5/platform/sim/aicpu/device_time.cppsrc/a5/platform/sim/aicpu/orch_so_file.cppsrc/a5/platform/sim/aicpu/platform_aicpu_affinity.cppsrc/a5/platform/sim/aicpu/spin_hint.hsrc/a5/platform/sim/host/CMakeLists.txtsrc/a5/runtime/tensormap_and_ringbuffer/host/runtime_maker.cppsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_orchestrator.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_shared_memory.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/pto_tensormap.hsrc/a5/runtime/tensormap_and_ringbuffer/runtime/scheduler/pto_scheduler.hsrc/common/platform/include/common/scope_stats.hsrc/common/platform/include/host/buffer_pool_manager.hsrc/common/platform/include/host/profiler_base.hsrc/common/platform/include/host/scope_stats_collector.hsrc/common/platform/include/host/tensor_dump_collector.hsrc/common/platform/onboard/aicpu/cache_ops.cppsrc/common/platform/onboard/aicpu/device_log.cppsrc/common/platform/onboard/aicpu/device_malloc.cppsrc/common/platform/onboard/aicpu/device_time.cppsrc/common/platform/onboard/aicpu/orch_so_file.cppsrc/common/platform/onboard/aicpu/platform_aicpu_affinity.cppsrc/common/platform/onboard/aicpu/spin_hint.hsrc/common/platform/onboard/host/device_runner_base.hsrc/common/platform/shared/aicpu/scope_stats_collector_aicpu.cppsrc/common/platform/shared/aicpu/tensor_dump_aicpu.cppsrc/common/platform/shared/aicpu/unified_log_device.cppsrc/common/platform/shared/host/platform_compile_info.cppsrc/common/platform/shared/host/scope_stats_collector.cppsrc/common/platform/shared/host/tensor_dump_collector.cppsrc/common/platform/sim/aicpu/cache_ops.cppsrc/common/platform/sim/aicpu/device_log.cppsrc/common/platform/sim/aicpu/device_malloc.cppsrc/common/platform/sim/aicpu/device_time.cppsrc/common/platform/sim/aicpu/orch_so_file.cppsrc/common/platform/sim/aicpu/platform_aicpu_affinity.cppsrc/common/platform/sim/aicpu/spin_hint.hsrc/common/platform/sim/host/device_runner_base.hsrc/common/platform/sim/sim_context/CMakeLists.txtsrc/common/platform/sim/sim_context/cpu_sim_context.cppsrc/common/platform/sim/sim_context/cpu_sim_context.hsrc/common/utils/device_arena.htests/st/a2a3/tensormap_and_ringbuffer/dfx/pmu/test_pmu.pytests/ut/cpp/CMakeLists.txt
💤 Files with no reviewable changes (14)
- src/a5/platform/sim/aicpu/device_time.cpp
- src/a2a3/platform/sim/aicpu/device_malloc.cpp
- src/a5/platform/onboard/aicpu/cache_ops.cpp
- src/a2a3/platform/onboard/aicpu/device_malloc.cpp
- src/a5/platform/sim/aicpu/orch_so_file.cpp
- src/a5/platform/onboard/aicpu/spin_hint.h
- src/a2a3/platform/onboard/aicpu/device_time.cpp
- src/a5/platform/sim/aicpu/spin_hint.h
- src/a5/platform/onboard/aicpu/device_time.cpp
- src/a5/platform/onboard/aicpu/device_log.cpp
- src/a5/platform/onboard/aicpu/orch_so_file.cpp
- src/a5/platform/onboard/aicpu/platform_aicpu_affinity.cpp
- src/a5/platform/sim/aicpu/device_log.cpp
- src/a5/platform/sim/aicpu/platform_aicpu_affinity.cpp
| space so the host can read device buffers directly. | ||
| `L2SwimlaneCollector` runs two background threads on top of a | ||
| [`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/profiling_common/buffer_pool_manager.h): | ||
| [`BufferPoolManager<L2SwimlaneModule>`](../src/a2a3/platform/include/host/buffer_pool_manager.h): |
There was a problem hiding this comment.
Update these links to the consolidated common header path.
Line 362, Line 435, Line 563, and Line 565 still reference arch-specific header locations, but this PR consolidates profiling headers under src/common/platform/include/host/. Please retarget these links to avoid stale/broken documentation paths.
Also applies to: 435-435, 563-563, 565-565
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/dfx/l2-swimlane-profiling.md` at line 362, Several documentation links
still point to arch-specific profiling headers (e.g., the link referencing
BufferPoolManager<L2SwimlaneModule>) and must be retargeted to the consolidated
common header location; find the markdown occurrences that link to
architecture-specific include headers (the BufferPoolManager<L2SwimlaneModule>
link and the other similar header links) and update their URLs to point to the
consolidated common platform include directory (the centralized host profiling
headers) so the docs reference the shared header location instead of the old
arch-specific paths.
| [`PmuCollector`](../src/a2a3/platform/include/host/pmu_collector.h) | ||
| inherits from | ||
| [`profiling_common::ProfilerBase<PmuCollector, PmuModule>`](../src/a2a3/platform/include/host/profiling_common/profiler_base.h): | ||
| [`profiling_common::ProfilerBase<PmuCollector, PmuModule>`](../src/a2a3/platform/include/host/profiler_base.h): |
There was a problem hiding this comment.
Fix stale PMU framework links after header consolidation.
Line 296, Line 464, and Line 466 still point to arch-specific host header paths. These should link to src/common/platform/include/host/... to match the refactor and keep docs navigable.
Also applies to: 464-464, 466-466
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/dfx/pmu-profiling.md` at line 296, Update the stale documentation links
that reference arch-specific host headers so they point to the consolidated
common headers; specifically change the link target for
profiling_common::ProfilerBase<PmuCollector, PmuModule> and the other two links
mentioned (the occurrences around lines referencing the PMU framework) to use
the new path prefix src/common/platform/include/host/... instead of the old
arch-specific host header paths, ensuring each markdown link target is replaced
with the consolidated header location so the docs resolve correctly.
| space so the host can read device buffers directly. | ||
| `TensorDumpCollector` runs two background threads on top of a | ||
| [`BufferPoolManager<DumpModule>`](../src/a2a3/platform/include/host/profiling_common/buffer_pool_manager.h): | ||
| [`BufferPoolManager<DumpModule>`](../src/a2a3/platform/include/host/buffer_pool_manager.h): |
There was a problem hiding this comment.
Retarget TensorDump framework links to common include headers.
Line 364, Line 426, Line 531, and Line 533 still use arch-specific host-header paths. Update these links to src/common/platform/include/host/... so they reflect the consolidated framework location.
Also applies to: 426-426, 531-531, 533-533
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/dfx/tensor-dump.md` at line 364, Update the TensorDump framework include
links that currently point to arch-specific host headers to instead use the
consolidated common headers; specifically replace occurrences like
../src/a2a3/platform/include/host/... with
../src/common/platform/include/host/... for the referenced symbols/links such as
BufferPoolManager<DumpModule> and the other TensorDump host-header links in this
document so all four occurrences use src/common/platform/include/host/...
| framework headers under `src/common/platform/include/host/` | ||
| ([a2a3](../src/common/platform/include/host/), | ||
| [a5](../src/common/platform/include/host/)) — the two copies | ||
| are kept byte-for-byte structurally aligned so reviewers can diff them, but |
There was a problem hiding this comment.
Resolve contradiction and stale links in framework header location docs.
Line 5–Line 8 says there are per-arch header copies, but both links point to the same common directory. Also, Line 68, Line 79, Line 100, Line 115, and Line 135 still reference src/a2a3/... paths. Please make this section consistently describe and link to the consolidated src/common/platform/include/host/ location.
Also applies to: 68-68, 79-79, 100-100, 115-115, 135-135
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/profiling-framework.md` around lines 5 - 8, Update the paragraph that
currently claims per-arch header copies and uses two identical links so it
consistently states that headers are consolidated under the common host include
directory and not duplicated per-arch; replace the two identical links that
point to the common location with a single clear reference to the consolidated
host headers and change every stale occurrence of the legacy per-arch token
(e.g. any 'src/a2a3/' or similar per-arch paths) at the noted locations to the
consolidated 'src/common/platform/include/host/' token, and reword the sentence
that mentions "the two copies are kept byte-for-byte" to reflect a single
authoritative copy so all references and links are consistent.
a4a2473 to
deedf3f
Compare
…n, consolidate common/, rename platform/src→shared) Mechanical post-hw-native-sys#944/hw-native-sys#945/hw-native-sys#948 layout cleanup, motivated by an audit of duplicated and oddly-placed files that the recent unification work left behind. Four orthogonal changes bundled here because each touches the same set of CMakeLists; splitting would mean three more rebuild rounds and three more cmake-include-path edits across the same files. 1. Extract identical aicpu sources to common/platform/{onboard,sim}/aicpu/ ------------------------------------------------------------------------ Seven files per backend (14 total) were byte-identical (or differed only in a one-line @brief arch qualifier) between a2a3 and a5: cache_ops.cpp, device_log.cpp, device_time.cpp, device_malloc.cpp, orch_so_file.cpp, platform_aicpu_affinity.cpp, spin_hint.h Moved a2a3's copy to common/, deleted a5's duplicate, and extended each arch's onboard/aicpu and sim/aicpu CMakeLists COMMON_SOURCES glob to pick them up from common/platform/{onboard,sim}/aicpu/. The device_malloc.cpp arch tag in its @brief was the only real content diff; generalized to "Real Hardware" / "Simulation" without the arch qualifier. Backfilled a copyright header that was missing on device_time.cpp (caught by the check-headers hook). The remaining files in per-arch aicpu/ (kernel.cpp, inner_platform_regs.cpp) have real arch-specific divergence (register addresses, kernel protocols) and stay where they are. 2. Flatten profiling_common/ subdir ------------------------------------ src/common/platform/include/host/profiling_common/{buffer_pool_manager, profiler_base}.h → src/common/platform/include/host/{buffer_pool_manager, profiler_base}.h. Updated 10 #include sites and the 2 header guards. The profiling_common:: C++ namespace stays — file path and namespace don't have to match. 3. Consolidate small src/common subdirs ---------------------------------------- - src/common/device_comm/device_arena.h → src/common/utils/device_arena.h. The file is a generic bump-arena utility, not a comm primitive; the enclosing dir name was misleading. Updated 10 #include sites "device_arena.h" → "utils/device_arena.h" and dropped the common/device_comm entry from 8 CMakeLists (replaced with common since utils/ resolves there). - src/common/sim_context/ → src/common/platform/sim/sim_context/. The dir is sim-only infrastructure (CPU sim context for CANN intrinsic emulation), so it belongs next to the other common/platform/sim/ shared sim infrastructure. Updated: * the dir's own CMakeLists relative path to log/include; * simpler_setup/runtime_compiler.py::compile_sim_context source path; * 4 sim-host CMakeLists references; * a small handful of docs that named the old path. 4. Rename platform/src → platform/shared ------------------------------------------ Per-arch src/{arch}/platform/src/ was confusingly nested inside the top-level src/ directory and read as "src/src" in many paths. Renamed to shared/ across all 3 trees (a2a3, a5, common), matching its actual semantic ("shared between onboard and sim within one arch"). Updated 21 files that referenced the old path: CMakeLists, host headers, docs, one test file, and the src/{arch}/docs/platform.md map. Test plan --------- - Build all four libhost_runtime.so (a2a3 onboard/sim, a5 onboard/sim) + libcpu_sim_context.so + aicpu and aicore artifacts: clean. - CI will run the full ST + UT suite. Net: ~30 file renames, ~14 files extracted to common, +0 / -0 behavioral changes (pure layout).
Summary
Post-#944/#945/#948 layout cleanup. Audit of the post-unification tree turned up several rough edges; this PR addresses them in one pass because each change touches the same set of CMakeLists — splitting would mean three more cmake-include-path edit rounds.
Four orthogonal cleanups
1. Extract identical aicpu sources to
common/platform/{onboard,sim}/aicpu/Seven files per backend (14 total) were byte-identical (or differed only in a one-line
@briefarch qualifier) between a2a3 and a5:cache_ops.cppdevice_log.cppdevice_time.cppdevice_malloc.cpp(only diff was(a2a3)vs(a5)in the@brief)orch_so_file.cppplatform_aicpu_affinity.cppspin_hint.hMoved a2a3's copy to
common/, deleted a5's duplicate. Extended each arch'sonboard/aicpuandsim/aicpuCMakeListsCOMMON_SOURCESglob to pick the new dir up. Generalized thedevice_malloc.cpp@briefto "Real Hardware" / "Simulation" without the arch qualifier. Backfilled a missing copyright header ondevice_time.cpp(caught by the check-headers hook on the move).The remaining files in per-arch
aicpu/(kernel.cpp,inner_platform_regs.cpp) have real arch-specific divergence (register addresses, kernel protocols) and stay per-arch.2. Flatten
profiling_common/subdirsrc/common/platform/include/host/profiling_common/{buffer_pool_manager,profiler_base}.h→src/common/platform/include/host/{buffer_pool_manager,profiler_base}.h. Updated 10#includesites and the 2 header guards. Theprofiling_common::C++ namespace stays — file path and namespace don't have to match.3. Consolidate small
src/common/subdirssrc/common/device_comm/device_arena.h→src/common/utils/device_arena.h. The file is a generic bump-arena utility, not a comm primitive; the enclosing dir name was misleading. Updated 10#includesites"device_arena.h"→"utils/device_arena.h"and dropped thecommon/device_commentry from 8 CMakeLists (replaced withcommon, sinceutils/resolves there).src/common/sim_context/→src/common/platform/sim/sim_context/. The dir is sim-only infrastructure (CPU sim context for CANN intrinsic emulation); it belongs next to the othercommon/platform/sim/shared sim infrastructure. Updated the dir's own CMakeLists,simpler_setup/runtime_compiler.py::compile_sim_contextsource path, 4 sim-host CMakeLists references, and a small handful of docs that named the old path.4. Rename
platform/src/→platform/shared/Per-arch
src/{arch}/platform/src/was confusingly nested inside the top-levelsrc/directory and read as "src/src" in many paths. Renamed toshared/across all 3 trees (a2a3, a5, common), matching its actual semantic ("shared between onboard and sim within one arch"). Updated 21 files that referenced the old path: CMakeLists, host headers, docs, one test file, and thesrc/{arch}/docs/platform.mdmap. The developer-guide directory diagram is updated to match.What's NOT changed
common/platform/include/aicore/aicore.h(a single file in its own dir) stays put — its include form"aicore/aicore.h"is consistent with"aicpu/..."and"host/...", and removing the namespacing would force a rename across 10+ files for no real gain.tensor_dump.hpad sizes,platform_regs.hreg addresses,inner_platform_regs.cpp,kernel.cpp,inner_kernel.h) stay per-arch.Test plan
libhost_runtime.so(a2a3 onboard/sim, a5 onboard/sim) +libcpu_sim_context.so+ aicpu and aicore artifacts: clean.Net: ~30 file renames, ~14 files extracted to common, +0 / -0 behavioral changes (pure layout).