Refactor: extract near-identical platform files (-1100 lines)#937
Conversation
|
Warning Review limit reached
More reviews will be available in 42 minutes and 25 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (17)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request consolidates and refactors the platform-specific code by extracting shared platform headers and source files (such as device timestamping, orchestration file creation, tensor dumping, memory barriers, and scope statistics) from the architecture-specific directories (a2a3 and a5) into a unified src/common/platform/ directory. The build configurations (CMakeLists.txt) and the Python kernel compiler have been updated to include these shared paths, and the corresponding header includes have been adjusted accordingly. No review comments were provided, so there is no additional feedback to address.
f111949 to
c303fe0
Compare
Move 3 more headers whose per-arch differences were ONLY arch-name strings in doc comments — same pattern as hw-native-sys#937: - include/aicore/aicore.h (33 lines) - include/aicpu/device_malloc.h (53 lines) - include/host/memory_allocator.h (109 lines) Per-arch save: ~195 lines deduped (one copy each in common; both arches' copies were equivalent code modulo doc text). Comment cleanup carried with the moves: - aicore.h: "a2a3: Real Ascend hardware" → "onboard: …" and "src/a2a3/platform/onboard/aicore/inner_kernel.h" → "src/{a2a3,a5}/platform/onboard/aicore/inner_kernel.h" - device_malloc.h: "On a2a3:" / "On a2a3sim:" → "On onboard:" / "On sim:" (function descriptions, not behavior) - memory_allocator.h: "a2a3: Wraps CANN runtime" / etc. → "onboard: Wraps CANN runtime" (4 occurrences) - Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_… Builds: - a2a3sim + a5sim host_runtime.so clean - Smoke ST passes on both arches Stacked on hw-native-sys#937 (which is stacked on hw-native-sys#934). Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
c303fe0 to
05d854a
Compare
Move 3 more headers whose per-arch differences were ONLY arch-name strings in doc comments — same pattern as hw-native-sys#937: - include/aicore/aicore.h (33 lines) - include/aicpu/device_malloc.h (53 lines) - include/host/memory_allocator.h (109 lines) Per-arch save: ~195 lines deduped (one copy each in common; both arches' copies were equivalent code modulo doc text). Comment cleanup carried with the moves: - aicore.h: "a2a3: Real Ascend hardware" → "onboard: …" and "src/a2a3/platform/onboard/aicore/inner_kernel.h" → "src/{a2a3,a5}/platform/onboard/aicore/inner_kernel.h" - device_malloc.h: "On a2a3:" / "On a2a3sim:" → "On onboard:" / "On sim:" (function descriptions, not behavior) - memory_allocator.h: "a2a3: Wraps CANN runtime" / etc. → "onboard: Wraps CANN runtime" (4 occurrences) - Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_… Builds: - a2a3sim + a5sim host_runtime.so clean - Smoke ST passes on both arches Stacked on hw-native-sys#937 (which is stacked on hw-native-sys#934). Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
05d854a to
796249b
Compare
Move 3 more headers whose per-arch differences were ONLY arch-name strings in doc comments — same pattern as hw-native-sys#937: - include/aicore/aicore.h (33 lines) - include/aicpu/device_malloc.h (53 lines) - include/host/memory_allocator.h (109 lines) Per-arch save: ~195 lines deduped (one copy each in common; both arches' copies were equivalent code modulo doc text). Comment cleanup carried with the moves: - aicore.h: "a2a3: Real Ascend hardware" → "onboard: …" and "src/a2a3/platform/onboard/aicore/inner_kernel.h" → "src/{a2a3,a5}/platform/onboard/aicore/inner_kernel.h" - device_malloc.h: "On a2a3:" / "On a2a3sim:" → "On onboard:" / "On sim:" (function descriptions, not behavior) - memory_allocator.h: "a2a3: Wraps CANN runtime" / etc. → "onboard: Wraps CANN runtime" (4 occurrences) - Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_… Builds: - a2a3sim + a5sim host_runtime.so clean - Smoke ST passes on both arches Stacked on hw-native-sys#937 (which is stacked on hw-native-sys#934). Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
….cpp (-250 lines) Move 3 more headers whose per-arch differences were ONLY arch-name strings in doc comments (same pattern as hw-native-sys#937): - include/aicore/aicore.h (33 lines) - include/aicpu/device_malloc.h (53 lines) - include/host/memory_allocator.h (109 lines) Per-arch save from header moves: ~195 lines. Also collapses the 4 per-arch `platform_compile_info.cpp` files (byte-identical modulo the platform-name string literal) into a single shared impl at `src/common/platform/src/host/platform_compile_info.cpp`. The string is supplied via `target_compile_definitions(host_runtime PRIVATE SIMPLER_PLATFORM_NAME=\"…\")` on each per-arch host CMakeLists, matching what `get_platform()` returned before. - 4 files deleted (a2a3/{onboard,sim}/host/, a5/{onboard,sim}/host/) - 1 file added (src/common/platform/src/host/platform_compile_info.cpp) - 4 host CMakeLists: source path moved to the common location + `SIMPLER_PLATFORM_NAME` define added with the appropriate string ("a2a3" / "a2a3sim" / "a5" / "a5sim") Header comment cleanups (same shape as hw-native-sys#937): - `aicore.h`: "a2a3: Real Ascend hardware" → "onboard: …" and doc reference generalized to `src/{a2a3,a5}/platform/...` - `device_malloc.h`: "On a2a3:" / "On a2a3sim:" → "On onboard:" / "On sim:" - `memory_allocator.h`: "a2a3: Wraps CANN runtime" → "onboard: …" - Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_… Builds: - a2a3sim + a5sim + a2a3 + a5 host_runtime.so clean - `get_platform()` returns the right string per variant (verified via compile_commands.json + ST smoke run) - Smoke ST passes on both arches Stacked on hw-native-sys#937 (which is stacked on hw-native-sys#934). Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
Move 6 headers + 1 source whose per-arch differences were ONLY:
- Header guards (SRC_A2A3_… vs SRC_A5_…)
- Comments naming the arch ("a2a3" / "a5" — generalized to the
shared "onboard" / "sim" axis)
- Two log-string tweaks in tensor_dump_aicpu.cpp
into src/common/platform/include/ + src/common/platform/src/aicpu/.
No code-level semantic changes; the per-arch copies were equivalent
modulo doc strings.
Files moved (canonical copy now in common):
- include/common/memory_barrier.h (63)
- include/aicpu/tensor_dump_aicpu.h (305)
- include/aicpu/device_time.h (28)
- include/aicpu/orch_so_file.h (57)
- include/host/function_cache.h (122)
- include/host/platform_compile_info.h (37)
- src/aicpu/tensor_dump_aicpu.cpp (598)
Per-arch saved: 1210
Total dedup'd: ~1100
Comment cleanups carried with the move:
- device_time.h: "a2a3: Real Ascend hardware" → "onboard: …"
- orch_so_file.h: "a2a3 (onboard): pid-based naming" → "onboard: …"
- function_cache.h: "a2a3: Real hardware" → "onboard: …"
- platform_compile_info.h: "Each platform (a2a3, a2a3sim, ...)" →
"Each platform (one of a2a3, a2a3sim, a5, a5sim)"
- Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_…
Include-path follow-up:
- l2_perf_collector_aicpu.h (still per-arch — DIFF) was doing
`#include "device_time.h"` (no subdirectory prefix), which only
worked because device_time.h lived alongside it in the same
`aicpu/` directory. Now that device_time.h moved up to
common/platform/include/aicpu/, qualify the include as
`#include "aicpu/device_time.h"` to match the shared search-path
shape. Both arches' l2_perf_collector_aicpu.h updated.
Builds:
- a2a3sim + a5sim host_runtime.so clean
- a2a3sim ST L1+L2 pass (38/38, devices 0,1)
- a5sim ST L1+L2 pass (22/22, devices 0,1)
Stacked on hw-native-sys#934 (platform-shared-headers).
Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
796249b to
db3442c
Compare
….cpp (-250 lines) Move 3 more headers whose per-arch differences were ONLY arch-name strings in doc comments (same pattern as hw-native-sys#937): - include/aicore/aicore.h (33 lines) - include/aicpu/device_malloc.h (53 lines) - include/host/memory_allocator.h (109 lines) Per-arch save from header moves: ~195 lines. Also collapses the 4 per-arch `platform_compile_info.cpp` files (byte-identical modulo the platform-name string literal) into a single shared impl at `src/common/platform/src/host/platform_compile_info.cpp`. The string is supplied via `target_compile_definitions(host_runtime PRIVATE SIMPLER_PLATFORM_NAME=\"…\")` on each per-arch host CMakeLists, matching what `get_platform()` returned before. - 4 files deleted (a2a3/{onboard,sim}/host/, a5/{onboard,sim}/host/) - 1 file added (src/common/platform/src/host/platform_compile_info.cpp) - 4 host CMakeLists: source path moved to the common location + `SIMPLER_PLATFORM_NAME` define added with the appropriate string ("a2a3" / "a2a3sim" / "a5" / "a5sim") Header comment cleanups (same shape as hw-native-sys#937): - `aicore.h`: "a2a3: Real Ascend hardware" → "onboard: …" and doc reference generalized to `src/{a2a3,a5}/platform/...` - `device_malloc.h`: "On a2a3:" / "On a2a3sim:" → "On onboard:" / "On sim:" - `memory_allocator.h`: "a2a3: Wraps CANN runtime" → "onboard: …" - Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_… Builds: - a2a3sim + a5sim + a2a3 + a5 host_runtime.so clean - `get_platform()` returns the right string per variant (verified via compile_commands.json + ST smoke run) - Smoke ST passes on both arches Stacked on hw-native-sys#937 (which is stacked on hw-native-sys#934). Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
….cpp (-250 lines) (#938) Move 3 more headers whose per-arch differences were ONLY arch-name strings in doc comments (same pattern as #937): - include/aicore/aicore.h (33 lines) - include/aicpu/device_malloc.h (53 lines) - include/host/memory_allocator.h (109 lines) Per-arch save from header moves: ~195 lines. Also collapses the 4 per-arch `platform_compile_info.cpp` files (byte-identical modulo the platform-name string literal) into a single shared impl at `src/common/platform/src/host/platform_compile_info.cpp`. The string is supplied via `target_compile_definitions(host_runtime PRIVATE SIMPLER_PLATFORM_NAME=\"…\")` on each per-arch host CMakeLists, matching what `get_platform()` returned before. - 4 files deleted (a2a3/{onboard,sim}/host/, a5/{onboard,sim}/host/) - 1 file added (src/common/platform/src/host/platform_compile_info.cpp) - 4 host CMakeLists: source path moved to the common location + `SIMPLER_PLATFORM_NAME` define added with the appropriate string ("a2a3" / "a2a3sim" / "a5" / "a5sim") Header comment cleanups (same shape as #937): - `aicore.h`: "a2a3: Real Ascend hardware" → "onboard: …" and doc reference generalized to `src/{a2a3,a5}/platform/...` - `device_malloc.h`: "On a2a3:" / "On a2a3sim:" → "On onboard:" / "On sim:" - `memory_allocator.h`: "a2a3: Wraps CANN runtime" → "onboard: …" - Header guards normalized to SRC_COMMON_PLATFORM_INCLUDE_… Builds: - a2a3sim + a5sim + a2a3 + a5 host_runtime.so clean - `get_platform()` returns the right string per variant (verified via compile_commands.json + ST smoke run) - Smoke ST passes on both arches Stacked on #937 (which is stacked on #934). Co-authored-by: Chao Wang <26245345+ChaoWao@users.noreply.github.com>
Summary
Stacked on #934. Move 6 headers + 1 source whose per-arch differences were ONLY:
SRC_A2A3_…vsSRC_A5_…)tensor_dump_aicpu.cppspecifically: 1 log-string tweak (drops the a5"(memcpy-based)"suffix from the init log — the kept copy is a2a3's) and 2 comment-only tweaks (drops the a5"Mirrors l2_swimlane_collector_aicpu.cpp patterns:"header line; replaces a2a3's"silent leftover"wording with a5's"silent wip-mismatch")into
src/common/platform/include/+src/common/platform/src/aicpu/. No semantic code changes; the per-arch copies were equivalent modulo doc strings.Files moved (canonical copy now in common)
include/common/memory_barrier.hinclude/aicpu/tensor_dump_aicpu.hinclude/aicpu/device_time.hinclude/aicpu/orch_so_file.hinclude/host/function_cache.hinclude/host/platform_compile_info.hsrc/aicpu/tensor_dump_aicpu.cppComment cleanups carried with the move
device_time.h: "a2a3: Real Ascend hardware" → "onboard: …"orch_so_file.h: "a2a3 (onboard): pid-based naming" → "onboard: …"function_cache.h: "a2a3: Real hardware" → "onboard: …"platform_compile_info.h: "Each platform (a2a3, a2a3sim, ...)" → "Each platform (one of a2a3, a2a3sim, a5, a5sim)"SRC_COMMON_PLATFORM_INCLUDE_…Include-path follow-up
l2_swimlane_collector_aicpu.h(still per-arch — DIFF; this file was renamed froml2_perf_collector_aicpu.hby #916'sL2Perf → L2Swimlanerename) was doing#include "device_time.h"(no subdirectory prefix), which only worked becausedevice_time.hlived alongside it in the sameaicpu/directory. Now thatdevice_time.hmoved up tocommon/platform/include/aicpu/, qualify the include as#include "aicpu/device_time.h"to match the shared search-path shape. Both arches'l2_swimlane_collector_aicpu.hupdated.Test plan
libhost_runtime.socleanStacking note: depends on #934 (
platform-shared-headers). The base CMakeLists changes there are needed for the common include path to be on the search path of every arch's build target.