Skip to content

Refactor: WorkerType CHIP→NEXT_LEVEL, consolidate add_worker API, level hygiene#516

Open
hw-native-sys-bot wants to merge 1 commit intohw-native-sys:mainfrom
hw-native-sys-bot:refactor/worker-level-hygiene
Open

Refactor: WorkerType CHIP→NEXT_LEVEL, consolidate add_worker API, level hygiene#516
hw-native-sys-bot wants to merge 1 commit intohw-native-sys:mainfrom
hw-native-sys-bot:refactor/worker-level-hygiene

Conversation

@hw-native-sys-bot
Copy link
Copy Markdown
Collaborator

@hw-native-sys-bot hw-native-sys-bot commented Apr 10, 2026

Summary

  • Rename WorkerType enum: CHIPNEXT_LEVEL, remove DIST (only two kinds of sub-worker: NEXT_LEVEL and SUB)
  • Rename scheduler pools: chip_workersnext_level_workers, chip_threads_next_level_threads_
  • Simplify DistWorker::add_worker — clean NEXT_LEVEL/SUB binary split instead of the old CHIP+DIST grouping hack
  • Consolidate nanobind API: replace add_chip_worker/add_chip_worker_native/add_chip_process with overloaded add_next_level_worker
  • Worker.register() now raises RuntimeError at L2 (was silently useless)
  • Extract L2 WorkerPayload unpacking into _run_l2_from_payload() helper

Part of hierarchical runtime refactor (Steps 2-4). See .claude/plans/HIERARCHICAL_RUNTIME_REFACTOR.md.

Testing

  • pytest tests/ut/py/test_chip_worker.py — 11 passed
  • pytest tests/ut/py/test_dist_worker/ — 17 passed
  • ctest (C++ unit tests) — 5 passed
  • All pre-commit hooks pass (clang-format, cpplint, ruff, pyright)

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a validation check in the register method to ensure it is only used at level 3 or higher and refactors the L2 payload execution logic into a dedicated helper method _run_l2_from_payload. I have included a critical review comment regarding the use of run_raw instead of run to prevent potential type errors when handling raw pointers from WorkerPayload.

Comment on lines +378 to +389
from .task_interface import ChipCallConfig # noqa: PLC0415

assert self._chip_worker is not None
config = ChipCallConfig()
config.block_dim = payload.block_dim
config.aicpu_thread_num = payload.aicpu_thread_num
config.enable_profiling = payload.enable_profiling
self._chip_worker.run(
payload.callable, # type: ignore[arg-type]
payload.args,
config,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _run_l2_from_payload helper currently passes raw pointers (integers) from WorkerPayload to self._chip_worker.run(). However, ChipWorker.run() expects higher-level objects, not raw memory addresses. This will lead to a TypeError at runtime. Since WorkerPayload is designed to hold raw pointers for cross-process dispatch, you should use the run_raw method on the underlying _impl instance, which is designed to handle these pointers directly.

        assert self._chip_worker is not None
        self._chip_worker._impl.run_raw(
            payload.callable,
            payload.args,
            payload.block_dim,
            payload.aicpu_thread_num,
            payload.enable_profiling,
        )

…el hygiene

- Rename WorkerType enum: CHIP→NEXT_LEVEL, remove DIST (only two kinds
  of sub-worker: NEXT_LEVEL and SUB)
- Rename scheduler pools: chip_workers→next_level_workers,
  chip_threads_→next_level_threads_
- Simplify DistWorker::add_worker — clean NEXT_LEVEL/SUB binary split
  instead of the old CHIP+DIST grouping hack
- Consolidate nanobind API: replace add_chip_worker/add_chip_worker_native/
  add_chip_process with overloaded add_next_level_worker
- Worker.register() now raises RuntimeError at L2 (was silently useless)
- Extract L2 WorkerPayload unpacking into _run_l2_from_payload() helper

Part of hierarchical runtime refactor (Steps 2-4).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@hw-native-sys-bot hw-native-sys-bot force-pushed the refactor/worker-level-hygiene branch from e5e8800 to b144082 Compare April 10, 2026 11:26
@hw-native-sys-bot hw-native-sys-bot changed the title Refactor: Worker level hygiene — guard register() and extract L2 helper Refactor: WorkerType CHIP→NEXT_LEVEL, consolidate add_worker API, level hygiene Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants