Skip to content

fix(ir): accept TileType input in tensor.gather conversion#1609

Merged
lyfne123 merged 5 commits into
hw-native-sys:mainfrom
YunjiQin:fix/gather_convert
Jun 1, 2026
Merged

fix(ir): accept TileType input in tensor.gather conversion#1609
lyfne123 merged 5 commits into
hw-native-sys:mainfrom
YunjiQin:fix/gather_convert

Conversation

@YunjiQin
Copy link
Copy Markdown
Contributor

@YunjiQin YunjiQin commented May 30, 2026

Summary

tensor.gather previously crashed in convert_tensor_to_tile_ops with CHECK(input_tensor_type) when an upstream conversion had already lowered its input/index to TileType (e.g. when a preceding tensor.create had been rewritten to tile.create).

This PR makes the converter accept either TensorType or TileType for both input and index:

  • Extract shape/dtype from whichever type the arg carries.
  • Route the per-row materialization through a new emit_load_or_slice helper: it emits tile.load when the source is still a TensorType (original behavior, valid at any rank) and tile.slice when an upstream pass has already produced a TileType (only reached on rank-2 paths today, because FlattenTileNdTo2D rejects tile.slice on >2D tiles).
  • The rank-3 paths therefore keep emitting tile.load, so the rank-3 lowering is bit-identical to before.

Test plan

  • tests/ut/ir/transforms/test_convert_tensor_to_tile_ops.py::TestConvertGatherOp — UTs pass
  • tests/st/runtime/ops/test_gather.py::TestGatherIndex rank-3 cases — pre-compile path no longer trips FlattenTileNdTo2D: tile.slice is not supported on >2D tiles; system-tests / dist-system-tests CI lanes green
  • CI: clang-tidy / pre-commit / codegen-tests / pypto-lib-model / system-tests / system-tests-a5sim / unit-tests (ubuntu+macos) / dist-system-tests — all green

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 30, 2026

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c1fd0f09-846b-4921-8fff-22ac198e0f99

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR refactors the tensor.gather lowering strategy from per-row tile loads to preloaded Vec tiles with per-iteration slicing. Input and index operands are now bridged upfront to Vec tile memory, then sliced per loop iteration instead of loaded individually. All four gather cases and their corresponding test expectations are updated accordingly.

Changes

Gather Operation Lowering Refactoring

Layer / File(s) Summary
Gather input space bridging and conversion refactoring
src/ir/transforms/op_conversion_registry.cpp
gather_input_reqs map bridges both input and index to Vec tiles. The conversion lambda now type-checks for TileType instead of TensorType, derives shapes and dtype from bridged tiles, and removes the previous load_kwargs initialization. Registration passes std::move(gather_input_reqs) to apply bridging.
Gather lowering cases refactored to tile.slice
src/ir/transforms/op_conversion_registry.cpp
All four rank/dim cases (rank==2 norm_dim==1, rank==3 norm_dim==2, rank==3 norm_dim==0, rank==3 norm_dim==1) replace per-iteration tile.load with tile.slice on the preloaded Vec tiles. Reshapes are applied where tile.slice produces higher-rank results before calling the existing gather math.
Test expectations for new gather lowering strategy
tests/ut/ir/transforms/test_convert_tensor_to_tile_ops.py
Test comments updated to describe the new structure: single upfront pl.load of full input/index tensors into Vec buffers, then per-iteration pl.tile.slice to extract row/index tiles before pl.tile.gather. Expected IR assertions match the new lowering flow.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

A gathered wish on tiled ground,
Where slices dance in circles 'round,
The bunny loaded, bridged with care,
Each gather hop through tile-sliced air! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title accurately summarizes the main change: the tensor.gather conversion now accepts TileType input in addition to TensorType, fixing a CHECK failure.
Description check ✅ Passed The description clearly explains the problem (CHECK crash when input/index are already TileType), the solution (accept both types and route through emit_load_or_slice), and provides comprehensive test coverage details.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the tensor.gather operation conversion to bridge input and index tensors to MemorySpace::Vec tiles, replacing row-by-row tile.load operations with tile.slice on the pre-loaded tiles. Unit tests have been updated to reflect these changes. Feedback is provided to use INTERNAL_CHECK_SPAN instead of CHECK for verifying internal invariants when a Span is available, ensuring source location information is included in error messages.

Comment thread src/ir/transforms/op_conversion_registry.cpp Outdated
@YunjiQin YunjiQin changed the title fix(ir): bridge tensor.gather inputs to Vec tiles before lowering fix(ir): accept TileType input in tensor.gather conversion Jun 1, 2026
YunjiQin added 4 commits June 1, 2026 14:12
Register Vec input_reqs on tensor.gather so the framework auto-loads
the input/index tensors into Vec tiles before the converter runs,
mirroring tensor.gather_compare and tensor.scatter. The per-row body
now slices from the bridged tiles instead of issuing tile.load against
the original tensors, fixing the CHECK(input_tensor_type) crash when
upstream conversions have already remapped the args to TileType.
The post-bridge TileType checks guard a framework-guaranteed invariant
(input_reqs auto-bridging runs before the converter), not user input,
so they belong as INTERNAL_CHECK_SPAN. Per .claude/rules/error-checking.md
the span source is the lambda's value parameter and stays valid on
failure, so the SPAN variant is safe and attaches IR source location
to the failure message.
The input_reqs bridge broke rank-3 paths: it created a 3D Vec tile
that the per-iteration tile.slice could no longer reduce, since
FlattenTileNdTo2D rejects tile.slice on >2D tiles. Replace the bridge
with an emit_load_or_slice helper that emits tile.load when the
source is still a tensor (any rank — original behavior) and tile.slice
when an upstream conversion has already lowered it to a tile (only
reached on rank-2 paths today). Restore the test Expected to the
per-row tile.load form for the TensorType-input case.
C++17 doesn't allow capturing structured-binding identifiers inside
nested lambdas (fixed in C++20). CI's clang-14 rejects this even
though local clang-18 accepts. Unpack get_shape_dtype's pair into
plain locals so the inner make_loop lambdas can reach input_shape /
input_dtype / index_shape.
@YunjiQin YunjiQin force-pushed the fix/gather_convert branch from 85e343b to 93ee249 Compare June 1, 2026 06:22
@YunjiQin YunjiQin force-pushed the fix/gather_convert branch from 93ee249 to 35e90de Compare June 1, 2026 06:32
@lyfne123 lyfne123 merged commit b5341e8 into hw-native-sys:main Jun 1, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants