fix(ir): accept TileType input in tensor.gather conversion#1609
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe PR refactors the ChangesGather Operation Lowering Refactoring
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request updates the tensor.gather operation conversion to bridge input and index tensors to MemorySpace::Vec tiles, replacing row-by-row tile.load operations with tile.slice on the pre-loaded tiles. Unit tests have been updated to reflect these changes. Feedback is provided to use INTERNAL_CHECK_SPAN instead of CHECK for verifying internal invariants when a Span is available, ensuring source location information is included in error messages.
Register Vec input_reqs on tensor.gather so the framework auto-loads the input/index tensors into Vec tiles before the converter runs, mirroring tensor.gather_compare and tensor.scatter. The per-row body now slices from the bridged tiles instead of issuing tile.load against the original tensors, fixing the CHECK(input_tensor_type) crash when upstream conversions have already remapped the args to TileType.
The post-bridge TileType checks guard a framework-guaranteed invariant (input_reqs auto-bridging runs before the converter), not user input, so they belong as INTERNAL_CHECK_SPAN. Per .claude/rules/error-checking.md the span source is the lambda's value parameter and stays valid on failure, so the SPAN variant is safe and attaches IR source location to the failure message.
The input_reqs bridge broke rank-3 paths: it created a 3D Vec tile that the per-iteration tile.slice could no longer reduce, since FlattenTileNdTo2D rejects tile.slice on >2D tiles. Replace the bridge with an emit_load_or_slice helper that emits tile.load when the source is still a tensor (any rank — original behavior) and tile.slice when an upstream conversion has already lowered it to a tile (only reached on rank-2 paths today). Restore the test Expected to the per-row tile.load form for the TensorType-input case.
C++17 doesn't allow capturing structured-binding identifiers inside nested lambdas (fixed in C++20). CI's clang-14 rejects this even though local clang-18 accepts. Unpack get_shape_dtype's pair into plain locals so the inner make_loop lambdas can reach input_shape / input_dtype / index_shape.
85e343b to
93ee249
Compare
93ee249 to
35e90de
Compare
Summary
tensor.gatherpreviously crashed inconvert_tensor_to_tile_opswithCHECK(input_tensor_type)when an upstream conversion had already lowered its input/index toTileType(e.g. when a precedingtensor.createhad been rewritten totile.create).This PR makes the converter accept either
TensorTypeorTileTypefor bothinputandindex:shape/dtypefrom whichever type the arg carries.emit_load_or_slicehelper: it emitstile.loadwhen the source is still aTensorType(original behavior, valid at any rank) andtile.slicewhen an upstream pass has already produced aTileType(only reached on rank-2 paths today, becauseFlattenTileNdTo2Drejectstile.sliceon >2D tiles).tile.load, so the rank-3 lowering is bit-identical to before.Test plan
tests/ut/ir/transforms/test_convert_tensor_to_tile_ops.py::TestConvertGatherOp— UTs passtests/st/runtime/ops/test_gather.py::TestGatherIndexrank-3 cases — pre-compile path no longer tripsFlattenTileNdTo2D: tile.slice is not supported on >2D tiles; system-tests / dist-system-tests CI lanes green