fix(verifier): reject unwritten Out/InOut params, not just tensor-creating reassigns (#1525)#1607
fix(verifier): reject unwritten Out/InOut params, not just tensor-creating reassigns (#1525)#1607lyfne123 wants to merge 3 commits into
Conversation
…ating reassigns (#1525) A bare reassignment of a `pl.Out`/`pl.InOut` parameter to a detached value (e.g. `y = pl.matmul(...)`) rebinds the name and leaves the external buffer unwritten, so the kernel silently produces its init value (all-zero). This was the real root cause of #1525 — not a device/pto-isa bug. Generalize the `OutParamNotShadowed` verifier from a tensor-creating-op blacklist (`tensor.create`/`tensor.full`) to the precise invariant: a call-valued reassignment of an Out/InOut param is invalid unless the call threads the param through its arguments. Threading reassigns (`out = pl.assemble(out, ...)`, `out = pl.store(..., out)`) stay valid; detached ones (`out = pl.matmul(...)`) now fail compilation with a clear diagnostic pointing at the idiomatic fix. Make the check actually fire by adding `OutParamNotShadowed` to `GetVerifiedProperties()`. It was already structural but absent from the verified set, so the inline pipeline-input check skipped it and the footgun compiled silently. It now runs once at pipeline input (pre-SSA, where the bare reassignment's AssignStmt still targets the param Var). - Regression tests for the matmul and InOut cases. - Update the error_handling example and the #1525 device test to the new behavior / the supported idiom. - Update verifier docs (en + zh).
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe PR refactors the ChangesOutParamNotShadowed Verifier Refactor
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request improves the OutParamNotShadowed verifier to detect when Out or InOut parameters are reassigned to a detached value (i.e., a call that does not thread the parameter, such as out = pl.matmul(...)), which previously caused the external buffer to remain unwritten and silently produce initial values (issue #1525). The verifier now checks if the parameter is referenced anywhere in the argument subtree of the reassignment call. Documentation, examples, and tests have been updated accordingly. Feedback on the changes suggests generalizing the threading check from CallPtr to ExprPtr to prevent potential loopholes where assigning to an intermediate variable first could bypass the verifier.
| /// True when `call` references `param` anywhere in its argument subtree — i.e. | ||
| /// the reassignment threads the Out/InOut parameter through the call, keeping | ||
| /// the external buffer connected (e.g. `out = tensor.assemble(out, tile, off)`, | ||
| /// `out = tile.store(value, idx, out)`). | ||
| bool CallThreadsParam(const CallPtr& call, const Var* param) { | ||
| var_collectors::VarDefUseCollector uses; | ||
| uses.VisitExpr(call); | ||
| return uses.var_uses.count(param) > 0; | ||
| } |
There was a problem hiding this comment.
To prevent a loophole where assigning to an intermediate variable first bypasses the verifier (e.g., temp = pl.matmul(a, b); out = temp), we should generalize the threading check from Call to any Expr. This helper function can be generalized to accept ExprPtr instead of CallPtr.
| /// True when `call` references `param` anywhere in its argument subtree — i.e. | |
| /// the reassignment threads the Out/InOut parameter through the call, keeping | |
| /// the external buffer connected (e.g. `out = tensor.assemble(out, tile, off)`, | |
| /// `out = tile.store(value, idx, out)`). | |
| bool CallThreadsParam(const CallPtr& call, const Var* param) { | |
| var_collectors::VarDefUseCollector uses; | |
| uses.VisitExpr(call); | |
| return uses.var_uses.count(param) > 0; | |
| } | |
| /// True when expr references param anywhere in its subtree - i.e. | |
| /// the reassignment threads the Out/InOut parameter through the expression, keeping | |
| /// the external buffer connected (e.g. out = tensor.assemble(out, tile, off), | |
| /// out = tile.store(value, idx, out)). | |
| bool ExprThreadsParam(const ExprPtr& expr, const Var* param) { | |
| var_collectors::VarDefUseCollector uses; | |
| uses.VisitExpr(expr); | |
| return uses.var_uses.count(param) > 0; | |
| } |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@examples/utils/error_handling.py`:
- Around line 43-48: The current except block treats any Exception as a
successful compile-time rejection; change it so only exceptions whose message
contains "OutParamNotShadowed" or "issue `#1525`" are treated as OK, and for any
other exception re-raise the exception (raise) or exit non-zero (sys.exit(1)) so
unrelated import/runtime errors fail the demo; update the except block that
binds Exception as e (and uses msg = str(e)) to perform this conditional and
either print the OK message for the expected verifier diagnostic or
propagate/exit on all other errors.
In `@src/ir/verifier/verify_out_param_pass.cpp`:
- Around line 49-56: CallThreadsParam currently treats any read of the Out/InOut
parameter as a write-through because it uses
var_collectors::VarDefUseCollector::VisitExpr and checks uses.var_uses; change
it to detect real write-through semantics: walk the Call AST (and nested Calls)
and only return true if the param is used in an argument position that the call
propagates to its result (e.g., the explicit output buffer arg such as the
“out”/destination arg or known in-place positions) or if the callee is a known
write-through primitive; update CallThreadsParam to inspect Call nodes (their op
and arg indices) instead of any var use, and apply the same logic to the
analogous predicate at lines 95-97 so only true write-through patterns (e.g.,
tensor.assemble/out-store/inplace primitives) count as threading the param.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 15b7cf4b-d96a-4044-a68f-7badccc35a49
📒 Files selected for processing (8)
docs/en/dev/passes/99-verifier.mddocs/zh-cn/dev/passes/99-verifier.mdexamples/utils/error_handling.pyinclude/pypto/ir/transforms/ir_property.hsrc/ir/transforms/ir_property.cppsrc/ir/verifier/verify_out_param_pass.cpptests/st/runtime/ops/test_issue1525_chained_matmul.pytests/ut/ir/transforms/test_verifier.py
| except Exception as e: # noqa: BLE001 -- demo: any compile-time rejection is the success path | ||
| msg = str(e) | ||
| if "OutParamNotShadowed" in msg or "issue #1525" in msg: | ||
| print("OK -- rejected at compile time by the OutParamNotShadowed verifier") | ||
| else: | ||
| print(f"OK -- rejected at compile time: {type(e).__name__}") |
There was a problem hiding this comment.
Unexpected exceptions should still fail this demo.
This now treats any exception as success. If the script hits an unrelated import/runtime/config error, it will still print OK and exit 0, so the example no longer proves that OutParamNotShadowed was the rejection path. Please re-raise or exit non-zero when the exception does not match the expected verifier diagnostic.
Suggested fix
except Exception as e: # noqa: BLE001 -- demo: any compile-time rejection is the success path
msg = str(e)
if "OutParamNotShadowed" in msg or "issue `#1525`" in msg:
print("OK -- rejected at compile time by the OutParamNotShadowed verifier")
else:
- print(f"OK -- rejected at compile time: {type(e).__name__}")
+ print(f"ERROR: unexpected exception {type(e).__name__}: {e}")
+ sys.exit(1)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@examples/utils/error_handling.py` around lines 43 - 48, The current except
block treats any Exception as a successful compile-time rejection; change it so
only exceptions whose message contains "OutParamNotShadowed" or "issue `#1525`"
are treated as OK, and for any other exception re-raise the exception (raise) or
exit non-zero (sys.exit(1)) so unrelated import/runtime errors fail the demo;
update the except block that binds Exception as e (and uses msg = str(e)) to
perform this conditional and either print the OK message for the expected
verifier diagnostic or propagate/exit on all other errors.
| /// True when `call` references `param` anywhere in its argument subtree — i.e. | ||
| /// the reassignment threads the Out/InOut parameter through the call, keeping | ||
| /// the external buffer connected (e.g. `out = tensor.assemble(out, tile, off)`, | ||
| /// `out = tile.store(value, idx, out)`). | ||
| bool CallThreadsParam(const CallPtr& call, const Var* param) { | ||
| var_collectors::VarDefUseCollector uses; | ||
| uses.VisitExpr(call); | ||
| return uses.var_uses.count(param) > 0; |
There was a problem hiding this comment.
Don't treat every read of the parameter as a write-through.
CallThreadsParam() only checks whether the Out/InOut param appears anywhere under the RHS call, so out = tensor.add(out, x) / acc = tensor.mul(acc, w) will now pass even though those ops still produce detached results and leave the external buffer unwritten. That reopens the silent-init-value bug for any pure op that merely reads the accumulator. This predicate needs to recognize actual write-through calls, not generic parameter usage.
Also applies to: 95-97
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/ir/verifier/verify_out_param_pass.cpp` around lines 49 - 56,
CallThreadsParam currently treats any read of the Out/InOut parameter as a
write-through because it uses var_collectors::VarDefUseCollector::VisitExpr and
checks uses.var_uses; change it to detect real write-through semantics: walk the
Call AST (and nested Calls) and only return true if the param is used in an
argument position that the call propagates to its result (e.g., the explicit
output buffer arg such as the “out”/destination arg or known in-place positions)
or if the callee is a known write-through primitive; update CallThreadsParam to
inspect Call nodes (their op and arg indices) instead of any var use, and apply
the same logic to the analogous predicate at lines 95-97 so only true
write-through patterns (e.g., tensor.assemble/out-store/inplace primitives)
count as threading the param.
…harden example (#1525) Address review feedback on the OutParamNotShadowed generalization. The previous check treated the Out/InOut param appearing *anywhere* in the reassigning call's args as "threading". That misses a real footgun: a call that only reads the param still returns a fresh, detached value, e.g. `out = pl.add(out, x)` — the external buffer is never written (silent all-zero), same class as #1525. Refine CallThreadsParam to mirror the orchestration codegen's TraceReturnedToParam, so the verifier flags exactly the reassignments codegen would fail to bind: - builtin output-side ops alias only their target arg (tensor.assemble[0], tile.store[2], tensor.set_validshape[0]); - any other builtin op (tensor.add, tensor.matmul, ...) produces a fresh value and never threads the param, even when it appears as an input operand; - user-function calls may write the param through an Out/InOut argument, so the param appearing in the args still counts as threading (keeps `out = self.kernel(a, out)` valid). Verified against the full unit suite: zero false positives (the legit `param = pl.store(...)` / `pl.assemble(...)` accumulator idioms and user-call threading all still pass), and `out = pl.add(out, x)` is now caught. Also harden the error_handling example to re-raise unexpected exceptions instead of treating any failure as the expected rejection, and update verifier docs.
|
Thanks @coderabbitai and @gemini-code-assist — addressed in 18264aa. coderabbit (write-through vs any read): valid and now fixed. I confirmed on a board-equivalent codegen dump that
I did not apply a pure "write-target position" rule globally, because that would false-positive on the legitimate user-function threading idiom gemini (CallPtr → ExprPtr / intermediate-variable loophole): acknowledged. A non- coderabbit (example except too broad): hardened — |
|
Thanks for the detailed update, @lyfne123. Your approach to the |
|
Tip For best results, initiate chat on the files or code changes.
🐇 ✨ ✏️ Learnings added
|
Summary
Fixes the real root cause of #1525: a bare reassignment of a
pl.Out/pl.InOutparameter to a detached value (e.g.y = pl.matmul(...); return y) rebinds the local name and leaves the external output buffer unwritten, so the kernel silently produces its init value (all-zero). This is a front-end output-binding footgun — not a device/pto-isa bug (the earlier device-side diagnosis was walked back; pto-isa#149 was closed as not-reproduced).What changed
Generalize the
OutParamNotShadowedverifier (src/ir/verifier/verify_out_param_pass.cpp) from a tensor-creating-op blacklist (tensor.create/tensor.fullonly) to the precise invariant:out = pl.assemble(out, tile, off),out = pl.store(value, off, out), indexed writeout[...] = value.out = pl.matmul(...),out = tensor.create(...).Make the check actually fire: add
OutParamNotShadowedtoGetVerifiedProperties()(src/ir/transforms/ir_property.cpp). It was already in the structural set but absent from the verified set, so the inline pipeline-input check skipped it and the footgun compiled silently. It now runs once at pipeline input (pre-SSA) — where the bare reassignment'sAssignStmt.var_still targets the param Var (post-SSA it detaches into a dead value).Why not a broader "no
=to parameters" ruleRefuted empirically: an AST scan of
examples/+tests/found 889 bare-name=assignments to parameters, 741 of which are the blessed accumulator idiomparam = pl.store(t, off, param)/pl.assemble. The distinguishing factor is whether the param is threaded through the rhs, not whether=is used.Validation
VerificationErrorat compile time with an actionable message; the idiomatic form compiles clean (no false positive).tests/utsuite green (5346 passed), no false positives.tests/ut/ir/transforms/test_verifier.py(matmul + InOut cases).examples/utils/error_handling.pyand the [DX] pypto silently emits all-zero when a pl.Out is reassigned to a detached value in a mix/split CORE_GROUP scope — should be a compile-time diagnostic #1525 device test updated to the new behavior / supported idiom.Closes #1525.