[minor] Refactor TE fused-norm handling in GPTModelExporter#1061
[minor] Refactor TE fused-norm handling in GPTModelExporter#1061yueshen2016 wants to merge 1 commit intomainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a private helper Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1061 +/- ##
==========================================
+ Coverage 70.30% 70.31% +0.01%
==========================================
Files 227 227
Lines 25847 25847
==========================================
+ Hits 18172 18175 +3
+ Misses 7675 7672 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Extract `_get_fused_norm_weight` helper to consolidate the repeated three-condition check (`"fused_norm" in self.rules`, attribute navigation, `layer_norm_weight is not None`) that appeared in `_get_transformer_layer_state_dict` (attention and MLP paths) and `_get_mamba_layer_state_dict`. - Replace double `hasattr` chains with `getattr(..., None)` chaining - Remove redundant `isinstance(layer.norm, IdentityOp)` in Mamba elif (already guaranteed by being in the elif branch) - Use walrus operator to capture `norm_weight` without repeating the attribute access on the call site No behavior change. Signed-off-by: James Shen <yueshen@nvidia.com>
6ac5a43 to
8783a4f
Compare
What does this PR do?
Type of change: Refactor (no behavior change)
Extract a
_get_fused_norm_weighthelper method inGPTModelExporterto consolidate the repeatedTE fused-norm detection logic that previously appeared as three separate multi-condition
elifblocksin
_get_transformer_layer_state_dict(attention and MLP paths) and_get_mamba_layer_state_dict.Changes:
_get_fused_norm_weight(module)that checks"fused_norm" in self.rulesand returnsgetattr(module, "layer_norm_weight", None)hasattrchains withgetattr(..., None)chaining —getattralready returnsNonefor missing attributesisinstance(layer.norm, IdentityOp)in the Mambaelif(guaranteed by being anelifbranch):=) to capturenorm_weightwithout repeating the attribute traversal on the call lineBefore / After
Before (one of three nearly-identical blocks):
After:
Testing
No behavior change — existing tests cover all paths.
🤖 Generated with Claude Code
Summary by CodeRabbit