Fused QKV add node issue for GQA graph surgery #1057
Fused QKV add node issue for GQA graph surgery #1057hthadicherla wants to merge 1 commit intomainfrom
Conversation
… before qkv matmul was fused causing the script to remove this node assuming is was a dead node Signed-off-by: Hrishith Thadicherla <hthadicherla@nvidia.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA single-file change modifies the timing of when a newly created Add node (qkv_proj/Add) is appended during bias fusion in GQA replacement. The node is now collected in qkv_matmul_nodes for later insertion rather than appended directly to the graph immediately. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1057 +/- ##
==========================================
+ Coverage 70.30% 70.31% +0.01%
==========================================
Files 227 227
Lines 25847 25847
==========================================
+ Hits 18172 18175 +3
+ Misses 7675 7672 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: Bug fix
There was a small issue where for models like qwen which have bias add nodes, while fusing the q,k,v matmul and q,k,v add nodes , the fused qkv bias add node was added to the graph before the fused qkv matmul node, causing the removal script to assume that the fused matmul and the add node were part of dead subgraph hence removing them. I just changed the order in which they are added. Now there are no issues.
Summary by CodeRabbit