Conversation
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
📝 WalkthroughWalkthroughThis PR introduces Video Sparse Attention (VSA), a new sparse attention method for video processing. It adds configuration classes (VSAAttributeConfig, VSAConfig), a VSA implementation with compression and sparse branches operating on 3D video blocks, tile-based utility functions, method registration, dynamic configuration selection, and comprehensive unit tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Forward Call
participant VSA as VSA.forward_attention()
participant Meta as Metadata Computation
participant Tile as Tiling Utilities
participant Kernel as Triton Kernel<br/>(video_sparse_attn)
participant Untile as Untiling Utilities
Client->>VSA: query, key, value, gate_compress, video_shape
VSA->>Meta: Compute/retrieve metadata (tiles, block sizes)
Meta-->>VSA: tile_indices, variable_block_sizes, non_pad_indices
VSA->>Tile: Tile Q, K, V, gate_compress
Tile-->>VSA: tiled tensors (padded layout)
VSA->>Kernel: Invoke video_sparse_attn(Q, K, V, block_size_3d, top_k_ratio)
Kernel-->>VSA: attended output (tiled, padded)
VSA->>Untile: Untile output back to original sequence
Untile-->>VSA: output (original sequence order)
VSA-->>Client: final_output, stats dict (sparsity, block_stats)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
Signed-off-by: Kai Xu <kaix@nvidia.com>
Signed-off-by: Kai Xu <kaix@nvidia.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1053 +/- ##
==========================================
+ Coverage 70.18% 70.41% +0.22%
==========================================
Files 223 229 +6
Lines 25712 26039 +327
==========================================
+ Hits 18047 18336 +289
- Misses 7665 7703 +38 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Kai Xu <kaix@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/unit/torch/sparsity/attention_sparsity/test_vsa.py (1)
269-299: Consider adding edge case test fortop_k_ratio=1.0.The tests cover invalid values (0.0, 1.5) but don't explicitly test that
top_k_ratio=1.0(keep all blocks) is valid. This boundary is allowed by the validator (0.0 < v <= 1.0).💡 Suggested addition
def test_video_shape_none_allowed(self): cfg = VSAAttributeConfig(video_shape=None) assert cfg.video_shape is None + def test_top_k_ratio_boundary_valid(self): + """top_k_ratio=1.0 (keep all blocks) should be valid.""" + cfg = VSAAttributeConfig(top_k_ratio=1.0) + assert cfg.top_k_ratio == 1.0 + def test_vsa_config_defaults(self):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/torch/sparsity/attention_sparsity/test_vsa.py` around lines 269 - 299, Add a test ensuring top_k_ratio == 1.0 is accepted: inside TestVSAAttributeConfig add a new test (e.g., test_top_k_ratio_one_valid) that constructs VSAAttributeConfig(top_k_ratio=1.0) and asserts cfg.top_k_ratio == 1.0 (and/or no ValidationError is raised); reference VSAAttributeConfig and the TestVSAAttributeConfig class so the new test sits with the existing tests for top_k_ratio edge cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modelopt/torch/sparsity/attention_sparsity/sparse_attention.py`:
- Around line 73-92: The code can lose config data when attribute_cfg is a
config object of the wrong class because config_dict is only populated for dict
inputs; fix by populating config_dict from the object before re-instantiation:
if attribute_cfg is not a dict but is an object (e.g., a Pydantic model),
extract its fields into config_dict (use attribute_cfg.dict() if available, else
attribute_cfg.__dict__ or vars(attribute_cfg)) and then instantiate
config_class(**config_dict); update the logic around attribute_cfg, config_dict,
config_class, VSAAttributeConfig and SparseAttentionAttributeConfig so the
original object's values are preserved when converting between classes.
---
Nitpick comments:
In `@tests/unit/torch/sparsity/attention_sparsity/test_vsa.py`:
- Around line 269-299: Add a test ensuring top_k_ratio == 1.0 is accepted:
inside TestVSAAttributeConfig add a new test (e.g., test_top_k_ratio_one_valid)
that constructs VSAAttributeConfig(top_k_ratio=1.0) and asserts cfg.top_k_ratio
== 1.0 (and/or no ValidationError is raised); reference VSAAttributeConfig and
the TestVSAAttributeConfig class so the new test sits with the existing tests
for top_k_ratio edge cases.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 855aa644-273c-466e-b11c-4b3b2e9d9e99
📒 Files selected for processing (8)
modelopt/torch/sparsity/attention_sparsity/config.pymodelopt/torch/sparsity/attention_sparsity/methods/__init__.pymodelopt/torch/sparsity/attention_sparsity/methods/registry.pymodelopt/torch/sparsity/attention_sparsity/methods/vsa.pymodelopt/torch/sparsity/attention_sparsity/methods/vsa_utils.pymodelopt/torch/sparsity/attention_sparsity/sparse_attention.pytests/examples/llm_eval/test_llm_eval.pytests/unit/torch/sparsity/attention_sparsity/test_vsa.py
💤 Files with no reviewable changes (1)
- tests/examples/llm_eval/test_llm_eval.py
| from .config import VSAAttributeConfig | ||
|
|
||
| # Determine which config class to use based on method | ||
| config_dict = attribute_cfg or {} | ||
| if isinstance(attribute_cfg, dict): | ||
| method = config_dict.get("method", "flash_skip_softmax") | ||
| elif attribute_cfg is not None and hasattr(attribute_cfg, "method"): | ||
| method = attribute_cfg.method | ||
| else: | ||
| method = "flash_skip_softmax" | ||
|
|
||
| # Select appropriate config class based on method | ||
| if method == "vsa": | ||
| config_class = VSAAttributeConfig | ||
| else: | ||
| config_class = SparseAttentionAttributeConfig | ||
|
|
||
| # Ensure config is validated through Pydantic | ||
| if not isinstance(attribute_cfg, SparseAttentionAttributeConfig): | ||
| attribute_cfg = SparseAttentionAttributeConfig(**(attribute_cfg or {})) | ||
| if not isinstance(attribute_cfg, (SparseAttentionAttributeConfig, VSAAttributeConfig)): | ||
| attribute_cfg = config_class(**(config_dict)) |
There was a problem hiding this comment.
Potential data loss when config object type doesn't match selected method.
If attribute_cfg is a config object (not a dict) with method="vsa", but is of type SparseAttentionAttributeConfig, the code selects VSAAttributeConfig as the config_class. Then the isinstance check at line 91 fails, and line 92 creates a new config from config_dict which is empty {} (since line 76-82 only populates it for dict inputs). This silently discards the original config values.
🐛 Proposed fix to preserve config values
# Determine which config class to use based on method
config_dict = attribute_cfg or {}
if isinstance(attribute_cfg, dict):
method = config_dict.get("method", "flash_skip_softmax")
elif attribute_cfg is not None and hasattr(attribute_cfg, "method"):
method = attribute_cfg.method
+ config_dict = attribute_cfg.model_dump()
else:
method = "flash_skip_softmax"
# Select appropriate config class based on method
if method == "vsa":
config_class = VSAAttributeConfig
else:
config_class = SparseAttentionAttributeConfig
# Ensure config is validated through Pydantic
if not isinstance(attribute_cfg, (SparseAttentionAttributeConfig, VSAAttributeConfig)):
attribute_cfg = config_class(**(config_dict))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modelopt/torch/sparsity/attention_sparsity/sparse_attention.py` around lines
73 - 92, The code can lose config data when attribute_cfg is a config object of
the wrong class because config_dict is only populated for dict inputs; fix by
populating config_dict from the object before re-instantiation: if attribute_cfg
is not a dict but is an object (e.g., a Pydantic model), extract its fields into
config_dict (use attribute_cfg.dict() if available, else attribute_cfg.__dict__
or vars(attribute_cfg)) and then instantiate config_class(**config_dict); update
the logic around attribute_cfg, config_dict, config_class, VSAAttributeConfig
and SparseAttentionAttributeConfig so the original object's values are preserved
when converting between classes.
What does this PR do?
Type of change: ?
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: ✅ / ❌ / N/AAdditional Information
Summary by CodeRabbit
New Features
Tests