You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add Sortformer CUDA export and Linux/Windows CUDA CI coverage (pytorch#17865)
## Summary
This PR adds CUDA coverage for Sortformer in both Linux and Windows CI,
and updates the Sortformer example/export path so CUDA artifacts are
exportable and runnable end-to-end.
## What Changed
- Added Sortformer to CUDA export/e2e matrices in:
- `.github/workflows/cuda.yml` (Linux CUDA)
- `.github/workflows/cuda-windows.yml` (Windows CUDA runtime, Linux
export)
- Extended CI export/test scripts for Sortformer:
- `.ci/scripts/export_model_artifact.sh`
- Added `nvidia/diar_streaming_sortformer_4spk-v2` support
- Added Sortformer-specific export path
- Enforced non-quantized Sortformer export
- `.ci/scripts/test_model_e2e.sh`
- Added Sortformer model routing, test audio download, and runner
invocation
- `.ci/scripts/test_model_e2e_windows.ps1`
- Added Sortformer runner path/args and expected-output validation
- Enabled Sortformer CUDA build targets:
- `examples/models/sortformer/CMakePresets.json`
- Added `sortformer-cuda` configure/build/workflow presets
- `Makefile`
- Added `sortformer-cuda` target and help text
- Updated Sortformer runner to accept CUDA named-data blob:
- `examples/models/sortformer/main.cpp`
- Added `--data_path`
- `examples/models/sortformer/sortformer_runner.h/.cpp`
- Added constructor overload/path handling for optional `.ptd`
- Updated Sortformer exporter for CUDA backends:
- `examples/models/sortformer/export_sortformer.py`
- Added backend choices: `cuda`, `cuda-windows`
- Added CUDA/CUDA-Windows lowering path
- Writes external tensor data via
`write_tensor_data_to_file(output_dir)`
- Verifies `aoti_cuda_blob.ptd` exists in output dir
- Added explicit print for blob write location
## Validation
- `python -m py_compile examples/models/sortformer/export_sortformer.py`
- CI coverage is now wired for:
- Linux CUDA export + e2e Sortformer
- Windows CUDA e2e Sortformer (using exported artifact)
0 commit comments