Skip to content

Commit f1a61fc

Browse files
authored
update voxtral-realtime build flag and readme for cuda-windows support (pytorch#18417)
Differential Revision: D97788666
1 parent 0ffca95 commit f1a61fc

4 files changed

Lines changed: 18 additions & 2 deletions

File tree

CMakePresets.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,8 @@
152152
"llm-release"
153153
],
154154
"cacheVariables": {
155-
"EXECUTORCH_BUILD_CUDA": "ON"
155+
"EXECUTORCH_BUILD_CUDA": "ON",
156+
"CMAKE_CUDA_ARCHITECTURES": "native"
156157
},
157158
"condition": {
158159
"type": "inList",

examples/models/voxtral_realtime/CMakePresets.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@
5757
"name": "voxtral-realtime-cpu",
5858
"displayName": "Build Voxtral Realtime runner (CPU)",
5959
"configurePreset": "voxtral-realtime-cpu",
60+
"configuration": "Release",
6061
"targets": [
6162
"voxtral_realtime_runner"
6263
]
@@ -73,6 +74,7 @@
7374
{
7475
"name": "voxtral-realtime-cuda",
7576
"displayName": "Build Voxtral Realtime runner (CUDA)",
77+
"configuration": "Release",
7678
"configurePreset": "voxtral-realtime-cuda",
7779
"targets": [
7880
"voxtral_realtime_runner"

examples/models/voxtral_realtime/README.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -198,7 +198,6 @@ capability to avoid "invalid device function" errors (the `int4mm` kernels
198198
require SM 80+).
199199

200200
```powershell
201-
$env:CMAKE_CUDA_ARCHITECTURES="80;86;89;90;120"
202201
cmake --workflow --preset llm-release-cuda
203202
Push-Location examples/models/voxtral_realtime
204203
cmake --workflow --preset voxtral-realtime-cuda

tools/cmake/preset/README.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,20 @@ $ cmake --workflow --preset llm-debug-cuda
6565
$ cmake --workflow --preset llm-debug-metal
6666
```
6767

68+
> [!NOTE]
69+
> **CUDA architecture selection:** The `llm-release-cuda` (and `llm-debug-cuda`)
70+
> preset sets `CMAKE_CUDA_ARCHITECTURES=native`, which auto-detects the GPU
71+
> on the build machine at configure time. To target a different architecture,
72+
> override it with `-D` on the configure step:
73+
> ```bash
74+
> cmake --preset llm-release-cuda -DCMAKE_CUDA_ARCHITECTURES="80;86;89;90;120"
75+
> cmake --build --preset llm-release-cuda --config Release
76+
> ```
77+
> Note that `cmake --workflow` does not accept `-D` flags, so you must run
78+
> configure and build as separate steps when overriding. Also note that on
79+
> Windows, setting `CMAKE_CUDA_ARCHITECTURES` via environment variable does
80+
> **not** work with CMake presets — you must use the `-D` flag.
81+
6882
#### Understanding workflow components
6983
7084
A workflow preset typically consists of:

0 commit comments

Comments
 (0)