Skip to content

Commit e56ccd9

Browse files
committed
ci(docker): build CUDA images on CUDA 13 for Blackwell / GB10 (DGX Spark)
CUDA 12.6 tops out at sm_90, so the CUDA images would not run on GB10 / Grace-Blackwell. The vendored ggml's CUDA CMake adds 120a-real at CUDA >= 12.8 and 121a-real (GB10 / DGX Spark / Thor) at CUDA >= 12.9, all under our GGML_NATIVE=OFF default. Bumping both arches to nvidia/cuda:13.0.1 therefore compiles Turing through Blackwell with no manual arch list: amd64 picks up Hopper / Ada / RTX 50, arm64 picks up GH200 (sm_90 PTX) and GB10 (sm_121). Assisted-by: Claude:claude-opus-4-8 [Claude Code]
1 parent 63e1339 commit e56ccd9

3 files changed

Lines changed: 12 additions & 6 deletions

File tree

.github/workflows/docker.yml

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,10 @@ name: docker
88
# ubuntu-24.04, arm64 on ubuntu-24.04-arm. The per-arch images are pushed by
99
# digest, then a merge job assembles one multi-arch manifest per variant.
1010
#
11+
# The CUDA images use the CUDA 13 base so ggml compiles the Blackwell
12+
# architectures (sm_120 + sm_121); that is what makes the arm64 CUDA image run
13+
# on GB10 / Grace-Blackwell (DGX Spark). CUDA 12.6 tops out at sm_90.
14+
#
1115
# On pull_request the images are built but NOT pushed, so the Dockerfile stays
1216
# a merge gate. On push to the default branch, tags, and manual dispatch the
1317
# images are pushed.
@@ -53,14 +57,14 @@ jobs:
5357
- variant: cuda
5458
arch: amd64
5559
runner: ubuntu-24.04
56-
build_base: nvidia/cuda:12.6.2-devel-ubuntu24.04
57-
runtime_base: nvidia/cuda:12.6.2-runtime-ubuntu24.04
60+
build_base: nvidia/cuda:13.0.1-devel-ubuntu24.04
61+
runtime_base: nvidia/cuda:13.0.1-runtime-ubuntu24.04
5862
cmake_args: "-DPARAKEET_GGML_CUDA=ON"
5963
- variant: cuda
6064
arch: arm64
6165
runner: ubuntu-24.04-arm
62-
build_base: nvidia/cuda:12.6.2-devel-ubuntu24.04
63-
runtime_base: nvidia/cuda:12.6.2-runtime-ubuntu24.04
66+
build_base: nvidia/cuda:13.0.1-devel-ubuntu24.04
67+
runtime_base: nvidia/cuda:13.0.1-runtime-ubuntu24.04
6468
cmake_args: "-DPARAKEET_GGML_CUDA=ON"
6569
steps:
6670
- name: Checkout (with submodules)

Dockerfile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,8 @@
1212
#
1313
# CUDA:
1414
# docker build -t parakeet.cpp:cuda \
15-
# --build-arg BUILD_BASE=nvidia/cuda:12.6.2-devel-ubuntu24.04 \
16-
# --build-arg RUNTIME_BASE=nvidia/cuda:12.6.2-runtime-ubuntu24.04 \
15+
# --build-arg BUILD_BASE=nvidia/cuda:13.0.1-devel-ubuntu24.04 \
16+
# --build-arg RUNTIME_BASE=nvidia/cuda:13.0.1-runtime-ubuntu24.04 \
1717
# --build-arg CMAKE_EXTRA_ARGS=-DPARAKEET_GGML_CUDA=ON .
1818
#
1919
# The build context must be a checkout with the ggml submodule populated

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,8 @@ docker run --rm --gpus all \
111111
transcribe --model /models/parakeet-tdt_ctc-110m-q5_k.gguf --input /audio/speech.wav --decoder tdt
112112
```
113113

114+
The CUDA image is built on CUDA 13, so it covers everything from Turing up through Blackwell, including GB10 / Grace-Blackwell (DGX Spark) on arm64.
115+
114116
To build the image yourself, see the build args at the top of the [`Dockerfile`](Dockerfile). The CPU image is the portable `GGML_NATIVE=OFF` build, so it runs on any amd64 or arm64 host.
115117

116118
---

0 commit comments

Comments
 (0)