ci(docker): build CUDA images on CUDA 13 for Blackwell / GB10 (DGX Spark)

mudler · mudler · commit e56ccd925d3d · 2026-06-02T21:10:17.000Z
CUDA 12.6 tops out at sm_90, so the CUDA images would not run on GB10 /
Grace-Blackwell. The vendored ggml's CUDA CMake adds 120a-real at CUDA &gt;= 12.8
and 121a-real (GB10 / DGX Spark / Thor) at CUDA &gt;= 12.9, all under our
GGML_NATIVE=OFF default. Bumping both arches to nvidia/cuda:13.0.1 therefore
compiles Turing through Blackwell with no manual arch list: amd64 picks up
Hopper / Ada / RTX 50, arm64 picks up GH200 (sm_90 PTX) and GB10 (sm_121).

Assisted-by: Claude:claude-opus-4-8 [Claude Code]
diff --git a/.github/workflows/docker.yml b/.github/workflows/docker.yml
@@ -8,6 +8,10 @@ name: docker
 # ubuntu-24.04, arm64 on ubuntu-24.04-arm. The per-arch images are pushed by
 # digest, then a merge job assembles one multi-arch manifest per variant.
 #
+# The CUDA images use the CUDA 13 base so ggml compiles the Blackwell
+# architectures (sm_120 + sm_121); that is what makes the arm64 CUDA image run
+# on GB10 / Grace-Blackwell (DGX Spark). CUDA 12.6 tops out at sm_90.
+#
 # On pull_request the images are built but NOT pushed, so the Dockerfile stays
 # a merge gate. On push to the default branch, tags, and manual dispatch the
 # images are pushed.
@@ -53,14 +57,14 @@ jobs:
           - variant: cuda
             arch: amd64
             runner: ubuntu-24.04
-            build_base: nvidia/cuda:12.6.2-devel-ubuntu24.04
-            runtime_base: nvidia/cuda:12.6.2-runtime-ubuntu24.04
+            build_base: nvidia/cuda:13.0.1-devel-ubuntu24.04
+            runtime_base: nvidia/cuda:13.0.1-runtime-ubuntu24.04
             cmake_args: "-DPARAKEET_GGML_CUDA=ON"
           - variant: cuda
             arch: arm64
             runner: ubuntu-24.04-arm
-            build_base: nvidia/cuda:12.6.2-devel-ubuntu24.04
-            runtime_base: nvidia/cuda:12.6.2-runtime-ubuntu24.04
+            build_base: nvidia/cuda:13.0.1-devel-ubuntu24.04
+            runtime_base: nvidia/cuda:13.0.1-runtime-ubuntu24.04
             cmake_args: "-DPARAKEET_GGML_CUDA=ON"
     steps:
       - name: Checkout (with submodules)
diff --git a/Dockerfile b/Dockerfile
@@ -12,8 +12,8 @@
 #
 #   CUDA:
 #     docker build -t parakeet.cpp:cuda \
-#       --build-arg BUILD_BASE=nvidia/cuda:12.6.2-devel-ubuntu24.04 \
-#       --build-arg RUNTIME_BASE=nvidia/cuda:12.6.2-runtime-ubuntu24.04 \
+#       --build-arg BUILD_BASE=nvidia/cuda:13.0.1-devel-ubuntu24.04 \
+#       --build-arg RUNTIME_BASE=nvidia/cuda:13.0.1-runtime-ubuntu24.04 \
 #       --build-arg CMAKE_EXTRA_ARGS=-DPARAKEET_GGML_CUDA=ON .
 #
 # The build context must be a checkout with the ggml submodule populated
diff --git a/README.md b/README.md
@@ -111,6 +111,8 @@ docker run --rm --gpus all \
   transcribe --model /models/parakeet-tdt_ctc-110m-q5_k.gguf --input /audio/speech.wav --decoder tdt
 ```
 
+The CUDA image is built on CUDA 13, so it covers everything from Turing up through Blackwell, including GB10 / Grace-Blackwell (DGX Spark) on arm64.
+
 To build the image yourself, see the build args at the top of the [`Dockerfile`](Dockerfile). The CPU image is the portable `GGML_NATIVE=OFF` build, so it runs on any amd64 or arm64 host.
 
 ---

Original file line number	Diff line number	Diff line change
`@@ -12,8 +12,8 @@`
`12`	`12`	`#`
`13`	`13`	`# CUDA:`
`14`	`14`	`# docker build -t parakeet.cpp:cuda \`
`15`		`-# --build-arg BUILD_BASE=nvidia/cuda:12.6.2-devel-ubuntu24.04 \`
`16`		`-# --build-arg RUNTIME_BASE=nvidia/cuda:12.6.2-runtime-ubuntu24.04 \`
	`15`	`+# --build-arg BUILD_BASE=nvidia/cuda:13.0.1-devel-ubuntu24.04 \`
	`16`	`+# --build-arg RUNTIME_BASE=nvidia/cuda:13.0.1-runtime-ubuntu24.04 \`
`17`	`17`	`# --build-arg CMAKE_EXTRA_ARGS=-DPARAKEET_GGML_CUDA=ON .`
`18`	`18`	`#`
`19`	`19`	`# The build context must be a checkout with the ggml submodule populated`