Skip to content

Is Cuda 3.2 working in WSL2? #14503

@chocwaffles

Description

@chocwaffles

Windows Version

Microsoft Windows [Version 10.0.26200.8037]

WSL Version

2.7.0.0

Are you using WSL 1 or WSL 2?

  • WSL 2
  • WSL 1

Kernel Version

6.6.114.1-1

Distro Version

Ubuntu 24.04

Other Software

I am wanting to compile llama.cpp with cuda support. I have a 5090 GPU with CUDA Version: 13.2 and Driver Version: 595.71. I've also installed nvidia cuda toolkit Cuda compilation tools, release 13.2, V13.2.51.

my llama.cpp would compile properly, but it doesn't appear that it could load it into my direct memory, and as a result, my Large Language model is extremely slow. about 0.8 tokens per seconds. I have previously raised this with llama.cpp git issue, ggml-org/llama.cpp#20631 (comment)

They have suggested that this is likely a wsl issue, Is this something you can help? I'm on the latest GPU with Blackwell architecture, and i need this compile to work under the WSL2 system.

appreciate your thoughts, and help

thank you.

Repro Steps

(base) user@JT-Ryzen-5090:~$ nvidia-smi
Sun Mar 22 16:05:54 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 595.45.03 Driver Version: 595.71 CUDA Version: 13.2 |
+-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 5090 On | 00000000:01:00.0 On | N/A |
| 0% 36C P8 23W / 575W | 1227MiB / 32607MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
(base) user@JT-Ryzen-5090:$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2026 NVIDIA Corporation
Built on Mon_Mar_02_09:52:23_PM_PST_2026
Cuda compilation tools, release 13.2, V13.2.51
Build cuda_13.2.r13.2/compiler.37434383_0
(base) user@JT-Ryzen-5090:
$ cd repos/llama.cpp/
(base) user@JT-Ryzen-5090:/repos/llama.cpp$ rm build -rf
(base) user@JT-Ryzen-5090:
/repos/llama.cpp$ git pull
Already up to date.
(base) user@JT-Ryzen-5090:/repos/llama.cpp$ cmake . -B build
-DBUILD_SHARED_LIBS=OFF
-DGGML_CUDA=ON
-DCMAKE_BUILD_TYPE=Release
-- The C compiler identification is GNU 13.3.0
-- The CXX compiler identification is GNU 13.3.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMAKE_BUILD_TYPE=Release
-- Found Git: /usr/bin/git (found version "2.43.0")
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/cc
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Warning: ccache not found - consider installing it for faster compilation or disable this warning with GGML_CCACHE=OFF
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- GGML_SYSTEM_ARCH: x86
-- Including CPU backend
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native
-- Found CUDAToolkit: /usr/local/cuda-13.2/targets/x86_64-linux/include (found version "13.2.51")
-- CUDA Toolkit found
-- The CUDA compiler identification is NVIDIA 13.2.51
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-13.2/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Replacing 120-real in CMAKE_CUDA_ARCHITECTURES_NATIVE with 120a-real
-- Using CMAKE_CUDA_ARCHITECTURES=120a-real CMAKE_CUDA_ARCHITECTURES_NATIVE=120a-real
-- CUDA host compiler is GNU 13.3.0
-- Including CUDA backend
-- ggml version: 0.9.8
-- ggml commit: 3306dbaef
-- Found OpenSSL: /usr/lib/x86_64-linux-gnu/libcrypto.so (found version "3.0.13")
-- Performing Test OPENSSL_VERSION_SUPPORTED
-- Performing Test OPENSSL_VERSION_SUPPORTED - Success
-- OpenSSL found: 3.0.13
-- Generating embedded license file for target: common
-- Configuring done (8.7s)
-- Generating done (0.1s)
-- Build files have been written to: /home/user/repos/llama.cpp/build
(base) user@JT-Ryzen-5090:
/repos/llama.cpp$ cmake --build build --config Release -j$(nproc)
[ 1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
[ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[ 1%] Building CXX object vendor/cpp-httplib/CMakeFiles/cpp-httplib.dir/httplib.cpp.o
[ 1%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[ 1%] Building C object examples/gguf-hash/CMakeFiles/sha256.dir/deps/sha256/sha256.c.o
[ 1%] Building CXX object tools/mtmd/CMakeFiles/llama-llava-cli.dir/deprecation-warning.cpp.o
[ 2%] Building C object examples/gguf-hash/CMakeFiles/sha1.dir/deps/sha1/sha1.c.o
[ 2%] Building CXX object tools/mtmd/CMakeFiles/llama-minicpmv-cli.dir/deprecation-warning.cpp.o
[ 2%] Building CXX object tools/mtmd/CMakeFiles/llama-gemma3-cli.dir/deprecation-warning.cpp.o
[ 3%] Building CXX object tools/mtmd/CMakeFiles/llama-qwen2vl-cli.dir/deprecation-warning.cpp.o
[ 4%] Building C object examples/gguf-hash/CMakeFiles/xxhash.dir/deps/xxhash/xxhash.c.o
[ 4%] Built target build_info
[ 4%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[ 4%] Built target sha1
[ 4%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[ 4%] Built target sha256
[ 4%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[ 4%] Linking CXX executable ../../bin/llama-llava-cli
[ 5%] Linking CXX executable ../../bin/llama-gemma3-cli
[ 5%] Linking CXX executable ../../bin/llama-minicpmv-cli
[ 5%] Linking CXX executable ../../bin/llama-qwen2vl-cli
[ 6%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
[ 6%] Built target llama-llava-cli
[ 6%] Built target llama-minicpmv-cli
[ 6%] Built target llama-qwen2vl-cli
[ 6%] Built target llama-gemma3-cli
[ 6%] Built target xxhash
[ 6%] Linking CXX static library libggml-base.a
[ 6%] Built target ggml-base
[ 6%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.c.o
[ 7%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/hbm.cpp.o
...
[ 71%] Built target test-log
[ 71%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/simple-tokenize.cpp.o
[ 71%] Built target test-llama-archs
[ 72%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/test-basic.cpp.o
[ 72%] Building CXX object tests/CMakeFiles/test-regex-partial.dir/test-regex-partial.cpp.o
[ 73%] Linking CXX executable ../bin/test-json-partial
[ 73%] Building CXX object tests/CMakeFiles/test-regex-partial.dir/get-model.cpp.o
[ 73%] Linking CXX executable ../bin/test-regex-partial
[ 73%] Linking CXX executable ../bin/test-quantize-stats
[ 73%] Built target test-json-partial
[ 73%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/test-gbnf-generation.cpp.o
[ 73%] Built target test-regex-partial
[ 73%] Linking CXX executable ../bin/test-grammar-integration
[ 73%] Building CXX object tests/CMakeFiles/test-thread-safety.dir/test-thread-safety.cpp.o
[ 73%] Built target test-quantize-stats
[ 73%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/test-json-parser.cpp.o
[ 73%] Built target test-grammar-integration
[ 73%] Building CXX object tests/CMakeFiles/test-thread-safety.dir/get-model.cpp.o
[ 73%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/test-json-serialization.cpp.o
[ 73%] Linking CXX executable ../bin/test-thread-safety
[ 73%] Built target test-thread-safety
[ 73%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/test-python-dict-parser.cpp.o
[ 73%] Linking CXX executable ../bin/test-json-schema-to-grammar
[ 73%] Built target test-json-schema-to-grammar
[ 74%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/peg-parser/test-unicode.cpp.o
[ 74%] Building CXX object tests/CMakeFiles/test-arg-parser.dir/test-arg-parser.cpp.o
[ 74%] Linking CXX executable ../bin/test-chat-template
[ 74%] Built target test-chat-template
[ 75%] Building CXX object tests/CMakeFiles/test-arg-parser.dir/get-model.cpp.o
[ 75%] Building CXX object tests/CMakeFiles/test-peg-parser.dir/get-model.cpp.o
[ 75%] Building CXX object tests/CMakeFiles/test-opt.dir/test-opt.cpp.o
[ 75%] Linking CXX executable ../bin/test-arg-parser
[ 75%] Building CXX object tests/CMakeFiles/test-gguf.dir/test-gguf.cpp.o
[ 75%] Building CXX object tests/CMakeFiles/test-backend-ops.dir/test-backend-ops.cpp.o
[ 75%] Built target test-arg-parser
[ 75%] Building CXX object tests/CMakeFiles/test-opt.dir/get-model.cpp.o
[ 76%] Building CXX object tests/CMakeFiles/test-backend-ops.dir/get-model.cpp.o
[ 76%] Building CXX object tests/CMakeFiles/test-gguf.dir/get-model.cpp.o
[ 76%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/test-model-load-cancel.cpp.o
[ 76%] Building CXX object tests/CMakeFiles/test-model-load-cancel.dir/get-model.cpp.o
[ 76%] Linking CXX executable ../bin/test-model-load-cancel
[ 76%] Built target test-model-load-cancel
[ 76%] Building CXX object tests/CMakeFiles/test-autorelease.dir/test-autorelease.cpp.o
[ 76%] Linking CXX executable ../bin/test-opt
[ 76%] Building CXX object tests/CMakeFiles/test-autorelease.dir/get-model.cpp.o
[ 76%] Linking CXX executable ../bin/test-autorelease
[ 76%] Linking CXX executable ../bin/test-gguf
[ 76%] Built target test-opt
[ 76%] Building CXX object tests/CMakeFiles/test-backend-sampler.dir/test-backend-sampler.cpp.o
[ 76%] Built target test-autorelease
[ 76%] Building CXX object tests/CMakeFiles/test-state-restore-fragmented.dir/test-state-restore-fragmented.cpp.o
[ 76%] Built target test-gguf
[ 77%] Building CXX object tests/CMakeFiles/test-state-restore-fragmented.dir/get-model.cpp.o
[ 77%] Building CXX object tests/CMakeFiles/test-barrier.dir/test-barrier.cpp.o
[ 78%] Building CXX object tests/CMakeFiles/test-barrier.dir/get-model.cpp.o
[ 78%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/test-quantize-fns.cpp.o
[ 78%] Linking CXX executable ../bin/test-barrier
[ 78%] Built target test-barrier
[ 78%] Building CXX object tests/CMakeFiles/test-backend-sampler.dir/get-model.cpp.o
[ 78%] Building CXX object tests/CMakeFiles/test-quantize-fns.dir/get-model.cpp.o
[ 79%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/test-quantize-perf.cpp.o
[ 79%] Linking CXX executable ../bin/test-quantize-fns
[ 79%] Building CXX object tests/CMakeFiles/test-rope.dir/test-rope.cpp.o
[ 79%] Built target test-quantize-fns
[ 80%] Building CXX object tests/CMakeFiles/test-rope.dir/get-model.cpp.o
[ 80%] Building CXX object tests/CMakeFiles/test-quantize-perf.dir/get-model.cpp.o
[ 80%] Building C object tests/CMakeFiles/test-mtmd-c-api.dir/test-mtmd-c-api.c.o
[ 80%] Building CXX object tests/CMakeFiles/gguf-model-data.dir/gguf-model-data.cpp.o
[ 80%] Building CXX object tests/CMakeFiles/test-mtmd-c-api.dir/get-model.cpp.o
[ 81%] Linking CXX executable ../bin/test-mtmd-c-api
[ 81%] Linking CXX executable ../bin/test-rope
[ 81%] Linking CXX executable ../bin/test-state-restore-fragmented
[ 81%] Built target test-rope
[ 81%] Building CXX object tests/CMakeFiles/test-alloc.dir/test-alloc.cpp.o
[ 81%] Built target test-mtmd-c-api
[ 81%] Building CXX object tests/CMakeFiles/test-alloc.dir/get-model.cpp.o
[ 82%] Building CXX object tests/CMakeFiles/export-graph-ops.dir/export-graph-ops.cpp.o
[ 82%] Built target test-state-restore-fragmented
[ 82%] Building CXX object examples/batched/CMakeFiles/llama-batched.dir/batched.cpp.o
[ 82%] Linking CXX executable ../bin/test-peg-parser
[ 82%] Linking CXX executable ../bin/test-quantize-perf
[ 82%] Linking CXX executable ../bin/test-backend-sampler
[ 82%] Linking CXX executable ../bin/test-alloc
[ 82%] Built target test-quantize-perf
[ 82%] Building CXX object examples/debug/CMakeFiles/llama-debug.dir/debug.cpp.o
[ 82%] Built target test-alloc
[ 82%] Building CXX object examples/embedding/CMakeFiles/llama-embedding.dir/embedding.cpp.o
[ 82%] Built target test-peg-parser
[ 82%] Building CXX object examples/eval-callback/CMakeFiles/llama-eval-callback.dir/eval-callback.cpp.o
[ 82%] Built target test-backend-sampler
[ 82%] Building CXX object examples/idle/CMakeFiles/llama-idle.dir/idle.cpp.o
[ 82%] Linking CXX executable ../../bin/llama-batched
[ 82%] Linking CXX executable ../bin/export-graph-ops
[ 83%] Linking CXX executable ../bin/test-chat
[ 83%] Built target llama-batched
[ 83%] Building CXX object examples/lookahead/CMakeFiles/llama-lookahead.dir/lookahead.cpp.o
[ 83%] Linking CXX executable ../../bin/llama-idle
[ 83%] Linking CXX executable ../../bin/llama-embedding
[ 83%] Built target export-graph-ops
[ 83%] Linking CXX executable ../../bin/llama-eval-callback
[ 83%] Building CXX object examples/lookup/CMakeFiles/llama-lookup.dir/lookup.cpp.o
[ 83%] Built target test-chat
[ 83%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-create.dir/lookup-create.cpp.o
[ 83%] Built target llama-embedding
[ 83%] Built target llama-idle
[ 83%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-merge.dir/lookup-merge.cpp.o
[ 83%] Built target llama-eval-callback
[ 84%] Building CXX object examples/lookup/CMakeFiles/llama-lookup-stats.dir/lookup-stats.cpp.o
[ 84%] Building CXX object examples/parallel/CMakeFiles/llama-parallel.dir/parallel.cpp.o
[ 84%] Linking CXX executable ../bin/test-chat-auto-parser
[ 84%] Linking CXX executable ../bin/test-chat-peg-parser
[ 84%] Linking CXX executable ../../bin/llama-lookup-merge
[ 84%] Linking CXX executable ../../bin/llama-lookup-create
[ 84%] Built target test-chat-auto-parser
[ 84%] Building CXX object examples/passkey/CMakeFiles/llama-passkey.dir/passkey.cpp.o
[ 84%] Linking CXX executable ../../bin/llama-lookahead
[ 84%] Built target test-chat-peg-parser
[ 84%] Building CXX object examples/retrieval/CMakeFiles/llama-retrieval.dir/retrieval.cpp.o
[ 85%] Linking CXX executable ../../bin/llama-lookup
[ 85%] Linking CXX executable ../bin/test-jinja
[ 85%] Built target llama-lookup-merge
[ 86%] Building CXX object examples/save-load-state/CMakeFiles/llama-save-load-state.dir/save-load-state.cpp.o
[ 86%] Linking CXX executable ../../bin/llama-lookup-stats
[ 86%] Built target llama-lookup-create
[ 86%] Built target llama-lookahead
[ 86%] Building CXX object examples/speculative-simple/CMakeFiles/llama-speculative-simple.dir/speculative-simple.cpp.o
[ 87%] Building CXX object examples/speculative/CMakeFiles/llama-speculative.dir/speculative.cpp.o
[ 87%] Built target llama-lookup
[ 87%] Building CXX object examples/gen-docs/CMakeFiles/llama-gen-docs.dir/gen-docs.cpp.o
[ 87%] Linking CXX executable ../../bin/llama-parallel
[ 87%] Built target test-jinja
[ 87%] Building CXX object examples/training/CMakeFiles/llama-finetune.dir/finetune.cpp.o
[ 87%] Built target llama-lookup-stats
[ 87%] Building CXX object examples/diffusion/CMakeFiles/llama-diffusion-cli.dir/diffusion-cli.cpp.o
[ 87%] Linking CXX static library libgguf-model-data.a
[ 87%] Built target gguf-model-data
[ 87%] Linking CXX executable ../../bin/llama-passkey
[ 87%] Building CXX object examples/convert-llama2c-to-ggml/CMakeFiles/llama-convert-llama2c-to-ggml.dir/convert-llama2c-to-ggml.cpp.o
[ 87%] Built target llama-parallel
[ 87%] Building CXX object pocs/vdot/CMakeFiles/llama-vdot.dir/vdot.cpp.o
[ 87%] Linking CXX executable ../../bin/llama-debug
[ 87%] Linking CXX executable ../../bin/llama-save-load-state
[ 87%] Built target llama-passkey
[ 87%] Building CXX object pocs/vdot/CMakeFiles/llama-q8dot.dir/q8dot.cpp.o
[ 87%] Linking CXX executable ../../bin/llama-retrieval
[ 88%] Linking CXX executable ../../bin/llama-vdot
[ 88%] Linking CXX executable ../../bin/llama-finetune
[ 88%] Built target llama-vdot
[ 89%] Building CXX object tools/batched-bench/CMakeFiles/llama-batched-bench.dir/batched-bench.cpp.o
[ 89%] Built target llama-debug
[ 89%] Linking CXX executable ../../bin/llama-gen-docs
[ 89%] Building CXX object tools/gguf-split/CMakeFiles/llama-gguf-split.dir/gguf-split.cpp.o
[ 89%] Built target llama-save-load-state
[ 89%] Building CXX object tools/imatrix/CMakeFiles/llama-imatrix.dir/imatrix.cpp.o
[ 89%] Built target llama-retrieval
[ 89%] Linking CXX executable ../../bin/llama-q8dot
[ 89%] Building CXX object tools/llama-bench/CMakeFiles/llama-bench.dir/llama-bench.cpp.o
[ 89%] Built target llama-finetune
[ 89%] Built target llama-q8dot
[ 90%] Building CXX object tools/completion/CMakeFiles/llama-completion.dir/completion.cpp.o
[ 91%] Building CXX object tools/perplexity/CMakeFiles/llama-perplexity.dir/perplexity.cpp.o
[ 91%] Linking CXX executable ../../bin/llama-speculative-simple
[ 91%] Built target llama-gen-docs
[ 91%] Linking CXX executable ../../bin/llama-convert-llama2c-to-ggml
[ 91%] Building CXX object tools/quantize/CMakeFiles/llama-quantize.dir/quantize.cpp.o
[ 91%] Linking CXX executable ../../bin/llama-gguf-split
[ 91%] Built target llama-speculative-simple
[ 92%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-task.cpp.o
[ 92%] Built target llama-convert-llama2c-to-ggml
[ 92%] Building CXX object tools/tokenize/CMakeFiles/llama-tokenize.dir/tokenize.cpp.o
[ 92%] Linking CXX executable ../../bin/llama-speculative
[ 92%] Linking CXX executable ../../bin/llama-batched-bench
[ 92%] Built target llama-gguf-split
[ 92%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-queue.cpp.o
[ 92%] Built target llama-speculative
[ 92%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-common.cpp.o
[ 92%] Linking CXX executable ../../bin/llama-tokenize
[ 92%] Built target llama-batched-bench
[ 92%] Building CXX object tools/parser/CMakeFiles/llama-debug-template-parser.dir/debug-template-parser.cpp.o
[ 92%] Built target llama-tokenize
[ 92%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-context.cpp.o
[ 92%] Linking CXX executable ../../bin/llama-quantize
[ 92%] Built target llama-quantize
[ 92%] Building CXX object tools/parser/CMakeFiles/llama-template-analysis.dir/template-analysis.cpp.o
[ 93%] Linking CXX executable ../../bin/llama-diffusion-cli
[ 93%] Built target llama-diffusion-cli
[ 93%] Building CXX object tools/tts/CMakeFiles/llama-tts.dir/tts.cpp.o
[ 93%] Linking CXX executable ../../bin/llama-perplexity
[ 93%] Linking CXX executable ../../bin/llama-completion
[ 93%] Built target llama-perplexity
[ 93%] Building CXX object tools/mtmd/CMakeFiles/llama-mtmd-cli.dir/mtmd-cli.cpp.o
[ 93%] Built target llama-completion
[ 94%] Building CXX object tools/mtmd/CMakeFiles/llama-mtmd-debug.dir/debug/mtmd-debug.cpp.o
[ 94%] Building CXX object tools/cvector-generator/CMakeFiles/llama-cvector-generator.dir/cvector-generator.cpp.o
[ 94%] Linking CXX executable ../../bin/llama-mtmd-debug
[ 94%] Linking CXX executable ../../bin/llama-debug-template-parser
[ 95%] Linking CXX executable ../../bin/llama-imatrix
[ 95%] Built target llama-debug-template-parser
[ 95%] Building CXX object tools/export-lora/CMakeFiles/llama-export-lora.dir/export-lora.cpp.o
[ 95%] Built target llama-mtmd-debug
[ 95%] Building CXX object tools/fit-params/CMakeFiles/llama-fit-params.dir/fit-params.cpp.o
[ 95%] Built target llama-imatrix
[ 95%] Building CXX object tools/results/CMakeFiles/llama-results.dir/results.cpp.o
[ 96%] Linking CXX executable ../../bin/llama-cvector-generator
[ 96%] Linking CXX executable ../../bin/llama-fit-params
[ 97%] Linking CXX executable ../../bin/llama-template-analysis
[ 97%] Linking CXX executable ../../bin/llama-bench
[ 97%] Built target llama-cvector-generator
[ 97%] Building CXX object tests/CMakeFiles/test-gguf-model-data.dir/test-gguf-model-data.cpp.o
[ 97%] Linking CXX executable ../../bin/llama-results
[ 97%] Linking CXX executable ../../bin/llama-mtmd-cli
[ 97%] Built target llama-fit-params
[ 98%] Linking CXX executable ../bin/test-gguf-model-data
[ 98%] Built target llama-template-analysis
[ 98%] Built target llama-bench
[ 99%] Linking CXX executable ../../bin/llama-export-lora
[ 99%] Built target llama-results
[ 99%] Linking CXX executable ../bin/test-backend-ops
[ 99%] Built target llama-mtmd-cli
[ 99%] Built target test-gguf-model-data
[ 99%] Built target test-backend-ops
[ 99%] Built target llama-export-lora
[ 99%] Linking CXX executable ../../bin/llama-tts
[ 99%] Built target llama-tts
[ 99%] Linking CXX static library libserver-context.a
[ 99%] Built target server-context
[ 99%] Generating loading.html.hpp
[ 99%] Generating index.html.gz.hpp
[ 99%] Building CXX object tools/cli/CMakeFiles/llama-cli.dir/cli.cpp.o
[ 99%] Building CXX object tools/server/CMakeFiles/llama-server.dir/server.cpp.o
[ 99%] Building CXX object tools/server/CMakeFiles/llama-server.dir/server-http.cpp.o
[100%] Building CXX object tools/server/CMakeFiles/llama-server.dir/server-models.cpp.o
[100%] Linking CXX executable ../../bin/llama-cli
[100%] Built target llama-cli
[100%] Linking CXX executable ../../bin/llama-server
[100%] Built target llama-server
(base) user@JT-Ryzen-5090:/repos/llama.cpp$ cmake --build ./build --config Release -j --clean-first --target llama-cli llama-mtmd-cli llama-server llama-gguf-split
[ 0%] Building CXX object common/CMakeFiles/build_info.dir/build-info.cpp.o
[ 0%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml.cpp.o
[ 0%] Building CXX object vendor/cpp-httplib/CMakeFiles/cpp-httplib.dir/httplib.cpp.o
[ 1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[ 3%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[ 3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[ 3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
[ 3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[ 3%] Built target build_info
[ 3%] Linking CXX static library libggml-base.a
[ 3%] Built target ggml-base
[ 5%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/hbm.cpp.o
[ 5%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/repack.cpp.o
[ 5%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.c.o
[ 5%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ggml-cpu.cpp.o
[ 5%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/traits.cpp.o
[ 5%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/amx.cpp.o
[ 5%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/amx/mmq.cpp.o
[ 5%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/quants.c.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/unary-ops.cpp.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/binary-ops.cpp.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/acc.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/add-id.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argmax.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/arange.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/argsort.cu.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/vec.cpp.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/llamafile/sgemm.cpp.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/x86/repack.cpp.o
[ 6%] Building C object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/arch/x86/quants.c.o
[ 6%] Building CXX object ggml/src/CMakeFiles/ggml-cpu.dir/ggml-cpu/ops.cpp.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/clamp.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/concat.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv-transpose-1d.cu.o
[ 6%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-transpose.cu.o
[ 8%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/binbcast.cu.o
[ 8%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d-dw.cu.o
[ 10%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/convert.cu.o
[ 10%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/conv2d.cu.o
[ 10%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cross-entropy-loss.cu.o
[ 10%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diag.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cumsum.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/count-equal.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/diagmask.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/cpy.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-tile.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn-wmma-f16.cu.o
[ 11%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fattn.cu.o
[ 13%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/fill.cu.o
[ 13%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gated_delta_net.cu.o
[ 13%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ggml-cuda.cu.o
[ 13%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/gla.cu.o
[ 13%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/im2col.cu.o
[ 13%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/getrows.cu.o
[ 15%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mean.cu.o
[ 15%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmf.cu.o
[ 15%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvf.cu.o
[ 15%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmvq.cu.o
[ 15%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmid.cu.o
[ 15%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/mmq.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/norm.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-adamw.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/opt-step-sgd.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad_reflect_1d.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/rope.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/roll.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/quantize.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/out-prod.cu.o
[ 16%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pad.cu.o
[ 18%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/pool2d.cu.o
[ 18%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/scale.cu.o
[ 18%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set-rows.cu.o
[ 18%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softmax.cu.o
[ 18%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/solve_tri.cu.o
[ 18%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/softcap.cu.o
[ 20%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/set.cu.o
[ 20%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-conv.cu.o
[ 21%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/ssm-scan.cu.o
[ 21%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sum.cu.o
[ 21%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/top-k.cu.o
[ 21%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/sumrows.cu.o
[ 21%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/topk-moe.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tsembd.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/tri.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/unary.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/upscale.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq112-dv112.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/wkv.cu.o
[ 23%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq128-dv128.cu.o
[ 25%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq256-dv256.cu.o
[ 25%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq40-dv40.cu.o
[ 25%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq576-dv512.cu.o
[ 25%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq80-dv80.cu.o
[ 25%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq72-dv72.cu.o
[ 25%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq64-dv64.cu.o
[ 26%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-tile-instance-dkq96-dv96.cu.o
[ 26%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_16.cu.o
[ 26%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_32.cu.o
[ 26%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_1-ncols2_8.cu.o
[ 26%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_2.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_4.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_16-ncols2_1.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_16.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_8.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_4.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_1.cu.o
[ 28%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_2-ncols2_32.cu.o
[ 30%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_32-ncols2_2.cu.o
[ 30%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_4.cu.o
[ 30%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_2.cu.o
[ 30%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_16.cu.o
[ 30%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_4-ncols2_8.cu.o
[ 30%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_64-ncols2_1.cu.o
[ 31%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_2.cu.o
[ 31%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_1.cu.o
[ 31%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_8.cu.o
[ 31%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-mma-f16-instance-ncols1_8-ncols2_4.cu.o
[ 31%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq1_s.cu.o
[ 33%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_s.cu.o
[ 33%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xs.cu.o
[ 33%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq2_xxs.cu.o
[ 33%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_s.cu.o
[ 33%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq3_xxs.cu.o
[ 33%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_nl.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-iq4_xs.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-mxfp4.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q2_k.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_0.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q3_k.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_1.cu.o
[ 35%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_0.cu.o
[ 36%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q4_k.cu.o
[ 36%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_1.cu.o
[ 36%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q5_k.cu.o
[ 36%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q6_k.cu.o
[ 36%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmq-instance-q8_0.cu.o
[ 38%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_1.cu.o
[ 38%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_10.cu.o
[ 38%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_11.cu.o
[ 38%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_12.cu.o
[ 38%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_13.cu.o
[ 38%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_14.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_15.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_16.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_2.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_3.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_4.cu.o
[ 40%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_5.cu.o
[ 41%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_6.cu.o
[ 41%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_7.cu.o
[ 41%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_8.cu.o
[ 41%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/mmf-instance-ncols_9.cu.o
[ 41%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-instance-q4_0-q4_0.cu.o
[ 41%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-instance-q8_0-q8_0.cu.o
[ 43%] Building CUDA object ggml/src/ggml-cuda/CMakeFiles/ggml-cuda.dir/template-instances/fattn-vec-instance-f16-f16.cu.o
[ 45%] Linking CXX static library libggml-cpu.a
[ 45%] Built target ggml-cpu
[ 45%] Linking CXX static library libcpp-httplib.a
[ 45%] Built target cpp-httplib
[ 45%] Linking CXX static library libggml-cuda.a
[ 45%] Built target ggml-cuda
[ 45%] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend-dl.cpp.o
[ 45%] Building CXX object ggml/src/CMakeFiles/ggml.dir/ggml-backend-reg.cpp.o
[ 45%] Linking CXX static library libggml.a
[ 45%] Built target ggml
[ 45%] Building CXX object src/CMakeFiles/llama.dir/llama.cpp.o
[ 45%] Building CXX object src/CMakeFiles/llama.dir/llama-batch.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-arch.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-adapter.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-chat.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-context.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-cparams.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-graph.cpp.o
[ 46%] Building CXX object src/CMakeFiles/llama.dir/llama-hparams.cpp.o
[ 48%] Building CXX object src/CMakeFiles/llama.dir/llama-io.cpp.o
[ 48%] Building CXX object src/CMakeFiles/llama.dir/llama-grammar.cpp.o
[ 48%] Building CXX object src/CMakeFiles/llama.dir/llama-impl.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-kv-cache-iswa.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-kv-cache.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-memory.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-memory-hybrid.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-memory-recurrent.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-memory-hybrid-iswa.cpp.o
[ 50%] Building CXX object src/CMakeFiles/llama.dir/llama-mmap.cpp.o
[ 51%] Building CXX object src/CMakeFiles/llama.dir/llama-model-loader.cpp.o
[ 51%] Building CXX object src/CMakeFiles/llama.dir/llama-model.cpp.o
[ 51%] Building CXX object src/CMakeFiles/llama.dir/llama-model-saver.cpp.o
[ 51%] Building CXX object src/CMakeFiles/llama.dir/llama-sampler.cpp.o
[ 51%] Building CXX object src/CMakeFiles/llama.dir/llama-quant.cpp.o
[ 51%] Building CXX object src/CMakeFiles/llama.dir/llama-vocab.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/unicode-data.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/unicode.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/models/afmoe.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/models/apertus.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/models/arcee.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/models/arctic.cpp.o
[ 53%] Building CXX object src/CMakeFiles/llama.dir/models/bailingmoe.cpp.o
[ 55%] Building CXX object src/CMakeFiles/llama.dir/models/arwkv7.cpp.o
[ 55%] Building CXX object src/CMakeFiles/llama.dir/models/baichuan.cpp.o
[ 55%] Building CXX object src/CMakeFiles/llama.dir/models/bailingmoe2.cpp.o
[ 55%] Building CXX object src/CMakeFiles/llama.dir/models/bert.cpp.o
[ 56%] Building CXX object src/CMakeFiles/llama.dir/models/bloom.cpp.o
[ 56%] Building CXX object src/CMakeFiles/llama.dir/models/bitnet.cpp.o
[ 56%] Building CXX object src/CMakeFiles/llama.dir/models/chatglm.cpp.o
[ 56%] Building CXX object src/CMakeFiles/llama.dir/models/chameleon.cpp.o
[ 56%] Building CXX object src/CMakeFiles/llama.dir/models/codeshell.cpp.o
[ 58%] Building CXX object src/CMakeFiles/llama.dir/models/cohere2-iswa.cpp.o
[ 58%] Building CXX object src/CMakeFiles/llama.dir/models/command-r.cpp.o
[ 58%] Building CXX object src/CMakeFiles/llama.dir/models/cogvlm.cpp.o
[ 58%] Building CXX object src/CMakeFiles/llama.dir/models/dbrx.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/delta-net-base.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/deci.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/dots1.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/dream.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/deepseek.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/deepseek2.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/ernie4-5-moe.cpp.o
[ 60%] Building CXX object src/CMakeFiles/llama.dir/models/ernie4-5.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/exaone-moe.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/exaone.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/falcon-h1.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/eurobert.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/falcon.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/exaone4.cpp.o
[ 61%] Building CXX object src/CMakeFiles/llama.dir/models/gemma-embedding.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/gemma.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/gemma2-iswa.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/gemma3.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/gemma3n-iswa.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/glm4.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/glm4-moe.cpp.o
[ 63%] Building CXX object src/CMakeFiles/llama.dir/models/gptneox.cpp.o
[ 65%] Building CXX object src/CMakeFiles/llama.dir/models/gpt2.cpp.o
[ 65%] Building CXX object src/CMakeFiles/llama.dir/models/granite-hybrid.cpp.o
[ 65%] Building CXX object src/CMakeFiles/llama.dir/models/granite.cpp.o
[ 65%] Building CXX object src/CMakeFiles/llama.dir/models/grovemoe.cpp.o
[ 65%] Building CXX object src/CMakeFiles/llama.dir/models/grok.cpp.o
[ 66%] Building CXX object src/CMakeFiles/llama.dir/models/hunyuan-dense.cpp.o
[ 66%] Building CXX object src/CMakeFiles/llama.dir/models/hunyuan-moe.cpp.o
[ 66%] Building CXX object src/CMakeFiles/llama.dir/models/internlm2.cpp.o
[ 66%] Building CXX object src/CMakeFiles/llama.dir/models/jais.cpp.o
[ 66%] Building CXX object src/CMakeFiles/llama.dir/models/jais2.cpp.o
[ 68%] Building CXX object src/CMakeFiles/llama.dir/models/jamba.cpp.o
[ 68%] Building CXX object src/CMakeFiles/llama.dir/models/lfm2.cpp.o
[ 68%] Building CXX object src/CMakeFiles/llama.dir/models/kimi-linear.cpp.o
[ 68%] Building CXX object src/CMakeFiles/llama.dir/models/llada.cpp.o
[ 68%] Building CXX object src/CMakeFiles/llama.dir/models/llada-moe.cpp.o
[ 68%] Building CXX object src/CMakeFiles/llama.dir/models/llama-iswa.cpp.o
[ 70%] Building CXX object src/CMakeFiles/llama.dir/models/llama.cpp.o
[ 70%] Building CXX object src/CMakeFiles/llama.dir/models/mamba-base.cpp.o
[ 70%] Building CXX object src/CMakeFiles/llama.dir/models/maincoder.cpp.o
[ 70%] Building CXX object src/CMakeFiles/llama.dir/models/mamba.cpp.o
[ 70%] Building CXX object src/CMakeFiles/llama.dir/models/mimo2-iswa.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/minimax-m2.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/minicpm3.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/modern-bert.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/mistral3.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/mpt.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/nemotron-h.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/olmo.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/nemotron.cpp.o
[ 71%] Building CXX object src/CMakeFiles/llama.dir/models/olmoe.cpp.o
[ 73%] Building CXX object src/CMakeFiles/llama.dir/models/neo-bert.cpp.o
[ 73%] Building CXX object src/CMakeFiles/llama.dir/models/olmo2.cpp.o
[ 73%] Building CXX object src/CMakeFiles/llama.dir/models/openelm.cpp.o
[ 73%] Building CXX object src/CMakeFiles/llama.dir/models/openai-moe-iswa.cpp.o
[ 75%] Building CXX object src/CMakeFiles/llama.dir/models/orion.cpp.o
[ 75%] Building CXX object src/CMakeFiles/llama.dir/models/paddleocr.cpp.o
[ 75%] Building CXX object src/CMakeFiles/llama.dir/models/phi2.cpp.o
[ 75%] Building CXX object src/CMakeFiles/llama.dir/models/pangu-embedded.cpp.o
[ 75%] Building CXX object src/CMakeFiles/llama.dir/models/phi3.cpp.o
[ 75%] Building CXX object src/CMakeFiles/llama.dir/models/plamo.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/plamo2.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/plamo3.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/plm.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/qwen.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/qwen2.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/qwen2moe.cpp.o
[ 76%] Building CXX object src/CMakeFiles/llama.dir/models/qwen3.cpp.o
[ 78%] Building CXX object src/CMakeFiles/llama.dir/models/qwen2vl.cpp.o
[ 78%] Building CXX object src/CMakeFiles/llama.dir/models/qwen35.cpp.o
[ 78%] Building CXX object src/CMakeFiles/llama.dir/models/qwen35moe.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/qwen3next.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/qwen3moe.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/qwen3vl-moe.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/qwen3vl.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/refact.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/rnd1.cpp.o
[ 80%] Building CXX object src/CMakeFiles/llama.dir/models/rwkv6-base.cpp.o
[ 81%] Building CXX object src/CMakeFiles/llama.dir/models/rwkv6.cpp.o
[ 81%] Building CXX object src/CMakeFiles/llama.dir/models/rwkv6qwen2.cpp.o
[ 81%] Building CXX object src/CMakeFiles/llama.dir/models/rwkv7-base.cpp.o
[ 81%] Building CXX object src/CMakeFiles/llama.dir/models/rwkv7.cpp.o
[ 81%] Building CXX object src/CMakeFiles/llama.dir/models/seed-oss.cpp.o
[ 81%] Building CXX object src/CMakeFiles/llama.dir/models/smallthinker.cpp.o
[ 83%] Building CXX object src/CMakeFiles/llama.dir/models/smollm3.cpp.o
[ 83%] Building CXX object src/CMakeFiles/llama.dir/models/stablelm.cpp.o
[ 83%] Building CXX object src/CMakeFiles/llama.dir/models/starcoder.cpp.o
[ 83%] Building CXX object src/CMakeFiles/llama.dir/models/starcoder2.cpp.o
[ 83%] Building CXX object src/CMakeFiles/llama.dir/models/step35-iswa.cpp.o
[ 83%] Building CXX object src/CMakeFiles/llama.dir/models/t5-dec.cpp.o
[ 85%] Building CXX object src/CMakeFiles/llama.dir/models/t5-enc.cpp.o
[ 85%] Building CXX object src/CMakeFiles/llama.dir/models/wavtokenizer-dec.cpp.o
[ 85%] Building CXX object src/CMakeFiles/llama.dir/models/xverse.cpp.o
[ 85%] Linking CXX static library libllama.a
[ 85%] Built target llama
[ 85%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/mtmd.cpp.o
[ 85%] Building CXX object common/CMakeFiles/common.dir/arg.cpp.o
[ 85%] Building CXX object common/CMakeFiles/common.dir/chat-diff-analyzer.cpp.o
[ 85%] Building CXX object common/CMakeFiles/common.dir/chat-auto-parser-helpers.cpp.o
[ 86%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/conformer.cpp.o
[ 86%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/cogvlm.cpp.o
[ 86%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/mtmd-helper.cpp.o
[ 86%] Building CXX object common/CMakeFiles/common.dir/chat-auto-parser-generator.cpp.o
[ 86%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/clip.cpp.o
[ 86%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/mtmd-audio.cpp.o
[ 86%] Building CXX object common/CMakeFiles/common.dir/chat.cpp.o
[ 88%] Building CXX object common/CMakeFiles/common.dir/chat-peg-parser.cpp.o
[ 88%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/glm4v.cpp.o
[ 88%] Building CXX object common/CMakeFiles/common.dir/common.cpp.o
[ 88%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/internvl.cpp.o
[ 88%] Building CXX object common/CMakeFiles/common.dir/console.cpp.o
[ 88%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/kimivl.cpp.o
[ 88%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/kimik25.cpp.o
[ 88%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/nemotron-v2-vl.cpp.o
[ 88%] Building CXX object common/CMakeFiles/common.dir/download.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/json-partial.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/json-schema-to-grammar.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/log.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/debug.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/llguidance.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/ngram-cache.cpp.o
[ 90%] Building CXX object common/CMakeFiles/common.dir/ngram-map.cpp.o
[ 90%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/minicpmv.cpp.o
[ 90%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/paddleocr.cpp.o
[ 90%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/llava.cpp.o
[ 91%] Building CXX object common/CMakeFiles/common.dir/ngram-mod.cpp.o
[ 93%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/llama4.cpp.o
[ 93%] Building CXX object common/CMakeFiles/common.dir/preset.cpp.o
[ 93%] Building CXX object common/CMakeFiles/common.dir/reasoning-budget.cpp.o
[ 93%] Building CXX object common/CMakeFiles/common.dir/regex-partial.cpp.o
[ 93%] Building CXX object common/CMakeFiles/common.dir/peg-parser.cpp.o
[ 93%] Building CXX object common/CMakeFiles/common.dir/sampling.cpp.o
[ 93%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/qwen2vl.cpp.o
[ 93%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/pixtral.cpp.o
[ 93%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/siglip.cpp.o
[ 95%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/qwen3vl.cpp.o
[ 95%] Building CXX object common/CMakeFiles/common.dir/jinja/lexer.cpp.o
[ 96%] Building CXX object common/CMakeFiles/common.dir/speculative.cpp.o
[ 96%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/mobilenetv5.cpp.o
[ 96%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/whisper-enc.cpp.o
[ 96%] Building CXX object tools/mtmd/CMakeFiles/mtmd.dir/models/youtuvl.cpp.o
[ 96%] Building CXX object common/CMakeFiles/common.dir/jinja/parser.cpp.o
[ 96%] Building CXX object common/CMakeFiles/common.dir/unicode.cpp.o
[ 96%] Building CXX object common/CMakeFiles/common.dir/jinja/runtime.cpp.o
[ 96%] Building CXX object common/CMakeFiles/common.dir/jinja/value.cpp.o
[ 98%] Building CXX object common/CMakeFiles/common.dir/jinja/string.cpp.o
[ 98%] Building CXX object common/CMakeFiles/common.dir/__/license.cpp.o
[ 98%] Building CXX object common/CMakeFiles/common.dir/jinja/caps.cpp.o
[ 98%] Linking CXX static library libmtmd.a
[ 98%] Built target mtmd
[ 98%] Linking CXX static library libcommon.a
[ 98%] Built target common
[100%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-task.cpp.o
[100%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-common.cpp.o
[100%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-context.cpp.o
[100%] Building CXX object tools/server/CMakeFiles/server-context.dir/server-queue.cpp.o
[100%] Linking CXX static library libserver-context.a
[100%] Built target server-context
[100%] Building CXX object tools/cli/CMakeFiles/llama-cli.dir/cli.cpp.o
[100%] Linking CXX executable ../../bin/llama-cli
[100%] Built target llama-cli
[ 0%] Built target build_info
[ 0%] Built target cpp-httplib
[ 3%] Built target ggml-base
[ 8%] Built target ggml-cpu
[ 45%] Built target ggml-cuda
[ 45%] Built target ggml
[ 86%] Built target llama
[ 91%] Built target mtmd
[100%] Built target common
[100%] Building CXX object tools/mtmd/CMakeFiles/llama-mtmd-cli.dir/mtmd-cli.cpp.o
[100%] Linking CXX executable ../../bin/llama-mtmd-cli
[100%] Built target llama-mtmd-cli
[ 0%] Built target build_info
[ 3%] Built target ggml-base
[ 3%] Built target cpp-httplib
[ 8%] Built target ggml-cpu
[ 44%] Built target ggml-cuda
[ 44%] Built target ggml
[ 83%] Built target llama
[ 88%] Built target mtmd
[ 96%] Built target common
[ 98%] Built target server-context
[ 98%] Generating loading.html.hpp
[ 98%] Generating index.html.gz.hpp
[100%] Building CXX object tools/server/CMakeFiles/llama-server.dir/server-http.cpp.o
[100%] Building CXX object tools/server/CMakeFiles/llama-server.dir/server.cpp.o
[100%] Building CXX object tools/server/CMakeFiles/llama-server.dir/server-models.cpp.o
[100%] Linking CXX executable ../../bin/llama-server
[100%] Built target llama-server
[ 0%] Built target build_info
[ 0%] Built target cpp-httplib
[ 3%] Built target ggml-base
[ 8%] Built target ggml-cpu
[ 48%] Built target ggml-cuda
[ 48%] Built target ggml
[ 91%] Built target llama
[100%] Built target common
[100%] Building CXX object tools/gguf-split/CMakeFiles/llama-gguf-split.dir/gguf-split.cpp.o
[100%] Linking CXX executable ../../bin/llama-gguf-split
[100%] Built target llama-gguf-split
(base) user@JT-Ryzen-5090:
/repos/llama.cpp$ cp ./build/bin/llama-* llama.cpp
cp: target 'llama.cpp': No such file or directory
(base) user@JT-Ryzen-5090:/repos/llama.cpp$ cp ./build/bin/llama-* .
(base) user@JT-Ryzen-5090:
/repos/llama.cpp$ ./llama-cli -m /mnt/e/user/artifacts/qwen35_27b_unsloth/Qwen3.5-27B-UD-Q5_K_XL.gguf
--cache-type-k q8_0 --cache-type-v q8_0
--flash-attn on --fit on
-ngl 99 -p "Write me a short story" -n 200 2>&1 | tail -5

Loading model...

build : b8468-3306dbaef
model : Qwen3.5-27B-UD-Q5_K_XL.gguf
modalities : text

available commands:
/exit or Ctrl+C stop or exit
/regen regenerate the last response
/clear clear the chat history
/read add a text file

Write me a short story

[Start thinking]

Here's a thinking process that leads to the story above:

  1. Analyze the Request:
    • Task: Write a short story.
    • Constraints: None specified (

[ Prompt: 4.6 t/s | Generation: 0.6 t/s ]

/exit

Image

Exiting...

Expected Behavior

Loading of Model

Image

into Direct GPU memory

Actual Behavior

AI Model loaded into indirect memory

Image

Diagnostic Logs

ldd /home/user/repos/llama.cpp/build/bin/llama-cli
linux-vdso.so.1 (0x00007ffd781aa000)
libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x00007e9169d56000)
libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x00007e9169800000)
libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x00007e916cbb5000)
libcudart.so.13 => /usr/local/cuda-13.2/lib64/libcudart.so.13 (0x00007e9169400000)
libcublas.so.13 => /usr/local/cuda-13.2/lib64/libcublas.so.13 (0x00007e9165e00000)
libcuda.so.1 => /usr/lib/wsl/lib/libcuda.so.1 (0x00007e9169d28000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007e9165a00000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007e9169717000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007e91696e9000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007e9165600000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007e916cbae000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007e9169d21000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007e9169d1c000)
/lib64/ld-linux-x86-64.so.2 (0x00007e916cc14000)
libcublasLt.so.13 => /usr/local/cuda-13.2/lib64/libcublasLt.so.13 (0x00007e9141400000)
(base) user@JT-Ryzen-5090:~/repos/llama.cpp$ ls /usr/lib/wsl/lib/ -lh
total 370M
-r-xr-xr-x 4 root root 180K Feb 28 10:32 libcuda.so
-r-xr-xr-x 4 root root 180K Feb 28 10:32 libcuda.so.1
-r-xr-xr-x 4 root root 180K Feb 28 10:32 libcuda.so.1.1
-r-xr-xr-x 2 root root 12M Feb 28 10:32 libcudadebugger.so.1
-r-xr-xr-x 1 root root 784K Oct 20 2023 libd3d12.so
-r-xr-xr-x 1 root root 6.6M Oct 20 2023 libd3d12core.so
-r-xr-xr-x 1 root root 920K Mar 31 2024 libdxcore.so
-r-xr-xr-x 3 root root 24M Feb 28 10:32 libnvcuvid.so
-r-xr-xr-x 3 root root 24M Feb 28 10:32 libnvcuvid.so.1
-r-xr-xr-x 1 root root 148M Jan 21 01:42 libnvdxdlkernels.so
-r-xr-xr-x 3 root root 267K Feb 28 10:32 libnvidia-encode.so
-r-xr-xr-x 3 root root 267K Feb 28 10:32 libnvidia-encode.so.1
-r-xr-xr-x 2 root root 90M Feb 28 10:32 libnvidia-gpucomp.so
lrwxrwxrwx 1 root root 20 Mar 22 16:05 libnvidia-gpucomp.so.595.45.03 -> libnvidia-gpucomp.so
-r-xr-xr-x 2 root root 279K Feb 28 10:32 libnvidia-ml.so.1
-r-xr-xr-x 2 root root 4.4M Feb 28 10:32 libnvidia-ngx.so.1
-r-xr-xr-x 2 root root 67K Jan 21 01:42 libnvidia-opticalflow.so
-r-xr-xr-x 2 root root 67K Jan 21 01:42 libnvidia-opticalflow.so.1
-r-xr-xr-x 1 root root 9.9K Sep 22 22:52 libnvoptix.so.1
lrwxrwxrwx 1 root root 15 Mar 22 16:05 libnvoptix_loader.so.1 -> libnvoptix.so.1
-r-xr-xr-x 2 root root 54M Feb 28 10:32 libnvwgf2umx.so
-r-xr-xr-x 2 root root 4.9M Feb 28 10:32 nvidia-ngx-updater
-r-xr-xr-x 2 root root 809K Feb 28 10:32 nvidia-smi

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions