Skip to content

CUDA/Vulkan "detected but using failed" in v3.15.0 (works in v3.14.5) on Windows 11 + CUDA 13.1 #545

@AmzGrainRain

Description

@AmzGrainRain

Issue description

When using node-llama-cpp@3.15.0 on Windows 11 with CUDA 13.1 (RTX 4060 Laptop GPU), the inspect gpu command shows "CUDA is detected, but using it failed" and "Vulkan is detected, but using it failed". However, downgrading to node-llama-cpp@3.14.5 resolves the issue immediately (CUDA/Vulkan are marked as "available" and GPU details are displayed correctly). This issue is reproducible in a clean project (no other dependencies), indicating it's a compatibility issue with v3.15.0's prebuilt binaries (b7698) rather than an environment configuration problem.

Expected Behavior

The npx node-llama-cpp inspect gpu command should:

  1. Mark CUDA and Vulkan as "available" (not "detected but using failed") in v3.15.0;
  2. Display detailed GPU information (e.g., CUDA device name, VRAM usage, Vulkan devices) for NVIDIA GeForce RTX 4060 Laptop GPU;
  3. Be consistent with the behavior of v3.14.5 (which works correctly).

Actual Behavior

For node-llama-cpp@3.15.0:

  • inspect gpu outputs "CUDA is detected, but using it failed" and "Vulkan is detected, but using it failed";
  • No CUDA/Vulkan GPU details (e.g., VRAM usage, device name) are displayed;
  • Prebuilt binaries version: b7698.

Full output of npx --yes node-llama-cpp inspect gpu (v3.15.0):

OS: Windows 10.0.26100 (x64)
Node: 25.3.0 (x64)

node-llama-cpp: 3.15.0
Prebuilt binaries: b7698

CUDA: CUDA is detected, but using it failed
To resolve errors related to CUDA, see the CUDA guide: https://node-llama-cpp.withcat.ai/guide/CUDA
Vulkan: Vulkan is detected, but using it failed
To resolve errors related to Vulkan, see the Vulkan guide: https://node-llama-cpp.withcat.ai/guide/vulkan

CPU model: AMD Ryzen 7 8845H w/ Radeon 780M Graphics
Math cores: 3.4873849e-316
Used RAM: 49.47% (11.52GB/23.29GB)
Free RAM: 50.52% (11.77GB/23.29GB)
Used swap: 40.82% (17.27GB/42.29GB)
Max swap size: 42.29GB
mmap: supported

Steps to reproduce

  1. install node-llama-cpp@3.15.0
  2. run npx --yes node-llama-cpp inspect gpu

My Environment

Dependency Version
Operating System Windows 11 Enterprise LTSC 26100.7623
CPU AMD R7 8845HS
Node.js version 25.3.0
Typescript version 5.9.3
node-llama-cpp version 3.15.0

npx --yes node-llama-cpp inspect gpu output:

OS: Windows 10.0.26100 (x64)
Node: 25.3.0 (x64)

node-llama-cpp: 3.14.5
Prebuilt binaries: b7347

CUDA: available
Vulkan: available

CUDA device: NVIDIA GeForce RTX 4060 Laptop GPU
CUDA used VRAM: 13.33% (1.07GB/8GB)
CUDA free VRAM: 86.66% (6.93GB/8GB)

Vulkan devices: NVIDIA GeForce RTX 4060 Laptop GPU, AMD Radeon 780M Graphics
Vulkan used VRAM: 7.29% (1.43GB/19.67GB)
Vulkan free VRAM: 92.7% (18.23GB/19.67GB)
Vulkan unified memory: 7.93GB (40.32%)

CPU model: AMD Ryzen 7 8845H w/ Radeon 780M Graphics
Math cores: 5.39976352e-315
Used RAM: 49.05% (11.42GB/23.29GB)
Free RAM: 50.94% (11.87GB/23.29GB)
Used swap: 40.2% (17GB/42.29GB)
Max swap size: 42.29GB
mmap: supported

Additional Context

nvcc

PS C:\Users\khlee> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Tue_Dec_16_19:27:18_Pacific_Standard_Time_2025
Cuda compilation tools, release 13.1, V13.1.115
Build cuda_13.1.r13.1/compiler.37061995_0

nvidia-smi

PS C:\Users\khlee> nvidia-smi
Wed Jan 21 15:12:31 2026
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 591.74                 Driver Version: 591.74         CUDA Version: 13.1     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4060 ...  WDDM  |   00000000:01:00.0  On |                  N/A |
| N/A   39C    P8              3W /  115W |    1549MiB /   8188MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

others

  • Environment variables (CUDA_PATH) are configured correctly: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.1
  • No compilation errors during pnpm install for either version (postinstall script completes successfully)
  • This issue is not related to environment configuration – it's specific to v3.15.0's prebuilt binaries (b7698)
  • I am happy to help run debugging commands (e.g., collect logs, verify environment variables, test specific commands) as requested, but cannot assist with CUDA/node-gyp-specific compilation, build script modifications, or native module development for this package.(cause i don't understand cuda and node-gyp)

Relevant Features Used

  • Metal support
  • CUDA support
  • Vulkan support
  • Grammar
  • Function calling

Are you willing to resolve this issue by submitting a Pull Request?

No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrequires triageRequires triaging

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions