bug: loadBackends(backendsPath) skipped when buildGpu === false, silently drops custom GPU backends compiled via NODE_LLAMA_CPP_CMAKE_OPTION_*

### Issue description

RELATED: #479 (feat: builtin ROCm support)

### Expected Behavior

In `src/bindings/Llama.ts`, the call to `loadBackends(backendsPath)` is guarded by
`buildGpu !== false`. When a binary is built with `--gpu false`, `buildGpu` is `false`
and the guard makes the entire block a no-op. Any backend `.so` placed in the binary's
`Release/` directory by a custom cmake build (e.g. `libggml-hip.so` via `GGML_HIP=ON`)
is never loaded. Inference silently falls back to CPU with no warning or error.

```typescript
// src/bindings/Llama.ts — v3.18.1 (affected code)
let loadedGpu = bindings.getGpuType();
if (loadedGpu == null || (loadedGpu === false && buildGpu !== false)) {
    const backendsPath = path.dirname(bindingPath);
    const fallbackBackendsDir = path.join(extBackendsPath ?? backendsPath, "fallback");
    bindings.loadBackends(backendsPath);          // ← never reached when buildGpu === false
    loadedGpu = bindings.getGpuType();
    if (loadedGpu == null || (loadedGpu === false && buildGpu !== false))
        bindings.loadBackends(fallbackBackendsDir);
}
```

`loadBackends(backendsPath)` is called regardless of `buildGpu`, so that custom GPU
backends compiled via `NODE_LLAMA_CPP_CMAKE_OPTION_*` are loaded and used at runtime.
If no backend initialises, `getGpuType()` returns `false` and the existing fallback
path proceeds unchanged.

### Actual Behavior

In `src/bindings/Llama.ts`, the call to `loadBackends(backendsPath)` is guarded by
`buildGpu !== false`. When a binary is built with `--gpu false`, `buildGpu` is `false`
and the guard makes the entire block a no-op. Any backend `.so` placed in the binary's
`Release/` directory by a custom cmake build (e.g. `libggml-hip.so` via `GGML_HIP=ON`)
is never loaded. Inference silently falls back to CPU with no warning or error.

```typescript
// src/bindings/Llama.ts — v3.18.1 (affected code)
let loadedGpu = bindings.getGpuType();
if (loadedGpu == null || (loadedGpu === false && buildGpu !== false)) {
    const backendsPath = path.dirname(bindingPath);
    const fallbackBackendsDir = path.join(extBackendsPath ?? backendsPath, "fallback");
    bindings.loadBackends(backendsPath);          // ← never reached when buildGpu === false
    loadedGpu = bindings.getGpuType();
    if (loadedGpu == null || (loadedGpu === false && buildGpu !== false))
        bindings.loadBackends(fallbackBackendsDir);
}
```

`loadBackends(backendsPath)` is never called when the binary was built with `--gpu false`.
`llama.gpu` is `false` even when a valid GPU backend (e.g. `libggml-hip.so`) was
compiled into the binary directory. Inference runs on CPU.

### Steps to reproduce

```bash
# 1. Set cmake options to compile a custom GPU backend
export NODE_LLAMA_CPP_CMAKE_OPTION_GGML_HIP=ON
export NODE_LLAMA_CPP_CMAKE_OPTION_AMDGPU_TARGETS=gfx1200

# 2. Build with --gpu false
node node-llama-cpp/dist/cli/cli.js source download --gpu false --noUsageExample

# 3. Confirm the backend .so was compiled
find ~/.cache/node-llama-cpp -name "libggml-hip.so"
# → file exists in Release/

# 4. Check llama.gpu at runtime
node -e "
const { getLlama } = require('node-llama-cpp');
getLlama({ gpu: false }).then(l => console.log('gpu:', l.gpu));
"
# Expected: gpu: cuda   (ROCm maps its device names to "cuda" internally)
# Actual:   gpu: false  (libggml-hip.so was never loaded)
```

### My Environment

| | |
|---|---|
| node-llama-cpp | 3.18.1 |
| llama.cpp release | b8390 |
| Node.js | 22.22.2 |
| OS | Ubuntu 24.04.4 LTS (Docker, `rocm/dev-ubuntu-24.04:latest`) |
| GPU | AMD RX 9060 XT — gfx1200 (RDNA 4) |
| ROCm | 7.2.2 |

### Additional Context

The `buildGpu !== false` guard is redundant: `loadBackends(backendsPath)` already has
no effect if no backend is found — `getGpuType()` simply returns `false` again and the
fallback path proceeds. The guard only prevents the probe from being attempted.

**Proposed fix** — remove `buildGpu !== false` from both checks:

```typescript
let loadedGpu = bindings.getGpuType();
if (loadedGpu == null || loadedGpu === false) {
    const backendsPath = path.dirname(bindingPath);
    const fallbackBackendsDir = path.join(extBackendsPath ?? backendsPath, "fallback");
    bindings.loadBackends(backendsPath);
    loadedGpu = bindings.getGpuType();
    if (loadedGpu == null || loadedGpu === false)
        bindings.loadBackends(fallbackBackendsDir);
}
```

This fix is a prerequisite for any `--gpu false` + cmake workaround for ROCm/HIP while
native support is pending (#479). It also affects any other custom GPU backend injected
via `NODE_LLAMA_CPP_CMAKE_OPTION_*` on non-NVIDIA/non-Apple hardware.


### Relevant Features Used

- [ ] Metal support
- [ ] CUDA support
- [ ] Vulkan support
- [ ] Grammar
- [ ] Function calling

### Are you willing to resolve this issue by submitting a Pull Request?

No, I don’t have the time and I’m okay to wait for the community / maintainers to resolve this issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: loadBackends(backendsPath) skipped when buildGpu === false, silently drops custom GPU backends compiled via NODE_LLAMA_CPP_CMAKE_OPTION_* #599

Issue description

Expected Behavior

Actual Behavior

Steps to reproduce

My Environment

Additional Context

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development


node-llama-cpp	3.18.1
llama.cpp release	b8390
Node.js	22.22.2
OS	Ubuntu 24.04.4 LTS (Docker, `rocm/dev-ubuntu-24.04:latest`)
GPU	AMD RX 9060 XT — gfx1200 (RDNA 4)
ROCm	7.2.2

Uh oh!

bug: loadBackends(backendsPath) skipped when buildGpu === false, silently drops custom GPU backends compiled via NODE_LLAMA_CPP_CMAKE_OPTION_* #599

Description

Issue description

Expected Behavior

Actual Behavior

Steps to reproduce

My Environment

Additional Context

Relevant Features Used

Are you willing to resolve this issue by submitting a Pull Request?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions