cmake: skip cvector-generator and export-lora when CPU backend is disabled by arichiardi · Pull Request #24053 · ggml-org/llama.cpp

arichiardi · 2026-06-02T21:46:41Z

Overview

Both cvector-generator and export-lora link against CPU backend symbols (ggml_backend_cpu_init,
ggml_get_f32_nd, etc.) unconditionally. When building with -DGGML_CPU=OFF, these tools fail to
link, causing cmake --install to error on a missing binary:

CMake Error at build/tools/cvector-generator/cmake_install.cmake:52 (file):
  file INSTALL cannot find
  "<repo-path>/llama.cpp/build/bin/llama-cvector-generator": No such
  file or directory.
Call Stack (most recent call first):
  build/tools/cmake_install.cmake:122 (include)
  build/cmake_install.cmake:67 (include)

Both tools are guarded by NOT GGML_BACKEND_DL but not by GGML_CPU. Added the missing check.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES - AI assisted in identifying the issue and proposing the fix

…abled

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Tightens the build conditions for certain tools/examples so they are only included when the CPU backend is enabled (in addition to requiring non-dynamic backend loading).

Changes:

Restrict cvector-generator and export-lora subdirectories to build only when GGML_CPU is enabled and GGML_BACKEND_DL is disabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* StepFun 3.5 MTP * Simplify to single layer * Rollback core changes * fix flake8 errors * Remove scripts * modify to convention * Apply suggestions from code review Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * dos2unix --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

…te-embedding-{97,311}m-multilingual-r2) (ggml-org#22716) * Add support for the ibm-granite/granite-embedding-{97m,311m}-multilingual-r2 embedding models: * Added a version of the gpt4o tokenizer that has a fixed regex (better handling of marks), and different token merging setting for the 97m model * Reused gemma4 tokenizer for the 311m model * granite-embedding-*-multilingual-r2 : add support SwiGLU FFN for Granite Embedding Multilingual R2 * added new GGUF key <arch>.hidden_activation (LLM_KV_HIDDEN_ACT) + writer * added a forward declaration of llm_ffn_op_type to llama-hparams.h * added llm_ffn_op in hparams * added LLM_FFN_NONE = 0 sentinel to llm_ffn_op_type (value-initialization), modern-bert: explicitly assigns LLM_FFN_GEGLU before reading GGUF (unchanged). * centralized hidden_act mapping in llama-model.cpp, added llm_ffn_op_type_from_string() helper, mirroring rope_scaling_type/llama_rope_scaling_type_from_string() * modern-bert reads the GGUF key (when present) and uses the resulting op in its FFN graph * Added granite-embedding-{97m,311m}-multilingual-r2 to the converter code * Added the hashes for the granite embedding multilingual R2 models * Set the hidden_activation in the GGUF if the field is present in config.json (such as for the granite embedding models)

* model: support for Mellum architecture * model: improve mellum.py formatting * model: improve mellum.py formatting once again * deps: downgrade transformers to 4.57.6 (to fix CI) * deps: remove huggingface_hub dependency * deps: remove huggingface_hub from test requirements --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* hex-ops: fix profiler output (ie remove the redundant NONEs) * hex-prof: update profiling script to support tot.usec column

…l-org#24006)

cmake: skip cvector-generator and export-lora when CPU backend is dis…

c6ea41b

…abled

Copilot AI review requested due to automatic review settings June 2, 2026 21:46

github-actions Bot added the examples label Jun 2, 2026

Copilot AI reviewed Jun 2, 2026

View reviewed changes

Comment thread tools/CMakeLists.txt

pwilkin and others added 6 commits June 2, 2026 21:57

hexagon: profiler output fix and script updates (ggml-org#24042)

e8dff9c

* hex-ops: fix profiler output (ie remove the redundant NONEs) * hex-prof: update profiling script to support tot.usec column

opencl: use flat variants of q4_K and q6_K gemv for very large M (ggm…

fc487d9

…l-org#24006)

cmake: clarify comment for cvector-generator and export-lora build guard

0b22e90

arichiardi requested review from a team, CISC, JohannesGaessler and ggerganov as code owners June 2, 2026 21:57

github-actions Bot added model Model specific script Script related testing Everything test related python python script changes server ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend Hexagon labels Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmake: skip cvector-generator and export-lora when CPU backend is disabled#24053

cmake: skip cvector-generator and export-lora when CPU backend is disabled#24053
arichiardi wants to merge 7 commits into
ggml-org:masterfrom
arichiardi:fix-cvector-cpu-guard

arichiardi commented Jun 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

arichiardi commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Requirements

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

arichiardi commented Jun 2, 2026 •

edited

Loading