Skip to content

cmake: skip cvector-generator and export-lora when CPU backend is disabled#24053

Open
arichiardi wants to merge 7 commits into
ggml-org:masterfrom
arichiardi:fix-cvector-cpu-guard
Open

cmake: skip cvector-generator and export-lora when CPU backend is disabled#24053
arichiardi wants to merge 7 commits into
ggml-org:masterfrom
arichiardi:fix-cvector-cpu-guard

Conversation

@arichiardi
Copy link
Copy Markdown

@arichiardi arichiardi commented Jun 2, 2026

Overview

Both cvector-generator and export-lora link against CPU backend symbols (ggml_backend_cpu_init,
ggml_get_f32_nd, etc.) unconditionally. When building with -DGGML_CPU=OFF, these tools fail to
link, causing cmake --install to error on a missing binary:

CMake Error at build/tools/cvector-generator/cmake_install.cmake:52 (file):
  file INSTALL cannot find
  "<repo-path>/llama.cpp/build/bin/llama-cvector-generator": No such
  file or directory.
Call Stack (most recent call first):
  build/tools/cmake_install.cmake:122 (include)
  build/cmake_install.cmake:67 (include)

Both tools are guarded by NOT GGML_BACKEND_DL but not by GGML_CPU. Added the missing check.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES - AI assisted in identifying the issue and proposing the fix

Copilot AI review requested due to automatic review settings June 2, 2026 21:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Tightens the build conditions for certain tools/examples so they are only included when the CPU backend is enabled (in addition to requiring non-dynamic backend loading).

Changes:

  • Restrict cvector-generator and export-lora subdirectories to build only when GGML_CPU is enabled and GGML_BACKEND_DL is disabled.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tools/CMakeLists.txt
pwilkin and others added 6 commits June 2, 2026 21:57
* StepFun 3.5 MTP

* Simplify to single layer

* Rollback core changes

* fix flake8 errors

* Remove scripts

* modify to convention

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* dos2unix

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
…te-embedding-{97,311}m-multilingual-r2) (ggml-org#22716)

* Add support for the ibm-granite/granite-embedding-{97m,311m}-multilingual-r2 embedding models:

* Added a version of the gpt4o tokenizer that has a fixed regex (better handling of marks), and different token merging setting for the 97m model
* Reused gemma4 tokenizer for the 311m model

* granite-embedding-*-multilingual-r2 : add support SwiGLU FFN for Granite Embedding Multilingual R2

* added new GGUF key <arch>.hidden_activation (LLM_KV_HIDDEN_ACT) + writer
* added a forward declaration of llm_ffn_op_type to llama-hparams.h
* added llm_ffn_op in hparams
* added LLM_FFN_NONE = 0 sentinel to llm_ffn_op_type (value-initialization), modern-bert: explicitly assigns LLM_FFN_GEGLU before reading GGUF (unchanged).
* centralized hidden_act mapping in llama-model.cpp, added llm_ffn_op_type_from_string() helper, mirroring rope_scaling_type/llama_rope_scaling_type_from_string()
* modern-bert reads the GGUF key (when present) and uses the resulting op in its FFN graph

* Added granite-embedding-{97m,311m}-multilingual-r2 to the converter code

* Added the hashes for the granite embedding multilingual R2 models
* Set the hidden_activation in the GGUF if the field is present in config.json (such as for the granite embedding models)
* model: support for Mellum architecture

* model: improve mellum.py formatting

* model: improve mellum.py formatting once again

* deps: downgrade transformers to 4.57.6 (to fix CI)

* deps: remove huggingface_hub dependency

* deps: remove huggingface_hub from test requirements

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* hex-ops: fix profiler output (ie remove the redundant NONEs)

* hex-prof: update profiling script to support tot.usec column
@github-actions github-actions Bot added model Model specific script Script related testing Everything test related python python script changes server ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend Hexagon labels Jun 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples ggml changes relating to the ggml tensor library for machine learning Hexagon model Model specific OpenCL Issues specific to the OpenCL backend python python script changes script Script related server testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants