Fix #94: vendor llama.cpp symlinks so SPM consumers can build by gsdali · Pull Request #96 · tattn/LocalLLMClient

gsdali · 2026-04-25T02:47:52Z

Closes #94.

Problem

Sources/LocalLLMClientLlamaC/ contains 25 symlinks pointing into the exclude/llama.cpp/ git submodule. Because SwiftPM does not recurse submodules during package resolution (and there is no Package.swift hook for downstream to request it), every consumer using .package(url:..., from:) / branch: / revision: ends up with dangling symlinks and a build failure:

warning: ignoring broken symlink .../Sources/LocalLLMClientLlamaC/clip.cpp
…
.build/checkouts/LocalLLMClient/Sources/LocalLLMClientLlamaC/include/utils.h:3:10:
fatal error: '../common/chat.h' file not found

(<memory> not found, originally reported, is the same root cause hit at a slightly later point in the include chain.)

Fix

This PR implements option 1 from the issue — vendoring the symlinked content directly into the package — picked because it gives the cleanest end-state for SPM consumers (zero runtime/build-time dependence on submodule state).

Concretely, against submodule pin 9a3ea685:

22 file symlinks → real file copies (clip*, mtmd*, common/*, …)
2 directory symlinks → vendored directories
- vendor/minja/ → minja/ (chat-template.hpp, minja.hpp ≈ 150 KB)
- vendor/nlohmann/ → nlohmann/ (json.hpp 953 KB, json_fwd.hpp 6.6 KB)
1 nested symlink → miniaudio/miniaudio.h
common/json-partial.h was a forwarding header reaching into ../exclude/. Inlined the upstream content + the existing using json = nlohmann::ordered_json; line.
stb/stb_image.h had a Linux branch including ../exclude/. Vendored upstream stb_image.h as stb/stb_image_full.h and pointed the Linux branch at it. Mac/iOS path (stb_image.swift) unchanged.
.gitmodules removed (only entry was for exclude/llama.cpp).
exclude/ directory removed.

The seven include/ forwarding headers (ggml.h, ggml-alloc.h, ggml-backend.h, ggml-cpu.h, ggml-opt.h, gguf.h, llama.h) are left alone: each uses #if __has_include(<llama/...>) and the LocalLLMClientLlamaFramework xcframework satisfies the primary path. The #else branches still mention exclude/ but never compile when the xcframework is on the include path.

Total vendored: ~1.7 MB.

Standalone regression test

Added a new CI job build-spm-consumer-no-submodules that does actions/checkout@v4 without submodules: recursive and runs swift build --target LocalLLMClientLlama (and LocalLLMClientMLX). Today's main would fail this check; with this PR it should pass. The test is approach-agnostic — even if you'd rather adopt a different fix (binary target, build-tool plugin, etc.), the CI job is worth keeping so this regression can't recur silently.

Local verification

$ git clone --no-recurse-submodules https://github.com/gsdali/LocalLLMClient.git
$ cd LocalLLMClient && git checkout fix/94-vendor-llama-symlinks
$ swift build --target LocalLLMClientLlama
…
[170/170] Compiling LocalLLMClientLlama resource_bundle_accessor.swift
Build of target: 'LocalLLMClientLlama' complete! (44.89s)

And via a downstream consumer package (closer to the real reproduction in #94):

// Package.swift
.package(
    url: "https://github.com/gsdali/LocalLLMClient.git",
    revision: "67388722c8621dfcb7c7ec3dd01cf1bfac5e5784"
),
// target deps:
.product(name: "LocalLLMClient",      package: "LocalLLMClient"),
.product(name: "LocalLLMClientLlama", package: "LocalLLMClient"),

$ swift package resolve     # no submodules fetched
$ swift build
…
Build complete! (26.96s)

Trade-offs (please weigh before merging)

Repo size: grows ~1.7 MB. Bulk of that is nlohmann/json.hpp (953 KB) — small relative to the existing binary xcframework artefact pulled at resolve time, but it is a real diff.
scripts/update_dependencies.sh + update-dependencies.yml workflow: today they git submodule update --remote to bump llama.cpp. After this PR that step is a no-op — bumping llama.cpp now requires re-running a (to-be-added) vendoring script that overwrites the vendored files from a target llama.cpp SHA. Out of scope for this PR; flagging here as a follow-up. Sketch:
```
# bump_llama.sh <new-sha>
git clone https://github.com/ggml-org/llama.cpp /tmp/llama && cd /tmp/llama && git checkout $1
# for each vendored path, copy from /tmp/llama back into Sources/LocalLLMClientLlamaC/
```
Happy to follow up with this script in a separate PR if you'd like to merge this one first.
Alternative architectures considered (full enumeration in LocalLLMClientLlama: 'memory' file not found — C wrapper needs C++ stdlib #94 comment):
- Vendor (this PR)
- SPM build-tool plugin that materialises symlinks at build time → blocked: plugins can't fetch git content
- Binary target wrapping LocalLLMClientLlamaC pre-compiled → cleanest consumer experience but heavy release infrastructure
- Document workaround → poor DX, doesn't help SPM-via-Xcode users
If you'd prefer a different shape, this branch can be closed and the work redone — the CI job here is still worth keeping in either case.
PR Support Gemma 4 #92 (Gemma 4) is open and uses the submodule pattern. Rebasing that on top of this PR would require running the vendoring step against the new llama.cpp SHA Gemma 4 needs.

Cc / context: filed by a downstream of OCCTSwiftPartsAgent, where this is blocking the in-process LLM path on iPad. Pairs with #95 (the MLX backend Tokenizers fix); together both backends now build cleanly under SPM consumption.

The Sources/LocalLLMClientLlamaC/ directory contained 25 symlinks pointing into the exclude/llama.cpp/ git submodule. Because SwiftPM does not recurse submodules when resolving package dependencies, every downstream consumer using `.package(url:..., from:)` (or `branch:` / `revision:`) ended up with dangling symlinks and a build failure. There is no SPM hook for downstream Package.swift to request submodule recursion. This change replaces every symlink with a regular file copy of its target content from llama.cpp at submodule-pinned commit 9a3ea685, and drops the submodule entirely: - 22 file symlinks → vendored copies (clip*, mtmd*, common/* …) - 2 directory symlinks → vendored directories (vendor/minja → minja/, vendor/nlohmann → nlohmann/) - 1 nested symlink → vendored copy (miniaudio/miniaudio.h) - common/json-partial.h was a forwarding header reaching into ../exclude/. Inlined the upstream content plus the existing `using json = nlohmann::ordered_json;` line. - stb/stb_image.h had a Linux branch including ../exclude/. Vendored upstream stb_image.h as stb/stb_image_full.h and pointed the Linux branch at it. Mac/iOS path (stb_image.swift) unchanged. - .gitmodules removed (the only entry was for exclude/llama.cpp). - exclude/ directory removed. The seven include/ forwarding headers (ggml.h, ggml-alloc.h, ggml-backend.h, ggml-cpu.h, ggml-opt.h, gguf.h, llama.h) are left alone: they use `#if __has_include(<llama/...>)` and the LocalLLMClientLlamaFramework xcframework satisfies the primary path. The dead `#else` branches still mention exclude/ but never compile with the xcframework on the include path. CI: adds a new build-spm-consumer-no-submodules job that does actions/checkout@v4 without submodules:recursive and runs `swift build --target LocalLLMClientLlama` (and MLX). This is the standalone regression test that today's main would fail. Local verification: `swift build --target LocalLLMClientLlama` from a working tree with the submodule removed builds clean (170/170 compiled). Trade-offs (maintainer to weigh): - Repo grows by ~1.7 MB of vendored source (vendor/nlohmann/ json.hpp alone is 953 KB). - scripts/update_dependencies.sh and the update-dependencies workflow currently `git submodule update --remote`. After this PR that step is a no-op; bumping llama.cpp now means re-running a (to-be-added) vendoring script. Out of scope for this PR — flagged for follow-up. - This is one of four options outlined in tattn#94 (vendor / build-tool plugin / binary target / document workaround). The PR demonstrates the vendor option; if you prefer another shape, this branch can be closed and the fix taken from a different angle. The CI job in this PR is approach-agnostic and worth keeping in any case. Closes tattn#94. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request integrates core llama.cpp components directly into the local source tree, replacing the submodule dependency with local implementations for chat parsing, multimodal support, and audio preprocessing. The review identifies several technical improvements: replacing std::rand() with modern for better thread safety and distribution, substituting std::localtime with thread-safe variants (localtime_r or localtime_s) to avoid race conditions in multi-threaded environments, and moving away from deprecated std::wstring_convert and std::codecvt_utf8 utilities to ensure long-term maintainability.

I am having trouble creating individual review comments. Click here to see my feedback.

Sources/LocalLLMClientLlamaC/common/chat-parser.cpp (22)

The use of std::rand() without a prior call to std::srand() results in a deterministic sequence of numbers, which may lead to predictable healing markers across different runs of the application. Additionally, std::rand() is not thread-safe. Consider using the modern C++ <random> header, specifically std::random_device and std::mt19937, for better distribution and thread safety.

Sources/LocalLLMClientLlamaC/common/common.cpp (428)

std::localtime is not thread-safe as it returns a pointer to a static internal buffer that is shared across all threads. In a multi-threaded environment, this can lead to race conditions where the returned data is overwritten by another thread. Use localtime_r on POSIX systems or localtime_s on Windows to ensure thread safety.

Sources/LocalLLMClientLlamaC/common/common.cpp (716)

std::wstring_convert and std::codecvt_utf8 are deprecated since C++17. While they are still functional in many compilers, they are slated for removal in future standards. Consider using a more modern library or platform-specific APIs for UTF-8/UTF-16 conversions to ensure long-term maintainability.

Sources/LocalLLMClientLlamaC/minja/chat-template.hpp (509)

Similar to the issue in common.cpp, std::localtime is not thread-safe. Since this is a header-only component that might be used in various contexts, using the thread-safe variants (localtime_r or localtime_s) is highly recommended to avoid potential race conditions.

tattn · 2026-04-25T03:18:02Z

In this library, I use the XCFramework provided by llama.cpp whenever possible.
When the interface is insufficient, I add only the minimum necessary symbolic links or header implementations.

The reason is that once I start copying files, I would eventually end up maintaining a separate version of llama.cpp in this repository.

Could you avoid copying files as much as possible?

gsdali · 2026-04-25T09:26:06Z

Thanks @tattn — that's a fair call, and a well-stated rationale. Avoiding the "separate fork of llama.cpp" maintenance burden is exactly the right priority for this library.

Closing this PR. Capturing the architectural decision here so anyone hitting #94 in the future has context:

Vendoring is off the table by design — keeps llama.cpp as a tracked submodule, not a forked copy.
Symlink + xcframework is the chosen architecture — the xcframework provides the bulk of the surface; symlinks fill the gaps.
SPM-consumer breakage (this PR's reason for existing) remains a real downstream problem when consumers pull via .package(url:..., from:) without --recurse-submodules. Two remaining viable shapes for a future fix:
1. Binary target wrapping LocalLLMClientLlamaC — pre-build the C wrapper into its own xcframework alongside the existing one, ship as .binaryTarget. Cleanest consumer experience but heavier release infrastructure.
2. A SwiftPM build-tool plugin that materialises the symlinks at build time — currently blocked by SPM not exposing a hook to fetch git submodules during package resolution; could become viable if SPM grows that capability.
Downstream workaround for now: consumers who need the Llama backend over SPM can pin a fork revision that vendors the symlinks (this PR's branch is one such snapshot, though it won't track future llama.cpp bumps), or use the local-development path of cloning the repo with --recurse-submodules and depending on a path package.

For OCCTSwiftPartsAgent we're consuming LocalLLMClient + LocalLLMClientMLX only, so the symlink issue doesn't affect us — happy that #95 unblocks the MLX path. Will revisit Llama integration in a future phase if/when the binary-target shape (or another approach) lands.

Thanks again for the quick review and the clear feedback.

gemini-code-assist Bot reviewed Apr 25, 2026

View reviewed changes

gsdali closed this Apr 25, 2026

gsdali deleted the fix/94-vendor-llama-symlinks branch April 25, 2026 09:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix #94: vendor llama.cpp symlinks so SPM consumers can build#96

Fix #94: vendor llama.cpp symlinks so SPM consumers can build#96
gsdali wants to merge 1 commit into
tattn:mainfrom
gsdali:fix/94-vendor-llama-symlinks

gsdali commented Apr 25, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

tattn commented Apr 25, 2026

Uh oh!

gsdali commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

gsdali commented Apr 25, 2026

Problem

Fix

Standalone regression test

Local verification

Trade-offs (please weigh before merging)

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Sources/LocalLLMClientLlamaC/common/chat-parser.cpp (22)

Sources/LocalLLMClientLlamaC/common/common.cpp (428)

Sources/LocalLLMClientLlamaC/common/common.cpp (716)

Sources/LocalLLMClientLlamaC/minja/chat-template.hpp (509)

Uh oh!

tattn commented Apr 25, 2026

Uh oh!

gsdali commented Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants