Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

model: qwen3next: no concat-in-loop model Model specific
#18759 opened Jan 11, 2026 by ngxson Draft
vocab: add tokenizer support for jina-embeddings-v2-base-zh python python script changes
#18756 opened Jan 11, 2026 by o7si Loading…
Kimi-Linear support (backend agnostic + MLA KV cache) ggml changes relating to the ggml tensor library for machine learning model Model specific python python script changes
#18755 opened Jan 11, 2026 by ymcki Loading…
security: make it clear about subtopics in server
#18754 opened Jan 11, 2026 by ngxson Loading…
tests : refactor test-backend-sampler devops improvements to build systems and github actions testing Everything test related
#18753 opened Jan 11, 2026 by ggerganov Loading…
Vulkan: Optimize Matmul parameters for AMD GPUs with Coopmat support ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18749 opened Jan 11, 2026 by 0cc4m Loading…
ggml, llama : add CPU paged attention for memory-efficient KV cache examples ggml changes relating to the ggml tensor library for machine learning model Model specific testing Everything test related
#18747 opened Jan 11, 2026 by pestopoppa Loading…
5 tasks done
server: add missing rerank and chat presets (#10932)
#18742 opened Jan 10, 2026 by ingyukoh Loading…
POC: group gate_exps and up_exps + fix mxfp4 alignment for PP boost model Model specific python python script changes
#18740 opened Jan 10, 2026 by am17an Draft
llama: add canaries to Markdown files
#18735 opened Jan 10, 2026 by JohannesGaessler Loading…
feat: add support for WeDLM architecture python python script changes
#18731 opened Jan 10, 2026 by feedseawave Loading…
5 tasks done
model: Add VAETKI support examples model Model specific python python script changes
#18719 opened Jan 9, 2026 by dororodoroddo Loading…
5 tasks done
ggml: new backend for Virglrenderer API Remoting acceleration (v2) build Compilation issues ggml changes relating to the ggml tensor library for machine learning python python script changes
#18718 opened Jan 9, 2026 by kpouget Loading…
vulkan: Check maxStorageBufferRange in supports_op ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18709 opened Jan 9, 2026 by jeffbolznv Loading…
fix text spacing in print_info
#18708 opened Jan 9, 2026 by ddh0 Loading…
ggml-metal: Clean up files used for embedded build Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#18705 opened Jan 9, 2026 by DaAwesomeP Loading…
[WIP] ggml-opencl: op args init refactoring ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
#18701 opened Jan 8, 2026 by chraac Draft
Improving inference speed for the repack buffer type on NUMA architectures ggml changes relating to the ggml tensor library for machine learning
#18698 opened Jan 8, 2026 by zzjianhui Loading…
ggml-cuda: extend concat support for more types ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#18690 opened Jan 8, 2026 by Lourdle Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.