Fix Metal model view overlap for q4 by Dango233 · Pull Request #58 · antirez/ds4

Dango233 · 2026-05-10T11:15:19Z

Summary

Fixes q4 startup on machines where the model crosses Metal's per-buffer maxBufferLength, then documents q4 usage and benchmark results for a Mac Studio M2 Ultra with 192 GB RAM.

The original implementation used a fixed 672 MiB model-view overlap. The q4 GGUF has 1152 MiB expert tensors, so on this machine one tensor can straddle two mapped Metal views and fail startup with:

ds4: Metal model range 106.89..108.01 GiB is not covered by mapped model views

This PR fixes that by deriving the Metal model-view overlap from the largest tensor described by the GGUF metadata, with a CLI/server override kept as a diagnostic lower-bound escape hatch.

192 GB q4 setup

On the tested Mac Studio M2 Ultra with 192 GB RAM, q4 runs at --ctx 32768 after raising the iGPU wired-memory limit:

sudo sysctl iogpu.wired_limit_mb=188000

Benchmark

Single-run Metal CLI numbers using:

--ctx 32768
--nothink
-sys ""
--temp 0
-n 256

Machine	Quant	Prompt	Prefill	Generation
Mac Studio M2 Ultra, 192 GB	q4	short	50.42 t/s	29.88 t/s
Mac Studio M2 Ultra, 192 GB	q4	3844 tokens	394.66 t/s	21.69 t/s

The long benchmark uses tests/test-vectors/prompts/long_code_audit.txt.

Test plan

make ds4 ds4-server ds4_test
./ds4 --inspect -m ds4flash.gguf
./ds4_test --server
q4 short benchmark command above
q4 long benchmark command above

Dango233 added 2 commits May 10, 2026 19:01

Infer Metal model view overlap from tensors

743f95b

Document M2 Ultra q4 benchmark

a4cbf50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Metal model view overlap for q4#58

Fix Metal model view overlap for q4#58
Dango233 wants to merge 2 commits intoantirez:mainfrom
Dango233:codex/metal-auto-model-overlap

Dango233 commented May 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dango233 commented May 10, 2026

Summary

192 GB q4 setup

Benchmark

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant