Skip to content

Add Mac Studio M2 Ultra q2 benchmark results#52

Open
Dango233 wants to merge 2 commits intoantirez:mainfrom
Dango233:codex/readme-q2-bench-m2-ultra
Open

Add Mac Studio M2 Ultra q2 benchmark results#52
Dango233 wants to merge 2 commits intoantirez:mainfrom
Dango233:codex/readme-q2-bench-m2-ultra

Conversation

@Dango233
Copy link
Copy Markdown

Summary

Adds q2 benchmark results for a Mac Studio M2 Ultra with 192 GB RAM.

The new M2 Ultra rows are 5-run averages using:

  • --ctx 32768
  • --nothink
  • -sys ""
  • --temp 0
  • -n 256

The long benchmark uses tests/test-vectors/prompts/long_code_audit.txt, which reports 3844 prompt tokens and triggers chunked Metal prefill.

Results

Machine Quant Prompt Prefill Generation
Mac Studio M2 Ultra, 192 GB q2 short, 5-run avg 51.86 t/s 30.72 t/s
Mac Studio M2 Ultra, 192 GB q2 3844 tokens, 5-run avg 401.58 t/s 22.56 t/s

Benchmark Runs

Short prompt:

  • Prefill avg: 51.86 t/s, sample sd: 0.66
  • Generation avg: 30.72 t/s, sample sd: 0.04

Long prompt:

  • Prefill avg: 401.58 t/s, sample sd: 0.62
  • Generation avg: 22.56 t/s, sample sd: 0.07

@Dango233 Dango233 force-pushed the codex/readme-q2-bench-m2-ultra branch from b690d25 to 59b50b5 Compare May 10, 2026 03:49
@gmontana
Copy link
Copy Markdown

FWIW, M4 Max (40-core GPU), 128 GB numbers under the same protocol as this PR. The laptop chassis throttles ~30% over 5 back-to-back long runs, so I added 2 warmup passes and 60 s cooldowns between measured runs to stay off the throttling regime. No other deviation.

Machine Quant Prompt Prefill Generation
MacBook Pro M4 Max (40-core GPU), 128 GB q2 short, 5-run avg 63.16 t/s 32.23 t/s
MacBook Pro M4 Max (40-core GPU), 128 GB q2 3844 tokens, 5-run avg 302.33 t/s 24.84 t/s

Benchmark Runs

Short prompt:

  • Prefill avg: 63.16 t/s, sample sd: 1.02
  • Generation avg: 32.23 t/s, sample sd: 0.23

Long prompt:

  • Prefill avg: 302.33 t/s, sample sd: 3.73
  • Generation avg: 24.84 t/s, sample sd: 0.59

Happy to send a separate PR with the M4 Max row once the protocol is settled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants