-
Notifications
You must be signed in to change notification settings - Fork 222
Pull requests: danveloper/flash-moe
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: 8-bit dequant for MLX mixed-precision gate quantization
#14
opened Mar 23, 2026 by
userFRM
Loading…
4 tasks
feat: cache-aware routing + co-activation expert clustering
#12
opened Mar 23, 2026 by
userFRM
Loading…
6 tasks
perf: 5 pure-win optimizations — zero quality tradeoff
#11
opened Mar 22, 2026 by
userFRM
Loading…
4 tasks
feat: CUDA/NVIDIA port — Qwen3.5-397B on single GPU at 5.35 tok/s (5.86 peak)
#7
opened Mar 22, 2026 by
ssubbotin
Loading…
10 tasks done
feat: live dashboard monitor + serve loop improvements
#5
opened Mar 21, 2026 by
msitarzewski
Loading…
5 tasks
feat: runtime model config from HuggingFace config.json
#3
opened Mar 20, 2026 by
Alexintosh
Loading…
6 tasks
Fix portability: runtime paths, missing setup scripts, vocab format bug
#1
opened Mar 19, 2026 by
msitarzewski
Loading…
5 tasks
ProTip!
Updated in the last three days: updated:>2026-03-26.