Commit f74cb67
docs: add HIP/AMD NaN warning for q8_0/turbo3 on large K-norm models
Adds a prominent WARNING block to turboquant-recommendations.md documenting
the observed NaN divergence when using q8_0 or turbo3 compression on models
with large K-vector norms (e.g. Qwen2.5-7B) on AMD/ROCm (HIP) backends.
The root cause is the int8 overflow path that differs between HIP and CUDA.
Recommended mitigations: switch to turbo2/turbo4 or add pre-quantization
K-norm clipping.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>1 parent 46efe26 commit f74cb67
1 file changed
Lines changed: 4 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
49 | 53 | | |
50 | 54 | | |
51 | 55 | | |
| |||
0 commit comments