Skip to content

Commit 5e6529e

Browse files
committed
docs: add audio processing recommendation to Gemma4ChatHandler
- Recommend BF16 mmproj for Gemma4 E2B and E4B models. - Note known degraded audio performance with other quantizations. - Add reference link to the relevant llama.cpp PR/issue comment. Signed-off-by: JamePeng <jame_peng@sina.com>
1 parent 4ec15ac commit 5e6529e

1 file changed

Lines changed: 5 additions & 0 deletions

File tree

llama_cpp/llama_chat_format.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4342,6 +4342,11 @@ class Gemma4ChatHandler(MTMDChatHandler):
43424342
Note on `enable_thinking`:
43434343
The `enable_thinking` toggle is currently ONLY supported by Gemma4 31B and 26BA4B models.
43444344
It is NOT supported by Gemma4 E2B and E4B models.
4345+
4346+
[Important Note for Audio Processing!]
4347+
It is recommended to use BF16 mmproj for Gemma4 E2B and E4B models.
4348+
Other quantizations are known to have degraded performance;
4349+
ref comment: https://github.com/ggml-org/llama.cpp/pull/21421#issuecomment-4230306463
43454350
"""
43464351

43474352
# The special token in Gemma 4

0 commit comments

Comments
 (0)