Skip to content

Commit e7976f4

Browse files
committed
feat(mtmd): improve fallback chat template for multimodal models
- Add BOS/EOS token handling to the default MTMD chat format. - Use a clearer role-based template with explicit USER and ASSISTANT prefixes. - Append a newline after each message to keep generated prompts readable. - Treat EOS as the end marker for the serialized conversation history before the optional generation prompt. - Improve fallback behavior for multimodal GGUF models that do not provide a chat template, such as OCR-oriented models like DeepSeek-OCR 1/2. - Make the default system prompt a single normalized string while preserving its original meaning. - Clean up minor formatting around MTMD context parameter initialization. This improves prompt compatibility for multimodal models that either lack a GGUF chat template or are not yet covered by a complete custom chat handler. Signed-off-by: JamePeng <jame_peng@sina.com>
1 parent 69e740c commit e7976f4

1 file changed

Lines changed: 15 additions & 12 deletions

File tree

llama_cpp/llama_chat_format.py

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2811,21 +2811,20 @@ def generate_streaming(tools, functions, function_call, prompt):
28112811

28122812
class MTMDChatHandler:
28132813
DEFAULT_SYSTEM_MESSAGE: Optional[str] = (
2814-
"""You are an exceptionally capable, precise, and helpful multimodal AI assistant that excels at deeply understanding and richly describing images, charts, diagrams, text in images, scenes, and any visual content,
2815-
while also answering every question accurately, clearly, and step-by-step when appropriate — always responding in the same language as the user's question, remaining polite, professional, and maximally helpful."""
2814+
"You are an exceptionally capable, precise, and helpful multimodal AI assistant that excels at deeply understanding and richly describing images, charts, diagrams, text in images, scenes, and any visual content, "
2815+
"while also answering every question accurately, clearly, and step-by-step when appropriate — always responding in the same language as the user's question, remaining polite, professional, and maximally helpful."
28162816
)
28172817

28182818
CHAT_FORMAT = (
2819+
"{{ bos_token if bos_token is defined else '' }}"
28192820
"{% for message in messages %}"
28202821
"{% if message.role == 'system' %}"
28212822
"{{ message.content }}"
2822-
"{% endif %}"
2823-
2824-
"{% if message.role == 'user' %}"
2823+
"{% elif message.role == 'user' %}"
2824+
"USER: "
28252825
"{% if message.content is string %}"
2826-
"\nUSER: {{ message.content }}"
2826+
"{{ message.content }}"
28272827
"{% elif message.content is iterable %}"
2828-
"\nUSER: "
28292828
"{% for content in message.content %}"
28302829
"{% if content.type == 'image_url' %}"
28312830
"{{ content.image_url if content.image_url is string else content.image_url.url }}"
@@ -2842,15 +2841,19 @@ class MTMDChatHandler:
28422841
"{% endif %}"
28432842
"{% endfor %}"
28442843
"{% endif %}"
2845-
"{% endif %}"
28462844

2847-
"{% if message.role == 'assistant' and message.content is not none %}"
2848-
"\nASSISTANT: {{ message.content }}"
2845+
"{% elif message.role == 'assistant' and message.content is not none %}"
2846+
"ASSISTANT: {{ message.content }}"
28492847
"{% endif %}"
2848+
"{{ \"\n\" }}"
28502849
"{% endfor %}"
28512850

2851+
"{% if eos_token is defined %}"
2852+
"{{ eos_token }}"
2853+
"{% endif %}"
2854+
28522855
"{% if add_generation_prompt %}"
2853-
"\nASSISTANT: "
2856+
"ASSISTANT: "
28542857
"{% endif %}"
28552858
)
28562859

@@ -2906,7 +2909,7 @@ def _init_mtmd_context(self, llama_model: llama_core.Llama):
29062909
self.mctx_params.use_gpu = self.use_gpu
29072910
self.mctx_params.print_timings = self.verbose
29082911
self.mctx_params.n_threads = llama_model.n_threads
2909-
self.mctx_params.flash_attn_type = self._mtmd_cpp.clip_flash_attn_type.CLIP_FLASH_ATTN_TYPE_AUTO
2912+
self.mctx_params.flash_attn_type = self._mtmd_cpp.clip_flash_attn_type.CLIP_FLASH_ATTN_TYPE_AUTO
29102913
self.mctx_params.warmup = True
29112914
if self.image_min_tokens > 0:
29122915
self.mctx_params.image_min_tokens = self.image_min_tokens

0 commit comments

Comments
 (0)