You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -184,12 +184,30 @@ NAME ID SIZE PROCESSOR UNTIL
184
184
qwen3:0.6b 7df6b6e09427 2.3 GB 100% GPU 4 minutes from now
185
185
```
186
186
187
-
Using this model in Rigging is as simple as using the `ollama/` or `ollama_chat/` prefixes:
187
+
<Warning>
188
+
Ollama is configured with a maximum context length on the server, by default 4096 tokens. This does not change depending on model and requires configuration to update.
189
+
190
+
If the input messages to the API would exceed this length, Ollama will silently truncate them to fit in the context window. This behavior can cause unexpected generation results due to missing context and is very difficult to detect in Rigging.
191
+
192
+
We make a best effort by monitoring model responses and checking if the reported input tokens is far less than the input messages we just sent. If observed, the following warning will be emitted.
193
+
194
+
```
195
+
GeneratorWarning: Input messages may have been truncated ...
196
+
```
197
+
198
+
When in doubt, monitor the Ollama server logs for the following:
0 commit comments