Skip to content
This repository was archived by the owner on May 5, 2026. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
d0fd92c
fix(security): stop implicit tool grants from config sections (#47487…
timeleft-- May 1, 2026
c79a6d7
fix(gateway): align sessions abort wait semantics (#74751) thanks @Bu…
BunsDev Apr 30, 2026
c1592bb
fix(cron): preserve model overrides for text payloads (#73946)
openclaw-clownfish[bot] Apr 29, 2026
3d6dbcc
fix(exec): preserve turnSourceChannel as messageProvider in approval …
hclsys Apr 30, 2026
ef6c4f7
fix(feishu): skip empty-text messages with no media to prevent blank …
hclsys Apr 30, 2026
603b113
fix(security): bound bootstrap handoff scopes (#72919)
timeleft-- May 1, 2026
244bb15
fix(security): remediate CodeQL alerts (#7c5bf1c675)
timeleft-- May 1, 2026
a58c4d8
fix(device-pair): reject invalid remote setup URLs (#7c51cd2baf)
timeleft-- May 1, 2026
7bec804
fix: gate startup context for sandboxed spawned sessions (#73611)
timeleft-- May 1, 2026
3c3da1c
fix(gateway): preserve rpc abort terminal snapshots (#0459206c40)
timeleft-- May 1, 2026
500aa6e
fix: environment edge case launcher regression (#74696)
timeleft-- May 1, 2026
86fcceb
fix(agents): finalize embedded lifecycle backstop (#ebff12e84f)
timeleft-- May 1, 2026
63841d5
fix(agents): preserve string user content when merging turns (#9061d1…
timeleft-- May 1, 2026
d405c0c
fix: derive dynamic context-window guard thresholds (#13e917e292)
timeleft-- May 1, 2026
f4e3f90
fix: reject invalid cron edits on disabled jobs (#74720)
timeleft-- May 1, 2026
037371f
fix(cron): catch croner parse errors in add/update handlers (#74193)
timeleft-- May 1, 2026
0a6d845
fix: accept previously documented WhatsApp exposeErrorText key (#74667)
timeleft-- May 1, 2026
515218c
fix: interpolate heartbeat response prefix templates (#73996)
timeleft-- May 1, 2026
0977380
fix(acp): fall through to thread-bound resolution on unresolvable tok…
timeleft-- May 1, 2026
caaf910
fix(mattermost): add WebSocket ping/pong keepalive (#73979)
timeleft-- May 1, 2026
1cfe22e
chore(release): bump version to 1.0.1-rc.1
timeleft-- May 1, 2026
77e1ece
docs(changelog): rewrite v1.0.1-rc.1 entries to RC contents only
timeleft-- May 2, 2026
b07769b
fix(cron): accept threaded delivery in gateway schema (b6be422306)
timeleft-- May 2, 2026
eee00c3
test(cron): mock loadConfig in cron.validation.test.ts for baseline
timeleft-- May 2, 2026
183ac16
test(cron): add missing loadCronStore import to service.issue-regress…
timeleft-- May 2, 2026
0e512f0
fix(outbound): hold active-delivery claim so reconnect drain skips li…
timeleft-- May 2, 2026
7155928
fix: isolate cron context-engine session keys (#72292) (a3c51f91c5)
timeleft-- May 2, 2026
1a15ec9
fix(cron): preserve current delivery target context (e309fd485e)
timeleft-- May 2, 2026
33f63ed
test(mattermost): skip unrelated post-baseline routing test
timeleft-- May 2, 2026
39ae761
test(qr-cli): skip URL-validation test that needs stricter parser
timeleft-- May 2, 2026
64a51d0
test(whatsapp): trim post-baseline systemPrompt tests from cherry-pick
timeleft-- May 2, 2026
52cf3e4
test(pairing): trim setup-code tests that need stricter URL parser
timeleft-- May 2, 2026
0a4e4f9
test(acp): skip 5 post-baseline ACP feature tests from cherry-pick
timeleft-- May 2, 2026
17bae54
test(device-pair): skip 9 URL-validation tests needing stricter parser
timeleft-- May 2, 2026
82a5cdc
test(agents): skip 16 post-baseline sanitize-history tests from cherr…
timeleft-- May 2, 2026
be4cdc3
test(auto-reply): skip 21 post-baseline tests in agent-runner-execution
timeleft-- May 2, 2026
86133f7
docs(changelog): add MK-51 / MK-52 fix entries to v1.0.1-rc.1
timeleft-- May 2, 2026
d316f10
fix(gateway): import isAbortError in agent.ts (PR #4 review fix)
timeleft-- May 2, 2026
570d4bb
test(gateway): skip 41 post-baseline tests across agent/abort/dedupe
timeleft-- May 2, 2026
c930fbb
fix(gateway): remove dead refs to upstream task-tracking helpers
timeleft-- May 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 84 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,90 @@

Docs: https://docs.openclaw.ai

## Unreleased
## ProdClaw 1.0.1-rc.1

24 cherry-picked fixes from upstream OpenClaw onto the v2026.4.20 baseline.
Each fix targets code present at the baseline and is self-contained (no
dependency on newer feature-train code). Upstream PR/issue numbers are noted
where available; commit references are upstream OpenClaw SHAs preserved in
this fork. The MK-51 / MK-52 fixes resolve the production incidents
documented in the linked Iris tickets.

### Security

- Stop implicit tool grants from config sections (upstream #47487, #75055).
- Bound bootstrap handoff scopes in device pairing (upstream #72919).
- Reject invalid remote setup URLs in device pairing (commit `7c51cd2baf`).
- CodeQL: iterative HTML tag stripping prevents nested-tag bypass; timing-safe
secret comparison (commit `7c5bf1c675`).
- Gate startup context for sandboxed spawned sessions (upstream #73611).

### Crash / Hang

- Gateway: align sessions abort wait semantics so abort no longer hangs
(upstream #74751). Thanks @BunsDev.
- Embedded agent runs: lifecycle backstop ensures runs finalize when the
embedded path fails to terminate cleanly (commit `ebff12e84f`).
- Launcher: handle empty-string `NODE_COMPILE_CACHE` env so the launcher does
not crash on edge-case environments (upstream #74696).

### Correctness — Gateway / Sessions

- **Outbound: hold active-delivery claim so reconnect drain skips live sends**
(commit `c94a8702c7`). Part of the MK-51 fix train. Prevents reconnect
drain from re-driving an entry that the live send is still writing to
the adapter.
- Gateway: preserve RPC abort terminal snapshots so wait-for-completion
clients receive the final state on aborted runs (commit `0459206c40`).
- Agents: preserve string user content when merging turns; normalize
string-form content to content-part arrays before merge (commit
`9061d1e4c3`).
- Derive dynamic context-window guard thresholds from model capabilities
instead of hardcoded values (commit `13e917e292`).

### Correctness — Cron / Commands / Config

- **Cron: preserve current delivery target context** (commit `e309fd485e`).
Resolves the WOD/Fajr scheduler incident (Iris MK-52). Cron announce jobs
created from a Telegram (or other channel) context now persist the
current delivery target metadata so unattended runs deliver to the
originating chat instead of failing with
"Delivering to <channel> requires target <chatId>".
- **Cron: isolate cron context-engine session keys** (upstream #72292,
commit `a3c51f91c5`). Resolves the stale-WOD-context incident
(Iris MK-51). Threads `runSessionKey` through the isolated-agent
execution context so cron-emitted system events no longer accumulate
against the main session and bleed into the next user message.
- Cron: preserve model overrides for text-mode cron payloads (upstream
#73946).
- Cron: reject invalid cron edits on disabled jobs to prevent silent state
corruption (upstream #74720).
- Cron: catch croner parse errors in `cron.add` and `cron.update` handlers
so bad expressions return a structured error instead of crashing the
gateway (upstream #74193).
- Cron: accept `delivery.threadId` (string or number) in the gateway schema
for threaded announce delivery, e.g. Telegram forum topics (commit
`b6be422306`).
- Config: accept the previously documented WhatsApp `exposeErrorText` key
to prevent validation failures on existing configs (upstream #74667).

### Correctness — Channels / Delivery

- Exec: preserve `turnSourceChannel` as `messageProvider` in approval
followup runs (upstream #74666).
- Feishu: skip empty-text messages with no media to prevent blank session
turns (upstream #74634, #74661).
- Heartbeat: interpolate response prefix templates so variables like
`{model}` render instead of appearing literally (upstream #73996).
Thanks @yweiii and @JunJD.
- ACP: fall through to thread-bound resolution when an ACP token is
unresolvable, instead of failing auto-reply silently (upstream #66299,
#74641).

### Correctness — Extensions

- Mattermost: WebSocket ping/pong keepalive prevents idle connection drops
on servers with aggressive timeout policies (upstream #73979).

## 2026.4.20

Expand Down
2 changes: 1 addition & 1 deletion docs/automation/cron-jobs.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ retries, cron aborts instead of looping forever.
| `webhook` | POST finished event payload to a URL |
| `none` | Internal only, no delivery |

Use `--announce --channel telegram --to "-1001234567890"` for channel delivery. For Telegram forum topics, use `-1001234567890:topic:123`. Slack/Discord/Mattermost targets should use explicit prefixes (`channel:<id>`, `user:<id>`).
Use `--announce --channel telegram --to "-1001234567890"` for channel delivery. For Telegram forum topics, use `-1001234567890:topic:123`; direct RPC/config callers may also pass `delivery.threadId` as a string or number. Slack/Discord/Mattermost targets should use explicit prefixes (`channel:<id>`, `user:<id>`).

For cron-owned isolated jobs, the runner owns the final delivery path. The
agent is prompted to return a plain-text summary, and that summary is then sent
Expand Down
199 changes: 176 additions & 23 deletions docs/gateway/local-models.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,17 @@ read_when:
- You want to serve models from your own GPU box
- You are wiring LM Studio or an OpenAI-compatible proxy
- You need the safest local model guidance
title: "Local Models"
title: "Local models"
---

# Local models

Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).

If you want the lowest-friction local setup, start with [LM Studio](/providers/lmstudio) or [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.

<Warning>
**WSL2 + Ollama + NVIDIA/CUDA users:** The official Ollama Linux installer enables a systemd service with `Restart=always`. On WSL2 GPU setups, autostart can reload the last model during boot and pin host memory. If your WSL2 VM repeatedly restarts after enabling Ollama, see [WSL2 crash loop](/providers/ollama#wsl2-crash-loop-repeated-reboots).
</Warning>

## Recommended: LM Studio + large local model (Responses API)

Best current local stack. Load a large model in LM Studio (for example, a full-size Qwen, DeepSeek, or Llama build), enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.
Expand All @@ -21,26 +23,26 @@ Best current local stack. Load a large model in LM Studio (for example, a full-s
{
agents: {
defaults: {
model: { primary: lmstudio/my-local-model },
model: { primary: "lmstudio/my-local-model" },
models: {
anthropic/claude-opus-4-6: { alias: Opus },
lmstudio/my-local-model: { alias: Local },
"anthropic/claude-opus-4-6": { alias: "Opus" },
"lmstudio/my-local-model": { alias: "Local" },
},
},
},
models: {
mode: merge,
mode: "merge",
providers: {
lmstudio: {
baseUrl: http://127.0.0.1:1234/v1,
apiKey: lmstudio,
api: openai-responses,
baseUrl: "http://127.0.0.1:1234/v1",
apiKey: "lmstudio",
api: "openai-responses",
models: [
{
id: my-local-model,
name: Local Model,
id: "my-local-model",
name: "Local Model",
reasoning: false,
input: [text],
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 196608,
maxTokens: 8192,
Expand Down Expand Up @@ -115,17 +117,27 @@ Swap the primary and fallback order; keep the same providers block and `models.m

## Other OpenAI-compatible local proxies

vLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style `/v1` endpoint. Replace the provider block above with your endpoint and model ID:
MLX (`mlx_lm.server`), vLLM, SGLang, LiteLLM, OAI-proxy, or custom
gateways work if they expose an OpenAI-style `/v1/chat/completions`
endpoint. Use the Chat Completions adapter unless the backend explicitly
documents `/v1/responses` support. Replace the provider block above with your
endpoint and model ID:

```json5
{
agents: {
defaults: {
model: { primary: "local/my-local-model" },
},
},
models: {
mode: "merge",
providers: {
local: {
baseUrl: "http://127.0.0.1:8000/v1",
apiKey: "sk-local",
api: "openai-responses",
api: "openai-completions",
timeoutSeconds: 300,
models: [
{
id: "my-local-model",
Expand All @@ -143,7 +155,35 @@ vLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style
}
```

If `api` is omitted on a custom provider with a `baseUrl`, OpenClaw defaults to
`openai-completions`. Loopback endpoints such as `127.0.0.1` are trusted
automatically; LAN, tailnet, and private DNS endpoints still need
`request.allowPrivateNetwork: true`.

The `models.providers.<id>.models[].id` value is provider-local. Do not
include the provider prefix there. For example, an MLX server started with
`mlx_lm.server --model mlx-community/Qwen3-30B-A3B-6bit` should use this
catalog id and model ref:

- `models.providers.mlx.models[].id: "mlx-community/Qwen3-30B-A3B-6bit"`
- `agents.defaults.model.primary: "mlx/mlx-community/Qwen3-30B-A3B-6bit"`

Set `input: ["text", "image"]` on local or proxied vision models so image
attachments are injected into agent turns. Interactive custom-provider
onboarding infers common vision model IDs and asks only for unknown names.
Non-interactive onboarding uses the same inference; use `--custom-image-input`
for unknown vision IDs or `--custom-text-input` when a known-looking model is
text-only behind your endpoint.

Keep `models.mode: "merge"` so hosted models stay available as fallbacks.
Use `models.providers.<id>.timeoutSeconds` for slow local or remote model
servers before raising `agents.defaults.timeoutSeconds`. The provider timeout
applies only to model HTTP requests, including connect, headers, body streaming,
and the total guarded-fetch abort.

<Note>
For custom OpenAI-compatible providers, persisting a non-secret local marker such as `apiKey: "ollama-local"` is accepted when `baseUrl` resolves to loopback, a private LAN, `.local`, or a bare hostname. OpenClaw treats it as a valid local credential instead of reporting a missing key. Use a real value for any provider that accepts a public hostname.
</Note>

Behavior note for local/proxied `/v1` backends:

Expand All @@ -161,15 +201,111 @@ Compatibility notes for stricter OpenAI-compatible backends:
structured content-part arrays. Set
`models.providers.<provider>.models[].compat.requiresStringContent: true` for
those endpoints.
- Some local models emit standalone bracketed tool requests as text, such as
`[tool_name]` followed by JSON and `[END_TOOL_REQUEST]`. OpenClaw promotes
those into real tool calls only when the name exactly matches a registered
tool for the turn; otherwise the block is treated as unsupported text and is
hidden from user-visible replies.
- If a model emits JSON, XML, or ReAct-style text that looks like a tool call
but the provider did not emit a structured invocation, OpenClaw leaves it as
text and logs a warning with the run id, provider/model, detected pattern, and
tool name when available. Treat that as provider/model tool-call
incompatibility, not a completed tool run.
- If tools appear as assistant text instead of running, for example raw JSON,
XML, ReAct syntax, or an empty `tool_calls` array in the provider response,
first verify the server is using a tool-call-capable chat template/parser. For
OpenAI-compatible Chat Completions backends whose parser works only when tool
use is forced, set a per-model request override instead of relying on text
parsing:

```json5
{
agents: {
defaults: {
models: {
"local/my-local-model": {
params: {
extra_body: {
tool_choice: "required",
},
},
},
},
},
},
}
```

Use this only for models/sessions where every normal turn should call a tool.
It overrides OpenClaw's default proxy value of `tool_choice: "auto"`.
Replace `local/my-local-model` with the exact provider/model ref shown by
`openclaw models list`.

```bash
openclaw config set agents.defaults.models '{"local/my-local-model":{"params":{"extra_body":{"tool_choice":"required"}}}}' --strict-json --merge
```

- If a custom OpenAI-compatible model accepts OpenAI reasoning efforts beyond
the built-in profile, declare them on the model compat block. Adding `"xhigh"`
here makes `/think xhigh`, session pickers, Gateway validation, and `llm-task`
validation expose the level for that configured provider/model ref:

```json5
{
models: {
providers: {
local: {
baseUrl: "http://127.0.0.1:8000/v1",
apiKey: "sk-local",
api: "openai-responses",
models: [
{
id: "gpt-5.4",
name: "GPT 5.4 via local proxy",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 196608,
maxTokens: 8192,
compat: {
supportedReasoningEfforts: ["low", "medium", "high", "xhigh"],
reasoningEffortMap: { xhigh: "xhigh" },
},
},
],
},
},
},
}
```

- Some smaller or stricter local backends are unstable with OpenClaw's full
agent-runtime prompt shape, especially when tool schemas are included. If the
backend works for tiny direct `/v1/chat/completions` calls but fails on normal
OpenClaw agent turns, first try
agent-runtime prompt shape, especially when tool schemas are included. First
verify the provider path with the lean local probe:

```bash
openclaw infer model run --local --model <provider/model> --prompt "Reply with exactly: pong" --json
```

To verify the Gateway route without the full agent prompt shape, use the
Gateway model probe instead:

```bash
openclaw infer model run --gateway --model <provider/model> --prompt "Reply with exactly: pong" --json
```

Both local and Gateway model probes send only the supplied prompt. The
Gateway probe still validates Gateway routing, auth, and provider selection,
but it intentionally skips prior session transcript, AGENTS/bootstrap context,
context-engine assembly, tools, and bundled MCP servers.

If that succeeds but normal OpenClaw agent turns fail, first try
`agents.defaults.experimental.localModelLean: true` to drop heavyweight
default tools like `browser`, `cron`, and `message`; this is an experimental
flag, not a stable default-mode setting. See
[Experimental Features](/concepts/experimental-features). If that still fails, try
`models.providers.<provider>.models[].compat.supportsTools: false`.

- If the backend still fails only on larger OpenClaw runs, the remaining issue
is usually upstream model/server capacity or a backend bug, not OpenClaw's
transport layer.
Expand All @@ -178,12 +314,29 @@ Compatibility notes for stricter OpenAI-compatible backends:

- Gateway can reach the proxy? `curl http://127.0.0.1:1234/v1/models`.
- LM Studio model unloaded? Reload; cold start is a common “hanging” cause.
- OpenClaw warns when the detected context window is below **32k** and blocks below **16k**. If you hit that preflight, raise the server/model context limit or choose a larger model.
- Local server says `terminated`, `ECONNRESET`, or closes the stream mid-turn?
OpenClaw records a low-cardinality `model.call.error.failureKind` plus the
OpenClaw process RSS/heap snapshot in diagnostics. For LM Studio/Ollama
memory pressure, match that timestamp against the server log or macOS crash /
jetsam log to confirm whether the model server was killed.
- OpenClaw derives context-window preflight thresholds from the detected model window, or from the uncapped model window when `agents.defaults.contextTokens` lowers the effective window. It warns below 20% with an **8k** floor. Hard blocks use the 10% threshold with a **4k** floor, capped to the effective context window so oversized model metadata cannot reject an otherwise valid user cap. If you hit that preflight, raise the server/model context limit or choose a larger model.
- Context errors? Lower `contextWindow` or raise your server limit.
- OpenAI-compatible server returns `messages[].content ... expected a string`?
Add `compat.requiresStringContent: true` on that model entry.
- Direct tiny `/v1/chat/completions` calls work, but `openclaw infer model run`
fails on Gemma or another local model? Disable tool schemas first with
`compat.supportsTools: false`, then retest. If the server still crashes only
on larger OpenClaw prompts, treat it as an upstream server/model limitation.
- Direct tiny `/v1/chat/completions` calls work, but `openclaw infer model run --local`
fails on Gemma or another local model? Check the provider URL, model ref, auth
marker, and server logs first; local `model run` does not include agent tools.
If local `model run` succeeds but larger agent turns fail, reduce the agent
tool surface with `localModelLean` or `compat.supportsTools: false`.
- Tool calls show up as raw JSON/XML/ReAct text, or the provider returns an
empty `tool_calls` array? Do not add a proxy that blindly converts assistant
text into tool execution. Fix the server chat template/parser first. If the
model only works when tool use is forced, add the per-model
`params.extra_body.tool_choice: "required"` override above and use that model
entry only for sessions where a tool call is expected on every turn.
- Safety: local models skip provider-side filters; keep agents narrow and compaction on to limit prompt injection blast radius.

## Related

- [Configuration reference](/gateway/configuration-reference)
- [Model failover](/concepts/model-failover)
Loading
Loading