MachineWisdomAI · timeleft-- · May 4, 2026 · May 1, 2026 · Apr 30, 2026 · Apr 29, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,7 +2,90 @@
 
 Docs: https://docs.openclaw.ai
 
-## Unreleased
+## ProdClaw 1.0.1-rc.1
+
+24 cherry-picked fixes from upstream OpenClaw onto the v2026.4.20 baseline.
+Each fix targets code present at the baseline and is self-contained (no
+dependency on newer feature-train code). Upstream PR/issue numbers are noted
+where available; commit references are upstream OpenClaw SHAs preserved in
+this fork. The MK-51 / MK-52 fixes resolve the production incidents
+documented in the linked Iris tickets.
+
+### Security
+
+- Stop implicit tool grants from config sections (upstream #47487, #75055).
+- Bound bootstrap handoff scopes in device pairing (upstream #72919).
+- Reject invalid remote setup URLs in device pairing (commit `7c51cd2baf`).
+- CodeQL: iterative HTML tag stripping prevents nested-tag bypass; timing-safe
+  secret comparison (commit `7c5bf1c675`).
+- Gate startup context for sandboxed spawned sessions (upstream #73611).
+
+### Crash / Hang
+
+- Gateway: align sessions abort wait semantics so abort no longer hangs
+  (upstream #74751). Thanks @BunsDev.
+- Embedded agent runs: lifecycle backstop ensures runs finalize when the
+  embedded path fails to terminate cleanly (commit `ebff12e84f`).
+- Launcher: handle empty-string `NODE_COMPILE_CACHE` env so the launcher does
+  not crash on edge-case environments (upstream #74696).
+
+### Correctness — Gateway / Sessions
+
+- **Outbound: hold active-delivery claim so reconnect drain skips live sends**
+  (commit `c94a8702c7`). Part of the MK-51 fix train. Prevents reconnect
+  drain from re-driving an entry that the live send is still writing to
+  the adapter.
+- Gateway: preserve RPC abort terminal snapshots so wait-for-completion
+  clients receive the final state on aborted runs (commit `0459206c40`).
+- Agents: preserve string user content when merging turns; normalize
+  string-form content to content-part arrays before merge (commit
+  `9061d1e4c3`).
+- Derive dynamic context-window guard thresholds from model capabilities
+  instead of hardcoded values (commit `13e917e292`).
+
+### Correctness — Cron / Commands / Config
+
+- **Cron: preserve current delivery target context** (commit `e309fd485e`).
+  Resolves the WOD/Fajr scheduler incident (Iris MK-52). Cron announce jobs
+  created from a Telegram (or other channel) context now persist the
+  current delivery target metadata so unattended runs deliver to the
+  originating chat instead of failing with
+  "Delivering to <channel> requires target <chatId>".
+- **Cron: isolate cron context-engine session keys** (upstream #72292,
+  commit `a3c51f91c5`). Resolves the stale-WOD-context incident
+  (Iris MK-51). Threads `runSessionKey` through the isolated-agent
+  execution context so cron-emitted system events no longer accumulate
+  against the main session and bleed into the next user message.
+- Cron: preserve model overrides for text-mode cron payloads (upstream
+  #73946).
+- Cron: reject invalid cron edits on disabled jobs to prevent silent state
+  corruption (upstream #74720).
+- Cron: catch croner parse errors in `cron.add` and `cron.update` handlers
+  so bad expressions return a structured error instead of crashing the
+  gateway (upstream #74193).
+- Cron: accept `delivery.threadId` (string or number) in the gateway schema
+  for threaded announce delivery, e.g. Telegram forum topics (commit
+  `b6be422306`).
+- Config: accept the previously documented WhatsApp `exposeErrorText` key
+  to prevent validation failures on existing configs (upstream #74667).
+
+### Correctness — Channels / Delivery
+
+- Exec: preserve `turnSourceChannel` as `messageProvider` in approval
+  followup runs (upstream #74666).
+- Feishu: skip empty-text messages with no media to prevent blank session
+  turns (upstream #74634, #74661).
+- Heartbeat: interpolate response prefix templates so variables like
+  `{model}` render instead of appearing literally (upstream #73996).
+  Thanks @yweiii and @JunJD.
+- ACP: fall through to thread-bound resolution when an ACP token is
+  unresolvable, instead of failing auto-reply silently (upstream #66299,
+  #74641).
+
+### Correctness — Extensions
+
+- Mattermost: WebSocket ping/pong keepalive prevents idle connection drops
+  on servers with aggressive timeout policies (upstream #73979).
 
 ## 2026.4.20
 

diff --git a/docs/automation/cron-jobs.md b/docs/automation/cron-jobs.md
@@ -129,7 +129,7 @@ retries, cron aborts instead of looping forever.
 | `webhook`  | POST finished event payload to a URL                     |
 | `none`     | Internal only, no delivery                               |
 
-Use `--announce --channel telegram --to "-1001234567890"` for channel delivery. For Telegram forum topics, use `-1001234567890:topic:123`. Slack/Discord/Mattermost targets should use explicit prefixes (`channel:<id>`, `user:<id>`).
+Use `--announce --channel telegram --to "-1001234567890"` for channel delivery. For Telegram forum topics, use `-1001234567890:topic:123`; direct RPC/config callers may also pass `delivery.threadId` as a string or number. Slack/Discord/Mattermost targets should use explicit prefixes (`channel:<id>`, `user:<id>`).
 
 For cron-owned isolated jobs, the runner owns the final delivery path. The
 agent is prompted to return a plain-text summary, and that summary is then sent

diff --git a/docs/gateway/local-models.md b/docs/gateway/local-models.md
@@ -4,15 +4,17 @@ read_when:
   - You want to serve models from your own GPU box
   - You are wiring LM Studio or an OpenAI-compatible proxy
   - You need the safest local model guidance
-title: "Local Models"
+title: "Local models"
 ---
 
-# Local models
-
 Local is doable, but OpenClaw expects large context + strong defenses against prompt injection. Small cards truncate context and leak safety. Aim high: **≥2 maxed-out Mac Studios or equivalent GPU rig (~$30k+)**. A single **24 GB** GPU works only for lighter prompts with higher latency. Use the **largest / full-size model variant you can run**; aggressively quantized or “small” checkpoints raise prompt-injection risk (see [Security](/gateway/security)).
 
 If you want the lowest-friction local setup, start with [LM Studio](/providers/lmstudio) or [Ollama](/providers/ollama) and `openclaw onboard`. This page is the opinionated guide for higher-end local stacks and custom OpenAI-compatible local servers.
 
+<Warning>
+**WSL2 + Ollama + NVIDIA/CUDA users:** The official Ollama Linux installer enables a systemd service with `Restart=always`. On WSL2 GPU setups, autostart can reload the last model during boot and pin host memory. If your WSL2 VM repeatedly restarts after enabling Ollama, see [WSL2 crash loop](/providers/ollama#wsl2-crash-loop-repeated-reboots).
+</Warning>
+
 ## Recommended: LM Studio + large local model (Responses API)
 
 Best current local stack. Load a large model in LM Studio (for example, a full-size Qwen, DeepSeek, or Llama build), enable the local server (default `http://127.0.0.1:1234`), and use Responses API to keep reasoning separate from final text.
@@ -21,26 +23,26 @@ Best current local stack. Load a large model in LM Studio (for example, a full-s
 {
   agents: {
     defaults: {
-      model: { primary: “lmstudio/my-local-model” },
+      model: { primary: "lmstudio/my-local-model" },
       models: {
-        “anthropic/claude-opus-4-6”: { alias: “Opus” },
-        “lmstudio/my-local-model”: { alias: “Local” },
+        "anthropic/claude-opus-4-6": { alias: "Opus" },
+        "lmstudio/my-local-model": { alias: "Local" },
       },
     },
   },
   models: {
-    mode: “merge”,
+    mode: "merge",
     providers: {
       lmstudio: {
-        baseUrl: “http://127.0.0.1:1234/v1”,
-        apiKey: “lmstudio”,
-        api: “openai-responses”,
+        baseUrl: "http://127.0.0.1:1234/v1",
+        apiKey: "lmstudio",
+        api: "openai-responses",
         models: [
           {
-            id: “my-local-model”,
-            name: “Local Model”,
+            id: "my-local-model",
+            name: "Local Model",
             reasoning: false,
-            input: [“text”],
+            input: ["text"],
             cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
             contextWindow: 196608,
             maxTokens: 8192,
@@ -115,17 +117,27 @@ Swap the primary and fallback order; keep the same providers block and `models.m
 
 ## Other OpenAI-compatible local proxies
 
-vLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style `/v1` endpoint. Replace the provider block above with your endpoint and model ID:
+MLX (`mlx_lm.server`), vLLM, SGLang, LiteLLM, OAI-proxy, or custom
+gateways work if they expose an OpenAI-style `/v1/chat/completions`
+endpoint. Use the Chat Completions adapter unless the backend explicitly
+documents `/v1/responses` support. Replace the provider block above with your
+endpoint and model ID:
 
 ```json5
 {
+  agents: {
+    defaults: {
+      model: { primary: "local/my-local-model" },
+    },
+  },
   models: {
     mode: "merge",
     providers: {
       local: {
         baseUrl: "http://127.0.0.1:8000/v1",
         apiKey: "sk-local",
-        api: "openai-responses",
+        api: "openai-completions",
+        timeoutSeconds: 300,
         models: [
           {
             id: "my-local-model",
@@ -143,7 +155,35 @@ vLLM, LiteLLM, OAI-proxy, or custom gateways work if they expose an OpenAI-style
 }
 ```
 
+If `api` is omitted on a custom provider with a `baseUrl`, OpenClaw defaults to
+`openai-completions`. Loopback endpoints such as `127.0.0.1` are trusted
+automatically; LAN, tailnet, and private DNS endpoints still need
+`request.allowPrivateNetwork: true`.
+
+The `models.providers.<id>.models[].id` value is provider-local. Do not
+include the provider prefix there. For example, an MLX server started with
+`mlx_lm.server --model mlx-community/Qwen3-30B-A3B-6bit` should use this
+catalog id and model ref:
+
+- `models.providers.mlx.models[].id: "mlx-community/Qwen3-30B-A3B-6bit"`
+- `agents.defaults.model.primary: "mlx/mlx-community/Qwen3-30B-A3B-6bit"`
+
+Set `input: ["text", "image"]` on local or proxied vision models so image
+attachments are injected into agent turns. Interactive custom-provider
+onboarding infers common vision model IDs and asks only for unknown names.
+Non-interactive onboarding uses the same inference; use `--custom-image-input`
+for unknown vision IDs or `--custom-text-input` when a known-looking model is
+text-only behind your endpoint.
+
 Keep `models.mode: "merge"` so hosted models stay available as fallbacks.
+Use `models.providers.<id>.timeoutSeconds` for slow local or remote model
+servers before raising `agents.defaults.timeoutSeconds`. The provider timeout
+applies only to model HTTP requests, including connect, headers, body streaming,
+and the total guarded-fetch abort.
+
+<Note>
+For custom OpenAI-compatible providers, persisting a non-secret local marker such as `apiKey: "ollama-local"` is accepted when `baseUrl` resolves to loopback, a private LAN, `.local`, or a bare hostname. OpenClaw treats it as a valid local credential instead of reporting a missing key. Use a real value for any provider that accepts a public hostname.
+</Note>
 
 Behavior note for local/proxied `/v1` backends:
 
@@ -161,15 +201,111 @@ Compatibility notes for stricter OpenAI-compatible backends:
   structured content-part arrays. Set
   `models.providers.<provider>.models[].compat.requiresStringContent: true` for
   those endpoints.
+- Some local models emit standalone bracketed tool requests as text, such as
+  `[tool_name]` followed by JSON and `[END_TOOL_REQUEST]`. OpenClaw promotes
+  those into real tool calls only when the name exactly matches a registered
+  tool for the turn; otherwise the block is treated as unsupported text and is
+  hidden from user-visible replies.
+- If a model emits JSON, XML, or ReAct-style text that looks like a tool call
+  but the provider did not emit a structured invocation, OpenClaw leaves it as
+  text and logs a warning with the run id, provider/model, detected pattern, and
+  tool name when available. Treat that as provider/model tool-call
+  incompatibility, not a completed tool run.
+- If tools appear as assistant text instead of running, for example raw JSON,
+  XML, ReAct syntax, or an empty `tool_calls` array in the provider response,
+  first verify the server is using a tool-call-capable chat template/parser. For
+  OpenAI-compatible Chat Completions backends whose parser works only when tool
+  use is forced, set a per-model request override instead of relying on text
+  parsing:
+
+  ```json5
+  {
+    agents: {
+      defaults: {
+        models: {
+          "local/my-local-model": {
+            params: {
+              extra_body: {
+                tool_choice: "required",
+              },
+            },
+          },
+        },
+      },
+    },
+  }
+  ```
+
+  Use this only for models/sessions where every normal turn should call a tool.
+  It overrides OpenClaw's default proxy value of `tool_choice: "auto"`.
+  Replace `local/my-local-model` with the exact provider/model ref shown by
+  `openclaw models list`.
+
+  ```bash
+  openclaw config set agents.defaults.models '{"local/my-local-model":{"params":{"extra_body":{"tool_choice":"required"}}}}' --strict-json --merge
+  ```
+
+- If a custom OpenAI-compatible model accepts OpenAI reasoning efforts beyond
+  the built-in profile, declare them on the model compat block. Adding `"xhigh"`
+  here makes `/think xhigh`, session pickers, Gateway validation, and `llm-task`
+  validation expose the level for that configured provider/model ref:
+
+  ```json5
+  {
+    models: {
+      providers: {
+        local: {
+          baseUrl: "http://127.0.0.1:8000/v1",
+          apiKey: "sk-local",
+          api: "openai-responses",
+          models: [
+            {
+              id: "gpt-5.4",
+              name: "GPT 5.4 via local proxy",
+              reasoning: true,
+              input: ["text"],
+              cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+              contextWindow: 196608,
+              maxTokens: 8192,
+              compat: {
+                supportedReasoningEfforts: ["low", "medium", "high", "xhigh"],
+                reasoningEffortMap: { xhigh: "xhigh" },
+              },
+            },
+          ],
+        },
+      },
+    },
+  }
+  ```
+
 - Some smaller or stricter local backends are unstable with OpenClaw's full
-  agent-runtime prompt shape, especially when tool schemas are included. If the
-  backend works for tiny direct `/v1/chat/completions` calls but fails on normal
-  OpenClaw agent turns, first try
+  agent-runtime prompt shape, especially when tool schemas are included. First
+  verify the provider path with the lean local probe:
+
+  ```bash
+  openclaw infer model run --local --model <provider/model> --prompt "Reply with exactly: pong" --json
+  ```
+
+  To verify the Gateway route without the full agent prompt shape, use the
+  Gateway model probe instead:
+
+  ```bash
+  openclaw infer model run --gateway --model <provider/model> --prompt "Reply with exactly: pong" --json
+  ```
+
+  Both local and Gateway model probes send only the supplied prompt. The
+  Gateway probe still validates Gateway routing, auth, and provider selection,
+  but it intentionally skips prior session transcript, AGENTS/bootstrap context,
+  context-engine assembly, tools, and bundled MCP servers.
+
+  If that succeeds but normal OpenClaw agent turns fail, first try
   `agents.defaults.experimental.localModelLean: true` to drop heavyweight
   default tools like `browser`, `cron`, and `message`; this is an experimental
   flag, not a stable default-mode setting. See
   [Experimental Features](/concepts/experimental-features). If that still fails, try
   `models.providers.<provider>.models[].compat.supportsTools: false`.
+
 - If the backend still fails only on larger OpenClaw runs, the remaining issue
   is usually upstream model/server capacity or a backend bug, not OpenClaw's
   transport layer.
@@ -178,12 +314,29 @@ Compatibility notes for stricter OpenAI-compatible backends:
 
 - Gateway can reach the proxy? `curl http://127.0.0.1:1234/v1/models`.
 - LM Studio model unloaded? Reload; cold start is a common “hanging” cause.
-- OpenClaw warns when the detected context window is below **32k** and blocks below **16k**. If you hit that preflight, raise the server/model context limit or choose a larger model.
+- Local server says `terminated`, `ECONNRESET`, or closes the stream mid-turn?
+  OpenClaw records a low-cardinality `model.call.error.failureKind` plus the
+  OpenClaw process RSS/heap snapshot in diagnostics. For LM Studio/Ollama
+  memory pressure, match that timestamp against the server log or macOS crash /
+  jetsam log to confirm whether the model server was killed.
+- OpenClaw derives context-window preflight thresholds from the detected model window, or from the uncapped model window when `agents.defaults.contextTokens` lowers the effective window. It warns below 20% with an **8k** floor. Hard blocks use the 10% threshold with a **4k** floor, capped to the effective context window so oversized model metadata cannot reject an otherwise valid user cap. If you hit that preflight, raise the server/model context limit or choose a larger model.
 - Context errors? Lower `contextWindow` or raise your server limit.
 - OpenAI-compatible server returns `messages[].content ... expected a string`?
   Add `compat.requiresStringContent: true` on that model entry.
-- Direct tiny `/v1/chat/completions` calls work, but `openclaw infer model run`
-  fails on Gemma or another local model? Disable tool schemas first with
-  `compat.supportsTools: false`, then retest. If the server still crashes only
-  on larger OpenClaw prompts, treat it as an upstream server/model limitation.
+- Direct tiny `/v1/chat/completions` calls work, but `openclaw infer model run --local`
+  fails on Gemma or another local model? Check the provider URL, model ref, auth
+  marker, and server logs first; local `model run` does not include agent tools.
+  If local `model run` succeeds but larger agent turns fail, reduce the agent
+  tool surface with `localModelLean` or `compat.supportsTools: false`.
+- Tool calls show up as raw JSON/XML/ReAct text, or the provider returns an
+  empty `tool_calls` array? Do not add a proxy that blindly converts assistant
+  text into tool execution. Fix the server chat template/parser first. If the
+  model only works when tool use is forced, add the per-model
+  `params.extra_body.tool_choice: "required"` override above and use that model
+  entry only for sessions where a tool call is expected on every turn.
 - Safety: local models skip provider-side filters; keep agents narrow and compaction on to limit prompt injection blast radius.
+
+## Related
+
+- [Configuration reference](/gateway/configuration-reference)
+- [Model failover](/concepts/model-failover)