ThinkInAIXYZ · zerob13 · May 11, 2026 · May 11, 2026 · May 11, 2026 · May 11, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -3,11 +3,19 @@
 ## v1.0.4-beta.7 (2026-05-11)
 - Added Mistral as a built-in provider, including model icons, provider catalog support, and deeplink handling
 - Agent runs now budget tool schemas and tool output more defensively, reducing oversized context failures and follow-up stalls
+- Context-window overflow errors now include budget diagnostics and try pressure recovery before failing oversized requests
+- Feishu remote control replies now use optimized Markdown posts for headings, tables, lists, code blocks, and streamed updates
+- Markdown streaming now stays smooth while messages are loading without leaving completed content in a streaming state
+- Upgraded markstream-vue to 0.0.14-beta.8 for improved Markdown rendering behavior
 - Agent terminal execution is steadier when shells, working directories, or context compaction need fallback handling
 - Disabled providers no longer trigger verification requests from settings screens
 - Plugin MCP servers now keep their lifecycle more isolated, improving start/stop behavior and built-in plugin visibility
 - 新增 Mistral 内置 Provider，补齐模型图标、Provider 目录与 deeplink 支持
 - Agent 运行会更谨慎地预算工具 schema 与工具输出，减少上下文过大和后续执行卡住的问题
+- 上下文窗口溢出错误现在会带上预算诊断，并在失败前尝试恢复上下文压力
+- 飞书远程控制回复改用优化后的 Markdown post，改善标题、表格、列表、代码块和流式更新
+- Markdown 流式渲染只在消息加载中保持平滑，完成后的内容不会继续停留在流式状态
+- markstream-vue 升级到 0.0.14-beta.8，改善 Markdown 渲染表现
 - 当 shell、工作目录或上下文压缩需要 fallback 时，Agent 终端执行更稳定
 - 设置页不会再对已禁用的 Provider 发起验证请求
 - Plugin MCP server 的生命周期隔离更清晰，启动、停止和内置插件展示更可靠

diff --git a/docs/issues/agent-tool-context-budget/plan.md b/docs/issues/agent-tool-context-budget/plan.md
@@ -57,6 +57,17 @@ consumes more context and fails on small formatting deviations.
 - Judge tool-output continuation fitting against the next preflight-fitted request shape, so older
   history that would be trimmed before the next provider call does not falsely fail the tool result.
 
+## Retry Overflow Hardening
+
+- Route unfittable provider-call preflight results through the existing context-pressure recovery
+  path once before failing.
+- Keep the latest user/system/tool payload protected; recovery may compact persisted history,
+  replace stale summary-bearing system prompt text, trim older in-memory history, and reduce only
+  the per-call output cap.
+- When recovery still cannot fit, fail before rate-limit wait/provider streaming with a budget
+  diagnostic that includes usable context, estimated input, tool-schema reserve, requested/effective
+  output, and remaining output room.
+
 ## Manual Validation Notes
 
 For a MiniMax-M2.7 agent session, inspect trace/log output rather than running automated test

diff --git a/docs/issues/agent-tool-context-budget/spec.md b/docs/issues/agent-tool-context-budget/spec.md
@@ -72,3 +72,16 @@ Additional acceptance criteria:
   instead of a positive budget.
 - Context-pressure recovery updates the in-memory request history used by later tool-continuation
   loops.
+
+## Retry Overflow Hardening
+
+Additional acceptance criteria:
+
+- Retry and resume provider calls that still cannot fit after request fitting attempt the same
+  internal recovery pass before failing, even when the user configured fewer than 4000 output
+  tokens.
+- If recovery cannot make the latest user/system/tool payload fit, DeepChat fails before sending a
+  provider request and stores a clear budget error on the assistant message.
+- The error explains that no provider request was sent and suggests shortening current input or
+  attachments, reducing active tools/skills/system prompt content, lowering max output tokens, or
+  increasing context length.
diff --git a/docs/issues/agent-tool-context-budget/tasks.md b/docs/issues/agent-tool-context-budget/tasks.md
@@ -13,6 +13,8 @@
 - [x] Drop orphaned tool results and invalid provider options before AI SDK requests.
 - [x] Report zero effective output tokens for unfittable preflight results.
 - [x] Harden preflight for unknown context windows and refitted tool continuations.
+- [x] Recover unfittable retry/resume preflights before provider calls.
+- [x] Add actionable budget diagnostics for irreducible retry/resume overflow.
 - [ ] Add request budget telemetry.
 - [ ] Add reasoning retention budget.
 - [ ] Add compact legacy tool schema mode.
diff --git a/docs/issues/markdown-smooth-streaming-control/plan.md b/docs/issues/markdown-smooth-streaming-control/plan.md
@@ -0,0 +1,19 @@
+# Markdown Smooth Streaming Control Plan
+
+## Approach
+
+- Add a `smoothStreaming` prop to `MarkdownRenderer`, defaulting to `false`.
+- Forward that prop to `markstream-vue`'s `NodeRenderer`.
+- In chat message text rendering, enable the prop only when the assistant content block status is `pending` or `loading`.
+- Do not derive this from parsed part loading state, because that state reflects artifact/tag parsing rather than message generation.
+
+## Compatibility
+
+- Existing non-chat markdown surfaces inherit the default `false`.
+- Existing `fade=false` behavior remains unchanged.
+
+## Validation
+
+- Cover the markdown renderer default and explicit prop behavior.
+- Cover completed versus generating message block behavior.
+- Run targeted renderer tests plus formatting, i18n, and lint checks.
diff --git a/docs/issues/markdown-smooth-streaming-control/spec.md b/docs/issues/markdown-smooth-streaming-control/spec.md
@@ -0,0 +1,17 @@
+# Markdown Smooth Streaming Control
+
+## User Story
+
+As a chat reader, I want completed markdown messages to render without streaming animation so that history and already-generated responses feel stable.
+
+## Acceptance Criteria
+
+- Assistant text blocks that are actively generating use `smoothStreaming`.
+- Completed assistant text blocks do not use `smoothStreaming`.
+- Markdown artifact previews and workspace markdown previews keep non-streaming behavior by default.
+- The change does not alter markdown content parsing, references, code previews, or artifact syncing.
+
+## Non-Goals
+
+- No new user setting is added.
+- No IPC, database, or shared message schema changes are needed.
diff --git a/docs/issues/markdown-smooth-streaming-control/tasks.md b/docs/issues/markdown-smooth-streaming-control/tasks.md
@@ -0,0 +1,8 @@
+# Markdown Smooth Streaming Control Tasks
+
+- [x] Document the intended behavior and implementation approach.
+- [x] Add the `smoothStreaming` prop to `MarkdownRenderer`.
+- [x] Enable smooth streaming only for pending/loading assistant content blocks.
+- [x] Add renderer tests for default, explicit, completed, and generating states.
+- [x] Run targeted renderer tests.
+- [x] Run final format, i18n, and lint checks.
diff --git a/package.json b/package.json
@@ -179,7 +179,7 @@
     "katex": "^0.16.27",
     "lint-staged": "^16.4.0",
     "lucide-vue-next": "^0.544.0",
-    "markstream-vue": "0.0.14-beta.2",
+    "markstream-vue": "0.0.14-beta.8",
     "mermaid": "^11.13.0",
     "minimatch": "^10.2.4",
     "monaco-editor": "^0.55.1",

diff --git a/src/main/presenter/agentRuntimePresenter/contextBudget.ts b/src/main/presenter/agentRuntimePresenter/contextBudget.ts
@@ -32,6 +32,16 @@ export type RequestContextPreflightResult = {
   requiresContextPressureRecovery: boolean
 }
 
+export type RequestContextBudgetDiagnostics = {
+  usableContextLength: number
+  inputTokens: number
+  toolReserveTokens: number
+  requestedMaxTokens: number
+  effectiveMaxTokens: number
+  remainingOutputTokens: number
+  totalRequestTokens: number
+}
+
 export function estimateToolReserveTokens(tools: MCPToolDefinition[]): number {
   return estimateToolDefinitionTokens(tools)
 }
@@ -195,6 +205,34 @@ export function preflightRequestContext(params: {
   }
 }
 
+export function buildRequestContextBudgetDiagnostics(
+  preflight: RequestContextPreflightResult
+): RequestContextBudgetDiagnostics {
+  return {
+    usableContextLength: preflight.usableContextLength,
+    inputTokens: preflight.inputTokens,
+    toolReserveTokens: preflight.toolReserveTokens,
+    requestedMaxTokens: preflight.requestedMaxTokens,
+    effectiveMaxTokens: preflight.effectiveMaxTokens,
+    remainingOutputTokens: preflight.remainingOutputTokens,
+    totalRequestTokens: preflight.totalRequestTokens
+  }
+}
+
+export function buildRequestContextOverflowErrorMessage(
+  preflight: RequestContextPreflightResult
+): string {
+  const diagnostics = buildRequestContextBudgetDiagnostics(preflight)
+  const formatTokenCount = (value: number): string =>
+    Number.isFinite(value) ? String(Math.floor(value)) : 'unknown'
+
+  return [
+    'Request was not sent because it cannot fit within the model context window after applying the safety margin.',
+    `Budget: usable context ${formatTokenCount(diagnostics.usableContextLength)} tokens, estimated input ${formatTokenCount(diagnostics.inputTokens)} tokens, tool schemas ${formatTokenCount(diagnostics.toolReserveTokens)} tokens, requested output ${formatTokenCount(diagnostics.requestedMaxTokens)} tokens, effective output ${formatTokenCount(diagnostics.effectiveMaxTokens)} tokens, remaining output room ${formatTokenCount(diagnostics.remainingOutputTokens)} tokens.`,
+    'Try shortening the latest input or attachments, reducing active tools, skills, or system prompt content, lowering max output tokens, or increasing context length.'
+  ].join(' ')
+}
+
 function resolveProtectedRequestTailCount(messages: ChatMessage[]): number {
   if (messages.length === 0) {
     return 0

diff --git a/src/main/presenter/agentRuntimePresenter/index.ts b/src/main/presenter/agentRuntimePresenter/index.ts
@@ -75,6 +75,7 @@ import {
 import {
   capAgentDefaultMaxTokens,
   capAgentRequestMaxTokens,
+  buildRequestContextOverflowErrorMessage,
   estimateToolReserveTokens,
   fitRequestMessagesToContextWindow,
   preflightRequestContext
@@ -1874,7 +1875,10 @@ export class AgentRuntimePresenter implements IAgentImplementation {
               requestedMaxTokens: requestMaxTokens,
               minimumProtectedTailCount: protectedSteerTailCount
             })
-            if (requestPreflight.requiresContextPressureRecovery) {
+            if (
+              requestPreflight.requiresContextPressureRecovery ||
+              !requestPreflight.fitsWithinContext
+            ) {
               const recovered = await recoverContextPressure({
                 sessionId,
                 providerId: state.providerId,
@@ -1903,9 +1907,7 @@ export class AgentRuntimePresenter implements IAgentImplementation {
               requestMessages.splice(0, requestMessages.length, ...requestPreflight.messages)
             }
             if (!requestPreflight.fitsWithinContext) {
-              throw new Error(
-                'Request cannot fit within the model context window after applying the safety margin.'
-              )
+              throw new Error(buildRequestContextOverflowErrorMessage(requestPreflight))
             }
             await llmProviderPresenter.executeWithRateLimit(state.providerId, {
               signal: abortController.signal,

diff --git a/src/main/presenter/remoteControlPresenter/feishu/feishuClient.ts b/src/main/presenter/remoteControlPresenter/feishu/feishuClient.ts
@@ -29,6 +29,15 @@ const createTextPayload = (text: string): string =>
     text
   })
 
+const createMarkdownPayload = (text: string): string =>
+  JSON.stringify({
+    post: {
+      zh_cn: {
+        content: [[{ tag: 'md', text }]]
+      }
+    }
+  })
+
 const createCardPayload = (card: FeishuInteractiveCardPayload): string => JSON.stringify(card)
 
 const readHeaderValue = (headers: unknown, name: string): string | undefined => {
@@ -232,6 +241,43 @@ export class FeishuClient {
     return messageId
   }
 
+  async sendMarkdown(target: FeishuTransportTarget, text: string): Promise<string | null> {
+    let messageId: string | null = null
+
+    for (const chunk of chunkFeishuText(text)) {
+      const content = createMarkdownPayload(chunk)
+
+      if (target.replyToMessageId) {
+        const response = (await this.sdk.im.message.reply({
+          path: {
+            message_id: target.replyToMessageId
+          },
+          data: {
+            content,
+            msg_type: 'post',
+            reply_in_thread: Boolean(target.threadId)
+          }
+        })) as FeishuMessageResponse
+        messageId = response.data?.message_id?.trim() || messageId
+        continue
+      }
+
+      const response = (await this.sdk.im.message.create({
+        params: {
+          receive_id_type: 'chat_id'
+        },
+        data: {
+          receive_id: target.chatId,
+          msg_type: 'post',
+          content
+        }
+      })) as FeishuMessageResponse
+      messageId = response.data?.message_id?.trim() || messageId
+    }
+
+    return messageId
+  }
+
   async downloadMessageResource(params: {
     messageId: string
     fileKey: string
@@ -336,6 +382,18 @@ export class FeishuClient {
     })
   }
 
+  async updateMarkdown(messageId: string, text: string): Promise<void> {
+    await this.sdk.im.message.update({
+      path: {
+        message_id: messageId
+      },
+      data: {
+        msg_type: 'post',
+        content: createMarkdownPayload(text)
+      }
+    })
+  }
+
   async deleteMessage(messageId: string): Promise<void> {
     await this.sdk.im.message.delete({
       path: {

diff --git a/src/main/presenter/remoteControlPresenter/feishu/feishuMarkdown.ts b/src/main/presenter/remoteControlPresenter/feishu/feishuMarkdown.ts
@@ -0,0 +1,102 @@
+/**
+ * Markdown style optimization for Feishu post messages.
+ *
+ * Ported from openclaw-lark (src/card/markdown-style.ts) and adapted
+ * to DeepChat coding conventions.
+ *
+ * Optimizations:
+ * - Heading demotion: H1 -> H4, H2~H6 -> H5
+ * - Paragraph spacing around tables
+ * - Ordered list: ensure single space after number
+ * - Unordered list: normalize "- " format (skip horizontal rules ---)
+ * - Table: pad cells, normalize separator rows, add blank lines
+ * - Code blocks are preserved as-is
+ */
+
+const IMAGE_RE = /!\[([^\]]*)\]\(([^)\s]+)\)/g
+
+/**
+ * Strip `![alt](value)` where value is not a valid Feishu image key (`img_xxx`).
+ */
+function stripInvalidImageKeys(text: string): string {
+  if (!text.includes('![')) return text
+  return text.replace(IMAGE_RE, (fullMatch, _alt: string, value: string) => {
+    if (value.startsWith('img_')) return fullMatch
+    return ''
+  })
+}
+
+function optimizeMarkdownStyleCore(text: string, cardVersion = 2): string {
+  const MARK = '___CB_'
+  const codeBlocks: string[] = []
+
+  // 1. Extract code blocks and protect with placeholders
+  let r = text.replace(/(^|\n)(`{3,})([^\n]*)\n[\s\S]*?\n\2(?=\n|$)/g, (m, prefix = '') => {
+    const block = m.slice(String(prefix).length)
+    return `${prefix}${MARK}${codeBlocks.push(block) - 1}___`
+  })
+
+  // 2. Heading demotion: only when original text contains h1~h3
+  const hasH1toH3 = /^#{1,3} /m.test(text)
+  if (hasH1toH3) {
+    r = r.replace(/^#{2,6} (.+)$/gm, '##### $1') // H2~H6 -> H5
+    r = r.replace(/^# (.+)$/gm, '#### $1') // H1 -> H4
+  }
+
+  if (cardVersion >= 2) {
+    // 3. Add paragraph spacing between consecutive headings
+    r = r.replace(/^(#{4,5} .+)\n{1,2}(#{4,5} )/gm, '$1\n\n$2')
+
+    // 4. Add paragraph spacing around tables
+    // 4a. Non-table line directly followed by table row
+    r = r.replace(/^([^|\n].*)\n(\|.+\|)/gm, '$1\n\n$2')
+    // 4b. Before table: add blank line
+    r = r.replace(/\n\n((?:\|.+\|[^\S\n]*\n?)+)/g, '\n\n\n$1')
+    // 4c. After table: append blank line at end of table block
+    r = r.replace(/((?:^\|.+\|[^\S\n]*\n?)+)/gm, (m, _table: string, offset: number) => {
+      const after = r.slice(offset + m.length).replace(/^\n+/, '')
+      if (!after || /^(---|#{4,5} |\*\*)/.test(after)) return m
+      return m + '\n\n'
+    })
+    // 4d. Table preceded by plain text: remove extra blank line
+    r = r.replace(/^((?!#{4,5} )(?!\*\*).+)\n\n\n(\|)/gm, '$1\n\n$2')
+    // 4d2. Table preceded by bold line
+    r = r.replace(/^(\*\*.+)\n\n\n(\|)/gm, '$1\n\n$2')
+    // 4e. Table followed by plain text: remove extra blank line
+    r = r.replace(/(\|[^\n]*\n)\n\n((?!#{4,5} )(?!\*\*))/gm, '$1\n$2')
+
+    // 5. Restore code blocks with blank line before and after
+    codeBlocks.forEach((block, i) => {
+      r = r.replace(`${MARK}${i}___`, `\n\n${block}\n\n`)
+    })
+  } else {
+    // 5. Restore code blocks
+    codeBlocks.forEach((block, i) => {
+      r = r.replace(`${MARK}${i}___`, block)
+    })
+  }
+
+  // 6. Compress excessive blank lines (3+ consecutive newlines -> 2)
+  r = r.replace(/\n{3,}/g, '\n\n')
+
+  return r
+}
+
+/**
+ * Optimize Markdown text for Feishu post rendering.
+ *
+ * - Heading demotion (H1->H4, H2~H6->H5) to avoid oversized headings in cards
+ * - Table spacing fixes for proper rendering
+ * - List format normalization
+ * - Code block preservation
+ * - Strip invalid image references
+ */
+export function optimizeMarkdownForFeishu(text: string, cardVersion = 2): string {
+  try {
+    let r = optimizeMarkdownStyleCore(text, cardVersion)
+    r = stripInvalidImageKeys(r)
+    return r
+  } catch {
+    return text
+  }
+}