Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions docs/issues/agent-loop-input-exec-responsiveness/plan.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Agent Loop Input And Exec Responsiveness Plan

## Runtime Input Flow

- Keep `chat.steerActiveTurn` as the active-turn entry point.
- Remove hidden steer injection from provider request construction.
- Store active steer input as a priority pending row while the current loop turn continues, so steer
never aborts the in-flight provider request.
- At the process loop boundary after tool calls have returned, yield before continuing to the next
provider request when a pending steer exists; the outer runtime then drains steer through
`processMessage()` as a normal user message.
- Drain pending steer rows before pending queue rows by claiming the row and passing its payload to
`processMessage()` with visible user-message persistence.
- Keep steer rows locked and non-editable, but show not-yet-entered steer rows in the pending input
rail.

## Exec Isolation

- Keep the existing background exec core manager as the utility host implementation.
- Replace the exported singleton with a main-process RPC proxy that starts an Electron
`utilityProcess` from the existing main bundle using a dedicated host flag.
- Route `start`, `waitForCompletionOrYield`, `poll`, `log`, `write`, `kill`, `clear`, `remove`,
`cleanupConversation`, and `shutdown` through JSON-serializable messages.
- Track started sessions in the proxy so an unexpected utility exit can return diagnostic error
snapshots for affected sessions.

## Compatibility

- `PendingSessionInputMode` remains `queue | steer`.
- Existing `sessions.convertPendingInputToSteer` route remains available for stored and older UI
flows.
- `AgentBashHandler` keeps its current public return shape for completed and yielded commands.

## Validation

- Update agent runtime/session integration tests for visible steer turns.
- Update pending input rail tests to assert pending steer rows render as locked items.
- Preserve existing background exec core tests and add coverage around the utility proxy behavior
where practical.
30 changes: 30 additions & 0 deletions docs/issues/agent-loop-input-exec-responsiveness/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Agent Loop Input And Exec Responsiveness

## User Stories

- As a user steering an active agent turn, I want my steering input to appear as a normal user
message so the conversation transcript matches what the agent saw.
- As a user running long shell commands, I want `exec` to yield quickly and keep DeepChat's main
process responsive while the command continues in a managed background session.

## Acceptance Criteria

- Active steer does not interrupt the current provider request; it records a priority steer input,
lets the current loop iteration finish including tool results, then yields before the next
provider loop so the steer payload is inserted as a normal visible user turn.
- Pending rows with `mode: "steer"` remain readable for compatibility, but drain before ordinary
queued rows as visible user turns instead of hidden request injections.
- Pending input UI shows not-yet-entered steer rows in the waiting lane as locked items, and keeps
ordinary queued follow-ups editable.
- Foreground `exec` returns a normal result if it finishes inside `yieldMs`; otherwise it returns a
running `sessionId`.
- Shell process spawning, output decoding, output offload, timeout, and process-tree termination
run in an Electron utility process rather than the main event loop.
- If the utility process exits unexpectedly, affected sessions surface an error snapshot instead of
blocking the main process.

## Non-Goals

- Do not change the public `exec` tool schema or permission semantics.
- Do not add renderer settings for exec isolation.
- Do not refactor the full agent runtime or provider loop.
12 changes: 12 additions & 0 deletions docs/issues/agent-loop-input-exec-responsiveness/tasks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Tasks

- [x] Add SDD artifacts for the combined responsiveness issue.
- [x] Queue active steer until the current loop iteration finishes without aborting the stream.
- [x] Yield the agent loop after completed tool calls when a pending steer should enter next.
- [x] Convert pending steer drain into visible user turns.
- [x] Remove hidden steer request injection.
- [x] Show not-yet-entered steer rows in the renderer pending rail.
- [x] Add utility-process RPC host for background exec.
- [x] Replace the production background exec singleton with a proxy.
- [x] Update and run targeted tests.
- [x] Run repository formatting, i18n, lint, and typecheck checks.
4 changes: 3 additions & 1 deletion docs/issues/telegram-message-markdown-render/plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,17 @@
- Add `src/main/presenter/remoteControlPresenter/telegram/telegramMarkdown.ts` exposing `convertMarkdownToTelegramHtml(text: string): string`, mirroring the Feishu-side `feishuMarkdown.ts` module location and shape.
- The converter:
- Escapes `&`, `<`, `>` first to make raw text safe for `parse_mode: 'HTML'`.
- Converts common GFM pipe tables into fenced fixed-width text before code-block extraction.
- Handles fenced code blocks (` ``` `) by emitting `<pre><code class="language-...">...</code></pre>` and protecting the body from further Markdown processing.
- Handles inline code (` `…` `), bold (`**`/`__`), italic (`*`/`_`), strikethrough (`~~`), links, headings (`#…######`), unordered/ordered lists, and blockquotes (`>`).
- Auto-closes a dangling fenced block when called on a chunk that ends mid-block, so each chunk produces valid HTML for Telegram.
- Extend `TelegramClient.sendMessage`, `editMessageText`, and `sendPhoto` with an optional `parseMode` ('HTML' | 'MarkdownV2'). Default remains undefined for backward compatibility.
- In `TelegramPoller`:
- Convert chunk text via `convertMarkdownToTelegramHtml` before `sendMessage`/`editMessageText` calls in `syncDeliverySegment`, `sendChunkedMessage`, `dispatchOutboundActions`, and `editMessageText`. Pass `parseMode: 'HTML'`.
- Apply conversion to the interaction prompt text as well so callback prompts render formatting consistently.
- Retry the original plain-text chunk when Telegram returns a 400 entity-parse error for converted HTML.

## Validation

- Run `pnpm test test/main/presenter/remoteControlPresenter/telegramClient.test.ts` (extended) and a new `telegramMarkdown.test.ts` covering core conversion rules and chunk-boundary behavior.
- Run `pnpm test test/main/presenter/remoteControlPresenter/telegramClient.test.ts` (extended) and a new `telegramMarkdown.test.ts` covering core conversion rules, table fallback, and chunk-boundary behavior.
- Run `pnpm run typecheck:node` to confirm no signature break in callers (Poller, Adapter).
2 changes: 2 additions & 0 deletions docs/issues/telegram-message-markdown-render/spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@ When DeepChat's Telegram remote control bot delivers AI replies, command output,

- `telegramClient.sendMessage` and `telegramClient.editMessageText` call the Telegram Bot API with `parse_mode: 'HTML'` when the outbound text contains formatted content.
- AI answer / process delivery segments routed through `TelegramPoller.syncDeliverySegment` and outbound actions dispatched via `dispatchOutboundActions` go through a Markdown → Telegram-HTML converter that handles bold, italic, strikethrough, inline code, fenced code blocks, headings, links, ordered/unordered lists, blockquotes, and horizontal rules.
- Common GFM pipe tables render as fixed-width preformatted text because Telegram does not support native table entities.
- Plain text (system replies, error messages, command echoes) is HTML-escaped and accepted by Telegram without parse-mode errors.
- Chunked streaming (4096 char limit) keeps each chunk independently renderable — partial Markdown left at a chunk boundary (e.g. an unclosed code fence) renders as text or a safely balanced block instead of breaking the Telegram parse.
- If Telegram rejects converted HTML with an entity-parse error, DeepChat retries the same outbound chunk as plain text.
- Existing Telegram client tests pass; a new test covers the converter and parse-mode wiring.

## Constraints
Expand Down
8 changes: 4 additions & 4 deletions docs/issues/telegram-message-markdown-render/tasks.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

- [x] Capture the reproduction from issue #1665 and confirm `sendMessage`/`editMessageText` ship raw Markdown without `parse_mode`.
- [x] Draft SDD spec, plan, tasks documents.
- [ ] Implement `telegram/telegramMarkdown.ts` with `convertMarkdownToTelegramHtml`.
- [ ] Thread an optional `parseMode` through `TelegramClient.sendMessage`, `editMessageText`, and `sendPhoto`.
- [ ] Update `TelegramPoller` to apply the converter and pass `parse_mode: 'HTML'` on all generated text paths.
- [ ] Add focused tests for the converter and parse-mode wiring; keep existing telegram tests green.
- [x] Implement `telegram/telegramMarkdown.ts` with `convertMarkdownToTelegramHtml`.
- [x] Thread an optional `parseMode` through `TelegramClient.sendMessage`, `editMessageText`, and `sendPhoto`.
- [x] Update `TelegramPoller` to apply the converter and pass `parse_mode: 'HTML'` on all generated text paths.
- [x] Add focused tests for the converter, table fallback, parse-mode wiring, and plain-text retry.
- [ ] Run `pnpm run format`, `pnpm run lint`, `pnpm run typecheck:node`, and the focused test suites.
185 changes: 185 additions & 0 deletions src/main/appMain.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
import { app, dialog } from 'electron'
import { LifecycleManager, registerCoreHooks } from './presenter/lifecyclePresenter'
import { getInstance, Presenter } from './presenter'
import { electronApp } from '@electron-toolkit/utils'
import log from 'electron-log'
import { eventBus, SendTarget } from './eventbus'
import { NOTIFICATION_EVENTS } from './events'
import { registerWorkspacePreviewSchemes } from './presenter/workspacePresenter/workspacePreviewProtocol'
import {
findDeepLinkArg,
findStartupDeepLink,
isDeepLinkUrl,
storeStartupDeepLink
} from './lib/startupDeepLink'
import { isInsecureTlsAllowed } from './lib/insecureTls'

registerWorkspacePreviewSchemes()

// Handle unhandled exceptions to prevent app crash or error dialogs
process.on('uncaughtException', (error) => {
log.error('Uncaught Exception:', error)

const msg = error.message || 'Unknown error'
const isNetworkError = [
'net::ERR',
'ECONNRESET',
'ETIMEDOUT',
'ENOTFOUND',
'Network Error',
'fetch failed'
].some((k) => msg.includes(k))

if (isNetworkError) {
// Send error to renderer to show a toast notification
// This is "elegant" and non-blocking
eventBus.sendToRenderer(NOTIFICATION_EVENTS.SHOW_ERROR, SendTarget.ALL_WINDOWS, {
id: Date.now().toString(),
title: 'Network Error',
message: msg,
type: 'error'
})
}
})

process.on('unhandledRejection', (reason) => {
log.error('Unhandled Rejection:', reason)
})

// Set application command line arguments
app.commandLine.appendSwitch('autoplay-policy', 'no-user-gesture-required') // Allow video autoplay
app.commandLine.appendSwitch('webrtc-max-cpu-consumption-percentage', '100') // Set WebRTC max CPU usage
app.commandLine.appendSwitch('js-flags', '--max-old-space-size=4096') // Set V8 heap memory size
if (isInsecureTlsAllowed()) {
// This disables certificate validation app-wide, so keep it limited to local debugging.
app.commandLine.appendSwitch('ignore-certificate-errors')
}

// Set platform-specific command line arguments
if (process.platform == 'win32') {
// Windows platform specific parameters (currently commented out)
// app.commandLine.appendSwitch('in-process-gpu')
// app.commandLine.appendSwitch('wm-window-animations-disabled')
}
if (process.platform === 'darwin') {
// macOS platform specific parameters
app.commandLine.appendSwitch('disable-features', 'DesktopCaptureMacV2,IOSurfaceCapturer')
}

const gotSingleInstanceLock = app.requestSingleInstanceLock()
if (!gotSingleInstanceLock) {
console.log('Another DeepChat instance is already running. Exiting current process.')
app.quit()
}

// Initialize presenter after ready
let presenter: Presenter | undefined

console.log('Main process starting, checking for deeplink...')
console.log('Full command line arguments:', process.argv)
const startupDeepLink = findStartupDeepLink(process.argv, process.env)
if (startupDeepLink) {
console.log('Found startup deeplink during initialization:', startupDeepLink)
storeStartupDeepLink(startupDeepLink)
} else {
console.log('No startup deeplink detected during initialization')
}

const focusExistingAppWindow = () => {
const targetWindow = presenter?.windowPresenter.getAllWindows()[0]
if (!targetWindow || targetWindow.isDestroyed()) {
return
}

if (targetWindow.isMinimized()) {
targetWindow.restore()
}
targetWindow.show()
targetWindow.focus()
}

const routeIncomingDeeplink = (url: string, source: string) => {
if (!isDeepLinkUrl(url)) {
return
}

console.log(`${source}:`, url)
const normalizedUrl = storeStartupDeepLink(url)
if (!normalizedUrl) {
return
}

if (presenter && app.isReady()) {
void presenter.deeplinkPresenter.handleDeepLink(normalizedUrl)
}
}

// Listen for open-url events that might occur during startup
// This must be set before app.whenReady() because open-url events can fire before that
app.on('open-url', (event, url) => {
event.preventDefault()
routeIncomingDeeplink(url, 'Received open-url event')
})

// Also listen for second-instance events (Windows/Linux)
if (gotSingleInstanceLock) {
app.on('second-instance', (_event, commandLine) => {
console.log('Received second-instance event with command line:', commandLine)
focusExistingAppWindow()

const deepLinkUrl = findDeepLinkArg(commandLine)
if (deepLinkUrl) {
routeIncomingDeeplink(deepLinkUrl, 'Received second-instance deeplink')
}
})
}

// Initialize lifecycle manager and register core hooks
const lifecycleManager = new LifecycleManager()
registerCoreHooks(lifecycleManager)

function clearPresenterPermissionCaches(activePresenter?: Presenter): void {
if (!activePresenter) return

activePresenter.commandPermissionService.clearAll()
activePresenter.filePermissionService.clearAll()
activePresenter.settingsPermissionService.clearAll()
}

// Start the lifecycle management system instead of using app.whenReady()
app.whenReady().then(async () => {
// Set app user model id for windows
electronApp.setAppUserModelId('com.wefonk.deepchat')
try {
console.log('main: Application lifecycle startup')
await lifecycleManager.start()
presenter = getInstance(lifecycleManager)
console.log('main: Application lifecycle startup completed successfully')
} catch (error) {
console.error('main: Application lifecycle startup failed:', error)
dialog.showErrorBox(
'Application startup failed',
error instanceof Error ? error.message : String(error)
)
app.quit() // Serious error, exit the program
}
})

app.on('before-quit', () => {
clearPresenterPermissionCaches(presenter)
})

// Handle window-all-closed event
app.on('window-all-closed', () => {
clearPresenterPermissionCaches(presenter)
if (!presenter) return

// Check if there are any non-floating-button windows
const mainWindows = presenter.windowPresenter.getAllWindows()

if (mainWindows.length === 0) {
// When only floating button windows exist, quit app on non-macOS platforms
console.log('main: All main windows closed, requesting shutdown')
app.quit() // Keep this event to avoid unexpected situations
}
})
Loading