Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,11 @@ OPENCODE_MODEL_ID=big-pickle
# STT_API_KEY=
# STT_MODEL=
# STT_LANGUAGE=

# Text-to-Speech credentials (optional)
# TTS reply behavior is controlled globally with /tts and persisted in settings.json.
# Set both TTS_API_URL and TTS_API_KEY to enable audio replies.
# TTS_API_URL=
# TTS_API_KEY=
# TTS_MODEL=gpt-4o-mini-tts
# TTS_VOICE=alloy
5 changes: 4 additions & 1 deletion PRODUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ No public inbound ports are required for normal usage.
- Configurable visibility for service messages (thinking/tool calls)
- Configurable max code file size in KB (default: 100)
- Optional STT settings for voice transcription (`STT_API_URL`, `STT_API_KEY`, `STT_MODEL`, `STT_LANGUAGE`)
- Optional TTS settings for global audio replies (`TTS_API_URL`, `TTS_API_KEY`, `TTS_MODEL`, `TTS_VOICE`)

## Current Product Scope

Expand All @@ -99,6 +100,7 @@ Current command set:
- `/abort` - stop the current task
- `/sessions` - show and switch recent sessions
- `/projects` - show and switch projects
- `/tts` - toggle global audio replies
- `/task` - create a scheduled task
- `/tasklist` - browse and delete scheduled tasks
- `/rename` - rename current session
Expand All @@ -109,7 +111,7 @@ Current command set:

Model, agent, variant, and context actions are available from the persistent bottom keyboard.

Text messages (non-commands) are treated as prompts for OpenCode only when no blocking interaction is active. Voice/audio messages are transcribed and then sent as prompts when STT is configured.
Text messages (non-commands) are treated as prompts for OpenCode only when no blocking interaction is active. Voice/audio messages are transcribed and then sent as prompts when STT is configured. When `/tts` is enabled globally, completed assistant replies also include a generated audio file if TTS is configured.

Interaction routing rules:

Expand Down Expand Up @@ -148,6 +150,7 @@ Model picker behavior:
- [x] PDF attachments support (send documents from Telegram to OpenCode)
- [x] Text file attachments support (send code/config/log files from Telegram to OpenCode)
- [x] Voice/audio transcription via Whisper-compatible APIs (OpenAI/Groq/Together and compatible providers)
- [x] Optional global audio replies with `/tts` via OpenAI-compatible APIs

## Current Task List

Expand Down
69 changes: 43 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Languages: English (`en`), Deutsch (`de`), Español (`es`), Français (`fr`), Р
- **Subagent activity** — watch live subagent progress in chat, including the current task, agent, model, and active tool step
- **Custom Commands** — run OpenCode custom commands (and built-ins like `init`/`review`) from an inline menu with confirmation
- **Interactive Q&A** — answer agent questions and approve permissions via inline buttons
- **Voice prompts** — send voice/audio messages, transcribe them via a Whisper-compatible API, then forward recognized text to OpenCode
- **Voice prompts** — send voice/audio messages, transcribe them via a Whisper-compatible API, and optionally enable spoken replies with `/tts`
- **File attachments** — send images, PDF documents, and any text-based files to OpenCode (code, logs, configs etc.)
- **Scheduled tasks** — schedule prompts to run later or on a recurring interval; see [Scheduled Tasks](#scheduled-tasks)
- **Context control** — compact context when it gets too large, right from the chat
Expand Down Expand Up @@ -109,6 +109,7 @@ opencode-telegram config
| `/abort` | Abort the current task |
| `/sessions` | Browse and switch between recent sessions |
| `/projects` | Switch between OpenCode projects |
| `/tts` | Toggle audio replies |
| `/rename` | Rename the current session |
| `/commands` | Browse and run custom commands |
| `/task` | Create a scheduled task |
Expand Down Expand Up @@ -147,31 +148,36 @@ When installed via npm, the configuration wizard handles the initial setup. The
- **Windows:** `%APPDATA%\opencode-telegram-bot\.env`
- **Linux:** `~/.config/opencode-telegram-bot/.env`

| Variable | Description | Required | Default |
| ----------------------------- | -------------------------------------------------------------------------------- | :------: | ------------------------ |
| `TELEGRAM_BOT_TOKEN` | Bot token from @BotFather | Yes | — |
| `TELEGRAM_ALLOWED_USER_ID` | Your numeric Telegram user ID | Yes | — |
| `TELEGRAM_PROXY_URL` | Proxy URL for Telegram API (SOCKS5/HTTP) | No | — |
| `OPENCODE_API_URL` | OpenCode server URL | No | `http://localhost:4096` |
| `OPENCODE_SERVER_USERNAME` | Server auth username | No | `opencode` |
| `OPENCODE_SERVER_PASSWORD` | Server auth password | No | — |
| `OPENCODE_MODEL_PROVIDER` | Default model provider | Yes | `opencode` |
| `OPENCODE_MODEL_ID` | Default model ID | Yes | `big-pickle` |
| `BOT_LOCALE` | Bot UI language (supported locale code, e.g. `en`, `de`, `es`, `fr`, `ru`, `zh`) | No | `en` |
| `SESSIONS_LIST_LIMIT` | Sessions per page in `/sessions` | No | `10` |
| `PROJECTS_LIST_LIMIT` | Projects per page in `/projects` | No | `10` |
| `COMMANDS_LIST_LIMIT` | Commands per page in `/commands` | No | `10` |
| `TASK_LIMIT` | Maximum number of scheduled tasks that can exist at once | No | `10` |
| `RESPONSE_STREAM_THROTTLE_MS` | Stream edit throttle (ms) for assistant and tool updates | No | `500` |
| `HIDE_THINKING_MESSAGES` | Hide `💭 Thinking...` service messages | No | `false` |
| `HIDE_TOOL_CALL_MESSAGES` | Hide tool-call service messages (`💻 bash ...`, `📖 read ...`, etc.) | No | `false` |
| `MESSAGE_FORMAT_MODE` | Assistant reply formatting mode: `markdown` (Telegram MarkdownV2) or `raw` | No | `markdown` |
| `CODE_FILE_MAX_SIZE_KB` | Max file size (KB) to send as document | No | `100` |
| `STT_API_URL` | Whisper-compatible API base URL (enables voice/audio transcription) | No | — |
| `STT_API_KEY` | API key for your STT provider | No | — |
| `STT_MODEL` | STT model name passed to `/audio/transcriptions` | No | `whisper-large-v3-turbo` |
| `STT_LANGUAGE` | Optional language hint (empty = provider auto-detect) | No | — |
| `LOG_LEVEL` | Log level (`debug`, `info`, `warn`, `error`) | No | `info` |
| Variable | Description | Required | Default |
| ------------------------------- | ------------------------------------------------------------------------------------------------------------ | :------: | ------------------------ |
| `TELEGRAM_BOT_TOKEN` | Bot token from @BotFather | Yes | — |
| `TELEGRAM_ALLOWED_USER_ID` | Your numeric Telegram user ID | Yes | — |
| `TELEGRAM_PROXY_URL` | Proxy URL for Telegram API (SOCKS5/HTTP) | No | — |
| `OPENCODE_API_URL` | OpenCode server URL | No | `http://localhost:4096` |
| `OPENCODE_SERVER_USERNAME` | Server auth username | No | `opencode` |
| `OPENCODE_SERVER_PASSWORD` | Server auth password | No | — |
| `OPENCODE_MODEL_PROVIDER` | Default model provider | Yes | `opencode` |
| `OPENCODE_MODEL_ID` | Default model ID | Yes | `big-pickle` |
| `BOT_LOCALE` | Bot UI language (supported locale code, e.g. `en`, `de`, `es`, `fr`, `ru`, `zh`) | No | `en` |
| `SESSIONS_LIST_LIMIT` | Sessions per page in `/sessions` | No | `10` |
| `PROJECTS_LIST_LIMIT` | Projects per page in `/projects` | No | `10` |
| `COMMANDS_LIST_LIMIT` | Commands per page in `/commands` | No | `10` |
| `TASK_LIMIT` | Maximum number of scheduled tasks that can exist at once | No | `10` |
| `SERVICE_MESSAGES_INTERVAL_SEC` | Service messages interval (thinking + tool calls); keep `>=2` to avoid Telegram rate limits, `0` = immediate | No | `5` |
| `HIDE_THINKING_MESSAGES` | Hide `💭 Thinking...` service messages | No | `false` |
| `HIDE_TOOL_CALL_MESSAGES` | Hide tool-call service messages (`💻 bash ...`, `📖 read ...`, etc.) | No | `false` |
| `RESPONSE_STREAMING` | Stream assistant replies while they are generated across one or more Telegram messages | No | `true` |
| `MESSAGE_FORMAT_MODE` | Assistant reply formatting mode: `markdown` (Telegram MarkdownV2) or `raw` | No | `markdown` |
| `CODE_FILE_MAX_SIZE_KB` | Max file size (KB) to send as document | No | `100` |
| `STT_API_URL` | Whisper-compatible API base URL (enables voice/audio transcription) | No | — |
| `STT_API_KEY` | API key for your STT provider | No | — |
| `STT_MODEL` | STT model name passed to `/audio/transcriptions` | No | `whisper-large-v3-turbo` |
| `STT_LANGUAGE` | Optional language hint (empty = provider auto-detect) | No | — |
| `TTS_API_URL` | TTS API base URL | No | — |
| `TTS_API_KEY` | TTS API key | No | — |
| `TTS_MODEL` | TTS model name passed to `/audio/speech` | No | `gpt-4o-mini-tts` |
| `TTS_VOICE` | OpenAI-compatible TTS voice name | No | `alloy` |
| `LOG_LEVEL` | Log level (`debug`, `info`, `warn`, `error`) | No | `info` |

> **Keep your `.env` file private.** It contains your bot token. Never commit it to version control.

Expand All @@ -184,6 +190,17 @@ If `STT_API_URL` and `STT_API_KEY` are set, the bot will:
3. Show recognized text in chat
4. Send the recognized text to OpenCode as a normal prompt

If TTS credentials are configured, you can toggle spoken replies globally with `/tts`. The preference is stored in `settings.json` and persists across restarts.

TTS configuration example:

```env
TTS_API_URL=https://api.openai.com/v1
TTS_API_KEY=your-tts-api-key
TTS_MODEL=gpt-4o-mini-tts
TTS_VOICE=alloy
```

Supported provider examples (Whisper-compatible):

- **OpenAI**
Expand Down
1 change: 1 addition & 0 deletions src/bot/commands/definitions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ const COMMAND_DEFINITIONS: BotCommandI18nDefinition[] = [
{ command: "new", descriptionKey: "cmd.description.new" },
{ command: "abort", descriptionKey: "cmd.description.stop" },
{ command: "sessions", descriptionKey: "cmd.description.sessions" },
{ command: "tts", descriptionKey: "cmd.description.tts" },
{ command: "projects", descriptionKey: "cmd.description.projects" },
{ command: "task", descriptionKey: "cmd.description.task" },
{ command: "tasklist", descriptionKey: "cmd.description.tasklist" },
Expand Down
5 changes: 4 additions & 1 deletion src/bot/commands/status.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { CommandContext, Context } from "grammy";
import { opencodeClient } from "../../opencode/client.js";
import { getCurrentSession } from "../../session/manager.js";
import { getCurrentProject } from "../../settings/manager.js";
import { getCurrentProject, isTtsEnabled } from "../../settings/manager.js";
import { fetchCurrentAgent } from "../../agent/manager.js";
import { getAgentDisplayName } from "../../agent/types.js";
import { fetchCurrentModel } from "../../model/manager.js";
Expand All @@ -26,6 +26,9 @@ export async function statusCommand(ctx: CommandContext<Context>) {
if (data.version) {
message += `${t("status.line.version", { version: data.version })}\n`;
}
message += `${t("status.line.tts", {
tts: isTtsEnabled() ? t("status.tts.on") : t("status.tts.off"),
})}\n`;

// Add process management information
if (processManager.isRunning()) {
Expand Down
19 changes: 19 additions & 0 deletions src/bot/commands/tts.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import { CommandContext, Context } from "grammy";
import { isTtsConfigured } from "../../tts/client.js";
import { isTtsEnabled, setTtsEnabled } from "../../settings/manager.js";
import { t } from "../../i18n/index.js";

export async function ttsCommand(ctx: CommandContext<Context>): Promise<void> {
const enabled = !isTtsEnabled();

if (enabled && !isTtsConfigured()) {
await ctx.reply(t("tts.not_configured"));
return;
}

setTtsEnabled(enabled);

const message = enabled ? t("tts.enabled") : t("tts.disabled");

await ctx.reply(message);
}
28 changes: 27 additions & 1 deletion src/bot/handlers/prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ import type { FilePartInput, TextPartInput } from "@opencode-ai/sdk/v2";
import { opencodeClient } from "../../opencode/client.js";
import { clearSession, getCurrentSession, setCurrentSession } from "../../session/manager.js";
import { ingestSessionInfoForCache } from "../../session/cache-manager.js";
import { getCurrentProject } from "../../settings/manager.js";
import { getCurrentProject, isTtsEnabled } from "../../settings/manager.js";
import { getStoredAgent } from "../../agent/manager.js";
import { getStoredModel } from "../../model/manager.js";
import { formatVariantForButton } from "../../variant/manager.js";
Expand All @@ -23,6 +23,13 @@ import { foregroundSessionState } from "../../scheduled-task/foreground-state.js
/** Module-level references for async callbacks that don't have ctx. */
let botInstance: Bot<Context> | null = null;
let chatIdInstance: number | null = null;
const promptResponseModes = new Map<string, PromptResponseMode>();

export type PromptResponseMode = "text_only" | "text_and_tts";

type ProcessPromptOptions = {
responseMode?: PromptResponseMode;
};

export function getPromptBotInstance(): Bot<Context> | null {
return botInstance;
Expand All @@ -32,6 +39,20 @@ export function getPromptChatId(): number | null {
return chatIdInstance;
}

export function setPromptResponseMode(sessionId: string, responseMode: PromptResponseMode): void {
promptResponseModes.set(sessionId, responseMode);
}

export function clearPromptResponseMode(sessionId: string): void {
promptResponseModes.delete(sessionId);
}

export function consumePromptResponseMode(sessionId: string): PromptResponseMode | null {
const responseMode = promptResponseModes.get(sessionId) ?? null;
promptResponseModes.delete(sessionId);
return responseMode;
}

async function isSessionBusy(sessionId: string, directory: string): Promise<boolean> {
try {
const { data, error } = await opencodeClient.session.status({ directory });
Expand Down Expand Up @@ -93,8 +114,10 @@ export async function processUserPrompt(
text: string,
deps: ProcessPromptDeps,
fileParts: FilePartInput[] = [],
options: ProcessPromptOptions = {},
): Promise<boolean> {
const { bot, ensureEventSubscription } = deps;
const responseMode = options.responseMode ?? (isTtsEnabled() ? "text_and_tts" : "text_only");

const currentProject = getCurrentProject();
if (!currentProject) {
Expand Down Expand Up @@ -263,6 +286,7 @@ export async function processUserPrompt(
);

foregroundSessionState.markBusy(currentSession.id);
setPromptResponseMode(currentSession.id, responseMode);

// CRITICAL: DO NOT wait for session.prompt to complete.
// If we wait, the handler will not finish and grammY will not call getUpdates,
Expand All @@ -274,6 +298,7 @@ export async function processUserPrompt(
onSuccess: ({ error }) => {
if (error) {
foregroundSessionState.markIdle(currentSession.id);
clearPromptResponseMode(currentSession.id);
const details = formatErrorDetails(error, 6000);
logger.error(
"[Bot] OpenCode API returned an error for session.prompt",
Expand All @@ -291,6 +316,7 @@ export async function processUserPrompt(
},
onError: (error) => {
foregroundSessionState.markIdle(currentSession.id);
clearPromptResponseMode(currentSession.id);
const details = formatErrorDetails(error, 6000);
logger.error("[Bot] session.prompt background task failed", promptErrorLogContext);
logger.error("[Bot] session.prompt background failure details:", details);
Expand Down
Loading
Loading