Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .github/skills/coc-knowledge/references/sdk-wrapper.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Location: `packages/coc-agent-sdk/src/`
| `model-reasoning.ts` | Metadata-aware model/reasoning resolver; variant IDs with `capabilities.family` sent as base model + reasoning effort |
| `mcp-config-loader.ts` | Loads/merges MCP config from `~/.copilot/mcp-config.json`, workspace `.vscode/mcp.json`, and explicit request options |
| `trusted-folder.ts` | Pre-registers working directories in `~/.copilot/config.json` |
| `image-converter.ts` | Image file data-URL conversion |
| `image-converter.ts` | Image file detection plus data-URL/base64 conversion helpers |
| `tool-call.ts` | `ToolCall`, `ToolCallStatus`, `ToolCallPermissionRequest`, serialization types |
| `model-info.ts` | `ModelInfo` type (id, name, description, tier, …) |
| `logger.ts` | `initSDKLogger` / `resetSDKLogger` / `getSDKLogger` — pino logger lifecycle |
Expand Down Expand Up @@ -95,6 +95,8 @@ Codex SDK thread options do not expose Copilot's native `skillDirectories` or `d

Codex permission mode is mapped at the provider boundary with `approvalPolicy: 'never'` for every CoC mode. Interactive/ask mode and omitted mode use `sandboxMode: 'read-only'` with network access disabled. Plan mode uses the same full-access Codex sandbox as autopilot (`sandboxMode: 'danger-full-access'`, network access enabled) and relies on CoC's read-only/plan system prompt rather than Codex sandbox enforcement.

Codex image attachments are passed at the provider boundary as `@openai/codex-sdk` structured `local_image` inputs. When `SendMessageOptions.attachments` includes file attachments with supported raster image extensions (`png`, `jpg`/`jpeg`, `gif`, `webp`), `CodexSDKService` sends an input array containing the prompt text plus `{ type: 'local_image', path }` entries in attachment order. Directories, non-images, and SVGs are ignored so text-only behavior is preserved.

**Thread ↔ session mapping:** Every CoC session ID maps to exactly one Codex thread. The mapping is created on the first `sendMessage()` call for a session and removed on abort or dispose.

**Authentication:** CoC does not own a Codex auth store or `/api/codex-auth/*` routes. Codex authentication is handled by the Codex SDK/CLI; hosts may still inject an optional `CodexAuthChecker` if they need a preflight gate before loading the SDK.
Expand Down Expand Up @@ -132,6 +134,8 @@ Claude model catalog discovery spawns the Claude Code CLI in `stream-json` proto

Claude Code permission mode is mapped at the provider boundary: CoC `autopilot` sends `permissionMode: 'bypassPermissions'` plus `allowDangerouslySkipPermissions: true`, CoC `plan` sends `permissionMode: 'plan'`, and all other modes (interactive/ask/undefined) send `permissionMode: 'acceptEdits'`. This ensures ask-mode sessions can create directories and write files within allowed working directories without blocking on permission prompts.

Claude image attachments are converted at the provider boundary. When `SendMessageOptions.attachments` includes readable file images with extensions Claude supports as base64 blocks (`png`, `jpg`/`jpeg`, `gif`, `webp`), `ClaudeSDKService` sends a one-shot streaming user message containing the prompt text plus image blocks. Unsupported files, directories, SVGs, missing files, and files over the shared image conversion limit are ignored so text-only behavior is preserved.

`ClaudeSDKService` widens the agent's filesystem permission scope via the SDK's `additionalDirectories` option (`resolveAdditionalDirectories`). It always grants access to `~/.coc` (CoC data/skills dir) and the system temp directory (`os.tmpdir()`) so out-of-repo skill files and temp artifacts remain readable beyond the per-request `workingDirectory`/`cwd`. Any caller-supplied `SendMessageOptions.additionalDirectories` are merged in; all entries are resolved to absolute paths and de-duplicated (case-insensitively on Windows).

`ClaudeSDKService` wires CoC LLM tools and any caller-provided `mcpServers` into `query({ options: { mcpServers } })`; CoC tools ride a stdio bridge entry (`coc_llm_tools`, `alwaysLoad: true`), are pre-approved via `options.allowedTools` (`mcp__coc_llm_tools__<tool>`) so Claude Code never prompts for them, and bridged `tool_use` names are de-namespaced (see *CoC LLM Tools over MCP*).
Expand Down
56 changes: 54 additions & 2 deletions packages/coc-agent-sdk/src/claude-sdk-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,12 +33,14 @@ import type { ToolEvent } from './types';
import type { ISDKService, IAvailabilityResult, IModelInfo, IInvocationResult } from './sdk-service-interface';
import type { IAccountQuotaResult, IAccountQuotaSnapshot } from './copilot-sdk-service';
import type { ToolCall } from './tool-call';
import type { ClaudeImageSource } from './image-converter';
import { sdkServiceRegistry, CLAUDE_PROVIDER } from './sdk-service-registry';
import { dynamicImportModule } from './sdk-esm-loader';
import { getSDKLogger } from './logger';
import { CocToolRuntime } from './llm-tools/coc-tool-runtime';
import { cocToolBridgeServer } from './llm-tools/bridge-server';
import { buildCocLlmToolsMcpConfig, COC_LLM_TOOLS_MCP_SERVER_NAME } from './llm-tools/mcp-config';
import { tryReadImageAsBase64 } from './image-converter';
import { spawn } from 'child_process';
import { createRequire } from 'module';
import * as crypto from 'crypto';
Expand All @@ -58,6 +60,15 @@ interface ClaudeTextBlock {
text: string;
}

interface ClaudeImageBlock {
type: 'image';
source: {
type: 'base64';
media_type: ClaudeImageSource['media_type'];
data: string;
};
}

interface ClaudeToolUseBlock {
type: 'tool_use';
id: string;
Expand Down Expand Up @@ -116,6 +127,16 @@ interface ClaudeUserMessage {
session_id?: string;
}

interface ClaudeStreamingUserMessage {
type: 'user';
message: {
role: 'user';
content: Array<ClaudeTextBlock | ClaudeImageBlock>;
};
parent_tool_use_id: null;
session_id?: string;
}

export interface ClaudeRateLimitInfo {
status: 'allowed' | 'allowed_warning' | 'rejected';
resetsAt?: number;
Expand Down Expand Up @@ -148,7 +169,7 @@ type ClaudeMcpServerConfig =
| { type: 'sse'; url: string; headers?: Record<string, string> };

interface ClaudeQueryOptions {
prompt: string;
prompt: string | AsyncIterable<ClaudeStreamingUserMessage>;
abortController?: AbortController;
options?: {
cwd?: string;
Expand Down Expand Up @@ -542,7 +563,7 @@ export class ClaudeSDKService implements ISDKService {
const { servers: mcpServers, allowedTools, cleanup } = await this.buildClaudeMcpServers(options);
mcpCleanup = cleanup;
const queryOptions: ClaudeQueryOptions = {
prompt: options.prompt ?? '',
prompt: this.buildClaudePrompt(options),
abortController,
options: {
...(options.workingDirectory ? { cwd: options.workingDirectory } : {}),
Expand Down Expand Up @@ -620,6 +641,37 @@ export class ClaudeSDKService implements ISDKService {
}
}

private buildClaudePrompt(options: SendMessageOptions): string | AsyncIterable<ClaudeStreamingUserMessage> {
const text = options.prompt ?? '';
const images = (options.attachments ?? [])
.filter(attachment => attachment.type === 'file')
.map(attachment => tryReadImageAsBase64(attachment.path))
.filter((image): image is ClaudeImageSource => image !== null);

if (images.length === 0) return text;

const content: Array<ClaudeTextBlock | ClaudeImageBlock> = [
...(text ? [{ type: 'text' as const, text }] : []),
...images.map(image => ({
type: 'image' as const,
source: {
type: 'base64' as const,
media_type: image.media_type,
data: image.data,
},
})),
];
const message: ClaudeStreamingUserMessage = {
type: 'user',
message: { role: 'user', content },
parent_tool_use_id: null,
};

return (async function* () {
yield message;
})();
}

/**
* Build the `mcpServers` map for the Claude Code session.
*
Expand Down
24 changes: 21 additions & 3 deletions packages/coc-agent-sdk/src/codex-sdk-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ import { execFileAsync } from './internal/exec-utils';
import { CocToolRuntime } from './llm-tools/coc-tool-runtime';
import { cocToolBridgeServer } from './llm-tools/bridge-server';
import { buildCocLlmToolsMcpConfig, COC_LLM_TOOLS_MCP_SERVER_NAME } from './llm-tools/mcp-config';
import { isSupportedCodexImagePath } from './image-converter';
import { spawn } from 'child_process';
import { createRequire } from 'module';
import * as readline from 'readline';
Expand Down Expand Up @@ -83,17 +84,20 @@ export type CodexAuthChecker = () => CodexAuthCheckResult;
// ============================================================================

/** A running Codex thread that can be used to send messages and stream output. */
type CodexUserInput = { type: 'text'; text: string } | { type: 'local_image'; path: string };
type CodexInput = string | CodexUserInput[];

interface CodexThread {
/** Unique ID assigned by the Codex service after the first turn starts. */
readonly id: string | null;
/**
* Run the thread with a prompt, resolving with the completed turn.
*/
run(input: string, options?: CodexTurnOptions): Promise<CodexThreadResult>;
run(input: CodexInput, options?: CodexTurnOptions): Promise<CodexThreadResult>;
/**
* Run the thread with a prompt and stream structured events.
*/
runStreamed(input: string, options?: CodexTurnOptions): Promise<{ events: AsyncGenerator<CodexThreadEvent> }>;
runStreamed(input: CodexInput, options?: CodexTurnOptions): Promise<{ events: AsyncGenerator<CodexThreadEvent> }>;
}

interface CodexTurnOptions {
Expand Down Expand Up @@ -519,6 +523,20 @@ export class CodexSDKService implements ISDKService {
return mapCodexRateLimitsToQuota(result);
}

private buildCodexInput(options: SendMessageOptions): CodexInput {
const text = options.prompt ?? '';
const imagePaths = (options.attachments ?? [])
.filter(attachment => attachment.type === 'file' && isSupportedCodexImagePath(attachment.path))
.map(attachment => attachment.path);

if (imagePaths.length === 0) return text;

return [
...(text ? [{ type: 'text' as const, text }] : []),
...imagePaths.map(imagePath => ({ type: 'local_image' as const, path: imagePath })),
];
}

// ── Message dispatch ──────────────────────────────────────────────────────

public async sendMessage(options: SendMessageOptions): Promise<IInvocationResult> {
Expand Down Expand Up @@ -595,7 +613,7 @@ export class CodexSDKService implements ISDKService {
if (thread.id) notifySessionCreated(thread.id);

const chunks: string[] = [];
const streamed = await thread.runStreamed(options.prompt ?? '', { signal: abortController.signal });
const streamed = await thread.runStreamed(this.buildCodexInput(options), { signal: abortController.signal });

for await (const event of streamed.events) {
if (event.type === 'thread.started') {
Expand Down
2 changes: 1 addition & 1 deletion packages/coc-agent-sdk/src/copilot-sdk-service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ export {
denyAllPermissions,
} from './types';

export { tryConvertImageFileToDataUrl } from './image-converter';
export { tryConvertImageFileToDataUrl, tryReadImageAsBase64 } from './image-converter';

export interface IAccountQuotaSnapshot {
isUnlimitedEntitlement: boolean;
Expand Down
62 changes: 53 additions & 9 deletions packages/coc-agent-sdk/src/image-converter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,67 @@ const IMAGE_EXTENSIONS: Record<string, string> = {
/** Max image file size we'll convert to a data URL (10 MB). */
const MAX_IMAGE_FILE_SIZE = 10 * 1024 * 1024;

/**
* If `filePath` points to a readable image file (by extension), read it and
* return a `data:image/<mime>;base64,…` string. Returns `null` when the file
* is not an image, doesn't exist, is too large, or any other error occurs.
*/
export function tryConvertImageFileToDataUrl(filePath: string): string | null {
export interface ClaudeImageSource {
media_type: 'image/jpeg' | 'image/png' | 'image/gif' | 'image/webp';
data: string;
}

const CLAUDE_IMAGE_MEDIA_TYPES: Record<string, ClaudeImageSource['media_type']> = {
png: 'image/png',
jpg: 'image/jpeg',
jpeg: 'image/jpeg',
gif: 'image/gif',
webp: 'image/webp',
};

const CODEX_IMAGE_EXTENSIONS = new Set(Object.keys(CLAUDE_IMAGE_MEDIA_TYPES));

export function isImageFilePath(filePath: string): boolean {
const ext = path.extname(filePath).replace(/^\./, '').toLowerCase();
return ext in IMAGE_EXTENSIONS;
}

export function isSupportedCodexImagePath(filePath: string): boolean {
const ext = path.extname(filePath).replace(/^\./, '').toLowerCase();
return CODEX_IMAGE_EXTENSIONS.has(ext);
}

function readImageFile(filePath: string, mimeByExtension: Record<string, string>): { mime: string; data: Buffer } | null {
try {
const ext = path.extname(filePath).replace(/^\./, '').toLowerCase();
const mime = IMAGE_EXTENSIONS[ext];
const mime = mimeByExtension[ext];
if (!mime) return null;

const stat = fs.statSync(filePath);
if (!stat.isFile() || stat.size > MAX_IMAGE_FILE_SIZE) return null;

const data = fs.readFileSync(filePath);
return `data:${mime};base64,${data.toString('base64')}`;
return { mime, data: fs.readFileSync(filePath) };
} catch {
return null;
}
}

/**
* If `filePath` points to a readable image file (by extension), read it and
* return a `data:image/<mime>;base64,…` string. Returns `null` when the file
* is not an image, doesn't exist, is too large, or any other error occurs.
*/
export function tryConvertImageFileToDataUrl(filePath: string): string | null {
const image = readImageFile(filePath, IMAGE_EXTENSIONS);
if (!image) return null;
return `data:${image.mime};base64,${image.data.toString('base64')}`;
}

/**
* Return a Claude-compatible base64 source for supported raster image files.
* SVG is intentionally excluded because Claude's base64 image source does not
* accept `image/svg+xml`.
*/
export function tryReadImageAsBase64(filePath: string): ClaudeImageSource | null {
const image = readImageFile(filePath, CLAUDE_IMAGE_MEDIA_TYPES);
if (!image) return null;
return {
media_type: image.mime as ClaudeImageSource['media_type'],
data: image.data.toString('base64'),
};
}
6 changes: 6 additions & 0 deletions packages/coc-agent-sdk/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,14 @@ export {
CopilotSDKService,
resetCopilotSDKService,
tryConvertImageFileToDataUrl,
tryReadImageAsBase64,
} from './copilot-sdk-service';

export {
isImageFilePath,
isSupportedCodexImagePath,
} from './image-converter';

export type { BackgroundTasksInfo, IAccountQuotaSnapshot, IAccountQuotaResult } from './copilot-sdk-service';

export type {
Expand Down
Loading
Loading