Skip to content

Add image generation to chat UI#2708

Open
vibegui wants to merge 7 commits intomainfrom
vibegui/image-gen-chat
Open

Add image generation to chat UI#2708
vibegui wants to merge 7 commits intomainfrom
vibegui/image-gen-chat

Conversation

@vibegui
Copy link
Contributor

@vibegui vibegui commented Mar 15, 2026

What is this contribution about?

Implements image generation as a first-class feature in the chat UI. Added an "Image" toggle button near the model selector that switches to image generation mode. When active, the model selector filters to only image-capable models (like OpenRouter's Nano Banana 2), aspect ratio controls appear (1:1, 16:9, 9:16), and the user's text becomes an image prompt. Images are generated using the AI SDK's generateImage() with OpenRouter's image models and rendered inline in chat with nice UI.

How to Test

  1. Start dev: bun run dev
  2. Click the image button (Image01 icon) in the chat input action row
  3. Verify the model selector filters to image-capable models only
  4. Select an aspect ratio from the inline picker
  5. Type a prompt like "a cat wearing a top hat" and send
  6. Verify image appears inline with rounded corners and fade-in animation
  7. Hover over the image and click the download button to save
  8. Toggle image mode off and verify controls return to normal
  9. Refresh the page and navigate back to the thread to verify images persist

Migration Notes

No database migrations required. Images are stored as base64 in message threads, same as existing file attachments.

Review Checklist

  • PR title is clear and descriptive
  • Changes are tested and working
  • bun run fmt and bun run check pass
  • No breaking changes
  • Image generation works with OpenRouter models including Nano Banana 2

🤖 Generated with Claude Code


Summary by cubic

Adds inline image generation to chat with a dedicated image model picker and a generate_image tool. Images stream as file parts and, when object storage is available, are stored and served via /api/files with a download button; falls back to base64 if not.

  • New Features

    • Image model picker next to the model control; filters to image-generation models with a new capability icon; works with OpenRouter.
    • Aspect ratio chips (1:1, 16:9, 9:16); hides file upload while active; clear to return to text-only; resets on new/switch thread and isn’t persisted.
    • Server/agents: generate_image calls generateImage() and writes a file part; StreamRequestSchema adds imageModel { id, aspectRatio }; stream-core passes image config and adds an image-generation hint.
    • Persistence: store generated images in object storage and serve via /api/files; fall back to inline base64 when storage isn’t configured.
  • Bug Fixes

    • Server hardening: guard for image capability, validate aspectRatio enum (incl. 4:3, 3:4), allowlist mediaType, add monitorLlmCall, prevent double FINISH, return friendly errors.
    • UI polish: disable image button when no image models, fix Firefox downloads/extension parsing, trim parenthetical suffix from compact model names.

Written for commit 4bb39de. Summary will update on new commits.

@github-actions
Copy link
Contributor

🧪 Benchmark

Should we run the Virtual MCP strategy benchmark for this PR?

React with 👍 to run the benchmark.

Reaction Action
👍 Run quick benchmark (10 & 128 tools)

Benchmark will run on the next push after you react.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 15, 2026

Release Options

Suggested: Patch (2.205.3) — default (no conventional commit prefix detected)

React with an emoji to override the release type:

Reaction Type Next Version
👍 Prerelease 2.205.3-alpha.1
🎉 Patch 2.205.3
❤️ Minor 2.206.0
🚀 Major 3.0.0

Current version: 2.205.2

Note: If multiple reactions exist, the smallest bump wins. If no reactions, the suggested bump is used (default: patch).

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 11 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/api/routes/decopilot/schemas.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/schemas.ts:91">
P2: Restrict `imageMode.aspectRatio` to the supported ratio values instead of accepting any string.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/input.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/input.tsx:527">
P2: Hiding the upload button in image mode does not disable drag-and-drop uploads, because the editor still mounts `FileUploader`. That leaves image mode accepting files that the backend ignores or rejects.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/select-model.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/select-model.tsx:783">
P1: Filtering the selector in image mode does not enforce an image-capable selected model, so image requests can still be sent with the previously selected text model.</violation>
</file>

<file name="apps/mesh/src/api/routes/decopilot/stream-core.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:241">
P1: Validate image-model support on the server before calling `imageModel()`. Otherwise invalid image-mode requests fail at runtime after the message has already been saved.</violation>

<violation number="2" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:318">
P2: Handle aborted image requests before marking the run failed. As written, cancelling image generation is recorded as a failed thread.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/image-mode-toggle.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/image-mode-toggle.tsx:51">
P1: Guard enabling image mode until an image-capable model is available; otherwise image requests can be sent with the current text model and fail at runtime.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/store/chat-store.ts">

<violation number="1" location="apps/mesh/src/web/components/chat/store/chat-store.ts:432">
P1: Preserve the current `credentialId` when auto-selecting an image model; otherwise the stored model can lose its connection and later be sent with the wrong key.

(Based on your team's feedback about treating the chat model and credential as an atomic pair.) [FEEDBACK_USED]</violation>

<violation number="2" location="apps/mesh/src/web/components/chat/store/chat-store.ts:438">
P2: Don’t enter image mode unless an image-capable model is available; right now the store can keep the old text model selected and send an invalid image request.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/select-model.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/select-model.tsx:788">
P1: Guard the image-mode model list against connections that have no image-generation models. As written, image mode can be enabled with an empty selector while requests still use the previous non-image model and fail.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/web/components/chat/store/chat-store.ts">

<violation number="1" location="apps/mesh/src/web/components/chat/store/chat-store.ts:217">
P2: Switching threads while image mode is active loses the previously selected text model. `setActiveThread()` clears `_previousModel` and disables `imageMode` directly, so the temporary image model stays selected in normal chat instead of being restored.</violation>
</file>

<file name="apps/mesh/src/api/routes/decopilot/stream-core.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:272">
P2: Move the new `monitorLlmCall` success/error reporting so only the `generateImage()` result determines model success; otherwise write failures can produce contradictory monitoring events.</violation>

<violation number="2" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:291">
P1: Do not relabel unsupported image bytes as `image/png`; reject the type or preserve the original MIME type.</violation>
</file>

<file name="apps/mesh/src/web/components/chat/image-mode-toggle.tsx">

<violation number="1" location="apps/mesh/src/web/components/chat/image-mode-toggle.tsx:50">
P3: The new availability check disables the button without applying the disabled styles, so it still looks clickable when no image-capable models are available.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

await saveMessagesToThread(requestMessage);

// ================================================================
// Image generation mode — skip MCP/tool setup, call generateImage
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image generation should be an innate tool, not an if block that duplicates existing code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored — image generation is now a generate_image built-in tool (like subtask or sandbox). The ~195-line if-block is gone. The image model is selected via a separate picker and passed to the tool, while the language model handles the agentic loop.

@vibegui vibegui force-pushed the vibegui/image-gen-chat branch from 44b1872 to c2abc8c Compare March 19, 2026 21:15
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 12 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/mesh/src/api/routes/decopilot/stream-core.ts">

<violation number="1" location="apps/mesh/src/api/routes/decopilot/stream-core.ts:330">
P2: Guard the image-generation prompt the same way the tool registration is guarded; otherwise the model can be told to call `generate_image` when the tool is not available for the selected provider.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@vibegui vibegui force-pushed the vibegui/image-gen-chat branch 2 times, most recently from 3430cc5 to 1b0bcbb Compare March 20, 2026 01:08
vibegui and others added 7 commits March 25, 2026 14:21
Implement image generation in the chat UI using OpenRouter's image models through the AI SDK.
Add an "Image" toggle button that filters the model selector to image-capable models, appears with
aspect ratio picker, and generates images inline in chat with nice UI. Generated images are stored
as base64 in message threads and render with download-on-hover functionality.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Move Image button to right side near model picker for stable positioning,
add "image-generation" capability to distinguish output from input modalities,
auto-select Gemini model when entering image mode, and filter model picker
to only show image generation models.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add server-side capability guard before calling imageModel()
- Validate aspectRatio as enum instead of free-form string
- Allowlist mediaType from provider response (prevent injection)
- Add monitorLlmCall to image path for observability parity
- Add streamFinished guard to prevent double FINISH dispatch
- Reset imageMode on thread switch, clear _previousModel on reset
- Disable Image toggle when no image models available
- Fix Firefox download (append anchor to DOM before click)
- Clean up mediaType extension parsing for downloads
- Revert unrelated conductor.json change
- Add IMAGE-GEN-FOLLOWUPS.md tracking deferred items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… chat

- Never persist image model to localStorage so refresh always restores text model
- Reset image mode and restore text model on createThread and setActiveThread
- Strip parenthetical suffix from model names in compact trigger display

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of raw "No image generated" error, show a clear message
explaining that image mode is for generating images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the ~195-line `if (input.imageMode)` block in stream-core.ts with a
`generate_image` built-in tool that runs inside the normal streamText agentic
loop. The image model is now selected via a dedicated picker (separate from
the language model selector), and the tool handles generateImage() calls,
metrics, and error handling internally.

Key changes:
- New `generate_image` built-in tool following subtask/sandbox pattern
- New `ImageModelSelector` component replaces `ImageModeToggle`
- Selecting an image model enables image mode; clearing exits it
- Language model stays selected for streamText; image model is separate
- Remove model save/restore (_previousModel) logic from chat store
- Remove imageMode filtering from text model selector

Addresses PR review feedback from @pedrofrxncx.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Store generated images in object storage when available, serving them
via /api/files instead of embedding large base64 data URLs in messages.
Falls back to inline base64 when object storage is not configured.

Also fixes tool error handling: return error strings instead of throwing
from tool execute, which crashed the entire SSE stream.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vibegui vibegui force-pushed the vibegui/image-gen-chat branch from 1b0bcbb to 4bb39de Compare March 25, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants