Skip to content

支持 Web 多模态图片上传并修正图片 token 预算估算#705

Open
Yumiue wants to merge 2 commits into
1024XEngineer:mainfrom
Yumiue:codex/gateway-plan-approval-rpc
Open

支持 Web 多模态图片上传并修正图片 token 预算估算#705
Yumiue wants to merge 2 commits into
1024XEngineer:mainfrom
Yumiue:codex/gateway-plan-approval-rpc

Conversation

@Yumiue
Copy link
Copy Markdown
Collaborator

@Yumiue Yumiue commented May 31, 2026

概要

  • 支持 Web 端通过 session asset 上传多模态图片。
  • 图片通过 input_parts.media.asset_id 传入后端,不再把图片数据内联到聊天消息。
  • 会话历史保留图片附件引用,支持 Web 端回显缩略图。
  • 修正 provider 预算估算,避免把图片 base64 传输体当作 prompt token 计入。

主要改动

  • 新增 /api/session-assets 上传/读取链路,并打通 Gateway、Runtime、Session 的附件引用。
  • 扩展 Web 聊天输入框、消息展示、会话加载和协议类型,支持图片附件。
  • 新增多模态投影 token 估算:文本、工具 schema 正常估算,图片按固定单图预算估算。
  • 保持真实 provider 发送路径不变:session asset 只在请求发送阶段临时读取并编码。
  • OpenAI-compatible、DeepSeek、Qwen、GLM、MiMo、MiniMax、Anthropic、Gemini 均接入图片投影估算路径。

测试

  • go test ./internal/provider/... ./internal/gateway/... ./internal/runtime -count=1

close #700

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copy link
Copy Markdown

@fennoai fennoai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review found two issues: one ACL bypass risk on the new asset HTTP endpoints, and one browser object URL lifecycle leak in the image send flow.

http.Error(writer, "method not allowed", http.StatusMethodNotAllowed)
return
}
subjectID, ok := s.authenticatedHTTPSubjectID(request)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new session-asset HTTP endpoints authenticate the bearer token but never consult s.acl; handleSessionAssetRead has the same pattern. If an operator configures the control-plane ACL to restrict HTTP access, a valid token can still upload/read session images through these endpoints. Please route these endpoints through the same ACL decision path (or add explicit asset methods) before calling the runtime port.

})))

setText('')
clearAttachments(false)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clearAttachments(false) drops the composer’s ownership without revoking the object URLs created for previews; those URLs are copied into chat messages and are never revoked when messages are cleared or removed. Repeated image sends can keep blobs alive for the lifetime of the tab, so please add a cleanup owner/lifecycle for sent attachment preview URLs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Web 会话区支持图片上传并加入会话

1 participant