Skip to content

feat: implement Rust proxy gateway for vLLM Responses API#24

Draft
leseb wants to merge 1 commit into
vllm-project:mainfrom
leseb:building-the-agentic-api-in-rust-after-project-mig
Draft

feat: implement Rust proxy gateway for vLLM Responses API#24
leseb wants to merge 1 commit into
vllm-project:mainfrom
leseb:building-the-agentic-api-in-rust-after-project-mig

Conversation

@leseb
Copy link
Copy Markdown
Collaborator

@leseb leseb commented May 11, 2026

Summary

  • Implements the full Rust proxy gateway for the vLLM Responses API, replacing the Python gateway stub
  • Supports both streaming (SSE) and non-streaming proxy modes with proper hop-by-hop header filtering
  • Includes API key injection from environment, comprehensive error mapping (502 for connection errors, 504 for timeouts), and CORS support
  • Adds CLI with two modes: standalone (--llm-api-base) and integrated (serve <model> spawning vLLM as a subprocess)
  • Modular architecture: config, proxy, app, server modules with a clean separation of concerns

Test Plan

  • 11 tests covering: non-stream passthrough, stream passthrough, hop-by-hop header stripping, auth injection, client auth precedence, upstream HTTP error passthrough, mid-stream failure handling, connection error → 502, timeout → 504
  • All tests pass: cargo test (11/11 green)
  • Clippy clean: cargo clippy --all-targets -- -D warnings
  • Formatting clean: cargo fmt -- --check

🤖 Generated with Claude Code

Replaces the Python gateway stub with a full Rust implementation
using axum, reqwest, and tokio. Supports both streaming (SSE) and
non-streaming proxy modes, hop-by-hop header filtering, API key
injection, and comprehensive error mapping (502/504).

Includes 11 tests covering passthrough, auth, streaming, error
propagation, mid-stream failure, connection errors, and timeouts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sébastien Han <seb@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant