Skip to content

Migrate from Flask/connexion to FastAPI #73

@adambalogh

Description

@adambalogh

Motivation

The current Flask + connexion stack is causing friction that's disproportionate to what the gateway actually does (a handful of OpenAI-shaped JSON endpoints + one binary OHTTP endpoint). Concrete pain points showing up today:

  • Connexion enforces an OpenAPI security scheme we don't actually use. x402 is the real access control, but connexion validates ApiKeyAuth before our handlers run. Worked around in ohttp_controller._wsgi_subrequest with a sentinel Authorization: Bearer ohttp header — a fake credential masquerading as a real one.
  • Connexion rejects non-JSON bodies on JSON-declared routes, so /v1/ohttp (raw HPKE bytes) has to be mounted via add_url_rule outside the OpenAPI spec — an asymmetric registration path.
  • Chat handlers are not callable as functions. They read from connexion.request.get_json() instead of taking a typed argument, so the OHTTP handler can't just call chat_completion(body). It has to re-enter the app through a hand-built WSGI sub-request (ohttp_controller._wsgi_subrequest) so the x402 middleware fires. This works, but it's ~50 lines of dense environ-dict plumbing that everyone reviewing the file flinches at.
  • We don't need the OpenAPI spec. The surface is small (chat completions, completions, health, signing-key, keys, ohttp). The cost of keeping openapi.yaml + connexion in sync with the Pydantic models is larger than the value of the generated validation.

Proposed direction

Migrate the HTTP layer to FastAPI (or bare Starlette if FastAPI feels too heavy):

  • Typed async def handlers — OHTTP can await chat_completion(body) directly; _wsgi_subrequest deleted.
  • Per-route dependencies for auth instead of a global OpenAPI security scheme — no sentinel bearer token.
  • Native raw-body support for /v1/ohttp.
  • Drop openapi.yaml and the generated Pydantic models; keep hand-written Pydantic request/response models alongside the handlers.

Out of scope / things to watch

  • x402 middleware is WSGI-shaped (x402.http.middleware.flask). Either wait for / contribute an ASGI variant, or run x402 in a thin WSGI shim mounted under Starlette's WSGIMiddleware. Needs investigation before committing to FastAPI.
  • Nitriding integration is unchanged — it proxies HTTP, doesn't care which framework is behind it.
  • PCR stability: dependency churn affects the enclave image hash. Plan this as a single deliberate bump, not piecemeal.
  • Heartbeat, TEE key injection, pricing precheck, response signing are all framework-agnostic logic — should port straight across.
  • Test suite (tee_gateway/test/) currently uses Flask test client; will need rewriting against httpx.AsyncClient / FastAPI's TestClient.

Not urgent

The current stack works. This is a quality-of-life / maintainability migration, not a bug fix. File it for when there's appetite for an HTTP-layer refresh.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions