Skip to content

feat: Async Journey: LiteLLM Removal from Async Engine#310

Open
eric-tramel wants to merge 1 commit intoasync/async-facadefrom
async/litellm-removal
Open

feat: Async Journey: LiteLLM Removal from Async Engine#310
eric-tramel wants to merge 1 commit intoasync/async-facadefrom
async/litellm-removal

Conversation

@eric-tramel
Copy link
Contributor

Summary

Adds a comprehensive analysis of removing the litellm dependency from Data Designer. This is a planning document — no code changes.

Key findings

  • LiteLLM is well-contained (12 production files, all in engine/models/ and engine/models_v2/)
  • DD underuses it — each ModelFacade creates a Router with a single deployment (no load balancing, no failover)
  • OpenAI and Anthropic SDKs handle retry/backoff natively; Bedrock does not (manual retry needed for throttling)
  • Anthropic adapter is HIGH risk due to structurally different response format (content blocks vs strings)

Implementation plan (4 phases)

  1. Replace Router with ModelClient in models_v2/ — OpenAI SDK adapter, keep OpenAI response format as canonical type. models/ untouched as fallback.
  2. Validate — Benchmark, test suite, real inference with env var enabled.
  3. Additional provider adapters — Anthropic + Bedrock. models/ fallback still available.
  4. Consolidate and drop dependency — Delete models/, remove litellm from deps. Only after all adapters are proven.

Reviewed by

10 independent code reviewers examined the report against the actual codebase. Corrections incorporated: expanded test blast radius (4 files, ~56 functions), upgraded Anthropic risk to HIGH, added MCP facade cross-layer caveat, corrected dependency impact analysis.

Comprehensive analysis of removing the litellm dependency from Data
Designer. Covers blast radius (per-phase), provider SDK research
(OpenAI, Anthropic, Bedrock), risk assessment, and a 4-phase
implementation plan using the models_v2/ parallel stack approach.

Co-Authored-By: Remi <noreply@anthropic.com>
@eric-tramel eric-tramel requested a review from a team as a code owner February 7, 2026 03:02
@eric-tramel eric-tramel changed the title docs: LiteLLM removal impact analysis and implementation plan feat: Async Journey: LiteLLM Removal from Async Engine Feb 7, 2026
@greptile-apps
Copy link

greptile-apps bot commented Feb 7, 2026

Greptile Overview

Greptile Summary

Adds comprehensive planning document for removing LiteLLM dependency from Data Designer. The document is thorough, well-structured, and demonstrates deep understanding of the codebase through 10 independent reviewer validations.

Key strengths:

  • Well-contained blast radius (12 production files, all in engine/models/ and engine/models_v2/)
  • Pragmatic 4-phase approach with parallel implementation in models_v2/ to maintain fallback
  • Accurate technical analysis confirmed against actual codebase (dependency version, file counts, test impact)
  • Honest risk assessment (Anthropic and Bedrock adapters marked HIGH risk due to response format incompatibilities)
  • Comprehensive test migration plan (~56 test functions identified)
  • Clear decision on response format (keep OpenAI structure as canonical to minimize cross-layer changes)

Implementation strategy:

  1. Phase 1: OpenAI adapter in models_v2/ (low risk)
  2. Phase 2: Validation via benchmarks and tests
  3. Phase 3: Anthropic + Bedrock adapters (high risk, structural differences)
  4. Phase 4: Remove models/ and drop dependency (only after validation)

The document correctly identifies that Data Designer underuses LiteLLM (single-deployment Router instead of load balancing), making removal feasible. The parallel stack approach via DATA_DESIGNER_ASYNC_ENGINE env var provides safe rollback mechanism.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it only adds planning documentation with no code changes
  • Documentation-only change with comprehensive technical analysis that has been validated by 10 independent reviewers. No production code modified, no breaking changes, no runtime impact. The planning document demonstrates thorough understanding of the codebase and provides clear implementation roadmap with appropriate risk assessment.
  • No files require special attention

Important Files Changed

Filename Overview
LITELLM_REMOVAL_ANALYSIS.md New comprehensive planning document analyzing LiteLLM removal strategy with 4-phase implementation plan

Sequence Diagram

sequenceDiagram
    participant Config as Config Layer
    participant Factory as models_v2/factory.py
    participant Facade as models_v2/facade.py
    participant Client as ModelClient (OpenAI/Anthropic/Bedrock)
    participant SDK as Provider SDK
    participant API as Provider API

    Note over Config,Factory: Phase 1: OpenAI Adapter
    Config->>Factory: create_model_registry(model_configs)
    Factory->>Factory: Construct OpenAIModelClient
    Factory->>Client: Initialize with api_key, base_url
    Factory->>Facade: ModelFacade(client=OpenAIModelClient)
    
    Note over Facade,API: Inference Request Flow
    Facade->>Facade: completion(messages, **params)
    Facade->>Client: client.completion(messages, **kwargs)
    Client->>Client: Translate DD params → SDK params
    Client->>SDK: await sdk.chat.completions.create(...)
    SDK->>SDK: Built-in retry/backoff
    SDK->>API: HTTPS POST /v1/chat/completions
    API-->>SDK: 200 OK with response
    SDK-->>Client: OpenAI response object
    Client->>Client: Extract content, tool_calls, usage
    Client-->>Facade: CompletionResponse
    Facade-->>Config: Generated text

    Note over Factory,Client: Phase 3: Multi-Provider
    Factory->>Factory: match provider_type
    alt provider_type == "openai"
        Factory->>Client: OpenAIModelClient
    else provider_type == "anthropic"
        Factory->>Client: AnthropicModelClient
        Note over Client: Translates content blocks → string
    else provider_type == "bedrock"
        Factory->>Client: BedrockModelClient
        Note over Client: Manual retry for throttling
    end

    Note over Facade,SDK: Error Handling
    SDK-->>Client: SDK-specific exception (e.g., RateLimitError)
    Client->>Client: Map to DD error types
    Client-->>Facade: ModelRateLimitError
    Facade-->>Config: Propagate with FormattedLLMErrorMessage
Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant