feat(engine): env-var switch for async-first models experiment#280
Draft
eric-tramel wants to merge 6 commits intomainfrom
Draft
feat(engine): env-var switch for async-first models experiment#280eric-tramel wants to merge 6 commits intomainfrom
eric-tramel wants to merge 6 commits intomainfrom
Conversation
de634c0 to
1129ed6
Compare
Adds an opt-in async execution path (DATA_DESIGNER_ASYNC_ENGINE=1) for the cell-by-cell generation pipeline. Replaces thread-pool concurrency with native asyncio TaskGroup + Semaphore for bounded concurrent LLM calls, while keeping the sync path as the default. Key changes: - ModelFacade: acompletion(), agenerate_text_embeddings(), agenerate() - acatch_llm_exceptions decorator (async mirror of catch_llm_exceptions) - AsyncConcurrentExecutor with persistent background event loop - ColumnWiseBuilder branches on env var to fan out via async or threads - Benchmark updated with async mock support Co-Authored-By: Remi <noreply@anthropic.com>
Resolved conflicts: - llm_completion.py: kept agenerate() async method + main's new _extract_reasoning_content(), TraceType handling, and extract_reasoning_content config. Updated agenerate() to match main's trace handling patterns. - column_wise_builder.py: kept DATA_DESIGNER_ASYNC_ENGINE env var + adopted main's get_library_version() replacing importlib.metadata. Co-Authored-By: Remi <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Opt-in async execution path for the cell-by-cell generation pipeline, gated behind
DATA_DESIGNER_ASYNC_ENGINE=1. Replaces thread-pool concurrency with nativeasynciofor LLM calls while keeping the sync path as the unchanged default.What changed
acompletion(),agenerate(),agenerate_text_embeddings()onModelFacade, using LiteLLM Router's native async APIs. MCP tool calls offloaded viaasyncio.to_thread().acatch_llm_exceptions— async mirror of the existing sync decorator, sharing the same error-mapping logic.AsyncConcurrentExecutor— drop-in async replacement forConcurrentThreadExecutor. UsesTaskGroup+Semaphorefor bounded concurrency,asyncio.Eventfor early shutdown. Runs on a persistent background event loop (same pattern asmcp/io.py) to avoid breaking libraries that maintain internal async state across calls.ColumnWiseBuilder— branches on env var:_fan_out_with_async()vs existing_fan_out_with_threads().agenerate()on generators — async path inColumnGeneratorWithModelChatCompletion.--mode compare.Benchmark results
With mock LLMs (zero-latency stubs), sync and async produce identical output (SHA256 content hash match across all iterations). No measurable speedup is expected here — the async path's advantage comes from real network-bound LLM calls where
awaityields the event loop during I/O wait. With mocks there's no I/O to overlap.Validated end-to-end with the
pdf_qarecipe using real inference (Opus 4.5 + MCP tool calls).How to test
Test plan