feat(backend): add OpenAI chat provider fallback#7322
Conversation
Greptile SummaryThis PR adds a
Confidence Score: 3/5The Anthropic path and client changes are safe. The new OpenAI path is functional but ships without the web-search tool that the Anthropic path always provides, meaning users on CHAT_PROVIDER=openai silently lose live-search capability. The core lazy-init change in clients.py is straightforward and low risk. The 240-line OpenAI streaming path closely mirrors the Anthropic path but has a meaningful feature gap: WEB_SEARCH_TOOL and TOOL_SEARCH_TOOL are injected unconditionally in the Anthropic code path but are entirely absent in the OpenAI path. Any question requiring a live web lookup will silently receive no answer on the new provider. This gap, combined with per-request LLM client allocation that differs from the singleton pattern used elsewhere, warrants attention before this routes production traffic. backend/utils/retrieval/agentic.py — the OpenAI streaming path and its tool list construction Important Files Changed
Sequence DiagramsequenceDiagram
participant Caller
participant execute_agentic_chat_stream
participant _execute_openai as _execute_agentic_chat_stream_openai
participant _run_openai as _run_openai_agent_stream
participant ReactAgent as create_react_agent (LangGraph)
participant Callback as AsyncStreamingCallback
Caller->>execute_agentic_chat_stream: uid, messages, app, ...
alt "CHAT_PROVIDER == 'openai'"
execute_agentic_chat_stream->>_execute_openai: delegate
_execute_openai->>ReactAgent: create_react_agent(model, tools) [per-request]
_execute_openai->>Callback: new AsyncStreamingCallback()
_execute_openai->>_run_openai: asyncio.create_task(...)
loop streaming events
ReactAgent-->>_run_openai: on_chat_model_stream
_run_openai->>Callback: put_data(token)
ReactAgent-->>_run_openai: on_tool_start
_run_openai->>Callback: put_thought(display_name)
ReactAgent-->>_run_openai: on_tool_end
end
_run_openai->>Callback: end() → queue.put(None)
_execute_openai-->>execute_agentic_chat_stream: yield chunks + yield None
execute_agentic_chat_stream-->>Caller: yield chunks + yield None
else "CHAT_PROVIDER == 'anthropic' (default)"
execute_agentic_chat_stream->>anthropic_client: messages.stream(...) [singleton]
loop streaming
anthropic_client-->>execute_agentic_chat_stream: tokens, tool events
end
execute_agentic_chat_stream-->>Caller: yield chunks + yield None
end
Reviews (1): Last reviewed commit: "feat(backend): add OpenAI chat provider ..." | Re-trigger Greptile |
| tools = list(CORE_TOOLS) | ||
| try: | ||
| app_tools = load_app_tools(uid) | ||
| if app_tools: | ||
| tools.extend(app_tools) | ||
| logger.info(f"Added {len(app_tools)} app tools to OpenAI chat") | ||
| system_prompt = _append_openai_app_tool_prompt(system_prompt, app_tools) |
There was a problem hiding this comment.
Missing web-search and tool-search capabilities in OpenAI path
The Anthropic path unconditionally injects WEB_SEARCH_TOOL (and TOOL_SEARCH_TOOL when app tools exist) via _convert_tools, giving Claude server-side web search on every request. The OpenAI path builds its tool list from CORE_TOOLS only — there is no equivalent web-search tool appended before passing to create_react_agent. Any user on CHAT_PROVIDER=openai silently loses the ability to answer questions that require a live web lookup. If parity isn't intended, the disparity should at least be documented via a warning log at startup or in the function's docstring.
| callback = AsyncStreamingCallback() | ||
| full_response = [] | ||
| tool_usage_count = 0 | ||
| agent = create_react_agent(model=get_openai_agent_llm(streaming=True), tools=tools) |
There was a problem hiding this comment.
Per-request model client and agent graph instantiation
get_openai_agent_llm(streaming=True) calls get_llm('chat_graph', ...) which constructs a new ChatOpenAI instance (including its connection-pool state) on every chat request. create_react_agent then wraps it in a new compiled graph. The Anthropic path reuses a module-level anthropic_client singleton. Under load, this allocates a new HTTP client object per request rather than sharing a pooled connection.
| tool_input = event.get("data", {}).get("input", {}) | ||
| if not isinstance(tool_input, dict): | ||
| tool_input = {} | ||
|
|
There was a problem hiding this comment.
tool_input normalised to {} silently voids safety-guard argument validation
When LangGraph's on_tool_start event delivers a non-dict input (e.g. a ToolCall object in some LangChain versions), the fallback replaces it with {}. safety_guard.validate_tool_call(tool_name, {}) is then called with no arguments, which may allow calls that would otherwise be blocked based on argument content to pass unexamined.
| def test_agentic_chat_provider_openai_path_is_declared(): | ||
| source = (Path(__file__).resolve().parents[2] / 'utils/retrieval/agentic.py').read_text() | ||
|
|
||
| assert "CHAT_PROVIDER = os.getenv('CHAT_PROVIDER'" in source | ||
| assert 'create_react_agent' in source | ||
| assert 'get_openai_agent_llm(streaming=True)' in source |
There was a problem hiding this comment.
This test verifies feature presence by reading the source file as raw text and asserting specific string literals exist. It passes even if the strings appear in comments, and breaks on trivial refactors (variable rename, line split). Consider replacing it with a behavioral test — e.g. patch
CHAT_PROVIDER to 'openai' and assert that _execute_agentic_chat_stream_openai is called, or that create_react_agent is invoked.
| def test_agentic_chat_provider_openai_path_is_declared(): | |
| source = (Path(__file__).resolve().parents[2] / 'utils/retrieval/agentic.py').read_text() | |
| assert "CHAT_PROVIDER = os.getenv('CHAT_PROVIDER'" in source | |
| assert 'create_react_agent' in source | |
| assert 'get_openai_agent_llm(streaming=True)' in source | |
| def test_agentic_chat_provider_openai_path_dispatches(): | |
| """Verify that CHAT_PROVIDER=openai routes to the OpenAI execution path.""" | |
| import utils.retrieval.agentic as agentic_mod | |
| called_with = [] | |
| async def _fake_openai_stream(*args, **kwargs): | |
| called_with.append((args, kwargs)) | |
| yield "data: hello" | |
| yield None | |
| with patch.object(agentic_mod, 'CHAT_PROVIDER', 'openai'), \ | |
| patch.object(agentic_mod, '_execute_agentic_chat_stream_openai', side_effect=_fake_openai_stream): | |
| import asyncio | |
| chunks = asyncio.run( | |
| _collect(agentic_mod.execute_agentic_chat_stream('uid', [], None)) | |
| ) | |
| assert called_with, "_execute_agentic_chat_stream_openai was not called" | |
| async def _collect(gen): | |
| result = [] | |
| async for chunk in gen: | |
| result.append(chunk) | |
| return result |
Summary
Closes #7238
Verification