fix(mistralai): respect max_retries during streaming failures #34209
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR fixes an issue in the
langchain-mistralaichat integration wheremax_retrieswere not respected when a failure occurred during streaming (e.g.,httpx.StreamErrorafter the first chunk).Yesterday, Mistral was experiencing failure systems (30% of requests were not returning any chunks). I lowered retry to 5 secondes, because 100% of calls not returning any chunks in 5 secondes were bug? And I discovered the retry mechanism was not correctly implemented. This behavior was inconsistent with ChatOpenAI LLM, for exemple.
Changes:
ChatMistralAI._astreamso that the async streaming loop is wrapped in a retry policy based ontenacity.AsyncRetrying.httpx.RequestErrorandhttpx.StreamErrorraised while consuming the SSE stream trigger new attempts, instead of failing after a single try.test_streaming_retry_on_stream_failureusinghttpx.MockTransportto simulate a mid-stream network failure and assert that multiple HTTP requests are made when retries are enabled.Rationale
The current Mistral integration uses raw
httpx+httpx-sseinstead of the official Mistral client, so it must implement its own retry logic. Previously, the retry decorator only wrapped the initial connection. Once the SSE iterator was returned, anyStreamErroroccurring during consumption bypassed the retry logic, resulting in only one failed attempt regardless of the configuredmax_retries.This PR preserves the existing public API and behavior while making the streaming path consistent with the
max_retriessetting.Testing
Local tests run:
uv run pytest libs/partners/mistralai/tests/unit_tests/test_streaming_retry.py
Future work
The current integration is built directly on top of
httpxandhttpx-sse. In the long term, it would likely be more robust and easier to maintain to layerChatMistralAIon top of the official Mistral Python SDK instead of reimplementing low-level HTTP, streaming, and retry behavior by hand. That refactor would be a larger, separate change, but this PR moves the currenthttpx-based implementation to a more correct and resilient state in the meantime.AI: Description of PR was made by AI.