[FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848
[FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848pk-zipstack wants to merge 13 commits intomainfrom
Conversation
litellm's cohere embed handler (1.80.0) receives a timeout parameter but doesn't pass it to client.post(), causing "Connection timed out after None seconds" on large Bedrock embedding requests. This adds a monkey-patch that replaces the affected functions with versions that correctly forward timeout. Includes version guard, source comments, and unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ions - Version guard now skips patch entirely when litellm > 1.80.0 instead of just warning - Test assertions now check exact timeout value received by client.post(), not just that it was called - Inline comments at client.post() calls marked with ONLY CHANGE Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a new patches package and a litellm Cohere timeout monkey-patch (sync + async) that forwards timeout arguments to underlying HTTP client calls; the patch is applied by a side-effect import in the SDK embedding module and unit tests verify timeout propagation and wiring. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SDK as "SDK embedding\n(unstract.sdk1.embedding)"
participant Patch as "Patch module\n(litellm_cohere_timeout)"
participant Litellm as "litellm embed\nhandler (wrapped)"
participant HTTP as "HTTP client\n(e.g., httpx)"
User->>SDK: call embed(inputs, timeout=...)
SDK->>Patch: import (side-effect) / use patched handler
Patch->>Litellm: invoke wrapped handler
Litellm->>HTTP: client.post(..., timeout=passed_timeout)
HTTP-->>Litellm: response
Litellm-->>Patch: embedding result
Patch-->>SDK: return embeddings
SDK-->>User: return embeddings
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
- Remove useless self-assignment `model = model` - Use int timeout values in tests to avoid float equality checks - Use `is` identity checks instead of `==` for timeout assertions - Replace async mock_post with AsyncMock to properly use async features Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py (1)
7-10: Exercise the production import hook in one test.Because this file imports the patch module directly, the wiring checks only prove direct-import behavior. Add one integration test that imports
unstract.sdk1.embeddingand asserts the handler is patched, so Line 7 inunstract/sdk1/src/unstract/sdk1/embedding.pycannot regress silently.Also applies to: 194-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py` around lines 7 - 10, Add an integration test that imports unstract.sdk1.embedding (instead of importing the patch module directly) and asserts the production import hook applied the patch: after importing unstract.sdk1.embedding, verify the module's embedding handler uses the patched implementations by checking that its async and sync embedding callables resolve to the symbols _patched_async_embedding and _patched_embedding (or that their identities/reference equality match those functions imported from unstract.sdk1.patches.litellm_cohere_timeout); update or add this assertion into test_litellm_cohere_timeout.py alongside the existing direct-import checks so the production import wiring (line referencing unstract.sdk1.embedding) is exercised and cannot regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-43: The patch currently imports private litellm.llms.* modules
before checking _SKIP_PATCH and uses a loose version check, so move all
private/internal imports (e.g., validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse and any litellm.llms.* import lines) to after the guard that
computes _SKIP_PATCH, change the version gate from a ">" check to an exact
equality check against the known compatible LiteLLM version, and when skipping
emit a visible warning via warnings.warn (not just DeprecationWarning swallowed
by default) so callers know the patch was skipped; ensure functions/classes
referenced in the patch (validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) are only imported after the guard.
---
Nitpick comments:
In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py`:
- Around line 7-10: Add an integration test that imports unstract.sdk1.embedding
(instead of importing the patch module directly) and asserts the production
import hook applied the patch: after importing unstract.sdk1.embedding, verify
the module's embedding handler uses the patched implementations by checking that
its async and sync embedding callables resolve to the symbols
_patched_async_embedding and _patched_embedding (or that their
identities/reference equality match those functions imported from
unstract.sdk1.patches.litellm_cohere_timeout); update or add this assertion into
test_litellm_cohere_timeout.py alongside the existing direct-import checks so
the production import wiring (line referencing unstract.sdk1.embedding) is
exercised and cannot regress.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f70cb6f9-69dd-431a-97bc-906d92c597ff
📒 Files selected for processing (5)
unstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/patches/__init__.pyunstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.pyunstract/sdk1/tests/patches/__init__.pyunstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)
26-64:⚠️ Potential issue | 🟠 MajorMove the private LiteLLM imports behind the guard and gate on the exact tested version.
Lines 26-43 still import private
litellm.llms.*modules before_SKIP_PATCHis evaluated, so a future LiteLLM reorg can fail at import time instead of cleanly skipping. Also, Line 56 only skips> 1.80.0; for a copied private implementation,!=is safer. The skip warning on Lines 58-64 usesDeprecationWarning, which is usually hidden for imported library code, so operators may never notice that the patch was not applied. With the provided dependency context showing LiteLLM 1.81.7, this path would currently skip.🔧 Suggested fix
import importlib.metadata import json import logging import warnings from collections.abc import Callable import httpx import litellm -import litellm.llms.bedrock.embed.embedding as _bedrock_embed -import litellm.llms.cohere.embed.handler as _cohere_handler from litellm.litellm_core_utils.litellm_logging import ( Logging as LiteLLMLoggingObj, ) -from litellm.llms.cohere.embed.handler import ( - validate_environment, -) -from litellm.llms.cohere.embed.v1_transformation import ( - CohereEmbeddingConfig, -) -from litellm.llms.custom_httpx.http_handler import ( - AsyncHTTPHandler, - HTTPHandler, - get_async_httpx_client, -) -from litellm.types.llms.bedrock import CohereEmbeddingRequest -from litellm.types.utils import EmbeddingResponse from packaging.version import Version logger = logging.getLogger(__name__) _DEFAULT_TIMEOUT = httpx.Timeout(None) _PATCHED_LITELLM_VERSION = "1.80.0" _litellm_version = importlib.metadata.version("litellm") -_SKIP_PATCH = Version(_litellm_version) > Version(_PATCHED_LITELLM_VERSION) +_SKIP_PATCH = Version(_litellm_version) != Version(_PATCHED_LITELLM_VERSION) if _SKIP_PATCH: warnings.warn( "litellm_cohere_timeout patch was SKIPPED — not applied. " f"Current litellm version: {_litellm_version}. " f"Patch was written for: {_PATCHED_LITELLM_VERSION}. " "Please verify the upstream fix and remove this module.", - DeprecationWarning, + RuntimeWarning, stacklevel=2, ) +else: + import litellm.llms.bedrock.embed.embedding as _bedrock_embed + import litellm.llms.cohere.embed.handler as _cohere_handler + from litellm.llms.cohere.embed.handler import validate_environment + from litellm.llms.cohere.embed.v1_transformation import CohereEmbeddingConfig + from litellm.llms.custom_httpx.http_handler import ( + AsyncHTTPHandler, + HTTPHandler, + get_async_httpx_client, + ) + from litellm.types.llms.bedrock import CohereEmbeddingRequest + from litellm.types.utils import EmbeddingResponse🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around lines 26 - 64, The current file imports private litellm modules before computing _SKIP_PATCH, which can cause import-time failures; move all imports of litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed, _cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig, AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest, EmbeddingResponse) so they occur only after computing _litellm_version and _SKIP_PATCH; change the version check to use equality against _PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) == Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the DeprecationWarning with a visible warning class such as UserWarning or RuntimeWarning so operators see the warning when the patch is skipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-64: The current file imports private litellm modules before
computing _SKIP_PATCH, which can cause import-time failures; move all imports of
litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed,
_cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) so they occur only after computing _litellm_version and
_SKIP_PATCH; change the version check to use equality against
_PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) ==
Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the
DeprecationWarning with a visible warning class such as UserWarning or
RuntimeWarning so operators see the warning when the patch is skipped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 651cf418-be2b-4238-bfe1-13ad72694bba
📒 Files selected for processing (2)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.pyunstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
🚧 Files skipped from review as they are similar to previous changes (1)
- unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
- Move all litellm.llms.* imports inside the `if not _SKIP_PATCH` block so they're only loaded when the patch will be applied - Change version gate from `>` to exact equality (`!=`) — patch only applies on the exact known-affected version (1.80.0) - Use RuntimeWarning instead of DeprecationWarning for skip message so it's visible by default (DeprecationWarning is filtered) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 30-41: The version guard (_PATCHED_LITELLM_VERSION,
_litellm_version, _SKIP_PATCH) causes the patch to be skipped because sdk1 uses
LiteLLM 1.81.7; update the constant _PATCHED_LITELLM_VERSION to "1.81.7" (and
run tests/verify that the Cohere timeout fix is effective in that release) or,
alternatively, revert sdk1's litellm dependency to "1.80.0" so the existing
patch applies—ensure the chosen approach makes _SKIP_PATCH False so the patch
code executes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0422bc8a-b039-47cf-aeeb-e54e2c5640e2
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Outdated
Show resolved
Hide resolved
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
@pk-zipstack can these tests be made to run with the existing tox tests? Otherwise I don't see much value in this unless devs run the tests themselves
There was a problem hiding this comment.
@pk-zipstack Agree, please check if these are tox compatible.
If not, remove them and roll out equivalent versions that are compatible with tox instead.
| embeddings without going through unstract.sdk1.embedding will NOT | ||
| have this patch active. | ||
|
|
||
| Remove this patch when litellm ships a fix upstream. |
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
@pk-zipstack Also the tests import _patched_embedding and _patched_async_embedding directly. There's no test that imports unstract.sdk1.embedding instead and verifies the patch was applied through the production code path. This would catch regressions in the side-effect import line.
hari-kuriakose
left a comment
There was a problem hiding this comment.
@pk-zipstack LGTM overall.
However request few critical changes though.
Co-authored-by: Hari John Kuriakose <hari@zipstack.com> Signed-off-by: Praveen Kumar <praveen@zipstack.com>
for more information, see https://pre-commit.ci
Greptile SummaryThis PR introduces a monkey-patch module to fix a timeout-forwarding bug in litellm's Cohere embed handler ( Key issues (several carried over from previous review rounds):
Confidence Score: 1/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Module import:\nunstract.sdk1.embedding] --> B[Side-effect import:\nlitellm_cohere_timeout]
B --> C{_SKIP_PATCH?\nlitellm version ≠ 1.81.7}
C -- "True (always, since\npinned version is 1.80.0)" --> D[warnings.warn RuntimeWarning\nPatch NOT applied\nTimeout bug remains]
C -- "False (never reached)" --> E[Define _patched_embedding\n& _patched_async_embedding]
E --> F[_cohere_handler.embedding =\n_patched_embedding]
E --> G[_cohere_handler.async_embedding =\n_patched_async_embedding]
E --> H[_bedrock_embed.cohere_embedding =\n_patched_embedding]
F & G & H --> I[litellm.embedding call\nwith timeout forwarded\nto client.post]
Prompt To Fix All With AIThis is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Line: 39
Comment:
**`RuntimeWarning` contradicts stated intent — should be `DeprecationWarning`**
The PR description explicitly states the module emits a `DeprecationWarning` to prompt engineers to verify compatibility after a litellm upgrade. However, the code uses `RuntimeWarning`.
The distinction matters in practice:
- `DeprecationWarning` is silenced by default in production Python (`-W default` filter), so it only surfaces during development and in test runs — exactly right for "please remember to verify and remove this patch".
- `RuntimeWarning` is always shown (not filtered by default), meaning every production deployment that doesn't have litellm pinned at exactly `1.81.7` will emit a warning to stderr on every cold start.
```suggestion
DeprecationWarning,
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 198-206
Comment:
**`importlib.import_module` is a no-op for cached modules — test is vacuous**
`importlib.import_module("unstract.sdk1.embedding")` only returns the module from `sys.modules` if it has already been imported; it does **not** re-execute the module body or re-run side effects. Since importing `litellm_cohere_timeout` at the top of the test file (lines 7–10) places the patch module in `sys.modules` before this test runs, the `import_module` call here is a no-op. The assertions then pass because the patch was applied by the top-level import, not because of anything `unstract.sdk1.embedding` did.
To actually verify that `unstract.sdk1.embedding`'s side-effect import wires the patch, you would need to test in an isolated process or use `importlib.reload()` — but that has its own ordering hazards. A simpler and accurate approach is to assert that `"unstract.sdk1.patches.litellm_cohere_timeout"` appears in `sys.modules` after importing `unstract.sdk1.embedding`, and document that the binding test (handler assertions) indirectly validates the wiring.
As written, this test gives false confidence: it would pass even if the `import unstract.sdk1.patches.litellm_cohere_timeout` line were deleted from `embedding.py`.
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: 7b4c442 |
| from unstract.sdk1.patches.litellm_cohere_timeout import ( | ||
| _patched_async_embedding, | ||
| _patched_embedding, | ||
| ) |
There was a problem hiding this comment.
Tests will fail with ImportError when litellm version ≠ 1.80.0
_patched_embedding and _patched_async_embedding are only defined inside the else branch of the version guard in litellm_cohere_timeout.py. If the installed litellm version is anything other than 1.80.0, _SKIP_PATCH is True, that else block is skipped entirely, and these names are never bound at module level.
This unconditional top-level import will therefore raise an ImportError in any environment where litellm has been upgraded (or hasn't been pinned yet), causing all six tests to fail immediately without a meaningful error message.
Consider guarding the import and tests with a version check:
import pytest
from unstract.sdk1.patches.litellm_cohere_timeout import _SKIP_PATCH
pytestmark = pytest.mark.skipif(
_SKIP_PATCH,
reason="Patch not applied for this litellm version",
)
if not _SKIP_PATCH:
from unstract.sdk1.patches.litellm_cohere_timeout import (
_patched_async_embedding,
_patched_embedding,
)Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 7-10
Comment:
**Tests will fail with `ImportError` when litellm version ≠ 1.80.0**
`_patched_embedding` and `_patched_async_embedding` are only defined inside the `else` branch of the version guard in `litellm_cohere_timeout.py`. If the installed litellm version is anything other than `1.80.0`, `_SKIP_PATCH` is `True`, that `else` block is skipped entirely, and these names are never bound at module level.
This unconditional top-level import will therefore raise an `ImportError` in any environment where litellm has been upgraded (or hasn't been pinned yet), causing all six tests to fail immediately without a meaningful error message.
Consider guarding the import and tests with a version check:
```python
import pytest
from unstract.sdk1.patches.litellm_cohere_timeout import _SKIP_PATCH
pytestmark = pytest.mark.skipif(
_SKIP_PATCH,
reason="Patch not applied for this litellm version",
)
if not _SKIP_PATCH:
from unstract.sdk1.patches.litellm_cohere_timeout import (
_patched_async_embedding,
_patched_embedding,
)
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)
15-15: Consider adding a TODO marker for discoverability.Adding
TODO:makes this line searchable via grep/IDE tooling, helping ensure timely removal once upstream fixes the bug.-Remove this patch when litellm ships a fix upstream. +TODO: Remove this patch when litellm ships a fix upstream.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` at line 15, Update the top-of-file note in litellm_cohere_timeout.py so it includes a searchable TODO marker; specifically replace or edit the comment "Remove this patch when litellm ships a fix upstream." to start with "TODO:" (e.g., "TODO: Remove this patch when litellm ships a fix upstream.") so it is discoverable via grep/IDE and clearly signals removal once the upstream bug in litellm is fixed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 42-65: The code references module aliases `_cohere_handler` and
`_bedrock_embed` later but only imports individual symbols (e.g.,
validate_environment, CohereEmbeddingConfig, AsyncHTTPHandler,
get_async_httpx_client, LiteLLMLoggingObj, CohereEmbeddingRequest,
EmbeddingResponse); add explicit module imports for the cohere and bedrock embed
modules (import the modules as `_cohere_handler` and `_bedrock_embed`) inside
the same else block so the later references to `_cohere_handler` and
`_bedrock_embed` resolve at runtime.
- Around line 30-32: The version guard currently pins the patch to
_PATCHED_LITELLM_VERSION = "1.80.0" causing _SKIP_PATCH to be true at runtime
for the dependency _litellm_version (which is 1.81.7), so the timeout patch is
skipped; either update _PATCHED_LITELLM_VERSION to "1.81.7" if the timeout bug
still exists in that release, or remove this patch block entirely if the
upstream bug is fixed—modify the constant _PATCHED_LITELLM_VERSION accordingly
and verify the behavior of _SKIP_PATCH (which compares Version(_litellm_version)
!= Version(_PATCHED_LITELLM_VERSION)) so the patch will run only when intended.
---
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Line 15: Update the top-of-file note in litellm_cohere_timeout.py so it
includes a searchable TODO marker; specifically replace or edit the comment
"Remove this patch when litellm ships a fix upstream." to start with "TODO:"
(e.g., "TODO: Remove this patch when litellm ships a fix upstream.") so it is
discoverable via grep/IDE and clearly signals removal once the upstream bug in
litellm is fixed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 42ebcad6-b1fa-4a6e-91c5-b75e12a511b5
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)
48-65:⚠️ Potential issue | 🔴 CriticalImport the modules you rebind later.
This block imports symbols from the Cohere and Bedrock modules, but lines 213-215 assign through
_cohere_handlerand_bedrock_embed, which are never defined. On LiteLLM1.81.7, importing this patch will raiseNameErrorbefore the monkey-patch is applied, breaking the side-effect activation path fromunstract.sdk1.embedding.🐛 Proposed fix
import httpx import litellm + import litellm.llms.bedrock.embed.embedding as _bedrock_embed + import litellm.llms.cohere.embed.handler as _cohere_handler from litellm.litellm_core_utils.litellm_logging import ( Logging as LiteLLMLoggingObj, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around lines 48 - 65, The patch imports Cohere/Bedrock symbols but later rebinds globals like _cohere_handler and _bedrock_embed without first defining or importing them, causing a NameError on import; to fix, import or define the original symbols before rebinding (e.g., import the existing cohere handler and bedrock embed functions used in lines that assign to _cohere_handler and _bedrock_embed), then perform the monkey-patch assignments; locate the rebinding logic targeting _cohere_handler and _bedrock_embed and ensure the original symbols are referenced (via imports or safe getattr fallbacks) so the module can be imported without raising NameError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 48-65: The patch imports Cohere/Bedrock symbols but later rebinds
globals like _cohere_handler and _bedrock_embed without first defining or
importing them, causing a NameError on import; to fix, import or define the
original symbols before rebinding (e.g., import the existing cohere handler and
bedrock embed functions used in lines that assign to _cohere_handler and
_bedrock_embed), then perform the monkey-patch assignments; locate the rebinding
logic targeting _cohere_handler and _bedrock_embed and ensure the original
symbols are referenced (via imports or safe getattr fallbacks) so the module can
be imported without raising NameError.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e8692340-9a40-4b25-ac09-56dbde0e3fb8
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
- Restore `_bedrock_embed` and `_cohere_handler` imports that were silently removed by ruff auto-fix (marked with noqa: F811) - Add test verifying patch activation through the production import path (unstract.sdk1.embedding) per reviewer feedback - Tests are tox-compatible — already covered by [testenv:sdk1] in tox.ini Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Test ResultsSummary
Runner Tests - Full Report
|
|
| f"Current litellm version: {_litellm_version}. " | ||
| f"Patch was written for: {_PATCHED_LITELLM_VERSION}. " | ||
| "Please verify the upstream fix and remove this module.", | ||
| RuntimeWarning, |
There was a problem hiding this comment.
RuntimeWarning contradicts stated intent — should be DeprecationWarning
The PR description explicitly states the module emits a DeprecationWarning to prompt engineers to verify compatibility after a litellm upgrade. However, the code uses RuntimeWarning.
The distinction matters in practice:
DeprecationWarningis silenced by default in production Python (-W defaultfilter), so it only surfaces during development and in test runs — exactly right for "please remember to verify and remove this patch".RuntimeWarningis always shown (not filtered by default), meaning every production deployment that doesn't have litellm pinned at exactly1.81.7will emit a warning to stderr on every cold start.
| RuntimeWarning, | |
| DeprecationWarning, |
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Line: 39
Comment:
**`RuntimeWarning` contradicts stated intent — should be `DeprecationWarning`**
The PR description explicitly states the module emits a `DeprecationWarning` to prompt engineers to verify compatibility after a litellm upgrade. However, the code uses `RuntimeWarning`.
The distinction matters in practice:
- `DeprecationWarning` is silenced by default in production Python (`-W default` filter), so it only surfaces during development and in test runs — exactly right for "please remember to verify and remove this patch".
- `RuntimeWarning` is always shown (not filtered by default), meaning every production deployment that doesn't have litellm pinned at exactly `1.81.7` will emit a warning to stderr on every cold start.
```suggestion
DeprecationWarning,
```
How can I resolve this? If you propose a fix, please make it concise.| def test_patch_applied_via_production_import(self) -> None: | ||
| """Verify side-effect import in unstract.sdk1.embedding applies the patch.""" | ||
| import importlib | ||
|
|
||
| import litellm.llms.bedrock.embed.embedding as bedrock | ||
| import litellm.llms.cohere.embed.handler as handler | ||
|
|
||
| # Re-import to ensure the side-effect runs | ||
| importlib.import_module("unstract.sdk1.embedding") |
There was a problem hiding this comment.
importlib.import_module is a no-op for cached modules — test is vacuous
importlib.import_module("unstract.sdk1.embedding") only returns the module from sys.modules if it has already been imported; it does not re-execute the module body or re-run side effects. Since importing litellm_cohere_timeout at the top of the test file (lines 7–10) places the patch module in sys.modules before this test runs, the import_module call here is a no-op. The assertions then pass because the patch was applied by the top-level import, not because of anything unstract.sdk1.embedding did.
To actually verify that unstract.sdk1.embedding's side-effect import wires the patch, you would need to test in an isolated process or use importlib.reload() — but that has its own ordering hazards. A simpler and accurate approach is to assert that "unstract.sdk1.patches.litellm_cohere_timeout" appears in sys.modules after importing unstract.sdk1.embedding, and document that the binding test (handler assertions) indirectly validates the wiring.
As written, this test gives false confidence: it would pass even if the import unstract.sdk1.patches.litellm_cohere_timeout line were deleted from embedding.py.
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 198-206
Comment:
**`importlib.import_module` is a no-op for cached modules — test is vacuous**
`importlib.import_module("unstract.sdk1.embedding")` only returns the module from `sys.modules` if it has already been imported; it does **not** re-execute the module body or re-run side effects. Since importing `litellm_cohere_timeout` at the top of the test file (lines 7–10) places the patch module in `sys.modules` before this test runs, the `import_module` call here is a no-op. The assertions then pass because the patch was applied by the top-level import, not because of anything `unstract.sdk1.embedding` did.
To actually verify that `unstract.sdk1.embedding`'s side-effect import wires the patch, you would need to test in an isolated process or use `importlib.reload()` — but that has its own ordering hazards. A simpler and accurate approach is to assert that `"unstract.sdk1.patches.litellm_cohere_timeout"` appears in `sys.modules` after importing `unstract.sdk1.embedding`, and document that the binding test (handler assertions) indirectly validates the wiring.
As written, this test gives false confidence: it would pass even if the `import unstract.sdk1.patches.litellm_cohere_timeout` line were deleted from `embedding.py`.
How can I resolve this? If you propose a fix, please make it concise.


What
Monkey-patch litellm's cohere embed handler to correctly forward the
timeoutparameter toclient.post()calls, fixing "Connection timed out after None seconds" errors when indexing large documents with AWS Bedrock embedding models.Why
litellm (v1.81.7) has a bug in
litellm/llms/cohere/embed/handler.pywhere bothembedding()andasync_embedding()receive atimeoutparameter but never forward it toclient.post(). This causes the timeout to default toNone, which surfaces as:This affects all Bedrock Cohere embedding operations (e.g.
cohere.embed-multilingual-v3) and is especially visible with large documents. The bug is present on litellm's latestmainbranch as well — no upstream fix exists.How
unstract/sdk1/patches/litellm_cohere_timeout.py) that replaces the affected functions with versions that correctly passtimeout=timeouttoclient.post()cohere_embedding)DeprecationWarningto prompt verification# ONLY CHANGE)unstract.sdk1.embeddingCan this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
No. The patched functions are exact copies of litellm 1.81.7's originals with only
timeout=timeoutadded toclient.post()calls. litellm is pinned at1.81.7in sdk1, so the source won't change. If litellm is later upgraded, the version guard skips the patch entirely and emits a warning.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Large_1040.pdf)Dependencies Versions
litellm: 1.81.7 (pinned, bug present)httpx: 0.28.1Notes on Testing
uv run pytest tests/patches/test_litellm_cohere_timeout.py -vScreenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.
🤖 Generated with Claude Code