Skip to content

[FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848

Open
pk-zipstack wants to merge 13 commits intomainfrom
fix/litellm-cohere-embed-timeout
Open

[FIX] Monkey-patch litellm cohere embed timeout for Bedrock embeddings#1848
pk-zipstack wants to merge 13 commits intomainfrom
fix/litellm-cohere-embed-timeout

Conversation

@pk-zipstack
Copy link
Contributor

@pk-zipstack pk-zipstack commented Mar 11, 2026

What

Monkey-patch litellm's cohere embed handler to correctly forward the timeout parameter to client.post() calls, fixing "Connection timed out after None seconds" errors when indexing large documents with AWS Bedrock embedding models.

Why

litellm (v1.81.7) has a bug in litellm/llms/cohere/embed/handler.py where both embedding() and async_embedding() receive a timeout parameter but never forward it to client.post(). This causes the timeout to default to None, which surfaces as:

Connection timed out after None seconds.

This affects all Bedrock Cohere embedding operations (e.g. cohere.embed-multilingual-v3) and is especially visible with large documents. The bug is present on litellm's latest main branch as well — no upstream fix exists.

How

  • Added a monkey-patch module (unstract/sdk1/patches/litellm_cohere_timeout.py) that replaces the affected functions with versions that correctly pass timeout=timeout to client.post()
  • Patches three targets: the cohere handler module, the async variant, and bedrock's direct import binding (cohere_embedding)
  • Includes a version guard that skips the patch entirely if litellm is upgraded past 1.81.7, with a DeprecationWarning to prompt verification
  • Each patched function has inline comments marking the single line changed (# ONLY CHANGE)
  • Patch is activated via side-effect import from unstract.sdk1.embedding

Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)

No. The patched functions are exact copies of litellm 1.81.7's originals with only timeout=timeout added to client.post() calls. litellm is pinned at 1.81.7 in sdk1, so the source won't change. If litellm is later upgraded, the version guard skips the patch entirely and emits a warning.

Database Migrations

  • None

Env Config

  • None

Relevant Docs

  • N/A

Related Issues or PRs

  • Fixes Bedrock embedding timeout issue reported on staging with large documents (e.g. Large_1040.pdf)

Dependencies Versions

  • litellm: 1.81.7 (pinned, bug present)
  • httpx: 0.28.1

Notes on Testing

  • 6 unit tests added covering:
    • Sync path: timeout value (600.0), None timeout, httpx.Timeout object all forwarded correctly
    • Async path: timeout value forwarded correctly
    • Monkey-patch wiring: cohere handler and bedrock handler both point to patched functions
  • All tests pass: uv run pytest tests/patches/test_litellm_cohere_timeout.py -v

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

🤖 Generated with Claude Code

pk-zipstack and others added 3 commits March 10, 2026 14:44
litellm's cohere embed handler (1.80.0) receives a timeout parameter
but doesn't pass it to client.post(), causing "Connection timed out
after None seconds" on large Bedrock embedding requests.

This adds a monkey-patch that replaces the affected functions with
versions that correctly forward timeout. Includes version guard,
source comments, and unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ions

- Version guard now skips patch entirely when litellm > 1.80.0
  instead of just warning
- Test assertions now check exact timeout value received by
  client.post(), not just that it was called
- Inline comments at client.post() calls marked with ONLY CHANGE

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new patches package and a litellm Cohere timeout monkey-patch (sync + async) that forwards timeout arguments to underlying HTTP client calls; the patch is applied by a side-effect import in the SDK embedding module and unit tests verify timeout propagation and wiring.

Changes

Cohort / File(s) Summary
Patch package initializer
unstract/sdk1/src/unstract/sdk1/patches/__init__.py
New package initializer documenting that patches are applied via side-effect imports and noting activation from the SDK embedding module.
Cohere timeout monkey-patch
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
New monkey-patch adding _patched_async_embedding and _patched_embedding that forward timeout to underlying HTTP client post() calls, install into litellm handlers under a version guard, and emit a RuntimeWarning when not applied.
Activation via embedding module
unstract/sdk1/src/unstract/sdk1/embedding.py
Single-line side-effect import added to load and apply the patch when the embedding module is imported.
Tests for the patch
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
New tests covering sync and async embedding paths, asserting timeout propagation (numeric, None, httpx.Timeout) to HTTP client post() and verifying litellm handler replacement.
Project metadata
requirements.txt, pyproject.toml
Manifest files present in the diff set (no functional edits reported).

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant SDK as "SDK embedding\n(unstract.sdk1.embedding)"
    participant Patch as "Patch module\n(litellm_cohere_timeout)"
    participant Litellm as "litellm embed\nhandler (wrapped)"
    participant HTTP as "HTTP client\n(e.g., httpx)"

    User->>SDK: call embed(inputs, timeout=...)
    SDK->>Patch: import (side-effect) / use patched handler
    Patch->>Litellm: invoke wrapped handler
    Litellm->>HTTP: client.post(..., timeout=passed_timeout)
    HTTP-->>Litellm: response
    Litellm-->>Patch: embedding result
    Patch-->>SDK: return embeddings
    SDK-->>User: return embeddings
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title accurately describes the main change: adding a monkey-patch for litellm's cohere embed timeout issue affecting Bedrock embeddings.
Description check ✅ Passed The pull request description comprehensively covers all required template sections with clear, detailed information about the changes, rationale, implementation, risk assessment, and testing.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/litellm-cohere-embed-timeout
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

pre-commit-ci bot and others added 3 commits March 11, 2026 10:45
- Remove useless self-assignment `model = model`
- Use int timeout values in tests to avoid float equality checks
- Use `is` identity checks instead of `==` for timeout assertions
- Replace async mock_post with AsyncMock to properly use async features

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py (1)

7-10: Exercise the production import hook in one test.

Because this file imports the patch module directly, the wiring checks only prove direct-import behavior. Add one integration test that imports unstract.sdk1.embedding and asserts the handler is patched, so Line 7 in unstract/sdk1/src/unstract/sdk1/embedding.py cannot regress silently.

Also applies to: 194-203

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py` around lines 7 -
10, Add an integration test that imports unstract.sdk1.embedding (instead of
importing the patch module directly) and asserts the production import hook
applied the patch: after importing unstract.sdk1.embedding, verify the module's
embedding handler uses the patched implementations by checking that its async
and sync embedding callables resolve to the symbols _patched_async_embedding and
_patched_embedding (or that their identities/reference equality match those
functions imported from unstract.sdk1.patches.litellm_cohere_timeout); update or
add this assertion into test_litellm_cohere_timeout.py alongside the existing
direct-import checks so the production import wiring (line referencing
unstract.sdk1.embedding) is exercised and cannot regress.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-43: The patch currently imports private litellm.llms.* modules
before checking _SKIP_PATCH and uses a loose version check, so move all
private/internal imports (e.g., validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse and any litellm.llms.* import lines) to after the guard that
computes _SKIP_PATCH, change the version gate from a ">" check to an exact
equality check against the known compatible LiteLLM version, and when skipping
emit a visible warning via warnings.warn (not just DeprecationWarning swallowed
by default) so callers know the patch was skipped; ensure functions/classes
referenced in the patch (validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) are only imported after the guard.

---

Nitpick comments:
In `@unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py`:
- Around line 7-10: Add an integration test that imports unstract.sdk1.embedding
(instead of importing the patch module directly) and asserts the production
import hook applied the patch: after importing unstract.sdk1.embedding, verify
the module's embedding handler uses the patched implementations by checking that
its async and sync embedding callables resolve to the symbols
_patched_async_embedding and _patched_embedding (or that their
identities/reference equality match those functions imported from
unstract.sdk1.patches.litellm_cohere_timeout); update or add this assertion into
test_litellm_cohere_timeout.py alongside the existing direct-import checks so
the production import wiring (line referencing unstract.sdk1.embedding) is
exercised and cannot regress.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f70cb6f9-69dd-431a-97bc-906d92c597ff

📥 Commits

Reviewing files that changed from the base of the PR and between c41e05e and dcd0663.

📒 Files selected for processing (5)
  • unstract/sdk1/src/unstract/sdk1/embedding.py
  • unstract/sdk1/src/unstract/sdk1/patches/__init__.py
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
  • unstract/sdk1/tests/patches/__init__.py
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)

26-64: ⚠️ Potential issue | 🟠 Major

Move the private LiteLLM imports behind the guard and gate on the exact tested version.

Lines 26-43 still import private litellm.llms.* modules before _SKIP_PATCH is evaluated, so a future LiteLLM reorg can fail at import time instead of cleanly skipping. Also, Line 56 only skips > 1.80.0; for a copied private implementation, != is safer. The skip warning on Lines 58-64 uses DeprecationWarning, which is usually hidden for imported library code, so operators may never notice that the patch was not applied. With the provided dependency context showing LiteLLM 1.81.7, this path would currently skip.

🔧 Suggested fix
 import importlib.metadata
 import json
 import logging
 import warnings
 from collections.abc import Callable

 import httpx
 import litellm
-import litellm.llms.bedrock.embed.embedding as _bedrock_embed
-import litellm.llms.cohere.embed.handler as _cohere_handler
 from litellm.litellm_core_utils.litellm_logging import (
     Logging as LiteLLMLoggingObj,
 )
-from litellm.llms.cohere.embed.handler import (
-    validate_environment,
-)
-from litellm.llms.cohere.embed.v1_transformation import (
-    CohereEmbeddingConfig,
-)
-from litellm.llms.custom_httpx.http_handler import (
-    AsyncHTTPHandler,
-    HTTPHandler,
-    get_async_httpx_client,
-)
-from litellm.types.llms.bedrock import CohereEmbeddingRequest
-from litellm.types.utils import EmbeddingResponse
 from packaging.version import Version

 logger = logging.getLogger(__name__)

 _DEFAULT_TIMEOUT = httpx.Timeout(None)

 _PATCHED_LITELLM_VERSION = "1.80.0"
 _litellm_version = importlib.metadata.version("litellm")
-_SKIP_PATCH = Version(_litellm_version) > Version(_PATCHED_LITELLM_VERSION)
+_SKIP_PATCH = Version(_litellm_version) != Version(_PATCHED_LITELLM_VERSION)
 if _SKIP_PATCH:
     warnings.warn(
         "litellm_cohere_timeout patch was SKIPPED — not applied. "
         f"Current litellm version: {_litellm_version}. "
         f"Patch was written for: {_PATCHED_LITELLM_VERSION}. "
         "Please verify the upstream fix and remove this module.",
-        DeprecationWarning,
+        RuntimeWarning,
         stacklevel=2,
     )
+else:
+    import litellm.llms.bedrock.embed.embedding as _bedrock_embed
+    import litellm.llms.cohere.embed.handler as _cohere_handler
+    from litellm.llms.cohere.embed.handler import validate_environment
+    from litellm.llms.cohere.embed.v1_transformation import CohereEmbeddingConfig
+    from litellm.llms.custom_httpx.http_handler import (
+        AsyncHTTPHandler,
+        HTTPHandler,
+        get_async_httpx_client,
+    )
+    from litellm.types.llms.bedrock import CohereEmbeddingRequest
+    from litellm.types.utils import EmbeddingResponse
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around
lines 26 - 64, The current file imports private litellm modules before computing
_SKIP_PATCH, which can cause import-time failures; move all imports of
litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed,
_cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) so they occur only after computing _litellm_version and
_SKIP_PATCH; change the version check to use equality against
_PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) ==
Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the
DeprecationWarning with a visible warning class such as UserWarning or
RuntimeWarning so operators see the warning when the patch is skipped.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 26-64: The current file imports private litellm modules before
computing _SKIP_PATCH, which can cause import-time failures; move all imports of
litellm.llms.* and litellm.litellm_core_utils.* (symbols: _bedrock_embed,
_cohere_handler, LiteLLMLoggingObj, validate_environment, CohereEmbeddingConfig,
AsyncHTTPHandler, HTTPHandler, get_async_httpx_client, CohereEmbeddingRequest,
EmbeddingResponse) so they occur only after computing _litellm_version and
_SKIP_PATCH; change the version check to use equality against
_PATCHED_LITELLM_VERSION (i.e. skip only when Version(_litellm_version) ==
Version(_PATCHED_LITELLM_VERSION) or gate with != as requested) and replace the
DeprecationWarning with a visible warning class such as UserWarning or
RuntimeWarning so operators see the warning when the patch is skipped.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 651cf418-be2b-4238-bfe1-13ad72694bba

📥 Commits

Reviewing files that changed from the base of the PR and between dcd0663 and b93ac45.

📒 Files selected for processing (2)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py

pk-zipstack and others added 2 commits March 12, 2026 09:29
- Move all litellm.llms.* imports inside the `if not _SKIP_PATCH`
  block so they're only loaded when the patch will be applied
- Change version gate from `>` to exact equality (`!=`) — patch
  only applies on the exact known-affected version (1.80.0)
- Use RuntimeWarning instead of DeprecationWarning for skip message
  so it's visible by default (DeprecationWarning is filtered)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 30-41: The version guard (_PATCHED_LITELLM_VERSION,
_litellm_version, _SKIP_PATCH) causes the patch to be skipped because sdk1 uses
LiteLLM 1.81.7; update the constant _PATCHED_LITELLM_VERSION to "1.81.7" (and
run tests/verify that the Cohere timeout fix is effective in that release) or,
alternatively, revert sdk1's litellm dependency to "1.80.0" so the existing
patch applies—ensure the chosen approach makes _SKIP_PATCH False so the patch
code executes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0422bc8a-b039-47cf-aeeb-e54e2c5640e2

📥 Commits

Reviewing files that changed from the base of the PR and between b93ac45 and f6181a6.

📒 Files selected for processing (1)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack can these tests be made to run with the existing tox tests? Otherwise I don't see much value in this unless devs run the tests themselves

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack Agree, please check if these are tox compatible.
If not, remove them and roll out equivalent versions that are compatible with tox instead.

embeddings without going through unstract.sdk1.embedding will NOT
have this patch active.

Remove this patch when litellm ships a fix upstream.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack Add a TODO marker here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack Also the tests import _patched_embedding and _patched_async_embedding directly. There's no test that imports unstract.sdk1.embedding instead and verifies the patch was applied through the production code path. This would catch regressions in the side-effect import line.

Copy link
Contributor

@hari-kuriakose hari-kuriakose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pk-zipstack LGTM overall.

However request few critical changes though.

pk-zipstack and others added 2 commits March 12, 2026 21:31
Co-authored-by: Hari John Kuriakose <hari@zipstack.com>
Signed-off-by: Praveen Kumar <praveen@zipstack.com>
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR introduces a monkey-patch module to fix a timeout-forwarding bug in litellm's Cohere embed handler (cohere/embed/handler.py), where the timeout parameter received by embedding() and async_embedding() was never passed through to client.post(), causing Bedrock Cohere embedding calls to time out with "Connection timed out after None seconds." The patch replaces both functions with corrected copies and wires them into the handler and bedrock modules via a side-effect import in unstract.sdk1.embedding.

Key issues (several carried over from previous review rounds):

  • Patch is never applied: _PATCHED_LITELLM_VERSION = "1.81.7" but litellm is pinned at 1.80.0, so _SKIP_PATCH is always True and the timeout bug remains unfixed in all current deployments.
  • Tests fail with ImportError: _patched_embedding / _patched_async_embedding are only defined in the else block; the unconditional top-level import in the test file crashes when _SKIP_PATCH is True.
  • RuntimeWarning instead of DeprecationWarning: The PR description promises a DeprecationWarning; the code emits a RuntimeWarning, which is not filtered by default and will be noisy in all production environments.
  • Vacuous test: test_patch_applied_via_production_import calls importlib.import_module(), which returns the already-cached module and never re-runs the side effect — the test passes regardless of whether embedding.py contains the side-effect import.

Confidence Score: 1/5

  • Not safe to merge — the patch is never applied due to a version constant mismatch, meaning the original timeout bug remains active in production.
  • The central fix in this PR (forwarding timeout to client.post()) is logically correct, but the version guard constant _PATCHED_LITELLM_VERSION = "1.81.7" does not match the pinned litellm version 1.80.0, so _SKIP_PATCH is always True and the patch is never applied. Additionally, the test suite will fail at collection time due to the unconditional import of symbols defined only inside the skipped else block. These issues (noted in prior review rounds) remain unresolved in the current commit.
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py — version constant must be corrected to "1.80.0" before the patch does anything. unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py — imports must be guarded by a _SKIP_PATCH check.

Important Files Changed

Filename Overview
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Core patch module — contains three critical bugs flagged in previous review threads: version constant set to "1.81.7" instead of the pinned "1.80.0" (patch never applied), _cohere_handler/_bedrock_embed imports now present but previously missing. Additionally, uses RuntimeWarning instead of the DeprecationWarning described in the PR. The combination of the version mismatch and RuntimeWarning means the patch is both never applied AND noisily warns on every startup.
unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py Test suite with 6 cases covering sync/async timeout forwarding and patch wiring; top-level unconditional import of _patched_embedding/_patched_async_embedding will raise ImportError when _SKIP_PATCH is True (already flagged). test_patch_applied_via_production_import is vacuous because importlib.import_module returns a cached module without re-running side effects.
unstract/sdk1/src/unstract/sdk1/embedding.py Adds a single side-effect import of the patch module; no functional changes to the Embedding or EmbeddingCompat classes.
unstract/sdk1/src/unstract/sdk1/patches/init.py New package init with documentation docstring only; no functional code.
unstract/sdk1/tests/patches/init.py Empty test package init file; no issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Module import:\nunstract.sdk1.embedding] --> B[Side-effect import:\nlitellm_cohere_timeout]
    B --> C{_SKIP_PATCH?\nlitellm version ≠ 1.81.7}
    C -- "True (always, since\npinned version is 1.80.0)" --> D[warnings.warn RuntimeWarning\nPatch NOT applied\nTimeout bug remains]
    C -- "False (never reached)" --> E[Define _patched_embedding\n& _patched_async_embedding]
    E --> F[_cohere_handler.embedding =\n_patched_embedding]
    E --> G[_cohere_handler.async_embedding =\n_patched_async_embedding]
    E --> H[_bedrock_embed.cohere_embedding =\n_patched_embedding]
    F & G & H --> I[litellm.embedding call\nwith timeout forwarded\nto client.post]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Line: 39

Comment:
**`RuntimeWarning` contradicts stated intent — should be `DeprecationWarning`**

The PR description explicitly states the module emits a `DeprecationWarning` to prompt engineers to verify compatibility after a litellm upgrade. However, the code uses `RuntimeWarning`.

The distinction matters in practice:
- `DeprecationWarning` is silenced by default in production Python (`-W default` filter), so it only surfaces during development and in test runs — exactly right for "please remember to verify and remove this patch".
- `RuntimeWarning` is always shown (not filtered by default), meaning every production deployment that doesn't have litellm pinned at exactly `1.81.7` will emit a warning to stderr on every cold start.

```suggestion
        DeprecationWarning,
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 198-206

Comment:
**`importlib.import_module` is a no-op for cached modules — test is vacuous**

`importlib.import_module("unstract.sdk1.embedding")` only returns the module from `sys.modules` if it has already been imported; it does **not** re-execute the module body or re-run side effects. Since importing `litellm_cohere_timeout` at the top of the test file (lines 7–10) places the patch module in `sys.modules` before this test runs, the `import_module` call here is a no-op. The assertions then pass because the patch was applied by the top-level import, not because of anything `unstract.sdk1.embedding` did.

To actually verify that `unstract.sdk1.embedding`'s side-effect import wires the patch, you would need to test in an isolated process or use `importlib.reload()` — but that has its own ordering hazards. A simpler and accurate approach is to assert that `"unstract.sdk1.patches.litellm_cohere_timeout"` appears in `sys.modules` after importing `unstract.sdk1.embedding`, and document that the binding test (handler assertions) indirectly validates the wiring.

As written, this test gives false confidence: it would pass even if the `import unstract.sdk1.patches.litellm_cohere_timeout` line were deleted from `embedding.py`.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 7b4c442

Comment on lines +7 to +10
from unstract.sdk1.patches.litellm_cohere_timeout import (
_patched_async_embedding,
_patched_embedding,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests will fail with ImportError when litellm version ≠ 1.80.0

_patched_embedding and _patched_async_embedding are only defined inside the else branch of the version guard in litellm_cohere_timeout.py. If the installed litellm version is anything other than 1.80.0, _SKIP_PATCH is True, that else block is skipped entirely, and these names are never bound at module level.

This unconditional top-level import will therefore raise an ImportError in any environment where litellm has been upgraded (or hasn't been pinned yet), causing all six tests to fail immediately without a meaningful error message.

Consider guarding the import and tests with a version check:

import pytest
from unstract.sdk1.patches.litellm_cohere_timeout import _SKIP_PATCH

pytestmark = pytest.mark.skipif(
    _SKIP_PATCH,
    reason="Patch not applied for this litellm version",
)

if not _SKIP_PATCH:
    from unstract.sdk1.patches.litellm_cohere_timeout import (
        _patched_async_embedding,
        _patched_embedding,
    )
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 7-10

Comment:
**Tests will fail with `ImportError` when litellm version ≠ 1.80.0**

`_patched_embedding` and `_patched_async_embedding` are only defined inside the `else` branch of the version guard in `litellm_cohere_timeout.py`. If the installed litellm version is anything other than `1.80.0`, `_SKIP_PATCH` is `True`, that `else` block is skipped entirely, and these names are never bound at module level.

This unconditional top-level import will therefore raise an `ImportError` in any environment where litellm has been upgraded (or hasn't been pinned yet), causing all six tests to fail immediately without a meaningful error message.

Consider guarding the import and tests with a version check:
```python
import pytest
from unstract.sdk1.patches.litellm_cohere_timeout import _SKIP_PATCH

pytestmark = pytest.mark.skipif(
    _SKIP_PATCH,
    reason="Patch not applied for this litellm version",
)

if not _SKIP_PATCH:
    from unstract.sdk1.patches.litellm_cohere_timeout import (
        _patched_async_embedding,
        _patched_embedding,
    )
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)

15-15: Consider adding a TODO marker for discoverability.

Adding TODO: makes this line searchable via grep/IDE tooling, helping ensure timely removal once upstream fixes the bug.

-Remove this patch when litellm ships a fix upstream.
+TODO: Remove this patch when litellm ships a fix upstream.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` at line
15, Update the top-of-file note in litellm_cohere_timeout.py so it includes a
searchable TODO marker; specifically replace or edit the comment "Remove this
patch when litellm ships a fix upstream." to start with "TODO:" (e.g., "TODO:
Remove this patch when litellm ships a fix upstream.") so it is discoverable via
grep/IDE and clearly signals removal once the upstream bug in litellm is fixed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 42-65: The code references module aliases `_cohere_handler` and
`_bedrock_embed` later but only imports individual symbols (e.g.,
validate_environment, CohereEmbeddingConfig, AsyncHTTPHandler,
get_async_httpx_client, LiteLLMLoggingObj, CohereEmbeddingRequest,
EmbeddingResponse); add explicit module imports for the cohere and bedrock embed
modules (import the modules as `_cohere_handler` and `_bedrock_embed`) inside
the same else block so the later references to `_cohere_handler` and
`_bedrock_embed` resolve at runtime.
- Around line 30-32: The version guard currently pins the patch to
_PATCHED_LITELLM_VERSION = "1.80.0" causing _SKIP_PATCH to be true at runtime
for the dependency _litellm_version (which is 1.81.7), so the timeout patch is
skipped; either update _PATCHED_LITELLM_VERSION to "1.81.7" if the timeout bug
still exists in that release, or remove this patch block entirely if the
upstream bug is fixed—modify the constant _PATCHED_LITELLM_VERSION accordingly
and verify the behavior of _SKIP_PATCH (which compares Version(_litellm_version)
!= Version(_PATCHED_LITELLM_VERSION)) so the patch will run only when intended.

---

Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Line 15: Update the top-of-file note in litellm_cohere_timeout.py so it
includes a searchable TODO marker; specifically replace or edit the comment
"Remove this patch when litellm ships a fix upstream." to start with "TODO:"
(e.g., "TODO: Remove this patch when litellm ships a fix upstream.") so it is
discoverable via grep/IDE and clearly signals removal once the upstream bug in
litellm is fixed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 42ebcad6-b1fa-4a6e-91c5-b75e12a511b5

📥 Commits

Reviewing files that changed from the base of the PR and between f6181a6 and 644da84.

📒 Files selected for processing (1)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py (1)

48-65: ⚠️ Potential issue | 🔴 Critical

Import the modules you rebind later.

This block imports symbols from the Cohere and Bedrock modules, but lines 213-215 assign through _cohere_handler and _bedrock_embed, which are never defined. On LiteLLM 1.81.7, importing this patch will raise NameError before the monkey-patch is applied, breaking the side-effect activation path from unstract.sdk1.embedding.

🐛 Proposed fix
     import httpx
     import litellm
+    import litellm.llms.bedrock.embed.embedding as _bedrock_embed
+    import litellm.llms.cohere.embed.handler as _cohere_handler
     from litellm.litellm_core_utils.litellm_logging import (
         Logging as LiteLLMLoggingObj,
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py` around
lines 48 - 65, The patch imports Cohere/Bedrock symbols but later rebinds
globals like _cohere_handler and _bedrock_embed without first defining or
importing them, causing a NameError on import; to fix, import or define the
original symbols before rebinding (e.g., import the existing cohere handler and
bedrock embed functions used in lines that assign to _cohere_handler and
_bedrock_embed), then perform the monkey-patch assignments; locate the rebinding
logic targeting _cohere_handler and _bedrock_embed and ensure the original
symbols are referenced (via imports or safe getattr fallbacks) so the module can
be imported without raising NameError.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py`:
- Around line 48-65: The patch imports Cohere/Bedrock symbols but later rebinds
globals like _cohere_handler and _bedrock_embed without first defining or
importing them, causing a NameError on import; to fix, import or define the
original symbols before rebinding (e.g., import the existing cohere handler and
bedrock embed functions used in lines that assign to _cohere_handler and
_bedrock_embed), then perform the monkey-patch assignments; locate the rebinding
logic targeting _cohere_handler and _bedrock_embed and ensure the original
symbols are referenced (via imports or safe getattr fallbacks) so the module can
be imported without raising NameError.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e8692340-9a40-4b25-ac09-56dbde0e3fb8

📥 Commits

Reviewing files that changed from the base of the PR and between 644da84 and 419d5e9.

📒 Files selected for processing (1)
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py

pk-zipstack and others added 2 commits March 12, 2026 23:12
- Restore `_bedrock_embed` and `_cohere_handler` imports that were
  silently removed by ruff auto-fix (marked with noqa: F811)
- Add test verifying patch activation through the production import
  path (unstract.sdk1.embedding) per reviewer feedback
- Tests are tox-compatible — already covered by [testenv:sdk1] in
  tox.ini

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Contributor

Test Results

Summary
  • Runner Tests: 11 passed, 0 failed (11 total)

Runner Tests - Full Report
filepath function $$\textcolor{#23d18b}{\tt{passed}}$$ SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_logs}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_client\_init}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_run\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$ $$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$ $$\textcolor{#23d18b}{\tt{1}}$$ $$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$ $$\textcolor{#23d18b}{\tt{11}}$$ $$\textcolor{#23d18b}{\tt{11}}$$

@sonarqubecloud
Copy link

f"Current litellm version: {_litellm_version}. "
f"Patch was written for: {_PATCHED_LITELLM_VERSION}. "
"Please verify the upstream fix and remove this module.",
RuntimeWarning,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeWarning contradicts stated intent — should be DeprecationWarning

The PR description explicitly states the module emits a DeprecationWarning to prompt engineers to verify compatibility after a litellm upgrade. However, the code uses RuntimeWarning.

The distinction matters in practice:

  • DeprecationWarning is silenced by default in production Python (-W default filter), so it only surfaces during development and in test runs — exactly right for "please remember to verify and remove this patch".
  • RuntimeWarning is always shown (not filtered by default), meaning every production deployment that doesn't have litellm pinned at exactly 1.81.7 will emit a warning to stderr on every cold start.
Suggested change
RuntimeWarning,
DeprecationWarning,
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Line: 39

Comment:
**`RuntimeWarning` contradicts stated intent — should be `DeprecationWarning`**

The PR description explicitly states the module emits a `DeprecationWarning` to prompt engineers to verify compatibility after a litellm upgrade. However, the code uses `RuntimeWarning`.

The distinction matters in practice:
- `DeprecationWarning` is silenced by default in production Python (`-W default` filter), so it only surfaces during development and in test runs — exactly right for "please remember to verify and remove this patch".
- `RuntimeWarning` is always shown (not filtered by default), meaning every production deployment that doesn't have litellm pinned at exactly `1.81.7` will emit a warning to stderr on every cold start.

```suggestion
        DeprecationWarning,
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +198 to +206
def test_patch_applied_via_production_import(self) -> None:
"""Verify side-effect import in unstract.sdk1.embedding applies the patch."""
import importlib

import litellm.llms.bedrock.embed.embedding as bedrock
import litellm.llms.cohere.embed.handler as handler

# Re-import to ensure the side-effect runs
importlib.import_module("unstract.sdk1.embedding")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

importlib.import_module is a no-op for cached modules — test is vacuous

importlib.import_module("unstract.sdk1.embedding") only returns the module from sys.modules if it has already been imported; it does not re-execute the module body or re-run side effects. Since importing litellm_cohere_timeout at the top of the test file (lines 7–10) places the patch module in sys.modules before this test runs, the import_module call here is a no-op. The assertions then pass because the patch was applied by the top-level import, not because of anything unstract.sdk1.embedding did.

To actually verify that unstract.sdk1.embedding's side-effect import wires the patch, you would need to test in an isolated process or use importlib.reload() — but that has its own ordering hazards. A simpler and accurate approach is to assert that "unstract.sdk1.patches.litellm_cohere_timeout" appears in sys.modules after importing unstract.sdk1.embedding, and document that the binding test (handler assertions) indirectly validates the wiring.

As written, this test gives false confidence: it would pass even if the import unstract.sdk1.patches.litellm_cohere_timeout line were deleted from embedding.py.

Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/patches/test_litellm_cohere_timeout.py
Line: 198-206

Comment:
**`importlib.import_module` is a no-op for cached modules — test is vacuous**

`importlib.import_module("unstract.sdk1.embedding")` only returns the module from `sys.modules` if it has already been imported; it does **not** re-execute the module body or re-run side effects. Since importing `litellm_cohere_timeout` at the top of the test file (lines 7–10) places the patch module in `sys.modules` before this test runs, the `import_module` call here is a no-op. The assertions then pass because the patch was applied by the top-level import, not because of anything `unstract.sdk1.embedding` did.

To actually verify that `unstract.sdk1.embedding`'s side-effect import wires the patch, you would need to test in an isolated process or use `importlib.reload()` — but that has its own ordering hazards. A simpler and accurate approach is to assert that `"unstract.sdk1.patches.litellm_cohere_timeout"` appears in `sys.modules` after importing `unstract.sdk1.embedding`, and document that the binding test (handler assertions) indirectly validates the wiring.

As written, this test gives false confidence: it would pass even if the `import unstract.sdk1.patches.litellm_cohere_timeout` line were deleted from `embedding.py`.

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants