UN-2105 [FEAT] Include adapter name in error messages#1825
UN-2105 [FEAT] Include adapter name in error messages#1825pk-zipstack wants to merge 10 commits intomainfrom
Conversation
Preserve the user-facing adapter name (e.g., "Unstract Trial LLM") from the platform service response and include it in error messages across SDK1 adapter consumers (LLM, Embedding, VectorDB, X2Text). Previously, adapter names were discarded during config retrieval and errors only showed provider names or instance IDs, making it hard to identify which adapter caused the issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
Summary by CodeRabbit
WalkthroughIntroduce Common.ADAPTER_NAME and a Utils.strip_adapter_name helper; platform stores adapter-name metadata; adapter classes capture/stash adapter names for error context; indexing routines fetch and sanitize adapter configs to exclude adapter-name metadata from generated index keys. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
prompt-service/src/unstract/prompt_service/core/index_v2.py (1)
100-109:⚠️ Potential issue | 🟠 MajorPre-existing issue:
file_hashmay be undefined.Note: This is not introduced by this PR, but
file_hashis only assigned on line 81 whenfile_info.file_hashis falsy. Whenfile_info.file_hashis truthy, the variablefile_hashused on line 101 will be undefined, causing aNameError.This should likely be:
file_hash = file_info.file_hash if not file_hash: file_hash = fs.get_hash_from_file(path=file_info.file_path)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@prompt-service/src/unstract/prompt_service/core/index_v2.py` around lines 100 - 109, The variable file_hash can be undefined when file_info.file_hash is truthy; update the assignment logic in the code that populates index_key so that file_hash is always set from file_info.file_hash first (e.g., file_hash = file_info.file_hash) and only call fs.get_hash_from_file(path=file_info.file_path) when that value is falsy, then use that ensured file_hash when building index_key (references: file_hash, file_info, fs.get_hash_from_file, index_key, self.chunking_config.chunk_size/ chunk_overlap).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/embedding.py`:
- Around line 100-105: The async embedding error paths still call
parse_litellm_err with only provider values; update those calls to include the
adapter display string like the sync path does. Compute adapter_info using
self._adapter_name and self.adapter.get_name() (same logic as the shown
adapter_info variable) and pass adapter_info into parse_litellm_err instead of
the provider-only argument in all async error raises (search for
parse_litellm_err in this module, e.g., the async embedding functions around the
current raise and the other occurrences referenced). Ensure the raise statements
use "raise parse_litellm_err(e, adapter_info) from e" so adapter context is
included.
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 266-273: The acomplete error path is building messages that only
include the provider name and drops the adapter instance name; update the error
construction in the acomplete handler to use the same adapter_info logic as
elsewhere (use self._adapter_name when present, formatted as
"'{self._adapter_name}' ({self.adapter.get_provider()})", else fall back to
"'{self.adapter.get_provider()}'") and use that adapter_info when composing
error_msg (same change should be applied to the other occurrence around lines
346-353) so async callers receive the adapter instance name in the error text.
---
Outside diff comments:
In `@prompt-service/src/unstract/prompt_service/core/index_v2.py`:
- Around line 100-109: The variable file_hash can be undefined when
file_info.file_hash is truthy; update the assignment logic in the code that
populates index_key so that file_hash is always set from file_info.file_hash
first (e.g., file_hash = file_info.file_hash) and only call
fs.get_hash_from_file(path=file_info.file_path) when that value is falsy, then
use that ensured file_hash when building index_key (references: file_hash,
file_info, fs.get_hash_from_file, index_key, self.chunking_config.chunk_size/
chunk_overlap).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ff0f61a0-7b62-4b96-ae66-eecf6097d69f
📒 Files selected for processing (9)
prompt-service/src/unstract/prompt_service/core/index_v2.pyunstract/sdk1/src/unstract/sdk1/constants.pyunstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/index.pyunstract/sdk1/src/unstract/sdk1/llm.pyunstract/sdk1/src/unstract/sdk1/platform.pyunstract/sdk1/src/unstract/sdk1/utils/indexing.pyunstract/sdk1/src/unstract/sdk1/vector_db.pyunstract/sdk1/src/unstract/sdk1/x2txt.py
…er-name-in-errors
Simplify adapter_info construction using or-chaining instead of ternary conditionals to bring cognitive complexity back within the SonarCloud limit of 15. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
♻️ Duplicate comments (1)
unstract/sdk1/src/unstract/sdk1/embedding.py (1)
141-156:⚠️ Potential issue | 🟠 MajorAsync embedding errors still drop the adapter instance name.
Line 142 and Line 155 still pass provider-only values to
parse_litellm_err(), so async callers miss the adapter-name context that the sync paths now include.Proposed fix
except Exception as e: - provider_name = f"{self.adapter.get_name()}" - raise parse_litellm_err(e, provider_name) from e + adapter_info = self._adapter_name or self.adapter.get_name() + raise parse_litellm_err(e, adapter_info) from e @@ except Exception as e: - provider_name = f"{self.adapter.get_name()}" - raise parse_litellm_err(e, provider_name) from e + adapter_info = self._adapter_name or self.adapter.get_name() + raise parse_litellm_err(e, adapter_info) from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/embedding.py` around lines 141 - 156, The async method get_aembeddings constructs and passes only a provider string to parse_litellm_err, losing the adapter-instance context; change its exception handler to mirror the sync path by building the same adapter-aware identifier used elsewhere (use the same expression the sync code uses to include the adapter instance name) and pass that identifier into parse_litellm_err(e, <adapter-aware-identifier>) instead of the current provider_name so async callers receive the adapter name too.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@unstract/sdk1/src/unstract/sdk1/embedding.py`:
- Around line 141-156: The async method get_aembeddings constructs and passes
only a provider string to parse_litellm_err, losing the adapter-instance
context; change its exception handler to mirror the sync path by building the
same adapter-aware identifier used elsewhere (use the same expression the sync
code uses to include the adapter instance name) and pass that identifier into
parse_litellm_err(e, <adapter-aware-identifier>) instead of the current
provider_name so async callers receive the adapter name too.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6d1355c0-09a2-4691-9f8b-1307219df5a3
📒 Files selected for processing (3)
unstract/sdk1/src/unstract/sdk1/embedding.pyunstract/sdk1/src/unstract/sdk1/index.pyunstract/sdk1/src/unstract/sdk1/llm.py
🚧 Files skipped from review as they are similar to previous changes (1)
- unstract/sdk1/src/unstract/sdk1/llm.py
Apply same or-chaining simplification to LLM complete() and stream_complete() error handlers to stay within SonarCloud's cognitive complexity limit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/llm.py (1)
271-275: Consider extracting_adapter_info()helper and including provider context.The error format shows either adapter name or provider, but the PR objectives suggest showing both:
"Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ...". This gives users the friendly name while retaining technical context for debugging.Additionally, this logic is duplicated at lines 348 and 408. A helper method would improve consistency and maintainability.
♻️ Proposed refactor
Add a helper method to the class:
def _adapter_info(self) -> str: """Return formatted adapter identifier for error messages.""" if self._adapter_name: return f"'{self._adapter_name}' ({self.adapter.get_provider()})" return f"'{self.adapter.get_provider()}'"Then replace all three occurrences (lines 271-274, 348-351, 408-411):
- adapter_info = self._adapter_name or self.adapter.get_provider() - error_msg = ( - f"Error from LLM adapter '{adapter_info}': " - f"{strip_litellm_prefix(str(e))}" - ) + error_msg = ( + f"Error from LLM adapter {self._adapter_info()}: " + f"{strip_litellm_prefix(str(e))}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 271 - 275, Extract a new instance helper method named _adapter_info(self) that returns a formatted identifier combining the friendly adapter name and provider, e.g. when self._adapter_name exists return f"'{self._adapter_name}' ({self.adapter.get_provider()})" else return f"'{self.adapter.get_provider()}'"; then replace the duplicated logic that builds adapter_info (the blocks using self._adapter_name and self.adapter.get_provider()) with calls to this._adapter_info() in the LLM class so all error messages use the same formatted string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 271-275: Extract a new instance helper method named
_adapter_info(self) that returns a formatted identifier combining the friendly
adapter name and provider, e.g. when self._adapter_name exists return
f"'{self._adapter_name}' ({self.adapter.get_provider()})" else return
f"'{self.adapter.get_provider()}'"; then replace the duplicated logic that
builds adapter_info (the blocks using self._adapter_name and
self.adapter.get_provider()) with calls to this._adapter_info() in the LLM class
so all error messages use the same formatted string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c4f01a13-8eee-448d-92e6-286ea4f3abf8
📒 Files selected for processing (1)
unstract/sdk1/src/unstract/sdk1/llm.py
chandrasekharan-zipstack
left a comment
There was a problem hiding this comment.
LGTM for the most part, address the comment to reduce code duplication
Move the repeated adapter-name stripping logic into Utils.strip_adapter_name() in sdk1/utils/common.py, replacing inline for-loops in index key generation across sdk1/index.py, sdk1/utils/indexing.py, and prompt-service/core/index_v2.py. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
for more information, see https://pre-commit.ci
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|
Greptile SummaryThis PR improves the user-facing error messages for all four SDK1 adapter types (LLM, Embedding, VectorDB, X2Text) by preserving the human-readable adapter instance name (e.g. "Unstract Trial LLM") from the platform service response and embedding it in error messages, replacing raw UUIDs or cryptic provider identifiers. Key changes:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Tool
participant Platform as PlatformHelper
participant Consumer as Adapter Consumer
participant IdxGen as Index Key Generator
Tool->>Platform: get_adapter_config(instance_id)
Platform-->>Platform: pop adapter_name & adapter_type from response
Platform-->>Platform: store under _adapter_name key
Platform-->>Consumer: return config dict (includes _adapter_name)
Consumer->>Consumer: pop _adapter_name → self._adapter_name
Consumer->>Consumer: initialise adapter (clean config)
alt Error during adapter call
Consumer-->>Tool: raise Error with adapter name in message
end
Tool->>IdxGen: generate_index_key(...)
IdxGen->>Platform: get_adapter_config (x3 for vdb, emb, x2text)
Platform-->>IdxGen: configs (each includes _adapter_name)
IdxGen->>IdxGen: Utils.strip_adapter_name(configs)
IdxGen->>IdxGen: json.dumps + hash
IdxGen-->>Tool: hashed index key (backward compatible)
|
| adapter_info = self._adapter_name or self.adapter.get_provider() | ||
| error_msg = ( | ||
| f"Error from LLM provider '{self.adapter.get_provider()}': " | ||
| f"Error from LLM adapter '{adapter_info}': " | ||
| f"{strip_litellm_prefix(str(e))}" |
There was a problem hiding this comment.
Error message loses provider context when adapter name is present
The PR description states the intended "after" format as Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ..., preserving the provider name alongside the human-readable adapter name for easier debugging. However, the current implementation uses adapter_info = self._adapter_name or self.adapter.get_provider(), which means when _adapter_name is set the provider name is completely absent from the error message.
Consider including both to match the stated goal:
| adapter_info = self._adapter_name or self.adapter.get_provider() | |
| error_msg = ( | |
| f"Error from LLM provider '{self.adapter.get_provider()}': " | |
| f"Error from LLM adapter '{adapter_info}': " | |
| f"{strip_litellm_prefix(str(e))}" | |
| adapter_name = self._adapter_name | |
| provider = self.adapter.get_provider() | |
| adapter_info = f"{adapter_name} ({provider})" if adapter_name else provider | |
| error_msg = ( | |
| f"Error from LLM adapter '{adapter_info}': " | |
| f"{strip_litellm_prefix(str(e))}" | |
| ) |
The same pattern applies to stream_complete() (line ~349) and acomplete() (line ~408).
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/llm.py
Line: 271-274
Comment:
**Error message loses provider context when adapter name is present**
The PR description states the intended "after" format as `Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ...`, preserving the provider name alongside the human-readable adapter name for easier debugging. However, the current implementation uses `adapter_info = self._adapter_name or self.adapter.get_provider()`, which means when `_adapter_name` is set the provider name is completely absent from the error message.
Consider including both to match the stated goal:
```suggestion
adapter_name = self._adapter_name
provider = self.adapter.get_provider()
adapter_info = f"{adapter_name} ({provider})" if adapter_name else provider
error_msg = (
f"Error from LLM adapter '{adapter_info}': "
f"{strip_litellm_prefix(str(e))}"
)
```
The same pattern applies to `stream_complete()` (line ~349) and `acomplete()` (line ~408).
How can I resolve this? If you propose a fix, please make it concise.| except Exception as e: | ||
| adapter_name = getattr(self, "_adapter_name", "") | ||
| adapter_info = ( | ||
| f"'{adapter_name}'" if adapter_name else self._adapter_instance_id | ||
| ) | ||
| self._tool.stream_log( | ||
| log=f"Unable to get vector_db {self._adapter_instance_id}: {e}", | ||
| log=f"Unable to get vector_db {adapter_info}: {e}", | ||
| level=LogLevel.ERROR, | ||
| ) | ||
| raise VectorDBError(f"Error getting vectorDB instance: {e}") from e | ||
| raise VectorDBError( | ||
| f"Error getting vectorDB instance {adapter_info}: {e}" | ||
| ) from e |
There was a problem hiding this comment.
Inconsistent quote handling in error message identifier
VectorDB (and X2Text at x2txt.py lines 89–97) embed the quotes inside adapter_info (f"'{adapter_name}'") only when a name exists, leaving the raw UUID unquoted as a fallback. By contrast, the LLM and Embedding adapters always place the identifier inside quotes via the template string (f"Error from LLM adapter '{adapter_info}': ..."). This results in different message styles:
- With a name:
Unable to get vector_db 'My VectorDB': ...← quoted - Without a name:
Unable to get vector_db some-uuid-here: ...← unquoted
For consistency, move the quotes into the template string and let adapter_info stay as the plain string (either the name or the UUID):
| except Exception as e: | |
| adapter_name = getattr(self, "_adapter_name", "") | |
| adapter_info = ( | |
| f"'{adapter_name}'" if adapter_name else self._adapter_instance_id | |
| ) | |
| self._tool.stream_log( | |
| log=f"Unable to get vector_db {self._adapter_instance_id}: {e}", | |
| log=f"Unable to get vector_db {adapter_info}: {e}", | |
| level=LogLevel.ERROR, | |
| ) | |
| raise VectorDBError(f"Error getting vectorDB instance: {e}") from e | |
| raise VectorDBError( | |
| f"Error getting vectorDB instance {adapter_info}: {e}" | |
| ) from e | |
| adapter_name = getattr(self, "_adapter_name", "") | |
| adapter_info = adapter_name if adapter_name else self._adapter_instance_id | |
| self._tool.stream_log( | |
| log=f"Unable to get vector_db '{adapter_info}': {e}", | |
| level=LogLevel.ERROR, | |
| ) | |
| raise VectorDBError( | |
| f"Error getting vectorDB instance '{adapter_info}': {e}" | |
| ) from e |
Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/vector_db.py
Line: 112-123
Comment:
**Inconsistent quote handling in error message identifier**
`VectorDB` (and `X2Text` at `x2txt.py` lines 89–97) embed the quotes inside `adapter_info` (`f"'{adapter_name}'"`) only when a name exists, leaving the raw UUID unquoted as a fallback. By contrast, the LLM and Embedding adapters always place the identifier inside quotes via the template string (`f"Error from LLM adapter '{adapter_info}': ..."`). This results in different message styles:
- With a name: `Unable to get vector_db 'My VectorDB': ...` ← quoted
- Without a name: `Unable to get vector_db some-uuid-here: ...` ← unquoted
For consistency, move the quotes into the template string and let `adapter_info` stay as the plain string (either the name or the UUID):
```suggestion
adapter_name = getattr(self, "_adapter_name", "")
adapter_info = adapter_name if adapter_name else self._adapter_instance_id
self._tool.stream_log(
log=f"Unable to get vector_db '{adapter_info}': {e}",
level=LogLevel.ERROR,
)
raise VectorDBError(
f"Error getting vectorDB instance '{adapter_info}': {e}"
) from e
```
How can I resolve this? If you propose a fix, please make it concise.


What
Why
How
PlatformHelper._get_adapter_configuration(), the adapter name was already fetched from the platform service but immediately discarded viapop(). Now it is stored back in the config dict under a private key_adapter_name.Error from LLM provider 'azureopenai': ...Error from LLM adapter 'Unstract Trial LLM' (azureopenai): ...index.py,utils/indexing.py, prompt-serviceindex_v2.py) strip the_adapter_namekey before hashing to maintain backward compatibility with existing document index keys.Can this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
_adapter_namekey is stripped before index key hashing, so existing indexed documents are unaffected. The only user-visible change is richer error messages that now include the adapter instance name alongside the provider name.Database Migrations
Env Config
Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.