Skip to content

Conversation

@major
Copy link
Contributor

@major major commented Jan 7, 2026

Description

Switch the /v1/infer endpoint from using the AsyncAgent abstraction to the direct chat.completions.create() API.

The AsyncAgent.create_turn() method internally calls self.initialize() which does not exist in llama-stack-client 0.3.5, causing AttributeError at runtime.

The chat completions API is a better fit for the stateless /infer endpoint because:

  • Simpler for single-turn inference (no session management)
  • Avoids the broken Agent abstraction
  • Fewer moving parts

Type of change

  • Bug fix

Tools used to create PR

  • Assisted-by: Claude
  • Generated by: N/A

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  1. Start lightspeed-stack with llama-stack as library mode (Vertex AI provider)
  2. Send a request to /v1/infer with rh-identity auth header
  3. Verify response is returned successfully without AttributeError

Example test:

cd tmp/test-rlsapi-vertexai
podman compose up --build -d
python3 test_infer.py -v -q "How do I list files on RHEL?"

Summary by CodeRabbit

  • Refactor

    • Switched the internal inference path to a direct chat completions flow for simpler, more reliable request handling.
  • Bug Fixes

    • Added defensive checks to avoid failures on incomplete LLM responses; returns empty results gracefully when content is missing.
  • Tests

    • Updated tests and centralized mocking to target the new chat completions behavior; fixtures simplified and mocks now produce well-formed responses including a non-null request_id.

✏️ Tip: You can customize this high-level summary in your review settings.

@openshift-ci
Copy link

openshift-ci bot commented Jan 7, 2026

Hi @major. Thanks for your PR.

I'm waiting for a lightspeed-core member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 7, 2026

Walkthrough

Replaces the temporary-agent/stateless inference path in the RLS API v1 endpoint with a direct client.chat.completions.create call using constructed system and user messages; removes session/tool handling; adds defensive checks on the LLM response; and updates tests to mock chat.completions interactions.

Changes

Cohort / File(s) Summary
Endpoint — inference refactor
src/app/endpoints/rlsapi_v1.py
Replaces get_temp_agent/session-based flow with direct client.chat.completions.create using MessageOpenAISystemMessageParam and MessageOpenAIUserMessageParam; removes session/tool usage; adds defensive checks for response.choices, message, and content; returns raw content string.
Tests — LLM mocking updated
tests/unit/app/endpoints/test_rlsapi_v1.py
Removes agent-based fixtures/mocks and adds _setup_chat_completions_mock; updates mock_llm_response_fixture and mock_empty_llm_response_fixture signatures (no prepare_agent_mocks); mocks chat.completions.create() for success, empty choices, and APIConnectionError; asserts non-null request_id.

Sequence Diagram(s)

sequenceDiagram
  participant Client as HTTP Client
  participant Endpoint as RLSAPI v1 Endpoint
  participant SDK as LLM SDK Client
  participant LLM as Chat Completions Service

  Client->>Endpoint: POST /infer (payload)
  Endpoint->>Endpoint: build sys_msg, user_msg
  Endpoint->>SDK: client.chat.completions.create(messages=[sys_msg,user_msg])
  SDK->>LLM: relay chat completion request
  alt success
    LLM-->>SDK: response (choices -> message -> content)
    SDK-->>Endpoint: response
    Endpoint->>Client: 200 { content: "<raw string>", request_id, ... }
  else error or empty
    LLM-->>SDK: error or empty choices
    SDK-->>Endpoint: error/empty
    Endpoint->>Client: 200 { content: "", request_id, ... } or error handling
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • tisnik
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the main change: replacing the AsyncAgent abstraction with direct chat.completions API calls in the rlsapi endpoint.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/app/endpoints/rlsapi_v1.py (2)

82-128: Refactoring successfully avoids the Agent API compatibility issues, but integration tests are outdated.

The migration from the agent-based API to client.chat.completions.create() is an excellent solution that sidesteps the initialize() issues mentioned in the PR motivation. The chat.completions API is available in llama-stack-client 0.3.5 (confirmed by imports from llama_stack_client.types.chat.completion_create_params), and unit tests validate the new implementation.

However, the integration tests in tests/integration/endpoints/test_rlsapi_v1_integration.py are now outdated. They still mock get_temp_agent and the agent-based flow, while the actual endpoint code no longer uses the Agent API. These tests should be updated to mock client.chat.completions.create() instead to reflect the current implementation.


1-191: Update test fixtures to match the new chat.completions API refactoring.

The rlsapi_v1 endpoint was refactored to use the Llama Stack chat completions API directly instead of the agent-based approach. The test fixtures must be updated to reflect this:

  1. Remove the patch for get_temp_agent (not used in the refactored code)
  2. Configure mock_client.chat.completions.create to return a properly structured response with:
    • response.choices[0].message.content containing the LLM response text
  3. Update mock_empty_llm_response fixture similarly to test empty/None content handling

The test structure should mock the chat completions call directly rather than the agent turn-based API.

🧹 Nitpick comments (1)
src/app/endpoints/rlsapi_v1.py (1)

116-128: Consider simplifying defensive null checks.

The defensive pattern using getattr() is safe and explicit, which is appropriate given the library compatibility concerns mentioned in the PR. However, if you prefer more concise code, you could use optional chaining with exception handling:

try:
    return str(response.choices[0].message.content)
except (IndexError, AttributeError):
    return ""

This is purely stylistic—the current implementation is clear and correct.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2adb747 and 21480cd.

📒 Files selected for processing (1)
  • src/app/endpoints/rlsapi_v1.py
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use absolute imports for internal modules in LCS project (e.g., from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Use logger = logging.getLogger(__name__) pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) and Optional[Type] or Type | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Use async def for I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes: Configuration for config classes, Error/Exception for exceptions, Resolver for strategy patterns, Interface for abstract base classes
Abstract classes must use ABC with @abstractmethod decorators
Include complete type annotations for all class attributes in Python classes
Use import logging and module logger pattern with standard log levels: debug, info, warning, error

Files:

  • src/app/endpoints/rlsapi_v1.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

  • src/app/endpoints/rlsapi_v1.py
src/**/{client,app/endpoints/**}.py

📄 CodeRabbit inference engine (CLAUDE.md)

Handle APIConnectionError from Llama Stack in integration code

Files:

  • src/app/endpoints/rlsapi_v1.py
🪛 GitHub Actions: Integration tests
src/app/endpoints/rlsapi_v1.py

[error] 1-1: AttributeError: module 'app.endpoints.rlsapi_v1' does not have the attribute 'get_temp_agent'. Tests patch this symbol during setup but it is missing in the module. Command: 'uv run pytest tests/integration --cov=src --cov=runner --cov-report term-missing'.

🪛 GitHub Actions: Unit tests
src/app/endpoints/rlsapi_v1.py

[error] 1-1: AttributeError: module app.endpoints.rlsapi_v1 does not have the attribute 'get_temp_agent'.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: build-pr
🔇 Additional comments (3)
src/app/endpoints/rlsapi_v1.py (3)

8-15: LGTM! Import changes support the new chat completions approach.

The addition of MessageOpenAISystemMessageParam and MessageOpenAIUserMessageParam types is appropriate for constructing properly-typed chat messages.


49-79: LGTM! Clean helper function with proper error handling.

The function correctly constructs the model identifier and handles missing configuration gracefully with appropriate error messages and status codes.


131-191: LGTM! Endpoint properly integrates the new chat completions flow.

The endpoint correctly handles authentication, authorization, error cases, and empty responses. The error handling for APIConnectionError follows the coding guidelines, converting it to an appropriate 503 ServiceUnavailableResponse.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (2)

45-78: Consider adding spec to mock objects for better test reliability.

The mock objects are created without spec or spec_set parameters. While this provides flexibility, it means attribute access typos won't be caught during testing. Consider using spec with the actual llama-stack-client types to ensure the mocks accurately reflect the real API surface.

Example: Adding spec to mocks
from llama_stack_client.types import CompletionMessage, ChatCompletionMessageParam

# Create mock message with spec
mock_message = mocker.Mock(spec=CompletionMessage)
mock_message.content = "This is a test LLM response."

Note: You may need to import the appropriate types from llama-stack-client to use as specs.


81-123: Reduce duplication across mock fixtures.

The three mock fixtures (mock_llm_response, mock_empty_llm_response, mock_api_connection_error) share substantial setup code. Consider extracting the common client/chat/completions mock structure into a helper function that accepts the desired create() behavior as a parameter.

Example: Helper function to reduce duplication
def _setup_chat_completions_mock(mocker: MockerFixture, create_behavior):
    """Helper to set up chat.completions mock with custom create() behavior."""
    mock_completions = mocker.Mock()
    mock_completions.create = create_behavior
    
    mock_chat = mocker.Mock()
    mock_chat.completions = mock_completions
    
    mock_client = mocker.Mock()
    mock_client.chat = mock_chat
    
    mock_client_holder = mocker.Mock()
    mock_client_holder.get_client.return_value = mock_client
    mocker.patch(
        "app.endpoints.rlsapi_v1.AsyncLlamaStackClientHolder",
        return_value=mock_client_holder,
    )
    return mock_client

@pytest.fixture(name="mock_llm_response")
def mock_llm_response_fixture(mocker: MockerFixture) -> None:
    mock_message = mocker.Mock()
    mock_message.content = "This is a test LLM response."
    mock_choice = mocker.Mock()
    mock_choice.message = mock_message
    mock_response = mocker.Mock()
    mock_response.choices = [mock_choice]
    
    _setup_chat_completions_mock(
        mocker, 
        mocker.AsyncMock(return_value=mock_response)
    )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 21480cd and fbdbcd7.

📒 Files selected for processing (1)
  • tests/unit/app/endpoints/test_rlsapi_v1.py
🧰 Additional context used
📓 Path-based instructions (2)
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests; do not use unittest framework
Unit tests must achieve 60% code coverage; integration tests must achieve 10% coverage

Files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest-mock with AsyncMock objects for mocking in tests

Files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
🧠 Learnings (2)
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to src/**/{client,app/endpoints/**}.py : Handle `APIConnectionError` from Llama Stack in integration code

Applied to files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` with AsyncMock objects for mocking in tests

Applied to files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
🧬 Code graph analysis (1)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
src/client.py (1)
  • get_client (70-84)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: build-pr
🔇 Additional comments (2)
tests/unit/app/endpoints/test_rlsapi_v1.py (2)

218-219: Good defensive assertions for request_id validation.

The added assertions ensure that request_id is always present and properly formatted using check_suid. This improves test coverage and validates the stateless request tracking behavior.


108-110: The APIConnectionError(request=mocker.Mock()) instantiation is correct and matches the expected constructor signature for llama-stack-client 0.3.5. The request parameter is accepted by the constructor, and this pattern is used consistently across the test suite (e.g., in test_info.py, test_query_v2.py, test_shields.py, and others). Both mocker.Mock() and None are valid values for the request parameter in this codebase.

Switch the /v1/infer endpoint from using the AsyncAgent abstraction to
the direct chat.completions.create() API.

The AsyncAgent.create_turn() method internally calls self.initialize()
which does not exist in llama-stack-client 0.3.5, causing AttributeError
at runtime.

The chat completions API is a better fit for the stateless /infer endpoint:
- Simpler for single-turn inference (no session management)
- Avoids the broken Agent abstraction
- Fewer moving parts

Test fixtures updated with:
- spec=OpenAIChatCompletion/OpenAIChatCompletionChoice for type safety
- Extracted common mock setup into helper to reduce duplication

Signed-off-by: Major Hayden <major@redhat.com>
@major major force-pushed the fix/rlsapi-use-chat-completions branch from fbdbcd7 to 4c46cea Compare January 7, 2026 20:37
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/app/endpoints/rlsapi_v1.py (1)

1-191: Critical: Integration tests failing – must be updated to match new implementation.

The integration tests in tests/integration/endpoints/test_rlsapi_v1_integration.py attempt to patch get_temp_agent at lines 106 and 241, but this function is not used in the refactored infer_endpoint. The endpoint now uses AsyncLlamaStackClientHolder().get_client() with direct chat.completions.create() calls instead of the agent-based approach.

The test fixtures must be updated to mock the chat completions client and response structure rather than the agent and its turn output.

🧹 Nitpick comments (1)
src/app/endpoints/rlsapi_v1.py (1)

102-128: Consider using direct attribute access instead of getattr() pattern.

The defensive getattr() checks (lines 120, 124) and str() conversion (line 128) suggest uncertainty about the response structure. Consider:

  1. Import OpenAIChatCompletion type (available in llama_stack_client.types.chat.completion_create_response)
  2. Add type hint: response: OpenAIChatCompletion = await client.chat.completions.create(...)
  3. Replace getattr() with direct attribute access: choice.message.content
  4. Let typed access raise AttributeError if structure is invalid, or handle specific exceptions explicitly

This improves type safety and makes the code more maintainable. The current pattern obscures whether we're guarding against None values, missing attributes, or API inconsistencies.

♻️ Proposed refactoring with proper types

Add import at the top of the file:

from llama_stack_client.types.chat.completion_create_response import (
    OpenAIChatCompletion,
)

Then refactor the response handling:

-    response = await client.chat.completions.create(
+    response: OpenAIChatCompletion = await client.chat.completions.create(
         model=model_id,
         messages=[sys_msg, user_msg],
     )
 
     if not response.choices:
         return ""
 
     choice = response.choices[0]
-    message = getattr(choice, "message", None)
-    if message is None:
+    if not hasattr(choice, "message") or choice.message is None:
         return ""
 
-    content = getattr(message, "content", None)
-    if content is None:
+    if not hasattr(choice.message, "content") or choice.message.content is None:
         return ""
 
-    return str(content)
+    return choice.message.content
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbdbcd7 and 4c46cea.

📒 Files selected for processing (2)
  • src/app/endpoints/rlsapi_v1.py
  • tests/unit/app/endpoints/test_rlsapi_v1.py
🧰 Additional context used
📓 Path-based instructions (5)
src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use absolute imports for internal modules in LCS project (e.g., from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Use logger = logging.getLogger(__name__) pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) and Optional[Type] or Type | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Use async def for I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes: Configuration for config classes, Error/Exception for exceptions, Resolver for strategy patterns, Interface for abstract base classes
Abstract classes must use ABC with @abstractmethod decorators
Include complete type annotations for all class attributes in Python classes
Use import logging and module logger pattern with standard log levels: debug, info, warning, error

Files:

  • src/app/endpoints/rlsapi_v1.py
src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

  • src/app/endpoints/rlsapi_v1.py
src/**/{client,app/endpoints/**}.py

📄 CodeRabbit inference engine (CLAUDE.md)

Handle APIConnectionError from Llama Stack in integration code

Files:

  • src/app/endpoints/rlsapi_v1.py
tests/{unit,integration}/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/{unit,integration}/**/*.py: Use pytest for all unit and integration tests; do not use unittest framework
Unit tests must achieve 60% code coverage; integration tests must achieve 10% coverage

Files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest-mock with AsyncMock objects for mocking in tests

Files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
🧠 Learnings (2)
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to src/**/{client,app/endpoints/**}.py : Handle `APIConnectionError` from Llama Stack in integration code

Applied to files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/**/*.py : Use `pytest-mock` with AsyncMock objects for mocking in tests

Applied to files:

  • tests/unit/app/endpoints/test_rlsapi_v1.py
🧬 Code graph analysis (1)
tests/unit/app/endpoints/test_rlsapi_v1.py (1)
src/client.py (1)
  • get_client (70-84)
🪛 GitHub Actions: Integration tests
src/app/endpoints/rlsapi_v1.py

[error] 1-1: AttributeError during tests: module 'app.endpoints.rlsapi_v1' does not have the attribute 'get_temp_agent'. This caused multiple integration tests (pytest) to fail when patching get_temp_agent. Command: 'uv run pytest tests/integration --cov=src --cov=runner --cov-report term-missing'. Ensure get_temp_agent is defined or correctly imported in rlsapi_v1.py.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-pr
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (9)
src/app/endpoints/rlsapi_v1.py (2)

8-8: LGTM! Missing import now properly added.

The Any type is used in the existing infer_responses dictionary (line 38: dict[int | str, dict[str, Any]]) but was not previously imported.


12-15: LGTM! Proper type hints for message parameters.

The imports provide type safety for the system and user message construction used in the direct chat completions API call.

tests/unit/app/endpoints/test_rlsapi_v1.py (7)

6-14: LGTM! Proper imports for the new mocking strategy.

The imports support the refactored mocking approach:

  • Any is used for the flexible create_behavior parameter (line 40)
  • OpenAIChatCompletion and OpenAIChatCompletionChoice provide proper specs for mock objects

40-62: Consider adding return type hint for completeness.

The helper function works correctly and properly mocks the entire chain. Consider adding explicit -> None return type hint to match coding guidelines requiring complete type annotations.

♻️ Add return type hint
-def _setup_chat_completions_mock(mocker: MockerFixture, create_behavior: Any) -> None:
+def _setup_chat_completions_mock(mocker: MockerFixture, create_behavior: Any) -> None:

Wait, it already has -> None. Never mind, this is already complete.

Actually, looking more carefully, line 40 already has -> None. This is already correct!


40-62: LGTM! Well-structured mocking helper.

The function correctly mocks the entire chain from AsyncLlamaStackClientHolder down to chat.completions.create(), with proper type hints and documentation. The use of regular Mock() for synchronous intermediate objects and accepting flexible create_behavior for the async create() method is appropriate.


76-87: LGTM! Fixture properly updated for chat completions API.

The mock correctly constructs the response structure (response.choices[0].message.content) with appropriate specs for type safety. The removal of the prepare_agent_mocks parameter aligns with the elimination of the agent-based approach.


91-96: LGTM! Properly tests empty response handling.

The fixture correctly simulates an empty choices list to test the defensive check in retrieve_simple_response.


100-105: LGTM! Properly tests API connection error handling.

The fixture correctly simulates APIConnectionError through the chat completions mock, testing the error handling path that returns a 503 response.


200-200: LGTM! Good defensive assertion.

Adding an explicit check for request_id is not None before validating the SUID format improves test clarity and error messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant