Refine optional provider handling and expand CI coverage #6

CaliLuke · 2025-10-12T19:05:36Z

Summary

allow direct LLMSettings(provider=...) construction by enabling name population
guard optional embedder/reranker imports and raise informative errors when extras are missing
move heavy test-only dependencies to the dev extra and adjust build artefacts accordingly
add an AddEpisodePayload entry point, new unit coverage, and tighten Pydantic AI fallback behaviour
update CI workflow to run on all branches and tolerate skipped integration suites

Testing

uv run pytest tests/unit/providers/test_factory.py
uv run pytest tests/unit/settings/test_llm_settings.py
uv run pytest tests/unit/llm_client/test_litellm_client.py::test_litellm_client_validates_response_model
ty check .

Summary by CodeRabbit

New Features
- App-wide Graphium client lifecycle for improved stability and performance.
- Add episodes via a structured payload model (legacy method preserved).
Improvements
- Optional AI providers now load gracefully with clearer guidance when missing.
- Smarter provider-specific API key and base-URL resolution.
- Consolidated search behavior routed through a unified service.
Bug Fixes
- Clear endpoint now returns a success message.
Documentation
- Added engineering plan, code-review guidance, and complexity report.
Tests
- Conditional skips and guards for optional SDKs; improved integration test handling.
Chores
- CI workflow update, pre-commit formatting/linting, and dependency upgrades.

coderabbitai · 2025-10-12T19:05:57Z

Walkthrough

Refactors search orchestration to use SearchService, adds AddEpisodePayload and payload-based add flow, introduces FastAPI app-scoped GraphiumClient lifecycle and request-based dependency, improves optional-provider handling with missing-provider shims and clearer import errors, enhances LLM settings resolution, adds pre-commit (ruff), broadens CI triggers and test guards, and updates many tests to skip when optional SDKs are absent.

Changes

Cohort / File(s)	Summary
CI & Tooling `.github/workflows/ci.yml`, `.pre-commit-config.yaml`, `pyproject.toml`	CI now triggers on all branches; targeted-test step adds guarded exit-code handling to allow skipping/unavailable integration tests. Adds pre-commit config with ruff-format and ruff (--fix, exit-on-fix). Bumps deps and Python/tooling settings.
Docs & Plans `CODE_REVIEW.md`, `ENGINEERING_PLAN.md`, `docs/reports/code-complexity.md`	Adds engineering plan, code-review notes, and complexity report; documents intended refactors and lifecycle changes (no functional code).
Search Orchestration `graphium_core/orchestration/search_orchestrator.py` (removed), `graphium_core/orchestration/__init__.py`, `graphium_core/graphium.py`	Removes SearchOrchestrator and its export; Graphium now uses SearchService and exposes add_episode_payload plus payload-driven add_episode.
Payload Models `graphium_core/orchestration/payloads.py`, `tests/unit/graphium/test_add_episode_payload.py`	Adds AddEpisodePayload Pydantic model, updates EpisodePayload config, and tests for payload delegation and field propagation.
Optional Provider Handling / Shims `graphium_core/utils/optional_dependencies.py`, `graphium_core/utils/__init__.py`, `graphium_core/embedder/providers/__init__.py`, `graphium_core/cross_encoder/rerankers/__init__.py`, `graphium_core/cross_encoder/rerankers/gemini.py`, `graphium_core/providers/factory.py`	Introduces missing_provider shim and re-exports it; converts eager optional imports to try/except with shims or clearer Install guidance; TYPE_CHECKING-aware genai typing in gemini. Factory raises descriptive ImportError with install hints when optional deps missing.
Search Gathering Shared Utility `graphium_core/search/shared.py`, `graphium_core/search/nodes.py`, `graphium_core/search/edges.py`	Adds gather_search_results utility; nodes/edges refactored to use it and remove local gather* helpers and semaphore_gather usage.
LLM Settings `graphium_core/settings.py`, `tests/unit/settings/test_llm_settings.py`	Adds provider-aware helpers _provider_api_keys and _provider_base_urls; extends resolved_api_key/base_url resolution logic and Settings config; tests added for key/URL resolution scenarios.
Pydantic AI guard `graphium_core/llm_client/pydantic_ai_adapter.py`	Skip Pydantic AI invocation early when provider API key absent (debug log + return None).
FastAPI Graphium Client Lifecycle `server/graph_service/graphium_client.py`, `server/graph_service/main.py`, `server/graph_service/routers/ingest.py`, `server/graph_service/routers/retrieve.py`	Introduces app-scoped GraphiumClient stored on app.state, lifespan-managed init/close, request-based dependency retrieval returning HTTP 503 if absent. Renames ZepGraphium → GraphiumClient and dependency alias; updates route signatures and return in clear endpoint.
MCP Server types & enqueue usage `mcp_server/graphium_mcp/state.py`, `mcp_server/graphium_mcp/tools.py`	Replaces EpisodeProcessor Callable with runtime-checkable Protocol including queue_metadata attribute; attach metadata to cast task and enqueue that task.
Test infra: guards & conftest `conftest.py`, `tests/integration/__init__.py`, `tests/integration/conftest.py`, `tests/integration/`, `mcp_server/tests/`, `tests/unit/*`	Adds pytest.importorskip and module-level RUN_ guards across many tests to skip when optional SDKs or flags are missing; conftest enhancements for plugin wiring and teardown (session/client restoration). Removes some integration conftest module-level declarations.
Unit tests for providers/embedder/rerankers `tests/unit/cross_encoder/test_`, `tests/unit/embedder/test_`, `tests/unit/providers/test_factory.py`	Add pytest.importorskip guards to skip tests when optional SDKs (sentence-transformers, google.genai, voyageai, openai, litellm, fastmcp) are not present.
Misc `mcp_server/tests/test_health_endpoint.py`, `mcp_server/tests/test_server_smoke.py`, `mcp_server/tests/test_streamable_http.py`	Add early pytest.importorskip for fastmcp and module-level skip guard for RUN_MCP_SMOKE_TESTS; adjust test import-time behavior.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant Client as HTTP Client
  participant FastAPI as FastAPI App
  participant Lifespan as Lifespan Context
  participant State as app.state
  participant Graphium as GraphiumClient

  rect rgb(235, 245, 255)
    note over FastAPI,Lifespan: App startup
    FastAPI->>Lifespan: enter lifespan(app)
    Lifespan->>Graphium: construct GraphiumClient(settings...)
    Graphium-->>Lifespan: instance
    Lifespan->>State: state.graphium_client = Graphium
    Lifespan-->>FastAPI: yield
  end

  rect rgb(240, 255, 240)
    note over Client,FastAPI: Request handling
    Client->>FastAPI: Route call
    FastAPI->>State: get graphium_client
    State-->>FastAPI: GraphiumClient
    FastAPI->>Graphium: perform operation (search / add_episode / ingest)
    Graphium-->>FastAPI: result
    FastAPI-->>Client: response
  end

  rect rgb(255, 240, 240)
    note over FastAPI,Lifespan: App shutdown
    FastAPI->>Lifespan: exit lifespan
    Lifespan->>State: get graphium_client
    Lifespan->>Graphium: close()
    Graphium-->>Lifespan: closed
  end

sequenceDiagram
  autonumber
  participant Caller as API/Service
  participant Graphium as Graphium
  participant Payload as AddEpisodePayload
  participant Orchestrator as EpisodeOrchestrator

  Caller->>Graphium: add_episode(args...)
  Graphium->>Payload: construct AddEpisodePayload from args
  Graphium->>Graphium: add_episode_payload(payload)
  Graphium->>Orchestrator: orchestrator.add_episode(payload fields...)
  Orchestrator-->>Graphium: AddEpisodeResults
  Graphium-->>Caller: AddEpisodeResults

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas to focus review on:

Graphium ↔ SearchService refactor and removal of SearchOrchestrator usages (graphium_core/graphium.py).
New AddEpisodePayload model and its uses and tests (payloads.py, tests/unit/graphium/test_add_episode_payload.py).
Optional-dependency shims and factory import-guard messages (utils/optional_dependencies.py, embedder/providers/init.py, providers/factory.py).
FastAPI lifespan/client lifecycle and request dependency changes (server/graph_service/*).
gather_search_results behavior and BFS seed integration (search/shared.py, search/nodes.py, search/edges.py).
Tests that now rely on pytest.importorskip—ensure CI environments have expected extras or guards behave as intended.

Possibly related PRs

Refactor: typed orchestration payloads and MCP fixes #2 — Mirrors the orchestration refactor: replaces SearchOrchestrator with SearchService and introduces AddEpisodePayload (strong code overlap).
Restructure tests into unit, integration, and e2e tiers #5 — Restructures tests and CI targeting; related to test guards and CI changes present here.

Poem

I nibble code with careful paws and cheer,
Payloads hop in — the paths are clear.
Clients snug in app-state burrows deep,
Optional friends wake when installs seep.
Ruff brushes fur; tests skip where needed—hop! 🐇✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: refining optional provider handling (wrapping imports, adding error guidance) and expanding CI coverage (running on all branches, handling integration test skips).

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch refactor/core-hardening

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

graphium_core/embedder/providers/__init__.py (1)

8-17: Extract _missing_provider to eliminate duplication.

This is identical to the _missing_provider helper in graphium_core/cross_encoder/rerankers/__init__.py (lines 9-18). Please see my comment on that file for the suggested refactor to a shared utility module.

🧹 Nitpick comments (4)

conftest.py (1)

58-63: LGTM! The unused config parameter is expected for pytest hooks.

The static analysis hint (ARG001) is a false positive. The config parameter is part of the standard pytest_collection_modifyitems hook signature and must be present even if unused. Pytest uses introspection to call hooks, so the signature must match the expected interface.

The hook logic correctly provides a fallback for unit tests with async behavior when pytest-asyncio is unavailable, complementing the stricter requirement in tests/integration/__init__.py.

Optional refinement: If you prefer to eliminate the linter warning, you could omit config from the signature entirely—pytest's hook system tolerates a subset of parameters. However, including it documents the full hook interface, so the current approach is perfectly acceptable.
graphium_core/cross_encoder/rerankers/gemini.py (1)
14-22: Simplify the typing.cast pattern for clarity.

The pattern typing.cast(typing.Any, None) is unusual and potentially confusing. When the import fails, these variables are simply None at runtime, and casting None to Any doesn't provide meaningful type information.

Consider these alternatives for cleaner type hints:

Option 1: Use direct None assignment
 if TYPE_CHECKING:  # pragma: no cover - typing only
     try:
         from google import genai as genai  # type: ignore[import-not-found]
         from google.genai import types as types  # type: ignore[import-not-found]
     except ImportError:  # pragma: no cover - optional dependency missing at type time
-        genai = typing.cast(typing.Any, None)
-        types = typing.cast(typing.Any, None)
+        genai = None  # type: ignore[assignment]
+        types = None  # type: ignore[assignment]
 else:  # pragma: no cover - runtime placeholders
-    genai = typing.cast(typing.Any, None)
-    types = typing.cast(typing.Any, None)
+    genai = None
+    types = None
Option 2: Use a proper typing.Any annotation
 if TYPE_CHECKING:  # pragma: no cover - typing only
     try:
         from google import genai as genai  # type: ignore[import-not-found]
         from google.genai import types as types  # type: ignore[import-not-found]
     except ImportError:  # pragma: no cover - optional dependency missing at type time
-        genai = typing.cast(typing.Any, None)
-        types = typing.cast(typing.Any, None)
+        genai: typing.Any = None
+        types: typing.Any = None
 else:  # pragma: no cover - runtime placeholders
-    genai = typing.cast(typing.Any, None)
-    types = typing.cast(typing.Any, None)
+    genai: typing.Any = None
+    types: typing.Any = None
Both options are clearer than using cast(Any, None), which conflates the concepts of casting and type annotation.
server/graph_service/main.py (1)

14-18: Consider error handling for client initialization.

If GraphiumClient construction or build_indices_and_constraints fails during startup, the application will fail to start. While this is appropriate for critical initialization, consider whether explicit error logging or a more informative startup failure message would help operators diagnose issues.

ENGINEERING_PLAN.md (1)

1-150: Well-structured engineering plan with clear milestones.

The document provides an excellent roadmap for the refactoring efforts, with:

Clear goals and acceptance criteria for each milestone

Concrete code examples for implementation

Progress tracking via task checkboxes

Validation steps to ensure quality

The plan aligns well with the actual code changes in this PR (Milestones 2-5) and provides valuable context for future work.

Consider adding a note about the required extras (e.g., [openai], [gemini]) needed to run the full test suite, as mentioned in Milestone 4's validation section.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f43ef4b and 9028dc3.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (37)

.github/workflows/ci.yml (2 hunks)
.pre-commit-config.yaml (1 hunks)
CODE_REVIEW.md (1 hunks)
ENGINEERING_PLAN.md (1 hunks)
conftest.py (2 hunks)
docs/reports/code-complexity.md (1 hunks)
graphium_core/cross_encoder/rerankers/__init__.py (1 hunks)
graphium_core/cross_encoder/rerankers/gemini.py (1 hunks)
graphium_core/embedder/providers/__init__.py (1 hunks)
graphium_core/graphium.py (6 hunks)
graphium_core/llm_client/pydantic_ai_adapter.py (1 hunks)
graphium_core/orchestration/__init__.py (0 hunks)
graphium_core/orchestration/payloads.py (1 hunks)
graphium_core/orchestration/search_orchestrator.py (0 hunks)
graphium_core/providers/factory.py (3 hunks)
graphium_core/settings.py (2 hunks)
mcp_server/graphium_mcp/state.py (1 hunks)
mcp_server/graphium_mcp/tools.py (1 hunks)
mcp_server/tests/test_health_endpoint.py (1 hunks)
mcp_server/tests/test_server_smoke.py (1 hunks)
mcp_server/tests/test_streamable_http.py (1 hunks)
pyproject.toml (3 hunks)
server/graph_service/graphium_client.py (3 hunks)
server/graph_service/main.py (1 hunks)
server/graph_service/routers/ingest.py (4 hunks)
server/graph_service/routers/retrieve.py (3 hunks)
tests/integration/__init__.py (1 hunks)
tests/integration/conftest.py (0 hunks)
tests/unit/cross_encoder/test_bge_reranker_client.py (1 hunks)
tests/unit/cross_encoder/test_gemini_reranker_client.py (1 hunks)
tests/unit/embedder/test_embeddinggemma.py (1 hunks)
tests/unit/embedder/test_openai.py (1 hunks)
tests/unit/embedder/test_voyage.py (1 hunks)
tests/unit/graphium/test_add_episode_payload.py (1 hunks)
tests/unit/mcp/test_episode_queue.py (1 hunks)
tests/unit/providers/test_factory.py (1 hunks)
tests/unit/settings/test_llm_settings.py (1 hunks)

💤 Files with no reviewable changes (3)

tests/integration/conftest.py
graphium_core/orchestration/search_orchestrator.py
graphium_core/orchestration/init.py

🧰 Additional context used

📓 Path-based instructions (15)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation, 100-character line length, and prefer single quotes as configured in pyproject.toml
Modules, files, and functions should use snake_case names
Only use # type: ignore as a last resort and include a justification comment

Files:

tests/unit/embedder/test_voyage.py
tests/unit/graphium/test_add_episode_payload.py
graphium_core/providers/factory.py
mcp_server/graphium_mcp/state.py
tests/unit/settings/test_llm_settings.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/cross_encoder/rerankers/gemini.py
server/graph_service/main.py
tests/unit/cross_encoder/test_bge_reranker_client.py
server/graph_service/graphium_client.py
tests/unit/mcp/test_episode_queue.py
tests/unit/providers/test_factory.py
tests/unit/embedder/test_embeddinggemma.py
tests/integration/__init__.py
graphium_core/llm_client/pydantic_ai_adapter.py
graphium_core/graphium.py
server/graph_service/routers/retrieve.py
mcp_server/tests/test_streamable_http.py
graphium_core/embedder/providers/__init__.py
tests/unit/cross_encoder/test_gemini_reranker_client.py
mcp_server/graphium_mcp/tools.py
mcp_server/tests/test_health_endpoint.py
graphium_core/settings.py
tests/unit/embedder/test_openai.py
server/graph_service/routers/ingest.py
mcp_server/tests/test_server_smoke.py
graphium_core/orchestration/payloads.py
conftest.py

tests/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/*.py: Name test functions as test_
Use @pytest.mark.integration for database-reliant tests so CI can gate them
Prefer @pytest.mark.parametrize over manual loops for clearer, isolated assertions
Seed random fixtures so test runs are deterministic
Reuse shared constants (e.g., GROUP_ID) instead of hard-coding values in tests
Register fixtures via pytest_plugins = [...] rather than importing solely for side effects
Remove unnecessary noqa directives once unused imports are gone
For provider-specific integration fixtures, parameterize with sentinels and skip inside the fixture so CI reports a skip when services are disabled

Files:

tests/unit/embedder/test_voyage.py
tests/unit/graphium/test_add_episode_payload.py
tests/unit/settings/test_llm_settings.py
tests/unit/cross_encoder/test_bge_reranker_client.py
tests/unit/mcp/test_episode_queue.py
tests/unit/providers/test_factory.py
tests/unit/embedder/test_embeddinggemma.py
tests/integration/__init__.py
tests/unit/cross_encoder/test_gemini_reranker_client.py
tests/unit/embedder/test_openai.py

{server/**/*.py,graphium_core/**/*.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use ruff for formatting/lint and pyright for type checks (make lint), with pyright standard mode in server and basic in core

Files:

graphium_core/providers/factory.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/cross_encoder/rerankers/gemini.py
server/graph_service/main.py
server/graph_service/graphium_client.py
graphium_core/llm_client/pydantic_ai_adapter.py
graphium_core/graphium.py
server/graph_service/routers/retrieve.py
graphium_core/embedder/providers/__init__.py
graphium_core/settings.py
server/graph_service/routers/ingest.py
graphium_core/orchestration/payloads.py

graphium_core/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

graphium_core/**/*.py: In graphium_core use absolute imports (e.g., from graphium_core.prompts.models import Message) instead of deep relative imports
Check for None (e.g., driver is None) before use to satisfy type-narrowing and avoid possibly-unbound warnings

Files:

graphium_core/providers/factory.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/cross_encoder/rerankers/gemini.py
graphium_core/llm_client/pydantic_ai_adapter.py
graphium_core/graphium.py
graphium_core/embedder/providers/__init__.py
graphium_core/settings.py
graphium_core/orchestration/payloads.py

{graphium_core/**/*.py,server/**/*.py,mcp_server/**/*.py}

📄 CodeRabbit inference engine (AGENTS.md)

Import the public facade via from graphium_core.graphium import Graphium instead of importing from graphium_core.init

Files:

graphium_core/providers/factory.py
mcp_server/graphium_mcp/state.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/cross_encoder/rerankers/gemini.py
server/graph_service/main.py
server/graph_service/graphium_client.py
graphium_core/llm_client/pydantic_ai_adapter.py
graphium_core/graphium.py
server/graph_service/routers/retrieve.py
mcp_server/tests/test_streamable_http.py
graphium_core/embedder/providers/__init__.py
mcp_server/graphium_mcp/tools.py
mcp_server/tests/test_health_endpoint.py
graphium_core/settings.py
server/graph_service/routers/ingest.py
mcp_server/tests/test_server_smoke.py
graphium_core/orchestration/payloads.py

**/*.md

📄 CodeRabbit inference engine (AGENTS.md)

Tag fenced Markdown code blocks with an explicit language (text, ini, etc.) to satisfy linting

Files:

docs/reports/code-complexity.md
ENGINEERING_PLAN.md
CODE_REVIEW.md

mcp_server/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

mcp_server/**/*.py: Avoid static state bindings in the MCP server; import runtime state dynamically
Use Literal types for transport strings ("streamable-http"|"http"|"stdio") in MCP-related code
MCP server code should live under mcp_server/ with Docker compose service defined there
For MCP server state, import as from . import state and access attributes (e.g., state.graphium_client) rather than binding names from state

Files:

mcp_server/graphium_mcp/state.py
mcp_server/tests/test_streamable_http.py
mcp_server/graphium_mcp/tools.py
mcp_server/tests/test_health_endpoint.py
mcp_server/tests/test_server_smoke.py

{graphium_core/settings.py,server/**/*.py,mcp_server/**/*.py}

📄 CodeRabbit inference engine (AGENTS.md)

Document defaulted environment flags and let caller-provided values take precedence

Files:

mcp_server/graphium_mcp/state.py
server/graph_service/main.py
server/graph_service/graphium_client.py
server/graph_service/routers/retrieve.py
mcp_server/tests/test_streamable_http.py
mcp_server/graphium_mcp/tools.py
mcp_server/tests/test_health_endpoint.py
graphium_core/settings.py
server/graph_service/routers/ingest.py
mcp_server/tests/test_server_smoke.py

{server/**/*.py,mcp_server/**/*.py}

📄 CodeRabbit inference engine (AGENTS.md)

Mirror core environment requirements in services: set LLM_PROVIDER and database credentials for server and MCP

Files:

mcp_server/graphium_mcp/state.py
server/graph_service/main.py
server/graph_service/graphium_client.py
server/graph_service/routers/retrieve.py
mcp_server/tests/test_streamable_http.py
mcp_server/graphium_mcp/tools.py
mcp_server/tests/test_health_endpoint.py
server/graph_service/routers/ingest.py
mcp_server/tests/test_server_smoke.py

server/graph_service/main.py

📄 CodeRabbit inference engine (AGENTS.md)

FastAPI service entry point is server/graph_service/main.py

Files:

server/graph_service/main.py

graphium_core/{llm_client,embedder}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place LLM clients under graphium_core/llm_client/ and embedders under graphium_core/embedder/

Files:

graphium_core/llm_client/pydantic_ai_adapter.py
graphium_core/embedder/providers/__init__.py

graphium_core/graphium.py

📄 CodeRabbit inference engine (AGENTS.md)

Treat graphium_core/graphium.py as the core library facade and delegate to orchestrators

Files:

graphium_core/graphium.py

mcp_server/tests/test_health_endpoint.py

📄 CodeRabbit inference engine (AGENTS.md)

Health endpoint must return 503 until Graphium initialization completes, then 200 with a success message

Files:

mcp_server/tests/test_health_endpoint.py

graphium_core/orchestration/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Bulk ingest, maintenance, and search orchestration should be under graphium_core/orchestration/

Files:

graphium_core/orchestration/payloads.py

{pyproject.toml,Makefile,pytest.ini,conftest.py}

📄 CodeRabbit inference engine (AGENTS.md)

Maintain tooling manifests and test configuration at their canonical locations (repo root for pyproject.toml/Makefile/pytest.ini; conftest.py under tests as needed)

Files:

pyproject.toml
conftest.py

🧠 Learnings (2)

📚 Learning: 2025-10-12T05:55:20.133Z

Learnt from: CR
PR: CaliLuke/graphium#0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to server/graph_service/main.py : FastAPI service entry point is server/graph_service/main.py

Applied to files:

server/graph_service/main.py

📚 Learning: 2025-10-12T05:55:20.133Z

Learnt from: CR
PR: CaliLuke/graphium#0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to graphium_core/orchestration/**/*.py : Bulk ingest, maintenance, and search orchestration should be under graphium_core/orchestration/

Applied to files:

graphium_core/graphium.py

🧬 Code graph analysis (13)

tests/unit/graphium/test_add_episode_payload.py (3)

graphium_core/graphium.py (2)

Graphium (41-256)

add_episode_payload (123-139)

graphium_core/nodes.py (1)

EpisodeType (26-59)

graphium_core/orchestration/payloads.py (1)

AddEpisodePayload (16-33)

graphium_core/providers/factory.py (4)

graphium_core/embedder/providers/gemini.py (2)

GeminiEmbedder (39-131)

GeminiEmbedderConfig (34-36)

graphium_core/embedder/providers/embeddinggemma.py (2)

EmbeddingGemmaConfig (18-21)

EmbeddingGemmaEmbedder (24-46)

graphium_core/cross_encoder/rerankers/gemini.py (1)

GeminiReranker (29-141)

graphium_core/cross_encoder/rerankers/bge.py (1)

BGEReranker (20-39)

tests/unit/settings/test_llm_settings.py (1)

graphium_core/settings.py (6)

LLMProvider (45-51)

LLMSettings (64-185)

resolved_api_key (140-156)

resolved_api_key (237-268)

resolved_base_url (117-138)

resolved_base_url (270-287)

graphium_core/cross_encoder/rerankers/__init__.py (3)

graphium_core/embedder/providers/__init__.py (2)

_missing_provider (8-17)

_MissingProvider (9-14)

graphium_core/cross_encoder/rerankers/gemini.py (1)

GeminiReranker (29-141)

graphium_core/cross_encoder/rerankers/bge.py (1)

BGEReranker (20-39)

server/graph_service/main.py (3)

server/graph_service/graphium_client.py (1)

GraphiumClient (16-70)

server/graph_service/routers/ingest.py (1)

lifespan (42-45)

server/graph_service/config.py (1)

get_settings (22-23)

server/graph_service/graphium_client.py (2)

graphium_core/graphium.py (1)

Graphium (41-256)

server/graph_service/dto/retrieve.py (1)

FactResult (16-26)

graphium_core/llm_client/pydantic_ai_adapter.py (1)

graphium_core/settings.py (2)

resolved_api_key (140-156)

resolved_api_key (237-268)

graphium_core/graphium.py (4)

graphium_core/orchestration/payloads.py (1)

AddEpisodePayload (16-33)

graphium_core/orchestration/services/search_service.py (2)

SearchService (21-92)

search_with_config (60-77)

graphium_core/orchestration/models.py (1)

AddEpisodeResults (9-15)

graphium_core/orchestration/episode_orchestrator.py (1)

add_episode (132-270)

server/graph_service/routers/retrieve.py (3)

server/graph_service/graphium_client.py (2)

get_fact_result_from_edge (80-89)

get_entity_edge (31-36)

graphium_core/graphium.py (1)

search (203-214)

server/graph_service/dto/retrieve.py (1)

SearchQuery (8-13)

graphium_core/embedder/providers/__init__.py (4)

graphium_core/cross_encoder/rerankers/__init__.py (2)

_missing_provider (9-18)

_MissingProvider (10-15)

graphium_core/embedder/providers/gemini.py (2)

GeminiEmbedder (39-131)

GeminiEmbedderConfig (34-36)

graphium_core/embedder/providers/voyage.py (2)

VoyageAIEmbedder (33-64)

VoyageAIEmbedderConfig (28-30)

graphium_core/embedder/providers/embeddinggemma.py (2)

EmbeddingGemmaConfig (18-21)

EmbeddingGemmaEmbedder (24-46)

mcp_server/graphium_mcp/tools.py (2)

mcp_server/graphium_mcp/state.py (1)

EpisodeProcessor (16-20)

mcp_server/graphium_mcp/queues.py (1)

enqueue_episode (107-120)

server/graph_service/routers/ingest.py (2)

server/graph_service/graphium_client.py (3)

delete_entity_edge (58-63)

delete_group (38-56)

delete_episodic_node (65-70)

server/graph_service/dto/common.py (1)

Result (8-10)

graphium_core/orchestration/payloads.py (1)

graphium_core/nodes.py (1)

EpisodeType (26-59)

🪛 Ruff (0.13.3)

graphium_core/providers/factory.py

146-149: Avoid specifying long messages outside the exception class

(TRY003)

165-168: Avoid specifying long messages outside the exception class

(TRY003)

214-217: Avoid specifying long messages outside the exception class

(TRY003)

227-230: Avoid specifying long messages outside the exception class

(TRY003)

graphium_core/cross_encoder/rerankers/__init__.py

11-11: Unused method argument: args

(ARG002)

11-11: Unused method argument: kwargs

(ARG002)

12-15: Avoid specifying long messages outside the exception class

(TRY003)

graphium_core/embedder/providers/__init__.py

10-10: Unused method argument: args

(ARG002)

10-10: Unused method argument: kwargs

(ARG002)

11-14: Avoid specifying long messages outside the exception class

(TRY003)

42-42: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

graphium_core/settings.py

133-135: Avoid specifying long messages outside the exception class

(TRY003)

151-153: Avoid specifying long messages outside the exception class

(TRY003)

conftest.py

58-58: Unused function argument: config

(ARG001)

🔇 Additional comments (52)

mcp_server/tests/test_streamable_http.py (1)

10-14: LGTM! Module-level skip guard follows pytest best practices.

The skip guard correctly prevents smoke tests from running unless explicitly enabled via RUN_MCP_SMOKE_TESTS=1. The implementation follows pytest conventions with allow_module_level=True and includes an appropriate pragma comment to prevent coverage false negatives.

tests/unit/embedder/test_openai.py (1)

10-11: Approve import guard
pytest.importorskip in test_openai.py correctly mirrors other provider tests, cleanly skipping the module when the OpenAI SDK is missing.

tests/unit/embedder/test_embeddinggemma.py (1)

5-6: LGTM! Correct placement for optional dependency guard.

The importorskip is correctly placed before the graphium_core imports, ensuring the entire test module is skipped at collection time when litellm is unavailable. This prevents import errors and aligns with the PR's objective to guard optional provider dependencies.

tests/unit/mcp/test_episode_queue.py (1)

7-8: LGTM! Correct optional dependency handling.

The import guard is correctly placed before any imports that require fastmcp, allowing the test suite to gracefully skip when the optional MCP extra is not installed. This aligns with the PR objective to tolerate skipped integration suites in CI.

mcp_server/tests/test_server_smoke.py (1)

4-6: LGTM! Consistent optional dependency handling.

The import guard follows the same pattern as other MCP tests and correctly prevents import failures when fastmcp is unavailable. The reason string appropriately describes the context.

mcp_server/tests/test_health_endpoint.py (1)

4-6: LGTM! Dual-guard pattern is intentional and correct.

The new importorskip guard for fastmcp complements the existing try/except guard for fastapi (lines 7-12). This dual-guard approach is intentional: fastmcp is required by the mcp_server.graphium_mcp imports at lines 15-17, while fastapi is required by TestClient at line 8. The ordering ensures early detection of missing dependencies before subsequent imports fail.

tests/integration/__init__.py (1)

1-13: LGTM! Robust integration test gating.

The two-level guard is well-designed:

Early exit if RUN_INTEGRATION_TESTS != '1' (lines 7-11)

Dependency check for pytest-asyncio only if tests are enabled (line 13)

This prevents unnecessary import checks when integration tests are disabled and provides clear skip messages in both scenarios.

conftest.py (3)

7-9: LGTM! Appropriate test environment defaults.

Disabling optional graph database services by default is a sensible choice for the test environment, allowing tests to run without external dependencies unless explicitly enabled.

14-14: LGTM! Correct use of pytest_plugins.

Registering integration fixtures as a plugin is the recommended approach per the coding guidelines.

16-20: LGTM! Clean optional dependency guard.

The try/except pattern with appropriate pragma comments correctly establishes whether pytest-asyncio is available for use in the collection hook.

mcp_server/graphium_mcp/state.py (2)

15-20: LGTM: Protocol definition enhances type safety.

The migration from a simple type alias to a runtime_checkable Protocol is a solid improvement:

Explicitly documents the queue_metadata attribute requirement

Enables runtime isinstance() checks for validation

Provides better IDE support and type checking

The Protocol contract clearly communicates that implementations need both the callable signature and the metadata attribute, which aligns well with the dynamic attribute assignment pattern used in tools.py.

6-7: LGTM: Import updates support the Protocol definition.

The import changes correctly add Awaitable from collections.abc and Protocol, runtime_checkable from typing, which are necessary for the new Protocol-based definition. The removal of the Callable import (previously used for the type alias) is appropriate.

mcp_server/graphium_mcp/tools.py (1)

109-116: Keep metadata assignment – it’s required
The manual queue_metadata set at lines 110–114 seeds values that _metadata() reads and augments with attempts, so it’s not overwritten and must be retained.

Likely an incorrect or invalid review comment.

pyproject.toml (2)

109-110: LGTM! Type-checking mode upgrade improves code quality.

The upgrade from basic to standard mode and Python 3.12 target aligns with the PR's goal of stricter type enforcement and is consistent with the updated requires-python = ">=3.12".

38-38: Verify dependency versions align with latest stable releases

Upgrade voyageai to >=0.3.5

Upgrade sentence-transformers to >=5.1.1

fastmcp>=2.12.4 is up-to-date ✓

Bump pytest-asyncio to >=1.2.0; no references to the removed event_loop fixture found in tests

Ensure these match the latest PyPI releases.

tests/unit/embedder/test_voyage.py (1)

10-11: LGTM! Proper optional dependency guard.

The pytest.importorskip guard ensures tests are skipped gracefully when the voyageai SDK is not installed, preventing import-time failures and aligning with the PR's optional dependency handling strategy.

tests/unit/cross_encoder/test_gemini_reranker_client.py (1)

10-11: LGTM! Consistent optional dependency guard.

The pytest.importorskip guard for google.genai follows the same pattern as other optional provider tests, ensuring graceful skips when the dependency is unavailable.

.pre-commit-config.yaml (1)

1-7: LGTM! Pre-commit configuration properly set up.

The configuration uses Ruff v0.7.1 (matching pyproject.toml dev dependencies) with appropriate hooks:

ruff-format for consistent code formatting

ruff --fix --exit-non-zero-on-fix to apply fixes and ensure review

This aligns with the PR's goal of stricter tooling and CI hygiene.

graphium_core/llm_client/pydantic_ai_adapter.py (1)

120-127: LGTM! Sensible early exit when API key unavailable.

The early API key check prevents unnecessary Pydantic AI model creation and execution when credentials are missing, with appropriate debug logging. The getattr pattern safely handles both enum and string provider values.

This optimization aligns with the provider-aware key resolution introduced in settings.py and ensures graceful fallback to the legacy flow.

tests/unit/cross_encoder/test_bge_reranker_client.py (1)

6-9: LGTM! Consistent optional dependency guard.

The pytest.importorskip guard for sentence_transformers follows the established pattern for optional dependencies, ensuring tests skip gracefully when the package is unavailable.
.github/workflows/ci.yml (2)
6-6: LGTM! Broader CI coverage on all branches.

Changing the trigger from main to ** ensures CI runs on all branches, catching issues earlier in development. This aligns with the PR's goal of expanding CI coverage.

44-55: Verify exit code 4 handling is intentional.

The script allows exit codes 4 and 5 to be treated as skip/unavailable. However:

Exit code 5: "No tests were collected" (expected for optional integration tests)

Exit code 4: "pytest command line usage error" (indicates a configuration problem)

Allowing exit code 4 might mask actual pytest usage errors.

Consider restricting to only exit code 5:
-          if [ "$status" -eq 5 ] || [ "$status" -eq 4 ]; then
+          if [ "$status" -eq 5 ]; then
             echo "Integration test suite skipped or unavailable (status $status)."
           elif [ "$status" -ne 0 ]; then
             exit "$status"
           fi
If exit code 4 is intentional (e.g., handling specific pytest configuration scenarios), please document why it's allowed.
CODE_REVIEW.md (1)

1-144: LGTM! Well-structured architectural guidance.

This documentation provides clear, actionable recommendations for improving maintainability and performance. The examples are concrete and the prioritization (Critical/High/Medium severity) helps guide implementation.

A few observations:

The singleton pattern recommendation (Issue 1) aligns with the PR's introduction of app-scoped lifecycle management.

The payload-based API pattern (Issue 2) matches the AddEpisodePayload addition referenced in the PR objectives.

The tooling recommendations (pre-commit hooks, stricter type checking) are solid best practices.

graphium_core/cross_encoder/rerankers/__init__.py (1)

21-33: LGTM! Clean optional dependency handling.

The try/except pattern correctly handles optional dependencies, providing shim types when imports fail and only exposing real implementations in __all__ when they succeed. This aligns with the PR objective of guarding optional embedder/reranker imports.

graphium_core/embedder/providers/__init__.py (2)

20-42: LGTM! Optional dependency handling is correctly implemented.

The pattern correctly handles optional embedder providers, with proper fallback to shim types when dependencies are missing.

Regarding Ruff RUF022 (unsorted __all__): This is a false positive. Since __all__ is built dynamically via extend() calls based on successful imports, sorting it statically would break the conditional export logic. This warning can be safely ignored.

36-40: Verify EmbeddingGemma extra name in pyproject.toml
The fallback uses 'sentence-transformers', but embeddinggemma.py doesn’t import that package. Ensure the extra defined under [tool.poetry.extras] matches this name; update the dependency string if it differs.

docs/reports/code-complexity.md (1)

83-83: LGTM! Complexity metrics look healthy.

The server/graph_service/graphium_client.py file shows good maintainability metrics (MI: 60.5/C, CC: 2.2/A), which aligns with the PR's goal of improving code quality.

tests/unit/providers/test_factory.py (1)

4-5: LGTM! Proper test guards for optional dependencies.

The pytest.importorskip guards correctly prevent test failures when optional dependencies are missing, aligning with the PR's objective of handling optional providers gracefully.

Minor note: Line 5's reason mentions 'google-generativeai' but the import check is for 'google.genai'. While these refer to the same package (the import name vs. the PyPI package name), consider being consistent. Either use:

reason='google-genai SDK required...' (matches the extra name), or

Keep as-is if referencing the PyPI package name is preferred.

graphium_core/providers/factory.py (4)

143-149: LGTM: Clear guidance for missing optional dependency.

The import guard provides actionable installation instructions for users attempting to use the Gemini embedder without the required dependency.

162-168: LGTM: Clear guidance for missing optional dependency.

The import guard provides actionable installation instructions for users attempting to use the EmbeddingGemma embedder without the required dependency.

211-217: LGTM: Clear guidance for missing optional dependency.

The import guard provides actionable installation instructions for users attempting to use the Gemini reranker without the required dependency.

224-230: LGTM: Clear guidance for missing optional dependency.

The import guard provides actionable installation instructions for users attempting to use the BGE reranker without the required dependency.

Note: Static analysis flags TRY003 (long messages outside exception class) for all four import guards in this file. For user-facing error messages with installation instructions, inline strings are acceptable despite the style warning.

server/graph_service/graphium_client.py (1)

73-77: LGTM: Appropriate app-scoped client lifecycle.

The runtime lookup pattern correctly retrieves the app-scoped GraphiumClient and provides appropriate error handling when the client is not initialized. This aligns with the lifespan-based initialization in server/graph_service/main.py.

tests/unit/settings/test_llm_settings.py (3)

8-30: LGTM: Comprehensive test isolation.

The autouse fixture ensures clean environment state for each test, preventing environment variable pollution between tests.

33-41: LGTM: Well-designed test helper.

The make_settings helper properly constructs test instances with defaults while allowing overrides, making test cases more readable and maintainable.

44-109: LGTM: Comprehensive test coverage.

The test suite thoroughly exercises LLMSettings resolution logic, including edge cases like ambiguous provider configurations. Test names clearly describe the behavior under test.

server/graph_service/routers/retrieve.py (1)

12-12: LGTM: Consistent dependency updates.

All route handlers have been updated to use GraphiumClientDep, aligning with the client lifecycle changes. The refactor is complete and consistent across the file.

Also applies to: 18-18, 31-31, 37-37, 47-47

server/graph_service/main.py (1)

19-24: Ensure LLM client config is safely mutable at runtime
Directly mutating client.llm_client.config.base_url/api_key and setting client.llm_client.model assumes the LLM client is always initialized with a mutable config; verify GraphiumClient (and its Graphium base) supports runtime updates to these properties or expose a dedicated configuration method.

tests/unit/graphium/test_add_episode_payload.py (1)

1-143: LGTM! Comprehensive test coverage for payload-based episode addition.

The test suite thoroughly validates the AddEpisodePayload integration:

Minimal payload flow with default values

Full payload flow with all optional fields populated

Delegation from legacy add_episode to add_episode_payload

The use of Graphium.__new__ to bypass initialization and inject mock dependencies is an appropriate unit testing pattern that isolates the code under test.

graphium_core/graphium.py (5)

25-26: LGTM! Clean import additions for payload-based API.

The imports of AddEpisodePayload and SearchService align with the architectural changes documented in the engineering plan and enable the new payload-based episode addition flow.

90-93: LGTM! SearchService instantiation follows existing patterns.

The replacement of SearchOrchestrator with SearchService maintains the same initialization pattern (clients and max_coroutines) and successfully simplifies the search layer as documented in Milestone 4 of the engineering plan.

123-139: LGTM! Clean payload delegation to orchestrator.

The add_episode_payload method correctly unpacks all payload fields and forwards them to the existing episode_orchestrator.add_episode method, maintaining consistency with the orchestrator's signature.

158-174: LGTM! Backward-compatible delegation to payload method.

The refactored add_episode maintains the existing signature and delegates to the new payload-based API, ensuring backward compatibility while adopting the cleaner payload pattern internally.

212-246: LGTM! Search methods correctly delegate to SearchService.

All three search methods (search, search_, and get_nodes_and_edges_by_episode) have been correctly updated to delegate to the new search_service, maintaining their existing signatures and behavior.

graphium_core/settings.py (4)

38-42: LGTM! Sensible model config update.

Adding populate_by_name=True enables field access via both the Python attribute name and the environment variable alias, improving ergonomics for programmatic settings construction while maintaining backward compatibility.

102-115: LGTM! Clean helper methods for provider-specific configurations.

The _provider_api_keys() and _provider_base_urls() methods centralize the mapping of provider-specific environment variables, making the resolution logic more maintainable and reducing repetition.

Note: Currently only OpenAI has a provider-specific base URL. This is intentional and appropriate given most providers use their default endpoints.

117-138: LGTM! Explicit base URL resolution with ambiguity detection.

The refactored logic correctly handles:

Direct base_url or litellm_proxy_base_url overrides (highest priority)

Provider-specific URLs when provider is explicitly set

Single provider auto-detection

Clear error for ambiguous multi-provider configurations

Fallback to provider defaults

This significantly improves configuration clarity and reduces the risk of silent misconfigurations.

140-156: LGTM! Explicit API key resolution with ambiguity detection.

The refactored logic mirrors the base URL resolution approach and correctly handles:

Direct api_key override (highest priority)

Provider-specific keys when provider is explicitly set

Single provider auto-detection

Clear error for ambiguous multi-provider configurations

This ensures users receive helpful guidance when their configuration is ambiguous rather than experiencing silent fallback behavior.

graphium_core/orchestration/payloads.py (3)

1-1: LGTM! Updated docstring reflects module purpose.

The docstring accurately describes that this module contains typed payload models shared across orchestration components.

6-13: LGTM! Clean imports and exports.

The import of EpisodeType supports the source field default, and the updated __all__ list correctly exports the new AddEpisodePayload model alongside existing payload types.

16-34: LGTM! Well-structured payload model with appropriate defaults.

The AddEpisodePayload model correctly captures all parameters from the Graphium.add_episode method:

Required fields: name, episode_body, source_description, reference_time

Sensible default: source=EpisodeType.message

Optional fields: group_id, uuid, update_communities, type configurations, edge configurations

The ConfigDict(extra='allow', arbitrary_types_allowed=True) is appropriate given:

entity_types and edge_types contain type[BaseModel] values

edge_type_map uses tuple keys which may not be directly serializable

The payload is used for internal API bridging rather than external serialization

server/graph_service/routers/ingest.py (1)

10-10: LGTM—GraphiumClientDep provides all invoked methods.

graphium_core/cross_encoder/rerankers/__init__.py

… function

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

graphium_core/search/nodes.py (1)
401-416: Restore the semaphore_gather import.

_gather_episode_search_results and _gather_community_search_results still call semaphore_gather, but this module no longer imports it. At runtime those code paths will raise NameError. Please reintroduce the import.
-from graphium_core.search.shared import gather_search_results
+from graphium_core.search.shared import gather_search_results
+from graphium_core.utils.async_utils import semaphore_gather

🧹 Nitpick comments (2)

graphium_core/search/edges.py (1)
312-312: Sort __all__ to appease Ruff.

Ruff’s RUF022 still fires here; swapping to an alphabetic order (e.g. ['_build_edge_search_tasks', 'edge_search']) will keep lint green.
-__all__ = ['edge_search', '_build_edge_search_tasks']
+__all__ = ['_build_edge_search_tasks', 'edge_search']
graphium_core/embedder/providers/__init__.py (1)
26-33: Keep __all__ sorted after optional extends.

Ruff’s RUF022 complains once we append optional exports; calling __all__.sort() (or reassigning the sorted list) after the try/except blocks keeps the dynamic additions lint-compliant.
 else:  # pragma: no cover
     __all__.extend(['EmbeddingGemmaEmbedder', 'EmbeddingGemmaConfig'])
 
+__all__.sort()

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9028dc3 and 0da92d7.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (8)

graphium_core/cross_encoder/rerankers/__init__.py (1 hunks)
graphium_core/embedder/providers/__init__.py (1 hunks)
graphium_core/search/edges.py (3 hunks)
graphium_core/search/nodes.py (2 hunks)
graphium_core/search/shared.py (1 hunks)
graphium_core/utils/__init__.py (1 hunks)
graphium_core/utils/optional_dependencies.py (1 hunks)
pyproject.toml (3 hunks)

🧰 Additional context used

📓 Path-based instructions (8)

**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indentation, 100-character line length, and prefer single quotes as configured in pyproject.toml
Modules, files, and functions should use snake_case names
Only use # type: ignore as a last resort and include a justification comment

Files:

graphium_core/utils/optional_dependencies.py
graphium_core/utils/__init__.py
graphium_core/search/nodes.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/search/shared.py
graphium_core/embedder/providers/__init__.py
graphium_core/search/edges.py

{graphium_core/driver/**,graphium_core/utils/**}

📄 CodeRabbit inference engine (AGENTS.md)

Place side-effectful code only in drivers or adapters (graphium_core/driver, graphium_core/utils); keep helpers elsewhere pure

Files:

graphium_core/utils/optional_dependencies.py
graphium_core/utils/__init__.py

{server/**/*.py,graphium_core/**/*.py}

📄 CodeRabbit inference engine (AGENTS.md)

Use ruff for formatting/lint and pyright for type checks (make lint), with pyright standard mode in server and basic in core

Files:

graphium_core/utils/optional_dependencies.py
graphium_core/utils/__init__.py
graphium_core/search/nodes.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/search/shared.py
graphium_core/embedder/providers/__init__.py
graphium_core/search/edges.py

graphium_core/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

graphium_core/**/*.py: In graphium_core use absolute imports (e.g., from graphium_core.prompts.models import Message) instead of deep relative imports
Check for None (e.g., driver is None) before use to satisfy type-narrowing and avoid possibly-unbound warnings

Files:

graphium_core/utils/optional_dependencies.py
graphium_core/utils/__init__.py
graphium_core/search/nodes.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/search/shared.py
graphium_core/embedder/providers/__init__.py
graphium_core/search/edges.py

{graphium_core/**/*.py,server/**/*.py,mcp_server/**/*.py}

📄 CodeRabbit inference engine (AGENTS.md)

Import the public facade via from graphium_core.graphium import Graphium instead of importing from graphium_core.init

Files:

graphium_core/utils/optional_dependencies.py
graphium_core/utils/__init__.py
graphium_core/search/nodes.py
graphium_core/cross_encoder/rerankers/__init__.py
graphium_core/search/shared.py
graphium_core/embedder/providers/__init__.py
graphium_core/search/edges.py

{pyproject.toml,Makefile,pytest.ini,conftest.py}

📄 CodeRabbit inference engine (AGENTS.md)

Maintain tooling manifests and test configuration at their canonical locations (repo root for pyproject.toml/Makefile/pytest.ini; conftest.py under tests as needed)

Files:

pyproject.toml

graphium_core/search/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Search pipeline components belong under graphium_core/search/

Files:

graphium_core/search/nodes.py
graphium_core/search/shared.py
graphium_core/search/edges.py

graphium_core/{llm_client,embedder}/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Place LLM clients under graphium_core/llm_client/ and embedders under graphium_core/embedder/

Files:

graphium_core/embedder/providers/__init__.py

🧠 Learnings (5)

📓 Common learnings

Learnt from: CR
Repo: CaliLuke/graphium PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to graphium_core/orchestration/**/*.py : Bulk ingest, maintenance, and search orchestration should be under graphium_core/orchestration/

📚 Learning: 2025-10-12T05:55:20.133Z

Learnt from: CR
Repo: CaliLuke/graphium PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to {graphium_core/**/*.py,server/**/*.py,mcp_server/**/*.py} : Import the public facade via `from graphium_core.graphium import Graphium` instead of importing from graphium_core.__init__

Applied to files:

graphium_core/utils/__init__.py
graphium_core/embedder/providers/__init__.py

📚 Learning: 2025-10-12T05:55:20.133Z

Learnt from: CR
Repo: CaliLuke/graphium PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to {server/**/*.py,graphium_core/**/*.py} : Use ruff for formatting/lint and pyright for type checks (`make lint`), with pyright standard mode in server and basic in core

Applied to files:

pyproject.toml

📚 Learning: 2025-10-12T05:55:20.133Z

Learnt from: CR
Repo: CaliLuke/graphium PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to graphium_core/models/**/*.py : Pydantic models in graphium_core/models must use PascalCase with explicit type hints

Applied to files:

pyproject.toml

📚 Learning: 2025-10-12T05:55:20.133Z

Learnt from: CR
Repo: CaliLuke/graphium PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-10-12T05:55:20.133Z
Learning: Applies to graphium_core/driver/**/*.py : If Ty reports unresolved imports for optional extras (Kuzu/FalkorDB/Neptune), install the extra or gate imports behind runtime-only paths

Applied to files:

graphium_core/cross_encoder/rerankers/__init__.py

🧬 Code graph analysis (6)

graphium_core/utils/__init__.py (1)

graphium_core/utils/optional_dependencies.py (1)

missing_provider (8-32)

graphium_core/search/nodes.py (2)

graphium_core/search/shared.py (1)

gather_search_results (11-29)

graphium_core/search/search_utils.py (1)

node_bfs_search (873-959)

graphium_core/cross_encoder/rerankers/__init__.py (1)

graphium_core/utils/optional_dependencies.py (1)

missing_provider (8-32)

graphium_core/search/shared.py (1)

graphium_core/utils/async_utils.py (1)

semaphore_gather (22-52)

graphium_core/embedder/providers/__init__.py (1)

graphium_core/utils/optional_dependencies.py (1)

missing_provider (8-32)

graphium_core/search/edges.py (2)

graphium_core/search/shared.py (1)

gather_search_results (11-29)

graphium_core/search/search_utils.py (1)

edge_bfs_search (529-651)

🪛 Ruff (0.14.3)

graphium_core/utils/optional_dependencies.py

28-28: Unused method argument: args

(ARG002)

28-28: Unused method argument: kwargs

(ARG002)

graphium_core/embedder/providers/__init__.py

32-32: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

graphium_core/search/edges.py

312-312: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

🔇 Additional comments (5)

pyproject.toml (3)
64-66: Verify intentionality: optional providers now duplicated in dev extra.

Lines 64 and 66 add sentence-transformers and voyageai to the dev extra, though they're already listed in [project.optional-dependencies] (lines 40 and 38). This redundancy aligns with the PR objective to move heavy test-only dependencies into dev, but confirm this dual-listing is intentional rather than an oversight.

110-111: Pyright typeCheckingMode set to "standard" for core; verify against coding guidelines.

Lines 110–111 configure Pyright for core with pythonVersion = "3.12" and typeCheckingMode = "standard". Per learnings, pyright should use standard mode in server and basic mode in core. Confirm whether this change to standard mode is intentional to improve type checking rigor, or if it should revert to basic to align with guidelines.

If this is intentional to tighten type checking, add a note to ADRs or commit message explaining the rationale. If it should remain basic, apply this diff:
  [tool.pyright]
  include = ["graphium_core"]
  pythonVersion = "3.12"
- typeCheckingMode = "standard"
+ typeCheckingMode = "basic"
69-69: No breaking changes found—upgrade to pytest-asyncio 1.0.0 is safe.

Verification shows the codebase uses no deprecated APIs. The major v1.0.0 breaking change is removal of the event_loop fixture, which is not used anywhere in this codebase. All async tests use @pytest.mark.asyncio (compatible with v1.0.0) and async fixtures use @pytest_asyncio.fixture (also compatible). The pytest_plugins declarations in integration tests are redundant but harmless in v1.0.0 since the plugin is auto-discovered. No migration work is needed.
graphium_core/search/edges.py (1)

283-295: Shared gather flow looks good.

Centralizing the BFS fallback via gather_search_results keeps this path consistent with node search and drops the bespoke semaphore plumbing—nice tightening.

graphium_core/embedder/providers/__init__.py (1)

10-31: Optional provider shims read cleanly.

Thanks for routing the optional backends through missing_provider; the shims should keep import-time UX consistent when extras are absent.

coderabbitai · 2025-11-08T23:07:20Z

graphium_core/utils/optional_dependencies.py

+    class _MissingProvider:  # pragma: no cover - shim for optional extras
+        def __init__(self, *args, **kwargs):
+            raise ImportError(message)
+
+    _MissingProvider.__name__ = f'Missing{name}'


⚠️ Potential issue | 🟠 Major

Silence Ruff ARG002 on the shim constructor.

The new shim trips Ruff’s ARG002 because *args/**kwargs are unused. That will fail ruff check. Rename the parameters to throwaway identifiers so lint passes.

- class _MissingProvider: # pragma: no cover - shim for optional extras - def __init__(self, *args, **kwargs): + class _MissingProvider: # pragma: no cover - shim for optional extras + def __init__(self, *_args, **_kwargs): raise ImportError(message)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class _MissingProvider: # pragma: no cover - shim for optional extras

def __init__(self, *args, **kwargs):

raise ImportError(message)

_MissingProvider.__name__ = f'Missing{name}'

class _MissingProvider: # pragma: no cover - shim for optional extras

def __init__(self, *_args, **_kwargs):

raise ImportError(message)

_MissingProvider.__name__ = f'Missing{name}'

🧰 Tools

🪛 Ruff (0.14.3)

28-28: Unused method argument: args

(ARG002)

28-28: Unused method argument: kwargs

(ARG002)

🤖 Prompt for AI Agents

In graphium_core/utils/optional_dependencies.py around lines 27 to 31, the shim class _MissingProvider defines __init__(self, *args, **kwargs) which triggers Ruff ARG002 because those parameters are unused; rename them to throwaway identifiers (e.g., *_, **__) so the linter ignores them and keep the body unchanged (raise ImportError(message)) and retain the __name__ assignment.

Luca Candela added 19 commits October 12, 2025 00:00

Use app lifespan to manage ZepGraphium

e03b22a

Return singleton Graphium from dependency

4c9528a

Record router dependency verification

807b7cf

Log milestone 1 validation attempts

bfdd68b

Rename server Graphium wrapper to GraphiumClient

a7325d4

Add AddEpisodePayload model

72fde29

Delegate Graphium add_episode to payload API

9bbc3ac

Mark add_episode delegation complete

48ee491

Add Graphium add_episode payload tests

01c1eb4

Document make test attempt for payload milestone

5de0b06

Set Pyright to Python 3.12 standard mode

01753cb

Tighten LLM settings provider resolution

9901e02

Remove SearchOrchestrator wrapper

6323c46

Add Ruff pre-commit hooks

dbe8fa9

Guard pytest discovery from optional dependencies

bef62ee

Skip asyncio tests when pytest-asyncio missing

ff41c1d

tighten optional providers and expand ci triggers

8823c8c

ci: tolerate skipped integration tests

27638da

ci: treat skipped integration file as non-fatal

9028dc3

coderabbitai bot reviewed Oct 12, 2025

View reviewed changes

graphium_core/cross_encoder/rerankers/__init__.py Outdated Show resolved Hide resolved

refactor: replace custom missing provider implementation with utility…

0da92d7

… function

coderabbitai bot reviewed Nov 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refine optional provider handling and expand CI coverage #6

Refine optional provider handling and expand CI coverage #6

Uh oh!

CaliLuke commented Oct 12, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 12, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refine optional provider handling and expand CI coverage #6

Are you sure you want to change the base?

Refine optional provider handling and expand CI coverage #6

Uh oh!

Conversation

CaliLuke commented Oct 12, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CaliLuke commented Oct 12, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 12, 2025 •

edited

Loading