Skip to content

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Dec 18, 2025

Description

  • Added small dummy rag db for e2e tests.
  • Created e2e tests for FAISS rag (making sure the retrieved chunks are from the docs in the db).
  • Added new required deps (pyproject.toml)
  • Updated docs/providers, revoked support for shields as they are unusable now.
  • Updated configs in examples folder, removed old unsupported configs (lls 0.2.x format).

Test:
Ask the agent "What is the title of the article from Paul?", expected answer: "great work"

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

NA

Related Tickets & Documents

  • Related Issue # LCORE-603
  • Closes # LCORE-603

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Added RAG storage volume support for Docker deployments
    • Introduced FAISS vector store integration with new test coverage
    • Enhanced provider configurations for Azure, Vertex AI, and other services
  • Improvements

    • Reorganized configuration structure for better modularity and clarity
    • Updated default embeddings and vector store providers
    • Simplified workflow triggers for automated testing
  • Documentation

    • Updated provider status in documentation
  • Chores

    • Added einops dependency

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

Walkthrough

Adds FAISS E2E coverage and RAG storage mount. Reorganizes run/example configs to a centralized providers/storage layout with env-driven SQLite paths. Removes multiple legacy example manifests. Updates docs provider statuses, adds einops dependency, expands registered resources/vector store defaults, and adjusts CI workflow trigger.

Changes

Cohort / File(s) Summary
Docker Compose mounts
docker-compose.yaml, docker-compose-library.yaml
Add volume: ./tests/e2e/rag/opt/app-root/src/.llama/storage/rag:Z.
Documentation
docs/providers.md
Update provider statuses: llama-guard set to ❌; rag-runtime set to ✅.
Dependency
pyproject.toml
Add dependency: einops>=0.8.1.
Root run config
run.yaml
Introduce telemetry; centralize providers; add FAISS vector_io, rag-runtime, safety/scoring; env-based storage backends and vector_stores defaults.
Example configs — major rewrites
examples/run.yaml, examples/azure-run.yaml, examples/vertexai-run.yaml, examples/vllm-rhaiis.yaml, examples/vllm-rhelai.yaml, examples/vllm-rhoai.yaml
Migrate to providers/storage-centric schema; add env-templated stores; set default FAISS vector store and nomic embedding; enable telemetry; update models/providers (Azure, Vertex, OpenAI, sentence-transformers).
Example configs — removals
examples/gemini-run.yaml, examples/openai-faiss-run.yaml, examples/openai-pgvector-run.yaml, examples/vllm-granite-run.yaml, examples/vllm-llama-faiss-run.yaml, examples/vllm-qwen3-run.yaml
Delete legacy example manifests.
E2E configs
tests/e2e/configs/run-azure.yaml, .../run-ci.yaml, .../run-rhaiis.yaml, .../run-rhelai.yaml, .../run-vertexai.yaml, tests/e2e-prow/rhoai/configs/run.yaml
Align with new schema: add providers blocks, env-driven SQLite paths, FAISS vector_io, rag-runtime tool group, telemetry; update namespaces and defaults.
E2E tests
tests/e2e/features/faiss.feature, tests/e2e/test_list.txt
Add FAISS feature verifying vector store registration and file_search query; register test in list.
CI workflow
.github/workflows/e2e_tests_providers.yaml
Change triggers to run on push only; remove schedule and manual dispatch.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Suggested reviewers

  • asimurka
  • tisnik
  • radofuchs
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and accurately describes the primary change: adding full support for the Faiss VectorIO provider, which aligns with the main objectives of adding FAISS e2e tests, configurations, and related dependencies.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (6)
examples/vllm-rhaiis-run.yaml (1)

33-37: Hardcoded port 8000 in URL.

The URL hardcodes port 8000: http://${env.RHAIIS_URL}:8000/v1/ unlike the RHEL AI config which uses ${env.RHEL_AI_PORT}. Consider using an environment variable for the port to maintain deployment flexibility:

url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/
tests/e2e/features/faiss.feature (1)

8-27: Consider adding a note about the port difference.

This scenario correctly uses @skip-in-library-mode since it directly queries the llama-stack service on port 8321. The second scenario (lines 29-39) omits the hostname/port override, so it will use defaults to test through lightspeed-stack. This asymmetry is intentional but worth a brief inline comment for maintainability.

tests/e2e/configs/run-azure.yaml (1)

34-36: Consider parameterizing the Azure API base URL.

The api_base is hardcoded to a specific test endpoint. For flexibility across environments, consider using an environment variable with a default:

api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
examples/azure-run.yaml (2)

34-36: Consider using an environment variable for the API base URL.

The api_base is hardcoded while api_key uses an environment variable. For consistency and deployment flexibility across environments, consider parameterizing the base URL:

   config: 
     api_key: ${env.AZURE_API_KEY}
-    api_base: https://ols-test.openai.azure.com/
+    api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
     api_version: 2024-02-15-preview

111-116: Inconsistent storage path structure between backends.

kv_default uses ~/.llama/storage/rag/kv_store.db (includes /rag/ subfolder) while sql_default uses ~/.llama/storage/sql_store.db (no subfolder). This asymmetry may cause confusion. Consider aligning the path structure or documenting the rationale.

run.yaml (1)

119-125: Same storage path inconsistency as azure-run.yaml.

The kv_default backend uses ~/.llama/storage/rag/kv_store.db while sql_default uses ~/.llama/storage/sql_store.db. The comment on line 120 explains that registry and RAG are unified in kv_default, which provides context. However, consider whether the /rag/ subdirectory name is still appropriate given it stores more than just RAG data.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96888ca and 85c9528.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml (1 hunks)
  • docker-compose.yaml (1 hunks)
  • docs/providers.md (2 hunks)
  • examples/azure-run.yaml (1 hunks)
  • examples/gemini-run.yaml (0 hunks)
  • examples/openai-faiss-run.yaml (0 hunks)
  • examples/openai-pgvector-run.yaml (0 hunks)
  • examples/run.yaml (1 hunks)
  • examples/vertexai-run.yaml (4 hunks)
  • examples/vllm-granite-run.yaml (0 hunks)
  • examples/vllm-llama-faiss-run.yaml (0 hunks)
  • examples/vllm-qwen3-run.yaml (0 hunks)
  • examples/vllm-rhaiis-run.yaml (1 hunks)
  • examples/vllm-rhelai-run.yaml (1 hunks)
  • examples/vllm-rhoai-run.yaml (1 hunks)
  • pyproject.toml (1 hunks)
  • run.yaml (3 hunks)
  • tests/e2e-prow/rhoai/configs/run.yaml (4 hunks)
  • tests/e2e/configs/run-azure.yaml (4 hunks)
  • tests/e2e/configs/run-ci.yaml (3 hunks)
  • tests/e2e/configs/run-rhaiis.yaml (4 hunks)
  • tests/e2e/configs/run-rhelai.yaml (4 hunks)
  • tests/e2e/configs/run-vertexai.yaml (4 hunks)
  • tests/e2e/features/faiss.feature (1 hunks)
  • tests/e2e/test_list.txt (1 hunks)
💤 Files with no reviewable changes (6)
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
🧰 Additional context used
📓 Path-based instructions (3)
tests/e2e/test_list.txt

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain test list in tests/e2e/test_list.txt for end-to-end tests

Files:

  • tests/e2e/test_list.txt
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

  • tests/e2e/features/faiss.feature
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Configure pylint with source-roots = "src"
Exclude src/auth/k8s.py from pyright type checking

Files:

  • pyproject.toml
🧠 Learnings (3)
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt` for end-to-end tests

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

  • tests/e2e/test_list.txt
🪛 YAMLlint (1.37.1)
tests/e2e/configs/run-rhaiis.yaml

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)


[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)


[error] 100-100: duplication of key "config" in mapping

(key-duplicates)


[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

tests/e2e/configs/run-rhelai.yaml

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)


[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)


[error] 100-100: duplication of key "config" in mapping

(key-duplicates)


[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (30)
pyproject.toml (1)

55-55: Consider using a flexible version constraint for urllib3 or explain the exact version pin.

urllib3 2.6.1 is the latest non-vulnerable version, but 2.6.2 is available as a bug fix. The exact version pin (==2.6.1) is unnecessarily restrictive. If there's no specific compatibility reason for avoiding 2.6.2, consider using >=2.6.1,<2.7 instead. If an exact pin is necessary, document the reason in a comment above the dependency.

The einops dependency is appropriate: einops 0.8.1 is the latest version with no known vulnerabilities.

⛔ Skipped due to learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Always check `pyproject.toml` for existing dependencies and versions before adding new ones
docs/providers.md (2)

103-103: Documentation accurately reflects the disabled llama-guard shield.

The change from ✅ to ❌ for llama-guard aligns with the configuration changes across the example files where shields are disabled with the comment about "infinite loop issue with LLM calls".


160-160: rag-runtime is now correctly marked as supported.

The rag-runtime provider is properly documented as supported (✅), which aligns with its configuration in all the new example YAML files.

examples/vllm-rhelai-run.yaml (1)

1-151: New RHEL AI configuration file is well-structured.

The configuration follows the consolidated provider pattern with:

  • Environment-driven paths and credentials
  • FAISS vector store with sentence-transformers embeddings
  • Properly disabled shields with documented reasoning
  • Consistent storage backend configuration
examples/vertexai-run.yaml (2)

30-67: Provider configuration is well-organized.

The inference providers are properly configured with:

  • Google Vertex AI remote provider with project/location from environment variables
  • Sentence-transformers for embeddings
  • FAISS vector store with persistence

132-141: Empty models array may need attention.

The registered_resources.models array is empty while the Vertex AI provider specifies allowed_models: ["google/gemini-2.5-flash"]. Verify if a model registration is needed here, or if the allowed_models configuration in the provider is sufficient for Vertex AI.

examples/vllm-rhoai-run.yaml (2)

134-138: Verify the hardcoded model configuration and null provider_model_id.

Unlike other example configs that use environment variables (e.g., ${env.RHEL_AI_MODEL}), this file hardcodes model_id: meta-llama/Llama-3.2-1B-Instruct and sets provider_model_id: null.

Consider using environment variables for consistency:

- model_id: ${env.RHOAI_MODEL}
  provider_id: vllm
  model_type: llm
  provider_model_id: ${env.RHOAI_MODEL}

37-37: Different max_tokens value than other vLLM configs.

This config uses max_tokens: 1024 while vllm-rhelai-run.yaml and vllm-rhaiis-run.yaml use max_tokens: 2048. Verify if this is intentional for RHOAI deployments.

examples/vllm-rhaiis-run.yaml (1)

1-151: RHAIIS configuration follows the established pattern.

The configuration is well-structured and consistent with other example configurations in this PR.

tests/e2e/configs/run-rhelai.yaml (1)

1-27: Configuration header and store definitions look correct.

The version, image_name, APIs, and store configurations follow the established pattern with proper environment variable substitution.

tests/e2e-prow/rhoai/configs/run.yaml (1)

1-150: Configuration structure looks well-organized.

The consolidated provider configuration with environment-variable-backed paths is consistent. The explicit warnings about disabled shields (lines 49, 138) document the known issue clearly.

tests/e2e/configs/run-vertexai.yaml (2)

30-67: Provider configuration looks correct.

The Vertex AI inference provider is properly configured with project/location environment variables, and the FAISS vector_io provider follows the correct structure with persistence settings.


132-146: Registered resources and vector stores configuration is consistent.

Empty models list is appropriate here since allowed_models is specified in the provider configuration (line 36). The vector_stores defaults align with the FAISS provider setup.

tests/e2e/configs/run-ci.yaml (2)

53-74: Scoring and vector_io providers are well-configured.

The scoring providers include useful documentation comments (e.g., "Enable the RAG tool", "Define the storage backend for RAG"). The placeholder '********' for openai_api_key is appropriate for example configurations.


114-155: Telemetry and storage configuration additions look good.

The inline documentation (line 120, 153) explaining the storage backend purpose improves maintainability. The vector_stores configuration with FAISS defaults is consistent with the PR's objective.

examples/run.yaml (3)

1-6: Good documentation header.

The introductory comments clearly explain the configuration purpose and prerequisites (OpenAI API key, rag-content tool reference). This helps users understand how to use this example.


74-80: FAISS vector_io configuration is correct.

The persistence configuration with namespace: vector_io::faiss and backend: kv_default properly integrates with the storage backends defined below. This aligns with the PR's goal of fully supporting the FAISS VectorIO provider.


147-161: Registered resources and vector stores complete the FAISS integration.

The vector_stores section properly configures:

  • default_provider_id: faiss matching the vector_io provider
  • default_embedding_model using sentence-transformers with the nomic model

This provides a complete working example for FAISS-based RAG.

tests/e2e/test_list.txt (1)

1-1: LGTM!

The new faiss.feature is correctly added to the test list, following the coding guidelines for maintaining end-to-end tests in this file.

docker-compose-library.yaml (1)

16-16: LGTM!

The volume mount correctly exposes the test RAG database to the container, enabling FAISS-based e2e tests.

docker-compose.yaml (1)

14-14: LGTM!

Consistent with the corresponding mount in docker-compose-library.yaml, ensuring FAISS e2e tests work in both library and non-library modes.

tests/e2e/features/faiss.feature (1)

1-39: Feature file follows BDD conventions.

The Gherkin structure is correct and scenarios appropriately test FAISS vector store registration and retrieval. As per coding guidelines, this uses the behave framework with proper Given/When/Then steps.

tests/e2e/configs/run-azure.yaml (2)

48-52: Safety providers disabled with documented reason.

The commented-out safety configuration includes a clear warning about the infinite loop issue. This is good practice for documenting why a feature is disabled.


61-67: FAISS vector_io provider correctly configured.

The FAISS provider configuration with persistence to kv_default backend aligns with the storage configuration and the mounted RAG volume path.

examples/azure-run.yaml (3)

4-14: Missing telemetry API compared to run.yaml.

The run.yaml includes telemetry in the APIs list and has a telemetry: enabled: true section, but this Azure configuration omits both. If telemetry should be available in Azure deployments, add it for consistency.


48-52: Shields disabled with documented warning — acceptable for known issue.

The safety providers array is empty with a clear warning comment about the infinite loop issue. The corresponding shields: [] in registered_resources is consistent. This is appropriately documented.

Also applies to: 138-138


132-150: Registered resources and vector_stores configuration looks correct.

The configuration properly:

  • Registers the gpt-4o-mini model with the Azure provider
  • Sets up the RAG toolgroup with rag-runtime
  • Configures FAISS as the default vector store provider with nomic-embed-text-v1.5 as the embedding model
run.yaml (3)

31-39: Well-structured inference providers with environment-driven configuration.

The OpenAI provider correctly uses environment variables for both the API key and allowed models with sensible defaults. The inline sentence-transformers provider is appropriately configured for embedding support.


68-74: FAISS vector_io provider correctly configured for persistence.

The FAISS provider is properly set up with a persistence backend pointing to kv_default, which aligns with the storage configuration. This enables the RAG functionality as intended by the PR.


148-155: RAG toolgroup and vector_stores configuration complete.

The configuration properly:

  • Registers builtin::rag toolgroup with the rag-runtime provider
  • Sets FAISS as the default vector store provider
  • Configures nomic-ai/nomic-embed-text-v1.5 as the default embedding model via sentence-transformers

This completes the FAISS VectorIO provider setup required by the PR objectives.

run.yaml Outdated
Comment on lines 114 to 115
telemetry:
enabled: true
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find run.yaml file
fd -n "run.yaml" --type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 306


🏁 Script executed:

#!/bin/bash
# Find run.yaml and check its structure
fd -type f -name "*.yaml" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 246


🏁 Script executed:

find . -name "run.yaml" -type f 2>/dev/null | head -5

Repository: lightspeed-core/lightspeed-stack

Length of output: 177


🏁 Script executed:

#!/bin/bash
# Search for run.yaml files
find . -type f -name "run.yaml" 2>/dev/null

# Also search for telemetry configuration patterns
rg -l "telemetry:" --type yaml 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 340


🏁 Script executed:

#!/bin/bash
# Locate run.yaml and examine structure around telemetry config
if [ -f "./run.yaml" ]; then
  echo "=== Found run.yaml ==="
  wc -l ./run.yaml
  echo "=== Lines 100-130 ==="
  sed -n '100,130p' ./run.yaml
  echo "=== Searching for providers section ==="
  grep -n "providers:" ./run.yaml || echo "No explicit providers section found"
elif [ -f "run.yaml" ]; then
  wc -l run.yaml
  sed -n '100,130p' run.yaml
  grep -n "providers:" run.yaml || echo "No explicit providers section found"
else
  find . -name "run.yaml" -type f 2>/dev/null -exec sh -c 'echo "=== Found: $1 ==="; wc -l "$1"; sed -n "100,130p" "$1"; grep -n "providers:" "$1" || echo "No providers section"' _ {} \;
fi

Repository: lightspeed-core/lightspeed-stack

Length of output: 1099


🏁 Script executed:

#!/bin/bash
# Read the providers section
sed -n '1,80p' ./run.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 2155


🏁 Script executed:

#!/bin/bash
# Check entire telemetry section and nearby context
sed -n '110,125p' ./run.yaml
# Search for any backend/exporter configurations
grep -i "backend\|exporter\|endpoint" ./run.yaml | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 1284


🏁 Script executed:

#!/bin/bash
# Check if telemetry is listed in apis and look for telemetry-related documentation
sed -n '1,20p' ./run.yaml
# Search for any telemetry configuration or documentation
rg -i "telemetry" --type yaml -B2 -A2 ./run.yaml | head -40

Repository: lightspeed-core/lightspeed-stack

Length of output: 504


🏁 Script executed:

#!/bin/bash
# Search for telemetry-related documentation, READMEs, or code comments
find . -type f \( -name "README*" -o -name "*.md" -o -name "telemetry*" \) | head -20
rg -i "telemetry.*provider|telemetry.*backend|telemetry.*config" --type yaml --type python -B1 -A1 | head -30

Repository: lightspeed-core/lightspeed-stack

Length of output: 617


🏁 Script executed:

#!/bin/bash
# Search for telemetry documentation
rg -i "telemetry" docs/ -l 2>/dev/null | head -10
# Also check config documentation
grep -i "telemetry" README.md docs/config.md 2>/dev/null

Repository: lightspeed-core/lightspeed-stack

Length of output: 319


🏁 Script executed:

#!/bin/bash
# Check providers documentation for telemetry config
grep -i "telemetry" docs/providers.md -B2 -A5 | head -40
# Also check deployment guide
grep -i "telemetry" docs/deployment_guide.md -B2 -A5 | head -40

Repository: lightspeed-core/lightspeed-stack

Length of output: 2149


🏁 Script executed:

#!/bin/bash
# Get more context from deployment_guide.md about telemetry configuration
sed -n '1,2000p' docs/deployment_guide.md | grep -i "telemetry" -B5 -A10 | head -60

Repository: lightspeed-core/lightspeed-stack

Length of output: 2874


🏁 Script executed:

#!/bin/bash
# Check other run.yaml examples to see how telemetry should be configured
echo "=== Checking examples/run.yaml ==="
grep -A5 "telemetry:" examples/run.yaml | head -20
echo ""
echo "=== Checking tests/configuration/run.yaml ==="
grep -A5 "telemetry:" tests/configuration/run.yaml | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 391


🏁 Script executed:

#!/bin/bash
# Check if there are any code comments or documentation about when telemetry provider config is required
rg -i "telemetry.*provider|telemetry.*config|enabled.*true" docs/ -B2 -A2 | head -40
# Also check the main config documentation
head -100 docs/config.md

Repository: lightspeed-core/lightspeed-stack

Length of output: 4089


Add telemetry provider configuration to match test and documentation patterns.

Telemetry is enabled but lacks an explicit provider configuration under the providers section, unlike other APIs (inference, safety, scoring, etc.). The test configuration at tests/configuration/run.yaml shows the proper pattern with providers.telemetry using the meta-reference provider. Either add explicit telemetry provider config or document why using a default backend is acceptable.

🤖 Prompt for AI Agents
In run.yaml around lines 114 to 115, telemetry is enabled but missing an
explicit providers.telemetry entry; add a providers.telemetry block mirroring
the pattern used by inference/safety/scoring (e.g., include a telemetry provider
key such as meta-reference with its required connection/config fields used in
tests) so telemetry has an explicit provider configuration, or update docs to
justify relying on a default backend—ensure the YAML structure follows the
existing providers.<provider> pattern used elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provider config for telemetry is not supported in 0.3.x

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@are-ces, thank you for the clarification! I've noted that telemetry provider configuration is not supported in version 0.3.x, which explains why the configuration is valid as-is with just the enabled: true flag.


✏️ Learnings added
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines 146 to 150
vector_stores:
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Vector stores configuration uses v1.5, though v2 is now available as a newer alternative.

The FAISS and sentence-transformers configuration is properly set. However, Nomic Embed Text V2 is now available for production workloads as a successor to v1.5, offering multilingual support if needed. Additionally, ensure that when embeddings are generated at runtime, the text prompt includes a task instruction prefix (e.g., search_document: or search_query: for RAG applications) to align with the model's requirements.

🤖 Prompt for AI Agents
In tests/e2e/configs/run-azure.yaml around lines 146 to 150, the vector store
config pins the embedding model to nomic-ai/nomic-embed-text-v1.5; update
model_id to the v2 release (e.g., nomic-ai/nomic-embed-text-v2) to use the newer
multilingual production model and adjust any downstream compatibility settings
if required, and ensure that runtime embedding calls prepend the appropriate
task instruction prefix (for example "search_document:" or "search_query:") to
the text passed to the embedding API so the model receives the required task
context for RAG/search use cases.

Comment on lines +31 to +27
And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
When I use "query" to ask question with authorization header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n tests/e2e/features/faiss.feature | sed -n '30,35p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 542


Complete the JWT token with all three segments.

A well-formed JWT consists of three concatenated Base64url-encoded strings, separated by dots. The token on line 31 is missing the signature segment and the payload appears truncated. Provide a complete, valid JWT token with all three parts (header.payload.signature).

🤖 Prompt for AI Agents
In tests/e2e/features/faiss.feature around lines 31 to 32, the Authorization
Bearer token is incomplete (missing the third signature segment and has a
truncated payload); replace that string with a well-formed JWT consisting of
three Base64url-encoded segments separated by dots (header.payload.signature).
Ensure the token used in the test is complete and valid for your test
environment (either a real signed token or a locally generated test token with
matching server validation), and update the line to include the full token
string.

@are-ces are-ces force-pushed the faiss-support-final branch from 160b644 to ca83927 Compare December 18, 2025 10:32
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @tests/e2e/configs/run-ci.yaml:
- Around line 60-63: The Braintrust provider config currently hardcodes
openai_api_key: '********' which will fail at runtime; update the provider
config for provider_type inline::braintrust to read the API key from an
environment variable (e.g., using the config expression or secret interpolation
used elsewhere in CI configs) instead of the literal '********', ensuring the
provider_id/provider_type block uses the env var name so the runtime picks up
the real key.
- Around line 160-164: registered_resources has malformed YAML: move the model
entries currently under the top-level list into a nested models: mapping and
ensure shields: [] remains a key with a models: list beneath it; specifically,
under registered_resources->shields keep "shields: []" as an empty sequence or
mapping and create a "models:" key containing entries with "provider_id:" and
"model_id:" (e.g., add an entry for provider_id: sentence-transformers and
model_id: nomic-ai/nomic-embed-text-v1.5) so the default_embedding_model
referenced in vector_stores is registered; validate indentation and YAML
structure so the models are children of the models: key, not sibling keys at the
same level as shields.
🧹 Nitpick comments (2)
tests/e2e/configs/run-ci.yaml (2)

48-52: Track the shields/safety infinite loop issue.

Safety providers and shields are disabled due to an "infinite loop issue with LLM calls". While this may be acceptable for E2E testing, ensure this known issue is tracked and prioritized for resolution before production deployment.

Is there an open issue tracking the shields infinite loop problem? If not, would you like me to help create one?


120-120: Consider breaking the long comment into multiple lines.

The comment explaining the unified storage backend is informative but spans over 100 characters, which impacts readability.

♻️ Suggested refactor
-    kv_default: # Define the storage backend type for RAG, in this case registry and RAG are unified i.e. information on registered resources (e.g. models, vector_stores) are saved together with the RAG chunks
+    kv_default:
+      # Define the storage backend type for RAG. In this case, registry and RAG are unified,
+      # i.e., information on registered resources (e.g., models, vector_stores) are saved
+      # together with the RAG chunks.
       type: kv_sqlite
       db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca83927 and c80a78d.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • tests/e2e/configs/run-ci.yaml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • tests/e2e/configs/run-ci.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: build-pr
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: E2E: server mode / vertexai
  • GitHub Check: E2E: server mode / azure
  • GitHub Check: E2E: library mode / azure
  • GitHub Check: E2E: library mode / vertexai
🔇 Additional comments (2)
tests/e2e/configs/run-ci.yaml (2)

31-36: LGTM: Inference provider properly configured.

The OpenAI inference provider configuration correctly uses environment variables for the API key and model selection, with a sensible default for E2E testing.


68-74: LGTM: FAISS vector_io provider properly configured.

The FAISS provider configuration correctly references the kv_default backend for persistence, which aligns with the PR objective to fully support FAISS.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the deletion of "old" examples should be part of this PR. Otherwise looks OK, thank you very much.

And REST API service prefix is /v1

@skip-in-library-mode
Scenario: Verify vector store is registered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this scenario is testing llama-stack behavior, it was useful in debugging and could be added as a single step to make sure the vector_store exists but entire test should not be there

@are-ces are-ces force-pushed the faiss-support-final branch from c80a78d to 2993407 Compare January 13, 2026 12:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/e2e/configs/run-ci.yaml (1)

145-150: Embedding model mismatch between registration and default.

The registered_resources.models includes sentence-transformers/all-mpnet-base-v2 (768 dimensions), but vector_stores.default_embedding_model references nomic-ai/nomic-embed-text-v1.5. Using different embedding models for registration versus default could cause dimension mismatches or inconsistent vector retrieval.

Consider aligning these to use the same model:

Option A: Use the registered model as default
 vector_stores:
   default_provider_id: faiss
   default_embedding_model:
     provider_id: sentence-transformers
-    model_id: nomic-ai/nomic-embed-text-v1.5
+    model_id: sentence-transformers/all-mpnet-base-v2
Option B: Register the nomic model instead
   - model_id: sentence-transformers/all-mpnet-base-v2
+  - model_id: nomic-ai/nomic-embed-text-v1.5
     model_type: embedding
     provider_id: sentence-transformers
-    provider_model_id: sentence-transformers/all-mpnet-base-v2
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
     metadata:
-      embedding_dimension: 768
+      embedding_dimension: 768  # Verify nomic model dimension

Also applies to: 161-165

🤖 Fix all issues with AI agents
In @run.yaml:
- Around line 60-63: The braintrust provider block (provider_id: braintrust,
provider_type: inline::braintrust) currently contains a hardcoded placeholder
openai_api_key: '********'; replace this with an environment variable reference
(e.g., set openai_api_key from process/env such as ${OPENAI_API_KEY} or a
project-specific name like ${BRAINTRUST_OPENAI_API_KEY}) to match other
credential handling so the actual key is not committed in the repo.

In @tests/e2e/configs/run-rhelai.yaml:
- Around line 53-57: The config declares an empty safety providers list
(`safety: []`) but still registers a shield named `llama-guard` in
`registered_resources.shields`, causing an inconsistency; either remove the
`llama-guard` entry from `registered_resources.shields` or re-enable the
`llama-guard` provider in the `safety` list (e.g., uncomment and include the
provider config) so that every registered shield has a corresponding provider;
update whichever of `registered_resources.shields` or the `safety` block you
choose to keep so they match.
🧹 Nitpick comments (6)
examples/vertexai-run.yaml (1)

132-138: Empty models array may require documentation.

Unlike the vllm configs which pre-register models, this VertexAI config has an empty models array in registered_resources. Users will need to register models separately. Consider adding a comment indicating that models should be registered via API or by adding entries here with the appropriate provider_model_id from the allowed_models list (e.g., google/gemini-2.5-flash).

examples/vllm-rhoai-run.yaml (1)

34-37: TLS verification disabled - security consideration.

tls_verify: false bypasses certificate validation. While this may be acceptable for development or internal environments with self-signed certificates, it introduces security risks in production. Consider:

  1. Adding a comment explaining when this is appropriate
  2. Using an environment variable to control this setting: tls_verify: ${env.VLLM_TLS_VERIFY:=true}
♻️ Suggested approach
   config:
     url: ${env.KSVC_URL}/v1/
     api_token: ${env.VLLM_API_KEY}
-    tls_verify: false
+    tls_verify: ${env.VLLM_TLS_VERIFY:=true}  # Set to false only for self-signed certs in dev/test
     max_tokens: 1024
examples/vllm-rhaiis-run.yaml (1)

34-37: URL construction reduces flexibility.

The URL is constructed with hardcoded protocol (http://) and port (:8000). This limits deployment flexibility:

  • Cannot switch to HTTPS without changing the config
  • Cannot use a different port without modifying the file

Consider making the entire URL configurable via environment variable, similar to the RHOAI config.

♻️ Suggested approach
   config:
-    url: http://${env.RHAIIS_URL}:8000/v1/
+    url: ${env.RHAIIS_URL}/v1/  # e.g., http://localhost:8000 or https://rhaiis.example.com
     api_token: ${env.RHAIIS_API_KEY}
-    tls_verify: false
+    tls_verify: ${env.RHAIIS_TLS_VERIFY:=true}
     max_tokens: 2048
examples/azure-run.yaml (1)

30-39: Hardcoded test endpoint and preview API version.

Two concerns with the Azure configuration:

  1. Hardcoded test endpoint (line 35): api_base: https://ols-test.openai.azure.com/ appears to be an internal test endpoint. Example configurations should use environment variables to avoid confusion and accidental usage of test infrastructure.

  2. Preview API version (line 36): api_version: 2024-02-15-preview is a preview version that may be deprecated. Consider documenting the minimum required API version or using a stable release.

♻️ Suggested fix
   inference:
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
-      api_version: 2024-02-15-preview
+      api_base: ${env.AZURE_API_BASE}  # e.g., https://your-resource.openai.azure.com/
+      api_version: ${env.AZURE_API_VERSION:=2024-02-15-preview}
examples/run.yaml (1)

66-69: Use environment variable for API key consistency.

The braintrust provider uses a hardcoded placeholder '********' for openai_api_key. For consistency with other sensitive values in this config (e.g., ${env.OPENAI_API_KEY}), consider using an environment variable.

Suggested fix
  - provider_id: braintrust
    provider_type: inline::braintrust
    config:
-      openai_api_key: '********'
+      openai_api_key: ${env.OPENAI_API_KEY}
tests/e2e/configs/run-ci.yaml (1)

60-63: Use environment variable for API key consistency.

Same as in examples/run.yaml - the braintrust provider uses a hardcoded placeholder. Use an environment variable for consistency with other secrets management in this config.

Suggested fix
  - provider_id: braintrust
    provider_type: inline::braintrust
    config:
-      openai_api_key: '********'
+      openai_api_key: ${env.OPENAI_API_KEY}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c80a78d and 2993407.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis-run.yaml
  • examples/vllm-rhelai-run.yaml
  • examples/vllm-rhoai-run.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • docker-compose.yaml
  • pyproject.toml
  • docker-compose-library.yaml
  • examples/vllm-rhelai-run.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • examples/vllm-rhoai-run.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/azure-run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-rhaiis-run.yaml
  • examples/run.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (24)
docs/providers.md (2)

103-103: LGTM - Provider status update aligns with PR objectives.

The llama-guard provider status change from ✅ to ❌ is consistent with the "infinite loop issue with LLM calls" mentioned in the configuration files. Consider adding a note in the documentation explaining why this provider is currently unsupported, so users understand it's a known issue rather than a missing feature.


160-160: LGTM - RAG runtime support documented.

The rag-runtime provider is now marked as supported, which aligns with the FAISS VectorIO provider support being added in this PR.

examples/vertexai-run.yaml (3)

30-39: LGTM - VertexAI inference configuration.

The dual-provider setup with Google Vertex for LLM inference and sentence-transformers for embeddings is well-structured.


48-52: Safety provider disabled - ensure this is tracked.

The llama-guard provider is commented out with a warning about an "infinite loop issue with LLM calls." This aligns with the documentation update marking llama-guard as unsupported. Ensure there's an issue tracking this bug for future resolution.


142-146: LGTM - Vector stores configuration.

The default FAISS provider and sentence-transformers embedding model configuration is consistent with the PR's goal of supporting FAISS VectorIO.

examples/vllm-rhaiis-run.yaml (1)

134-138: LGTM - Environment-driven model configuration.

Using environment variables for both model_id and provider_model_id is more flexible than hardcoding, allowing the same config to work across different RHAIIS deployments with different models.

examples/azure-run.yaml (2)

132-145: LGTM - Resource configuration is complete.

The registered resources section properly links the gpt-4o-mini model to the Azure provider and includes the RAG tool group configuration.


146-150: LGTM - Vector stores configuration consistent across examples.

The FAISS vector store with sentence-transformers embedding model is consistently configured across all example files, which supports maintainability.

examples/vllm-rhoai-run.yaml (1)

134-138: No changes needed. Setting provider_model_id: null is intentional and supported behavior in Llama Stack. When set to null (or omitted), the provider defaults to using the model_id value directly. This is documented behavior and is used consistently in the codebase.

Likely an incorrect or invalid review comment.

tests/e2e/configs/run-azure.yaml (1)

1-160: Configuration structure looks correct for Azure provider integration.

The file follows the new centralized providers/storage configuration pattern correctly. Telemetry is properly configured with just enabled: true as expected for Llama Stack 0.3.x. All sensitive credentials use environment variables appropriately.

Based on learnings, the telemetry configuration aligns with the documented pattern.

tests/e2e/configs/run-rhaiis.yaml (1)

1-161: Configuration correctly implements RHAIIS provider setup.

The file properly configures the vLLM-based RHAIIS inference provider with environment-variable-driven URLs and credentials. The structure aligns with the unified providers/storage schema introduced in this PR.

tests/e2e-prow/rhoai/configs/run.yaml (2)

139-142: Verify intentional use of null for provider_model_id.

The provider_model_id is explicitly set to null while model_id is set to meta-llama/Llama-3.2-1B-Instruct. Typically, provider_model_id maps to the model identifier expected by the provider. If the vLLM provider uses the model_id directly or infers it from the endpoint, this may be intentional. Otherwise, consider setting it to match the deployed model.


143-146: Potential mismatch: llama-guard shield references openai/gpt-4o-mini.

The shield is configured with provider_id: llama-guard but provider_shield_id: openai/gpt-4o-mini. This pattern is consistent across other configs, but verify that the llama-guard safety provider correctly routes to the OpenAI provider for the underlying model inference.

tests/e2e/configs/run-vertexai.yaml (2)

30-36: Vertex AI provider configuration looks good.

The Google Vertex AI provider is correctly configured with environment variables for project and location. The allowed_models array explicitly lists the permitted model.


136-137: Verify empty models array is intentional.

The registered_resources.models is an empty array, while other configuration files (e.g., run-azure.yaml, run-rhaiis.yaml) explicitly register models. If models should be dynamically discovered via allowed_models in the provider config, this is fine. Otherwise, consider registering the Vertex AI model explicitly for consistency.

run.yaml (3)

12-12: Verify telemetry in the apis list.

This file includes telemetry in the apis list (line 12), while other configuration files in this PR do not. Verify if this is intentional for the root/starter configuration or if it should be removed for consistency with the other configs.


117-120: Helpful inline documentation.

The comment explaining that registry and RAG storage are unified in kv_default is valuable for understanding the configuration design.


156-164: Vector stores and telemetry configuration looks correct.

The vector_stores defaults, safety configuration, and telemetry enablement follow the expected pattern for Llama Stack 0.3.x. Based on learnings, the telemetry configuration correctly uses enabled: true without a provider block.

tests/e2e/configs/run-rhelai.yaml (2)

155-157: Verify embedding model availability.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 via sentence-transformers provider, but no embedding model is explicitly registered under registered_resources.models. Confirm that the sentence-transformers provider auto-loads this model or add an explicit registration.


160-161: LGTM!

Telemetry configuration correctly uses enabled: true without a provider block, which aligns with Llama Stack 0.3.x requirements.

examples/run.yaml (2)

157-161: LGTM!

The vector_stores configuration correctly defines the default FAISS provider and embedding model. The use of sentence-transformers with nomic model is consistent with other configuration files.


1-7: Good documentation header.

The header comments clearly explain the purpose, requirements, and external tooling reference. This is helpful for users setting up OpenAI + FAISS configurations.

tests/e2e/configs/run-ci.yaml (2)

48-52: LGTM!

Safety provider is properly configured with llama-guard and correctly referenced in registered_resources.shields and safety.default_shield_id. This differs from other configs where safety is disabled, which is appropriate for CI testing.


168-169: LGTM!

Telemetry is correctly enabled and listed in the APIs. Configuration follows Llama Stack 0.3.x conventions.

@are-ces are-ces force-pushed the faiss-support-final branch from 2993407 to 3dc2804 Compare January 13, 2026 13:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Fix all issues with AI agents
In @examples/azure-run.yaml:
- Line 36: Update the api_version value for Azure OpenAI in the YAML: replace
the outdated "2024-02-15-preview" with the stable GA "2024-06-01" (or a newer
preview like "2025-04-01-preview" if preview features are required) by editing
the api_version key in the Azure config block so the example uses a current
supported API version.

In @examples/run.yaml:
- Around line 66-69: The YAML snippet uses a hardcoded placeholder API key
("********") for the braintrust provider which will fail at runtime; update the
provider config (provider_id/provider_type/config.openai_api_key) to read the
real key from a secure source (e.g., environment variable or secrets manager)
and validate presence before use (fail fast with a clear error if the env
var/secret is missing) so the provider doesn't attempt to start with a
placeholder value.
- Around line 157-161: The examples run.yaml is missing the telemetry
configuration block referenced by the telemetry API; add a telemetry section at
the end of the file with telemetry: enabled: true (matching the run.yaml
pattern) so telemetry is explicitly enabled and consistent with the API usage.

In @pyproject.toml:
- Line 63: The dependency line "einops>=0.8.1" lacks an explanatory comment; add
a brief single-line comment immediately above that line describing why einops is
required (e.g., "Used for tensor rearrangements and shape transformations in
model preprocessing/visual modules") so it matches the style of other
dependencies and improves maintainability.

In @run.yaml:
- Around line 60-63: The braintrust provider config currently uses a hardcoded
placeholder openai_api_key: '********' which will fail at runtime; update the
inline::braintrust provider block (refer to provider_type: inline::braintrust
and provider_id: braintrust) to read the OpenAI key from an environment variable
(e.g., reference a secret like OPENAI_API_KEY used elsewhere) instead of the
literal ********, and ensure any runtime code that loads config supports env
interpolation or use your secrets manager so the provider's openai_api_key is
populated securely at runtime.

In @tests/e2e/configs/run-ci.yaml:
- Around line 116-123: The two storage backends (kv_default and sql_default)
both use the same env var pattern ${env.SQLITE_STORE_DIR:=...}, causing
collisions when the env var is set; change them to distinct env vars and/or
paths so they cannot override each other (e.g., use
${env.SQLITE_KV_STORE_DIR:=~/.llama/storage/rag/kv_store.db} for kv_default and
${env.SQLITE_SQL_STORE_DIR:=~/.llama/storage/sql_store.db} for sql_default) and
ensure the db_path values for kv_default and sql_default remain unique and
consistent with the other store definitions referenced elsewhere.
- Around line 161-165: The default_embedding_model value (model_id
nomic-ai/nomic-embed-text-v1.5) is not present in registered_resources.models
which will break FAISS usage; either change
vector_stores.default_embedding_model to the already-registered model_id
(sentence-transformers/all-mpnet-base-v2) or add an entry for
nomic-ai/nomic-embed-text-v1.5 under registered_resources.models with model_id
nomic-ai/nomic-embed-text-v1.5, model_type: embedding, provider_id:
sentence-transformers, provider_model_id: nomic-ai/nomic-embed-text-v1.5 and
correct embedding_dimension in metadata so the default and registry match.
- Around line 17-28: The three store entries (conversations_store,
inference_store, metadata_store) currently all use the same env pattern
`${env.SQLITE_STORE_DIR:=...}` which makes them point to the same file when
SQLITE_STORE_DIR is set; update these keys so they either use distinct env vars
per store (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH, METADATA_DB_PATH) or
treat SQLITE_STORE_DIR as a directory and append filenames (e.g., use
SQLITE_STORE_DIR with `/conversations.db`, `/inference-store.db`,
`/registry.db`), and ensure this matches the existing storage_dir expectation
for a directory.
- Around line 60-63: The Braintrust provider block (provider_id: braintrust,
provider_type: inline::braintrust) contains a hardcoded/placeholder
openai_api_key value; replace it with an environment variable reference (e.g.,
use a CI/secret env var like BRAINTRUST_OPENAI_API_KEY) so the key is not
committed, or if this provider is intentionally disabled remove the provider
block or add a clear comment explaining it's disabled; ensure the config key
name openai_api_key stays the same and that your CI/concourse/github-actions
reads the chosen env var before tests run.

In @tests/e2e/configs/run-rhelai.yaml:
- Around line 53-57: The config sets safety: [] but still registers shields and
sets safety.default_shield_id to "llama-guard", causing an inconsistency; fix by
removing the shield registration block (the entries that reference provider_id:
llama-guard / provider_type: inline::llama-guard) and remove or clear
safety.default_shield_id (the safety.default_shield_id: llama-guard setting) so
the config consistently disables safety providers, or alternatively re-enable
the safety provider entry and its config under safety: if you intend to keep
shields.
🧹 Nitpick comments (2)
examples/vllm-rhelai-run.yaml (1)

36-36: Consider documenting TLS verification in production guidance.

tls_verify: false is acceptable for examples/development, but users copying this config should be reminded to enable TLS verification in production environments. A brief comment would help.

tests/e2e/configs/run-azure.yaml (1)

35-35: Hardcoded Azure endpoint in test configuration.

The api_base is hardcoded to https://ols-test.openai.azure.com/. This is acceptable for e2e tests but consider using an environment variable for flexibility across test environments.

♻️ Suggested change
-      api_base: https://ols-test.openai.azure.com/
+      api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2993407 and 3dc2804.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis-run.yaml
  • examples/vllm-rhelai-run.yaml
  • examples/vllm-rhoai-run.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-granite-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (7)
  • docs/providers.md
  • examples/vllm-rhaiis-run.yaml
  • docker-compose.yaml
  • tests/e2e/test_list.txt
  • tests/e2e/features/faiss.feature
  • examples/vertexai-run.yaml
  • tests/e2e/configs/run-vertexai.yaml
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Check pyproject.toml for supported Python versions before development
Always check pyproject.toml for existing dependencies before adding new ones
Always verify current library versions in pyproject.toml rather than assuming versions
Use pylint with source-roots = "src" configuration

Files:

  • pyproject.toml
🧠 Learnings (5)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • pyproject.toml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • pyproject.toml
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • examples/vllm-rhoai-run.yaml
  • examples/azure-run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • examples/run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
  • run.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (25)
tests/e2e/configs/run-rhaiis.yaml (6)

17-27: LGTM!

The store configurations use consistent environment variable patterns with sensible defaults and separate database files for each store type.


66-72: LGTM!

The FAISS vector_io provider configuration correctly sets up persistence with the kv_default backend and an appropriate namespace. This aligns with the PR objective of fully supporting the FAISS VectorIO provider.


73-110: LGTM!

The agents, batches, datasetio, and eval provider configurations follow consistent patterns with appropriate namespace isolation and backend references.


114-136: LGTM!

The storage configuration properly defines both kv_sqlite and sql_sqlite backends with appropriate store definitions. The inference store includes sensible write queue settings for performance.


143-146: Verify shield configuration consistency.

The shield is registered with provider_id: llama-guard (which maps to inline::llama-guard provider), but provider_shield_id references openai/gpt-4o-mini. This appears inconsistent—typically the provider_shield_id should match the provider's expected model identifier.

Given the PR notes that shield support was revoked due to unusability, please confirm whether this configuration is intentionally a placeholder or if it should be corrected/removed.


153-161: LGTM!

The vector stores configuration correctly sets FAISS as the default provider with sentence-transformers embedding model. The telemetry configuration follows the Llama Stack 0.3.x pattern of using simple enabled: true without a provider block. Based on learnings, this is the correct approach.

tests/e2e/configs/run-ci.yaml (3)

12-12: LGTM!

The telemetry API addition and the telemetry.enabled: true configuration align with the Llama Stack 0.3.x pattern where telemetry can be enabled without requiring an explicit provider block. Based on learnings.

Also applies to: 168-169


31-74: LGTM!

The providers block is well-structured with appropriate comments explaining the RAG tool and FAISS storage backend configuration. The inference provider correctly uses environment variables for the API key and allowed models.


158-167: LGTM!

The RAG tool group registration properly references the rag-runtime provider, and the safety shield configuration correctly references the registered llama-guard shield.

docker-compose-library.yaml (1)

16-16: LGTM!

The new volume mount correctly exposes the e2e RAG test database to the container at the expected storage path. The :Z flag is appropriate for SELinux relabeling.

examples/vllm-rhelai-run.yaml (1)

1-151: LGTM - Well-structured vLLM RHEL AI configuration.

Configuration properly uses environment variables for sensitive values and follows the centralized storage-backed architecture pattern. The shields disabled warning (line 49, 139) is appropriately documented.

examples/vllm-rhoai-run.yaml (1)

1-151: LGTM - Consistent RHOAI configuration.

Configuration mirrors the RHEL AI variant appropriately with RHOAI-specific environment variables (KSVC_URL, VLLM_API_KEY). The hardcoded model ID with provider_model_id: null is valid for vLLM deployments.

tests/e2e/configs/run-azure.yaml (1)

1-160: LGTM - Comprehensive Azure e2e test configuration.

Configuration correctly enables safety provider (llama-guard) for e2e testing, unlike examples where shields are disabled. The telemetry configuration at root level aligns with Llama Stack 0.3.x requirements. Based on learnings.

examples/azure-run.yaml (1)

1-150: LGTM - Azure example configuration follows established patterns.

Configuration properly disables shields with documented warnings and uses the centralized storage-backed architecture.

tests/e2e-prow/rhoai/configs/run.yaml (2)

1-161: LGTM - Well-configured RHOAI prow test configuration.

Configuration appropriately enables safety (llama-guard) and telemetry for CI/prow testing. The dual inference providers (vllm + openai) provide flexibility for the test environment.


144-146: Shield configuration depends on OpenAI provider.

The provider_shield_id: openai/gpt-4o-mini indicates llama-guard uses OpenAI as its backing model. Ensure OPENAI_API_KEY is available in the prow test environment, as the shield will fail without it even though vllm is the primary inference provider.

run.yaml (3)

12-14: LGTM!

The telemetry API is correctly added to the apis list, and the telemetry configuration at lines 163-164 follows the correct Llama Stack 0.3.x pattern with telemetry.enabled: true without a provider block. Based on learnings, this is the expected approach.


68-74: LGTM!

The FAISS vector_io provider configuration is well-structured with proper persistence settings referencing the kv_default backend.


156-164: LGTM!

The vector_stores, safety, and telemetry sections are properly configured with appropriate defaults and references to the defined providers.

examples/run.yaml (3)

1-6: LGTM!

Good documentation header explaining the configuration purpose and requirements.


54-58: Safety shields intentionally disabled.

The safety providers and shields are disabled with a documented reason (infinite loop issue with LLM calls). This is acceptable for the example configuration, but ensure the underlying issue is tracked for resolution.

Is there a tracking issue for the infinite loop problem mentioned in the comment? If not, consider creating one to ensure this gets addressed.


147-156: LGTM!

The registered_resources section correctly registers the RAG tool group while keeping models and shields empty (consistent with the disabled safety configuration and dynamic model registration via OpenAI).

tests/e2e/configs/run-rhelai.yaml (3)

30-37: LGTM with note on TLS verification.

The vLLM inference provider configuration is appropriate for the RHEL AI test environment. Note that tls_verify: false is acceptable for testing but should not be used in production environments.


153-161: LGTM!

The vector_stores configuration and telemetry block are correctly configured for the RHEL AI test environment, following the Llama Stack 0.3.x pattern.


114-136: LGTM!

The storage backends and stores configuration is well-structured and consistent with the other configuration files in this PR.

config:
api_key: ${env.AZURE_API_KEY}
api_base: https://ols-test.openai.azure.com/
api_version: 2024-02-15-preview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Azure OpenAI API versions 2024 2025 latest current

💡 Result:

Short answer

  • Latest GA Azure OpenAI Service API version: 2024-06-01. [1]
  • Latest preview / Foundry API versions (most recent preview cycle): 2025-04-01-preview (and incremental preview dates like 2025-03-01-preview, 2025-02-01-preview, 2024-12-01-preview are documented). [2]

Notes

  • Microsoft documents the API-version lifecycle and change history for Azure OpenAI / Foundry (preview versions listed through 2025-04-01-preview). [2]
  • If you use the Assistants API, v1 is being retired (migration to v2 required—retirement notices in 2025), so check the specific retirement date for your region. [3]

Sources

  1. Rollout / Azure OpenAI Service summary (latest GA = 2024-06-01). [4]
  2. Microsoft Learn — Azure OpenAI / Foundry API version lifecycle (preview releases through 2025-04-01-preview). [2]
  3. Announcement / notices re: Assistants API v1 retirement. [3]

Would you like direct links to the specific Microsoft docs or help updating code to a particular api-version?


Update Azure OpenAI API version to current release.

The api_version: 2024-02-15-preview is outdated. The latest GA version is 2024-06-01, with preview versions available through 2025-04-01-preview. Update to at least the stable 2024-06-01 release to ensure access to current features and improvements.

🤖 Prompt for AI Agents
In @examples/azure-run.yaml at line 36, Update the api_version value for Azure
OpenAI in the YAML: replace the outdated "2024-02-15-preview" with the stable GA
"2024-06-01" (or a newer preview like "2025-04-01-preview" if preview features
are required) by editing the api_version key in the Azure config block so the
example uses a current supported API version.

Comment on lines +66 to +68
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Placeholder API key for braintrust scoring provider.

Same issue as in run.yaml - the placeholder value will fail at runtime if used.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🤖 Prompt for AI Agents
In @examples/run.yaml around lines 66 - 69, The YAML snippet uses a hardcoded
placeholder API key ("********") for the braintrust provider which will fail at
runtime; update the provider config
(provider_id/provider_type/config.openai_api_key) to read the real key from a
secure source (e.g., environment variable or secrets manager) and validate
presence before use (fail fast with a clear error if the env var/secret is
missing) so the provider doesn't attempt to start with a placeholder value.

"urllib3==2.6.2",
# Used for agent card configuration
"PyYAML>=6.0.0",
"einops>=0.8.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for einops usage in Python files
rg "einops" --type py -i

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

# View pyproject.toml around line 63 to see the dependency and comments
sed -n '50,75p' pyproject.toml

Repository: lightspeed-core/lightspeed-stack

Length of output: 911


🏁 Script executed:

# Check the overall structure and comments in dependencies section
sed -n '1,100p' pyproject.toml | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 2912


🏁 Script executed:

# Double-check with a broader search for einops in all file types
rg "einops" -i

Repository: lightspeed-core/lightspeed-stack

Length of output: 817


🏁 Script executed:

# Check if einops might be mentioned in documentation or requirements
find . -type f \( -name "*.md" -o -name "*.txt" -o -name "*.rst" \) -exec grep -l "einops" {} \;

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


Add an explanatory comment for the einops dependency.

Every other dependency in this section has a comment explaining its purpose (e.g., "Used by Llama Stack version checker", "Used for agent card configuration"). einops lacks this comment, reducing maintainability. Add a brief comment above the line to clarify what einops is used for.

🤖 Prompt for AI Agents
In @pyproject.toml at line 63, The dependency line "einops>=0.8.1" lacks an
explanatory comment; add a brief single-line comment immediately above that line
describing why einops is required (e.g., "Used for tensor rearrangements and
shape transformations in model preprocessing/visual modules") so it matches the
style of other dependencies and improves maintainability.

Comment on lines +60 to +63
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Placeholder API key for braintrust scoring provider.

The openai_api_key: '********' is a redacted placeholder that will fail at runtime if the braintrust scoring provider is used. Consider using an environment variable like the other API keys.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🤖 Prompt for AI Agents
In @run.yaml around lines 60 - 63, The braintrust provider config currently uses
a hardcoded placeholder openai_api_key: '********' which will fail at runtime;
update the inline::braintrust provider block (refer to provider_type:
inline::braintrust and provider_id: braintrust) to read the OpenAI key from an
environment variable (e.g., reference a secret like OPENAI_API_KEY used
elsewhere) instead of the literal ********, and ensure any runtime code that
loads config supports env interpolation or use your secrets manager so the
provider's openai_api_key is populated securely at runtime.

Comment on lines 17 to 28
conversations_store:
db_path: ~/.llama/storage/conversations.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
type: sqlite
datasets: []
image_name: starter
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ~/.llama/storage/inference-store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
type: sqlite
metadata_store:
db_path: ~/.llama/storage/registry.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
type: sqlite
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Environment variable pattern may cause database conflicts.

All three store paths use ${env.SQLITE_STORE_DIR:=...} with different default values. If SQLITE_STORE_DIR is set, all stores will point to the same file path, causing conflicts.

The pattern should either:

  1. Use distinct env vars per store (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH), or
  2. Use SQLITE_STORE_DIR as a directory with filenames appended: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db

This also conflicts with line 45 where storage_dir expects a directory path.

🔧 Suggested fix using directory pattern
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
   type: sqlite
 datasets: []
 image_name: starter
 # external_providers_dir: /opt/app-root/src/.llama/providers.d
 inference_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
   type: sqlite
 metadata_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
   type: sqlite
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-ci.yaml around lines 17 - 28, The three store entries
(conversations_store, inference_store, metadata_store) currently all use the
same env pattern `${env.SQLITE_STORE_DIR:=...}` which makes them point to the
same file when SQLITE_STORE_DIR is set; update these keys so they either use
distinct env vars per store (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH,
METADATA_DB_PATH) or treat SQLITE_STORE_DIR as a directory and append filenames
(e.g., use SQLITE_STORE_DIR with `/conversations.db`, `/inference-store.db`,
`/registry.db`), and ensure this matches the existing storage_dir expectation
for a directory.

Comment on lines +60 to +63
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Braintrust API key should use environment variable.

The openai_api_key: '********' appears to be either a placeholder or obfuscated value. If this provider is used in CI tests, it will fail. If it's intentionally disabled, consider removing the provider or adding a comment.

Per project guidelines, use environment variables for sensitive data.

🔧 Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-ci.yaml around lines 60 - 63, The Braintrust provider
block (provider_id: braintrust, provider_type: inline::braintrust) contains a
hardcoded/placeholder openai_api_key value; replace it with an environment
variable reference (e.g., use a CI/secret env var like
BRAINTRUST_OPENAI_API_KEY) so the key is not committed, or if this provider is
intentionally disabled remove the provider block or add a clear comment
explaining it's disabled; ensure the config key name openai_api_key stays the
same and that your CI/concourse/github-actions reads the chosen env var before
tests run.

- FAISS e2e test (for all inference providers)
- Disable shields in all configurations
- Dummy rag db

Update/remove old configuration examples
@are-ces are-ces force-pushed the faiss-support-final branch from 3dc2804 to f2b7f5e Compare January 13, 2026 13:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/providers.md (1)

23-23: Typo: "atributes" → "attributes"

-The tables below summarize each provider category, containing the following atributes:
+The tables below summarize each provider category, containing the following attributes:
🤖 Fix all issues with AI agents
In @.github/workflows/e2e_tests_providers.yaml:
- Around line 4-5: The workflow currently uses a bare "on: push" which triggers
on every branch and removed manual dispatch, PR validation, and scheduled runs;
update the "on" section to a mapping that includes "push" with "branches" (e.g.,
["main","release"]) to limit pushes, add "pull_request" with the same "branches"
to validate PRs, re-add "workflow_dispatch" to allow manual runs, and include a
"schedule" entry (cron) to run periodic E2E checks; modify the YAML keys "on",
"push", "pull_request", "workflow_dispatch", and "schedule" accordingly so the
workflow is targeted, can be manually triggered, validates PRs, and runs on a
cron.

In @examples/azure-run.yaml:
- Around line 137-152: The config references default_embedding_model =
"nomic-ai/nomic-embed-text-v1.5" but that embedding model is not present in
registered_resources.models; add a models entry for the embedding (e.g., a model
object with model_id matching the referenced name or a chosen id, provider_id
set to the correct provider for Nomic, model_type set to "embedding", and
provider_model_id set to "nomic-ai/nomic-embed-text-v1.5") so
default_embedding_model points to a registered model; ensure the same fix is
applied for the other occurrence noted (lines 153-157).

In @examples/run.yaml:
- Around line 119-120: The YAML contains two identical top-level telemetry
blocks; remove the duplicate so only a single "telemetry:" mapping remains at
the top level, deleting the extra "telemetry: enabled: true" occurrence and
ensuring surrounding indentation/spacing keeps the YAML valid.
- Around line 146-157: The YAML registers no models while RAG references an
embedding model; add the embedding model to registered_resources.models so
default_embedding_model matches a registered model (e.g., add an entry with
model_id like "nomic-embed-text-v1.5" or similar, provider_id (e.g.,
"nomic-ai"), and provider_model_id set to "nomic-ai/nomic-embed-text-v1.5");
ensure the model_id you add is the same identifier used by
default_embedding_model and that the toolgroup entry (toolgroup_id:
builtin::rag) can resolve that model.
- Line 58: The YAML has a syntax error where the keys "provider_type" and
"scoring" are on the same line; update the config so "provider_type:
inline::llama-guard" and "scoring:" are on separate lines (ensure proper
indentation for the "scoring" block), so the YAML parser can correctly recognize
both keys.

In @examples/vllm-rhaiis.yaml:
- Around line 17-27: The config uses the same `${env.SQLITE_STORE_DIR:=...}`
substitution for conversations_store, inference_store, and metadata_store which
replaces the entire path and will make all three point to the same file when
SQLITE_STORE_DIR is set; update the three entries so each store has its own env
var (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH, METADATA_DB_PATH) or switch
to a single directory env var (e.g., SQLITE_STORE_DIR) combined with per-store
filenames (concatenate directory + filename) so that conversations_store,
inference_store, and metadata_store each resolve to distinct DB files.
- Around line 46-52: The storage_dir value is using the misnamed env var
SQLITE_STORE_DIR which implies a DB path; update the config to use a dedicated
environment variable (e.g., FILES_STORAGE_DIR) for file storage paths and adjust
the default accordingly. Locate the storage_dir key in the inline::localfs
provider block and replace `${env.SQLITE_STORE_DIR:=~/.llama/storage/files}`
with `${env.FILES_STORAGE_DIR:=~/.llama/storage/files}` (or another appropriate
FILES_* var), and ensure any documentation or env examples are updated to
reference FILES_STORAGE_DIR instead of SQLITE_STORE_DIR.
- Around line 153-157: The configuration selects the embedding model
"nomic-ai/nomic-embed-text-v1.5" but the Python dependency "nomic" is not
declared; add "nomic" to the project dependencies (pyproject.toml/requirements)
alongside the existing sentence-transformers and einops entries so the model can
be loaded at runtime, specifying a stable version constraint (or latest)
compatible with sentence-transformers.

In @examples/vllm-rhelai.yaml:
- Line 15: Remove the trailing whitespace after the APIs list on line 15 (the
blank/whitespace-only content following the "APIs" entry) so the line contains
no trailing spaces; simply delete the extra spaces at the end of that line to
clean up formatting.
- Around line 143-146: The YAML defines a shield (shield_id: llama-guard,
provider_id: llama-guard, provider_shield_id: openai/gpt-4o-mini) and sets
default_shield_id: llama-guard while the safety providers array is empty, which
is inconsistent; either remove the shield block and the default_shield_id entry
(if safety is intended to be disabled) or populate the safety providers array
with a matching provider definition for provider_id: llama-guard (and configure
any required provider_shield mapping) so the default_shield_id resolves at
runtime.
- Around line 17-27: The db_path entries (conversations_store.db_path,
inference_store.db_path, metadata_store.db_path) currently all use the same env
var SQLITE_STORE_DIR with different inline defaults, which causes different
stores to collapse to the same path when SQLITE_STORE_DIR is set; change each
entry to either use its own env var (e.g., SQLITE_CONVERSATIONS_PATH,
SQLITE_INFERENCE_PATH, SQLITE_METADATA_PATH) or treat SQLITE_STORE_DIR strictly
as a directory prefix and append distinct filenames outside the variable (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db), and apply the same
change across all example and test config files to avoid cross-store collisions.

In @run.yaml:
- Around line 156-160: The default_embedding_model currently sets provider_id:
sentence-transformers and model_id: nomic-ai/nomic-embed-text-v1.5 which doesn't
match the registered embedding model (sentence-transformers/all-mpnet-base-v2);
update the default_embedding_model.model_id to the registered model_id
(sentence-transformers/all-mpnet-base-v2) so vector_stores.default_provider_id
and the registered embedding entry are consistent (mirror the same fix applied
in run-ci.yaml).

In @tests/e2e/configs/run-ci.yaml:
- Around line 161-165: The default_embedding_model
(nomic-ai/nomic-embed-text-v1.5) is not registered in
registered_resources.models (only sentence-transformers/all-mpnet-base-v2 is
present), causing a mismatch; fix by either adding a registration entry for
nomic-ai/nomic-embed-text-v1.5 under registered_resources.models or changing
vector_stores.default_embedding_model to the existing
sentence-transformers/all-mpnet-base-v2 so the default references a registered
model.

In @tests/e2e/configs/run-rhaiis.yaml:
- Around line 115-122: Both storage backends kv_default and sql_default use the
same env var SQLITE_STORE_DIR so setting it makes them point to the same DB
file; change the db_path entries so each backend uses a distinct env var or
distinct filename (e.g., SQLITE_KV_STORE_DIR for kv_default and
SQLITE_SQL_STORE_DIR for sql_default, or keep SQLITE_STORE_DIR but append
different filenames like ${env.SQLITE_STORE_DIR:=~/.llama/storage}/kv_store.db
and .../sql_store.db) so kv_default.db_path and sql_default.db_path resolve to
different files.
- Around line 18-28: The three db_path entries (conversations_store.db_path,
inference_store.db_path, metadata_store.db_path) incorrectly treat
SQLITE_STORE_DIR as a full filepath and thus will point all databases to the
same file when the env var is set; update each db_path to use SQLITE_STORE_DIR
as a base directory and append distinct filenames (e.g., conversations.db,
inference-store.db, registry.db) so each DB has its own file, and ensure the
storage_dir usage at the other location (where storage_dir expects a directory)
remains consistent with this change.
- Around line 46-53: The storage_dir entry for provider_id
"meta-reference-files" currently uses
${env.SQLITE_STORE_DIR:=~/.llama/storage/files} which is inconsistent with the
repo's directory-vs-file env var convention; update this value to the
directory-specific env var used elsewhere (for example replace SQLITE_STORE_DIR
with the directory-specific name, e.g. SQLITE_STORE_DIR_DIR or the agreed-upon
directory env var) so storage_dir consistently expects a directory path.
- Line 16: Remove the trailing whitespace after the list item "- vector_io" in
the YAML so the line ends exactly with "- vector_io" (no extra spaces or tabs);
update the file to trim trailing whitespace on that line and run a
YAML/whitespace linter or git diff to confirm no trailing characters remain.

In @tests/e2e/configs/run-rhelai.yaml:
- Around line 54-58: The config disables safety via "safety: []" but still
registers a shield and sets "safety.default_shield_id" which creates an
inconsistent state; fix by removing or commenting out the shield entry under
"registered_resources.shields" and clearing or commenting
"safety.default_shield_id" (or set it to null) so no shield is referenced while
safety is disabled, and add a clear inline warning comment near "safety: []"
explaining why shields are disabled to prevent accidental re-enablement.
- Around line 154-158: The config references default_embedding_model
(provider_id sentence-transformers, model_id nomic-ai/nomic-embed-text-v1.5) but
no embedding model is registered under registered_resources.models; add a model
entry in registered_resources.models that registers this embedding model using
the same provider_id and model_id so the RAG/vector_stores flow can resolve it
(ensure the entry key/name matches any expected reference and use the symbols
registered_resources.models and default_embedding_model to locate where to add
the registration).
🧹 Nitpick comments (4)
tests/e2e/configs/run-azure.yaml (1)

31-44: Consider environment variable for Azure api_base.

The Azure endpoint https://ols-test.openai.azure.com/ is hardcoded while api_key uses an environment variable. For consistency and flexibility across environments, consider templating the base URL:

♻️ Optional: Use environment variable for api_base
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
+      api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
       api_version: 2024-02-15-preview
tests/e2e/configs/run-ci.yaml (1)

60-63: Use environment variable for braintrust API key.

The hardcoded '********' placeholder will cause braintrust scoring to fail if invoked. Consider using environment variable substitution for consistency with other API keys in this config.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.OPENAI_API_KEY}
examples/azure-run.yaml (1)

34-37: Hardcoded Azure endpoint in example configuration.

The api_base is hardcoded to a specific test endpoint (https://ols-test.openai.azure.com/). For an example configuration, consider using an environment variable to make it more reusable.

Suggested fix
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
+      api_base: ${env.AZURE_API_BASE}
       api_version: 2024-02-15-preview
run.yaml (1)

60-63: Use environment variable for braintrust API key.

Same issue as run-ci.yaml: the placeholder '********' should be replaced with ${env.OPENAI_API_KEY} for consistency and functionality.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc2804 and f2b7f5e.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (26)
  • .github/workflows/e2e_tests_providers.yaml
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-granite-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/gemini-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/e2e/features/faiss.feature
  • docker-compose.yaml
  • docker-compose-library.yaml
🧰 Additional context used
📓 Path-based instructions (2)
tests/e2e/test_list.txt

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain test list in tests/e2e/test_list.txt

Files:

  • tests/e2e/test_list.txt
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Check pyproject.toml for supported Python versions before development
Always check pyproject.toml for existing dependencies before adding new ones
Always verify current library versions in pyproject.toml rather than assuming versions
Use pylint with source-roots = "src" configuration

Files:

  • pyproject.toml
🧠 Learnings (9)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/steps/**/*.py : Define E2E test steps in `tests/e2e/features/steps/` directory

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • examples/vllm-rhoai.yaml
  • examples/azure-run.yaml
  • examples/vllm-rhaiis.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/run.yaml
  • tests/e2e/configs/run-ci.yaml
  • examples/vertexai-run.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • examples/vllm-rhelai.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-vertexai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • pyproject.toml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • pyproject.toml
🪛 YAMLlint (1.37.1)
examples/run.yaml

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (30)
pyproject.toml (1)

63-63: Add an explanatory comment for consistency with other dependencies.

The version einops>=0.8.1 is correct (latest release published Feb 9, 2025). However, unlike all other dependencies in this file, einops lacks an explanatory comment. Add a comment clarifying its purpose—likely related to FAISS/RAG vectorization support given the PR context and the project's default RAG configuration using FAISS.

examples/vllm-rhelai.yaml (5)

53-57: Safety providers disabled — verify this is intentional for the example.

The safety providers are disabled with a comment about an "infinite loop issue with LLM calls." While this is documented, shipping an example configuration with safety disabled could lead users to run unprotected deployments.

Consider:

  1. Adding a more prominent warning in documentation about this limitation.
  2. Tracking this as a known issue to be resolved before GA.

Is there a tracking issue for the infinite loop problem mentioned in the comment?


66-72: LGTM!

The FAISS vector_io provider configuration looks correct with appropriate persistence settings using the kv_default backend.


73-110: LGTM!

The agents, batches, datasetio, and eval provider configurations are well-structured with appropriate namespace organization and backend references.


153-157: LGTM!

The vector_stores configuration correctly sets FAISS as the default provider with the nomic embedding model, aligning with the PR objectives for full FAISS support.


160-161: LGTM!

Telemetry is enabled without an explicit provider block, which is the correct approach for Llama Stack version 0.3.x based on learnings.

examples/vllm-rhaiis.yaml (5)

1-15: LGTM on basic structure.

The version and APIs configuration looks appropriate for the FAISS VectorIO provider support. Minor note: Line 15 has trailing whitespace that could be cleaned up.


66-72: FAISS vector_io configuration looks correct.

The inline FAISS provider is properly configured with persistence settings using the kv_default backend. This aligns well with the PR objective of fully supporting the FAISS VectorIO provider.


114-121: Same environment variable conflict applies to storage backends.

Lines 118 and 121 have the same SQLITE_STORE_DIR pattern issue flagged earlier. Both kv_default and sql_default backends would point to the same path if the env var is set.


143-146: Shield naming may be misleading.

The shield is named llama-guard but uses openai/gpt-4o-mini as the backend (provider_shield_id). This could confuse users expecting Meta's Llama Guard safety model. Consider renaming to reflect the actual backend, or add a comment clarifying this is a placeholder configuration.

Per the PR description mentioning "revoked support for shields due to unusability"—is this configuration intentionally a workaround, or should it be removed/commented out?


160-161: Telemetry configuration is correct.

The simplified telemetry configuration with just enabled: true aligns with Llama Stack 0.3.x requirements where provider blocks are not needed for telemetry. Based on learnings from this repository.

tests/e2e/configs/run-rhaiis.yaml (4)

144-147: Verify provider_shield_id for llama-guard shield.

The shield is named llama-guard but uses openai/gpt-4o-mini as the provider_shield_id. This appears to route safety checks through OpenAI's API rather than a Llama Guard model. Please confirm this is intentional for the e2e test configuration.


67-73: FAISS vector_io provider configuration looks correct.

The FAISS provider is properly configured with kv_default persistence backend, which is defined in the storage section. This aligns with the PR objective to fully support the FAISS VectorIO provider.


154-158: Vector stores configuration is consistent with provider definitions.

The default embedding model correctly references the sentence-transformers provider defined at lines 43-45, and FAISS is properly set as the default vector_io provider.


161-162: Telemetry configuration is appropriate for Llama Stack 0.3.x.

Based on learnings, telemetry provider configuration is not supported under the providers section in Llama Stack 0.3.x, so enabling it with just enabled: true is correct.

tests/e2e/test_list.txt (1)

1-1: LGTM!

The new features/faiss.feature entry correctly maintains the test list as per the coding guidelines, enabling FAISS e2e test discovery for the new vector store functionality.

examples/vllm-rhoai.yaml (3)

1-27: Configuration structure looks good with environment-driven paths.

The centralized storage layout with ${env.SQLITE_STORE_DIR:=~/.llama/storage/...} templating provides good flexibility for deployment environments. The separation of conversations, inference, and metadata stores is well-organized.


66-72: FAISS vector_io provider correctly configured.

The FAISS provider with kv_default backend persistence aligns with the PR's objective to fully support the FAISS VectorIO provider.


160-161: Review comment contains factual errors and incorrect assumptions.

The file names are incorrect: run-azure.yaml and run-vertexai.yaml do not exist; the actual files are azure-run.yaml and vertexai-run.yaml. More importantly, the claim is inaccurate: only azure-run.yaml includes telemetry in its apis list (line 12), while vertexai-run.yaml does not.

Additionally, based on Llama Stack 0.3.x documentation, telemetry provider configuration is not supported under the providers section—telemetry is enabled via telemetry.enabled: true without requiring an explicit apis entry. The vllm-rhoai.yaml configuration is following the correct pattern. The inconsistency with azure-run.yaml warrants investigation, but it does not indicate that vllm-rhoai.yaml is misconfigured.

Likely an incorrect or invalid review comment.

tests/e2e/configs/run-vertexai.yaml (2)

31-44: Vertex AI inference provider configuration looks correct.

The provider setup with project/location environment variables and the allowed_models restriction to google/gemini-2.5-flash is appropriate for controlled e2e testing.


149-157: Vector stores and telemetry configuration is consistent.

The default FAISS provider with nomic-ai/nomic-embed-text-v1.5 embedding model aligns with the centralized vector store pattern across the PR. Based on learnings, telemetry.enabled: true without a provider block is the correct approach for Llama Stack 0.3.x.

tests/e2e-prow/rhoai/configs/run.yaml (2)

1-162: Configuration structure is well-organized for RHOAI prow testing.

The comprehensive provider setup with FAISS vector_io, environment-driven storage paths, and centralized backend configuration aligns with the PR's objectives.


139-143: The provider_model_id: null value is intentional for RHOAI vllm configurations. This pattern is consistently used across both the test configuration and the official example configuration (examples/vllm-rhoai.yaml), while other vllm environments (RHELAI, RHAIIS) use environment variables instead. RHOAI vllm is a supported provider in LCS, and the null value appears to be the expected configuration for this environment.

tests/e2e/configs/run-azure.yaml (1)

66-72: FAISS vector_io configuration is consistent across environments.

The FAISS provider setup matches the pattern used in vertexai and rhoai configs, ensuring consistent behavior across different inference backends.

examples/vertexai-run.yaml (3)

4-14: telemetry API not listed but telemetry.enabled is set.

Similar to examples/vllm-rhoai.yaml, the apis list here omits telemetry while the test configs (run-azure.yaml, run-vertexai.yaml) explicitly include it. Verify whether this inconsistency between examples and test configs is intentional.


30-43: Vertex AI provider configuration is appropriate for the example.

The allowed_models restriction and environment-driven project/location settings provide a clear reference implementation.


148-156: Vector stores and safety defaults are correctly configured.

The FAISS default provider and nomic-ai/nomic-embed-text-v1.5 embedding model provide a complete vector store setup aligned with the PR objectives.

docs/providers.md (2)

103-103: Documentation update looks correct.

The change from ✅ to ❌ for llama-guard aligns with the PR description stating shields support was revoked due to unusability issues.


160-160: LGTM: rag-runtime now marked as supported.

This correctly reflects the FAISS VectorIO provider support being added in this PR.

run.yaml (1)

163-164: Telemetry configuration follows correct pattern.

Per retrieved learnings, telemetry in Llama Stack 0.3.x should use telemetry.enabled: true without a provider block. This is correctly implemented.

Comment on lines 4 to +5
on:
schedule:
- cron: "0 0 * * *" # Runs once a day at midnight UTC
workflow_dispatch:
push
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reconsider removing workflow_dispatch and adding branch filtering.

The bare push trigger will run on every push to any branch, which may cause excessive CI usage. Additionally:

  1. Removing workflow_dispatch eliminates the ability to manually trigger tests for debugging
  2. Removing the cron schedule means no proactive detection of external breakages (e.g., provider API changes, credential expiry)
  3. No pull_request trigger means these tests won't validate PRs before merge

Consider a more targeted configuration:

🔧 Suggested trigger configuration
 on: 
-  push
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+  workflow_dispatch:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
on:
schedule:
- cron: "0 0 * * *" # Runs once a day at midnight UTC
workflow_dispatch:
push
on:
push:
branches:
- main
pull_request:
branches:
- main
workflow_dispatch:
🤖 Prompt for AI Agents
In @.github/workflows/e2e_tests_providers.yaml around lines 4 - 5, The workflow
currently uses a bare "on: push" which triggers on every branch and removed
manual dispatch, PR validation, and scheduled runs; update the "on" section to a
mapping that includes "push" with "branches" (e.g., ["main","release"]) to limit
pushes, add "pull_request" with the same "branches" to validate PRs, re-add
"workflow_dispatch" to allow manual runs, and include a "schedule" entry (cron)
to run periodic E2E checks; modify the YAML keys "on", "push", "pull_request",
"workflow_dispatch", and "schedule" accordingly so the workflow is targeted, can
be manually triggered, validates PRs, and runs on a cron.

Comment on lines 137 to 152
registered_resources:
models:
- model_id: gpt-4o-mini
model_type: llm
provider_id: azure
provider_model_id: gpt-4o-mini
model_type: llm
provider_model_id: gpt-4o-mini
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag
provider_id: rag-runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing embedding model registration.

Same issue as other configs: default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but no embedding model is registered. Add the embedding model to registered_resources.models.

Register the embedding model
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: azure
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:

Also applies to: 153-157

🤖 Prompt for AI Agents
In @examples/azure-run.yaml around lines 137 - 152, The config references
default_embedding_model = "nomic-ai/nomic-embed-text-v1.5" but that embedding
model is not present in registered_resources.models; add a models entry for the
embedding (e.g., a model object with model_id matching the referenced name or a
chosen id, provider_id set to the correct provider for Nomic, model_type set to
"embedding", and provider_model_id set to "nomic-ai/nomic-embed-text-v1.5") so
default_embedding_model points to a registered model; ensure the same fix is
applied for the other occurrence noted (lines 153-157).

- config:
excluded_categories: []
provider_id: llama-guard
provider_type: inline::llama-guard scoring:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical YAML syntax error: missing newline breaks configuration.

Line 58 has provider_type: inline::llama-guard scoring: — the scoring: key is incorrectly placed on the same line. This will cause a YAML parse error and prevent the configuration from loading.

Fix the syntax error
   safety:
   - config:
       excluded_categories: []
     provider_id: llama-guard
-    provider_type: inline::llama-guard  scoring:
+    provider_type: inline::llama-guard
+  scoring:
   - provider_id: basic
     provider_type: inline::basic
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
provider_type: inline::llama-guard scoring:
safety:
- config:
excluded_categories: []
provider_id: llama-guard
provider_type: inline::llama-guard
scoring:
- provider_id: basic
provider_type: inline::basic
🧰 Tools
🪛 YAMLlint (1.37.1)

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

🤖 Prompt for AI Agents
In @examples/run.yaml at line 58, The YAML has a syntax error where the keys
"provider_type" and "scoring" are on the same line; update the config so
"provider_type: inline::llama-guard" and "scoring:" are on separate lines
(ensure proper indentation for the "scoring" block), so the YAML parser can
correctly recognize both keys.

Comment on lines +119 to +120
telemetry:
enabled: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Duplicate telemetry block.

The telemetry section appears twice: at lines 119-120 and again at lines 165-166. This is likely a copy-paste error. Remove the duplicate.

Remove duplicate telemetry block
 safety:
   default_shield_id: llama-guard
-telemetry:
-  enabled: true

Also applies to: 165-166

🤖 Prompt for AI Agents
In @examples/run.yaml around lines 119 - 120, The YAML contains two identical
top-level telemetry blocks; remove the duplicate so only a single "telemetry:"
mapping remains at the top level, deleting the extra "telemetry: enabled: true"
occurrence and ensuring surrounding indentation/spacing keeps the YAML valid.

Comment on lines +146 to +157
registered_resources:
models: []
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag # Register the RAG tool
provider_id: rag-runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Empty models list but embedding model referenced.

registered_resources.models is empty ([]), but default_embedding_model references nomic-ai/nomic-embed-text-v1.5. Register at least the embedding model for RAG to function.

Register required models
 registered_resources:
-  models: []
+  models:
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
registered_resources:
models: []
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag # Register the RAG tool
provider_id: rag-runtime
registered_resources:
models:
- model_id: nomic-ai/nomic-embed-text-v1.5
model_type: embedding
provider_id: sentence-transformers
provider_model_id: nomic-ai/nomic-embed-text-v1.5
metadata:
embedding_dimension: 768
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag # Register the RAG tool
provider_id: rag-runtime
🤖 Prompt for AI Agents
In @examples/run.yaml around lines 146 - 157, The YAML registers no models while
RAG references an embedding model; add the embedding model to
registered_resources.models so default_embedding_model matches a registered
model (e.g., add an entry with model_id like "nomic-embed-text-v1.5" or similar,
provider_id (e.g., "nomic-ai"), and provider_model_id set to
"nomic-ai/nomic-embed-text-v1.5"); ensure the model_id you add is the same
identifier used by default_embedding_model and that the toolgroup entry
(toolgroup_id: builtin::rag) can resolve that model.

Comment on lines +18 to +28
conversations_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
type: sqlite
datasets: []
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
type: sqlite
metadata_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
type: sqlite
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Environment variable SQLITE_STORE_DIR used inconsistently across database paths.

All db_path entries use ${env.SQLITE_STORE_DIR:=<default>} where the env var would completely replace the path. If a user sets SQLITE_STORE_DIR, all databases (conversations, inference-store, registry) would point to the same file, causing data corruption.

Additionally, Line 51 uses the same env var for storage_dir (a directory), while these lines use it for db_path (a file).

Consider using the env var as a base directory with filenames appended:

Proposed fix
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
   type: sqlite
 datasets: []
 # external_providers_dir: /opt/app-root/src/.llama/providers.d
 inference_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
   type: sqlite
 metadata_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
   type: sqlite
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
conversations_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
type: sqlite
datasets: []
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
type: sqlite
metadata_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
type: sqlite
conversations_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
type: sqlite
datasets: []
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
type: sqlite
metadata_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
type: sqlite
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhaiis.yaml around lines 18 - 28, The three db_path
entries (conversations_store.db_path, inference_store.db_path,
metadata_store.db_path) incorrectly treat SQLITE_STORE_DIR as a full filepath
and thus will point all databases to the same file when the env var is set;
update each db_path to use SQLITE_STORE_DIR as a base directory and append
distinct filenames (e.g., conversations.db, inference-store.db, registry.db) so
each DB has its own file, and ensure the storage_dir usage at the other location
(where storage_dir expects a directory) remains consistent with this change.

Comment on lines 46 to +53
files:
- provider_id: meta-reference-files
provider_type: inline::localfs
config:
storage_dir: ${env.FILES_STORAGE_DIR:=~/.llama/storage}
- config:
metadata_store:
table_name: files_metadata
backend: sql_default
storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage/files}
provider_id: meta-reference-files
provider_type: inline::localfs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Files provider storage_dir uses same env var inconsistently.

This line expects a directory while other usages expect file paths. If the env var pattern is changed as suggested above, this line should also be updated for consistency.

Proposed fix (aligned with directory-based env var)
   files:
   - config:
       metadata_store:
         table_name: files_metadata
         backend: sql_default
-      storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage/files}
+      storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/files
     provider_id: meta-reference-files
     provider_type: inline::localfs
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
files:
- provider_id: meta-reference-files
provider_type: inline::localfs
config:
storage_dir: ${env.FILES_STORAGE_DIR:=~/.llama/storage}
- config:
metadata_store:
table_name: files_metadata
backend: sql_default
storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage/files}
provider_id: meta-reference-files
provider_type: inline::localfs
files:
- config:
metadata_store:
table_name: files_metadata
backend: sql_default
storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/files
provider_id: meta-reference-files
provider_type: inline::localfs
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhaiis.yaml around lines 46 - 53, The storage_dir
entry for provider_id "meta-reference-files" currently uses
${env.SQLITE_STORE_DIR:=~/.llama/storage/files} which is inconsistent with the
repo's directory-vs-file env var convention; update this value to the
directory-specific env var used elsewhere (for example replace SQLITE_STORE_DIR
with the directory-specific name, e.g. SQLITE_STORE_DIR_DIR or the agreed-upon
directory env var) so storage_dir consistently expects a directory path.

Comment on lines 115 to +122
storage:
backends:
kv_default:
type: kv_sqlite
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/kv_store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
sql_default:
type: sql_sqlite
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Same SQLITE_STORE_DIR env var issue in storage backends.

The kv_default and sql_default backends have the same issue where setting SQLITE_STORE_DIR would cause both to point to the same file.

Proposed fix
 storage:
   backends:
     kv_default:
       type: kv_sqlite
-      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
+      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/rag/kv_store.db
     sql_default:
       type: sql_sqlite
-      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db}
+      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.db
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
storage:
backends:
kv_default:
type: kv_sqlite
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/kv_store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
sql_default:
type: sql_sqlite
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db}
storage:
backends:
kv_default:
type: kv_sqlite
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/rag/kv_store.db
sql_default:
type: sql_sqlite
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.db
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhaiis.yaml around lines 115 - 122, Both storage
backends kv_default and sql_default use the same env var SQLITE_STORE_DIR so
setting it makes them point to the same DB file; change the db_path entries so
each backend uses a distinct env var or distinct filename (e.g.,
SQLITE_KV_STORE_DIR for kv_default and SQLITE_SQL_STORE_DIR for sql_default, or
keep SQLITE_STORE_DIR but append different filenames like
${env.SQLITE_STORE_DIR:=~/.llama/storage}/kv_store.db and .../sql_store.db) so
kv_default.db_path and sql_default.db_path resolve to different files.

Comment on lines +54 to +58
safety: [] # WARNING: Shields disabled due to infinite loop issue with LLM calls
# - config:
# excluded_categories: []
# provider_id: llama-guard
# provider_type: inline::llama-guard
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent safety configuration: shield registered but provider disabled.

The safety provider is explicitly disabled (safety: []) due to an infinite loop issue, but:

  • A shield is still registered in registered_resources.shields (lines 144-147)
  • safety.default_shield_id is set to llama-guard (line 160)

This inconsistency may cause confusing behavior or runtime errors if safety checks are triggered. Consider either:

  1. Commenting out the shield registration and default_shield_id while safety is disabled
  2. Adding a more prominent warning in the config
Suggested alignment
 registered_resources:
   models:
   - model_id: ${env.RHEL_AI_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHEL_AI_MODEL}
-  shields:
-  - shield_id: llama-guard
-    provider_id: llama-guard
-    provider_shield_id: openai/gpt-4o-mini
+  shields: [] # Disabled - see safety provider comment
   datasets: []
...
 safety:
-  default_shield_id: llama-guard
+  default_shield_id: null # Disabled - see safety provider comment

Also applies to: 144-147, 159-160

🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhelai.yaml around lines 54 - 58, The config disables
safety via "safety: []" but still registers a shield and sets
"safety.default_shield_id" which creates an inconsistent state; fix by removing
or commenting out the shield entry under "registered_resources.shields" and
clearing or commenting "safety.default_shield_id" (or set it to null) so no
shield is referenced while safety is disabled, and add a clear inline warning
comment near "safety: []" explaining why shields are disabled to prevent
accidental re-enablement.

Comment on lines 154 to +158
vector_stores:
default_provider_id: documentation_faiss
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: all-mpnet-base-v2
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing embedding model registration.

default_embedding_model references nomic-ai/nomic-embed-text-v1.5, but no embedding models are registered in registered_resources.models. The sentence-transformers provider is configured (lines 43-45), but without a registered embedding model, RAG operations may fail.

Register the embedding model
 registered_resources:
   models:
   - model_id: ${env.RHEL_AI_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHEL_AI_MODEL}
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhelai.yaml around lines 154 - 158, The config
references default_embedding_model (provider_id sentence-transformers, model_id
nomic-ai/nomic-embed-text-v1.5) but no embedding model is registered under
registered_resources.models; add a model entry in registered_resources.models
that registers this embedding model using the same provider_id and model_id so
the RAG/vector_stores flow can resolve it (ensure the entry key/name matches any
expected reference and use the symbols registered_resources.models and
default_embedding_model to locate where to add the registration).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants