Skip to content

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Dec 18, 2025

Description

  • Added small dummy rag db for e2e tests.
  • Created e2e tests for FAISS rag (making sure the retrieved chunks are from the docs in the db).
  • Added new required deps (pyproject.toml)
  • Updated docs/providers, revoked support for shields as they are unusable now.
  • Updated configs in examples folder, removed old unsupported configs (lls 0.2.x format).

Test:
Ask the agent "What is the title of the article from Paul?", expected answer: "great work"

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

NA

Related Tickets & Documents

  • Related Issue # LCORE-603
  • Closes # LCORE-603

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • FAISS end-to-end tests added; telemetry API added and enabled by default.
  • Improvements

    • Large config reorganization: centralized storage/backends, unified providers, consolidated resource registry; updated example manifests and safety/provider defaults.
  • Chores

    • Added local storage mount for RAG test data and updated test list entries.
  • Dependencies

    • Added einops dependency.
  • Documentation

    • Updated provider status indicators in docs.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

Walkthrough

Centralized and reorganized many YAML manifests into a storage-backed, provider-centric layout; added RAG volume mounts to docker-compose files; removed several obsolete example manifests; added FAISS e2e tests and updated dependency and provider status in docs.

Changes

Cohort / File(s) Summary
Docker Compose Configuration
docker-compose-library.yaml, docker-compose.yaml
Added RAG volume mount: ./tests/e2e/rag/opt/app-root/src/.llama/storage/rag:Z.
Documentation
docs/providers.md
Updated provider status entries: llama-guard and rag-runtime statuses changed.
Deleted Example Configurations
Examples removed
examples/gemini-run.yaml, examples/openai-faiss-run.yaml, examples/openai-pgvector-run.yaml, examples/vllm-granite-run.yaml, examples/vllm-llama-faiss-run.yaml, examples/vllm-qwen3-run.yaml
Removed multiple obsolete example manifests.
Reworked Example Configurations
examples/azure-run.yaml, examples/run.yaml, examples/vertexai-run.yaml, examples/vllm-rhaiis.yaml, examples/vllm-rhelai.yaml, examples/vllm-rhoai.yaml
Reorganized to centralized storage.backends/storage.stores, unified providers block, added registered_resources (models, shields, vector_stores, tool_groups), env-templated db paths, telemetry and tool_runtime defaults.
E2E CI / Runner Configs
tests/e2e/configs/*, tests/e2e-prow/rhoai/configs/run.yaml, tests/e2e/configs/run-*.yaml
Applied provider/storage reorg to CI/e2e configs, added telemetry, vector_store defaults, updated storage paths/namespaces, aligned providers (OpenAI/azure/faiss/nomic embeddings).
E2E Tests & Test List
tests/e2e/features/faiss.feature, tests/e2e/test_list.txt
Added FAISS integration feature (verify vector store registration; query vector DB via file_search) and appended test to list.
Dependencies
pyproject.toml
Added dependency: einops>=0.8.1.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • tisnik
  • radofuchs
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically describes the main objective: adding full support for the Faiss VectorIO provider, which aligns with the primary changes across configuration files, documentation, and end-to-end tests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (6)
examples/vllm-rhaiis-run.yaml (1)

33-37: Hardcoded port 8000 in URL.

The URL hardcodes port 8000: http://${env.RHAIIS_URL}:8000/v1/ unlike the RHEL AI config which uses ${env.RHEL_AI_PORT}. Consider using an environment variable for the port to maintain deployment flexibility:

url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/
tests/e2e/features/faiss.feature (1)

8-27: Consider adding a note about the port difference.

This scenario correctly uses @skip-in-library-mode since it directly queries the llama-stack service on port 8321. The second scenario (lines 29-39) omits the hostname/port override, so it will use defaults to test through lightspeed-stack. This asymmetry is intentional but worth a brief inline comment for maintainability.

tests/e2e/configs/run-azure.yaml (1)

34-36: Consider parameterizing the Azure API base URL.

The api_base is hardcoded to a specific test endpoint. For flexibility across environments, consider using an environment variable with a default:

api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
examples/azure-run.yaml (2)

34-36: Consider using an environment variable for the API base URL.

The api_base is hardcoded while api_key uses an environment variable. For consistency and deployment flexibility across environments, consider parameterizing the base URL:

   config: 
     api_key: ${env.AZURE_API_KEY}
-    api_base: https://ols-test.openai.azure.com/
+    api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
     api_version: 2024-02-15-preview

111-116: Inconsistent storage path structure between backends.

kv_default uses ~/.llama/storage/rag/kv_store.db (includes /rag/ subfolder) while sql_default uses ~/.llama/storage/sql_store.db (no subfolder). This asymmetry may cause confusion. Consider aligning the path structure or documenting the rationale.

run.yaml (1)

119-125: Same storage path inconsistency as azure-run.yaml.

The kv_default backend uses ~/.llama/storage/rag/kv_store.db while sql_default uses ~/.llama/storage/sql_store.db. The comment on line 120 explains that registry and RAG are unified in kv_default, which provides context. However, consider whether the /rag/ subdirectory name is still appropriate given it stores more than just RAG data.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96888ca and 85c9528.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml (1 hunks)
  • docker-compose.yaml (1 hunks)
  • docs/providers.md (2 hunks)
  • examples/azure-run.yaml (1 hunks)
  • examples/gemini-run.yaml (0 hunks)
  • examples/openai-faiss-run.yaml (0 hunks)
  • examples/openai-pgvector-run.yaml (0 hunks)
  • examples/run.yaml (1 hunks)
  • examples/vertexai-run.yaml (4 hunks)
  • examples/vllm-granite-run.yaml (0 hunks)
  • examples/vllm-llama-faiss-run.yaml (0 hunks)
  • examples/vllm-qwen3-run.yaml (0 hunks)
  • examples/vllm-rhaiis-run.yaml (1 hunks)
  • examples/vllm-rhelai-run.yaml (1 hunks)
  • examples/vllm-rhoai-run.yaml (1 hunks)
  • pyproject.toml (1 hunks)
  • run.yaml (3 hunks)
  • tests/e2e-prow/rhoai/configs/run.yaml (4 hunks)
  • tests/e2e/configs/run-azure.yaml (4 hunks)
  • tests/e2e/configs/run-ci.yaml (3 hunks)
  • tests/e2e/configs/run-rhaiis.yaml (4 hunks)
  • tests/e2e/configs/run-rhelai.yaml (4 hunks)
  • tests/e2e/configs/run-vertexai.yaml (4 hunks)
  • tests/e2e/features/faiss.feature (1 hunks)
  • tests/e2e/test_list.txt (1 hunks)
💤 Files with no reviewable changes (6)
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
🧰 Additional context used
📓 Path-based instructions (3)
tests/e2e/test_list.txt

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain test list in tests/e2e/test_list.txt for end-to-end tests

Files:

  • tests/e2e/test_list.txt
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

  • tests/e2e/features/faiss.feature
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Configure pylint with source-roots = "src"
Exclude src/auth/k8s.py from pyright type checking

Files:

  • pyproject.toml
🧠 Learnings (3)
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt` for end-to-end tests

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

  • tests/e2e/test_list.txt
🪛 YAMLlint (1.37.1)
tests/e2e/configs/run-rhaiis.yaml

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)


[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)


[error] 100-100: duplication of key "config" in mapping

(key-duplicates)


[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

tests/e2e/configs/run-rhelai.yaml

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)


[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)


[error] 100-100: duplication of key "config" in mapping

(key-duplicates)


[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)


[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (30)
pyproject.toml (1)

55-55: Consider using a flexible version constraint for urllib3 or explain the exact version pin.

urllib3 2.6.1 is the latest non-vulnerable version, but 2.6.2 is available as a bug fix. The exact version pin (==2.6.1) is unnecessarily restrictive. If there's no specific compatibility reason for avoiding 2.6.2, consider using >=2.6.1,<2.7 instead. If an exact pin is necessary, document the reason in a comment above the dependency.

The einops dependency is appropriate: einops 0.8.1 is the latest version with no known vulnerabilities.

⛔ Skipped due to learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Always check `pyproject.toml` for existing dependencies and versions before adding new ones
docs/providers.md (2)

103-103: Documentation accurately reflects the disabled llama-guard shield.

The change from ✅ to ❌ for llama-guard aligns with the configuration changes across the example files where shields are disabled with the comment about "infinite loop issue with LLM calls".


160-160: rag-runtime is now correctly marked as supported.

The rag-runtime provider is properly documented as supported (✅), which aligns with its configuration in all the new example YAML files.

examples/vllm-rhelai-run.yaml (1)

1-151: New RHEL AI configuration file is well-structured.

The configuration follows the consolidated provider pattern with:

  • Environment-driven paths and credentials
  • FAISS vector store with sentence-transformers embeddings
  • Properly disabled shields with documented reasoning
  • Consistent storage backend configuration
examples/vertexai-run.yaml (2)

30-67: Provider configuration is well-organized.

The inference providers are properly configured with:

  • Google Vertex AI remote provider with project/location from environment variables
  • Sentence-transformers for embeddings
  • FAISS vector store with persistence

132-141: Empty models array may need attention.

The registered_resources.models array is empty while the Vertex AI provider specifies allowed_models: ["google/gemini-2.5-flash"]. Verify if a model registration is needed here, or if the allowed_models configuration in the provider is sufficient for Vertex AI.

examples/vllm-rhoai-run.yaml (2)

134-138: Verify the hardcoded model configuration and null provider_model_id.

Unlike other example configs that use environment variables (e.g., ${env.RHEL_AI_MODEL}), this file hardcodes model_id: meta-llama/Llama-3.2-1B-Instruct and sets provider_model_id: null.

Consider using environment variables for consistency:

- model_id: ${env.RHOAI_MODEL}
  provider_id: vllm
  model_type: llm
  provider_model_id: ${env.RHOAI_MODEL}

37-37: Different max_tokens value than other vLLM configs.

This config uses max_tokens: 1024 while vllm-rhelai-run.yaml and vllm-rhaiis-run.yaml use max_tokens: 2048. Verify if this is intentional for RHOAI deployments.

examples/vllm-rhaiis-run.yaml (1)

1-151: RHAIIS configuration follows the established pattern.

The configuration is well-structured and consistent with other example configurations in this PR.

tests/e2e/configs/run-rhelai.yaml (1)

1-27: Configuration header and store definitions look correct.

The version, image_name, APIs, and store configurations follow the established pattern with proper environment variable substitution.

tests/e2e-prow/rhoai/configs/run.yaml (1)

1-150: Configuration structure looks well-organized.

The consolidated provider configuration with environment-variable-backed paths is consistent. The explicit warnings about disabled shields (lines 49, 138) document the known issue clearly.

tests/e2e/configs/run-vertexai.yaml (2)

30-67: Provider configuration looks correct.

The Vertex AI inference provider is properly configured with project/location environment variables, and the FAISS vector_io provider follows the correct structure with persistence settings.


132-146: Registered resources and vector stores configuration is consistent.

Empty models list is appropriate here since allowed_models is specified in the provider configuration (line 36). The vector_stores defaults align with the FAISS provider setup.

tests/e2e/configs/run-ci.yaml (2)

53-74: Scoring and vector_io providers are well-configured.

The scoring providers include useful documentation comments (e.g., "Enable the RAG tool", "Define the storage backend for RAG"). The placeholder '********' for openai_api_key is appropriate for example configurations.


114-155: Telemetry and storage configuration additions look good.

The inline documentation (line 120, 153) explaining the storage backend purpose improves maintainability. The vector_stores configuration with FAISS defaults is consistent with the PR's objective.

examples/run.yaml (3)

1-6: Good documentation header.

The introductory comments clearly explain the configuration purpose and prerequisites (OpenAI API key, rag-content tool reference). This helps users understand how to use this example.


74-80: FAISS vector_io configuration is correct.

The persistence configuration with namespace: vector_io::faiss and backend: kv_default properly integrates with the storage backends defined below. This aligns with the PR's goal of fully supporting the FAISS VectorIO provider.


147-161: Registered resources and vector stores complete the FAISS integration.

The vector_stores section properly configures:

  • default_provider_id: faiss matching the vector_io provider
  • default_embedding_model using sentence-transformers with the nomic model

This provides a complete working example for FAISS-based RAG.

tests/e2e/test_list.txt (1)

1-1: LGTM!

The new faiss.feature is correctly added to the test list, following the coding guidelines for maintaining end-to-end tests in this file.

docker-compose-library.yaml (1)

16-16: LGTM!

The volume mount correctly exposes the test RAG database to the container, enabling FAISS-based e2e tests.

docker-compose.yaml (1)

14-14: LGTM!

Consistent with the corresponding mount in docker-compose-library.yaml, ensuring FAISS e2e tests work in both library and non-library modes.

tests/e2e/features/faiss.feature (1)

1-39: Feature file follows BDD conventions.

The Gherkin structure is correct and scenarios appropriately test FAISS vector store registration and retrieval. As per coding guidelines, this uses the behave framework with proper Given/When/Then steps.

tests/e2e/configs/run-azure.yaml (2)

48-52: Safety providers disabled with documented reason.

The commented-out safety configuration includes a clear warning about the infinite loop issue. This is good practice for documenting why a feature is disabled.


61-67: FAISS vector_io provider correctly configured.

The FAISS provider configuration with persistence to kv_default backend aligns with the storage configuration and the mounted RAG volume path.

examples/azure-run.yaml (3)

4-14: Missing telemetry API compared to run.yaml.

The run.yaml includes telemetry in the APIs list and has a telemetry: enabled: true section, but this Azure configuration omits both. If telemetry should be available in Azure deployments, add it for consistency.


48-52: Shields disabled with documented warning — acceptable for known issue.

The safety providers array is empty with a clear warning comment about the infinite loop issue. The corresponding shields: [] in registered_resources is consistent. This is appropriately documented.

Also applies to: 138-138


132-150: Registered resources and vector_stores configuration looks correct.

The configuration properly:

  • Registers the gpt-4o-mini model with the Azure provider
  • Sets up the RAG toolgroup with rag-runtime
  • Configures FAISS as the default vector store provider with nomic-embed-text-v1.5 as the embedding model
run.yaml (3)

31-39: Well-structured inference providers with environment-driven configuration.

The OpenAI provider correctly uses environment variables for both the API key and allowed models with sensible defaults. The inline sentence-transformers provider is appropriately configured for embedding support.


68-74: FAISS vector_io provider correctly configured for persistence.

The FAISS provider is properly set up with a persistence backend pointing to kv_default, which aligns with the storage configuration. This enables the RAG functionality as intended by the PR.


148-155: RAG toolgroup and vector_stores configuration complete.

The configuration properly:

  • Registers builtin::rag toolgroup with the rag-runtime provider
  • Sets FAISS as the default vector store provider
  • Configures nomic-ai/nomic-embed-text-v1.5 as the default embedding model via sentence-transformers

This completes the FAISS VectorIO provider setup required by the PR objectives.

run.yaml Outdated
Comment on lines 114 to 115
telemetry:
enabled: true
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find run.yaml file
fd -n "run.yaml" --type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 306


🏁 Script executed:

#!/bin/bash
# Find run.yaml and check its structure
fd -type f -name "*.yaml" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 246


🏁 Script executed:

find . -name "run.yaml" -type f 2>/dev/null | head -5

Repository: lightspeed-core/lightspeed-stack

Length of output: 177


🏁 Script executed:

#!/bin/bash
# Search for run.yaml files
find . -type f -name "run.yaml" 2>/dev/null

# Also search for telemetry configuration patterns
rg -l "telemetry:" --type yaml 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 340


🏁 Script executed:

#!/bin/bash
# Locate run.yaml and examine structure around telemetry config
if [ -f "./run.yaml" ]; then
  echo "=== Found run.yaml ==="
  wc -l ./run.yaml
  echo "=== Lines 100-130 ==="
  sed -n '100,130p' ./run.yaml
  echo "=== Searching for providers section ==="
  grep -n "providers:" ./run.yaml || echo "No explicit providers section found"
elif [ -f "run.yaml" ]; then
  wc -l run.yaml
  sed -n '100,130p' run.yaml
  grep -n "providers:" run.yaml || echo "No explicit providers section found"
else
  find . -name "run.yaml" -type f 2>/dev/null -exec sh -c 'echo "=== Found: $1 ==="; wc -l "$1"; sed -n "100,130p" "$1"; grep -n "providers:" "$1" || echo "No providers section"' _ {} \;
fi

Repository: lightspeed-core/lightspeed-stack

Length of output: 1099


🏁 Script executed:

#!/bin/bash
# Read the providers section
sed -n '1,80p' ./run.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 2155


🏁 Script executed:

#!/bin/bash
# Check entire telemetry section and nearby context
sed -n '110,125p' ./run.yaml
# Search for any backend/exporter configurations
grep -i "backend\|exporter\|endpoint" ./run.yaml | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 1284


🏁 Script executed:

#!/bin/bash
# Check if telemetry is listed in apis and look for telemetry-related documentation
sed -n '1,20p' ./run.yaml
# Search for any telemetry configuration or documentation
rg -i "telemetry" --type yaml -B2 -A2 ./run.yaml | head -40

Repository: lightspeed-core/lightspeed-stack

Length of output: 504


🏁 Script executed:

#!/bin/bash
# Search for telemetry-related documentation, READMEs, or code comments
find . -type f \( -name "README*" -o -name "*.md" -o -name "telemetry*" \) | head -20
rg -i "telemetry.*provider|telemetry.*backend|telemetry.*config" --type yaml --type python -B1 -A1 | head -30

Repository: lightspeed-core/lightspeed-stack

Length of output: 617


🏁 Script executed:

#!/bin/bash
# Search for telemetry documentation
rg -i "telemetry" docs/ -l 2>/dev/null | head -10
# Also check config documentation
grep -i "telemetry" README.md docs/config.md 2>/dev/null

Repository: lightspeed-core/lightspeed-stack

Length of output: 319


🏁 Script executed:

#!/bin/bash
# Check providers documentation for telemetry config
grep -i "telemetry" docs/providers.md -B2 -A5 | head -40
# Also check deployment guide
grep -i "telemetry" docs/deployment_guide.md -B2 -A5 | head -40

Repository: lightspeed-core/lightspeed-stack

Length of output: 2149


🏁 Script executed:

#!/bin/bash
# Get more context from deployment_guide.md about telemetry configuration
sed -n '1,2000p' docs/deployment_guide.md | grep -i "telemetry" -B5 -A10 | head -60

Repository: lightspeed-core/lightspeed-stack

Length of output: 2874


🏁 Script executed:

#!/bin/bash
# Check other run.yaml examples to see how telemetry should be configured
echo "=== Checking examples/run.yaml ==="
grep -A5 "telemetry:" examples/run.yaml | head -20
echo ""
echo "=== Checking tests/configuration/run.yaml ==="
grep -A5 "telemetry:" tests/configuration/run.yaml | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 391


🏁 Script executed:

#!/bin/bash
# Check if there are any code comments or documentation about when telemetry provider config is required
rg -i "telemetry.*provider|telemetry.*config|enabled.*true" docs/ -B2 -A2 | head -40
# Also check the main config documentation
head -100 docs/config.md

Repository: lightspeed-core/lightspeed-stack

Length of output: 4089


Add telemetry provider configuration to match test and documentation patterns.

Telemetry is enabled but lacks an explicit provider configuration under the providers section, unlike other APIs (inference, safety, scoring, etc.). The test configuration at tests/configuration/run.yaml shows the proper pattern with providers.telemetry using the meta-reference provider. Either add explicit telemetry provider config or document why using a default backend is acceptable.

🤖 Prompt for AI Agents
In run.yaml around lines 114 to 115, telemetry is enabled but missing an
explicit providers.telemetry entry; add a providers.telemetry block mirroring
the pattern used by inference/safety/scoring (e.g., include a telemetry provider
key such as meta-reference with its required connection/config fields used in
tests) so telemetry has an explicit provider configuration, or update docs to
justify relying on a default backend—ensure the YAML structure follows the
existing providers.<provider> pattern used elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provider config for telemetry is not supported in 0.3.x

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@are-ces, thank you for the clarification! I've noted that telemetry provider configuration is not supported in version 0.3.x, which explains why the configuration is valid as-is with just the enabled: true flag.


✏️ Learnings added
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment on lines 146 to 150
vector_stores:
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Vector stores configuration uses v1.5, though v2 is now available as a newer alternative.

The FAISS and sentence-transformers configuration is properly set. However, Nomic Embed Text V2 is now available for production workloads as a successor to v1.5, offering multilingual support if needed. Additionally, ensure that when embeddings are generated at runtime, the text prompt includes a task instruction prefix (e.g., search_document: or search_query: for RAG applications) to align with the model's requirements.

🤖 Prompt for AI Agents
In tests/e2e/configs/run-azure.yaml around lines 146 to 150, the vector store
config pins the embedding model to nomic-ai/nomic-embed-text-v1.5; update
model_id to the v2 release (e.g., nomic-ai/nomic-embed-text-v2) to use the newer
multilingual production model and adjust any downstream compatibility settings
if required, and ensure that runtime embedding calls prepend the appropriate
task instruction prefix (for example "search_document:" or "search_query:") to
the text passed to the embedding API so the model receives the required task
context for RAG/search use cases.

Comment on lines +31 to +25
And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
When I use "query" to ask question with authorization header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n tests/e2e/features/faiss.feature | sed -n '30,35p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 542


Complete the JWT token with all three segments.

A well-formed JWT consists of three concatenated Base64url-encoded strings, separated by dots. The token on line 31 is missing the signature segment and the payload appears truncated. Provide a complete, valid JWT token with all three parts (header.payload.signature).

🤖 Prompt for AI Agents
In tests/e2e/features/faiss.feature around lines 31 to 32, the Authorization
Bearer token is incomplete (missing the third signature segment and has a
truncated payload); replace that string with a well-formed JWT consisting of
three Base64url-encoded segments separated by dots (header.payload.signature).
Ensure the token used in the test is complete and valid for your test
environment (either a real signed token or a locally generated test token with
matching server validation), and update the line to include the full token
string.

@are-ces are-ces force-pushed the faiss-support-final branch from 160b644 to ca83927 Compare December 18, 2025 10:32
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @tests/e2e/configs/run-ci.yaml:
- Around line 60-63: The Braintrust provider config currently hardcodes
openai_api_key: '********' which will fail at runtime; update the provider
config for provider_type inline::braintrust to read the API key from an
environment variable (e.g., using the config expression or secret interpolation
used elsewhere in CI configs) instead of the literal '********', ensuring the
provider_id/provider_type block uses the env var name so the runtime picks up
the real key.
- Around line 160-164: registered_resources has malformed YAML: move the model
entries currently under the top-level list into a nested models: mapping and
ensure shields: [] remains a key with a models: list beneath it; specifically,
under registered_resources->shields keep "shields: []" as an empty sequence or
mapping and create a "models:" key containing entries with "provider_id:" and
"model_id:" (e.g., add an entry for provider_id: sentence-transformers and
model_id: nomic-ai/nomic-embed-text-v1.5) so the default_embedding_model
referenced in vector_stores is registered; validate indentation and YAML
structure so the models are children of the models: key, not sibling keys at the
same level as shields.
🧹 Nitpick comments (2)
tests/e2e/configs/run-ci.yaml (2)

48-52: Track the shields/safety infinite loop issue.

Safety providers and shields are disabled due to an "infinite loop issue with LLM calls". While this may be acceptable for E2E testing, ensure this known issue is tracked and prioritized for resolution before production deployment.

Is there an open issue tracking the shields infinite loop problem? If not, would you like me to help create one?


120-120: Consider breaking the long comment into multiple lines.

The comment explaining the unified storage backend is informative but spans over 100 characters, which impacts readability.

♻️ Suggested refactor
-    kv_default: # Define the storage backend type for RAG, in this case registry and RAG are unified i.e. information on registered resources (e.g. models, vector_stores) are saved together with the RAG chunks
+    kv_default:
+      # Define the storage backend type for RAG. In this case, registry and RAG are unified,
+      # i.e., information on registered resources (e.g., models, vector_stores) are saved
+      # together with the RAG chunks.
       type: kv_sqlite
       db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca83927 and c80a78d.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • tests/e2e/configs/run-ci.yaml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • tests/e2e/configs/run-ci.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: build-pr
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: E2E: server mode / vertexai
  • GitHub Check: E2E: server mode / azure
  • GitHub Check: E2E: library mode / azure
  • GitHub Check: E2E: library mode / vertexai
🔇 Additional comments (2)
tests/e2e/configs/run-ci.yaml (2)

31-36: LGTM: Inference provider properly configured.

The OpenAI inference provider configuration correctly uses environment variables for the API key and model selection, with a sensible default for E2E testing.


68-74: LGTM: FAISS vector_io provider properly configured.

The FAISS provider configuration correctly references the kv_default backend for persistence, which aligns with the PR objective to fully support FAISS.

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if the deletion of "old" examples should be part of this PR. Otherwise looks OK, thank you very much.

And REST API service prefix is /v1

@skip-in-library-mode
Scenario: Verify vector store is registered
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this scenario is testing llama-stack behavior, it was useful in debugging and could be added as a single step to make sure the vector_store exists but entire test should not be there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this should be moved to info.feature (as stated above).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@radofuchs and I had a discussion about this, his suggestion was to remove the scenario for the llama-stack test entirely and move the test functionality inside the other faiss scenario, my solution has been to instead of testing the llama-stack functionality test our endpoint /rags.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo faiss.feature is the right place because when testing faiss we want to make sure that llama-stack is able to register the faiss vector store

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is different issue though when you started to test the rags endpoint there, since the tests for rags need to test more than only the 200 case. We also need to test that it fails when llama-stack is down as well as the auth errors. So I would move these tests to the info.feature to cover all of these there

@are-ces are-ces force-pushed the faiss-support-final branch from c80a78d to 2993407 Compare January 13, 2026 12:31
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/e2e/configs/run-ci.yaml (1)

145-150: Embedding model mismatch between registration and default.

The registered_resources.models includes sentence-transformers/all-mpnet-base-v2 (768 dimensions), but vector_stores.default_embedding_model references nomic-ai/nomic-embed-text-v1.5. Using different embedding models for registration versus default could cause dimension mismatches or inconsistent vector retrieval.

Consider aligning these to use the same model:

Option A: Use the registered model as default
 vector_stores:
   default_provider_id: faiss
   default_embedding_model:
     provider_id: sentence-transformers
-    model_id: nomic-ai/nomic-embed-text-v1.5
+    model_id: sentence-transformers/all-mpnet-base-v2
Option B: Register the nomic model instead
   - model_id: sentence-transformers/all-mpnet-base-v2
+  - model_id: nomic-ai/nomic-embed-text-v1.5
     model_type: embedding
     provider_id: sentence-transformers
-    provider_model_id: sentence-transformers/all-mpnet-base-v2
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
     metadata:
-      embedding_dimension: 768
+      embedding_dimension: 768  # Verify nomic model dimension

Also applies to: 161-165

🤖 Fix all issues with AI agents
In @run.yaml:
- Around line 60-63: The braintrust provider block (provider_id: braintrust,
provider_type: inline::braintrust) currently contains a hardcoded placeholder
openai_api_key: '********'; replace this with an environment variable reference
(e.g., set openai_api_key from process/env such as ${OPENAI_API_KEY} or a
project-specific name like ${BRAINTRUST_OPENAI_API_KEY}) to match other
credential handling so the actual key is not committed in the repo.

In @tests/e2e/configs/run-rhelai.yaml:
- Around line 53-57: The config declares an empty safety providers list
(`safety: []`) but still registers a shield named `llama-guard` in
`registered_resources.shields`, causing an inconsistency; either remove the
`llama-guard` entry from `registered_resources.shields` or re-enable the
`llama-guard` provider in the `safety` list (e.g., uncomment and include the
provider config) so that every registered shield has a corresponding provider;
update whichever of `registered_resources.shields` or the `safety` block you
choose to keep so they match.
🧹 Nitpick comments (6)
examples/vertexai-run.yaml (1)

132-138: Empty models array may require documentation.

Unlike the vllm configs which pre-register models, this VertexAI config has an empty models array in registered_resources. Users will need to register models separately. Consider adding a comment indicating that models should be registered via API or by adding entries here with the appropriate provider_model_id from the allowed_models list (e.g., google/gemini-2.5-flash).

examples/vllm-rhoai-run.yaml (1)

34-37: TLS verification disabled - security consideration.

tls_verify: false bypasses certificate validation. While this may be acceptable for development or internal environments with self-signed certificates, it introduces security risks in production. Consider:

  1. Adding a comment explaining when this is appropriate
  2. Using an environment variable to control this setting: tls_verify: ${env.VLLM_TLS_VERIFY:=true}
♻️ Suggested approach
   config:
     url: ${env.KSVC_URL}/v1/
     api_token: ${env.VLLM_API_KEY}
-    tls_verify: false
+    tls_verify: ${env.VLLM_TLS_VERIFY:=true}  # Set to false only for self-signed certs in dev/test
     max_tokens: 1024
examples/vllm-rhaiis-run.yaml (1)

34-37: URL construction reduces flexibility.

The URL is constructed with hardcoded protocol (http://) and port (:8000). This limits deployment flexibility:

  • Cannot switch to HTTPS without changing the config
  • Cannot use a different port without modifying the file

Consider making the entire URL configurable via environment variable, similar to the RHOAI config.

♻️ Suggested approach
   config:
-    url: http://${env.RHAIIS_URL}:8000/v1/
+    url: ${env.RHAIIS_URL}/v1/  # e.g., http://localhost:8000 or https://rhaiis.example.com
     api_token: ${env.RHAIIS_API_KEY}
-    tls_verify: false
+    tls_verify: ${env.RHAIIS_TLS_VERIFY:=true}
     max_tokens: 2048
examples/azure-run.yaml (1)

30-39: Hardcoded test endpoint and preview API version.

Two concerns with the Azure configuration:

  1. Hardcoded test endpoint (line 35): api_base: https://ols-test.openai.azure.com/ appears to be an internal test endpoint. Example configurations should use environment variables to avoid confusion and accidental usage of test infrastructure.

  2. Preview API version (line 36): api_version: 2024-02-15-preview is a preview version that may be deprecated. Consider documenting the minimum required API version or using a stable release.

♻️ Suggested fix
   inference:
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
-      api_version: 2024-02-15-preview
+      api_base: ${env.AZURE_API_BASE}  # e.g., https://your-resource.openai.azure.com/
+      api_version: ${env.AZURE_API_VERSION:=2024-02-15-preview}
examples/run.yaml (1)

66-69: Use environment variable for API key consistency.

The braintrust provider uses a hardcoded placeholder '********' for openai_api_key. For consistency with other sensitive values in this config (e.g., ${env.OPENAI_API_KEY}), consider using an environment variable.

Suggested fix
  - provider_id: braintrust
    provider_type: inline::braintrust
    config:
-      openai_api_key: '********'
+      openai_api_key: ${env.OPENAI_API_KEY}
tests/e2e/configs/run-ci.yaml (1)

60-63: Use environment variable for API key consistency.

Same as in examples/run.yaml - the braintrust provider uses a hardcoded placeholder. Use an environment variable for consistency with other secrets management in this config.

Suggested fix
  - provider_id: braintrust
    provider_type: inline::braintrust
    config:
-      openai_api_key: '********'
+      openai_api_key: ${env.OPENAI_API_KEY}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c80a78d and 2993407.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis-run.yaml
  • examples/vllm-rhelai-run.yaml
  • examples/vllm-rhoai-run.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • docker-compose.yaml
  • pyproject.toml
  • docker-compose-library.yaml
  • examples/vllm-rhelai-run.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • examples/vllm-rhoai-run.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/azure-run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-rhaiis-run.yaml
  • examples/run.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (24)
docs/providers.md (2)

103-103: LGTM - Provider status update aligns with PR objectives.

The llama-guard provider status change from ✅ to ❌ is consistent with the "infinite loop issue with LLM calls" mentioned in the configuration files. Consider adding a note in the documentation explaining why this provider is currently unsupported, so users understand it's a known issue rather than a missing feature.


160-160: LGTM - RAG runtime support documented.

The rag-runtime provider is now marked as supported, which aligns with the FAISS VectorIO provider support being added in this PR.

examples/vertexai-run.yaml (3)

30-39: LGTM - VertexAI inference configuration.

The dual-provider setup with Google Vertex for LLM inference and sentence-transformers for embeddings is well-structured.


48-52: Safety provider disabled - ensure this is tracked.

The llama-guard provider is commented out with a warning about an "infinite loop issue with LLM calls." This aligns with the documentation update marking llama-guard as unsupported. Ensure there's an issue tracking this bug for future resolution.


142-146: LGTM - Vector stores configuration.

The default FAISS provider and sentence-transformers embedding model configuration is consistent with the PR's goal of supporting FAISS VectorIO.

examples/vllm-rhaiis-run.yaml (1)

134-138: LGTM - Environment-driven model configuration.

Using environment variables for both model_id and provider_model_id is more flexible than hardcoding, allowing the same config to work across different RHAIIS deployments with different models.

examples/azure-run.yaml (2)

132-145: LGTM - Resource configuration is complete.

The registered resources section properly links the gpt-4o-mini model to the Azure provider and includes the RAG tool group configuration.


146-150: LGTM - Vector stores configuration consistent across examples.

The FAISS vector store with sentence-transformers embedding model is consistently configured across all example files, which supports maintainability.

examples/vllm-rhoai-run.yaml (1)

134-138: No changes needed. Setting provider_model_id: null is intentional and supported behavior in Llama Stack. When set to null (or omitted), the provider defaults to using the model_id value directly. This is documented behavior and is used consistently in the codebase.

Likely an incorrect or invalid review comment.

tests/e2e/configs/run-azure.yaml (1)

1-160: Configuration structure looks correct for Azure provider integration.

The file follows the new centralized providers/storage configuration pattern correctly. Telemetry is properly configured with just enabled: true as expected for Llama Stack 0.3.x. All sensitive credentials use environment variables appropriately.

Based on learnings, the telemetry configuration aligns with the documented pattern.

tests/e2e/configs/run-rhaiis.yaml (1)

1-161: Configuration correctly implements RHAIIS provider setup.

The file properly configures the vLLM-based RHAIIS inference provider with environment-variable-driven URLs and credentials. The structure aligns with the unified providers/storage schema introduced in this PR.

tests/e2e-prow/rhoai/configs/run.yaml (2)

139-142: Verify intentional use of null for provider_model_id.

The provider_model_id is explicitly set to null while model_id is set to meta-llama/Llama-3.2-1B-Instruct. Typically, provider_model_id maps to the model identifier expected by the provider. If the vLLM provider uses the model_id directly or infers it from the endpoint, this may be intentional. Otherwise, consider setting it to match the deployed model.


143-146: Potential mismatch: llama-guard shield references openai/gpt-4o-mini.

The shield is configured with provider_id: llama-guard but provider_shield_id: openai/gpt-4o-mini. This pattern is consistent across other configs, but verify that the llama-guard safety provider correctly routes to the OpenAI provider for the underlying model inference.

tests/e2e/configs/run-vertexai.yaml (2)

30-36: Vertex AI provider configuration looks good.

The Google Vertex AI provider is correctly configured with environment variables for project and location. The allowed_models array explicitly lists the permitted model.


136-137: Verify empty models array is intentional.

The registered_resources.models is an empty array, while other configuration files (e.g., run-azure.yaml, run-rhaiis.yaml) explicitly register models. If models should be dynamically discovered via allowed_models in the provider config, this is fine. Otherwise, consider registering the Vertex AI model explicitly for consistency.

run.yaml (3)

12-12: Verify telemetry in the apis list.

This file includes telemetry in the apis list (line 12), while other configuration files in this PR do not. Verify if this is intentional for the root/starter configuration or if it should be removed for consistency with the other configs.


117-120: Helpful inline documentation.

The comment explaining that registry and RAG storage are unified in kv_default is valuable for understanding the configuration design.


156-164: Vector stores and telemetry configuration looks correct.

The vector_stores defaults, safety configuration, and telemetry enablement follow the expected pattern for Llama Stack 0.3.x. Based on learnings, the telemetry configuration correctly uses enabled: true without a provider block.

tests/e2e/configs/run-rhelai.yaml (2)

155-157: Verify embedding model availability.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 via sentence-transformers provider, but no embedding model is explicitly registered under registered_resources.models. Confirm that the sentence-transformers provider auto-loads this model or add an explicit registration.


160-161: LGTM!

Telemetry configuration correctly uses enabled: true without a provider block, which aligns with Llama Stack 0.3.x requirements.

examples/run.yaml (2)

157-161: LGTM!

The vector_stores configuration correctly defines the default FAISS provider and embedding model. The use of sentence-transformers with nomic model is consistent with other configuration files.


1-7: Good documentation header.

The header comments clearly explain the purpose, requirements, and external tooling reference. This is helpful for users setting up OpenAI + FAISS configurations.

tests/e2e/configs/run-ci.yaml (2)

48-52: LGTM!

Safety provider is properly configured with llama-guard and correctly referenced in registered_resources.shields and safety.default_shield_id. This differs from other configs where safety is disabled, which is appropriate for CI testing.


168-169: LGTM!

Telemetry is correctly enabled and listed in the APIs. Configuration follows Llama Stack 0.3.x conventions.

@are-ces are-ces force-pushed the faiss-support-final branch from 2993407 to 3dc2804 Compare January 13, 2026 13:06
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Fix all issues with AI agents
In @examples/azure-run.yaml:
- Line 36: Update the api_version value for Azure OpenAI in the YAML: replace
the outdated "2024-02-15-preview" with the stable GA "2024-06-01" (or a newer
preview like "2025-04-01-preview" if preview features are required) by editing
the api_version key in the Azure config block so the example uses a current
supported API version.

In @examples/run.yaml:
- Around line 66-69: The YAML snippet uses a hardcoded placeholder API key
("********") for the braintrust provider which will fail at runtime; update the
provider config (provider_id/provider_type/config.openai_api_key) to read the
real key from a secure source (e.g., environment variable or secrets manager)
and validate presence before use (fail fast with a clear error if the env
var/secret is missing) so the provider doesn't attempt to start with a
placeholder value.
- Around line 157-161: The examples run.yaml is missing the telemetry
configuration block referenced by the telemetry API; add a telemetry section at
the end of the file with telemetry: enabled: true (matching the run.yaml
pattern) so telemetry is explicitly enabled and consistent with the API usage.

In @pyproject.toml:
- Line 63: The dependency line "einops>=0.8.1" lacks an explanatory comment; add
a brief single-line comment immediately above that line describing why einops is
required (e.g., "Used for tensor rearrangements and shape transformations in
model preprocessing/visual modules") so it matches the style of other
dependencies and improves maintainability.

In @run.yaml:
- Around line 60-63: The braintrust provider config currently uses a hardcoded
placeholder openai_api_key: '********' which will fail at runtime; update the
inline::braintrust provider block (refer to provider_type: inline::braintrust
and provider_id: braintrust) to read the OpenAI key from an environment variable
(e.g., reference a secret like OPENAI_API_KEY used elsewhere) instead of the
literal ********, and ensure any runtime code that loads config supports env
interpolation or use your secrets manager so the provider's openai_api_key is
populated securely at runtime.

In @tests/e2e/configs/run-ci.yaml:
- Around line 116-123: The two storage backends (kv_default and sql_default)
both use the same env var pattern ${env.SQLITE_STORE_DIR:=...}, causing
collisions when the env var is set; change them to distinct env vars and/or
paths so they cannot override each other (e.g., use
${env.SQLITE_KV_STORE_DIR:=~/.llama/storage/rag/kv_store.db} for kv_default and
${env.SQLITE_SQL_STORE_DIR:=~/.llama/storage/sql_store.db} for sql_default) and
ensure the db_path values for kv_default and sql_default remain unique and
consistent with the other store definitions referenced elsewhere.
- Around line 161-165: The default_embedding_model value (model_id
nomic-ai/nomic-embed-text-v1.5) is not present in registered_resources.models
which will break FAISS usage; either change
vector_stores.default_embedding_model to the already-registered model_id
(sentence-transformers/all-mpnet-base-v2) or add an entry for
nomic-ai/nomic-embed-text-v1.5 under registered_resources.models with model_id
nomic-ai/nomic-embed-text-v1.5, model_type: embedding, provider_id:
sentence-transformers, provider_model_id: nomic-ai/nomic-embed-text-v1.5 and
correct embedding_dimension in metadata so the default and registry match.
- Around line 17-28: The three store entries (conversations_store,
inference_store, metadata_store) currently all use the same env pattern
`${env.SQLITE_STORE_DIR:=...}` which makes them point to the same file when
SQLITE_STORE_DIR is set; update these keys so they either use distinct env vars
per store (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH, METADATA_DB_PATH) or
treat SQLITE_STORE_DIR as a directory and append filenames (e.g., use
SQLITE_STORE_DIR with `/conversations.db`, `/inference-store.db`,
`/registry.db`), and ensure this matches the existing storage_dir expectation
for a directory.
- Around line 60-63: The Braintrust provider block (provider_id: braintrust,
provider_type: inline::braintrust) contains a hardcoded/placeholder
openai_api_key value; replace it with an environment variable reference (e.g.,
use a CI/secret env var like BRAINTRUST_OPENAI_API_KEY) so the key is not
committed, or if this provider is intentionally disabled remove the provider
block or add a clear comment explaining it's disabled; ensure the config key
name openai_api_key stays the same and that your CI/concourse/github-actions
reads the chosen env var before tests run.

In @tests/e2e/configs/run-rhelai.yaml:
- Around line 53-57: The config sets safety: [] but still registers shields and
sets safety.default_shield_id to "llama-guard", causing an inconsistency; fix by
removing the shield registration block (the entries that reference provider_id:
llama-guard / provider_type: inline::llama-guard) and remove or clear
safety.default_shield_id (the safety.default_shield_id: llama-guard setting) so
the config consistently disables safety providers, or alternatively re-enable
the safety provider entry and its config under safety: if you intend to keep
shields.
🧹 Nitpick comments (2)
examples/vllm-rhelai-run.yaml (1)

36-36: Consider documenting TLS verification in production guidance.

tls_verify: false is acceptable for examples/development, but users copying this config should be reminded to enable TLS verification in production environments. A brief comment would help.

tests/e2e/configs/run-azure.yaml (1)

35-35: Hardcoded Azure endpoint in test configuration.

The api_base is hardcoded to https://ols-test.openai.azure.com/. This is acceptable for e2e tests but consider using an environment variable for flexibility across test environments.

♻️ Suggested change
-      api_base: https://ols-test.openai.azure.com/
+      api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2993407 and 3dc2804.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis-run.yaml
  • examples/vllm-rhelai-run.yaml
  • examples/vllm-rhoai-run.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-granite-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (7)
  • docs/providers.md
  • examples/vllm-rhaiis-run.yaml
  • docker-compose.yaml
  • tests/e2e/test_list.txt
  • tests/e2e/features/faiss.feature
  • examples/vertexai-run.yaml
  • tests/e2e/configs/run-vertexai.yaml
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Check pyproject.toml for supported Python versions before development
Always check pyproject.toml for existing dependencies before adding new ones
Always verify current library versions in pyproject.toml rather than assuming versions
Use pylint with source-roots = "src" configuration

Files:

  • pyproject.toml
🧠 Learnings (5)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • pyproject.toml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • pyproject.toml
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • examples/vllm-rhoai-run.yaml
  • examples/azure-run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • examples/run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
  • run.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (25)
tests/e2e/configs/run-rhaiis.yaml (6)

17-27: LGTM!

The store configurations use consistent environment variable patterns with sensible defaults and separate database files for each store type.


66-72: LGTM!

The FAISS vector_io provider configuration correctly sets up persistence with the kv_default backend and an appropriate namespace. This aligns with the PR objective of fully supporting the FAISS VectorIO provider.


73-110: LGTM!

The agents, batches, datasetio, and eval provider configurations follow consistent patterns with appropriate namespace isolation and backend references.


114-136: LGTM!

The storage configuration properly defines both kv_sqlite and sql_sqlite backends with appropriate store definitions. The inference store includes sensible write queue settings for performance.


143-146: Verify shield configuration consistency.

The shield is registered with provider_id: llama-guard (which maps to inline::llama-guard provider), but provider_shield_id references openai/gpt-4o-mini. This appears inconsistent—typically the provider_shield_id should match the provider's expected model identifier.

Given the PR notes that shield support was revoked due to unusability, please confirm whether this configuration is intentionally a placeholder or if it should be corrected/removed.


153-161: LGTM!

The vector stores configuration correctly sets FAISS as the default provider with sentence-transformers embedding model. The telemetry configuration follows the Llama Stack 0.3.x pattern of using simple enabled: true without a provider block. Based on learnings, this is the correct approach.

tests/e2e/configs/run-ci.yaml (3)

12-12: LGTM!

The telemetry API addition and the telemetry.enabled: true configuration align with the Llama Stack 0.3.x pattern where telemetry can be enabled without requiring an explicit provider block. Based on learnings.

Also applies to: 168-169


31-74: LGTM!

The providers block is well-structured with appropriate comments explaining the RAG tool and FAISS storage backend configuration. The inference provider correctly uses environment variables for the API key and allowed models.


158-167: LGTM!

The RAG tool group registration properly references the rag-runtime provider, and the safety shield configuration correctly references the registered llama-guard shield.

docker-compose-library.yaml (1)

16-16: LGTM!

The new volume mount correctly exposes the e2e RAG test database to the container at the expected storage path. The :Z flag is appropriate for SELinux relabeling.

examples/vllm-rhelai-run.yaml (1)

1-151: LGTM - Well-structured vLLM RHEL AI configuration.

Configuration properly uses environment variables for sensitive values and follows the centralized storage-backed architecture pattern. The shields disabled warning (line 49, 139) is appropriately documented.

examples/vllm-rhoai-run.yaml (1)

1-151: LGTM - Consistent RHOAI configuration.

Configuration mirrors the RHEL AI variant appropriately with RHOAI-specific environment variables (KSVC_URL, VLLM_API_KEY). The hardcoded model ID with provider_model_id: null is valid for vLLM deployments.

tests/e2e/configs/run-azure.yaml (1)

1-160: LGTM - Comprehensive Azure e2e test configuration.

Configuration correctly enables safety provider (llama-guard) for e2e testing, unlike examples where shields are disabled. The telemetry configuration at root level aligns with Llama Stack 0.3.x requirements. Based on learnings.

examples/azure-run.yaml (1)

1-150: LGTM - Azure example configuration follows established patterns.

Configuration properly disables shields with documented warnings and uses the centralized storage-backed architecture.

tests/e2e-prow/rhoai/configs/run.yaml (2)

1-161: LGTM - Well-configured RHOAI prow test configuration.

Configuration appropriately enables safety (llama-guard) and telemetry for CI/prow testing. The dual inference providers (vllm + openai) provide flexibility for the test environment.


144-146: Shield configuration depends on OpenAI provider.

The provider_shield_id: openai/gpt-4o-mini indicates llama-guard uses OpenAI as its backing model. Ensure OPENAI_API_KEY is available in the prow test environment, as the shield will fail without it even though vllm is the primary inference provider.

run.yaml (3)

12-14: LGTM!

The telemetry API is correctly added to the apis list, and the telemetry configuration at lines 163-164 follows the correct Llama Stack 0.3.x pattern with telemetry.enabled: true without a provider block. Based on learnings, this is the expected approach.


68-74: LGTM!

The FAISS vector_io provider configuration is well-structured with proper persistence settings referencing the kv_default backend.


156-164: LGTM!

The vector_stores, safety, and telemetry sections are properly configured with appropriate defaults and references to the defined providers.

examples/run.yaml (3)

1-6: LGTM!

Good documentation header explaining the configuration purpose and requirements.


54-58: Safety shields intentionally disabled.

The safety providers and shields are disabled with a documented reason (infinite loop issue with LLM calls). This is acceptable for the example configuration, but ensure the underlying issue is tracked for resolution.

Is there a tracking issue for the infinite loop problem mentioned in the comment? If not, consider creating one to ensure this gets addressed.


147-156: LGTM!

The registered_resources section correctly registers the RAG tool group while keeping models and shields empty (consistent with the disabled safety configuration and dynamic model registration via OpenAI).

tests/e2e/configs/run-rhelai.yaml (3)

30-37: LGTM with note on TLS verification.

The vLLM inference provider configuration is appropriate for the RHEL AI test environment. Note that tls_verify: false is acceptable for testing but should not be used in production environments.


153-161: LGTM!

The vector_stores configuration and telemetry block are correctly configured for the RHEL AI test environment, following the Llama Stack 0.3.x pattern.


114-136: LGTM!

The storage backends and stores configuration is well-structured and consistent with the other configuration files in this PR.

config:
api_key: ${env.AZURE_API_KEY}
api_base: https://ols-test.openai.azure.com/
api_version: 2024-02-15-preview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

Azure OpenAI API versions 2024 2025 latest current

💡 Result:

Short answer

  • Latest GA Azure OpenAI Service API version: 2024-06-01. [1]
  • Latest preview / Foundry API versions (most recent preview cycle): 2025-04-01-preview (and incremental preview dates like 2025-03-01-preview, 2025-02-01-preview, 2024-12-01-preview are documented). [2]

Notes

  • Microsoft documents the API-version lifecycle and change history for Azure OpenAI / Foundry (preview versions listed through 2025-04-01-preview). [2]
  • If you use the Assistants API, v1 is being retired (migration to v2 required—retirement notices in 2025), so check the specific retirement date for your region. [3]

Sources

  1. Rollout / Azure OpenAI Service summary (latest GA = 2024-06-01). [4]
  2. Microsoft Learn — Azure OpenAI / Foundry API version lifecycle (preview releases through 2025-04-01-preview). [2]
  3. Announcement / notices re: Assistants API v1 retirement. [3]

Would you like direct links to the specific Microsoft docs or help updating code to a particular api-version?


Update Azure OpenAI API version to current release.

The api_version: 2024-02-15-preview is outdated. The latest GA version is 2024-06-01, with preview versions available through 2025-04-01-preview. Update to at least the stable 2024-06-01 release to ensure access to current features and improvements.

🤖 Prompt for AI Agents
In @examples/azure-run.yaml at line 36, Update the api_version value for Azure
OpenAI in the YAML: replace the outdated "2024-02-15-preview" with the stable GA
"2024-06-01" (or a newer preview like "2025-04-01-preview" if preview features
are required) by editing the api_version key in the Azure config block so the
example uses a current supported API version.

Comment on lines +66 to +68
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Placeholder API key for braintrust scoring provider.

Same issue as in run.yaml - the placeholder value will fail at runtime if used.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🤖 Prompt for AI Agents
In @examples/run.yaml around lines 66 - 69, The YAML snippet uses a hardcoded
placeholder API key ("********") for the braintrust provider which will fail at
runtime; update the provider config
(provider_id/provider_type/config.openai_api_key) to read the real key from a
secure source (e.g., environment variable or secrets manager) and validate
presence before use (fail fast with a clear error if the env var/secret is
missing) so the provider doesn't attempt to start with a placeholder value.

Comment on lines +60 to +63
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Braintrust API key should use environment variable.

The openai_api_key: '********' appears to be either a placeholder or obfuscated value. If this provider is used in CI tests, it will fail. If it's intentionally disabled, consider removing the provider or adding a comment.

Per project guidelines, use environment variables for sensitive data.

🔧 Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: '********'
- provider_id: braintrust
provider_type: inline::braintrust
config:
openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-ci.yaml around lines 60 - 63, The Braintrust provider
block (provider_id: braintrust, provider_type: inline::braintrust) contains a
hardcoded/placeholder openai_api_key value; replace it with an environment
variable reference (e.g., use a CI/secret env var like
BRAINTRUST_OPENAI_API_KEY) so the key is not committed, or if this provider is
intentionally disabled remove the provider block or add a clear comment
explaining it's disabled; ensure the config key name openai_api_key stays the
same and that your CI/concourse/github-actions reads the chosen env var before
tests run.

@are-ces are-ces force-pushed the faiss-support-final branch from 3dc2804 to f2b7f5e Compare January 13, 2026 13:16
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 19

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/providers.md (1)

23-23: Typo: "atributes" → "attributes"

-The tables below summarize each provider category, containing the following atributes:
+The tables below summarize each provider category, containing the following attributes:
🤖 Fix all issues with AI agents
In @.github/workflows/e2e_tests_providers.yaml:
- Around line 4-5: The workflow currently uses a bare "on: push" which triggers
on every branch and removed manual dispatch, PR validation, and scheduled runs;
update the "on" section to a mapping that includes "push" with "branches" (e.g.,
["main","release"]) to limit pushes, add "pull_request" with the same "branches"
to validate PRs, re-add "workflow_dispatch" to allow manual runs, and include a
"schedule" entry (cron) to run periodic E2E checks; modify the YAML keys "on",
"push", "pull_request", "workflow_dispatch", and "schedule" accordingly so the
workflow is targeted, can be manually triggered, validates PRs, and runs on a
cron.

In @examples/azure-run.yaml:
- Around line 137-152: The config references default_embedding_model =
"nomic-ai/nomic-embed-text-v1.5" but that embedding model is not present in
registered_resources.models; add a models entry for the embedding (e.g., a model
object with model_id matching the referenced name or a chosen id, provider_id
set to the correct provider for Nomic, model_type set to "embedding", and
provider_model_id set to "nomic-ai/nomic-embed-text-v1.5") so
default_embedding_model points to a registered model; ensure the same fix is
applied for the other occurrence noted (lines 153-157).

In @examples/run.yaml:
- Around line 119-120: The YAML contains two identical top-level telemetry
blocks; remove the duplicate so only a single "telemetry:" mapping remains at
the top level, deleting the extra "telemetry: enabled: true" occurrence and
ensuring surrounding indentation/spacing keeps the YAML valid.
- Around line 146-157: The YAML registers no models while RAG references an
embedding model; add the embedding model to registered_resources.models so
default_embedding_model matches a registered model (e.g., add an entry with
model_id like "nomic-embed-text-v1.5" or similar, provider_id (e.g.,
"nomic-ai"), and provider_model_id set to "nomic-ai/nomic-embed-text-v1.5");
ensure the model_id you add is the same identifier used by
default_embedding_model and that the toolgroup entry (toolgroup_id:
builtin::rag) can resolve that model.
- Line 58: The YAML has a syntax error where the keys "provider_type" and
"scoring" are on the same line; update the config so "provider_type:
inline::llama-guard" and "scoring:" are on separate lines (ensure proper
indentation for the "scoring" block), so the YAML parser can correctly recognize
both keys.

In @examples/vllm-rhaiis.yaml:
- Around line 17-27: The config uses the same `${env.SQLITE_STORE_DIR:=...}`
substitution for conversations_store, inference_store, and metadata_store which
replaces the entire path and will make all three point to the same file when
SQLITE_STORE_DIR is set; update the three entries so each store has its own env
var (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH, METADATA_DB_PATH) or switch
to a single directory env var (e.g., SQLITE_STORE_DIR) combined with per-store
filenames (concatenate directory + filename) so that conversations_store,
inference_store, and metadata_store each resolve to distinct DB files.
- Around line 46-52: The storage_dir value is using the misnamed env var
SQLITE_STORE_DIR which implies a DB path; update the config to use a dedicated
environment variable (e.g., FILES_STORAGE_DIR) for file storage paths and adjust
the default accordingly. Locate the storage_dir key in the inline::localfs
provider block and replace `${env.SQLITE_STORE_DIR:=~/.llama/storage/files}`
with `${env.FILES_STORAGE_DIR:=~/.llama/storage/files}` (or another appropriate
FILES_* var), and ensure any documentation or env examples are updated to
reference FILES_STORAGE_DIR instead of SQLITE_STORE_DIR.
- Around line 153-157: The configuration selects the embedding model
"nomic-ai/nomic-embed-text-v1.5" but the Python dependency "nomic" is not
declared; add "nomic" to the project dependencies (pyproject.toml/requirements)
alongside the existing sentence-transformers and einops entries so the model can
be loaded at runtime, specifying a stable version constraint (or latest)
compatible with sentence-transformers.

In @examples/vllm-rhelai.yaml:
- Line 15: Remove the trailing whitespace after the APIs list on line 15 (the
blank/whitespace-only content following the "APIs" entry) so the line contains
no trailing spaces; simply delete the extra spaces at the end of that line to
clean up formatting.
- Around line 143-146: The YAML defines a shield (shield_id: llama-guard,
provider_id: llama-guard, provider_shield_id: openai/gpt-4o-mini) and sets
default_shield_id: llama-guard while the safety providers array is empty, which
is inconsistent; either remove the shield block and the default_shield_id entry
(if safety is intended to be disabled) or populate the safety providers array
with a matching provider definition for provider_id: llama-guard (and configure
any required provider_shield mapping) so the default_shield_id resolves at
runtime.
- Around line 17-27: The db_path entries (conversations_store.db_path,
inference_store.db_path, metadata_store.db_path) currently all use the same env
var SQLITE_STORE_DIR with different inline defaults, which causes different
stores to collapse to the same path when SQLITE_STORE_DIR is set; change each
entry to either use its own env var (e.g., SQLITE_CONVERSATIONS_PATH,
SQLITE_INFERENCE_PATH, SQLITE_METADATA_PATH) or treat SQLITE_STORE_DIR strictly
as a directory prefix and append distinct filenames outside the variable (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db), and apply the same
change across all example and test config files to avoid cross-store collisions.

In @run.yaml:
- Around line 156-160: The default_embedding_model currently sets provider_id:
sentence-transformers and model_id: nomic-ai/nomic-embed-text-v1.5 which doesn't
match the registered embedding model (sentence-transformers/all-mpnet-base-v2);
update the default_embedding_model.model_id to the registered model_id
(sentence-transformers/all-mpnet-base-v2) so vector_stores.default_provider_id
and the registered embedding entry are consistent (mirror the same fix applied
in run-ci.yaml).

In @tests/e2e/configs/run-ci.yaml:
- Around line 161-165: The default_embedding_model
(nomic-ai/nomic-embed-text-v1.5) is not registered in
registered_resources.models (only sentence-transformers/all-mpnet-base-v2 is
present), causing a mismatch; fix by either adding a registration entry for
nomic-ai/nomic-embed-text-v1.5 under registered_resources.models or changing
vector_stores.default_embedding_model to the existing
sentence-transformers/all-mpnet-base-v2 so the default references a registered
model.

In @tests/e2e/configs/run-rhaiis.yaml:
- Around line 115-122: Both storage backends kv_default and sql_default use the
same env var SQLITE_STORE_DIR so setting it makes them point to the same DB
file; change the db_path entries so each backend uses a distinct env var or
distinct filename (e.g., SQLITE_KV_STORE_DIR for kv_default and
SQLITE_SQL_STORE_DIR for sql_default, or keep SQLITE_STORE_DIR but append
different filenames like ${env.SQLITE_STORE_DIR:=~/.llama/storage}/kv_store.db
and .../sql_store.db) so kv_default.db_path and sql_default.db_path resolve to
different files.
- Around line 18-28: The three db_path entries (conversations_store.db_path,
inference_store.db_path, metadata_store.db_path) incorrectly treat
SQLITE_STORE_DIR as a full filepath and thus will point all databases to the
same file when the env var is set; update each db_path to use SQLITE_STORE_DIR
as a base directory and append distinct filenames (e.g., conversations.db,
inference-store.db, registry.db) so each DB has its own file, and ensure the
storage_dir usage at the other location (where storage_dir expects a directory)
remains consistent with this change.
- Around line 46-53: The storage_dir entry for provider_id
"meta-reference-files" currently uses
${env.SQLITE_STORE_DIR:=~/.llama/storage/files} which is inconsistent with the
repo's directory-vs-file env var convention; update this value to the
directory-specific env var used elsewhere (for example replace SQLITE_STORE_DIR
with the directory-specific name, e.g. SQLITE_STORE_DIR_DIR or the agreed-upon
directory env var) so storage_dir consistently expects a directory path.
- Line 16: Remove the trailing whitespace after the list item "- vector_io" in
the YAML so the line ends exactly with "- vector_io" (no extra spaces or tabs);
update the file to trim trailing whitespace on that line and run a
YAML/whitespace linter or git diff to confirm no trailing characters remain.

In @tests/e2e/configs/run-rhelai.yaml:
- Around line 54-58: The config disables safety via "safety: []" but still
registers a shield and sets "safety.default_shield_id" which creates an
inconsistent state; fix by removing or commenting out the shield entry under
"registered_resources.shields" and clearing or commenting
"safety.default_shield_id" (or set it to null) so no shield is referenced while
safety is disabled, and add a clear inline warning comment near "safety: []"
explaining why shields are disabled to prevent accidental re-enablement.
- Around line 154-158: The config references default_embedding_model
(provider_id sentence-transformers, model_id nomic-ai/nomic-embed-text-v1.5) but
no embedding model is registered under registered_resources.models; add a model
entry in registered_resources.models that registers this embedding model using
the same provider_id and model_id so the RAG/vector_stores flow can resolve it
(ensure the entry key/name matches any expected reference and use the symbols
registered_resources.models and default_embedding_model to locate where to add
the registration).
🧹 Nitpick comments (4)
tests/e2e/configs/run-azure.yaml (1)

31-44: Consider environment variable for Azure api_base.

The Azure endpoint https://ols-test.openai.azure.com/ is hardcoded while api_key uses an environment variable. For consistency and flexibility across environments, consider templating the base URL:

♻️ Optional: Use environment variable for api_base
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
+      api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
       api_version: 2024-02-15-preview
tests/e2e/configs/run-ci.yaml (1)

60-63: Use environment variable for braintrust API key.

The hardcoded '********' placeholder will cause braintrust scoring to fail if invoked. Consider using environment variable substitution for consistency with other API keys in this config.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.OPENAI_API_KEY}
examples/azure-run.yaml (1)

34-37: Hardcoded Azure endpoint in example configuration.

The api_base is hardcoded to a specific test endpoint (https://ols-test.openai.azure.com/). For an example configuration, consider using an environment variable to make it more reusable.

Suggested fix
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
+      api_base: ${env.AZURE_API_BASE}
       api_version: 2024-02-15-preview
run.yaml (1)

60-63: Use environment variable for braintrust API key.

Same issue as run-ci.yaml: the placeholder '********' should be replaced with ${env.OPENAI_API_KEY} for consistency and functionality.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3dc2804 and f2b7f5e.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (26)
  • .github/workflows/e2e_tests_providers.yaml
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-granite-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/gemini-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/e2e/features/faiss.feature
  • docker-compose.yaml
  • docker-compose-library.yaml
🧰 Additional context used
📓 Path-based instructions (2)
tests/e2e/test_list.txt

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain test list in tests/e2e/test_list.txt

Files:

  • tests/e2e/test_list.txt
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Check pyproject.toml for supported Python versions before development
Always check pyproject.toml for existing dependencies before adding new ones
Always verify current library versions in pyproject.toml rather than assuming versions
Use pylint with source-roots = "src" configuration

Files:

  • pyproject.toml
🧠 Learnings (9)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/steps/**/*.py : Define E2E test steps in `tests/e2e/features/steps/` directory

Applied to files:

  • tests/e2e/test_list.txt
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • examples/vllm-rhoai.yaml
  • examples/azure-run.yaml
  • examples/vllm-rhaiis.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/run.yaml
  • tests/e2e/configs/run-ci.yaml
  • examples/vertexai-run.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • examples/vllm-rhelai.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-vertexai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • pyproject.toml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • pyproject.toml
🪛 YAMLlint (1.37.1)
examples/run.yaml

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
  • GitHub Check: E2E: server mode / ci
🔇 Additional comments (30)
pyproject.toml (1)

63-63: Add an explanatory comment for consistency with other dependencies.

The version einops>=0.8.1 is correct (latest release published Feb 9, 2025). However, unlike all other dependencies in this file, einops lacks an explanatory comment. Add a comment clarifying its purpose—likely related to FAISS/RAG vectorization support given the PR context and the project's default RAG configuration using FAISS.

examples/vllm-rhelai.yaml (5)

53-57: Safety providers disabled — verify this is intentional for the example.

The safety providers are disabled with a comment about an "infinite loop issue with LLM calls." While this is documented, shipping an example configuration with safety disabled could lead users to run unprotected deployments.

Consider:

  1. Adding a more prominent warning in documentation about this limitation.
  2. Tracking this as a known issue to be resolved before GA.

Is there a tracking issue for the infinite loop problem mentioned in the comment?


66-72: LGTM!

The FAISS vector_io provider configuration looks correct with appropriate persistence settings using the kv_default backend.


73-110: LGTM!

The agents, batches, datasetio, and eval provider configurations are well-structured with appropriate namespace organization and backend references.


153-157: LGTM!

The vector_stores configuration correctly sets FAISS as the default provider with the nomic embedding model, aligning with the PR objectives for full FAISS support.


160-161: LGTM!

Telemetry is enabled without an explicit provider block, which is the correct approach for Llama Stack version 0.3.x based on learnings.

examples/vllm-rhaiis.yaml (5)

1-15: LGTM on basic structure.

The version and APIs configuration looks appropriate for the FAISS VectorIO provider support. Minor note: Line 15 has trailing whitespace that could be cleaned up.


66-72: FAISS vector_io configuration looks correct.

The inline FAISS provider is properly configured with persistence settings using the kv_default backend. This aligns well with the PR objective of fully supporting the FAISS VectorIO provider.


114-121: Same environment variable conflict applies to storage backends.

Lines 118 and 121 have the same SQLITE_STORE_DIR pattern issue flagged earlier. Both kv_default and sql_default backends would point to the same path if the env var is set.


143-146: Shield naming may be misleading.

The shield is named llama-guard but uses openai/gpt-4o-mini as the backend (provider_shield_id). This could confuse users expecting Meta's Llama Guard safety model. Consider renaming to reflect the actual backend, or add a comment clarifying this is a placeholder configuration.

Per the PR description mentioning "revoked support for shields due to unusability"—is this configuration intentionally a workaround, or should it be removed/commented out?


160-161: Telemetry configuration is correct.

The simplified telemetry configuration with just enabled: true aligns with Llama Stack 0.3.x requirements where provider blocks are not needed for telemetry. Based on learnings from this repository.

tests/e2e/configs/run-rhaiis.yaml (4)

144-147: Verify provider_shield_id for llama-guard shield.

The shield is named llama-guard but uses openai/gpt-4o-mini as the provider_shield_id. This appears to route safety checks through OpenAI's API rather than a Llama Guard model. Please confirm this is intentional for the e2e test configuration.


67-73: FAISS vector_io provider configuration looks correct.

The FAISS provider is properly configured with kv_default persistence backend, which is defined in the storage section. This aligns with the PR objective to fully support the FAISS VectorIO provider.


154-158: Vector stores configuration is consistent with provider definitions.

The default embedding model correctly references the sentence-transformers provider defined at lines 43-45, and FAISS is properly set as the default vector_io provider.


161-162: Telemetry configuration is appropriate for Llama Stack 0.3.x.

Based on learnings, telemetry provider configuration is not supported under the providers section in Llama Stack 0.3.x, so enabling it with just enabled: true is correct.

tests/e2e/test_list.txt (1)

1-1: LGTM!

The new features/faiss.feature entry correctly maintains the test list as per the coding guidelines, enabling FAISS e2e test discovery for the new vector store functionality.

examples/vllm-rhoai.yaml (3)

1-27: Configuration structure looks good with environment-driven paths.

The centralized storage layout with ${env.SQLITE_STORE_DIR:=~/.llama/storage/...} templating provides good flexibility for deployment environments. The separation of conversations, inference, and metadata stores is well-organized.


66-72: FAISS vector_io provider correctly configured.

The FAISS provider with kv_default backend persistence aligns with the PR's objective to fully support the FAISS VectorIO provider.


160-161: Review comment contains factual errors and incorrect assumptions.

The file names are incorrect: run-azure.yaml and run-vertexai.yaml do not exist; the actual files are azure-run.yaml and vertexai-run.yaml. More importantly, the claim is inaccurate: only azure-run.yaml includes telemetry in its apis list (line 12), while vertexai-run.yaml does not.

Additionally, based on Llama Stack 0.3.x documentation, telemetry provider configuration is not supported under the providers section—telemetry is enabled via telemetry.enabled: true without requiring an explicit apis entry. The vllm-rhoai.yaml configuration is following the correct pattern. The inconsistency with azure-run.yaml warrants investigation, but it does not indicate that vllm-rhoai.yaml is misconfigured.

Likely an incorrect or invalid review comment.

tests/e2e/configs/run-vertexai.yaml (2)

31-44: Vertex AI inference provider configuration looks correct.

The provider setup with project/location environment variables and the allowed_models restriction to google/gemini-2.5-flash is appropriate for controlled e2e testing.


149-157: Vector stores and telemetry configuration is consistent.

The default FAISS provider with nomic-ai/nomic-embed-text-v1.5 embedding model aligns with the centralized vector store pattern across the PR. Based on learnings, telemetry.enabled: true without a provider block is the correct approach for Llama Stack 0.3.x.

tests/e2e-prow/rhoai/configs/run.yaml (2)

1-162: Configuration structure is well-organized for RHOAI prow testing.

The comprehensive provider setup with FAISS vector_io, environment-driven storage paths, and centralized backend configuration aligns with the PR's objectives.


139-143: The provider_model_id: null value is intentional for RHOAI vllm configurations. This pattern is consistently used across both the test configuration and the official example configuration (examples/vllm-rhoai.yaml), while other vllm environments (RHELAI, RHAIIS) use environment variables instead. RHOAI vllm is a supported provider in LCS, and the null value appears to be the expected configuration for this environment.

tests/e2e/configs/run-azure.yaml (1)

66-72: FAISS vector_io configuration is consistent across environments.

The FAISS provider setup matches the pattern used in vertexai and rhoai configs, ensuring consistent behavior across different inference backends.

examples/vertexai-run.yaml (3)

4-14: telemetry API not listed but telemetry.enabled is set.

Similar to examples/vllm-rhoai.yaml, the apis list here omits telemetry while the test configs (run-azure.yaml, run-vertexai.yaml) explicitly include it. Verify whether this inconsistency between examples and test configs is intentional.


30-43: Vertex AI provider configuration is appropriate for the example.

The allowed_models restriction and environment-driven project/location settings provide a clear reference implementation.


148-156: Vector stores and safety defaults are correctly configured.

The FAISS default provider and nomic-ai/nomic-embed-text-v1.5 embedding model provide a complete vector store setup aligned with the PR objectives.

docs/providers.md (2)

103-103: Documentation update looks correct.

The change from ✅ to ❌ for llama-guard aligns with the PR description stating shields support was revoked due to unusability issues.


160-160: LGTM: rag-runtime now marked as supported.

This correctly reflects the FAISS VectorIO provider support being added in this PR.

run.yaml (1)

163-164: Telemetry configuration follows correct pattern.

Per retrieved learnings, telemetry in Llama Stack 0.3.x should use telemetry.enabled: true without a provider block. This is correctly implemented.

Comment on lines 154 to +158
vector_stores:
default_provider_id: documentation_faiss
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: all-mpnet-base-v2
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing embedding model registration.

default_embedding_model references nomic-ai/nomic-embed-text-v1.5, but no embedding models are registered in registered_resources.models. The sentence-transformers provider is configured (lines 43-45), but without a registered embedding model, RAG operations may fail.

Register the embedding model
 registered_resources:
   models:
   - model_id: ${env.RHEL_AI_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHEL_AI_MODEL}
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhelai.yaml around lines 154 - 158, The config
references default_embedding_model (provider_id sentence-transformers, model_id
nomic-ai/nomic-embed-text-v1.5) but no embedding model is registered under
registered_resources.models; add a model entry in registered_resources.models
that registers this embedding model using the same provider_id and model_id so
the RAG/vector_stores flow can resolve it (ensure the entry key/name matches any
expected reference and use the symbols registered_resources.models and
default_embedding_model to locate where to add the registration).

@are-ces are-ces force-pushed the faiss-support-final branch from f2b7f5e to 57b58a4 Compare January 14, 2026 08:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/e2e/configs/run-vertexai.yaml (1)

137-153: Register the embedding model in registered_resources.models.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 with provider sentence-transformers, but registered_resources.models is empty. This differs from the pattern in run-ci.yaml, which explicitly registers its embedding model. Explicit registration ensures consistent model discovery and lifecycle management.

🔧 Suggested fix: Register the embedding model
 registered_resources:
-  models: []
+  models:
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
🤖 Fix all issues with AI agents
In `@examples/vllm-rhoai.yaml`:
- Around line 17-27: The three store entries (conversations_store,
inference_store, metadata_store) incorrectly embed the full filepath into the
SQLITE_STORE_DIR env var, causing all stores to resolve to the same file when
SQLITE_STORE_DIR is set; change each db_path to treat SQLITE_STORE_DIR as a
directory by using the env default for the directory only and append the
specific filenames (e.g., conversations.db, inference-store.db, registry.db)
outside the variable expansion, and apply the same directory-only fix to the
other occurrences of SQLITE_STORE_DIR elsewhere in the file.
- Around line 143-146: The shield entry is misconfigured: update the shields
block so the guard references a registered model instead of an external
provider_shield_id; register a guard model in the models section (e.g., add a
model entry with an id you will reference, provider_id set to openai or vllm,
and provider_model_id like meta-llama/Llama-Guard-3-8B), then change the shield
to reference that model id via shield_id and set provider_shield_id: null (or
remove provider_shield_id) so the inline::llama-guard provider uses the
registered model; ensure the model id used in the shield matches the new models
entry and that provider_id/provider_model_id are correctly populated.

In `@tests/e2e-prow/rhoai/configs/run.yaml`:
- Around line 139-158: The config references an embedding model under
default_embedding_model (model_id: nomic-ai/nomic-embed-text-v1.5) that is not
listed in the models array; add a models entry for that embedding model so it is
registered (e.g., create a new model object with model_id set to
"nomic-ai/nomic-embed-text-v1.5", provider_id "sentence-transformers", and
model_type "embedding" — include provider_model_id or other required fields
consistent with the existing models schema).

In `@tests/e2e/configs/run-ci.yaml`:
- Around line 18-28: The db_path patterns for conversations_store,
inference_store and metadata_store use SQLITE_STORE_DIR as a full-file default
which causes all paths to collapse to the same value when the env var is set;
change each to treat SQLITE_STORE_DIR as a directory (e.g. use SQLITE_STORE_DIR
defaulting to ~/.llama/storage and then append the specific filename) so
conversations_store.db_path, inference_store.db_path and metadata_store.db_path
resolve to distinct files, and make the same change for
storage.backends.kv_default.db_path and storage.backends.sql_default.db_path
(also update the analogous entries around the referenced 118-123 region).

In `@tests/e2e/configs/run-rhaiis.yaml`:
- Around line 154-158: The config's vector_stores.default_embedding_model
references provider_id "sentence-transformers" and model_id
"nomic-ai/nomic-embed-text-v1.5" but that model is not present in
registered_resources.models; either add an entry for
"nomic-ai/nomic-embed-text-v1.5" to registered_resources.models (matching the
provider_id and model_id fields) or change vector_stores.default_embedding_model
to a model_id that is already registered by the sentence-transformers provider
so the system can resolve the embedding model at runtime.
♻️ Duplicate comments (25)
examples/vllm-rhaiis.yaml (3)

17-27: Environment variable usage causes path conflicts when set.

This issue was previously flagged. All db_path entries use the same ${env.SQLITE_STORE_DIR:=...} variable. If a user sets SQLITE_STORE_DIR=/custom/path, all stores will resolve to the same path, causing database conflicts.


46-52: Mismatched environment variable for file storage directory.

This issue was previously flagged. storage_dir uses ${env.SQLITE_STORE_DIR:=...} which is semantically inconsistent for file storage.


153-157: Add nomic package dependency for embedding model.

This issue was previously flagged. The nomic-ai/nomic-embed-text-v1.5 model requires the nomic package in pyproject.toml.

examples/vertexai-run.yaml (1)

17-27: Same environment variable path conflict pattern.

All db_path entries use ${env.SQLITE_STORE_DIR:=...}. When this env var is set, all stores will point to the same path, causing data corruption. This is the same systemic issue affecting other configuration files in this PR.

examples/vllm-rhelai.yaml (3)

15-15: Minor: Trailing whitespace.

This was previously flagged.


17-27: Environment variable path conflict pattern.

This systemic issue was previously flagged across configuration files.


53-57: Configuration inconsistency: Shield registered but no safety provider.

This was previously flagged. The shield is registered at lines 143-146 with provider_id: llama-guard, and default_shield_id: llama-guard is set at line 159, but the safety providers array is empty (line 53). This will cause runtime errors.

Either remove the shield registration and default_shield_id, or configure the safety provider (uncomment lines 54-57).

Also applies to: 143-146, 158-159

examples/run.yaml (5)

65-68: Placeholder API key for braintrust scoring provider.

This was previously flagged. The openai_api_key: '********' is a placeholder that will fail at runtime. Use an environment variable instead.


119-120: Duplicate telemetry block.

This was previously flagged. The telemetry section appears twice: at lines 119-120 and again at lines 165-166. Remove the duplicate.

Also applies to: 165-166


146-162: Empty models list but embedding model referenced.

This was previously flagged. registered_resources.models is empty ([]), but default_embedding_model references nomic-ai/nomic-embed-text-v1.5. Consider registering the embedding model.


23-34: Same environment variable path conflict pattern.

All db_path entries use the same ${env.SQLITE_STORE_DIR:=...} variable with different defaults. This systemic issue affects multiple configuration files.


54-61: Critical YAML syntax error: missing newline breaks configuration.

Line 58 has provider_type: inline::llama-guard scoring: — the scoring: key is incorrectly placed on the same line. This will cause a YAML parse error and prevent the configuration from loading.

🐛 Fix the syntax error
   safety:
   - config:
       excluded_categories: []
     provider_id: llama-guard
-    provider_type: inline::llama-guard  scoring:
-  - provider_id: basic
+    provider_type: inline::llama-guard
+  scoring:
+  - provider_id: basic
     provider_type: inline::basic
     config: {}
run.yaml (2)

17-28: Same environment variable path conflict pattern.

All db_path entries use the same ${env.SQLITE_STORE_DIR:=...} variable. This is the same systemic issue affecting all configuration files in this PR. Consider using separate environment variables per store or restructuring to use the env var as a directory prefix.


60-63: Placeholder API key for braintrust scoring provider.

This was previously flagged. The openai_api_key: '********' is a placeholder that will fail at runtime if the braintrust scoring provider is used.

tests/e2e/configs/run-rhelai.yaml (2)

54-58: Safety configuration inconsistency remains unresolved.

The safety provider is disabled (safety: []) but shields are still registered and default_shield_id is set. This creates an inconsistent state that may cause confusing behavior.

Also applies to: 144-147, 159-160


154-158: Embedding model not registered.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5, but it's not in registered_resources.models.

tests/e2e/configs/run-ci.yaml (2)

60-63: Braintrust API key placeholder should use environment variable.

The openai_api_key: '********' is a placeholder that will cause failures if this provider is used. Per project guidelines, use environment variables for sensitive data. Based on learnings.


161-165: Embedding model mismatch persists.

The default_embedding_model uses nomic-ai/nomic-embed-text-v1.5, but only sentence-transformers/all-mpnet-base-v2 is registered. Either register the nomic model or change the default to the registered model.

tests/e2e/configs/run-rhaiis.yaml (3)

16-16: Trailing whitespace after - vector_io.

Line 16 has trailing whitespace which should be removed.


18-28: Environment variable pattern issue for database paths.

The SQLITE_STORE_DIR pattern issue flagged previously still exists. Apply directory-based pattern for consistency.

Also applies to: 115-122


46-53: Files provider storage_dir uses same env var inconsistently.

This expects a directory while other usages expect file paths. Should align with directory-based pattern.

tests/e2e/configs/run-azure.yaml (2)

137-157: Missing embedding model registration for e2e tests.

Same as the example config: default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but no embedding model is registered in registered_resources.models. This will likely cause the FAISS RAG e2e tests to fail when attempting to generate embeddings.

Register the embedding model
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: azure
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    provider_id: sentence-transformers
+    model_type: embedding
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:

37-37: Update Azure OpenAI API version to current release.

The api_version: 2024-02-15-preview is outdated. For e2e tests, using an up-to-date API version ensures tests validate against current Azure OpenAI behavior.

examples/azure-run.yaml (2)

37-37: Update Azure OpenAI API version to current release.

The api_version: 2024-02-15-preview is outdated. The latest GA version is 2024-06-01, with newer preview versions available. Update to at least the stable release.


137-157: Missing embedding model registration.

The default_embedding_model at lines 155-157 references nomic-ai/nomic-embed-text-v1.5 with sentence-transformers provider, but no embedding model is registered in registered_resources.models. This may cause runtime failures when vector operations attempt to use the embedding model.

Register the embedding model
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: azure
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    provider_id: sentence-transformers
+    model_type: embedding
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
🧹 Nitpick comments (4)
tests/e2e/configs/run-vertexai.yaml (2)

18-28: Environment variable pattern may cause database path collisions.

All db_path entries use ${env.SQLITE_STORE_DIR:=<full_path>} where the env var completely replaces the path. If SQLITE_STORE_DIR is set, all databases point to that same value.

Consider using the env var as a directory base:

🔧 Suggested fix
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
   type: sqlite
 ...
 inference_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
   type: sqlite
 metadata_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
   type: sqlite

116-122: Same SQLITE_STORE_DIR pattern issue in storage backends.

Apply the same fix to storage backend paths for consistency.

🔧 Suggested fix
     kv_default:
       type: kv_sqlite
-      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
+      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/rag/kv_store.db
     sql_default:
       type: sql_sqlite
-      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db}
+      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.db
tests/e2e/configs/run-rhelai.yaml (1)

18-28: Environment variable pattern may cause database path collisions.

Same SQLITE_STORE_DIR pattern issue as other config files. If the env var is set, all database paths resolve to the same value.

🔧 Apply directory-based pattern
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db

Apply similarly to inference_store, metadata_store, and storage.backends paths.

Also applies to: 119-119, 122-122

tests/e2e-prow/rhoai/configs/run.yaml (1)

18-28: Environment variable pattern may cause database path collisions.

All db_path entries use ${env.SQLITE_STORE_DIR:=<full_path>}. If SQLITE_STORE_DIR is set, all databases point to that same value, causing conflicts.

🔧 Use directory-based pattern
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db

Apply to all db_path entries.

Also applies to: 119-119, 122-122

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2b7f5e and 57b58a4.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/openai-pgvector-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
  • pyproject.toml
  • tests/e2e/test_list.txt
  • tests/e2e/features/faiss.feature
  • docs/providers.md
  • docker-compose.yaml
  • docker-compose-library.yaml
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • tests/e2e/configs/run-azure.yaml
  • examples/vllm-rhoai.yaml
  • examples/azure-run.yaml
  • examples/vertexai-run.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/run.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • examples/vllm-rhaiis.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • examples/vllm-rhelai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`

Applied to files:

  • tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • examples/vllm-rhaiis.yaml
🪛 YAMLlint (1.37.1)
examples/run.yaml

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
🔇 Additional comments (18)
examples/vllm-rhoai.yaml (6)

38-52: LGTM on provider additions.

The OpenAI and sentence-transformers provider configurations are correctly structured. Note that the storage_dir on line 50 is affected by the same environment variable issue flagged above.


58-72: LGTM!

The scoring, tool_runtime, and vector_io provider configurations are well-structured. The FAISS vector_io provider with persistence to kv_default backend aligns with the PR objective of fully supporting the FAISS VectorIO provider.


73-110: LGTM!

Provider configurations for agents, batches, datasetio, and eval follow consistent patterns with proper namespace isolation and backend references.


111-121: Storage backend configuration is consistent.

The storage backends correctly define kv_default and sql_default that are referenced by providers. The environment variable issue on lines 118 and 121 is covered in the earlier comment.


142-142: Verify if provider_model_id: null is intentional.

Setting provider_model_id to null may indicate auto-detection or could be an incomplete configuration. Please confirm this is the intended behavior for the vLLM provider.


153-161: LGTM!

The vector_stores configuration correctly sets FAISS as the default provider with nomic-ai embedding model, aligning with the PR objectives. Based on learnings, the telemetry configuration with just enabled: true is correct for Llama Stack 0.3.x.

examples/vllm-rhaiis.yaml (1)

53-72: LGTM: Safety and vector_io providers are properly configured.

The safety provider (llama-guard) is correctly defined, matching the shield registration at lines 143-146. The vector_io faiss provider with kv_default persistence is properly configured.

examples/vertexai-run.yaml (2)

30-43: LGTM: Vertex AI inference provider configuration.

The Google Vertex AI remote provider is properly configured with environment-driven project/location settings and an allowed_models list. The sentence-transformers inline provider for embeddings is also correctly defined.


148-156: No action needed. The configuration file specifies a Hugging Face model ID (nomic-ai/nomic-embed-text-v1.5) that will be resolved by Llama Stack at runtime, not a direct Python package dependency. The nomic package is neither imported in the codebase nor required as a project dependency.

Likely an incorrect or invalid review comment.

examples/vllm-rhelai.yaml (1)

66-72: LGTM: FAISS vector_io provider configuration.

The faiss provider with kv_default persistence namespace is correctly configured for RAG storage.

run.yaml (2)

139-155: LGTM: Model registration and tool groups.

The gpt-4o-mini model is properly registered with the OpenAI provider, and the RAG tool group is correctly configured with the rag-runtime provider.


156-164: No action needed. The configuration is correct.

The nomic-ai/nomic-embed-text-v1.5 model is compatible with sentence-transformers and is already in use throughout the codebase (run.yaml, e2e configs, examples). The sentence-transformers>=5.0.0 dependency is present in pyproject.toml. No separate "nomic" package dependency is required—sentence-transformers automatically downloads and loads HuggingFace models.

Likely an incorrect or invalid review comment.

tests/e2e/configs/run-vertexai.yaml (1)

156-157: LGTM!

Telemetry configuration is correct - using telemetry.enabled: true without a provider block, which aligns with Llama Stack 0.3.x requirements. Based on learnings.

tests/e2e/configs/run-rhelai.yaml (1)

30-111: LGTM!

Provider structure is well-organized with proper list markers and indentation. The YAML structure issues from past reviews have been addressed.

tests/e2e-prow/rhoai/configs/run.yaml (2)

140-143: Verify provider_model_id: null is intentional.

The model registration has provider_model_id: null. Confirm this is intended for dynamic model resolution by vllm, or if it should match model_id.


30-111: LGTM!

Provider configuration structure is consistent and well-organized.

tests/e2e/configs/run-ci.yaml (1)

64-74: LGTM!

Good inline comments explaining the RAG tool and FAISS storage configuration. The structure is clear and self-documenting.

tests/e2e/configs/run-rhaiis.yaml (1)

92-111: LGTM!

The datasetio and eval provider sections now have correct YAML structure with proper list markers. The structure issues from past reviews have been addressed.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@are-ces are-ces force-pushed the faiss-support-final branch from 57b58a4 to 0d28b3d Compare January 14, 2026 09:29
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@examples/vertexai-run.yaml`:
- Around line 136-147: registered_resources currently has models: [] despite
inference providers being configured; add explicit model registrations under the
registered_resources.models array for each allowed provider/model (e.g., add
entries referencing google/gemini-2.5-flash for the google-vertex provider and
any OpenAI or sentence-transformers models you intend to use) so that the system
does not rely on implicit auto-registration; update the models array to include
model objects with model_id/model_name, provider_id (matching provider
definitions like google-vertex, openai, sentence-transformers), and any required
fields such as default/allowed flags to align with existing provider configs.

In `@run.yaml`:
- Around line 17-27: The env pattern currently uses
${env.SQLITE_STORE_DIR:=<full_path>} as the entire DB path which makes
conversations_store, inference_store, and metadata_store resolve to the same
file when SQLITE_STORE_DIR is set; change each path to use the env var as a
directory prefix (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db) so each store (look
for keys conversations_store, inference_store, metadata_store and the other
occurrences mentioned at lines ~45, ~120, ~123) points to its own filename under
the provided directory.

In `@tests/e2e/configs/run-vertexai.yaml`:
- Around line 149-157: The config uses vector_stores.default_embedding_model
with provider_id "sentence-transformers" and model_id
"nomic-ai/nomic-embed-text-v1.5" but that model is not listed under
registered_resources.models (unlike run-ci.yaml which registers
sentence-transformers/all-mpnet-base-v2); either add an entry for
"sentence-transformers" -> model "nomic-ai/nomic-embed-text-v1.5" to
registered_resources.models in this YAML, or explicitly document/verify that the
sentence-transformers provider will auto-register unlisted models and add a
comment or test to assert that auto-registration behavior for
vector_stores.default_embedding_model works as intended.

In `@tests/e2e/features/faiss.feature`:
- Around line 8-22: The Authorization header in the faiss.feature step "I set
the Authorization header to Bearer ..." contains a truncated JWT and must be
replaced with a complete, valid test token (three base64url parts) or pulled
from a test config/env var; also avoid hardcoding the vector store id
"vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee" in the response assertion — instead
read the expected id from a fixture/config or change the assertion in the "Then
The body of the response has the following structure" step to verify the rags
array contains an id pattern (or presence) rather than an exact hardcoded value
so the test is not tightly coupled to specific test data.
♻️ Duplicate comments (16)
examples/vllm-rhoai.yaml (1)

144-147: Shield configuration still incomplete for inline::llama-guard.

This issue was previously flagged: the inline::llama-guard provider requires a registered Llama Guard model. The current provider_shield_id: openai/gpt-4o-mini is invalid for this provider type—it should be null, and shield_id should reference a registered guard model (e.g., meta-llama/Llama-Guard-3-8B).

tests/e2e-prow/rhoai/configs/run.yaml (1)

139-158: Embedding model referenced but not registered in models list.

The default_embedding_model at line 158 references nomic-ai/nomic-embed-text-v1.5 with provider sentence-transformers, but only meta-llama/Llama-3.2-1B-Instruct is registered in registered_resources.models. Consider adding the embedding model registration for consistency and to prevent potential runtime issues if the system requires explicit model registration.

examples/vllm-rhelai.yaml (1)

16-16: Minor: Trailing whitespace on line 16.

This line has trailing whitespace after the APIs list which should be removed for cleaner formatting.

tests/e2e/features/faiss.feature (1)

24-34: Same truncated JWT token issue.

This scenario has the same incomplete JWT token as flagged in past review comments. Please provide a complete, valid JWT with all three segments.

tests/e2e/configs/run-rhaiis.yaml (2)

16-16: Trailing whitespace persists on line 16.

The line after - vector_io contains trailing whitespace which should be removed.


154-162: Embedding model registration question applies here as well.

Same consideration as other configs - the nomic-ai/nomic-embed-text-v1.5 model may need explicit registration in registered_resources.models depending on whether sentence-transformers auto-registers models.

Telemetry configuration is correct per Llama Stack 0.3.x conventions. Based on learnings, this is the proper approach.

examples/vllm-rhaiis.yaml (1)

153-161: Verify nomic package dependency for embedding model.

The nomic-ai/nomic-embed-text-v1.5 model may require the nomic package as a runtime dependency. A previous review comment noted that while einops>=0.8.1 is in pyproject.toml, the nomic package might be missing.

run.yaml (2)

139-160: Missing embedding model registration.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 (line 160), but no embedding model is registered in registered_resources.models (lines 140-144). Only the LLM model gpt-4o-mini is registered, which will cause RAG operations to fail.

Register the embedding model
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: openai
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:

60-63: Placeholder API key for braintrust scoring provider.

The openai_api_key: '********' is a redacted placeholder that will fail at runtime if the braintrust scoring provider is used. Per learnings, use environment variables for sensitive data.

🔧 Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
examples/azure-run.yaml (1)

137-157: Missing embedding model registration.

The default_embedding_model at line 155-157 references nomic-ai/nomic-embed-text-v1.5, but no embedding model is registered in registered_resources.models (lines 138-142). Only the LLM model gpt-4o-mini is registered. This will cause RAG operations to fail when the system tries to resolve the embedding model.

Register the embedding model
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: azure
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
tests/e2e/configs/run-rhelai.yaml (1)

154-158: Missing embedding model registration.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5, but no embedding model is registered in registered_resources.models (lines 139-143). Only the LLM model ${env.RHEL_AI_MODEL} is registered. This inconsistency will cause RAG operations to fail.

Register the embedding model
 registered_resources:
   models:
   - model_id: ${env.RHEL_AI_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHEL_AI_MODEL}
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
tests/e2e/configs/run-ci.yaml (2)

60-63: Placeholder API key for braintrust scoring provider.

The openai_api_key: '********' is a placeholder that will fail at runtime if the braintrust scoring provider is used. Per project guidelines from learnings, use environment variables for sensitive data.

🔧 Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}

161-165: Embedding model mismatch: default model not registered.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 (line 165), but the registered embedding model is sentence-transformers/all-mpnet-base-v2 (lines 145-150). This inconsistency may cause runtime failures when RAG tries to use the unregistered default model.

Option A: Register the nomic embedding model
   - model_id: sentence-transformers/all-mpnet-base-v2
     model_type: embedding
     provider_id: sentence-transformers
     provider_model_id: sentence-transformers/all-mpnet-base-v2
     metadata:
       embedding_dimension: 768
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
Option B: Use the registered model as default
 vector_stores:
   default_provider_id: faiss
   default_embedding_model:
     provider_id: sentence-transformers
-    model_id: nomic-ai/nomic-embed-text-v1.5
+    model_id: sentence-transformers/all-mpnet-base-v2
examples/run.yaml (3)

58-58: Critical YAML syntax error: scoring: key on same line breaks parsing.

Line 58 has provider_type: inline::llama-guard scoring: — the scoring: key is incorrectly placed on the same line as the value. This is invalid YAML and will cause a parse error, preventing the configuration from loading.

🐛 Fix the syntax error
   safety:
   - config:
       excluded_categories: []
     provider_id: llama-guard
-    provider_type: inline::llama-guard  scoring:
+    provider_type: inline::llama-guard
+  scoring:
   - provider_id: basic
     provider_type: inline::basic

119-120: Duplicate telemetry blocks.

The telemetry section appears twice: at lines 119-120 and again at lines 175-176. This is likely a copy-paste error and may cause unexpected YAML parsing behavior where the second occurrence overrides the first.

Remove the duplicate at lines 119-120
 scoring_fns: []
-telemetry:
-  enabled: true
 server:
   port: 8321

Also applies to: 175-176


168-172: Embedding model mismatch: default differs from registered.

The default_embedding_model uses nomic-ai/nomic-embed-text-v1.5 (line 172), but the registered embedding model is sentence-transformers/all-mpnet-base-v2 (lines 152-157). Either register the nomic model or align the default to the registered model.

Option A: Register the nomic model
   - model_id: sentence-transformers/all-mpnet-base-v2
     model_type: embedding
     provider_id: sentence-transformers
     provider_model_id: sentence-transformers/all-mpnet-base-v2
     metadata:
       embedding_dimension: 768
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
Option B: Use registered model as default
 vector_stores:
   default_provider_id: faiss
   default_embedding_model:
     provider_id: sentence-transformers
-    model_id: nomic-ai/nomic-embed-text-v1.5
+    model_id: sentence-transformers/all-mpnet-base-v2
🧹 Nitpick comments (2)
tests/e2e/configs/run-azure.yaml (1)

16-16: Remove trailing whitespace.

Line 16 contains trailing whitespace after the - vector_io entry, similar to other config files.

Proposed fix
 - vector_io
-      
+
examples/vllm-rhaiis.yaml (1)

4-15: Missing telemetry in apis list.

Unlike the other configuration files (run-vertexai.yaml, run-rhaiis.yaml, run-azure.yaml) which include - telemetry in their apis list, this file omits it. However, telemetry.enabled: true is set at line 160. For consistency with other configs and to ensure the telemetry API is available, consider adding - telemetry to the apis list.

Also, line 15 contains trailing whitespace.

Proposed fix
 - scoring
+- telemetry
 - tool_runtime
 - vector_io
-      
+
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57b58a4 and 0d28b3d.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-llama-faiss-run.yaml
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/openai-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • pyproject.toml
  • tests/e2e/test_list.txt
  • docker-compose-library.yaml
🧰 Additional context used
📓 Path-based instructions (1)
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

  • tests/e2e/features/faiss.feature
🧠 Learnings (6)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • tests/e2e-prow/rhoai/configs/run.yaml
  • examples/vllm-rhaiis.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/vllm-rhoai.yaml
  • examples/vllm-rhelai.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • examples/azure-run.yaml
  • examples/vertexai-run.yaml
  • examples/run.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-vertexai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • examples/vllm-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/features/faiss.feature
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`

Applied to files:

  • tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
🪛 YAMLlint (1.37.1)
examples/run.yaml

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: build-pr
  • GitHub Check: E2E: library mode / ci
🔇 Additional comments (30)
examples/vllm-rhoai.yaml (5)

17-28: Store paths are now distinct and consistent.

The hardcoded paths for conversations_store, inference_store, and metadata_store each point to separate .db files under ~/.llama/storage/, which resolves the previous concern about path conflicts.


34-37: Note: TLS verification disabled for vLLM endpoint.

tls_verify: false is acceptable for internal cluster traffic or development, but ensure this is intentional for your deployment environment. For production with external endpoints, enabling TLS verification is recommended.


67-73: FAISS vector_io provider correctly configured.

The inline::faiss provider with kv_default persistence backend aligns with the PR objective to support the FAISS VectorIO provider.


154-158: FAISS default and embedding model configuration looks good.

Setting faiss as the default vector provider with nomic-ai/nomic-embed-text-v1.5 for embeddings is appropriate for the FAISS VectorIO support objective.


161-162: Telemetry configuration is correct for Llama Stack 0.3.x.

Using just telemetry.enabled: true without a provider block is the correct approach for version 0.3.x. Based on learnings, telemetry provider configuration is not supported under the providers section in this version.

tests/e2e-prow/rhoai/configs/run.yaml (3)

1-28: LGTM! Configuration header and store paths are well-structured.

The version, image_name, APIs list, and sqlite store configurations follow the expected pattern. The telemetry API addition aligns with the broader PR changes.


30-73: Provider configurations look correct.

The providers section properly configures inference (vllm, openai, sentence-transformers), files, safety, scoring, tool_runtime, and vector_io with faiss. The faiss persistence namespace vector_io::faiss with kv_default backend follows the expected pattern.


161-162: Telemetry configuration is correct.

Per learnings, in Llama Stack 0.3.x, telemetry can be enabled with just telemetry.enabled: true without requiring an explicit provider block. This configuration follows that pattern correctly.

examples/vertexai-run.yaml (2)

30-43: VertexAI inference provider configuration looks correct.

The google-vertex provider properly uses environment variables for project and location, and specifies allowed_models. The sentence-transformers inline provider for embeddings is appropriately configured.


148-156: Vector stores and telemetry configuration is consistent.

The faiss default provider and nomic-embed-text embedding model configuration matches the pattern used across other configuration files in this PR. Telemetry enablement follows the correct approach for Llama Stack 0.3.x.

examples/vllm-rhelai.yaml (3)

54-58: Safety provider is now properly configured.

The previous review flagged an inconsistency where shields were registered but no safety provider existed. This has been addressed - the llama-guard safety provider is now configured with excluded_categories: []. This resolves the configuration inconsistency.


138-147: Model and shield configuration aligns with provider setup.

The model uses environment variables (${env.RHEL_AI_MODEL}) which provides flexibility for different deployments. The shield configuration with llama-guard now has a backing safety provider, resolving the previous inconsistency.


154-162: Vector stores and telemetry configuration is consistent with other configs.

The faiss default provider, nomic-embed-text embedding model, and telemetry enablement follow the established pattern across all configuration files in this PR.

docs/providers.md (1)

160-160: rag-runtime support status correctly updated.

The rag-runtime tool runtime provider is now marked as supported (), which aligns with the PR's FAISS VectorIO integration and the addition of e2e tests for RAG functionality. The pip dependencies listed are comprehensive for RAG operations.

docker-compose.yaml (1)

14-14: RAG volume mount supports FAISS e2e testing.

The added volume mount maps the test RAG database (./tests/e2e/rag) to the container's expected storage path (/opt/app-root/src/.llama/storage/rag). The test data directory exists with the expected kv_store.db file. The :Z flag is appropriate for SELinux relabeling in containerized environments.

tests/e2e/features/faiss.feature (1)

1-6: LGTM - Well-structured BDD feature file.

The Background setup correctly initializes the service and REST API prefix, following proper Gherkin conventions as expected for behave e2e tests. Based on learnings, this aligns with the BDD framework guidelines.

tests/e2e/configs/run-vertexai.yaml (3)

2-2: LGTM - Appropriate image name for VertexAI configuration.


31-44: Well-structured inference providers.

The provider configuration correctly sets up VertexAI as the primary inference provider with environment-based credentials, OpenAI as secondary, and sentence-transformers for embeddings. The inline sentence-transformers configuration aligns with the vector_stores default_embedding_model.


66-72: FAISS vector_io provider correctly configured.

The persistence configuration with namespace: vector_io::faiss and backend: kv_default properly integrates with the storage backend defined at line 118.

tests/e2e/configs/run-rhaiis.yaml (4)

1-3: LGTM - Version and image name updates.

The configuration correctly identifies as version 2 with rhaiis-configuration image name.


17-28: Database paths now use hardcoded values - addresses previous env var concern.

The db_path entries for conversations_store, inference_store, and metadata_store now use direct paths instead of the problematic ${env.SQLITE_STORE_DIR:=...} pattern that would have caused path conflicts. This resolves the previously flagged issue.


43-45: Sentence-transformers provider correctly configured.

The inline sentence-transformers provider with empty config aligns with the vector_stores default_embedding_model specification.


67-73: FAISS vector_io provider properly configured.

Configuration matches the pattern used in other config files with correct namespace and backend references.

tests/e2e/configs/run-azure.yaml (4)

2-2: LGTM - Azure configuration image name.


31-44: Azure inference provider properly configured.

The Azure provider configuration includes appropriate fields (api_key, api_base, api_version) with environment variable substitution. The OpenAI and sentence-transformers providers are correctly set up as secondary options.


66-72: FAISS vector_io configuration consistent with other configs.


153-161: Vector stores and telemetry configuration looks good.

The configuration follows the established pattern. The v1.5 embedding model choice was previously noted in past comments as having a v2 alternative available, but v1.5 remains a valid production choice.

examples/vllm-rhaiis.yaml (3)

1-3: LGTM - Version and image name updates.


17-27: Database paths now use hardcoded values - addresses previous env var concern.

The db_path entries now use direct paths (~/.llama/storage/...) instead of the problematic ${env.SQLITE_STORE_DIR:=...} pattern. This resolves the previously flagged path conflict issue.


46-52: Storage directory path now hardcoded - addresses previous concern.

The storage_dir now uses a direct path instead of the mismatched SQLITE_STORE_DIR env var. This resolves the semantic inconsistency flagged previously.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 136 to 147
registered_resources:
models: []
shields: []
vector_dbs: []
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag
provider_id: rag-runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No models registered despite having inference providers configured.

The models: [] on line 137 means no models are explicitly registered, yet the inference providers (google-vertex, openai, sentence-transformers) are configured. While some providers may auto-register models, this could cause issues if the system expects explicit model registration. Consider adding model entries for the allowed models (e.g., google/gemini-2.5-flash from line 36).

🔧 Suggested model registration
 registered_resources:
-  models: []
+  models:
+  - model_id: google/gemini-2.5-flash
+    provider_id: google-vertex
+    model_type: llm
+    provider_model_id: google/gemini-2.5-flash
   shields:
   - shield_id: llama-guard
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
registered_resources:
models: []
shields: []
vector_dbs: []
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag
provider_id: rag-runtime
registered_resources:
models:
- model_id: google/gemini-2.5-flash
provider_id: google-vertex
model_type: llm
provider_model_id: google/gemini-2.5-flash
shields:
- shield_id: llama-guard
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag
provider_id: rag-runtime
🤖 Prompt for AI Agents
In `@examples/vertexai-run.yaml` around lines 136 - 147, registered_resources
currently has models: [] despite inference providers being configured; add
explicit model registrations under the registered_resources.models array for
each allowed provider/model (e.g., add entries referencing
google/gemini-2.5-flash for the google-vertex provider and any OpenAI or
sentence-transformers models you intend to use) so that the system does not rely
on implicit auto-registration; update the models array to include model objects
with model_id/model_name, provider_id (matching provider definitions like
google-vertex, openai, sentence-transformers), and any required fields such as
default/allowed flags to align with existing provider configs.

run.yaml Outdated
Comment on lines 17 to 27
conversations_store:
db_path: ~/.llama/storage/conversations.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
type: sqlite
datasets: []
image_name: starter
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ~/.llama/storage/inference-store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
type: sqlite
metadata_store:
db_path: ~/.llama/storage/registry.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Environment variable pattern causes path collisions when set.

All store paths use ${env.SQLITE_STORE_DIR:=<full_path>} with different default values. If SQLITE_STORE_DIR is set, all paths resolve to the same value, causing database collisions. The pattern should use the env var as a directory prefix, not the entire path.

🔧 Use directory-based pattern
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
   type: sqlite
 ...
 inference_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
   type: sqlite
 metadata_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
   type: sqlite

Also apply to lines 45, 120, and 123.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
conversations_store:
db_path: ~/.llama/storage/conversations.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
type: sqlite
datasets: []
image_name: starter
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ~/.llama/storage/inference-store.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
type: sqlite
metadata_store:
db_path: ~/.llama/storage/registry.db
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
conversations_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
type: sqlite
datasets: []
image_name: starter
# external_providers_dir: /opt/app-root/src/.llama/providers.d
inference_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
type: sqlite
metadata_store:
db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
🤖 Prompt for AI Agents
In `@run.yaml` around lines 17 - 27, The env pattern currently uses
${env.SQLITE_STORE_DIR:=<full_path>} as the entire DB path which makes
conversations_store, inference_store, and metadata_store resolve to the same
file when SQLITE_STORE_DIR is set; change each path to use the env var as a
directory prefix (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db) so each store (look
for keys conversations_store, inference_store, metadata_store and the other
occurrences mentioned at lines ~45, ~120, ~123) points to its own filename under
the provided directory.

Comment on lines 8 to 20
Scenario: Verify vector store is registered
Given The system is in default state
And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
And REST API service hostname is localhost
And REST API service port is 8080
When I access REST API endpoint rags using HTTP GET method
Then The status code of the response is 200
And the body of the response has the following structure
"""
{
"rags": [
"vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee"
]
}
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Truncated JWT token on line 10.

The Authorization Bearer token is incomplete—it's missing the signature segment. A valid JWT requires three Base64url-encoded parts separated by dots (header.payload.signature). The token ends abruptly at Ikpva which appears to be mid-payload.

Additionally, the hardcoded vector_store ID vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee tightly couples this test to specific test data. Consider documenting this dependency or making it configurable.

🤖 Prompt for AI Agents
In `@tests/e2e/features/faiss.feature` around lines 8 - 22, The Authorization
header in the faiss.feature step "I set the Authorization header to Bearer ..."
contains a truncated JWT and must be replaced with a complete, valid test token
(three base64url parts) or pulled from a test config/env var; also avoid
hardcoding the vector store id "vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee" in the
response assertion — instead read the expected id from a fixture/config or
change the assertion in the "Then The body of the response has the following
structure" step to verify the rags array contains an id pattern (or presence)
rather than an exact hardcoded value so the test is not tightly coupled to
specific test data.

@are-ces are-ces force-pushed the faiss-support-final branch from 0d28b3d to 9ae2114 Compare January 14, 2026 10:32
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
examples/vllm-rhaiis.yaml (1)

4-15: Add telemetry to apis list to match all other configuration files.

The file is inconsistent with the established pattern: every other YAML configuration file in the codebase includes - telemetry in the apis list when telemetry.enabled: true is configured. This file is the only exception. Add telemetry to maintain consistency.

Proposed fix
 apis:
 - agents
 - batches
 - datasetio
 - eval
 - files
 - inference
 - safety
 - scoring
+- telemetry
 - tool_runtime
 - vector_io
run.yaml (1)

139-145: Add embedding model registration to registered_resources.models for consistency with test configurations.

The embedding model nomic-ai/nomic-embed-text-v1.5 is defined in vector_stores.default_embedding_model but not registered in registered_resources.models. Test configurations (run-ci.yaml, examples/run.yaml) register their embedding models with model_type: embedding in the models section. Add a similar entry here:

Example from run-ci.yaml
registered_resources:
  models:
  - model_id: gpt-4o-mini
    provider_id: openai
    model_type: llm
    provider_model_id: gpt-4o-mini
  - model_id: sentence-transformers/all-mpnet-base-v2
    model_type: embedding
    provider_id: sentence-transformers
    provider_model_id: sentence-transformers/all-mpnet-base-v2
    metadata:
      embedding_dimension: 768
🤖 Fix all issues with AI agents
In `@examples/run.yaml`:
- Around line 146-173: The registered embedding model (model_id:
sentence-transformers/all-mpnet-base-v2) does not match the
default_embedding_model (model_id: nomic-ai/nomic-embed-text-v1.5); update the
registered_resources.models entry to register nomic-ai/nomic-embed-text-v1.5
with provider_id nomic-ai (or alternatively change default_embedding_model to
sentence-transformers/all-mpnet-base-v2) so the configured default embedding
model and the registered embedding model match.

In `@examples/vllm-rhelai.yaml`:
- Around line 138-161: Add a registered_resources.models entry for the embedding
model referenced by vector_stores.default_embedding_model (model_id
nomic-ai/nomic-embed-text-v1.5) so it’s explicitly declared; create a model
object with model_id: nomic-ai/nomic-embed-text-v1.5, model_type: embedding,
provider_id: sentence-transformers, provider_model_id:
nomic-ai/nomic-embed-text-v1.5 and include metadata.embedding_dimension: 768 (or
the correct dimension), and apply the same addition to the corresponding
manifest that also references this embedding.

In `@examples/vllm-rhoai.yaml`:
- Around line 154-158: The YAML references an embedding model under
vector_stores.default_embedding_model (provider_id: sentence-transformers,
model_id: nomic-ai/nomic-embed-text-v1.5) but that model is not registered in
registered_resources.models; add an entry for this model in
registered_resources.models with the same provider_id and model_id (and any
required fields like type/params per registry schema) so the model is
discoverable at runtime and avoids failures when the vector store initializes.

In `@tests/e2e/configs/run-azure.yaml`:
- Around line 34-37: The config block currently hardcodes api_base and
api_version while api_key uses ${env.AZURE_API_KEY}; change api_base and
api_version to environment variables (e.g., ${env.AZURE_API_BASE} and
${env.AZURE_API_VERSION}) so they follow the same externalized pattern as
api_key and allow central version/endpoint bumps; update any references or docs
that expect those values if needed and ensure tests/CI set those env vars.

In `@tests/e2e/features/faiss.feature`:
- Around line 25-32: Fix the inconsistent indentation in the feature steps by
aligning the "Then The status code of the response is 200" and "And The response
should contain following fragments" lines with the preceding "When I use
\"query\"..." step; remove the extra leading spaces so all steps use the same
indentation level (match lines containing 'When I use "query"' and the
multi-line JSON block) to keep formatting consistent and readable.
♻️ Duplicate comments (19)
examples/vllm-rhoai.yaml (1)

144-147: Shield configuration for inline::llama-guard looks invalid/misleading.
provider_shield_id: openai/gpt-4o-mini doesn’t look like a shield identifier; it looks like a model id. If Llama Stack expects the shield to reference a registered guard model (and provider_shield_id to be null/omitted for inline llama-guard), this example will break.

Llama Stack 0.3.x inline::llama-guard: how should registered_resources.shields be configured? Should provider_shield_id be omitted/null and shield_id reference a registered guard model?

Also applies to: 55-58

examples/azure-run.yaml (3)

34-38: Update api_version to a current Azure OpenAI version.
2024-02-15-preview is likely stale; examples should point to a currently supported GA (or clearly label as preview).

Azure OpenAI: what is the latest supported GA api-version as of 2026-01, and what preview versions are currently supported?

137-158: default_embedding_model references an unregistered embedding model (example may fail).
Register nomic-ai/nomic-embed-text-v1.5 under registered_resources.models (provider: sentence-transformers) or change the default to a registered embedding.

Proposed fix
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: azure
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    provider_id: sentence-transformers
+    model_type: embedding
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
   shields:

143-146: Shield config for inline::llama-guard likely misconfigured.
Same concern as other manifests: provider_shield_id: openai/gpt-4o-mini looks like a model id, not a shield id.

Llama Stack 0.3.x inline::llama-guard: confirm correct shields configuration fields and whether provider_shield_id should be null/omitted.

Also applies to: 53-58

examples/vertexai-run.yaml (1)

34-43: Register the Vertex LLM and the default embedding model (current config has models: []).
With registered_resources.models: [], but allowed_models set (Vertex) and default_embedding_model set (sentence-transformers), this example likely fails unless Llama Stack auto-registers models for these providers.

Proposed fix (explicit registrations)
 registered_resources:
-  models: []
+  models:
+  - model_id: google/gemini-2.5-flash
+    provider_id: google-vertex
+    model_type: llm
+    provider_model_id: google/gemini-2.5-flash
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    provider_id: sentence-transformers
+    model_type: embedding
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
   shields:
   - shield_id: llama-guard
Llama Stack 0.3.x: does registered_resources.models need explicit entries for remote::vertexai allowed_models and for vector_stores.default_embedding_model?

Also applies to: 136-152

tests/e2e/features/faiss.feature (2)

8-12: Don’t commit truncated/hardcoded Bearer tokens; use a valid test token from env/fixture.
Current token strings are incomplete JWTs and will fail if auth is enforced; even if auth is lax, hardcoding tokens in-repo is brittle.

Example change (env-driven token)
-    And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
+    And I set the Authorization header to Bearer ${E2E_JWT}
behave / Gherkin tests: what is the recommended way in this repo to inject secrets (like Authorization tokens) into e2e runs—env vars, fixture files, or a step that mints a local token?

Also applies to: 22-25


13-20: Avoid asserting a single hardcoded vector store id.
Asserting "vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee" will break if the fixture DB is regenerated. Prefer asserting non-empty, matching an ^vs_[0-9a-f-]{36}$ shape, or asserting membership against an id read from the fixture you mount.

tests/e2e/configs/run-vertexai.yaml (1)

137-153: Register (or prove auto-registration of) the default embedding model.
vector_stores.default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but registered_resources.models is empty, which can break model resolution depending on how strict config validation is.

Proposed fix (explicitly register the embedding model)
 registered_resources:
-  models: []
+  models:
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
   shields:
   - shield_id: llama-guard
     provider_id: llama-guard
     provider_shield_id: openai/gpt-4o-mini
tests/e2e/configs/run-rhaiis.yaml (2)

138-158: Register (or prove auto-registration of) nomic-ai/nomic-embed-text-v1.5.
vector_stores.default_embedding_model references nomic v1.5, but registered_resources.models only declares the RHAIIS LLM.

Proposed fix
 registered_resources:
   models:
   - model_id: ${env.RHAIIS_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHAIIS_MODEL}
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5

1-2: Config header changes look fine; trim the stray whitespace-only line.
Line 16 appears to contain trailing whitespace after - vector_io.

Proposed fix
 - vector_io
-      
+

Also applies to: 13-16

tests/e2e/configs/run-ci.yaml (2)

140-150: Fix embedding model mismatch: default embedding model is not registered.
registered_resources.models registers sentence-transformers/all-mpnet-base-v2, but vector_stores.default_embedding_model points to nomic-ai/nomic-embed-text-v1.5.

Option A: Register the nomic embedding model
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: openai
     model_type: llm
     provider_model_id: gpt-4o-mini
   - model_id: sentence-transformers/all-mpnet-base-v2
     model_type: embedding
     provider_id: sentence-transformers
     provider_model_id: sentence-transformers/all-mpnet-base-v2
     metadata:
       embedding_dimension: 768
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
Option B: Align default to the already-registered embedding model
 vector_stores:
   default_provider_id: faiss
   default_embedding_model: # Define the default embedding model for RAG
     provider_id: sentence-transformers
-    model_id: nomic-ai/nomic-embed-text-v1.5
+    model_id: sentence-transformers/all-mpnet-base-v2

Also applies to: 161-165


60-63: Don’t commit Braintrust/OpenAI keys (even placeholders) in CI configs.
Replace openai_api_key: '********' with an env var (or remove the provider if not used in CI).

Proposed fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
tests/e2e/configs/run-rhelai.yaml (1)

154-158: Register (or align) the embedding model referenced by vector_stores.
nomic-ai/nomic-embed-text-v1.5 is used as default embedding model but isn’t declared under registered_resources.models.

Proposed fix
 registered_resources:
   models:
   - model_id: ${env.RHEL_AI_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHEL_AI_MODEL}
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5

Also applies to: 138-143

tests/e2e-prow/rhoai/configs/run.yaml (1)

154-158: Register the embedding model referenced by default_embedding_model.

Proposed fix
 registered_resources:
   models:
   - model_id: meta-llama/Llama-3.2-1B-Instruct
     provider_id: vllm
     model_type: llm
     provider_model_id: null
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
   shields:

Also applies to: 139-148

examples/run.yaml (3)

54-61: Fix YAML parse error: scoring: is on the same line as provider_type.

Proposed fix
   safety:
   - config:
       excluded_categories: []
     provider_id: llama-guard
-    provider_type: inline::llama-guard  scoring:
+    provider_type: inline::llama-guard
+  scoring:
   - provider_id: basic
     provider_type: inline::basic
     config: {}

65-69: Don’t ship placeholder secrets ('********') in runnable examples.

Proposed fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}

119-120: Remove duplicate top-level telemetry block.
YAML will typically keep the last key, so this can silently override and confuse.

Proposed fix
-telemetry:
-  enabled: true

Also applies to: 175-176

run.yaml (2)

18-28: ${env.SQLITE_STORE_DIR:=...} is used as a full file path in multiple places → path collisions when set.

Proposed fix (treat SQLITE_STORE_DIR as a directory)
 conversations_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
   type: sqlite
 ...
 inference_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
   type: sqlite
 metadata_store:
-  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
+  db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
   type: sqlite
 ...
   files:
   - config:
       metadata_store:
         table_name: files_metadata
         backend: sql_default
-      storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage/files}
+      storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/files
 ...
 storage:
   backends:
     kv_default:
       type: kv_sqlite
-      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
+      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/rag/kv_store.db
     sql_default:
       type: sql_sqlite
-      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db}
+      db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.db

Also applies to: 45-47, 118-123


60-63: Don’t ship placeholder secrets ('********') in runnable configs.

Proposed fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🧹 Nitpick comments (2)
examples/vllm-rhaiis.yaml (1)

30-42: Consider dropping the unused openai inference provider (or register a model for it).
Right now providers.inference includes openai, but all registered models point at vllm, so this can mislead example users.

Also applies to: 138-143

examples/vllm-rhelai.yaml (1)

30-45: Consider dropping the unused openai inference provider (or register a model for it).
Keeps the example focused and avoids implying OpenAI is active when it isn’t.

Also applies to: 139-143

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d28b3d and 9ae2114.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-granite-run.yaml
  • examples/gemini-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/openai-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
  • pyproject.toml
  • docker-compose-library.yaml
  • tests/e2e/test_list.txt
  • docs/providers.md
🧰 Additional context used
📓 Path-based instructions (1)
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

  • tests/e2e/features/faiss.feature
🧠 Learnings (6)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/features/faiss.feature
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • examples/run.yaml
  • examples/vllm-rhoai.yaml
  • examples/azure-run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • examples/vllm-rhaiis.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-rhelai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`

Applied to files:

  • tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • examples/vllm-rhaiis.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
🪛 YAMLlint (1.37.1)
examples/run.yaml

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
🔇 Additional comments (14)
examples/vertexai-run.yaml (1)

4-15: Include telemetry in the apis list to match the configuration pattern in other examples.

The file enables telemetry (line 155-156) but omits it from the apis list (lines 4-14). The canonical example examples/run.yaml includes telemetry in its apis list when telemetry is enabled, and this should be consistent.

examples/vllm-rhoai.yaml (1)

139-143: No action required. The configuration provider_model_id: null is intentional and valid for vLLM providers in Llama Stack 0.3.x. When set to null, the server automatically uses the model_id value (meta-llama/Llama-3.2-1B-Instruct) as the provider-side model identifier at startup.

docker-compose.yaml (1)

11-14: The ./tests/e2e/rag directory is already committed to the repository.

The volume mount is properly configured. The directory exists and is tracked in git with kv_store.db already present. The :Z flag for SELinux relabeling is already used consistently on multiple volume mounts in both docker-compose.yaml and docker-compose-library.yaml, and the configuration is actively running in CI (ubuntu-latest) without issues.

No changes needed.

tests/e2e/configs/run-vertexai.yaml (2)

13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, telemetry.enabled: true is sufficient in 0.3.x-style configs, and adding telemetry to apis aligns with that.

Also applies to: 156-157


30-72: Providers centralization + FAISS persistence wiring looks consistent.
The providers.vector_io FAISS entry uses a stable namespace and kv_default backend, which should keep RAG state isolated under the rag kv store.

tests/e2e/configs/run-azure.yaml (2)

13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, keeping telemetry as telemetry.enabled: true (plus listing telemetry in apis) is the right shape here.

Also applies to: 160-162


137-157: Register the embedding model referenced by vector_stores.
The configuration references nomic-ai/nomic-embed-text-v1.5 as the default embedding model in vector_stores.default_embedding_model, but this model is not registered in registered_resources.models. Only gpt-4o-mini is currently registered. For consistency and to ensure the embedding model is properly tracked, add it to the models list.

Proposed fix
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: azure
     model_type: llm
     provider_model_id: gpt-4o-mini
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
tests/e2e/configs/run-rhaiis.yaml (1)

161-162: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, telemetry.enabled: true is the intended way to enable telemetry here.

tests/e2e/configs/run-rhelai.yaml (1)

13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, this telemetry configuration shape is correct.

Also applies to: 161-162

tests/e2e/configs/run-ci.yaml (1)

13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, this is the right way to enable telemetry in these configs.

Also applies to: 168-169

examples/vllm-rhaiis.yaml (2)

1-2: Version/image_name normalization looks good.


137-158: Embedding model should be registered in registered_resources.models for consistency with reference configs.

The file uses vector_stores.default_embedding_model but does not register nomic-ai/nomic-embed-text-v1.5 in registered_resources.models. The reference example (examples/run.yaml) registers embedding models with model_type: embedding. While most configs in this PR omit registration, the pattern inconsistency should be resolved.

Proposed fix (register the embedding model)
 registered_resources:
   models:
   - model_id: ${env.RHAIIS_MODEL}
     provider_id: vllm
     model_type: llm
     provider_model_id: ${env.RHAIIS_MODEL}
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
   shields:
   - shield_id: llama-guard
     provider_id: llama-guard
     provider_shield_id: openai/gpt-4o-mini

Also note: sentence-transformers>=5.0.0 is currently only in the optional llslibdev dependency group, not in main project dependencies.

examples/vllm-rhelai.yaml (1)

1-3: Version/image_name normalization looks good.

tests/e2e-prow/rhoai/configs/run.yaml (1)

139-143: No action needed. The provider_model_id: null configuration is valid for the vllm provider and is used consistently across multiple example and test configurations in the repository.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines 138 to +161
registered_resources:
models:
- model_id: all-mpnet-base-v2
provider_id: sentence-transformers
provider_model_id: all-mpnet-base-v2
model_type: embedding
metadata:
embedding_dimension: 768
- model_id: ${env.RHEL_AI_MODEL}
provider_id: vllm
model_type: llm
provider_model_id: ${env.RHEL_AI_MODEL}

shields:
- shield_id: llama-guard
provider_id: ${env.SAFETY_MODEL:+llama-guard}
provider_shield_id: ${env.SAFETY_MODEL:=}
- shield_id: code-scanner
provider_id: ${env.CODE_SCANNER_MODEL:+code-scanner}
provider_shield_id: ${env.CODE_SCANNER_MODEL:=}
provider_id: llama-guard
provider_shield_id: openai/gpt-4o-mini
datasets: []
scoring_fns: []
benchmarks: []
tool_groups:
- toolgroup_id: builtin::rag
provider_id: rag-runtime
server:
port: 8321
telemetry:
enabled: true
vector_stores:
default_provider_id: documentation_faiss
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: all-mpnet-base-v2
model_id: nomic-ai/nomic-embed-text-v1.5
safety:
default_shield_id: llama-guard
telemetry:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's find the vllm-rhelai.yaml file and read its full content
find . -name "vllm-rhelai.yaml" -type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 103


🏁 Script executed:

# Look for config schema files
find . -name "*schema*" -o -name "*config*.json" -o -name "*config*.yaml" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 275


🏁 Script executed:

# List example manifest files to understand patterns
git ls-files examples/ | grep -E "\.(yaml|yml)$" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 891


🏁 Script executed:

# Read the full vllm-rhelai.yaml file
cat ./examples/vllm-rhelai.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 3851


🏁 Script executed:

# Check size of config.json and then read it
wc -l ./docs/config.json

Repository: lightspeed-core/lightspeed-stack

Length of output: 99


🏁 Script executed:

# Compare with other similar manifest files - check vllm-rhaiis.yaml
cat ./examples/vllm-rhaiis.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 3820


🏁 Script executed:

# Read the config.json schema to understand embedding model requirements
head -300 ./docs/config.json

Repository: lightspeed-core/lightspeed-stack

Length of output: 10586


🏁 Script executed:

# Check the byok-rag example manifest
cat ./examples/lightspeed-stack-byok-rag.yaml | grep -A 30 "vector_stores"

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

# Search for embedding model registrations in all manifests
grep -r "model_type.*embedding" ./examples/ || echo "No embedding models found in registered_resources"

Repository: lightspeed-core/lightspeed-stack

Length of output: 121


🏁 Script executed:

# Check the run.yaml to see how embedding models are registered
grep -B 5 -A 5 "model_type: embedding" ./examples/run.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 418


🏁 Script executed:

# Check if there are any other manifests with embedding model registrations
grep -B 5 -A 5 "model_type: embedding" ./examples/*.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 638


🏁 Script executed:

# Search config.json for embedding model schema requirements
grep -A 20 "EmbeddingModel\|embedding_model\|registered_resources" ./docs/config.json | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 800


🏁 Script executed:

# Search for RegisteredResources or Model schema in config.json
grep -B 5 -A 30 "RegisteredResources\|\"Model\"\|\"models\"" ./docs/config.json | head -150

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

# Look for the actual RegisteredResources schema definition
grep -A 50 "\"RegisteredResources\"" ./docs/config.json

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

# Let's check if there are any comments or documentation about embedding models in the codebase
rg "embedding.*model.*register" -A 3 -B 3 --type yaml || echo "No documentation found"

Repository: lightspeed-core/lightspeed-stack

Length of output: 98


🏁 Script executed:

# Check the entire structure of run.yaml's registered_resources section for comparison
sed -n '/registered_resources:/,/^[a-z]/p' ./examples/run.yaml | head -60

Repository: lightspeed-core/lightspeed-stack

Length of output: 713


Register embedding model in registered_resources.models. The default embedding model nomic-ai/nomic-embed-text-v1.5 is referenced in vector_stores.default_embedding_model but not registered in registered_resources.models. Register it with model_type: embedding following the pattern in other manifests:

Example registration
- model_id: nomic-ai/nomic-embed-text-v1.5
  model_type: embedding
  provider_id: sentence-transformers
  provider_model_id: nomic-ai/nomic-embed-text-v1.5
  metadata:
    embedding_dimension: 768

Also applies to: examples/vllm-rhaiis.yaml (lines 154-158)

🤖 Prompt for AI Agents
In `@examples/vllm-rhelai.yaml` around lines 138 - 161, Add a
registered_resources.models entry for the embedding model referenced by
vector_stores.default_embedding_model (model_id nomic-ai/nomic-embed-text-v1.5)
so it’s explicitly declared; create a model object with model_id:
nomic-ai/nomic-embed-text-v1.5, model_type: embedding, provider_id:
sentence-transformers, provider_model_id: nomic-ai/nomic-embed-text-v1.5 and
include metadata.embedding_dimension: 768 (or the correct dimension), and apply
the same addition to the corresponding manifest that also references this
embedding.

Comment on lines 154 to +158
vector_stores:
default_provider_id: documentation_faiss
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: all-mpnet-base-v2
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "vllm-rhoai.yaml" -type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 208


🏁 Script executed:

cat -n examples/vllm-rhoai.yaml | head -170

Repository: lightspeed-core/lightspeed-stack

Length of output: 4947


🌐 Web query:

Llama Stack 0.3.x vector_stores default_embedding_model registration requirements

💡 Result:

Short answer — for Llama Stack 0.3.x the vector_store’s default_embedding_model must be a registered embedding model (an entry in the platform’s models registry) and you must supply its embedding dimension (either via the model metadata or explicitly when creating the store). Concretely:

  • The embedding model must be registered in Llama Stack’s models registry (an identifier returned by models.list()). [1][2]
  • The model entry must include embedding metadata (embedding_dimension) or you must pass embedding_dimension when creating the vector store. [1][2]
  • When creating/registering a vector_store you either reference the embedding model identifier (embedding_model) and embedding_dimension, or rely on a distribution-level default_embedding_model set in run.yaml / distribution config. [2][3]
  • Provider info: vector stores normally require a provider_id (or use the distribution default provider); for inline/dev you can use the built-in providers (faiss, milvus, etc.). [3][2]

Sources: Llama Stack client/CLI and “Working with Llama Stack” docs (v0.3.x). [1][2][3]

References
[1] Llama Stack client CLI docs (embedding-model default, create/unregister vector db).
[2] “Working with Llama Stack” — examples showing registering/selecting embedding models and creating vector stores.
[3] Starter distribution docs — default providers (faiss, etc.) and env/config defaults.


Register the default_embedding_model in registered_resources.models. In Llama Stack 0.3.x, embedding models referenced by vector_stores.default_embedding_model must be registered in the models registry. Currently, nomic-ai/nomic-embed-text-v1.5 is used but not defined under registered_resources.models, which will cause runtime failures.

Proposed fix
 registered_resources:
   models:
   - model_id: meta-llama/Llama-3.2-1B-Instruct
     provider_id: vllm
     model_type: llm
     provider_model_id: null
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    provider_id: sentence-transformers
+    model_type: embedding
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    embedding_dimension: 768
   shields:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
vector_stores:
default_provider_id: documentation_faiss
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: all-mpnet-base-v2
model_id: nomic-ai/nomic-embed-text-v1.5
registered_resources:
models:
- model_id: meta-llama/Llama-3.2-1B-Instruct
provider_id: vllm
model_type: llm
provider_model_id: null
- model_id: nomic-ai/nomic-embed-text-v1.5
provider_id: sentence-transformers
model_type: embedding
provider_model_id: nomic-ai/nomic-embed-text-v1.5
embedding_dimension: 768
shields:
🤖 Prompt for AI Agents
In `@examples/vllm-rhoai.yaml` around lines 154 - 158, The YAML references an
embedding model under vector_stores.default_embedding_model (provider_id:
sentence-transformers, model_id: nomic-ai/nomic-embed-text-v1.5) but that model
is not registered in registered_resources.models; add an entry for this model in
registered_resources.models with the same provider_id and model_id (and any
required fields like type/params per registry schema) so the model is
discoverable at runtime and avoids failures when the vector store initializes.

Comment on lines +34 to +37
config:
api_key: ${env.AZURE_API_KEY}
api_base: https://ols-test.openai.azure.com/
api_version: 2024-02-15-preview
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, verify the file exists and check the actual content around lines 34-37
find . -name "run-azure.yaml" -type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 110


🏁 Script executed:

# Get the actual file content to verify the snippet
cat -n tests/e2e/configs/run-azure.yaml | sed -n '30,40p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 440


🏁 Script executed:

# Check if there are other e2e config files to see patterns
ls -la tests/e2e/configs/ 2>/dev/null || find . -path "*/e2e/configs*" -type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 568


🏁 Script executed:

# Search for ${env. patterns in config files to see if this syntax is supported
rg '\$\{env\.' tests/e2e/ 2>/dev/null || echo "No env var usage found"

Repository: lightspeed-core/lightspeed-stack

Length of output: 1356


🏁 Script executed:

# Check the entire run-azure.yaml to understand the structure
wc -l tests/e2e/configs/run-azure.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 112


🏁 Script executed:

# Check what the other config files do for api_base and api_version (or similar endpoint configs)
cat -n tests/e2e/configs/run-ci.yaml | sed -n '30,45p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 683


🏁 Script executed:

# Check run-rhelai and run-rhaiis for their endpoint patterns
cat -n tests/e2e/configs/run-rhelai.yaml | head -50

Repository: lightspeed-core/lightspeed-stack

Length of output: 1477


🏁 Script executed:

# Check if there are any scripts or documentation about running these e2e tests
find . -name "*.md" -o -name "*e2e*" -o -name "*test*" | grep -i readme

Repository: lightspeed-core/lightspeed-stack

Length of output: 1314


🏁 Script executed:

# Search for how AZURE_API_BASE or similar env vars might be used elsewhere
rg "AZURE_API" . --type yaml --type md

Repository: lightspeed-core/lightspeed-stack

Length of output: 401


🏁 Script executed:

# Check if the config parsing/loading code supports env var expansion
rg -A 5 -B 5 "api_base|api_version" . --type py | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 1700


🏁 Script executed:

# Check the examples/azure-run.yaml to see if it has the same hardcoded values
cat -n examples/azure-run.yaml | sed -n '1,50p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1498


🏁 Script executed:

# Check tests/e2e/README.md for guidance on running these configs
cat tests/e2e/README.md

Repository: lightspeed-core/lightspeed-stack

Length of output: 375


Externalize hardcoded Azure endpoint and API version to environment variables for consistent portability across deployments.

The api_base and api_version are hardcoded, while api_key already uses ${env.AZURE_API_KEY}. This inconsistency breaks the established pattern seen across other e2e configs (run-rhelai.yaml, run-rhaiis.yaml, run-vertexai.yaml), which externalize all deployment-specific values. Hardcoding preview versions also prevents central bumping without file edits.

Proposed change
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
-      api_version: 2024-02-15-preview
+      api_base: ${env.AZURE_API_BASE}
+      api_version: ${env.AZURE_API_VERSION}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
config:
api_key: ${env.AZURE_API_KEY}
api_base: https://ols-test.openai.azure.com/
api_version: 2024-02-15-preview
- provider_id: azure
provider_type: remote::azure
config:
api_key: ${env.AZURE_API_KEY}
api_base: ${env.AZURE_API_BASE}
api_version: ${env.AZURE_API_VERSION}
🤖 Prompt for AI Agents
In `@tests/e2e/configs/run-azure.yaml` around lines 34 - 37, The config block
currently hardcodes api_base and api_version while api_key uses
${env.AZURE_API_KEY}; change api_base and api_version to environment variables
(e.g., ${env.AZURE_API_BASE} and ${env.AZURE_API_VERSION}) so they follow the
same externalized pattern as api_key and allow central version/endpoint bumps;
update any references or docs that expect those values if needed and ensure
tests/CI set those env vars.

Comment on lines +25 to +32
When I use "query" to ask question with authorization header
"""
{"query": "What is the title of the article from Paul?", "system_prompt": "You are an assistant. Always use the file_search tool to answer. Write only lowercase letters"}
"""
Then The status code of the response is 200
And The response should contain following fragments
| Fragments in LLM response |
| great work |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n tests/e2e/features/faiss.feature | sed -n '20,35p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 842


🏁 Script executed:

# Check if the file exists and get its full content to verify indentation
if [ -f "tests/e2e/features/faiss.feature" ]; then
  head -35 tests/e2e/features/faiss.feature | cat -A
fi

Repository: lightspeed-core/lightspeed-stack

Length of output: 1405


🌐 Web query:

behave Python Gherkin indentation handling step definition parsing

💡 Result:

Short summary — how behave handles Gherkin indentation and step parsing

  • General indentation: behave mostly ignores indentation in feature files; whitespace is for readability only and does not change step parsing. Keywords (Feature/Scenario/Given/When/Then/And/But) are recognized regardless of indentation (spaces or tabs). [1]

  • Steps and "And/But": "And"/"But" inherit the type of the preceding step; their indentation doesn’t change their semantics. [1]

  • Docstrings / PyStrings (multi-line step text): this is the one place where indentation is significant. behave expects docstrings to be delimited with triple-quotes (""") and will strip the common leading whitespace based on the first line of the docstring; all subsequent lines must have at least that amount of indentation. The docstring text becomes available to step code (context.text). [2][4]

  • Tables: data tables (pipe-delimited | a | b |) are parsed line-by-line and attached to the last step or Examples section; normal indentation is ignored but table syntax must be valid. [3]

  • Parser behavior: behave's parser is a line-oriented state machine implementing Gherkin (with small extensions). It recognizes states for steps, tables and MULTILINE_TEXT and uses indentation rules above when switching states. See the parser/state-machine overview. [3]

References

  • behave Gherkin docs (features, steps, docstrings). [1]
  • behave docs — Step data / Text (docstrings) explanation. [2]
  • Parser system / MULTILINE_TEXT and table handling (parser implementation overview). [3]

Sources
[1] behave — Feature Testing Setup / Gherkin (readthedocs).
[2] behave — Feature Testing Setup / Step Data (readthedocs).
[3] behave parser system / MULTILINE_TEXT and table handling (parser overview).


Fix step indentation inconsistency for consistent formatting.
Lines 29–30 have extra indentation compared to lines 25–28. While behave ignores step indentation during parsing, keep consistent indentation across all steps for readability.

🤖 Prompt for AI Agents
In `@tests/e2e/features/faiss.feature` around lines 25 - 32, Fix the inconsistent
indentation in the feature steps by aligning the "Then The status code of the
response is 200" and "And The response should contain following fragments" lines
with the preceding "When I use \"query\"..." step; remove the extra leading
spaces so all steps use the same indentation level (match lines containing 'When
I use "query"' and the multi-line JSON block) to keep formatting consistent and
readable.

- FAISS e2e test (for all inference providers)
- Disable shields in all configurations
- Dummy rag db

Update/remove old configuration examples
@are-ces are-ces force-pushed the faiss-support-final branch from 0ef52a1 to e7709d9 Compare January 14, 2026 12:13
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
examples/vllm-rhaiis.yaml (1)

4-14: Missing telemetry in APIs list.

Unlike other configuration files in this PR (e.g., vllm-rhoai.yaml, vllm-rhelai.yaml), this file lacks telemetry in the apis list, even though telemetry.enabled: true is set at lines 160-161. Add telemetry to the APIs for consistency.

Suggested fix
 apis:
 - agents
 - batches
 - datasetio
 - eval
 - files
 - inference
 - safety
 - scoring
+- telemetry
 - tool_runtime
 - vector_io
🤖 Fix all issues with AI agents
In `@run.yaml`:
- Around line 156-160: The default_embedding_model references
nomic-ai/nomic-embed-text-v1.5 but it’s not registered; add an entry under
registered_resources.models to register that embedding model using the same
pattern as other models: create a model object with model_id set to
nomic-ai/nomic-embed-text-v1.5, provider_id set to sentence-transformers,
model_type set to embedding, and provider_model_id set to
nomic-ai/nomic-embed-text-v1.5 so the default_embedding_model resolves
correctly.

In `@tests/e2e/configs/run-rhaiis.yaml`:
- Around line 154-158: The default_embedding_model references
nomic-ai/nomic-embed-text-v1.5 but that model isn't listed in
registered_resources.models; add an entry under registered_resources.models for
model_id: nomic-ai/nomic-embed-text-v1.5 with provider_id: sentence-transformers
(or alternatively document that the model must be pre-registered via the Llama
Stack client) so the model referenced by default_embedding_model is available at
runtime.
♻️ Duplicate comments (21)
examples/azure-run.yaml (1)

37-37: Update Azure OpenAI API version to current release.

The api_version: 2024-02-15-preview is outdated. Consider updating to at least the stable 2024-06-01 release.

Suggested fix
-      api_version: 2024-02-15-preview
+      api_version: 2024-06-01
examples/vertexai-run.yaml (1)

136-147: No models registered despite having inference providers configured.

The models: [] means no models are explicitly registered, yet inference providers (google-vertex, openai, sentence-transformers) are configured. Consider adding model entries for the allowed models.

Suggested model registration
 registered_resources:
-  models: []
+  models:
+  - model_id: google/gemini-2.5-flash
+    provider_id: google-vertex
+    model_type: llm
+    provider_model_id: google/gemini-2.5-flash
   shields:
run.yaml (1)

60-63: Placeholder API key for braintrust scoring provider.

The openai_api_key: '********' is a redacted placeholder that will fail at runtime if the braintrust scoring provider is used. Consider using an environment variable.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
examples/vllm-rhoai.yaml (2)

144-147: Shield configuration incomplete for inline::llama-guard provider.

The inline::llama-guard safety provider (line 55-58) requires a separate registered Llama Guard model served by an inference provider. Setting provider_shield_id: openai/gpt-4o-mini is incorrect—this field should be null for inline llama-guard, and shield_id should reference a registered guard model.

To fix:

  1. Register a Llama Guard model in registered_resources.models
  2. Update the shield to use provider_shield_id: null and reference the registered guard model

154-158: Register nomic-ai/nomic-embed-text-v1.5 in registered_resources.models.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5, but this model is not registered in registered_resources.models. In Llama Stack 0.3.x, embedding models must be registered with model_type: embedding.

Proposed fix
 registered_resources:
   models:
   - model_id: meta-llama/Llama-3.2-1B-Instruct
     provider_id: vllm
     model_type: llm
     provider_model_id: null
+  - model_id: nomic-ai/nomic-embed-text-v1.5
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
+    metadata:
+      embedding_dimension: 768
   shields:
examples/vllm-rhelai.yaml (2)

144-147: Shield configuration mismatch with inline::llama-guard provider.

Same issue as other configuration files: provider_shield_id: openai/gpt-4o-mini is incorrect for the inline::llama-guard provider. This should be provider_shield_id: null, and a Llama Guard model should be registered and referenced.


154-158: Register the embedding model in registered_resources.models.

The default_embedding_model (nomic-ai/nomic-embed-text-v1.5) is not registered in registered_resources.models. Add an entry with model_type: embedding to ensure the vector store can resolve the model at runtime.

tests/e2e-prow/rhoai/configs/run.yaml (1)

144-147: Shield configuration uses incorrect provider_shield_id for inline::llama-guard.

Consistent with other configuration files, the shield's provider_shield_id: openai/gpt-4o-mini is incompatible with the inline::llama-guard provider. This should be null with a registered guard model.

examples/run.yaml (4)

65-68: Placeholder API key for braintrust scoring provider.

The openai_api_key: '********' placeholder will fail at runtime. Use an environment variable reference instead.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}

152-157: Embedding model mismatch: registered model differs from default.

registered_resources.models registers sentence-transformers/all-mpnet-base-v2, but vector_stores.default_embedding_model references nomic-ai/nomic-embed-text-v1.5. These must match for RAG to function correctly.

Option A (recommended): Update registered model to match default
 registered_resources:
   models:
   - model_id: gpt-4o-mini
     provider_id: openai
     model_type: llm
     provider_model_id: gpt-4o-mini
-  - model_id: sentence-transformers/all-mpnet-base-v2
+  - model_id: nomic-ai/nomic-embed-text-v1.5
     model_type: embedding
     provider_id: sentence-transformers
-    provider_model_id: sentence-transformers/all-mpnet-base-v2
+    provider_model_id: nomic-ai/nomic-embed-text-v1.5
     metadata:
       embedding_dimension: 768

Also applies to: 168-172


119-120: Duplicate telemetry block.

The telemetry section appears twice: at lines 119-120 and again at lines 175-176. Remove the duplicate to avoid configuration ambiguity.

Remove the first occurrence (lines 119-120)
 scoring_fns: []
-telemetry:
-  enabled: true
 server:
   port: 8321

Also applies to: 175-176


54-61: Critical YAML syntax error: scoring: incorrectly on same line.

Line 58 has provider_type: inline::llama-guard scoring: where scoring: is placed on the same line. This is invalid YAML syntax and will cause a parse error, preventing the configuration from loading.

🐛 Fix the YAML syntax error
   safety:
   - config:
       excluded_categories: []
     provider_id: llama-guard
-    provider_type: inline::llama-guard  scoring:
-  - provider_id: basic
+    provider_type: inline::llama-guard
+  scoring:
+  - provider_id: basic
     provider_type: inline::basic
     config: {}
examples/vllm-rhaiis.yaml (2)

143-146: Shield configuration inconsistent with inline::llama-guard provider.

Same issue as other files: provider_shield_id: openai/gpt-4o-mini is incorrect for the inline::llama-guard safety provider. Should be null with a properly registered guard model.


153-157: Register the embedding model for vector stores.

nomic-ai/nomic-embed-text-v1.5 is referenced as the default embedding model but not registered in registered_resources.models. This will cause runtime failures.

tests/e2e/features/faiss.feature (3)

14-19: Hardcoded vector_store ID tightly couples test to specific test data.

The UUID vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee creates a dependency on specific test database state. Consider documenting this dependency or making the expected ID configurable/discoverable.


22-32: Fix inconsistent step indentation for readability.

Lines 29-30 have extra indentation compared to lines 25-28. While behave ignores step indentation during parsing, maintain consistent formatting for better readability.

Suggested alignment
     When I use "query" to ask question with authorization header
     """
     {"query": "What is the title of the article from Paul?", "system_prompt": "You are an assistant. Always use the file_search tool to answer. Write only lowercase letters"}
     """
-     Then The status code of the response is 200
-      And The response should contain following fragments
+    Then The status code of the response is 200
+     And The response should contain following fragments
           | Fragments in LLM response |
           | great work                |

10-10: Truncated JWT token - missing signature segment.

The Bearer token appears incomplete: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva ends mid-payload and is missing the third (signature) segment. A valid JWT requires three Base64url-encoded parts separated by dots.

Consider using a complete test JWT or extracting this to a test fixture/environment variable.

tests/e2e/configs/run-rhelai.yaml (1)

154-158: Verify embedding model registration requirements.

Same as run-rhaiis.yaml: the default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but it's not registered in registered_resources.models. Verify whether the sentence-transformers provider auto-registers models or if explicit registration is required.

tests/e2e/configs/run-azure.yaml (1)

34-37: Externalize hardcoded Azure endpoint and API version.

The api_key uses ${env.AZURE_API_KEY}, but api_base and api_version are hardcoded. This inconsistency breaks the pattern established in other e2e configs (e.g., run-rhelai.yaml externalizes RHEL_AI_URL and RHEL_AI_PORT). Hardcoding the preview version also prevents central version bumps without file edits.

Proposed change
   - provider_id: azure
     provider_type: remote::azure
     config: 
       api_key: ${env.AZURE_API_KEY}
-      api_base: https://ols-test.openai.azure.com/
-      api_version: 2024-02-15-preview
+      api_base: ${env.AZURE_API_BASE}
+      api_version: ${env.AZURE_API_VERSION:=2024-02-15-preview}
tests/e2e/configs/run-ci.yaml (2)

60-63: Braintrust API key should use environment variable.

The openai_api_key: '********' is still a placeholder value. If this provider is actually used in CI tests, authentication will fail. Per project guidelines, sensitive data should use environment variables.

Suggested fix
   - provider_id: braintrust
     provider_type: inline::braintrust
     config:
-      openai_api_key: '********'
+      openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}

161-165: Embedding model mismatch: default model not registered.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 (line 165), but only sentence-transformers/all-mpnet-base-v2 is registered in registered_resources.models (lines 145-150). This mismatch will cause runtime failures when the RAG system attempts to use the unregistered embedding model.

Option A: Register the nomic embedding model

Add to registered_resources.models:

  - model_id: nomic-ai/nomic-embed-text-v1.5
    model_type: embedding
    provider_id: sentence-transformers
    provider_model_id: nomic-ai/nomic-embed-text-v1.5
    metadata:
      embedding_dimension: 768
Option B: Use the registered model as default
 vector_stores:
   default_provider_id: faiss
   default_embedding_model:
     provider_id: sentence-transformers
-    model_id: nomic-ai/nomic-embed-text-v1.5
+    model_id: sentence-transformers/all-mpnet-base-v2
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9ae2114 and e7709d9.

⛔ Files ignored due to path filters (2)
  • tests/e2e/rag/kv_store.db is excluded by !**/*.db
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (25)
  • docker-compose-library.yaml
  • docker-compose.yaml
  • docs/providers.md
  • examples/azure-run.yaml
  • examples/gemini-run.yaml
  • examples/openai-faiss-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/run.yaml
  • examples/vertexai-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/vllm-qwen3-run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • pyproject.toml
  • run.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/features/faiss.feature
  • tests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
  • examples/vllm-qwen3-run.yaml
  • examples/openai-pgvector-run.yaml
  • examples/gemini-run.yaml
  • examples/vllm-granite-run.yaml
  • examples/vllm-llama-faiss-run.yaml
  • examples/openai-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • docker-compose-library.yaml
  • tests/e2e/test_list.txt
🧰 Additional context used
📓 Path-based instructions (2)
pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Check pyproject.toml for supported Python versions before development
Always check pyproject.toml for existing dependencies before adding new ones
Always verify current library versions in pyproject.toml rather than assuming versions
Use pylint with source-roots = "src" configuration

Files:

  • pyproject.toml
tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

  • tests/e2e/features/faiss.feature
🧠 Learnings (9)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones

Applied to files:

  • pyproject.toml
  • examples/vllm-rhaiis.yaml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.

Applied to files:

  • pyproject.toml
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

  • tests/e2e/configs/run-rhaiis.yaml
  • tests/e2e/configs/run-rhelai.yaml
  • tests/e2e/configs/run-vertexai.yaml
  • tests/e2e/configs/run-ci.yaml
  • tests/e2e/configs/run-azure.yaml
  • tests/e2e-prow/rhoai/configs/run.yaml
  • examples/vllm-rhaiis.yaml
  • examples/run.yaml
  • examples/vllm-rhoai.yaml
  • examples/vllm-rhelai.yaml
  • examples/vertexai-run.yaml
  • examples/azure-run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`

Applied to files:

  • tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data

Applied to files:

  • tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.

Applied to files:

  • run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

  • tests/e2e/features/faiss.feature
📚 Learning: 2025-08-25T09:05:18.350Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 445
File: tests/e2e/features/health.feature:50-53
Timestamp: 2025-08-25T09:05:18.350Z
Learning: The step definition for ignoring specific fields in response body comparisons is implemented in tests/e2e/features/steps/common_http.py at line 175 as `then('The body of the response, ignoring the "{field}" field, is the following')`.

Applied to files:

  • tests/e2e/features/faiss.feature
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/steps/**/*.py : Define E2E test steps in `tests/e2e/features/steps/` directory

Applied to files:

  • tests/e2e/features/faiss.feature
🪛 YAMLlint (1.37.1)
examples/run.yaml

[error] 58-58: syntax error: mapping values are not allowed here

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: library mode / ci
🔇 Additional comments (37)
pyproject.toml (1)

63-64: Dependency addition follows project conventions. The einops>=0.8.1 constraint is properly documented and aligns with the project's established pattern of pinning transitive dependencies at the top level for better control. The version constraint is current and appropriate.

docs/providers.md (2)

103-103: Documentation now reflects llama-guard as unsupported.

Per the past review, this was addressed. Note that configuration files still reference llama-guard for testing/example purposes even though it's marked unsupported in the official provider status.


160-160: rag-runtime now correctly marked as supported.

This aligns with the PR objective to fully support the FAISS VectorIO provider and RAG functionality.

docker-compose.yaml (1)

14-14: LGTM - RAG volume mount for e2e tests.

The volume mapping correctly provides access to the RAG test database, aligning with the storage paths configured in run.yaml (~/.llama/storage/rag/).

examples/azure-run.yaml (2)

66-72: FAISS vector_io provider configuration looks correct.

The persistence configuration with namespace vector_io::faiss and kv_default backend aligns with the storage backend defined at lines 116-118.


160-161: Telemetry configuration is correct for Llama Stack 0.3.x.

Based on learnings, telemetry provider configuration is not supported under the providers section in version 0.3.x. The telemetry.enabled: true pattern is correct.

examples/vertexai-run.yaml (2)

65-71: FAISS vector_io configuration is correctly structured.

The persistence configuration aligns with the storage backend definitions and follows the same pattern as other config files.


148-156: Vector stores, safety, and telemetry configuration looks good.

The configuration correctly references the FAISS provider and sentence-transformers for embeddings, with telemetry enabled per 0.3.x patterns.

run.yaml (4)

31-39: Inference provider configuration looks correct.

The OpenAI remote provider with environment-driven API key and allowed models, plus the inline sentence-transformers provider for embeddings, is properly structured.


68-74: FAISS vector_io provider configuration is correct.

The inline FAISS provider with persistence namespace vector_io::faiss using kv_default backend properly integrates with the storage configuration.


118-123: Storage backend paths now use separate environment variables.

The KV_STORE_PATH and SQL_STORE_PATH env vars avoid the collision issue from using a single SQLITE_STORE_DIR for all paths.


161-164: Safety and telemetry configuration is correct.

The default_shield_id: llama-guard aligns with the registered shield, and telemetry.enabled: true follows the 0.3.x pattern per learnings.

examples/vllm-rhoai.yaml (1)

1-28: LGTM on top-level structure and stores.

The version, image_name, API list, and individual store configurations (conversations_store, inference_store, metadata_store) are well-structured with hardcoded paths that avoid the previously flagged environment variable conflict issue.

examples/vllm-rhelai.yaml (1)

54-58: Safety provider now correctly configured.

The inline::llama-guard safety provider with excluded_categories: [] is properly structured.

tests/e2e-prow/rhoai/configs/run.yaml (1)

138-158: No action needed—embedding model configuration is correct.

The nomic-ai/nomic-embed-text-v1.5 model is properly configured in vector_stores.default_embedding_model (lines 156–158), which is the expected pattern. Analysis of other test config files (run-azure.yaml, run-rhaiis.yaml, run-rhelai.yaml, run-vertexai.yaml) shows they all follow the same pattern: embedding models are defined in vector_stores, not in registered_resources.models. Only run-ci.yaml registers an embedding model in registered_resources, and it registers a different model than the one in vector_stores, suggesting that configuration is an exception rather than a requirement. The current state in run.yaml aligns with the standard configuration pattern used across the test suite.

tests/e2e/configs/run-rhaiis.yaml (3)

1-2: LGTM!

Version and image_name configuration are properly set for the RHAIIS environment.


119-122: LGTM!

Storage paths now use separate environment variables (KV_STORE_PATH and SQL_STORE_PATH) with appropriate defaults, which resolves the previous concern about database path conflicts.


161-162: LGTM!

Telemetry is correctly configured at the top level with enabled: true, which aligns with Llama Stack 0.3.x conventions. Based on learnings, telemetry provider configuration is not supported under the providers section.

tests/e2e/features/faiss.feature (1)

1-6: LGTM!

The Background setup correctly initializes the service and sets the REST API prefix to /v1. This follows the behave BDD framework conventions as per coding guidelines.

tests/e2e/configs/run-rhelai.yaml (4)

1-2: LGTM!

Version and image_name properly configured for the RHEL AI environment.


54-58: LGTM!

The safety provider is now properly configured with the llama-guard provider, resolving the previous inconsistency where safety was disabled but a shield was still registered.


119-122: LGTM!

Storage paths use separate environment variables (KV_STORE_PATH and SQL_STORE_PATH) with appropriate defaults, consistent with the pattern established across other configuration files.


161-162: LGTM!

Telemetry configuration is correct for Llama Stack 0.3.x.

tests/e2e/configs/run-vertexai.yaml (4)

31-44: LGTM!

The inference providers are properly configured with google-vertex as the primary provider, along with openai and sentence-transformers inline providers. The Vertex AI configuration correctly uses environment variables for project and location.


66-72: LGTM!

The FAISS vector_io provider is correctly configured with persistence using the kv_default backend, consistent with the other configuration files in this PR.


118-121: LGTM!

Storage backend paths use environment variables with appropriate defaults, consistent with the established pattern.


149-157: LGTM!

Vector stores, safety, and telemetry configurations are properly set. The structure aligns with the centralized configuration pattern used across other e2e config files.

tests/e2e/configs/run-azure.yaml (4)

1-2: LGTM!

Image name correctly set for Azure configuration.


66-72: LGTM!

FAISS vector_io provider configuration is consistent with other e2e config files.


118-121: LGTM!

Storage backend paths correctly use environment variables.


153-161: LGTM!

Vector stores, safety, and telemetry configurations are properly structured and consistent with the centralized configuration pattern.

tests/e2e/configs/run-ci.yaml (6)

1-16: LGTM!

The image_name: starter and addition of telemetry API are consistent with the PR's centralized configuration approach and align with the telemetry enablement at line 169.


17-28: LGTM!

The store paths are now distinct hardcoded values, resolving the previous environment variable collision issue.


64-74: LGTM!

The RAG tool runtime and FAISS vector_io providers are properly configured with appropriate persistence settings using the kv_default backend.


116-138: LGTM!

The storage backends now use distinct environment variables (KV_STORE_PATH and SQL_STORE_PATH) with separate default paths, correctly resolving the previous collision issue.


139-160: LGTM!

The YAML structure is now correct with proper nesting under models:, shields:, and tool_groups: keys.


166-169: LGTM!

The safety configuration correctly references the registered llama-guard shield, and the telemetry configuration follows Llama Stack 0.3.x conventions (enabled via flag without requiring a provider block). Based on learnings, this is the correct approach.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +156 to +160
vector_stores:
default_provider_id: faiss
default_embedding_model: # Define the default embedding model for RAG
provider_id: sentence-transformers
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for sentence-transformers references in YAML
rg -n "sentence-transformers" --type yaml -A 5 -B 2 | head -60

echo "---"

# Check for any model registration patterns for embedding models
rg -n "model_type.*embedding" --type yaml -C 2

echo "---"

# Also search for embedding_model patterns in config
rg -n "embedding_model" --type yaml -C 3 | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 8331


🏁 Script executed:

#!/bin/bash
# Search for registered_resources.models to understand registration structure
rg -n "registered_resources|models" run.yaml -C 3 | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 930


🏁 Script executed:

#!/bin/bash
# Look for provider configuration and how it handles model loading
rg -n "provider_id.*sentence-transformers" -A 10 -B 5

Repository: lightspeed-core/lightspeed-stack

Length of output: 32128


🏁 Script executed:

#!/bin/bash
# Check if there's any code or documentation about provider initialization
fd -e "py" -e "md" -e "json" | xargs grep -l "sentence.transformers\|SentenceTransformers" 2>/dev/null | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 301


🏁 Script executed:

#!/bin/bash
# Search for documentation on inline::sentence-transformers behavior
rg -n "inline::sentence-transformers" --type md -A 5 -B 2 | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 2024


🏁 Script executed:

#!/bin/bash
# Check if there's any code that shows how models are resolved/loaded
fd -e "py" | xargs grep -l "sentence.transformers\|embedding.*model" 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 352


🏁 Script executed:

#!/bin/bash
# Look at the full registered_resources.models section in run.yaml to see what's actually there
sed -n '139,160p' run.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 688


Register the embedding model in registered_resources.models.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but does not have an entry in registered_resources.models. Add the embedding model registration matching the pattern used in run-ci.yaml and examples/run.yaml:

Suggested registration
registered_resources:
  models:
  - model_id: gpt-4o-mini
    provider_id: openai
    model_type: llm
    provider_model_id: gpt-4o-mini
  - model_id: nomic-ai/nomic-embed-text-v1.5
    provider_id: sentence-transformers
    model_type: embedding
    provider_model_id: nomic-ai/nomic-embed-text-v1.5
🤖 Prompt for AI Agents
In `@run.yaml` around lines 156 - 160, The default_embedding_model references
nomic-ai/nomic-embed-text-v1.5 but it’s not registered; add an entry under
registered_resources.models to register that embedding model using the same
pattern as other models: create a model object with model_id set to
nomic-ai/nomic-embed-text-v1.5, provider_id set to sentence-transformers,
model_type set to embedding, and provider_model_id set to
nomic-ai/nomic-embed-text-v1.5 so the default_embedding_model resolves
correctly.

Comment on lines 154 to +158
vector_stores:
default_provider_id: documentation_faiss
default_provider_id: faiss
default_embedding_model:
provider_id: sentence-transformers
model_id: all-mpnet-base-v2
model_id: nomic-ai/nomic-embed-text-v1.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if other config files register embedding models explicitly
rg -n "nomic-ai/nomic-embed-text" tests/e2e/configs/ --type yaml -B 5 -A 5

Repository: lightspeed-core/lightspeed-stack

Length of output: 3262


🏁 Script executed:

#!/bin/bash
# Search for sentence-transformers provider implementation to understand registration
find . -type f -name "*.py" | xargs rg -l "sentence.transformers|sentence-transformers" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 219


🌐 Web query:

Llama Stack sentence-transformers provider auto-register embedding models

💡 Result:

Short answer: Llama Stack includes an inline sentence-transformers provider for local embedding models, but it does not magically auto-register arbitrary HF/Sentence‑Transformers models — the distribution ships/assumes a couple default embedding models (granite-embedding-30m, granite-embedding-125m) and you must register additional models with the Llama Stack client (or enable the provider) before use. See docs and example register command below.

Key points and commands

  • Provider: inline::sentence-transformers is shipped as a provider in the starter distribution (providers are disabled by default). [1][2]
  • Defaults: the starter docs note granite-embedding-30m and granite-embedding-125m as the default sentence‑transformers embeddings. [3]
  • Registering an extra model (example): run the server, then register via the client:
    llama-stack-client models register all-MiniLM-L6-v2
    --provider-id sentence-transformers
    --provider-model-id all-minilm:latest
    --metadata '{"embedding_dimension":384}'
    --model-type embedding
    (This example and workflow appear in the Llama Stack / Red Hat docs.) [3][4]

References

  1. Llama Stack providers list — inline::sentence-transformers. [LLama‑Stack docs]. [1]
  2. Llama Stack inline::sentence-transformers provider page. [2]
  3. Red Hat / Llama Stack “Working with Llama Stack” — default embedding models and example register command. [3]
  4. opendatahub / Llama Stack client usage showing registered models. [6]

🏁 Script executed:

#!/bin/bash
# Search for registered_resources or explicit model registration
rg -n "registered_resources|register.*nomic-embed-text" . --type yaml --type py -B 2 -A 2

Repository: lightspeed-core/lightspeed-stack

Length of output: 3770


🏁 Script executed:

#!/bin/bash
# Check for model registration in Python code
rg -n "nomic-embed-text|nomic-ai" . --type py -B 3 -A 3

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

#!/bin/bash
# Look for setup or initialization files that might register models
fd -e py -e yaml | xargs rg -l "models.*register|register.*models" | grep -E "(setup|init|fixture|conftest)" | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

#!/bin/bash
# Search for sentence-transformers provider implementation
find . -type f -name "*.py" -path "*/sentence*transformers*" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 58


🏁 Script executed:

#!/bin/bash
# Search for vector store implementation and default_embedding_model handling
rg -n "default_embedding_model|vector_stores" . --type py -B 2 -A 5 | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 9671


🏁 Script executed:

#!/bin/bash
# Check for test conftest files that might set up models
fd -name "conftest.py" | xargs cat | grep -A 10 -B 10 "nomic\|embedding\|register" 2>/dev/null | head -50

Repository: lightspeed-core/lightspeed-stack

Length of output: 306


Add embedding model registration for nomic-ai/nomic-embed-text-v1.5 in registered_resources.

The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 with provider_id: sentence-transformers, but this model is not registered in registered_resources.models. Unlike default models (granite-embedding-30m, granite-embedding-125m), the sentence-transformers provider does not auto-register arbitrary Hugging Face models—they must be explicitly registered via the Llama Stack client or in the configuration's registered_resources section to be available at runtime during RAG operations.

Either register the model in registered_resources.models or document the required pre-registration step:

registered_resources:
  models:
    - model_id: nomic-ai/nomic-embed-text-v1.5
      provider_id: sentence-transformers
🤖 Prompt for AI Agents
In `@tests/e2e/configs/run-rhaiis.yaml` around lines 154 - 158, The
default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but that model
isn't listed in registered_resources.models; add an entry under
registered_resources.models for model_id: nomic-ai/nomic-embed-text-v1.5 with
provider_id: sentence-transformers (or alternatively document that the model
must be pre-registered via the Llama Stack client) so the model referenced by
default_embedding_model is available at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants