-
Notifications
You must be signed in to change notification settings - Fork 63
LCORE-603: Lightspeed core needs to fully support Faiss VectorIO provider #935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughCentralized and reorganized many YAML manifests into a storage-backed, provider-centric layout; added RAG volume mounts to docker-compose files; removed several obsolete example manifests; added FAISS e2e tests and updated dependency and provider status in docs. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (6)
examples/vllm-rhaiis-run.yaml (1)
33-37: Hardcoded port 8000 in URL.The URL hardcodes port 8000:
http://${env.RHAIIS_URL}:8000/v1/unlike the RHEL AI config which uses${env.RHEL_AI_PORT}. Consider using an environment variable for the port to maintain deployment flexibility:url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/tests/e2e/features/faiss.feature (1)
8-27: Consider adding a note about the port difference.This scenario correctly uses
@skip-in-library-modesince it directly queries the llama-stack service on port 8321. The second scenario (lines 29-39) omits the hostname/port override, so it will use defaults to test through lightspeed-stack. This asymmetry is intentional but worth a brief inline comment for maintainability.tests/e2e/configs/run-azure.yaml (1)
34-36: Consider parameterizing the Azure API base URL.The
api_baseis hardcoded to a specific test endpoint. For flexibility across environments, consider using an environment variable with a default:api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}examples/azure-run.yaml (2)
34-36: Consider using an environment variable for the API base URL.The
api_baseis hardcoded whileapi_keyuses an environment variable. For consistency and deployment flexibility across environments, consider parameterizing the base URL:config: api_key: ${env.AZURE_API_KEY} - api_base: https://ols-test.openai.azure.com/ + api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/} api_version: 2024-02-15-preview
111-116: Inconsistent storage path structure between backends.
kv_defaultuses~/.llama/storage/rag/kv_store.db(includes/rag/subfolder) whilesql_defaultuses~/.llama/storage/sql_store.db(no subfolder). This asymmetry may cause confusion. Consider aligning the path structure or documenting the rationale.run.yaml (1)
119-125: Same storage path inconsistency as azure-run.yaml.The
kv_defaultbackend uses~/.llama/storage/rag/kv_store.dbwhilesql_defaultuses~/.llama/storage/sql_store.db. The comment on line 120 explains that registry and RAG are unified inkv_default, which provides context. However, consider whether the/rag/subdirectory name is still appropriate given it stores more than just RAG data.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yaml(1 hunks)docker-compose.yaml(1 hunks)docs/providers.md(2 hunks)examples/azure-run.yaml(1 hunks)examples/gemini-run.yaml(0 hunks)examples/openai-faiss-run.yaml(0 hunks)examples/openai-pgvector-run.yaml(0 hunks)examples/run.yaml(1 hunks)examples/vertexai-run.yaml(4 hunks)examples/vllm-granite-run.yaml(0 hunks)examples/vllm-llama-faiss-run.yaml(0 hunks)examples/vllm-qwen3-run.yaml(0 hunks)examples/vllm-rhaiis-run.yaml(1 hunks)examples/vllm-rhelai-run.yaml(1 hunks)examples/vllm-rhoai-run.yaml(1 hunks)pyproject.toml(1 hunks)run.yaml(3 hunks)tests/e2e-prow/rhoai/configs/run.yaml(4 hunks)tests/e2e/configs/run-azure.yaml(4 hunks)tests/e2e/configs/run-ci.yaml(3 hunks)tests/e2e/configs/run-rhaiis.yaml(4 hunks)tests/e2e/configs/run-rhelai.yaml(4 hunks)tests/e2e/configs/run-vertexai.yaml(4 hunks)tests/e2e/features/faiss.feature(1 hunks)tests/e2e/test_list.txt(1 hunks)
💤 Files with no reviewable changes (6)
- examples/gemini-run.yaml
- examples/vllm-granite-run.yaml
- examples/vllm-llama-faiss-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-faiss-run.yaml
- examples/openai-pgvector-run.yaml
🧰 Additional context used
📓 Path-based instructions (3)
tests/e2e/test_list.txt
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain test list in
tests/e2e/test_list.txtfor end-to-end tests
Files:
tests/e2e/test_list.txt
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/faiss.feature
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Configure pylint withsource-roots = "src"
Excludesrc/auth/k8s.pyfrom pyright type checking
Files:
pyproject.toml
🧠 Learnings (3)
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt` for end-to-end tests
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2025-11-24T16:58:04.410Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.
Applied to files:
tests/e2e/test_list.txt
🪛 YAMLlint (1.37.1)
tests/e2e/configs/run-rhaiis.yaml
[error] 94-94: duplication of key "config" in mapping
(key-duplicates)
[error] 98-98: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 99-99: duplication of key "provider_type" in mapping
(key-duplicates)
[error] 100-100: duplication of key "config" in mapping
(key-duplicates)
[error] 104-104: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 105-105: duplication of key "provider_type" in mapping
(key-duplicates)
tests/e2e/configs/run-rhelai.yaml
[error] 94-94: duplication of key "config" in mapping
(key-duplicates)
[error] 98-98: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 99-99: duplication of key "provider_type" in mapping
(key-duplicates)
[error] 100-100: duplication of key "config" in mapping
(key-duplicates)
[error] 104-104: duplication of key "provider_id" in mapping
(key-duplicates)
[error] 105-105: duplication of key "provider_type" in mapping
(key-duplicates)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (30)
pyproject.toml (1)
55-55: Consider using a flexible version constraint for urllib3 or explain the exact version pin.urllib3 2.6.1 is the latest non-vulnerable version, but 2.6.2 is available as a bug fix. The exact version pin (
==2.6.1) is unnecessarily restrictive. If there's no specific compatibility reason for avoiding 2.6.2, consider using>=2.6.1,<2.7instead. If an exact pin is necessary, document the reason in a comment above the dependency.The einops dependency is appropriate: einops 0.8.1 is the latest version with no known vulnerabilities.
⛔ Skipped due to learnings
Learnt from: CR Repo: lightspeed-core/lightspeed-stack PR: 0 File: CLAUDE.md:0-0 Timestamp: 2025-11-24T16:58:04.410Z Learning: Always check `pyproject.toml` for existing dependencies and versions before adding new onesdocs/providers.md (2)
103-103: Documentation accurately reflects the disabled llama-guard shield.The change from ✅ to ❌ for llama-guard aligns with the configuration changes across the example files where shields are disabled with the comment about "infinite loop issue with LLM calls".
160-160: rag-runtime is now correctly marked as supported.The rag-runtime provider is properly documented as supported (✅), which aligns with its configuration in all the new example YAML files.
examples/vllm-rhelai-run.yaml (1)
1-151: New RHEL AI configuration file is well-structured.The configuration follows the consolidated provider pattern with:
- Environment-driven paths and credentials
- FAISS vector store with sentence-transformers embeddings
- Properly disabled shields with documented reasoning
- Consistent storage backend configuration
examples/vertexai-run.yaml (2)
30-67: Provider configuration is well-organized.The inference providers are properly configured with:
- Google Vertex AI remote provider with project/location from environment variables
- Sentence-transformers for embeddings
- FAISS vector store with persistence
132-141: Empty models array may need attention.The
registered_resources.modelsarray is empty while the Vertex AI provider specifiesallowed_models: ["google/gemini-2.5-flash"]. Verify if a model registration is needed here, or if theallowed_modelsconfiguration in the provider is sufficient for Vertex AI.examples/vllm-rhoai-run.yaml (2)
134-138: Verify the hardcoded model configuration and null provider_model_id.Unlike other example configs that use environment variables (e.g.,
${env.RHEL_AI_MODEL}), this file hardcodesmodel_id: meta-llama/Llama-3.2-1B-Instructand setsprovider_model_id: null.Consider using environment variables for consistency:
- model_id: ${env.RHOAI_MODEL} provider_id: vllm model_type: llm provider_model_id: ${env.RHOAI_MODEL}
37-37: Different max_tokens value than other vLLM configs.This config uses
max_tokens: 1024whilevllm-rhelai-run.yamlandvllm-rhaiis-run.yamlusemax_tokens: 2048. Verify if this is intentional for RHOAI deployments.examples/vllm-rhaiis-run.yaml (1)
1-151: RHAIIS configuration follows the established pattern.The configuration is well-structured and consistent with other example configurations in this PR.
tests/e2e/configs/run-rhelai.yaml (1)
1-27: Configuration header and store definitions look correct.The version, image_name, APIs, and store configurations follow the established pattern with proper environment variable substitution.
tests/e2e-prow/rhoai/configs/run.yaml (1)
1-150: Configuration structure looks well-organized.The consolidated provider configuration with environment-variable-backed paths is consistent. The explicit warnings about disabled shields (lines 49, 138) document the known issue clearly.
tests/e2e/configs/run-vertexai.yaml (2)
30-67: Provider configuration looks correct.The Vertex AI inference provider is properly configured with project/location environment variables, and the FAISS vector_io provider follows the correct structure with persistence settings.
132-146: Registered resources and vector stores configuration is consistent.Empty
modelslist is appropriate here sinceallowed_modelsis specified in the provider configuration (line 36). Thevector_storesdefaults align with the FAISS provider setup.tests/e2e/configs/run-ci.yaml (2)
53-74: Scoring and vector_io providers are well-configured.The scoring providers include useful documentation comments (e.g., "Enable the RAG tool", "Define the storage backend for RAG"). The placeholder
'********'foropenai_api_keyis appropriate for example configurations.
114-155: Telemetry and storage configuration additions look good.The inline documentation (line 120, 153) explaining the storage backend purpose improves maintainability. The
vector_storesconfiguration with FAISS defaults is consistent with the PR's objective.examples/run.yaml (3)
1-6: Good documentation header.The introductory comments clearly explain the configuration purpose and prerequisites (OpenAI API key, rag-content tool reference). This helps users understand how to use this example.
74-80: FAISS vector_io configuration is correct.The persistence configuration with
namespace: vector_io::faissandbackend: kv_defaultproperly integrates with the storage backends defined below. This aligns with the PR's goal of fully supporting the FAISS VectorIO provider.
147-161: Registered resources and vector stores complete the FAISS integration.The
vector_storessection properly configures:
default_provider_id: faissmatching the vector_io providerdefault_embedding_modelusing sentence-transformers with the nomic modelThis provides a complete working example for FAISS-based RAG.
tests/e2e/test_list.txt (1)
1-1: LGTM!The new
faiss.featureis correctly added to the test list, following the coding guidelines for maintaining end-to-end tests in this file.docker-compose-library.yaml (1)
16-16: LGTM!The volume mount correctly exposes the test RAG database to the container, enabling FAISS-based e2e tests.
docker-compose.yaml (1)
14-14: LGTM!Consistent with the corresponding mount in
docker-compose-library.yaml, ensuring FAISS e2e tests work in both library and non-library modes.tests/e2e/features/faiss.feature (1)
1-39: Feature file follows BDD conventions.The Gherkin structure is correct and scenarios appropriately test FAISS vector store registration and retrieval. As per coding guidelines, this uses the behave framework with proper Given/When/Then steps.
tests/e2e/configs/run-azure.yaml (2)
48-52: Safety providers disabled with documented reason.The commented-out safety configuration includes a clear warning about the infinite loop issue. This is good practice for documenting why a feature is disabled.
61-67: FAISS vector_io provider correctly configured.The FAISS provider configuration with persistence to
kv_defaultbackend aligns with the storage configuration and the mounted RAG volume path.examples/azure-run.yaml (3)
4-14: MissingtelemetryAPI compared torun.yaml.The
run.yamlincludestelemetryin the APIs list and has atelemetry: enabled: truesection, but this Azure configuration omits both. If telemetry should be available in Azure deployments, add it for consistency.
48-52: Shields disabled with documented warning — acceptable for known issue.The safety providers array is empty with a clear warning comment about the infinite loop issue. The corresponding
shields: []in registered_resources is consistent. This is appropriately documented.Also applies to: 138-138
132-150: Registered resources and vector_stores configuration looks correct.The configuration properly:
- Registers the
gpt-4o-minimodel with the Azure provider- Sets up the RAG toolgroup with
rag-runtime- Configures FAISS as the default vector store provider with
nomic-embed-text-v1.5as the embedding modelrun.yaml (3)
31-39: Well-structured inference providers with environment-driven configuration.The OpenAI provider correctly uses environment variables for both the API key and allowed models with sensible defaults. The inline sentence-transformers provider is appropriately configured for embedding support.
68-74: FAISS vector_io provider correctly configured for persistence.The FAISS provider is properly set up with a persistence backend pointing to
kv_default, which aligns with the storage configuration. This enables the RAG functionality as intended by the PR.
148-155: RAG toolgroup and vector_stores configuration complete.The configuration properly:
- Registers
builtin::ragtoolgroup with therag-runtimeprovider- Sets FAISS as the default vector store provider
- Configures
nomic-ai/nomic-embed-text-v1.5as the default embedding model viasentence-transformersThis completes the FAISS VectorIO provider setup required by the PR objectives.
run.yaml
Outdated
| telemetry: | ||
| enabled: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find run.yaml file
fd -n "run.yaml" --type fRepository: lightspeed-core/lightspeed-stack
Length of output: 306
🏁 Script executed:
#!/bin/bash
# Find run.yaml and check its structure
fd -type f -name "*.yaml" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 246
🏁 Script executed:
find . -name "run.yaml" -type f 2>/dev/null | head -5Repository: lightspeed-core/lightspeed-stack
Length of output: 177
🏁 Script executed:
#!/bin/bash
# Search for run.yaml files
find . -type f -name "run.yaml" 2>/dev/null
# Also search for telemetry configuration patterns
rg -l "telemetry:" --type yaml 2>/dev/null | head -10Repository: lightspeed-core/lightspeed-stack
Length of output: 340
🏁 Script executed:
#!/bin/bash
# Locate run.yaml and examine structure around telemetry config
if [ -f "./run.yaml" ]; then
echo "=== Found run.yaml ==="
wc -l ./run.yaml
echo "=== Lines 100-130 ==="
sed -n '100,130p' ./run.yaml
echo "=== Searching for providers section ==="
grep -n "providers:" ./run.yaml || echo "No explicit providers section found"
elif [ -f "run.yaml" ]; then
wc -l run.yaml
sed -n '100,130p' run.yaml
grep -n "providers:" run.yaml || echo "No explicit providers section found"
else
find . -name "run.yaml" -type f 2>/dev/null -exec sh -c 'echo "=== Found: $1 ==="; wc -l "$1"; sed -n "100,130p" "$1"; grep -n "providers:" "$1" || echo "No providers section"' _ {} \;
fiRepository: lightspeed-core/lightspeed-stack
Length of output: 1099
🏁 Script executed:
#!/bin/bash
# Read the providers section
sed -n '1,80p' ./run.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 2155
🏁 Script executed:
#!/bin/bash
# Check entire telemetry section and nearby context
sed -n '110,125p' ./run.yaml
# Search for any backend/exporter configurations
grep -i "backend\|exporter\|endpoint" ./run.yaml | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 1284
🏁 Script executed:
#!/bin/bash
# Check if telemetry is listed in apis and look for telemetry-related documentation
sed -n '1,20p' ./run.yaml
# Search for any telemetry configuration or documentation
rg -i "telemetry" --type yaml -B2 -A2 ./run.yaml | head -40Repository: lightspeed-core/lightspeed-stack
Length of output: 504
🏁 Script executed:
#!/bin/bash
# Search for telemetry-related documentation, READMEs, or code comments
find . -type f \( -name "README*" -o -name "*.md" -o -name "telemetry*" \) | head -20
rg -i "telemetry.*provider|telemetry.*backend|telemetry.*config" --type yaml --type python -B1 -A1 | head -30Repository: lightspeed-core/lightspeed-stack
Length of output: 617
🏁 Script executed:
#!/bin/bash
# Search for telemetry documentation
rg -i "telemetry" docs/ -l 2>/dev/null | head -10
# Also check config documentation
grep -i "telemetry" README.md docs/config.md 2>/dev/nullRepository: lightspeed-core/lightspeed-stack
Length of output: 319
🏁 Script executed:
#!/bin/bash
# Check providers documentation for telemetry config
grep -i "telemetry" docs/providers.md -B2 -A5 | head -40
# Also check deployment guide
grep -i "telemetry" docs/deployment_guide.md -B2 -A5 | head -40Repository: lightspeed-core/lightspeed-stack
Length of output: 2149
🏁 Script executed:
#!/bin/bash
# Get more context from deployment_guide.md about telemetry configuration
sed -n '1,2000p' docs/deployment_guide.md | grep -i "telemetry" -B5 -A10 | head -60Repository: lightspeed-core/lightspeed-stack
Length of output: 2874
🏁 Script executed:
#!/bin/bash
# Check other run.yaml examples to see how telemetry should be configured
echo "=== Checking examples/run.yaml ==="
grep -A5 "telemetry:" examples/run.yaml | head -20
echo ""
echo "=== Checking tests/configuration/run.yaml ==="
grep -A5 "telemetry:" tests/configuration/run.yaml | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 391
🏁 Script executed:
#!/bin/bash
# Check if there are any code comments or documentation about when telemetry provider config is required
rg -i "telemetry.*provider|telemetry.*config|enabled.*true" docs/ -B2 -A2 | head -40
# Also check the main config documentation
head -100 docs/config.mdRepository: lightspeed-core/lightspeed-stack
Length of output: 4089
Add telemetry provider configuration to match test and documentation patterns.
Telemetry is enabled but lacks an explicit provider configuration under the providers section, unlike other APIs (inference, safety, scoring, etc.). The test configuration at tests/configuration/run.yaml shows the proper pattern with providers.telemetry using the meta-reference provider. Either add explicit telemetry provider config or document why using a default backend is acceptable.
🤖 Prompt for AI Agents
In run.yaml around lines 114 to 115, telemetry is enabled but missing an
explicit providers.telemetry entry; add a providers.telemetry block mirroring
the pattern used by inference/safety/scoring (e.g., include a telemetry provider
key such as meta-reference with its required connection/config fields used in
tests) so telemetry has an explicit provider configuration, or update docs to
justify relying on a default backend—ensure the YAML structure follows the
existing providers.<provider> pattern used elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provider config for telemetry is not supported in 0.3.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@are-ces, thank you for the clarification! I've noted that telemetry provider configuration is not supported in version 0.3.x, which explains why the configuration is valid as-is with just the enabled: true flag.
✏️ Learnings added
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
tests/e2e/configs/run-azure.yaml
Outdated
| vector_stores: | ||
| default_provider_id: faiss | ||
| default_embedding_model: | ||
| provider_id: sentence-transformers | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vector stores configuration uses v1.5, though v2 is now available as a newer alternative.
The FAISS and sentence-transformers configuration is properly set. However, Nomic Embed Text V2 is now available for production workloads as a successor to v1.5, offering multilingual support if needed. Additionally, ensure that when embeddings are generated at runtime, the text prompt includes a task instruction prefix (e.g., search_document: or search_query: for RAG applications) to align with the model's requirements.
🤖 Prompt for AI Agents
In tests/e2e/configs/run-azure.yaml around lines 146 to 150, the vector store
config pins the embedding model to nomic-ai/nomic-embed-text-v1.5; update
model_id to the v2 release (e.g., nomic-ai/nomic-embed-text-v2) to use the newer
multilingual production model and adjust any downstream compatibility settings
if required, and ensure that runtime embedding calls prepend the appropriate
task instruction prefix (for example "search_document:" or "search_query:") to
the text passed to the embedding API so the model receives the required task
context for RAG/search use cases.
| And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva | ||
| When I use "query" to ask question with authorization header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/e2e/features/faiss.feature | sed -n '30,35p'Repository: lightspeed-core/lightspeed-stack
Length of output: 542
Complete the JWT token with all three segments.
A well-formed JWT consists of three concatenated Base64url-encoded strings, separated by dots. The token on line 31 is missing the signature segment and the payload appears truncated. Provide a complete, valid JWT token with all three parts (header.payload.signature).
🤖 Prompt for AI Agents
In tests/e2e/features/faiss.feature around lines 31 to 32, the Authorization
Bearer token is incomplete (missing the third signature segment and has a
truncated payload); replace that string with a well-formed JWT consisting of
three Base64url-encoded segments separated by dots (header.payload.signature).
Ensure the token used in the test is complete and valid for your test
environment (either a real signed token or a locally generated test token with
matching server validation), and update the line to include the full token
string.
160b644 to
ca83927
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @tests/e2e/configs/run-ci.yaml:
- Around line 60-63: The Braintrust provider config currently hardcodes
openai_api_key: '********' which will fail at runtime; update the provider
config for provider_type inline::braintrust to read the API key from an
environment variable (e.g., using the config expression or secret interpolation
used elsewhere in CI configs) instead of the literal '********', ensuring the
provider_id/provider_type block uses the env var name so the runtime picks up
the real key.
- Around line 160-164: registered_resources has malformed YAML: move the model
entries currently under the top-level list into a nested models: mapping and
ensure shields: [] remains a key with a models: list beneath it; specifically,
under registered_resources->shields keep "shields: []" as an empty sequence or
mapping and create a "models:" key containing entries with "provider_id:" and
"model_id:" (e.g., add an entry for provider_id: sentence-transformers and
model_id: nomic-ai/nomic-embed-text-v1.5) so the default_embedding_model
referenced in vector_stores is registered; validate indentation and YAML
structure so the models are children of the models: key, not sibling keys at the
same level as shields.
🧹 Nitpick comments (2)
tests/e2e/configs/run-ci.yaml (2)
48-52: Track the shields/safety infinite loop issue.Safety providers and shields are disabled due to an "infinite loop issue with LLM calls". While this may be acceptable for E2E testing, ensure this known issue is tracked and prioritized for resolution before production deployment.
Is there an open issue tracking the shields infinite loop problem? If not, would you like me to help create one?
120-120: Consider breaking the long comment into multiple lines.The comment explaining the unified storage backend is informative but spans over 100 characters, which impacts readability.
♻️ Suggested refactor
- kv_default: # Define the storage backend type for RAG, in this case registry and RAG are unified i.e. information on registered resources (e.g. models, vector_stores) are saved together with the RAG chunks + kv_default: + # Define the storage backend type for RAG. In this case, registry and RAG are unified, + # i.e., information on registered resources (e.g., models, vector_stores) are saved + # together with the RAG chunks. type: kv_sqlite db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (1)
tests/e2e/configs/run-ci.yaml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
tests/e2e/configs/run-ci.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: build-pr
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: server mode / vertexai
- GitHub Check: E2E: server mode / azure
- GitHub Check: E2E: library mode / azure
- GitHub Check: E2E: library mode / vertexai
🔇 Additional comments (2)
tests/e2e/configs/run-ci.yaml (2)
31-36: LGTM: Inference provider properly configured.The OpenAI inference provider configuration correctly uses environment variables for the API key and model selection, with a sensible default for E2E testing.
68-74: LGTM: FAISS vector_io provider properly configured.The FAISS provider configuration correctly references the
kv_defaultbackend for persistence, which aligns with the PR objective to fully support FAISS.
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the deletion of "old" examples should be part of this PR. Otherwise looks OK, thank you very much.
| And REST API service prefix is /v1 | ||
|
|
||
| @skip-in-library-mode | ||
| Scenario: Verify vector store is registered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this scenario is testing llama-stack behavior, it was useful in debugging and could be added as a single step to make sure the vector_store exists but entire test should not be there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this should be moved to info.feature (as stated above).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@radofuchs and I had a discussion about this, his suggestion was to remove the scenario for the llama-stack test entirely and move the test functionality inside the other faiss scenario, my solution has been to instead of testing the llama-stack functionality test our endpoint /rags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo faiss.feature is the right place because when testing faiss we want to make sure that llama-stack is able to register the faiss vector store
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is different issue though when you started to test the rags endpoint there, since the tests for rags need to test more than only the 200 case. We also need to test that it fails when llama-stack is down as well as the auth errors. So I would move these tests to the info.feature to cover all of these there
c80a78d to
2993407
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/configs/run-ci.yaml (1)
145-150: Embedding model mismatch between registration and default.The
registered_resources.modelsincludessentence-transformers/all-mpnet-base-v2(768 dimensions), butvector_stores.default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5. Using different embedding models for registration versus default could cause dimension mismatches or inconsistent vector retrieval.Consider aligning these to use the same model:
Option A: Use the registered model as default
vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers - model_id: nomic-ai/nomic-embed-text-v1.5 + model_id: sentence-transformers/all-mpnet-base-v2Option B: Register the nomic model instead
- model_id: sentence-transformers/all-mpnet-base-v2 + - model_id: nomic-ai/nomic-embed-text-v1.5 model_type: embedding provider_id: sentence-transformers - provider_model_id: sentence-transformers/all-mpnet-base-v2 + provider_model_id: nomic-ai/nomic-embed-text-v1.5 metadata: - embedding_dimension: 768 + embedding_dimension: 768 # Verify nomic model dimensionAlso applies to: 161-165
🤖 Fix all issues with AI agents
In @run.yaml:
- Around line 60-63: The braintrust provider block (provider_id: braintrust,
provider_type: inline::braintrust) currently contains a hardcoded placeholder
openai_api_key: '********'; replace this with an environment variable reference
(e.g., set openai_api_key from process/env such as ${OPENAI_API_KEY} or a
project-specific name like ${BRAINTRUST_OPENAI_API_KEY}) to match other
credential handling so the actual key is not committed in the repo.
In @tests/e2e/configs/run-rhelai.yaml:
- Around line 53-57: The config declares an empty safety providers list
(`safety: []`) but still registers a shield named `llama-guard` in
`registered_resources.shields`, causing an inconsistency; either remove the
`llama-guard` entry from `registered_resources.shields` or re-enable the
`llama-guard` provider in the `safety` list (e.g., uncomment and include the
provider config) so that every registered shield has a corresponding provider;
update whichever of `registered_resources.shields` or the `safety` block you
choose to keep so they match.
🧹 Nitpick comments (6)
examples/vertexai-run.yaml (1)
132-138: Empty models array may require documentation.Unlike the vllm configs which pre-register models, this VertexAI config has an empty
modelsarray inregistered_resources. Users will need to register models separately. Consider adding a comment indicating that models should be registered via API or by adding entries here with the appropriateprovider_model_idfrom theallowed_modelslist (e.g.,google/gemini-2.5-flash).examples/vllm-rhoai-run.yaml (1)
34-37: TLS verification disabled - security consideration.
tls_verify: falsebypasses certificate validation. While this may be acceptable for development or internal environments with self-signed certificates, it introduces security risks in production. Consider:
- Adding a comment explaining when this is appropriate
- Using an environment variable to control this setting:
tls_verify: ${env.VLLM_TLS_VERIFY:=true}♻️ Suggested approach
config: url: ${env.KSVC_URL}/v1/ api_token: ${env.VLLM_API_KEY} - tls_verify: false + tls_verify: ${env.VLLM_TLS_VERIFY:=true} # Set to false only for self-signed certs in dev/test max_tokens: 1024examples/vllm-rhaiis-run.yaml (1)
34-37: URL construction reduces flexibility.The URL is constructed with hardcoded protocol (
http://) and port (:8000). This limits deployment flexibility:
- Cannot switch to HTTPS without changing the config
- Cannot use a different port without modifying the file
Consider making the entire URL configurable via environment variable, similar to the RHOAI config.
♻️ Suggested approach
config: - url: http://${env.RHAIIS_URL}:8000/v1/ + url: ${env.RHAIIS_URL}/v1/ # e.g., http://localhost:8000 or https://rhaiis.example.com api_token: ${env.RHAIIS_API_KEY} - tls_verify: false + tls_verify: ${env.RHAIIS_TLS_VERIFY:=true} max_tokens: 2048examples/azure-run.yaml (1)
30-39: Hardcoded test endpoint and preview API version.Two concerns with the Azure configuration:
Hardcoded test endpoint (line 35):
api_base: https://ols-test.openai.azure.com/appears to be an internal test endpoint. Example configurations should use environment variables to avoid confusion and accidental usage of test infrastructure.Preview API version (line 36):
api_version: 2024-02-15-previewis a preview version that may be deprecated. Consider documenting the minimum required API version or using a stable release.♻️ Suggested fix
inference: - provider_id: azure provider_type: remote::azure config: api_key: ${env.AZURE_API_KEY} - api_base: https://ols-test.openai.azure.com/ - api_version: 2024-02-15-preview + api_base: ${env.AZURE_API_BASE} # e.g., https://your-resource.openai.azure.com/ + api_version: ${env.AZURE_API_VERSION:=2024-02-15-preview}examples/run.yaml (1)
66-69: Use environment variable for API key consistency.The
braintrustprovider uses a hardcoded placeholder'********'foropenai_api_key. For consistency with other sensitive values in this config (e.g.,${env.OPENAI_API_KEY}), consider using an environment variable.Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.OPENAI_API_KEY}tests/e2e/configs/run-ci.yaml (1)
60-63: Use environment variable for API key consistency.Same as in
examples/run.yaml- thebraintrustprovider uses a hardcoded placeholder. Use an environment variable for consistency with other secrets management in this config.Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.OPENAI_API_KEY}
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis-run.yamlexamples/vllm-rhelai-run.yamlexamples/vllm-rhoai-run.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/gemini-run.yaml
- examples/vllm-granite-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-pgvector-run.yaml
- examples/openai-faiss-run.yaml
- examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
- docker-compose.yaml
- pyproject.toml
- docker-compose-library.yaml
- examples/vllm-rhelai-run.yaml
- tests/e2e/features/faiss.feature
- tests/e2e/test_list.txt
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
examples/vllm-rhoai-run.yamltests/e2e/configs/run-rhaiis.yamlexamples/azure-run.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamlexamples/vertexai-run.yamlexamples/vllm-rhaiis-run.yamlexamples/run.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: server mode / ci
🔇 Additional comments (24)
docs/providers.md (2)
103-103: LGTM - Provider status update aligns with PR objectives.The llama-guard provider status change from ✅ to ❌ is consistent with the "infinite loop issue with LLM calls" mentioned in the configuration files. Consider adding a note in the documentation explaining why this provider is currently unsupported, so users understand it's a known issue rather than a missing feature.
160-160: LGTM - RAG runtime support documented.The rag-runtime provider is now marked as supported, which aligns with the FAISS VectorIO provider support being added in this PR.
examples/vertexai-run.yaml (3)
30-39: LGTM - VertexAI inference configuration.The dual-provider setup with Google Vertex for LLM inference and sentence-transformers for embeddings is well-structured.
48-52: Safety provider disabled - ensure this is tracked.The llama-guard provider is commented out with a warning about an "infinite loop issue with LLM calls." This aligns with the documentation update marking llama-guard as unsupported. Ensure there's an issue tracking this bug for future resolution.
142-146: LGTM - Vector stores configuration.The default FAISS provider and sentence-transformers embedding model configuration is consistent with the PR's goal of supporting FAISS VectorIO.
examples/vllm-rhaiis-run.yaml (1)
134-138: LGTM - Environment-driven model configuration.Using environment variables for both
model_idandprovider_model_idis more flexible than hardcoding, allowing the same config to work across different RHAIIS deployments with different models.examples/azure-run.yaml (2)
132-145: LGTM - Resource configuration is complete.The registered resources section properly links the gpt-4o-mini model to the Azure provider and includes the RAG tool group configuration.
146-150: LGTM - Vector stores configuration consistent across examples.The FAISS vector store with sentence-transformers embedding model is consistently configured across all example files, which supports maintainability.
examples/vllm-rhoai-run.yaml (1)
134-138: No changes needed. Settingprovider_model_id: nullis intentional and supported behavior in Llama Stack. When set tonull(or omitted), the provider defaults to using themodel_idvalue directly. This is documented behavior and is used consistently in the codebase.Likely an incorrect or invalid review comment.
tests/e2e/configs/run-azure.yaml (1)
1-160: Configuration structure looks correct for Azure provider integration.The file follows the new centralized providers/storage configuration pattern correctly. Telemetry is properly configured with just
enabled: trueas expected for Llama Stack 0.3.x. All sensitive credentials use environment variables appropriately.Based on learnings, the telemetry configuration aligns with the documented pattern.
tests/e2e/configs/run-rhaiis.yaml (1)
1-161: Configuration correctly implements RHAIIS provider setup.The file properly configures the vLLM-based RHAIIS inference provider with environment-variable-driven URLs and credentials. The structure aligns with the unified providers/storage schema introduced in this PR.
tests/e2e-prow/rhoai/configs/run.yaml (2)
139-142: Verify intentional use ofnullforprovider_model_id.The
provider_model_idis explicitly set tonullwhilemodel_idis set tometa-llama/Llama-3.2-1B-Instruct. Typically,provider_model_idmaps to the model identifier expected by the provider. If the vLLM provider uses themodel_iddirectly or infers it from the endpoint, this may be intentional. Otherwise, consider setting it to match the deployed model.
143-146: Potential mismatch:llama-guardshield referencesopenai/gpt-4o-mini.The shield is configured with
provider_id: llama-guardbutprovider_shield_id: openai/gpt-4o-mini. This pattern is consistent across other configs, but verify that the llama-guard safety provider correctly routes to the OpenAI provider for the underlying model inference.tests/e2e/configs/run-vertexai.yaml (2)
30-36: Vertex AI provider configuration looks good.The Google Vertex AI provider is correctly configured with environment variables for project and location. The
allowed_modelsarray explicitly lists the permitted model.
136-137: Verify emptymodelsarray is intentional.The
registered_resources.modelsis an empty array, while other configuration files (e.g.,run-azure.yaml,run-rhaiis.yaml) explicitly register models. If models should be dynamically discovered viaallowed_modelsin the provider config, this is fine. Otherwise, consider registering the Vertex AI model explicitly for consistency.run.yaml (3)
12-12: Verifytelemetryin theapislist.This file includes
telemetryin theapislist (line 12), while other configuration files in this PR do not. Verify if this is intentional for the root/starter configuration or if it should be removed for consistency with the other configs.
117-120: Helpful inline documentation.The comment explaining that registry and RAG storage are unified in
kv_defaultis valuable for understanding the configuration design.
156-164: Vector stores and telemetry configuration looks correct.The
vector_storesdefaults, safety configuration, and telemetry enablement follow the expected pattern for Llama Stack 0.3.x. Based on learnings, the telemetry configuration correctly usesenabled: truewithout a provider block.tests/e2e/configs/run-rhelai.yaml (2)
155-157: Verify embedding model availability.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5viasentence-transformersprovider, but no embedding model is explicitly registered underregistered_resources.models. Confirm that the sentence-transformers provider auto-loads this model or add an explicit registration.
160-161: LGTM!Telemetry configuration correctly uses
enabled: truewithout a provider block, which aligns with Llama Stack 0.3.x requirements.examples/run.yaml (2)
157-161: LGTM!The vector_stores configuration correctly defines the default FAISS provider and embedding model. The use of sentence-transformers with nomic model is consistent with other configuration files.
1-7: Good documentation header.The header comments clearly explain the purpose, requirements, and external tooling reference. This is helpful for users setting up OpenAI + FAISS configurations.
tests/e2e/configs/run-ci.yaml (2)
48-52: LGTM!Safety provider is properly configured with
llama-guardand correctly referenced inregistered_resources.shieldsandsafety.default_shield_id. This differs from other configs where safety is disabled, which is appropriate for CI testing.
168-169: LGTM!Telemetry is correctly enabled and listed in the APIs. Configuration follows Llama Stack 0.3.x conventions.
2993407 to
3dc2804
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🤖 Fix all issues with AI agents
In @examples/azure-run.yaml:
- Line 36: Update the api_version value for Azure OpenAI in the YAML: replace
the outdated "2024-02-15-preview" with the stable GA "2024-06-01" (or a newer
preview like "2025-04-01-preview" if preview features are required) by editing
the api_version key in the Azure config block so the example uses a current
supported API version.
In @examples/run.yaml:
- Around line 66-69: The YAML snippet uses a hardcoded placeholder API key
("********") for the braintrust provider which will fail at runtime; update the
provider config (provider_id/provider_type/config.openai_api_key) to read the
real key from a secure source (e.g., environment variable or secrets manager)
and validate presence before use (fail fast with a clear error if the env
var/secret is missing) so the provider doesn't attempt to start with a
placeholder value.
- Around line 157-161: The examples run.yaml is missing the telemetry
configuration block referenced by the telemetry API; add a telemetry section at
the end of the file with telemetry: enabled: true (matching the run.yaml
pattern) so telemetry is explicitly enabled and consistent with the API usage.
In @pyproject.toml:
- Line 63: The dependency line "einops>=0.8.1" lacks an explanatory comment; add
a brief single-line comment immediately above that line describing why einops is
required (e.g., "Used for tensor rearrangements and shape transformations in
model preprocessing/visual modules") so it matches the style of other
dependencies and improves maintainability.
In @run.yaml:
- Around line 60-63: The braintrust provider config currently uses a hardcoded
placeholder openai_api_key: '********' which will fail at runtime; update the
inline::braintrust provider block (refer to provider_type: inline::braintrust
and provider_id: braintrust) to read the OpenAI key from an environment variable
(e.g., reference a secret like OPENAI_API_KEY used elsewhere) instead of the
literal ********, and ensure any runtime code that loads config supports env
interpolation or use your secrets manager so the provider's openai_api_key is
populated securely at runtime.
In @tests/e2e/configs/run-ci.yaml:
- Around line 116-123: The two storage backends (kv_default and sql_default)
both use the same env var pattern ${env.SQLITE_STORE_DIR:=...}, causing
collisions when the env var is set; change them to distinct env vars and/or
paths so they cannot override each other (e.g., use
${env.SQLITE_KV_STORE_DIR:=~/.llama/storage/rag/kv_store.db} for kv_default and
${env.SQLITE_SQL_STORE_DIR:=~/.llama/storage/sql_store.db} for sql_default) and
ensure the db_path values for kv_default and sql_default remain unique and
consistent with the other store definitions referenced elsewhere.
- Around line 161-165: The default_embedding_model value (model_id
nomic-ai/nomic-embed-text-v1.5) is not present in registered_resources.models
which will break FAISS usage; either change
vector_stores.default_embedding_model to the already-registered model_id
(sentence-transformers/all-mpnet-base-v2) or add an entry for
nomic-ai/nomic-embed-text-v1.5 under registered_resources.models with model_id
nomic-ai/nomic-embed-text-v1.5, model_type: embedding, provider_id:
sentence-transformers, provider_model_id: nomic-ai/nomic-embed-text-v1.5 and
correct embedding_dimension in metadata so the default and registry match.
- Around line 17-28: The three store entries (conversations_store,
inference_store, metadata_store) currently all use the same env pattern
`${env.SQLITE_STORE_DIR:=...}` which makes them point to the same file when
SQLITE_STORE_DIR is set; update these keys so they either use distinct env vars
per store (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH, METADATA_DB_PATH) or
treat SQLITE_STORE_DIR as a directory and append filenames (e.g., use
SQLITE_STORE_DIR with `/conversations.db`, `/inference-store.db`,
`/registry.db`), and ensure this matches the existing storage_dir expectation
for a directory.
- Around line 60-63: The Braintrust provider block (provider_id: braintrust,
provider_type: inline::braintrust) contains a hardcoded/placeholder
openai_api_key value; replace it with an environment variable reference (e.g.,
use a CI/secret env var like BRAINTRUST_OPENAI_API_KEY) so the key is not
committed, or if this provider is intentionally disabled remove the provider
block or add a clear comment explaining it's disabled; ensure the config key
name openai_api_key stays the same and that your CI/concourse/github-actions
reads the chosen env var before tests run.
In @tests/e2e/configs/run-rhelai.yaml:
- Around line 53-57: The config sets safety: [] but still registers shields and
sets safety.default_shield_id to "llama-guard", causing an inconsistency; fix by
removing the shield registration block (the entries that reference provider_id:
llama-guard / provider_type: inline::llama-guard) and remove or clear
safety.default_shield_id (the safety.default_shield_id: llama-guard setting) so
the config consistently disables safety providers, or alternatively re-enable
the safety provider entry and its config under safety: if you intend to keep
shields.
🧹 Nitpick comments (2)
examples/vllm-rhelai-run.yaml (1)
36-36: Consider documenting TLS verification in production guidance.
tls_verify: falseis acceptable for examples/development, but users copying this config should be reminded to enable TLS verification in production environments. A brief comment would help.tests/e2e/configs/run-azure.yaml (1)
35-35: Hardcoded Azure endpoint in test configuration.The
api_baseis hardcoded tohttps://ols-test.openai.azure.com/. This is acceptable for e2e tests but consider using an environment variable for flexibility across test environments.♻️ Suggested change
- api_base: https://ols-test.openai.azure.com/ + api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis-run.yamlexamples/vllm-rhelai-run.yamlexamples/vllm-rhoai-run.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/vllm-granite-run.yaml
- examples/gemini-run.yaml
- examples/openai-pgvector-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-faiss-run.yaml
- examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (7)
- docs/providers.md
- examples/vllm-rhaiis-run.yaml
- docker-compose.yaml
- tests/e2e/test_list.txt
- tests/e2e/features/faiss.feature
- examples/vertexai-run.yaml
- tests/e2e/configs/run-vertexai.yaml
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Checkpyproject.tomlfor supported Python versions before development
Always checkpyproject.tomlfor existing dependencies before adding new ones
Always verify current library versions inpyproject.tomlrather than assuming versions
Use pylint withsource-roots = "src"configuration
Files:
pyproject.toml
🧠 Learnings (5)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones
Applied to files:
pyproject.toml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.
Applied to files:
pyproject.toml
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
examples/vllm-rhoai-run.yamlexamples/azure-run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-ci.yamltests/e2e-prow/rhoai/configs/run.yamlexamples/run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yamlrun.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: server mode / ci
🔇 Additional comments (25)
tests/e2e/configs/run-rhaiis.yaml (6)
17-27: LGTM!The store configurations use consistent environment variable patterns with sensible defaults and separate database files for each store type.
66-72: LGTM!The FAISS vector_io provider configuration correctly sets up persistence with the
kv_defaultbackend and an appropriate namespace. This aligns with the PR objective of fully supporting the FAISS VectorIO provider.
73-110: LGTM!The agents, batches, datasetio, and eval provider configurations follow consistent patterns with appropriate namespace isolation and backend references.
114-136: LGTM!The storage configuration properly defines both kv_sqlite and sql_sqlite backends with appropriate store definitions. The inference store includes sensible write queue settings for performance.
143-146: Verify shield configuration consistency.The shield is registered with
provider_id: llama-guard(which maps toinline::llama-guardprovider), butprovider_shield_idreferencesopenai/gpt-4o-mini. This appears inconsistent—typically theprovider_shield_idshould match the provider's expected model identifier.Given the PR notes that shield support was revoked due to unusability, please confirm whether this configuration is intentionally a placeholder or if it should be corrected/removed.
153-161: LGTM!The vector stores configuration correctly sets FAISS as the default provider with sentence-transformers embedding model. The telemetry configuration follows the Llama Stack 0.3.x pattern of using simple
enabled: truewithout a provider block. Based on learnings, this is the correct approach.tests/e2e/configs/run-ci.yaml (3)
12-12: LGTM!The telemetry API addition and the
telemetry.enabled: trueconfiguration align with the Llama Stack 0.3.x pattern where telemetry can be enabled without requiring an explicit provider block. Based on learnings.Also applies to: 168-169
31-74: LGTM!The providers block is well-structured with appropriate comments explaining the RAG tool and FAISS storage backend configuration. The inference provider correctly uses environment variables for the API key and allowed models.
158-167: LGTM!The RAG tool group registration properly references the
rag-runtimeprovider, and the safety shield configuration correctly references the registeredllama-guardshield.docker-compose-library.yaml (1)
16-16: LGTM!The new volume mount correctly exposes the e2e RAG test database to the container at the expected storage path. The
:Zflag is appropriate for SELinux relabeling.examples/vllm-rhelai-run.yaml (1)
1-151: LGTM - Well-structured vLLM RHEL AI configuration.Configuration properly uses environment variables for sensitive values and follows the centralized storage-backed architecture pattern. The shields disabled warning (line 49, 139) is appropriately documented.
examples/vllm-rhoai-run.yaml (1)
1-151: LGTM - Consistent RHOAI configuration.Configuration mirrors the RHEL AI variant appropriately with RHOAI-specific environment variables (
KSVC_URL,VLLM_API_KEY). The hardcoded model ID withprovider_model_id: nullis valid for vLLM deployments.tests/e2e/configs/run-azure.yaml (1)
1-160: LGTM - Comprehensive Azure e2e test configuration.Configuration correctly enables safety provider (llama-guard) for e2e testing, unlike examples where shields are disabled. The telemetry configuration at root level aligns with Llama Stack 0.3.x requirements. Based on learnings.
examples/azure-run.yaml (1)
1-150: LGTM - Azure example configuration follows established patterns.Configuration properly disables shields with documented warnings and uses the centralized storage-backed architecture.
tests/e2e-prow/rhoai/configs/run.yaml (2)
1-161: LGTM - Well-configured RHOAI prow test configuration.Configuration appropriately enables safety (llama-guard) and telemetry for CI/prow testing. The dual inference providers (vllm + openai) provide flexibility for the test environment.
144-146: Shield configuration depends on OpenAI provider.The
provider_shield_id: openai/gpt-4o-miniindicates llama-guard uses OpenAI as its backing model. EnsureOPENAI_API_KEYis available in the prow test environment, as the shield will fail without it even though vllm is the primary inference provider.run.yaml (3)
12-14: LGTM!The telemetry API is correctly added to the apis list, and the telemetry configuration at lines 163-164 follows the correct Llama Stack 0.3.x pattern with
telemetry.enabled: truewithout a provider block. Based on learnings, this is the expected approach.
68-74: LGTM!The FAISS vector_io provider configuration is well-structured with proper persistence settings referencing the
kv_defaultbackend.
156-164: LGTM!The vector_stores, safety, and telemetry sections are properly configured with appropriate defaults and references to the defined providers.
examples/run.yaml (3)
1-6: LGTM!Good documentation header explaining the configuration purpose and requirements.
54-58: Safety shields intentionally disabled.The safety providers and shields are disabled with a documented reason (infinite loop issue with LLM calls). This is acceptable for the example configuration, but ensure the underlying issue is tracked for resolution.
Is there a tracking issue for the infinite loop problem mentioned in the comment? If not, consider creating one to ensure this gets addressed.
147-156: LGTM!The registered_resources section correctly registers the RAG tool group while keeping models and shields empty (consistent with the disabled safety configuration and dynamic model registration via OpenAI).
tests/e2e/configs/run-rhelai.yaml (3)
30-37: LGTM with note on TLS verification.The vLLM inference provider configuration is appropriate for the RHEL AI test environment. Note that
tls_verify: falseis acceptable for testing but should not be used in production environments.
153-161: LGTM!The vector_stores configuration and telemetry block are correctly configured for the RHEL AI test environment, following the Llama Stack 0.3.x pattern.
114-136: LGTM!The storage backends and stores configuration is well-structured and consistent with the other configuration files in this PR.
| config: | ||
| api_key: ${env.AZURE_API_KEY} | ||
| api_base: https://ols-test.openai.azure.com/ | ||
| api_version: 2024-02-15-preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
Azure OpenAI API versions 2024 2025 latest current
💡 Result:
Short answer
- Latest GA Azure OpenAI Service API version: 2024-06-01. [1]
- Latest preview / Foundry API versions (most recent preview cycle): 2025-04-01-preview (and incremental preview dates like 2025-03-01-preview, 2025-02-01-preview, 2024-12-01-preview are documented). [2]
Notes
- Microsoft documents the API-version lifecycle and change history for Azure OpenAI / Foundry (preview versions listed through 2025-04-01-preview). [2]
- If you use the Assistants API, v1 is being retired (migration to v2 required—retirement notices in 2025), so check the specific retirement date for your region. [3]
Sources
- Rollout / Azure OpenAI Service summary (latest GA = 2024-06-01). [4]
- Microsoft Learn — Azure OpenAI / Foundry API version lifecycle (preview releases through 2025-04-01-preview). [2]
- Announcement / notices re: Assistants API v1 retirement. [3]
Would you like direct links to the specific Microsoft docs or help updating code to a particular api-version?
Update Azure OpenAI API version to current release.
The api_version: 2024-02-15-preview is outdated. The latest GA version is 2024-06-01, with preview versions available through 2025-04-01-preview. Update to at least the stable 2024-06-01 release to ensure access to current features and improvements.
🤖 Prompt for AI Agents
In @examples/azure-run.yaml at line 36, Update the api_version value for Azure
OpenAI in the YAML: replace the outdated "2024-02-15-preview" with the stable GA
"2024-06-01" (or a newer preview like "2025-04-01-preview" if preview features
are required) by editing the api_version key in the Azure config block so the
example uses a current supported API version.
| - provider_id: braintrust | ||
| provider_type: inline::braintrust | ||
| config: | ||
| openai_api_key: '********' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placeholder API key for braintrust scoring provider.
Same issue as in run.yaml - the placeholder value will fail at runtime if used.
Suggested fix
- provider_id: braintrust
provider_type: inline::braintrust
config:
- openai_api_key: '********'
+ openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - provider_id: braintrust | |
| provider_type: inline::braintrust | |
| config: | |
| openai_api_key: '********' | |
| - provider_id: braintrust | |
| provider_type: inline::braintrust | |
| config: | |
| openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=} |
🤖 Prompt for AI Agents
In @examples/run.yaml around lines 66 - 69, The YAML snippet uses a hardcoded
placeholder API key ("********") for the braintrust provider which will fail at
runtime; update the provider config
(provider_id/provider_type/config.openai_api_key) to read the real key from a
secure source (e.g., environment variable or secrets manager) and validate
presence before use (fail fast with a clear error if the env var/secret is
missing) so the provider doesn't attempt to start with a placeholder value.
| - provider_id: braintrust | ||
| provider_type: inline::braintrust | ||
| config: | ||
| openai_api_key: '********' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Braintrust API key should use environment variable.
The openai_api_key: '********' appears to be either a placeholder or obfuscated value. If this provider is used in CI tests, it will fail. If it's intentionally disabled, consider removing the provider or adding a comment.
Per project guidelines, use environment variables for sensitive data.
🔧 Suggested fix
- provider_id: braintrust
provider_type: inline::braintrust
config:
- openai_api_key: '********'
+ openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - provider_id: braintrust | |
| provider_type: inline::braintrust | |
| config: | |
| openai_api_key: '********' | |
| - provider_id: braintrust | |
| provider_type: inline::braintrust | |
| config: | |
| openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=} |
🤖 Prompt for AI Agents
In @tests/e2e/configs/run-ci.yaml around lines 60 - 63, The Braintrust provider
block (provider_id: braintrust, provider_type: inline::braintrust) contains a
hardcoded/placeholder openai_api_key value; replace it with an environment
variable reference (e.g., use a CI/secret env var like
BRAINTRUST_OPENAI_API_KEY) so the key is not committed, or if this provider is
intentionally disabled remove the provider block or add a clear comment
explaining it's disabled; ensure the config key name openai_api_key stays the
same and that your CI/concourse/github-actions reads the chosen env var before
tests run.
3dc2804 to
f2b7f5e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/providers.md (1)
23-23: Typo: "atributes" → "attributes"-The tables below summarize each provider category, containing the following atributes: +The tables below summarize each provider category, containing the following attributes:
🤖 Fix all issues with AI agents
In @.github/workflows/e2e_tests_providers.yaml:
- Around line 4-5: The workflow currently uses a bare "on: push" which triggers
on every branch and removed manual dispatch, PR validation, and scheduled runs;
update the "on" section to a mapping that includes "push" with "branches" (e.g.,
["main","release"]) to limit pushes, add "pull_request" with the same "branches"
to validate PRs, re-add "workflow_dispatch" to allow manual runs, and include a
"schedule" entry (cron) to run periodic E2E checks; modify the YAML keys "on",
"push", "pull_request", "workflow_dispatch", and "schedule" accordingly so the
workflow is targeted, can be manually triggered, validates PRs, and runs on a
cron.
In @examples/azure-run.yaml:
- Around line 137-152: The config references default_embedding_model =
"nomic-ai/nomic-embed-text-v1.5" but that embedding model is not present in
registered_resources.models; add a models entry for the embedding (e.g., a model
object with model_id matching the referenced name or a chosen id, provider_id
set to the correct provider for Nomic, model_type set to "embedding", and
provider_model_id set to "nomic-ai/nomic-embed-text-v1.5") so
default_embedding_model points to a registered model; ensure the same fix is
applied for the other occurrence noted (lines 153-157).
In @examples/run.yaml:
- Around line 119-120: The YAML contains two identical top-level telemetry
blocks; remove the duplicate so only a single "telemetry:" mapping remains at
the top level, deleting the extra "telemetry: enabled: true" occurrence and
ensuring surrounding indentation/spacing keeps the YAML valid.
- Around line 146-157: The YAML registers no models while RAG references an
embedding model; add the embedding model to registered_resources.models so
default_embedding_model matches a registered model (e.g., add an entry with
model_id like "nomic-embed-text-v1.5" or similar, provider_id (e.g.,
"nomic-ai"), and provider_model_id set to "nomic-ai/nomic-embed-text-v1.5");
ensure the model_id you add is the same identifier used by
default_embedding_model and that the toolgroup entry (toolgroup_id:
builtin::rag) can resolve that model.
- Line 58: The YAML has a syntax error where the keys "provider_type" and
"scoring" are on the same line; update the config so "provider_type:
inline::llama-guard" and "scoring:" are on separate lines (ensure proper
indentation for the "scoring" block), so the YAML parser can correctly recognize
both keys.
In @examples/vllm-rhaiis.yaml:
- Around line 17-27: The config uses the same `${env.SQLITE_STORE_DIR:=...}`
substitution for conversations_store, inference_store, and metadata_store which
replaces the entire path and will make all three point to the same file when
SQLITE_STORE_DIR is set; update the three entries so each store has its own env
var (e.g., CONVERSATIONS_DB_PATH, INFERENCE_DB_PATH, METADATA_DB_PATH) or switch
to a single directory env var (e.g., SQLITE_STORE_DIR) combined with per-store
filenames (concatenate directory + filename) so that conversations_store,
inference_store, and metadata_store each resolve to distinct DB files.
- Around line 46-52: The storage_dir value is using the misnamed env var
SQLITE_STORE_DIR which implies a DB path; update the config to use a dedicated
environment variable (e.g., FILES_STORAGE_DIR) for file storage paths and adjust
the default accordingly. Locate the storage_dir key in the inline::localfs
provider block and replace `${env.SQLITE_STORE_DIR:=~/.llama/storage/files}`
with `${env.FILES_STORAGE_DIR:=~/.llama/storage/files}` (or another appropriate
FILES_* var), and ensure any documentation or env examples are updated to
reference FILES_STORAGE_DIR instead of SQLITE_STORE_DIR.
- Around line 153-157: The configuration selects the embedding model
"nomic-ai/nomic-embed-text-v1.5" but the Python dependency "nomic" is not
declared; add "nomic" to the project dependencies (pyproject.toml/requirements)
alongside the existing sentence-transformers and einops entries so the model can
be loaded at runtime, specifying a stable version constraint (or latest)
compatible with sentence-transformers.
In @examples/vllm-rhelai.yaml:
- Line 15: Remove the trailing whitespace after the APIs list on line 15 (the
blank/whitespace-only content following the "APIs" entry) so the line contains
no trailing spaces; simply delete the extra spaces at the end of that line to
clean up formatting.
- Around line 143-146: The YAML defines a shield (shield_id: llama-guard,
provider_id: llama-guard, provider_shield_id: openai/gpt-4o-mini) and sets
default_shield_id: llama-guard while the safety providers array is empty, which
is inconsistent; either remove the shield block and the default_shield_id entry
(if safety is intended to be disabled) or populate the safety providers array
with a matching provider definition for provider_id: llama-guard (and configure
any required provider_shield mapping) so the default_shield_id resolves at
runtime.
- Around line 17-27: The db_path entries (conversations_store.db_path,
inference_store.db_path, metadata_store.db_path) currently all use the same env
var SQLITE_STORE_DIR with different inline defaults, which causes different
stores to collapse to the same path when SQLITE_STORE_DIR is set; change each
entry to either use its own env var (e.g., SQLITE_CONVERSATIONS_PATH,
SQLITE_INFERENCE_PATH, SQLITE_METADATA_PATH) or treat SQLITE_STORE_DIR strictly
as a directory prefix and append distinct filenames outside the variable (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db), and apply the same
change across all example and test config files to avoid cross-store collisions.
In @run.yaml:
- Around line 156-160: The default_embedding_model currently sets provider_id:
sentence-transformers and model_id: nomic-ai/nomic-embed-text-v1.5 which doesn't
match the registered embedding model (sentence-transformers/all-mpnet-base-v2);
update the default_embedding_model.model_id to the registered model_id
(sentence-transformers/all-mpnet-base-v2) so vector_stores.default_provider_id
and the registered embedding entry are consistent (mirror the same fix applied
in run-ci.yaml).
In @tests/e2e/configs/run-ci.yaml:
- Around line 161-165: The default_embedding_model
(nomic-ai/nomic-embed-text-v1.5) is not registered in
registered_resources.models (only sentence-transformers/all-mpnet-base-v2 is
present), causing a mismatch; fix by either adding a registration entry for
nomic-ai/nomic-embed-text-v1.5 under registered_resources.models or changing
vector_stores.default_embedding_model to the existing
sentence-transformers/all-mpnet-base-v2 so the default references a registered
model.
In @tests/e2e/configs/run-rhaiis.yaml:
- Around line 115-122: Both storage backends kv_default and sql_default use the
same env var SQLITE_STORE_DIR so setting it makes them point to the same DB
file; change the db_path entries so each backend uses a distinct env var or
distinct filename (e.g., SQLITE_KV_STORE_DIR for kv_default and
SQLITE_SQL_STORE_DIR for sql_default, or keep SQLITE_STORE_DIR but append
different filenames like ${env.SQLITE_STORE_DIR:=~/.llama/storage}/kv_store.db
and .../sql_store.db) so kv_default.db_path and sql_default.db_path resolve to
different files.
- Around line 18-28: The three db_path entries (conversations_store.db_path,
inference_store.db_path, metadata_store.db_path) incorrectly treat
SQLITE_STORE_DIR as a full filepath and thus will point all databases to the
same file when the env var is set; update each db_path to use SQLITE_STORE_DIR
as a base directory and append distinct filenames (e.g., conversations.db,
inference-store.db, registry.db) so each DB has its own file, and ensure the
storage_dir usage at the other location (where storage_dir expects a directory)
remains consistent with this change.
- Around line 46-53: The storage_dir entry for provider_id
"meta-reference-files" currently uses
${env.SQLITE_STORE_DIR:=~/.llama/storage/files} which is inconsistent with the
repo's directory-vs-file env var convention; update this value to the
directory-specific env var used elsewhere (for example replace SQLITE_STORE_DIR
with the directory-specific name, e.g. SQLITE_STORE_DIR_DIR or the agreed-upon
directory env var) so storage_dir consistently expects a directory path.
- Line 16: Remove the trailing whitespace after the list item "- vector_io" in
the YAML so the line ends exactly with "- vector_io" (no extra spaces or tabs);
update the file to trim trailing whitespace on that line and run a
YAML/whitespace linter or git diff to confirm no trailing characters remain.
In @tests/e2e/configs/run-rhelai.yaml:
- Around line 54-58: The config disables safety via "safety: []" but still
registers a shield and sets "safety.default_shield_id" which creates an
inconsistent state; fix by removing or commenting out the shield entry under
"registered_resources.shields" and clearing or commenting
"safety.default_shield_id" (or set it to null) so no shield is referenced while
safety is disabled, and add a clear inline warning comment near "safety: []"
explaining why shields are disabled to prevent accidental re-enablement.
- Around line 154-158: The config references default_embedding_model
(provider_id sentence-transformers, model_id nomic-ai/nomic-embed-text-v1.5) but
no embedding model is registered under registered_resources.models; add a model
entry in registered_resources.models that registers this embedding model using
the same provider_id and model_id so the RAG/vector_stores flow can resolve it
(ensure the entry key/name matches any expected reference and use the symbols
registered_resources.models and default_embedding_model to locate where to add
the registration).
🧹 Nitpick comments (4)
tests/e2e/configs/run-azure.yaml (1)
31-44: Consider environment variable for Azureapi_base.The Azure endpoint
https://ols-test.openai.azure.com/is hardcoded whileapi_keyuses an environment variable. For consistency and flexibility across environments, consider templating the base URL:♻️ Optional: Use environment variable for api_base
- provider_id: azure provider_type: remote::azure config: api_key: ${env.AZURE_API_KEY} - api_base: https://ols-test.openai.azure.com/ + api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/} api_version: 2024-02-15-previewtests/e2e/configs/run-ci.yaml (1)
60-63: Use environment variable for braintrust API key.The hardcoded
'********'placeholder will cause braintrust scoring to fail if invoked. Consider using environment variable substitution for consistency with other API keys in this config.Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.OPENAI_API_KEY}examples/azure-run.yaml (1)
34-37: Hardcoded Azure endpoint in example configuration.The
api_baseis hardcoded to a specific test endpoint (https://ols-test.openai.azure.com/). For an example configuration, consider using an environment variable to make it more reusable.Suggested fix
- provider_id: azure provider_type: remote::azure config: api_key: ${env.AZURE_API_KEY} - api_base: https://ols-test.openai.azure.com/ + api_base: ${env.AZURE_API_BASE} api_version: 2024-02-15-previewrun.yaml (1)
60-63: Use environment variable for braintrust API key.Same issue as
run-ci.yaml: the placeholder'********'should be replaced with${env.OPENAI_API_KEY}for consistency and functionality.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (26)
.github/workflows/e2e_tests_providers.yamldocker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis.yamlexamples/vllm-rhelai.yamlexamples/vllm-rhoai.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/vllm-granite-run.yaml
- examples/openai-faiss-run.yaml
- examples/openai-pgvector-run.yaml
- examples/gemini-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/vllm-llama-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
- tests/e2e/features/faiss.feature
- docker-compose.yaml
- docker-compose-library.yaml
🧰 Additional context used
📓 Path-based instructions (2)
tests/e2e/test_list.txt
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain test list in
tests/e2e/test_list.txt
Files:
tests/e2e/test_list.txt
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Checkpyproject.tomlfor supported Python versions before development
Always checkpyproject.tomlfor existing dependencies before adding new ones
Always verify current library versions inpyproject.tomlrather than assuming versions
Use pylint withsource-roots = "src"configuration
Files:
pyproject.toml
🧠 Learnings (9)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2025-09-02T11:09:40.404Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/steps/**/*.py : Define E2E test steps in `tests/e2e/features/steps/` directory
Applied to files:
tests/e2e/test_list.txt
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
examples/vllm-rhoai.yamlexamples/azure-run.yamlexamples/vllm-rhaiis.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-rhaiis.yamlexamples/run.yamltests/e2e/configs/run-ci.yamlexamples/vertexai-run.yamltests/e2e/configs/run-rhelai.yamlexamples/vllm-rhelai.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-vertexai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones
Applied to files:
pyproject.toml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.
Applied to files:
pyproject.toml
🪛 YAMLlint (1.37.1)
examples/run.yaml
[error] 58-58: syntax error: mapping values are not allowed here
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
- GitHub Check: E2E: server mode / ci
🔇 Additional comments (30)
pyproject.toml (1)
63-63: Add an explanatory comment for consistency with other dependencies.The version
einops>=0.8.1is correct (latest release published Feb 9, 2025). However, unlike all other dependencies in this file,einopslacks an explanatory comment. Add a comment clarifying its purpose—likely related to FAISS/RAG vectorization support given the PR context and the project's default RAG configuration using FAISS.examples/vllm-rhelai.yaml (5)
53-57: Safety providers disabled — verify this is intentional for the example.The safety providers are disabled with a comment about an "infinite loop issue with LLM calls." While this is documented, shipping an example configuration with safety disabled could lead users to run unprotected deployments.
Consider:
- Adding a more prominent warning in documentation about this limitation.
- Tracking this as a known issue to be resolved before GA.
Is there a tracking issue for the infinite loop problem mentioned in the comment?
66-72: LGTM!The FAISS vector_io provider configuration looks correct with appropriate persistence settings using the
kv_defaultbackend.
73-110: LGTM!The agents, batches, datasetio, and eval provider configurations are well-structured with appropriate namespace organization and backend references.
153-157: LGTM!The vector_stores configuration correctly sets FAISS as the default provider with the nomic embedding model, aligning with the PR objectives for full FAISS support.
160-161: LGTM!Telemetry is enabled without an explicit provider block, which is the correct approach for Llama Stack version 0.3.x based on learnings.
examples/vllm-rhaiis.yaml (5)
1-15: LGTM on basic structure.The version and APIs configuration looks appropriate for the FAISS VectorIO provider support. Minor note: Line 15 has trailing whitespace that could be cleaned up.
66-72: FAISS vector_io configuration looks correct.The inline FAISS provider is properly configured with persistence settings using the
kv_defaultbackend. This aligns well with the PR objective of fully supporting the FAISS VectorIO provider.
114-121: Same environment variable conflict applies to storage backends.Lines 118 and 121 have the same
SQLITE_STORE_DIRpattern issue flagged earlier. Bothkv_defaultandsql_defaultbackends would point to the same path if the env var is set.
143-146: Shield naming may be misleading.The shield is named
llama-guardbut usesopenai/gpt-4o-minias the backend (provider_shield_id). This could confuse users expecting Meta's Llama Guard safety model. Consider renaming to reflect the actual backend, or add a comment clarifying this is a placeholder configuration.Per the PR description mentioning "revoked support for shields due to unusability"—is this configuration intentionally a workaround, or should it be removed/commented out?
160-161: Telemetry configuration is correct.The simplified telemetry configuration with just
enabled: truealigns with Llama Stack 0.3.x requirements where provider blocks are not needed for telemetry. Based on learnings from this repository.tests/e2e/configs/run-rhaiis.yaml (4)
144-147: Verifyprovider_shield_idfor llama-guard shield.The shield is named
llama-guardbut usesopenai/gpt-4o-minias theprovider_shield_id. This appears to route safety checks through OpenAI's API rather than a Llama Guard model. Please confirm this is intentional for the e2e test configuration.
67-73: FAISS vector_io provider configuration looks correct.The FAISS provider is properly configured with
kv_defaultpersistence backend, which is defined in the storage section. This aligns with the PR objective to fully support the FAISS VectorIO provider.
154-158: Vector stores configuration is consistent with provider definitions.The default embedding model correctly references the
sentence-transformersprovider defined at lines 43-45, and FAISS is properly set as the default vector_io provider.
161-162: Telemetry configuration is appropriate for Llama Stack 0.3.x.Based on learnings, telemetry provider configuration is not supported under the
providerssection in Llama Stack 0.3.x, so enabling it with justenabled: trueis correct.tests/e2e/test_list.txt (1)
1-1: LGTM!The new
features/faiss.featureentry correctly maintains the test list as per the coding guidelines, enabling FAISS e2e test discovery for the new vector store functionality.examples/vllm-rhoai.yaml (3)
1-27: Configuration structure looks good with environment-driven paths.The centralized storage layout with
${env.SQLITE_STORE_DIR:=~/.llama/storage/...}templating provides good flexibility for deployment environments. The separation of conversations, inference, and metadata stores is well-organized.
66-72: FAISS vector_io provider correctly configured.The FAISS provider with
kv_defaultbackend persistence aligns with the PR's objective to fully support the FAISS VectorIO provider.
160-161: Review comment contains factual errors and incorrect assumptions.The file names are incorrect:
run-azure.yamlandrun-vertexai.yamldo not exist; the actual files areazure-run.yamlandvertexai-run.yaml. More importantly, the claim is inaccurate: onlyazure-run.yamlincludestelemetryin itsapislist (line 12), whilevertexai-run.yamldoes not.Additionally, based on Llama Stack 0.3.x documentation, telemetry provider configuration is not supported under the
providerssection—telemetry is enabled viatelemetry.enabled: truewithout requiring an explicitapisentry. Thevllm-rhoai.yamlconfiguration is following the correct pattern. The inconsistency withazure-run.yamlwarrants investigation, but it does not indicate thatvllm-rhoai.yamlis misconfigured.Likely an incorrect or invalid review comment.
tests/e2e/configs/run-vertexai.yaml (2)
31-44: Vertex AI inference provider configuration looks correct.The provider setup with project/location environment variables and the
allowed_modelsrestriction togoogle/gemini-2.5-flashis appropriate for controlled e2e testing.
149-157: Vector stores and telemetry configuration is consistent.The default FAISS provider with
nomic-ai/nomic-embed-text-v1.5embedding model aligns with the centralized vector store pattern across the PR. Based on learnings,telemetry.enabled: truewithout a provider block is the correct approach for Llama Stack 0.3.x.tests/e2e-prow/rhoai/configs/run.yaml (2)
1-162: Configuration structure is well-organized for RHOAI prow testing.The comprehensive provider setup with FAISS vector_io, environment-driven storage paths, and centralized backend configuration aligns with the PR's objectives.
139-143: Theprovider_model_id: nullvalue is intentional for RHOAI vllm configurations. This pattern is consistently used across both the test configuration and the official example configuration (examples/vllm-rhoai.yaml), while other vllm environments (RHELAI, RHAIIS) use environment variables instead. RHOAI vllm is a supported provider in LCS, and the null value appears to be the expected configuration for this environment.tests/e2e/configs/run-azure.yaml (1)
66-72: FAISS vector_io configuration is consistent across environments.The FAISS provider setup matches the pattern used in vertexai and rhoai configs, ensuring consistent behavior across different inference backends.
examples/vertexai-run.yaml (3)
4-14:telemetryAPI not listed buttelemetry.enabledis set.Similar to
examples/vllm-rhoai.yaml, theapislist here omitstelemetrywhile the test configs (run-azure.yaml,run-vertexai.yaml) explicitly include it. Verify whether this inconsistency between examples and test configs is intentional.
30-43: Vertex AI provider configuration is appropriate for the example.The
allowed_modelsrestriction and environment-driven project/location settings provide a clear reference implementation.
148-156: Vector stores and safety defaults are correctly configured.The FAISS default provider and
nomic-ai/nomic-embed-text-v1.5embedding model provide a complete vector store setup aligned with the PR objectives.docs/providers.md (2)
103-103: Documentation update looks correct.The change from ✅ to ❌ for llama-guard aligns with the PR description stating shields support was revoked due to unusability issues.
160-160: LGTM: rag-runtime now marked as supported.This correctly reflects the FAISS VectorIO provider support being added in this PR.
run.yaml (1)
163-164: Telemetry configuration follows correct pattern.Per retrieved learnings, telemetry in Llama Stack 0.3.x should use
telemetry.enabled: truewithout a provider block. This is correctly implemented.
| vector_stores: | ||
| default_provider_id: documentation_faiss | ||
| default_provider_id: faiss | ||
| default_embedding_model: | ||
| provider_id: sentence-transformers | ||
| model_id: all-mpnet-base-v2 | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing embedding model registration.
default_embedding_model references nomic-ai/nomic-embed-text-v1.5, but no embedding models are registered in registered_resources.models. The sentence-transformers provider is configured (lines 43-45), but without a registered embedding model, RAG operations may fail.
Register the embedding model
registered_resources:
models:
- model_id: ${env.RHEL_AI_MODEL}
provider_id: vllm
model_type: llm
provider_model_id: ${env.RHEL_AI_MODEL}
+ - model_id: nomic-ai/nomic-embed-text-v1.5
+ model_type: embedding
+ provider_id: sentence-transformers
+ provider_model_id: nomic-ai/nomic-embed-text-v1.5
+ metadata:
+ embedding_dimension: 768
shields:🤖 Prompt for AI Agents
In @tests/e2e/configs/run-rhelai.yaml around lines 154 - 158, The config
references default_embedding_model (provider_id sentence-transformers, model_id
nomic-ai/nomic-embed-text-v1.5) but no embedding model is registered under
registered_resources.models; add a model entry in registered_resources.models
that registers this embedding model using the same provider_id and model_id so
the RAG/vector_stores flow can resolve it (ensure the entry key/name matches any
expected reference and use the symbols registered_resources.models and
default_embedding_model to locate where to add the registration).
f2b7f5e to
57b58a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/e2e/configs/run-vertexai.yaml (1)
137-153: Register the embedding model inregistered_resources.models.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5with providersentence-transformers, butregistered_resources.modelsis empty. This differs from the pattern inrun-ci.yaml, which explicitly registers its embedding model. Explicit registration ensures consistent model discovery and lifecycle management.🔧 Suggested fix: Register the embedding model
registered_resources: - models: [] + models: + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:
🤖 Fix all issues with AI agents
In `@examples/vllm-rhoai.yaml`:
- Around line 17-27: The three store entries (conversations_store,
inference_store, metadata_store) incorrectly embed the full filepath into the
SQLITE_STORE_DIR env var, causing all stores to resolve to the same file when
SQLITE_STORE_DIR is set; change each db_path to treat SQLITE_STORE_DIR as a
directory by using the env default for the directory only and append the
specific filenames (e.g., conversations.db, inference-store.db, registry.db)
outside the variable expansion, and apply the same directory-only fix to the
other occurrences of SQLITE_STORE_DIR elsewhere in the file.
- Around line 143-146: The shield entry is misconfigured: update the shields
block so the guard references a registered model instead of an external
provider_shield_id; register a guard model in the models section (e.g., add a
model entry with an id you will reference, provider_id set to openai or vllm,
and provider_model_id like meta-llama/Llama-Guard-3-8B), then change the shield
to reference that model id via shield_id and set provider_shield_id: null (or
remove provider_shield_id) so the inline::llama-guard provider uses the
registered model; ensure the model id used in the shield matches the new models
entry and that provider_id/provider_model_id are correctly populated.
In `@tests/e2e-prow/rhoai/configs/run.yaml`:
- Around line 139-158: The config references an embedding model under
default_embedding_model (model_id: nomic-ai/nomic-embed-text-v1.5) that is not
listed in the models array; add a models entry for that embedding model so it is
registered (e.g., create a new model object with model_id set to
"nomic-ai/nomic-embed-text-v1.5", provider_id "sentence-transformers", and
model_type "embedding" — include provider_model_id or other required fields
consistent with the existing models schema).
In `@tests/e2e/configs/run-ci.yaml`:
- Around line 18-28: The db_path patterns for conversations_store,
inference_store and metadata_store use SQLITE_STORE_DIR as a full-file default
which causes all paths to collapse to the same value when the env var is set;
change each to treat SQLITE_STORE_DIR as a directory (e.g. use SQLITE_STORE_DIR
defaulting to ~/.llama/storage and then append the specific filename) so
conversations_store.db_path, inference_store.db_path and metadata_store.db_path
resolve to distinct files, and make the same change for
storage.backends.kv_default.db_path and storage.backends.sql_default.db_path
(also update the analogous entries around the referenced 118-123 region).
In `@tests/e2e/configs/run-rhaiis.yaml`:
- Around line 154-158: The config's vector_stores.default_embedding_model
references provider_id "sentence-transformers" and model_id
"nomic-ai/nomic-embed-text-v1.5" but that model is not present in
registered_resources.models; either add an entry for
"nomic-ai/nomic-embed-text-v1.5" to registered_resources.models (matching the
provider_id and model_id fields) or change vector_stores.default_embedding_model
to a model_id that is already registered by the sentence-transformers provider
so the system can resolve the embedding model at runtime.
♻️ Duplicate comments (25)
examples/vllm-rhaiis.yaml (3)
17-27: Environment variable usage causes path conflicts when set.This issue was previously flagged. All
db_pathentries use the same${env.SQLITE_STORE_DIR:=...}variable. If a user setsSQLITE_STORE_DIR=/custom/path, all stores will resolve to the same path, causing database conflicts.
46-52: Mismatched environment variable for file storage directory.This issue was previously flagged.
storage_diruses${env.SQLITE_STORE_DIR:=...}which is semantically inconsistent for file storage.
153-157: Addnomicpackage dependency for embedding model.This issue was previously flagged. The
nomic-ai/nomic-embed-text-v1.5model requires thenomicpackage inpyproject.toml.examples/vertexai-run.yaml (1)
17-27: Same environment variable path conflict pattern.All
db_pathentries use${env.SQLITE_STORE_DIR:=...}. When this env var is set, all stores will point to the same path, causing data corruption. This is the same systemic issue affecting other configuration files in this PR.examples/vllm-rhelai.yaml (3)
15-15: Minor: Trailing whitespace.This was previously flagged.
17-27: Environment variable path conflict pattern.This systemic issue was previously flagged across configuration files.
53-57: Configuration inconsistency: Shield registered but no safety provider.This was previously flagged. The shield is registered at lines 143-146 with
provider_id: llama-guard, anddefault_shield_id: llama-guardis set at line 159, but thesafetyproviders array is empty (line 53). This will cause runtime errors.Either remove the shield registration and
default_shield_id, or configure the safety provider (uncomment lines 54-57).Also applies to: 143-146, 158-159
examples/run.yaml (5)
65-68: Placeholder API key for braintrust scoring provider.This was previously flagged. The
openai_api_key: '********'is a placeholder that will fail at runtime. Use an environment variable instead.
119-120: Duplicatetelemetryblock.This was previously flagged. The
telemetrysection appears twice: at lines 119-120 and again at lines 165-166. Remove the duplicate.Also applies to: 165-166
146-162: Empty models list but embedding model referenced.This was previously flagged.
registered_resources.modelsis empty ([]), butdefault_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5. Consider registering the embedding model.
23-34: Same environment variable path conflict pattern.All
db_pathentries use the same${env.SQLITE_STORE_DIR:=...}variable with different defaults. This systemic issue affects multiple configuration files.
54-61: Critical YAML syntax error: missing newline breaks configuration.Line 58 has
provider_type: inline::llama-guard scoring:— thescoring:key is incorrectly placed on the same line. This will cause a YAML parse error and prevent the configuration from loading.🐛 Fix the syntax error
safety: - config: excluded_categories: [] provider_id: llama-guard - provider_type: inline::llama-guard scoring: - - provider_id: basic + provider_type: inline::llama-guard + scoring: + - provider_id: basic provider_type: inline::basic config: {}run.yaml (2)
17-28: Same environment variable path conflict pattern.All
db_pathentries use the same${env.SQLITE_STORE_DIR:=...}variable. This is the same systemic issue affecting all configuration files in this PR. Consider using separate environment variables per store or restructuring to use the env var as a directory prefix.
60-63: Placeholder API key for braintrust scoring provider.This was previously flagged. The
openai_api_key: '********'is a placeholder that will fail at runtime if the braintrust scoring provider is used.tests/e2e/configs/run-rhelai.yaml (2)
54-58: Safety configuration inconsistency remains unresolved.The safety provider is disabled (
safety: []) but shields are still registered anddefault_shield_idis set. This creates an inconsistent state that may cause confusing behavior.Also applies to: 144-147, 159-160
154-158: Embedding model not registered.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5, but it's not inregistered_resources.models.tests/e2e/configs/run-ci.yaml (2)
60-63: Braintrust API key placeholder should use environment variable.The
openai_api_key: '********'is a placeholder that will cause failures if this provider is used. Per project guidelines, use environment variables for sensitive data. Based on learnings.
161-165: Embedding model mismatch persists.The
default_embedding_modelusesnomic-ai/nomic-embed-text-v1.5, but onlysentence-transformers/all-mpnet-base-v2is registered. Either register the nomic model or change the default to the registered model.tests/e2e/configs/run-rhaiis.yaml (3)
16-16: Trailing whitespace after- vector_io.Line 16 has trailing whitespace which should be removed.
18-28: Environment variable pattern issue for database paths.The
SQLITE_STORE_DIRpattern issue flagged previously still exists. Apply directory-based pattern for consistency.Also applies to: 115-122
46-53: Files providerstorage_diruses same env var inconsistently.This expects a directory while other usages expect file paths. Should align with directory-based pattern.
tests/e2e/configs/run-azure.yaml (2)
137-157: Missing embedding model registration for e2e tests.Same as the example config:
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5but no embedding model is registered inregistered_resources.models. This will likely cause the FAISS RAG e2e tests to fail when attempting to generate embeddings.Register the embedding model
registered_resources: models: - model_id: gpt-4o-mini provider_id: azure model_type: llm provider_model_id: gpt-4o-mini + - model_id: nomic-ai/nomic-embed-text-v1.5 + provider_id: sentence-transformers + model_type: embedding + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:
37-37: Update Azure OpenAI API version to current release.The
api_version: 2024-02-15-previewis outdated. For e2e tests, using an up-to-date API version ensures tests validate against current Azure OpenAI behavior.examples/azure-run.yaml (2)
37-37: Update Azure OpenAI API version to current release.The
api_version: 2024-02-15-previewis outdated. The latest GA version is2024-06-01, with newer preview versions available. Update to at least the stable release.
137-157: Missing embedding model registration.The
default_embedding_modelat lines 155-157 referencesnomic-ai/nomic-embed-text-v1.5withsentence-transformersprovider, but no embedding model is registered inregistered_resources.models. This may cause runtime failures when vector operations attempt to use the embedding model.Register the embedding model
registered_resources: models: - model_id: gpt-4o-mini provider_id: azure model_type: llm provider_model_id: gpt-4o-mini + - model_id: nomic-ai/nomic-embed-text-v1.5 + provider_id: sentence-transformers + model_type: embedding + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:
🧹 Nitpick comments (4)
tests/e2e/configs/run-vertexai.yaml (2)
18-28: Environment variable pattern may cause database path collisions.All
db_pathentries use${env.SQLITE_STORE_DIR:=<full_path>}where the env var completely replaces the path. IfSQLITE_STORE_DIRis set, all databases point to that same value.Consider using the env var as a directory base:
🔧 Suggested fix
conversations_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db type: sqlite ... inference_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db type: sqlite metadata_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db type: sqlite
116-122: SameSQLITE_STORE_DIRpattern issue in storage backends.Apply the same fix to storage backend paths for consistency.
🔧 Suggested fix
kv_default: type: kv_sqlite - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/rag/kv_store.db sql_default: type: sql_sqlite - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.dbtests/e2e/configs/run-rhelai.yaml (1)
18-28: Environment variable pattern may cause database path collisions.Same
SQLITE_STORE_DIRpattern issue as other config files. If the env var is set, all database paths resolve to the same value.🔧 Apply directory-based pattern
conversations_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.dbApply similarly to
inference_store,metadata_store, andstorage.backendspaths.Also applies to: 119-119, 122-122
tests/e2e-prow/rhoai/configs/run.yaml (1)
18-28: Environment variable pattern may cause database path collisions.All
db_pathentries use${env.SQLITE_STORE_DIR:=<full_path>}. IfSQLITE_STORE_DIRis set, all databases point to that same value, causing conflicts.🔧 Use directory-based pattern
conversations_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.dbApply to all
db_pathentries.Also applies to: 119-119, 122-122
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis.yamlexamples/vllm-rhelai.yamlexamples/vllm-rhoai.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/openai-pgvector-run.yaml
- examples/vllm-llama-faiss-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-faiss-run.yaml
- examples/gemini-run.yaml
- examples/vllm-granite-run.yaml
🚧 Files skipped from review as they are similar to previous changes (6)
- pyproject.toml
- tests/e2e/test_list.txt
- tests/e2e/features/faiss.feature
- docs/providers.md
- docker-compose.yaml
- docker-compose-library.yaml
🧰 Additional context used
🧠 Learnings (5)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
tests/e2e/configs/run-azure.yamlexamples/vllm-rhoai.yamlexamples/azure-run.yamlexamples/vertexai-run.yamltests/e2e/configs/run-rhaiis.yamlexamples/run.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhelai.yamlexamples/vllm-rhaiis.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-vertexai.yamlexamples/vllm-rhelai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`
Applied to files:
tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones
Applied to files:
examples/vllm-rhaiis.yaml
🪛 YAMLlint (1.37.1)
examples/run.yaml
[error] 58-58: syntax error: mapping values are not allowed here
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (18)
examples/vllm-rhoai.yaml (6)
38-52: LGTM on provider additions.The OpenAI and sentence-transformers provider configurations are correctly structured. Note that the
storage_diron line 50 is affected by the same environment variable issue flagged above.
58-72: LGTM!The scoring, tool_runtime, and vector_io provider configurations are well-structured. The FAISS vector_io provider with persistence to
kv_defaultbackend aligns with the PR objective of fully supporting the FAISS VectorIO provider.
73-110: LGTM!Provider configurations for agents, batches, datasetio, and eval follow consistent patterns with proper namespace isolation and backend references.
111-121: Storage backend configuration is consistent.The storage backends correctly define
kv_defaultandsql_defaultthat are referenced by providers. The environment variable issue on lines 118 and 121 is covered in the earlier comment.
142-142: Verify ifprovider_model_id: nullis intentional.Setting
provider_model_idtonullmay indicate auto-detection or could be an incomplete configuration. Please confirm this is the intended behavior for the vLLM provider.
153-161: LGTM!The vector_stores configuration correctly sets FAISS as the default provider with nomic-ai embedding model, aligning with the PR objectives. Based on learnings, the telemetry configuration with just
enabled: trueis correct for Llama Stack 0.3.x.examples/vllm-rhaiis.yaml (1)
53-72: LGTM: Safety and vector_io providers are properly configured.The safety provider (
llama-guard) is correctly defined, matching the shield registration at lines 143-146. The vector_io faiss provider with kv_default persistence is properly configured.examples/vertexai-run.yaml (2)
30-43: LGTM: Vertex AI inference provider configuration.The Google Vertex AI remote provider is properly configured with environment-driven project/location settings and an allowed_models list. The sentence-transformers inline provider for embeddings is also correctly defined.
148-156: No action needed. The configuration file specifies a Hugging Face model ID (nomic-ai/nomic-embed-text-v1.5) that will be resolved by Llama Stack at runtime, not a direct Python package dependency. Thenomicpackage is neither imported in the codebase nor required as a project dependency.Likely an incorrect or invalid review comment.
examples/vllm-rhelai.yaml (1)
66-72: LGTM: FAISS vector_io provider configuration.The faiss provider with kv_default persistence namespace is correctly configured for RAG storage.
run.yaml (2)
139-155: LGTM: Model registration and tool groups.The
gpt-4o-minimodel is properly registered with the OpenAI provider, and the RAG tool group is correctly configured with therag-runtimeprovider.
156-164: No action needed. The configuration is correct.The
nomic-ai/nomic-embed-text-v1.5model is compatible with sentence-transformers and is already in use throughout the codebase (run.yaml, e2e configs, examples). Thesentence-transformers>=5.0.0dependency is present in pyproject.toml. No separate "nomic" package dependency is required—sentence-transformers automatically downloads and loads HuggingFace models.Likely an incorrect or invalid review comment.
tests/e2e/configs/run-vertexai.yaml (1)
156-157: LGTM!Telemetry configuration is correct - using
telemetry.enabled: truewithout a provider block, which aligns with Llama Stack 0.3.x requirements. Based on learnings.tests/e2e/configs/run-rhelai.yaml (1)
30-111: LGTM!Provider structure is well-organized with proper list markers and indentation. The YAML structure issues from past reviews have been addressed.
tests/e2e-prow/rhoai/configs/run.yaml (2)
140-143: Verifyprovider_model_id: nullis intentional.The model registration has
provider_model_id: null. Confirm this is intended for dynamic model resolution by vllm, or if it should matchmodel_id.
30-111: LGTM!Provider configuration structure is consistent and well-organized.
tests/e2e/configs/run-ci.yaml (1)
64-74: LGTM!Good inline comments explaining the RAG tool and FAISS storage configuration. The structure is clear and self-documenting.
tests/e2e/configs/run-rhaiis.yaml (1)
92-111: LGTM!The
datasetioandevalprovider sections now have correct YAML structure with proper list markers. The structure issues from past reviews have been addressed.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
57b58a4 to
0d28b3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@examples/vertexai-run.yaml`:
- Around line 136-147: registered_resources currently has models: [] despite
inference providers being configured; add explicit model registrations under the
registered_resources.models array for each allowed provider/model (e.g., add
entries referencing google/gemini-2.5-flash for the google-vertex provider and
any OpenAI or sentence-transformers models you intend to use) so that the system
does not rely on implicit auto-registration; update the models array to include
model objects with model_id/model_name, provider_id (matching provider
definitions like google-vertex, openai, sentence-transformers), and any required
fields such as default/allowed flags to align with existing provider configs.
In `@run.yaml`:
- Around line 17-27: The env pattern currently uses
${env.SQLITE_STORE_DIR:=<full_path>} as the entire DB path which makes
conversations_store, inference_store, and metadata_store resolve to the same
file when SQLITE_STORE_DIR is set; change each path to use the env var as a
directory prefix (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db) so each store (look
for keys conversations_store, inference_store, metadata_store and the other
occurrences mentioned at lines ~45, ~120, ~123) points to its own filename under
the provided directory.
In `@tests/e2e/configs/run-vertexai.yaml`:
- Around line 149-157: The config uses vector_stores.default_embedding_model
with provider_id "sentence-transformers" and model_id
"nomic-ai/nomic-embed-text-v1.5" but that model is not listed under
registered_resources.models (unlike run-ci.yaml which registers
sentence-transformers/all-mpnet-base-v2); either add an entry for
"sentence-transformers" -> model "nomic-ai/nomic-embed-text-v1.5" to
registered_resources.models in this YAML, or explicitly document/verify that the
sentence-transformers provider will auto-register unlisted models and add a
comment or test to assert that auto-registration behavior for
vector_stores.default_embedding_model works as intended.
In `@tests/e2e/features/faiss.feature`:
- Around line 8-22: The Authorization header in the faiss.feature step "I set
the Authorization header to Bearer ..." contains a truncated JWT and must be
replaced with a complete, valid test token (three base64url parts) or pulled
from a test config/env var; also avoid hardcoding the vector store id
"vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee" in the response assertion — instead
read the expected id from a fixture/config or change the assertion in the "Then
The body of the response has the following structure" step to verify the rags
array contains an id pattern (or presence) rather than an exact hardcoded value
so the test is not tightly coupled to specific test data.
♻️ Duplicate comments (16)
examples/vllm-rhoai.yaml (1)
144-147: Shield configuration still incomplete for inline::llama-guard.This issue was previously flagged: the
inline::llama-guardprovider requires a registered Llama Guard model. The currentprovider_shield_id: openai/gpt-4o-miniis invalid for this provider type—it should benull, andshield_idshould reference a registered guard model (e.g.,meta-llama/Llama-Guard-3-8B).tests/e2e-prow/rhoai/configs/run.yaml (1)
139-158: Embedding model referenced but not registered in models list.The
default_embedding_modelat line 158 referencesnomic-ai/nomic-embed-text-v1.5with providersentence-transformers, but onlymeta-llama/Llama-3.2-1B-Instructis registered inregistered_resources.models. Consider adding the embedding model registration for consistency and to prevent potential runtime issues if the system requires explicit model registration.examples/vllm-rhelai.yaml (1)
16-16: Minor: Trailing whitespace on line 16.This line has trailing whitespace after the APIs list which should be removed for cleaner formatting.
tests/e2e/features/faiss.feature (1)
24-34: Same truncated JWT token issue.This scenario has the same incomplete JWT token as flagged in past review comments. Please provide a complete, valid JWT with all three segments.
tests/e2e/configs/run-rhaiis.yaml (2)
16-16: Trailing whitespace persists on line 16.The line after
- vector_iocontains trailing whitespace which should be removed.
154-162: Embedding model registration question applies here as well.Same consideration as other configs - the
nomic-ai/nomic-embed-text-v1.5model may need explicit registration inregistered_resources.modelsdepending on whether sentence-transformers auto-registers models.Telemetry configuration is correct per Llama Stack 0.3.x conventions. Based on learnings, this is the proper approach.
examples/vllm-rhaiis.yaml (1)
153-161: Verifynomicpackage dependency for embedding model.The
nomic-ai/nomic-embed-text-v1.5model may require thenomicpackage as a runtime dependency. A previous review comment noted that whileeinops>=0.8.1is inpyproject.toml, thenomicpackage might be missing.run.yaml (2)
139-160: Missing embedding model registration.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5(line 160), but no embedding model is registered inregistered_resources.models(lines 140-144). Only the LLM modelgpt-4o-miniis registered, which will cause RAG operations to fail.Register the embedding model
registered_resources: models: - model_id: gpt-4o-mini provider_id: openai model_type: llm provider_model_id: gpt-4o-mini + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:
60-63: Placeholder API key for braintrust scoring provider.The
openai_api_key: '********'is a redacted placeholder that will fail at runtime if the braintrust scoring provider is used. Per learnings, use environment variables for sensitive data.🔧 Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}examples/azure-run.yaml (1)
137-157: Missing embedding model registration.The
default_embedding_modelat line 155-157 referencesnomic-ai/nomic-embed-text-v1.5, but no embedding model is registered inregistered_resources.models(lines 138-142). Only the LLM modelgpt-4o-miniis registered. This will cause RAG operations to fail when the system tries to resolve the embedding model.Register the embedding model
registered_resources: models: - model_id: gpt-4o-mini provider_id: azure model_type: llm provider_model_id: gpt-4o-mini + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:tests/e2e/configs/run-rhelai.yaml (1)
154-158: Missing embedding model registration.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5, but no embedding model is registered inregistered_resources.models(lines 139-143). Only the LLM model${env.RHEL_AI_MODEL}is registered. This inconsistency will cause RAG operations to fail.Register the embedding model
registered_resources: models: - model_id: ${env.RHEL_AI_MODEL} provider_id: vllm model_type: llm provider_model_id: ${env.RHEL_AI_MODEL} + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:tests/e2e/configs/run-ci.yaml (2)
60-63: Placeholder API key for braintrust scoring provider.The
openai_api_key: '********'is a placeholder that will fail at runtime if the braintrust scoring provider is used. Per project guidelines from learnings, use environment variables for sensitive data.🔧 Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
161-165: Embedding model mismatch: default model not registered.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5(line 165), but the registered embedding model issentence-transformers/all-mpnet-base-v2(lines 145-150). This inconsistency may cause runtime failures when RAG tries to use the unregistered default model.Option A: Register the nomic embedding model
- model_id: sentence-transformers/all-mpnet-base-v2 model_type: embedding provider_id: sentence-transformers provider_model_id: sentence-transformers/all-mpnet-base-v2 metadata: embedding_dimension: 768 + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768Option B: Use the registered model as default
vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers - model_id: nomic-ai/nomic-embed-text-v1.5 + model_id: sentence-transformers/all-mpnet-base-v2examples/run.yaml (3)
58-58: Critical YAML syntax error:scoring:key on same line breaks parsing.Line 58 has
provider_type: inline::llama-guard scoring:— thescoring:key is incorrectly placed on the same line as the value. This is invalid YAML and will cause a parse error, preventing the configuration from loading.🐛 Fix the syntax error
safety: - config: excluded_categories: [] provider_id: llama-guard - provider_type: inline::llama-guard scoring: + provider_type: inline::llama-guard + scoring: - provider_id: basic provider_type: inline::basic
119-120: Duplicatetelemetryblocks.The
telemetrysection appears twice: at lines 119-120 and again at lines 175-176. This is likely a copy-paste error and may cause unexpected YAML parsing behavior where the second occurrence overrides the first.Remove the duplicate at lines 119-120
scoring_fns: [] -telemetry: - enabled: true server: port: 8321Also applies to: 175-176
168-172: Embedding model mismatch: default differs from registered.The
default_embedding_modelusesnomic-ai/nomic-embed-text-v1.5(line 172), but the registered embedding model issentence-transformers/all-mpnet-base-v2(lines 152-157). Either register the nomic model or align the default to the registered model.Option A: Register the nomic model
- model_id: sentence-transformers/all-mpnet-base-v2 model_type: embedding provider_id: sentence-transformers provider_model_id: sentence-transformers/all-mpnet-base-v2 metadata: embedding_dimension: 768 + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768Option B: Use registered model as default
vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers - model_id: nomic-ai/nomic-embed-text-v1.5 + model_id: sentence-transformers/all-mpnet-base-v2
🧹 Nitpick comments (2)
tests/e2e/configs/run-azure.yaml (1)
16-16: Remove trailing whitespace.Line 16 contains trailing whitespace after the
- vector_ioentry, similar to other config files.Proposed fix
- vector_io - +examples/vllm-rhaiis.yaml (1)
4-15: Missingtelemetryin apis list.Unlike the other configuration files (
run-vertexai.yaml,run-rhaiis.yaml,run-azure.yaml) which include- telemetryin their apis list, this file omits it. However,telemetry.enabled: trueis set at line 160. For consistency with other configs and to ensure the telemetry API is available, consider adding- telemetryto the apis list.Also, line 15 contains trailing whitespace.
Proposed fix
- scoring +- telemetry - tool_runtime - vector_io - +
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis.yamlexamples/vllm-rhelai.yamlexamples/vllm-rhoai.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/vllm-llama-faiss-run.yaml
- examples/gemini-run.yaml
- examples/vllm-granite-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-pgvector-run.yaml
- examples/openai-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
- pyproject.toml
- tests/e2e/test_list.txt
- docker-compose-library.yaml
🧰 Additional context used
📓 Path-based instructions (1)
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/faiss.feature
🧠 Learnings (6)
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
tests/e2e-prow/rhoai/configs/run.yamlexamples/vllm-rhaiis.yamltests/e2e/configs/run-rhaiis.yamlexamples/vllm-rhoai.yamlexamples/vllm-rhelai.yamltests/e2e/configs/run-rhelai.yamlexamples/azure-run.yamlexamples/vertexai-run.yamlexamples/run.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-vertexai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones
Applied to files:
examples/vllm-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/features/faiss.feature
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`
Applied to files:
tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yaml
🪛 YAMLlint (1.37.1)
examples/run.yaml
[error] 58-58: syntax error: mapping values are not allowed here
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (30)
examples/vllm-rhoai.yaml (5)
17-28: Store paths are now distinct and consistent.The hardcoded paths for
conversations_store,inference_store, andmetadata_storeeach point to separate.dbfiles under~/.llama/storage/, which resolves the previous concern about path conflicts.
34-37: Note: TLS verification disabled for vLLM endpoint.
tls_verify: falseis acceptable for internal cluster traffic or development, but ensure this is intentional for your deployment environment. For production with external endpoints, enabling TLS verification is recommended.
67-73: FAISS vector_io provider correctly configured.The
inline::faissprovider withkv_defaultpersistence backend aligns with the PR objective to support the FAISS VectorIO provider.
154-158: FAISS default and embedding model configuration looks good.Setting
faissas the default vector provider withnomic-ai/nomic-embed-text-v1.5for embeddings is appropriate for the FAISS VectorIO support objective.
161-162: Telemetry configuration is correct for Llama Stack 0.3.x.Using just
telemetry.enabled: truewithout a provider block is the correct approach for version 0.3.x. Based on learnings, telemetry provider configuration is not supported under theproviderssection in this version.tests/e2e-prow/rhoai/configs/run.yaml (3)
1-28: LGTM! Configuration header and store paths are well-structured.The version, image_name, APIs list, and sqlite store configurations follow the expected pattern. The telemetry API addition aligns with the broader PR changes.
30-73: Provider configurations look correct.The providers section properly configures inference (vllm, openai, sentence-transformers), files, safety, scoring, tool_runtime, and vector_io with faiss. The faiss persistence namespace
vector_io::faisswithkv_defaultbackend follows the expected pattern.
161-162: Telemetry configuration is correct.Per learnings, in Llama Stack 0.3.x, telemetry can be enabled with just
telemetry.enabled: truewithout requiring an explicit provider block. This configuration follows that pattern correctly.examples/vertexai-run.yaml (2)
30-43: VertexAI inference provider configuration looks correct.The google-vertex provider properly uses environment variables for project and location, and specifies allowed_models. The sentence-transformers inline provider for embeddings is appropriately configured.
148-156: Vector stores and telemetry configuration is consistent.The faiss default provider and nomic-embed-text embedding model configuration matches the pattern used across other configuration files in this PR. Telemetry enablement follows the correct approach for Llama Stack 0.3.x.
examples/vllm-rhelai.yaml (3)
54-58: Safety provider is now properly configured.The previous review flagged an inconsistency where shields were registered but no safety provider existed. This has been addressed - the
llama-guardsafety provider is now configured withexcluded_categories: []. This resolves the configuration inconsistency.
138-147: Model and shield configuration aligns with provider setup.The model uses environment variables (
${env.RHEL_AI_MODEL}) which provides flexibility for different deployments. The shield configuration withllama-guardnow has a backing safety provider, resolving the previous inconsistency.
154-162: Vector stores and telemetry configuration is consistent with other configs.The faiss default provider, nomic-embed-text embedding model, and telemetry enablement follow the established pattern across all configuration files in this PR.
docs/providers.md (1)
160-160: rag-runtime support status correctly updated.The
rag-runtimetool runtime provider is now marked as supported (✅), which aligns with the PR's FAISS VectorIO integration and the addition of e2e tests for RAG functionality. The pip dependencies listed are comprehensive for RAG operations.docker-compose.yaml (1)
14-14: RAG volume mount supports FAISS e2e testing.The added volume mount maps the test RAG database (
./tests/e2e/rag) to the container's expected storage path (/opt/app-root/src/.llama/storage/rag). The test data directory exists with the expectedkv_store.dbfile. The:Zflag is appropriate for SELinux relabeling in containerized environments.tests/e2e/features/faiss.feature (1)
1-6: LGTM - Well-structured BDD feature file.The Background setup correctly initializes the service and REST API prefix, following proper Gherkin conventions as expected for behave e2e tests. Based on learnings, this aligns with the BDD framework guidelines.
tests/e2e/configs/run-vertexai.yaml (3)
2-2: LGTM - Appropriate image name for VertexAI configuration.
31-44: Well-structured inference providers.The provider configuration correctly sets up VertexAI as the primary inference provider with environment-based credentials, OpenAI as secondary, and sentence-transformers for embeddings. The inline sentence-transformers configuration aligns with the vector_stores default_embedding_model.
66-72: FAISS vector_io provider correctly configured.The persistence configuration with
namespace: vector_io::faissandbackend: kv_defaultproperly integrates with the storage backend defined at line 118.tests/e2e/configs/run-rhaiis.yaml (4)
1-3: LGTM - Version and image name updates.The configuration correctly identifies as version 2 with
rhaiis-configurationimage name.
17-28: Database paths now use hardcoded values - addresses previous env var concern.The
db_pathentries forconversations_store,inference_store, andmetadata_storenow use direct paths instead of the problematic${env.SQLITE_STORE_DIR:=...}pattern that would have caused path conflicts. This resolves the previously flagged issue.
43-45: Sentence-transformers provider correctly configured.The inline sentence-transformers provider with empty config aligns with the vector_stores default_embedding_model specification.
67-73: FAISS vector_io provider properly configured.Configuration matches the pattern used in other config files with correct namespace and backend references.
tests/e2e/configs/run-azure.yaml (4)
2-2: LGTM - Azure configuration image name.
31-44: Azure inference provider properly configured.The Azure provider configuration includes appropriate fields (
api_key,api_base,api_version) with environment variable substitution. The OpenAI and sentence-transformers providers are correctly set up as secondary options.
66-72: FAISS vector_io configuration consistent with other configs.
153-161: Vector stores and telemetry configuration looks good.The configuration follows the established pattern. The v1.5 embedding model choice was previously noted in past comments as having a v2 alternative available, but v1.5 remains a valid production choice.
examples/vllm-rhaiis.yaml (3)
1-3: LGTM - Version and image name updates.
17-27: Database paths now use hardcoded values - addresses previous env var concern.The
db_pathentries now use direct paths (~/.llama/storage/...) instead of the problematic${env.SQLITE_STORE_DIR:=...}pattern. This resolves the previously flagged path conflict issue.
46-52: Storage directory path now hardcoded - addresses previous concern.The
storage_dirnow uses a direct path instead of the mismatchedSQLITE_STORE_DIRenv var. This resolves the semantic inconsistency flagged previously.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| registered_resources: | ||
| models: [] | ||
| shields: [] | ||
| vector_dbs: [] | ||
| shields: | ||
| - shield_id: llama-guard | ||
| provider_id: llama-guard | ||
| provider_shield_id: openai/gpt-4o-mini | ||
| datasets: [] | ||
| scoring_fns: [] | ||
| benchmarks: [] | ||
| tool_groups: | ||
| - toolgroup_id: builtin::rag | ||
| provider_id: rag-runtime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No models registered despite having inference providers configured.
The models: [] on line 137 means no models are explicitly registered, yet the inference providers (google-vertex, openai, sentence-transformers) are configured. While some providers may auto-register models, this could cause issues if the system expects explicit model registration. Consider adding model entries for the allowed models (e.g., google/gemini-2.5-flash from line 36).
🔧 Suggested model registration
registered_resources:
- models: []
+ models:
+ - model_id: google/gemini-2.5-flash
+ provider_id: google-vertex
+ model_type: llm
+ provider_model_id: google/gemini-2.5-flash
shields:
- shield_id: llama-guard📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| registered_resources: | |
| models: [] | |
| shields: [] | |
| vector_dbs: [] | |
| shields: | |
| - shield_id: llama-guard | |
| provider_id: llama-guard | |
| provider_shield_id: openai/gpt-4o-mini | |
| datasets: [] | |
| scoring_fns: [] | |
| benchmarks: [] | |
| tool_groups: | |
| - toolgroup_id: builtin::rag | |
| provider_id: rag-runtime | |
| registered_resources: | |
| models: | |
| - model_id: google/gemini-2.5-flash | |
| provider_id: google-vertex | |
| model_type: llm | |
| provider_model_id: google/gemini-2.5-flash | |
| shields: | |
| - shield_id: llama-guard | |
| provider_id: llama-guard | |
| provider_shield_id: openai/gpt-4o-mini | |
| datasets: [] | |
| scoring_fns: [] | |
| benchmarks: [] | |
| tool_groups: | |
| - toolgroup_id: builtin::rag | |
| provider_id: rag-runtime |
🤖 Prompt for AI Agents
In `@examples/vertexai-run.yaml` around lines 136 - 147, registered_resources
currently has models: [] despite inference providers being configured; add
explicit model registrations under the registered_resources.models array for
each allowed provider/model (e.g., add entries referencing
google/gemini-2.5-flash for the google-vertex provider and any OpenAI or
sentence-transformers models you intend to use) so that the system does not rely
on implicit auto-registration; update the models array to include model objects
with model_id/model_name, provider_id (matching provider definitions like
google-vertex, openai, sentence-transformers), and any required fields such as
default/allowed flags to align with existing provider configs.
run.yaml
Outdated
| conversations_store: | ||
| db_path: ~/.llama/storage/conversations.db | ||
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db} | ||
| type: sqlite | ||
| datasets: [] | ||
| image_name: starter | ||
| # external_providers_dir: /opt/app-root/src/.llama/providers.d | ||
| inference_store: | ||
| db_path: ~/.llama/storage/inference-store.db | ||
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db} | ||
| type: sqlite | ||
| metadata_store: | ||
| db_path: ~/.llama/storage/registry.db | ||
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Environment variable pattern causes path collisions when set.
All store paths use ${env.SQLITE_STORE_DIR:=<full_path>} with different default values. If SQLITE_STORE_DIR is set, all paths resolve to the same value, causing database collisions. The pattern should use the env var as a directory prefix, not the entire path.
🔧 Use directory-based pattern
conversations_store:
- db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db}
+ db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db
type: sqlite
...
inference_store:
- db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db}
+ db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db
type: sqlite
metadata_store:
- db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db}
+ db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db
type: sqliteAlso apply to lines 45, 120, and 123.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| conversations_store: | |
| db_path: ~/.llama/storage/conversations.db | |
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db} | |
| type: sqlite | |
| datasets: [] | |
| image_name: starter | |
| # external_providers_dir: /opt/app-root/src/.llama/providers.d | |
| inference_store: | |
| db_path: ~/.llama/storage/inference-store.db | |
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db} | |
| type: sqlite | |
| metadata_store: | |
| db_path: ~/.llama/storage/registry.db | |
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db} | |
| conversations_store: | |
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db | |
| type: sqlite | |
| datasets: [] | |
| image_name: starter | |
| # external_providers_dir: /opt/app-root/src/.llama/providers.d | |
| inference_store: | |
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db | |
| type: sqlite | |
| metadata_store: | |
| db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db |
🤖 Prompt for AI Agents
In `@run.yaml` around lines 17 - 27, The env pattern currently uses
${env.SQLITE_STORE_DIR:=<full_path>} as the entire DB path which makes
conversations_store, inference_store, and metadata_store resolve to the same
file when SQLITE_STORE_DIR is set; change each path to use the env var as a
directory prefix (e.g.,
${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db) so each store (look
for keys conversations_store, inference_store, metadata_store and the other
occurrences mentioned at lines ~45, ~120, ~123) points to its own filename under
the provided directory.
| Scenario: Verify vector store is registered | ||
| Given The system is in default state | ||
| And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva | ||
| And REST API service hostname is localhost | ||
| And REST API service port is 8080 | ||
| When I access REST API endpoint rags using HTTP GET method | ||
| Then The status code of the response is 200 | ||
| And the body of the response has the following structure | ||
| """ | ||
| { | ||
| "rags": [ | ||
| "vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee" | ||
| ] | ||
| } | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truncated JWT token on line 10.
The Authorization Bearer token is incomplete—it's missing the signature segment. A valid JWT requires three Base64url-encoded parts separated by dots (header.payload.signature). The token ends abruptly at Ikpva which appears to be mid-payload.
Additionally, the hardcoded vector_store ID vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee tightly couples this test to specific test data. Consider documenting this dependency or making it configurable.
🤖 Prompt for AI Agents
In `@tests/e2e/features/faiss.feature` around lines 8 - 22, The Authorization
header in the faiss.feature step "I set the Authorization header to Bearer ..."
contains a truncated JWT and must be replaced with a complete, valid test token
(three base64url parts) or pulled from a test config/env var; also avoid
hardcoding the vector store id "vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee" in the
response assertion — instead read the expected id from a fixture/config or
change the assertion in the "Then The body of the response has the following
structure" step to verify the rags array contains an id pattern (or presence)
rather than an exact hardcoded value so the test is not tightly coupled to
specific test data.
0d28b3d to
9ae2114
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
examples/vllm-rhaiis.yaml (1)
4-15: Addtelemetrytoapislist to match all other configuration files.The file is inconsistent with the established pattern: every other YAML configuration file in the codebase includes
- telemetryin theapislist whentelemetry.enabled: trueis configured. This file is the only exception. Add telemetry to maintain consistency.Proposed fix
apis: - agents - batches - datasetio - eval - files - inference - safety - scoring +- telemetry - tool_runtime - vector_iorun.yaml (1)
139-145: Add embedding model registration toregistered_resources.modelsfor consistency with test configurations.The embedding model
nomic-ai/nomic-embed-text-v1.5is defined invector_stores.default_embedding_modelbut not registered inregistered_resources.models. Test configurations (run-ci.yaml, examples/run.yaml) register their embedding models withmodel_type: embeddingin the models section. Add a similar entry here:Example from run-ci.yaml
registered_resources: models: - model_id: gpt-4o-mini provider_id: openai model_type: llm provider_model_id: gpt-4o-mini - model_id: sentence-transformers/all-mpnet-base-v2 model_type: embedding provider_id: sentence-transformers provider_model_id: sentence-transformers/all-mpnet-base-v2 metadata: embedding_dimension: 768
🤖 Fix all issues with AI agents
In `@examples/run.yaml`:
- Around line 146-173: The registered embedding model (model_id:
sentence-transformers/all-mpnet-base-v2) does not match the
default_embedding_model (model_id: nomic-ai/nomic-embed-text-v1.5); update the
registered_resources.models entry to register nomic-ai/nomic-embed-text-v1.5
with provider_id nomic-ai (or alternatively change default_embedding_model to
sentence-transformers/all-mpnet-base-v2) so the configured default embedding
model and the registered embedding model match.
In `@examples/vllm-rhelai.yaml`:
- Around line 138-161: Add a registered_resources.models entry for the embedding
model referenced by vector_stores.default_embedding_model (model_id
nomic-ai/nomic-embed-text-v1.5) so it’s explicitly declared; create a model
object with model_id: nomic-ai/nomic-embed-text-v1.5, model_type: embedding,
provider_id: sentence-transformers, provider_model_id:
nomic-ai/nomic-embed-text-v1.5 and include metadata.embedding_dimension: 768 (or
the correct dimension), and apply the same addition to the corresponding
manifest that also references this embedding.
In `@examples/vllm-rhoai.yaml`:
- Around line 154-158: The YAML references an embedding model under
vector_stores.default_embedding_model (provider_id: sentence-transformers,
model_id: nomic-ai/nomic-embed-text-v1.5) but that model is not registered in
registered_resources.models; add an entry for this model in
registered_resources.models with the same provider_id and model_id (and any
required fields like type/params per registry schema) so the model is
discoverable at runtime and avoids failures when the vector store initializes.
In `@tests/e2e/configs/run-azure.yaml`:
- Around line 34-37: The config block currently hardcodes api_base and
api_version while api_key uses ${env.AZURE_API_KEY}; change api_base and
api_version to environment variables (e.g., ${env.AZURE_API_BASE} and
${env.AZURE_API_VERSION}) so they follow the same externalized pattern as
api_key and allow central version/endpoint bumps; update any references or docs
that expect those values if needed and ensure tests/CI set those env vars.
In `@tests/e2e/features/faiss.feature`:
- Around line 25-32: Fix the inconsistent indentation in the feature steps by
aligning the "Then The status code of the response is 200" and "And The response
should contain following fragments" lines with the preceding "When I use
\"query\"..." step; remove the extra leading spaces so all steps use the same
indentation level (match lines containing 'When I use "query"' and the
multi-line JSON block) to keep formatting consistent and readable.
♻️ Duplicate comments (19)
examples/vllm-rhoai.yaml (1)
144-147: Shield configuration forinline::llama-guardlooks invalid/misleading.
provider_shield_id: openai/gpt-4o-minidoesn’t look like a shield identifier; it looks like a model id. If Llama Stack expects the shield to reference a registered guard model (andprovider_shield_idto be null/omitted for inline llama-guard), this example will break.Llama Stack 0.3.x inline::llama-guard: how should registered_resources.shields be configured? Should provider_shield_id be omitted/null and shield_id reference a registered guard model?Also applies to: 55-58
examples/azure-run.yaml (3)
34-38: Updateapi_versionto a current Azure OpenAI version.
2024-02-15-previewis likely stale; examples should point to a currently supported GA (or clearly label as preview).Azure OpenAI: what is the latest supported GA api-version as of 2026-01, and what preview versions are currently supported?
137-158:default_embedding_modelreferences an unregistered embedding model (example may fail).
Registernomic-ai/nomic-embed-text-v1.5underregistered_resources.models(provider:sentence-transformers) or change the default to a registered embedding.Proposed fix
registered_resources: models: - model_id: gpt-4o-mini provider_id: azure model_type: llm provider_model_id: gpt-4o-mini + - model_id: nomic-ai/nomic-embed-text-v1.5 + provider_id: sentence-transformers + model_type: embedding + provider_model_id: nomic-ai/nomic-embed-text-v1.5 shields:
143-146: Shield config forinline::llama-guardlikely misconfigured.
Same concern as other manifests:provider_shield_id: openai/gpt-4o-minilooks like a model id, not a shield id.Llama Stack 0.3.x inline::llama-guard: confirm correct shields configuration fields and whether provider_shield_id should be null/omitted.Also applies to: 53-58
examples/vertexai-run.yaml (1)
34-43: Register the Vertex LLM and the default embedding model (current config hasmodels: []).
Withregistered_resources.models: [], butallowed_modelsset (Vertex) anddefault_embedding_modelset (sentence-transformers), this example likely fails unless Llama Stack auto-registers models for these providers.Proposed fix (explicit registrations)
registered_resources: - models: [] + models: + - model_id: google/gemini-2.5-flash + provider_id: google-vertex + model_type: llm + provider_model_id: google/gemini-2.5-flash + - model_id: nomic-ai/nomic-embed-text-v1.5 + provider_id: sentence-transformers + model_type: embedding + provider_model_id: nomic-ai/nomic-embed-text-v1.5 shields: - shield_id: llama-guardLlama Stack 0.3.x: does registered_resources.models need explicit entries for remote::vertexai allowed_models and for vector_stores.default_embedding_model?Also applies to: 136-152
tests/e2e/features/faiss.feature (2)
8-12: Don’t commit truncated/hardcoded Bearer tokens; use a valid test token from env/fixture.
Current token strings are incomplete JWTs and will fail if auth is enforced; even if auth is lax, hardcoding tokens in-repo is brittle.Example change (env-driven token)
- And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva + And I set the Authorization header to Bearer ${E2E_JWT}behave / Gherkin tests: what is the recommended way in this repo to inject secrets (like Authorization tokens) into e2e runs—env vars, fixture files, or a step that mints a local token?Also applies to: 22-25
13-20: Avoid asserting a single hardcoded vector store id.
Asserting"vs_37316db9-e60d-4e5f-a1d4-d2a22219aaee"will break if the fixture DB is regenerated. Prefer asserting non-empty, matching an^vs_[0-9a-f-]{36}$shape, or asserting membership against an id read from the fixture you mount.tests/e2e/configs/run-vertexai.yaml (1)
137-153: Register (or prove auto-registration of) the default embedding model.
vector_stores.default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5butregistered_resources.modelsis empty, which can break model resolution depending on how strict config validation is.Proposed fix (explicitly register the embedding model)
registered_resources: - models: [] + models: + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 shields: - shield_id: llama-guard provider_id: llama-guard provider_shield_id: openai/gpt-4o-minitests/e2e/configs/run-rhaiis.yaml (2)
138-158: Register (or prove auto-registration of)nomic-ai/nomic-embed-text-v1.5.
vector_stores.default_embedding_modelreferences nomic v1.5, butregistered_resources.modelsonly declares the RHAIIS LLM.Proposed fix
registered_resources: models: - model_id: ${env.RHAIIS_MODEL} provider_id: vllm model_type: llm provider_model_id: ${env.RHAIIS_MODEL} + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5
1-2: Config header changes look fine; trim the stray whitespace-only line.
Line 16 appears to contain trailing whitespace after- vector_io.Proposed fix
- vector_io - +Also applies to: 13-16
tests/e2e/configs/run-ci.yaml (2)
140-150: Fix embedding model mismatch: default embedding model is not registered.
registered_resources.modelsregisterssentence-transformers/all-mpnet-base-v2, butvector_stores.default_embedding_modelpoints tonomic-ai/nomic-embed-text-v1.5.Option A: Register the nomic embedding model
registered_resources: models: - model_id: gpt-4o-mini provider_id: openai model_type: llm provider_model_id: gpt-4o-mini - model_id: sentence-transformers/all-mpnet-base-v2 model_type: embedding provider_id: sentence-transformers provider_model_id: sentence-transformers/all-mpnet-base-v2 metadata: embedding_dimension: 768 + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5Option B: Align default to the already-registered embedding model
vector_stores: default_provider_id: faiss default_embedding_model: # Define the default embedding model for RAG provider_id: sentence-transformers - model_id: nomic-ai/nomic-embed-text-v1.5 + model_id: sentence-transformers/all-mpnet-base-v2Also applies to: 161-165
60-63: Don’t commit Braintrust/OpenAI keys (even placeholders) in CI configs.
Replaceopenai_api_key: '********'with an env var (or remove the provider if not used in CI).Proposed fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}tests/e2e/configs/run-rhelai.yaml (1)
154-158: Register (or align) the embedding model referenced byvector_stores.
nomic-ai/nomic-embed-text-v1.5is used as default embedding model but isn’t declared underregistered_resources.models.Proposed fix
registered_resources: models: - model_id: ${env.RHEL_AI_MODEL} provider_id: vllm model_type: llm provider_model_id: ${env.RHEL_AI_MODEL} + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5Also applies to: 138-143
tests/e2e-prow/rhoai/configs/run.yaml (1)
154-158: Register the embedding model referenced bydefault_embedding_model.Proposed fix
registered_resources: models: - model_id: meta-llama/Llama-3.2-1B-Instruct provider_id: vllm model_type: llm provider_model_id: null + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 shields:Also applies to: 139-148
examples/run.yaml (3)
54-61: Fix YAML parse error:scoring:is on the same line asprovider_type.Proposed fix
safety: - config: excluded_categories: [] provider_id: llama-guard - provider_type: inline::llama-guard scoring: + provider_type: inline::llama-guard + scoring: - provider_id: basic provider_type: inline::basic config: {}
65-69: Don’t ship placeholder secrets ('********') in runnable examples.Proposed fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
119-120: Remove duplicate top-leveltelemetryblock.
YAML will typically keep the last key, so this can silently override and confuse.Proposed fix
-telemetry: - enabled: trueAlso applies to: 175-176
run.yaml (2)
18-28:${env.SQLITE_STORE_DIR:=...}is used as a full file path in multiple places → path collisions when set.Proposed fix (treat SQLITE_STORE_DIR as a directory)
conversations_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/conversations.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/conversations.db type: sqlite ... inference_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/inference-store.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/inference-store.db type: sqlite metadata_store: - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/registry.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/registry.db type: sqlite ... files: - config: metadata_store: table_name: files_metadata backend: sql_default - storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage/files} + storage_dir: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/files ... storage: backends: kv_default: type: kv_sqlite - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/rag/kv_store.db sql_default: type: sql_sqlite - db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/sql_store.db} + db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage}/sql_store.dbAlso applies to: 45-47, 118-123
60-63: Don’t ship placeholder secrets ('********') in runnable configs.Proposed fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
🧹 Nitpick comments (2)
examples/vllm-rhaiis.yaml (1)
30-42: Consider dropping the unusedopenaiinference provider (or register a model for it).
Right nowproviders.inferenceincludesopenai, but all registered models point atvllm, so this can mislead example users.Also applies to: 138-143
examples/vllm-rhelai.yaml (1)
30-45: Consider dropping the unusedopenaiinference provider (or register a model for it).
Keeps the example focused and avoids implying OpenAI is active when it isn’t.Also applies to: 139-143
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis.yamlexamples/vllm-rhelai.yamlexamples/vllm-rhoai.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/vllm-granite-run.yaml
- examples/gemini-run.yaml
- examples/vllm-llama-faiss-run.yaml
- examples/openai-pgvector-run.yaml
- examples/vllm-qwen3-run.yaml
- examples/openai-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (4)
- pyproject.toml
- docker-compose-library.yaml
- tests/e2e/test_list.txt
- docs/providers.md
🧰 Additional context used
📓 Path-based instructions (1)
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/faiss.feature
🧠 Learnings (6)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/features/faiss.feature
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
tests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamlexamples/run.yamlexamples/vllm-rhoai.yamlexamples/azure-run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamlexamples/vllm-rhaiis.yamltests/e2e-prow/rhoai/configs/run.yamlexamples/vertexai-run.yamlexamples/vllm-rhelai.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`
Applied to files:
tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones
Applied to files:
examples/vllm-rhaiis.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
🪛 YAMLlint (1.37.1)
examples/run.yaml
[error] 58-58: syntax error: mapping values are not allowed here
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (14)
examples/vertexai-run.yaml (1)
4-15: Includetelemetryin theapislist to match the configuration pattern in other examples.The file enables
telemetry(line 155-156) but omits it from theapislist (lines 4-14). The canonical exampleexamples/run.yamlincludestelemetryin itsapislist when telemetry is enabled, and this should be consistent.examples/vllm-rhoai.yaml (1)
139-143: No action required. The configurationprovider_model_id: nullis intentional and valid for vLLM providers in Llama Stack 0.3.x. When set to null, the server automatically uses themodel_idvalue (meta-llama/Llama-3.2-1B-Instruct) as the provider-side model identifier at startup.docker-compose.yaml (1)
11-14: The./tests/e2e/ragdirectory is already committed to the repository.The volume mount is properly configured. The directory exists and is tracked in git with
kv_store.dbalready present. The:Zflag for SELinux relabeling is already used consistently on multiple volume mounts in bothdocker-compose.yamlanddocker-compose-library.yaml, and the configuration is actively running in CI (ubuntu-latest) without issues.No changes needed.
tests/e2e/configs/run-vertexai.yaml (2)
13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings,telemetry.enabled: trueis sufficient in 0.3.x-style configs, and addingtelemetrytoapisaligns with that.Also applies to: 156-157
30-72: Providers centralization + FAISS persistence wiring looks consistent.
Theproviders.vector_ioFAISS entry uses a stablenamespaceandkv_defaultbackend, which should keep RAG state isolated under the rag kv store.tests/e2e/configs/run-azure.yaml (2)
13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, keeping telemetry astelemetry.enabled: true(plus listingtelemetryinapis) is the right shape here.Also applies to: 160-162
137-157: Register the embedding model referenced byvector_stores.
The configuration referencesnomic-ai/nomic-embed-text-v1.5as the default embedding model invector_stores.default_embedding_model, but this model is not registered inregistered_resources.models. Onlygpt-4o-miniis currently registered. For consistency and to ensure the embedding model is properly tracked, add it to the models list.Proposed fix
registered_resources: models: - model_id: gpt-4o-mini provider_id: azure model_type: llm provider_model_id: gpt-4o-mini + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5tests/e2e/configs/run-rhaiis.yaml (1)
161-162: Telemetry enablement is correctly minimal (no provider block).
Based on learnings,telemetry.enabled: trueis the intended way to enable telemetry here.tests/e2e/configs/run-rhelai.yaml (1)
13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, this telemetry configuration shape is correct.Also applies to: 161-162
tests/e2e/configs/run-ci.yaml (1)
13-15: Telemetry enablement is correctly minimal (no provider block).
Based on learnings, this is the right way to enable telemetry in these configs.Also applies to: 168-169
examples/vllm-rhaiis.yaml (2)
1-2: Version/image_name normalization looks good.
137-158: Embedding model should be registered inregistered_resources.modelsfor consistency with reference configs.The file uses
vector_stores.default_embedding_modelbut does not registernomic-ai/nomic-embed-text-v1.5inregistered_resources.models. The reference example (examples/run.yaml) registers embedding models withmodel_type: embedding. While most configs in this PR omit registration, the pattern inconsistency should be resolved.Proposed fix (register the embedding model)
registered_resources: models: - model_id: ${env.RHAIIS_MODEL} provider_id: vllm model_type: llm provider_model_id: ${env.RHAIIS_MODEL} + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 shields: - shield_id: llama-guard provider_id: llama-guard provider_shield_id: openai/gpt-4o-miniAlso note:
sentence-transformers>=5.0.0is currently only in the optionalllslibdevdependency group, not in main project dependencies.examples/vllm-rhelai.yaml (1)
1-3: Version/image_name normalization looks good.tests/e2e-prow/rhoai/configs/run.yaml (1)
139-143: No action needed. Theprovider_model_id: nullconfiguration is valid for the vllm provider and is used consistently across multiple example and test configurations in the repository.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| registered_resources: | ||
| models: | ||
| - model_id: all-mpnet-base-v2 | ||
| provider_id: sentence-transformers | ||
| provider_model_id: all-mpnet-base-v2 | ||
| model_type: embedding | ||
| metadata: | ||
| embedding_dimension: 768 | ||
| - model_id: ${env.RHEL_AI_MODEL} | ||
| provider_id: vllm | ||
| model_type: llm | ||
| provider_model_id: ${env.RHEL_AI_MODEL} | ||
|
|
||
| shields: | ||
| - shield_id: llama-guard | ||
| provider_id: ${env.SAFETY_MODEL:+llama-guard} | ||
| provider_shield_id: ${env.SAFETY_MODEL:=} | ||
| - shield_id: code-scanner | ||
| provider_id: ${env.CODE_SCANNER_MODEL:+code-scanner} | ||
| provider_shield_id: ${env.CODE_SCANNER_MODEL:=} | ||
| provider_id: llama-guard | ||
| provider_shield_id: openai/gpt-4o-mini | ||
| datasets: [] | ||
| scoring_fns: [] | ||
| benchmarks: [] | ||
| tool_groups: | ||
| - toolgroup_id: builtin::rag | ||
| provider_id: rag-runtime | ||
| server: | ||
| port: 8321 | ||
| telemetry: | ||
| enabled: true | ||
| vector_stores: | ||
| default_provider_id: documentation_faiss | ||
| default_provider_id: faiss | ||
| default_embedding_model: | ||
| provider_id: sentence-transformers | ||
| model_id: all-mpnet-base-v2 | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 | ||
| safety: | ||
| default_shield_id: llama-guard | ||
| telemetry: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the vllm-rhelai.yaml file and read its full content
find . -name "vllm-rhelai.yaml" -type fRepository: lightspeed-core/lightspeed-stack
Length of output: 103
🏁 Script executed:
# Look for config schema files
find . -name "*schema*" -o -name "*config*.json" -o -name "*config*.yaml" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 275
🏁 Script executed:
# List example manifest files to understand patterns
git ls-files examples/ | grep -E "\.(yaml|yml)$" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 891
🏁 Script executed:
# Read the full vllm-rhelai.yaml file
cat ./examples/vllm-rhelai.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 3851
🏁 Script executed:
# Check size of config.json and then read it
wc -l ./docs/config.jsonRepository: lightspeed-core/lightspeed-stack
Length of output: 99
🏁 Script executed:
# Compare with other similar manifest files - check vllm-rhaiis.yaml
cat ./examples/vllm-rhaiis.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 3820
🏁 Script executed:
# Read the config.json schema to understand embedding model requirements
head -300 ./docs/config.jsonRepository: lightspeed-core/lightspeed-stack
Length of output: 10586
🏁 Script executed:
# Check the byok-rag example manifest
cat ./examples/lightspeed-stack-byok-rag.yaml | grep -A 30 "vector_stores"Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Search for embedding model registrations in all manifests
grep -r "model_type.*embedding" ./examples/ || echo "No embedding models found in registered_resources"Repository: lightspeed-core/lightspeed-stack
Length of output: 121
🏁 Script executed:
# Check the run.yaml to see how embedding models are registered
grep -B 5 -A 5 "model_type: embedding" ./examples/run.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 418
🏁 Script executed:
# Check if there are any other manifests with embedding model registrations
grep -B 5 -A 5 "model_type: embedding" ./examples/*.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 638
🏁 Script executed:
# Search config.json for embedding model schema requirements
grep -A 20 "EmbeddingModel\|embedding_model\|registered_resources" ./docs/config.json | head -100Repository: lightspeed-core/lightspeed-stack
Length of output: 800
🏁 Script executed:
# Search for RegisteredResources or Model schema in config.json
grep -B 5 -A 30 "RegisteredResources\|\"Model\"\|\"models\"" ./docs/config.json | head -150Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Look for the actual RegisteredResources schema definition
grep -A 50 "\"RegisteredResources\"" ./docs/config.jsonRepository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Let's check if there are any comments or documentation about embedding models in the codebase
rg "embedding.*model.*register" -A 3 -B 3 --type yaml || echo "No documentation found"Repository: lightspeed-core/lightspeed-stack
Length of output: 98
🏁 Script executed:
# Check the entire structure of run.yaml's registered_resources section for comparison
sed -n '/registered_resources:/,/^[a-z]/p' ./examples/run.yaml | head -60Repository: lightspeed-core/lightspeed-stack
Length of output: 713
Register embedding model in registered_resources.models. The default embedding model nomic-ai/nomic-embed-text-v1.5 is referenced in vector_stores.default_embedding_model but not registered in registered_resources.models. Register it with model_type: embedding following the pattern in other manifests:
Example registration
- model_id: nomic-ai/nomic-embed-text-v1.5
model_type: embedding
provider_id: sentence-transformers
provider_model_id: nomic-ai/nomic-embed-text-v1.5
metadata:
embedding_dimension: 768Also applies to: examples/vllm-rhaiis.yaml (lines 154-158)
🤖 Prompt for AI Agents
In `@examples/vllm-rhelai.yaml` around lines 138 - 161, Add a
registered_resources.models entry for the embedding model referenced by
vector_stores.default_embedding_model (model_id nomic-ai/nomic-embed-text-v1.5)
so it’s explicitly declared; create a model object with model_id:
nomic-ai/nomic-embed-text-v1.5, model_type: embedding, provider_id:
sentence-transformers, provider_model_id: nomic-ai/nomic-embed-text-v1.5 and
include metadata.embedding_dimension: 768 (or the correct dimension), and apply
the same addition to the corresponding manifest that also references this
embedding.
| vector_stores: | ||
| default_provider_id: documentation_faiss | ||
| default_provider_id: faiss | ||
| default_embedding_model: | ||
| provider_id: sentence-transformers | ||
| model_id: all-mpnet-base-v2 | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
find . -name "vllm-rhoai.yaml" -type fRepository: lightspeed-core/lightspeed-stack
Length of output: 208
🏁 Script executed:
cat -n examples/vllm-rhoai.yaml | head -170Repository: lightspeed-core/lightspeed-stack
Length of output: 4947
🌐 Web query:
Llama Stack 0.3.x vector_stores default_embedding_model registration requirements
💡 Result:
Short answer — for Llama Stack 0.3.x the vector_store’s default_embedding_model must be a registered embedding model (an entry in the platform’s models registry) and you must supply its embedding dimension (either via the model metadata or explicitly when creating the store). Concretely:
- The embedding model must be registered in Llama Stack’s models registry (an identifier returned by models.list()). [1][2]
- The model entry must include embedding metadata (embedding_dimension) or you must pass embedding_dimension when creating the vector store. [1][2]
- When creating/registering a vector_store you either reference the embedding model identifier (embedding_model) and embedding_dimension, or rely on a distribution-level default_embedding_model set in run.yaml / distribution config. [2][3]
- Provider info: vector stores normally require a provider_id (or use the distribution default provider); for inline/dev you can use the built-in providers (faiss, milvus, etc.). [3][2]
Sources: Llama Stack client/CLI and “Working with Llama Stack” docs (v0.3.x). [1][2][3]
References
[1] Llama Stack client CLI docs (embedding-model default, create/unregister vector db).
[2] “Working with Llama Stack” — examples showing registering/selecting embedding models and creating vector stores.
[3] Starter distribution docs — default providers (faiss, etc.) and env/config defaults.
Register the default_embedding_model in registered_resources.models. In Llama Stack 0.3.x, embedding models referenced by vector_stores.default_embedding_model must be registered in the models registry. Currently, nomic-ai/nomic-embed-text-v1.5 is used but not defined under registered_resources.models, which will cause runtime failures.
Proposed fix
registered_resources:
models:
- model_id: meta-llama/Llama-3.2-1B-Instruct
provider_id: vllm
model_type: llm
provider_model_id: null
+ - model_id: nomic-ai/nomic-embed-text-v1.5
+ provider_id: sentence-transformers
+ model_type: embedding
+ provider_model_id: nomic-ai/nomic-embed-text-v1.5
+ embedding_dimension: 768
shields:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| vector_stores: | |
| default_provider_id: documentation_faiss | |
| default_provider_id: faiss | |
| default_embedding_model: | |
| provider_id: sentence-transformers | |
| model_id: all-mpnet-base-v2 | |
| model_id: nomic-ai/nomic-embed-text-v1.5 | |
| registered_resources: | |
| models: | |
| - model_id: meta-llama/Llama-3.2-1B-Instruct | |
| provider_id: vllm | |
| model_type: llm | |
| provider_model_id: null | |
| - model_id: nomic-ai/nomic-embed-text-v1.5 | |
| provider_id: sentence-transformers | |
| model_type: embedding | |
| provider_model_id: nomic-ai/nomic-embed-text-v1.5 | |
| embedding_dimension: 768 | |
| shields: |
🤖 Prompt for AI Agents
In `@examples/vllm-rhoai.yaml` around lines 154 - 158, The YAML references an
embedding model under vector_stores.default_embedding_model (provider_id:
sentence-transformers, model_id: nomic-ai/nomic-embed-text-v1.5) but that model
is not registered in registered_resources.models; add an entry for this model in
registered_resources.models with the same provider_id and model_id (and any
required fields like type/params per registry schema) so the model is
discoverable at runtime and avoids failures when the vector store initializes.
| config: | ||
| api_key: ${env.AZURE_API_KEY} | ||
| api_base: https://ols-test.openai.azure.com/ | ||
| api_version: 2024-02-15-preview |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, verify the file exists and check the actual content around lines 34-37
find . -name "run-azure.yaml" -type fRepository: lightspeed-core/lightspeed-stack
Length of output: 110
🏁 Script executed:
# Get the actual file content to verify the snippet
cat -n tests/e2e/configs/run-azure.yaml | sed -n '30,40p'Repository: lightspeed-core/lightspeed-stack
Length of output: 440
🏁 Script executed:
# Check if there are other e2e config files to see patterns
ls -la tests/e2e/configs/ 2>/dev/null || find . -path "*/e2e/configs*" -type fRepository: lightspeed-core/lightspeed-stack
Length of output: 568
🏁 Script executed:
# Search for ${env. patterns in config files to see if this syntax is supported
rg '\$\{env\.' tests/e2e/ 2>/dev/null || echo "No env var usage found"Repository: lightspeed-core/lightspeed-stack
Length of output: 1356
🏁 Script executed:
# Check the entire run-azure.yaml to understand the structure
wc -l tests/e2e/configs/run-azure.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 112
🏁 Script executed:
# Check what the other config files do for api_base and api_version (or similar endpoint configs)
cat -n tests/e2e/configs/run-ci.yaml | sed -n '30,45p'Repository: lightspeed-core/lightspeed-stack
Length of output: 683
🏁 Script executed:
# Check run-rhelai and run-rhaiis for their endpoint patterns
cat -n tests/e2e/configs/run-rhelai.yaml | head -50Repository: lightspeed-core/lightspeed-stack
Length of output: 1477
🏁 Script executed:
# Check if there are any scripts or documentation about running these e2e tests
find . -name "*.md" -o -name "*e2e*" -o -name "*test*" | grep -i readmeRepository: lightspeed-core/lightspeed-stack
Length of output: 1314
🏁 Script executed:
# Search for how AZURE_API_BASE or similar env vars might be used elsewhere
rg "AZURE_API" . --type yaml --type mdRepository: lightspeed-core/lightspeed-stack
Length of output: 401
🏁 Script executed:
# Check if the config parsing/loading code supports env var expansion
rg -A 5 -B 5 "api_base|api_version" . --type py | head -80Repository: lightspeed-core/lightspeed-stack
Length of output: 1700
🏁 Script executed:
# Check the examples/azure-run.yaml to see if it has the same hardcoded values
cat -n examples/azure-run.yaml | sed -n '1,50p'Repository: lightspeed-core/lightspeed-stack
Length of output: 1498
🏁 Script executed:
# Check tests/e2e/README.md for guidance on running these configs
cat tests/e2e/README.mdRepository: lightspeed-core/lightspeed-stack
Length of output: 375
Externalize hardcoded Azure endpoint and API version to environment variables for consistent portability across deployments.
The api_base and api_version are hardcoded, while api_key already uses ${env.AZURE_API_KEY}. This inconsistency breaks the established pattern seen across other e2e configs (run-rhelai.yaml, run-rhaiis.yaml, run-vertexai.yaml), which externalize all deployment-specific values. Hardcoding preview versions also prevents central bumping without file edits.
Proposed change
- provider_id: azure
provider_type: remote::azure
config:
api_key: ${env.AZURE_API_KEY}
- api_base: https://ols-test.openai.azure.com/
- api_version: 2024-02-15-preview
+ api_base: ${env.AZURE_API_BASE}
+ api_version: ${env.AZURE_API_VERSION}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| config: | |
| api_key: ${env.AZURE_API_KEY} | |
| api_base: https://ols-test.openai.azure.com/ | |
| api_version: 2024-02-15-preview | |
| - provider_id: azure | |
| provider_type: remote::azure | |
| config: | |
| api_key: ${env.AZURE_API_KEY} | |
| api_base: ${env.AZURE_API_BASE} | |
| api_version: ${env.AZURE_API_VERSION} |
🤖 Prompt for AI Agents
In `@tests/e2e/configs/run-azure.yaml` around lines 34 - 37, The config block
currently hardcodes api_base and api_version while api_key uses
${env.AZURE_API_KEY}; change api_base and api_version to environment variables
(e.g., ${env.AZURE_API_BASE} and ${env.AZURE_API_VERSION}) so they follow the
same externalized pattern as api_key and allow central version/endpoint bumps;
update any references or docs that expect those values if needed and ensure
tests/CI set those env vars.
| When I use "query" to ask question with authorization header | ||
| """ | ||
| {"query": "What is the title of the article from Paul?", "system_prompt": "You are an assistant. Always use the file_search tool to answer. Write only lowercase letters"} | ||
| """ | ||
| Then The status code of the response is 200 | ||
| And The response should contain following fragments | ||
| | Fragments in LLM response | | ||
| | great work | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n tests/e2e/features/faiss.feature | sed -n '20,35p'Repository: lightspeed-core/lightspeed-stack
Length of output: 842
🏁 Script executed:
# Check if the file exists and get its full content to verify indentation
if [ -f "tests/e2e/features/faiss.feature" ]; then
head -35 tests/e2e/features/faiss.feature | cat -A
fiRepository: lightspeed-core/lightspeed-stack
Length of output: 1405
🌐 Web query:
behave Python Gherkin indentation handling step definition parsing
💡 Result:
Short summary — how behave handles Gherkin indentation and step parsing
-
General indentation: behave mostly ignores indentation in feature files; whitespace is for readability only and does not change step parsing. Keywords (Feature/Scenario/Given/When/Then/And/But) are recognized regardless of indentation (spaces or tabs). [1]
-
Steps and "And/But": "And"/"But" inherit the type of the preceding step; their indentation doesn’t change their semantics. [1]
-
Docstrings / PyStrings (multi-line step text): this is the one place where indentation is significant. behave expects docstrings to be delimited with triple-quotes (""") and will strip the common leading whitespace based on the first line of the docstring; all subsequent lines must have at least that amount of indentation. The docstring text becomes available to step code (context.text). [2][4]
-
Tables: data tables (pipe-delimited | a | b |) are parsed line-by-line and attached to the last step or Examples section; normal indentation is ignored but table syntax must be valid. [3]
-
Parser behavior: behave's parser is a line-oriented state machine implementing Gherkin (with small extensions). It recognizes states for steps, tables and MULTILINE_TEXT and uses indentation rules above when switching states. See the parser/state-machine overview. [3]
References
- behave Gherkin docs (features, steps, docstrings). [1]
- behave docs — Step data / Text (docstrings) explanation. [2]
- Parser system / MULTILINE_TEXT and table handling (parser implementation overview). [3]
Sources
[1] behave — Feature Testing Setup / Gherkin (readthedocs).
[2] behave — Feature Testing Setup / Step Data (readthedocs).
[3] behave parser system / MULTILINE_TEXT and table handling (parser overview).
Fix step indentation inconsistency for consistent formatting.
Lines 29–30 have extra indentation compared to lines 25–28. While behave ignores step indentation during parsing, keep consistent indentation across all steps for readability.
🤖 Prompt for AI Agents
In `@tests/e2e/features/faiss.feature` around lines 25 - 32, Fix the inconsistent
indentation in the feature steps by aligning the "Then The status code of the
response is 200" and "And The response should contain following fragments" lines
with the preceding "When I use \"query\"..." step; remove the extra leading
spaces so all steps use the same indentation level (match lines containing 'When
I use "query"' and the multi-line JSON block) to keep formatting consistent and
readable.
- FAISS e2e test (for all inference providers) - Disable shields in all configurations - Dummy rag db Update/remove old configuration examples
0ef52a1 to
e7709d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/vllm-rhaiis.yaml (1)
4-14: Missingtelemetryin APIs list.Unlike other configuration files in this PR (e.g.,
vllm-rhoai.yaml,vllm-rhelai.yaml), this file lackstelemetryin theapislist, even thoughtelemetry.enabled: trueis set at lines 160-161. Addtelemetryto the APIs for consistency.Suggested fix
apis: - agents - batches - datasetio - eval - files - inference - safety - scoring +- telemetry - tool_runtime - vector_io
🤖 Fix all issues with AI agents
In `@run.yaml`:
- Around line 156-160: The default_embedding_model references
nomic-ai/nomic-embed-text-v1.5 but it’s not registered; add an entry under
registered_resources.models to register that embedding model using the same
pattern as other models: create a model object with model_id set to
nomic-ai/nomic-embed-text-v1.5, provider_id set to sentence-transformers,
model_type set to embedding, and provider_model_id set to
nomic-ai/nomic-embed-text-v1.5 so the default_embedding_model resolves
correctly.
In `@tests/e2e/configs/run-rhaiis.yaml`:
- Around line 154-158: The default_embedding_model references
nomic-ai/nomic-embed-text-v1.5 but that model isn't listed in
registered_resources.models; add an entry under registered_resources.models for
model_id: nomic-ai/nomic-embed-text-v1.5 with provider_id: sentence-transformers
(or alternatively document that the model must be pre-registered via the Llama
Stack client) so the model referenced by default_embedding_model is available at
runtime.
♻️ Duplicate comments (21)
examples/azure-run.yaml (1)
37-37: Update Azure OpenAI API version to current release.The
api_version: 2024-02-15-previewis outdated. Consider updating to at least the stable2024-06-01release.Suggested fix
- api_version: 2024-02-15-preview + api_version: 2024-06-01examples/vertexai-run.yaml (1)
136-147: No models registered despite having inference providers configured.The
models: []means no models are explicitly registered, yet inference providers (google-vertex, openai, sentence-transformers) are configured. Consider adding model entries for the allowed models.Suggested model registration
registered_resources: - models: [] + models: + - model_id: google/gemini-2.5-flash + provider_id: google-vertex + model_type: llm + provider_model_id: google/gemini-2.5-flash shields:run.yaml (1)
60-63: Placeholder API key for braintrust scoring provider.The
openai_api_key: '********'is a redacted placeholder that will fail at runtime if the braintrust scoring provider is used. Consider using an environment variable.Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}examples/vllm-rhoai.yaml (2)
144-147: Shield configuration incomplete forinline::llama-guardprovider.The
inline::llama-guardsafety provider (line 55-58) requires a separate registered Llama Guard model served by an inference provider. Settingprovider_shield_id: openai/gpt-4o-miniis incorrect—this field should benullfor inline llama-guard, andshield_idshould reference a registered guard model.To fix:
- Register a Llama Guard model in
registered_resources.models- Update the shield to use
provider_shield_id: nulland reference the registered guard model
154-158: Registernomic-ai/nomic-embed-text-v1.5inregistered_resources.models.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5, but this model is not registered inregistered_resources.models. In Llama Stack 0.3.x, embedding models must be registered withmodel_type: embedding.Proposed fix
registered_resources: models: - model_id: meta-llama/Llama-3.2-1B-Instruct provider_id: vllm model_type: llm provider_model_id: null + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 shields:examples/vllm-rhelai.yaml (2)
144-147: Shield configuration mismatch withinline::llama-guardprovider.Same issue as other configuration files:
provider_shield_id: openai/gpt-4o-miniis incorrect for theinline::llama-guardprovider. This should beprovider_shield_id: null, and a Llama Guard model should be registered and referenced.
154-158: Register the embedding model inregistered_resources.models.The
default_embedding_model(nomic-ai/nomic-embed-text-v1.5) is not registered inregistered_resources.models. Add an entry withmodel_type: embeddingto ensure the vector store can resolve the model at runtime.tests/e2e-prow/rhoai/configs/run.yaml (1)
144-147: Shield configuration uses incorrectprovider_shield_idforinline::llama-guard.Consistent with other configuration files, the shield's
provider_shield_id: openai/gpt-4o-miniis incompatible with theinline::llama-guardprovider. This should benullwith a registered guard model.examples/run.yaml (4)
65-68: Placeholder API key for braintrust scoring provider.The
openai_api_key: '********'placeholder will fail at runtime. Use an environment variable reference instead.Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
152-157: Embedding model mismatch: registered model differs from default.
registered_resources.modelsregisterssentence-transformers/all-mpnet-base-v2, butvector_stores.default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5. These must match for RAG to function correctly.Option A (recommended): Update registered model to match default
registered_resources: models: - model_id: gpt-4o-mini provider_id: openai model_type: llm provider_model_id: gpt-4o-mini - - model_id: sentence-transformers/all-mpnet-base-v2 + - model_id: nomic-ai/nomic-embed-text-v1.5 model_type: embedding provider_id: sentence-transformers - provider_model_id: sentence-transformers/all-mpnet-base-v2 + provider_model_id: nomic-ai/nomic-embed-text-v1.5 metadata: embedding_dimension: 768Also applies to: 168-172
119-120: Duplicatetelemetryblock.The
telemetrysection appears twice: at lines 119-120 and again at lines 175-176. Remove the duplicate to avoid configuration ambiguity.Remove the first occurrence (lines 119-120)
scoring_fns: [] -telemetry: - enabled: true server: port: 8321Also applies to: 175-176
54-61: Critical YAML syntax error:scoring:incorrectly on same line.Line 58 has
provider_type: inline::llama-guard scoring:wherescoring:is placed on the same line. This is invalid YAML syntax and will cause a parse error, preventing the configuration from loading.🐛 Fix the YAML syntax error
safety: - config: excluded_categories: [] provider_id: llama-guard - provider_type: inline::llama-guard scoring: - - provider_id: basic + provider_type: inline::llama-guard + scoring: + - provider_id: basic provider_type: inline::basic config: {}examples/vllm-rhaiis.yaml (2)
143-146: Shield configuration inconsistent withinline::llama-guardprovider.Same issue as other files:
provider_shield_id: openai/gpt-4o-miniis incorrect for theinline::llama-guardsafety provider. Should benullwith a properly registered guard model.
153-157: Register the embedding model for vector stores.
nomic-ai/nomic-embed-text-v1.5is referenced as the default embedding model but not registered inregistered_resources.models. This will cause runtime failures.tests/e2e/features/faiss.feature (3)
14-19: Hardcoded vector_store ID tightly couples test to specific test data.The UUID
vs_37316db9-e60d-4e5f-a1d4-d2a22219aaeecreates a dependency on specific test database state. Consider documenting this dependency or making the expected ID configurable/discoverable.
22-32: Fix inconsistent step indentation for readability.Lines 29-30 have extra indentation compared to lines 25-28. While behave ignores step indentation during parsing, maintain consistent formatting for better readability.
Suggested alignment
When I use "query" to ask question with authorization header """ {"query": "What is the title of the article from Paul?", "system_prompt": "You are an assistant. Always use the file_search tool to answer. Write only lowercase letters"} """ - Then The status code of the response is 200 - And The response should contain following fragments + Then The status code of the response is 200 + And The response should contain following fragments | Fragments in LLM response | | great work |
10-10: Truncated JWT token - missing signature segment.The Bearer token appears incomplete:
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpvaends mid-payload and is missing the third (signature) segment. A valid JWT requires three Base64url-encoded parts separated by dots.Consider using a complete test JWT or extracting this to a test fixture/environment variable.
tests/e2e/configs/run-rhelai.yaml (1)
154-158: Verify embedding model registration requirements.Same as
run-rhaiis.yaml: thedefault_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5but it's not registered inregistered_resources.models. Verify whether the sentence-transformers provider auto-registers models or if explicit registration is required.tests/e2e/configs/run-azure.yaml (1)
34-37: Externalize hardcoded Azure endpoint and API version.The
api_keyuses${env.AZURE_API_KEY}, butapi_baseandapi_versionare hardcoded. This inconsistency breaks the pattern established in other e2e configs (e.g.,run-rhelai.yamlexternalizesRHEL_AI_URLandRHEL_AI_PORT). Hardcoding the preview version also prevents central version bumps without file edits.Proposed change
- provider_id: azure provider_type: remote::azure config: api_key: ${env.AZURE_API_KEY} - api_base: https://ols-test.openai.azure.com/ - api_version: 2024-02-15-preview + api_base: ${env.AZURE_API_BASE} + api_version: ${env.AZURE_API_VERSION:=2024-02-15-preview}tests/e2e/configs/run-ci.yaml (2)
60-63: Braintrust API key should use environment variable.The
openai_api_key: '********'is still a placeholder value. If this provider is actually used in CI tests, authentication will fail. Per project guidelines, sensitive data should use environment variables.Suggested fix
- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}
161-165: Embedding model mismatch: default model not registered.The
default_embedding_modelreferencesnomic-ai/nomic-embed-text-v1.5(line 165), but onlysentence-transformers/all-mpnet-base-v2is registered inregistered_resources.models(lines 145-150). This mismatch will cause runtime failures when the RAG system attempts to use the unregistered embedding model.Option A: Register the nomic embedding model
Add to
registered_resources.models:- model_id: nomic-ai/nomic-embed-text-v1.5 model_type: embedding provider_id: sentence-transformers provider_model_id: nomic-ai/nomic-embed-text-v1.5 metadata: embedding_dimension: 768Option B: Use the registered model as default
vector_stores: default_provider_id: faiss default_embedding_model: provider_id: sentence-transformers - model_id: nomic-ai/nomic-embed-text-v1.5 + model_id: sentence-transformers/all-mpnet-base-v2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
tests/e2e/rag/kv_store.dbis excluded by!**/*.dbuv.lockis excluded by!**/*.lock
📒 Files selected for processing (25)
docker-compose-library.yamldocker-compose.yamldocs/providers.mdexamples/azure-run.yamlexamples/gemini-run.yamlexamples/openai-faiss-run.yamlexamples/openai-pgvector-run.yamlexamples/run.yamlexamples/vertexai-run.yamlexamples/vllm-granite-run.yamlexamples/vllm-llama-faiss-run.yamlexamples/vllm-qwen3-run.yamlexamples/vllm-rhaiis.yamlexamples/vllm-rhelai.yamlexamples/vllm-rhoai.yamlpyproject.tomlrun.yamltests/e2e-prow/rhoai/configs/run.yamltests/e2e/configs/run-azure.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/features/faiss.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (6)
- examples/vllm-qwen3-run.yaml
- examples/openai-pgvector-run.yaml
- examples/gemini-run.yaml
- examples/vllm-granite-run.yaml
- examples/vllm-llama-faiss-run.yaml
- examples/openai-faiss-run.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- docker-compose-library.yaml
- tests/e2e/test_list.txt
🧰 Additional context used
📓 Path-based instructions (2)
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Checkpyproject.tomlfor supported Python versions before development
Always checkpyproject.tomlfor existing dependencies before adding new ones
Always verify current library versions inpyproject.tomlrather than assuming versions
Use pylint withsource-roots = "src"configuration
Files:
pyproject.toml
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/faiss.feature
🧠 Learnings (9)
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to pyproject.toml : Always check `pyproject.toml` for existing dependencies before adding new ones
Applied to files:
pyproject.tomlexamples/vllm-rhaiis.yaml
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.
Applied to files:
pyproject.toml
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
tests/e2e/configs/run-rhaiis.yamltests/e2e/configs/run-rhelai.yamltests/e2e/configs/run-vertexai.yamltests/e2e/configs/run-ci.yamltests/e2e/configs/run-azure.yamltests/e2e-prow/rhoai/configs/run.yamlexamples/vllm-rhaiis.yamlexamples/run.yamlexamples/vllm-rhoai.yamlexamples/vllm-rhelai.yamlexamples/vertexai-run.yamlexamples/azure-run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt`
Applied to files:
tests/e2e/configs/run-rhaiis.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Never commit secrets/keys; use environment variables for sensitive data
Applied to files:
tests/e2e/configs/run-ci.yaml
📚 Learning: 2025-12-18T10:21:03.056Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:03.056Z
Learning: In run.yaml for Llama Stack 0.3.x, do not add a telemetry provider block under providers. To enable telemetry, set telemetry.enabled: true directly (no provider block required). This pattern applies specifically to run.yaml configuration in this repository.
Applied to files:
run.yaml
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Applied to files:
tests/e2e/features/faiss.feature
📚 Learning: 2025-08-25T09:05:18.350Z
Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 445
File: tests/e2e/features/health.feature:50-53
Timestamp: 2025-08-25T09:05:18.350Z
Learning: The step definition for ignoring specific fields in response body comparisons is implemented in tests/e2e/features/steps/common_http.py at line 175 as `then('The body of the response, ignoring the "{field}" field, is the following')`.
Applied to files:
tests/e2e/features/faiss.feature
📚 Learning: 2026-01-11T16:30:41.784Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/steps/**/*.py : Define E2E test steps in `tests/e2e/features/steps/` directory
Applied to files:
tests/e2e/features/faiss.feature
🪛 YAMLlint (1.37.1)
examples/run.yaml
[error] 58-58: syntax error: mapping values are not allowed here
(syntax)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (37)
pyproject.toml (1)
63-64: Dependency addition follows project conventions. Theeinops>=0.8.1constraint is properly documented and aligns with the project's established pattern of pinning transitive dependencies at the top level for better control. The version constraint is current and appropriate.docs/providers.md (2)
103-103: Documentation now reflects llama-guard as unsupported.Per the past review, this was addressed. Note that configuration files still reference
llama-guardfor testing/example purposes even though it's marked unsupported in the official provider status.
160-160: rag-runtime now correctly marked as supported.This aligns with the PR objective to fully support the FAISS VectorIO provider and RAG functionality.
docker-compose.yaml (1)
14-14: LGTM - RAG volume mount for e2e tests.The volume mapping correctly provides access to the RAG test database, aligning with the storage paths configured in
run.yaml(~/.llama/storage/rag/).examples/azure-run.yaml (2)
66-72: FAISS vector_io provider configuration looks correct.The persistence configuration with namespace
vector_io::faissandkv_defaultbackend aligns with the storage backend defined at lines 116-118.
160-161: Telemetry configuration is correct for Llama Stack 0.3.x.Based on learnings, telemetry provider configuration is not supported under the
providerssection in version 0.3.x. Thetelemetry.enabled: truepattern is correct.examples/vertexai-run.yaml (2)
65-71: FAISS vector_io configuration is correctly structured.The persistence configuration aligns with the storage backend definitions and follows the same pattern as other config files.
148-156: Vector stores, safety, and telemetry configuration looks good.The configuration correctly references the FAISS provider and sentence-transformers for embeddings, with telemetry enabled per 0.3.x patterns.
run.yaml (4)
31-39: Inference provider configuration looks correct.The OpenAI remote provider with environment-driven API key and allowed models, plus the inline sentence-transformers provider for embeddings, is properly structured.
68-74: FAISS vector_io provider configuration is correct.The inline FAISS provider with persistence namespace
vector_io::faissusingkv_defaultbackend properly integrates with the storage configuration.
118-123: Storage backend paths now use separate environment variables.The
KV_STORE_PATHandSQL_STORE_PATHenv vars avoid the collision issue from using a singleSQLITE_STORE_DIRfor all paths.
161-164: Safety and telemetry configuration is correct.The
default_shield_id: llama-guardaligns with the registered shield, andtelemetry.enabled: truefollows the 0.3.x pattern per learnings.examples/vllm-rhoai.yaml (1)
1-28: LGTM on top-level structure and stores.The version, image_name, API list, and individual store configurations (conversations_store, inference_store, metadata_store) are well-structured with hardcoded paths that avoid the previously flagged environment variable conflict issue.
examples/vllm-rhelai.yaml (1)
54-58: Safety provider now correctly configured.The
inline::llama-guardsafety provider withexcluded_categories: []is properly structured.tests/e2e-prow/rhoai/configs/run.yaml (1)
138-158: No action needed—embedding model configuration is correct.The
nomic-ai/nomic-embed-text-v1.5model is properly configured invector_stores.default_embedding_model(lines 156–158), which is the expected pattern. Analysis of other test config files (run-azure.yaml,run-rhaiis.yaml,run-rhelai.yaml,run-vertexai.yaml) shows they all follow the same pattern: embedding models are defined invector_stores, not inregistered_resources.models. Onlyrun-ci.yamlregisters an embedding model inregistered_resources, and it registers a different model than the one invector_stores, suggesting that configuration is an exception rather than a requirement. The current state inrun.yamlaligns with the standard configuration pattern used across the test suite.tests/e2e/configs/run-rhaiis.yaml (3)
1-2: LGTM!Version and image_name configuration are properly set for the RHAIIS environment.
119-122: LGTM!Storage paths now use separate environment variables (
KV_STORE_PATHandSQL_STORE_PATH) with appropriate defaults, which resolves the previous concern about database path conflicts.
161-162: LGTM!Telemetry is correctly configured at the top level with
enabled: true, which aligns with Llama Stack 0.3.x conventions. Based on learnings, telemetry provider configuration is not supported under theproviderssection.tests/e2e/features/faiss.feature (1)
1-6: LGTM!The Background setup correctly initializes the service and sets the REST API prefix to
/v1. This follows the behave BDD framework conventions as per coding guidelines.tests/e2e/configs/run-rhelai.yaml (4)
1-2: LGTM!Version and image_name properly configured for the RHEL AI environment.
54-58: LGTM!The safety provider is now properly configured with the llama-guard provider, resolving the previous inconsistency where safety was disabled but a shield was still registered.
119-122: LGTM!Storage paths use separate environment variables (
KV_STORE_PATHandSQL_STORE_PATH) with appropriate defaults, consistent with the pattern established across other configuration files.
161-162: LGTM!Telemetry configuration is correct for Llama Stack 0.3.x.
tests/e2e/configs/run-vertexai.yaml (4)
31-44: LGTM!The inference providers are properly configured with google-vertex as the primary provider, along with openai and sentence-transformers inline providers. The Vertex AI configuration correctly uses environment variables for project and location.
66-72: LGTM!The FAISS vector_io provider is correctly configured with persistence using the
kv_defaultbackend, consistent with the other configuration files in this PR.
118-121: LGTM!Storage backend paths use environment variables with appropriate defaults, consistent with the established pattern.
149-157: LGTM!Vector stores, safety, and telemetry configurations are properly set. The structure aligns with the centralized configuration pattern used across other e2e config files.
tests/e2e/configs/run-azure.yaml (4)
1-2: LGTM!Image name correctly set for Azure configuration.
66-72: LGTM!FAISS vector_io provider configuration is consistent with other e2e config files.
118-121: LGTM!Storage backend paths correctly use environment variables.
153-161: LGTM!Vector stores, safety, and telemetry configurations are properly structured and consistent with the centralized configuration pattern.
tests/e2e/configs/run-ci.yaml (6)
1-16: LGTM!The
image_name: starterand addition oftelemetryAPI are consistent with the PR's centralized configuration approach and align with the telemetry enablement at line 169.
17-28: LGTM!The store paths are now distinct hardcoded values, resolving the previous environment variable collision issue.
64-74: LGTM!The RAG tool runtime and FAISS vector_io providers are properly configured with appropriate persistence settings using the
kv_defaultbackend.
116-138: LGTM!The storage backends now use distinct environment variables (
KV_STORE_PATHandSQL_STORE_PATH) with separate default paths, correctly resolving the previous collision issue.
139-160: LGTM!The YAML structure is now correct with proper nesting under
models:,shields:, andtool_groups:keys.
166-169: LGTM!The safety configuration correctly references the registered
llama-guardshield, and the telemetry configuration follows Llama Stack 0.3.x conventions (enabled via flag without requiring a provider block). Based on learnings, this is the correct approach.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| vector_stores: | ||
| default_provider_id: faiss | ||
| default_embedding_model: # Define the default embedding model for RAG | ||
| provider_id: sentence-transformers | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for sentence-transformers references in YAML
rg -n "sentence-transformers" --type yaml -A 5 -B 2 | head -60
echo "---"
# Check for any model registration patterns for embedding models
rg -n "model_type.*embedding" --type yaml -C 2
echo "---"
# Also search for embedding_model patterns in config
rg -n "embedding_model" --type yaml -C 3 | head -80Repository: lightspeed-core/lightspeed-stack
Length of output: 8331
🏁 Script executed:
#!/bin/bash
# Search for registered_resources.models to understand registration structure
rg -n "registered_resources|models" run.yaml -C 3 | head -100Repository: lightspeed-core/lightspeed-stack
Length of output: 930
🏁 Script executed:
#!/bin/bash
# Look for provider configuration and how it handles model loading
rg -n "provider_id.*sentence-transformers" -A 10 -B 5Repository: lightspeed-core/lightspeed-stack
Length of output: 32128
🏁 Script executed:
#!/bin/bash
# Check if there's any code or documentation about provider initialization
fd -e "py" -e "md" -e "json" | xargs grep -l "sentence.transformers\|SentenceTransformers" 2>/dev/null | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 301
🏁 Script executed:
#!/bin/bash
# Search for documentation on inline::sentence-transformers behavior
rg -n "inline::sentence-transformers" --type md -A 5 -B 2 | head -80Repository: lightspeed-core/lightspeed-stack
Length of output: 2024
🏁 Script executed:
#!/bin/bash
# Check if there's any code that shows how models are resolved/loaded
fd -e "py" | xargs grep -l "sentence.transformers\|embedding.*model" 2>/dev/null | head -10Repository: lightspeed-core/lightspeed-stack
Length of output: 352
🏁 Script executed:
#!/bin/bash
# Look at the full registered_resources.models section in run.yaml to see what's actually there
sed -n '139,160p' run.yamlRepository: lightspeed-core/lightspeed-stack
Length of output: 688
Register the embedding model in registered_resources.models.
The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but does not have an entry in registered_resources.models. Add the embedding model registration matching the pattern used in run-ci.yaml and examples/run.yaml:
Suggested registration
registered_resources:
models:
- model_id: gpt-4o-mini
provider_id: openai
model_type: llm
provider_model_id: gpt-4o-mini
- model_id: nomic-ai/nomic-embed-text-v1.5
provider_id: sentence-transformers
model_type: embedding
provider_model_id: nomic-ai/nomic-embed-text-v1.5🤖 Prompt for AI Agents
In `@run.yaml` around lines 156 - 160, The default_embedding_model references
nomic-ai/nomic-embed-text-v1.5 but it’s not registered; add an entry under
registered_resources.models to register that embedding model using the same
pattern as other models: create a model object with model_id set to
nomic-ai/nomic-embed-text-v1.5, provider_id set to sentence-transformers,
model_type set to embedding, and provider_model_id set to
nomic-ai/nomic-embed-text-v1.5 so the default_embedding_model resolves
correctly.
| vector_stores: | ||
| default_provider_id: documentation_faiss | ||
| default_provider_id: faiss | ||
| default_embedding_model: | ||
| provider_id: sentence-transformers | ||
| model_id: all-mpnet-base-v2 | ||
| model_id: nomic-ai/nomic-embed-text-v1.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if other config files register embedding models explicitly
rg -n "nomic-ai/nomic-embed-text" tests/e2e/configs/ --type yaml -B 5 -A 5Repository: lightspeed-core/lightspeed-stack
Length of output: 3262
🏁 Script executed:
#!/bin/bash
# Search for sentence-transformers provider implementation to understand registration
find . -type f -name "*.py" | xargs rg -l "sentence.transformers|sentence-transformers" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 219
🌐 Web query:
Llama Stack sentence-transformers provider auto-register embedding models
💡 Result:
Short answer: Llama Stack includes an inline sentence-transformers provider for local embedding models, but it does not magically auto-register arbitrary HF/Sentence‑Transformers models — the distribution ships/assumes a couple default embedding models (granite-embedding-30m, granite-embedding-125m) and you must register additional models with the Llama Stack client (or enable the provider) before use. See docs and example register command below.
Key points and commands
- Provider: inline::sentence-transformers is shipped as a provider in the starter distribution (providers are disabled by default). [1][2]
- Defaults: the starter docs note granite-embedding-30m and granite-embedding-125m as the default sentence‑transformers embeddings. [3]
- Registering an extra model (example): run the server, then register via the client:
llama-stack-client models register all-MiniLM-L6-v2
--provider-id sentence-transformers
--provider-model-id all-minilm:latest
--metadata '{"embedding_dimension":384}'
--model-type embedding
(This example and workflow appear in the Llama Stack / Red Hat docs.) [3][4]
References
- Llama Stack providers list — inline::sentence-transformers. [LLama‑Stack docs]. [1]
- Llama Stack inline::sentence-transformers provider page. [2]
- Red Hat / Llama Stack “Working with Llama Stack” — default embedding models and example register command. [3]
- opendatahub / Llama Stack client usage showing registered models. [6]
🏁 Script executed:
#!/bin/bash
# Search for registered_resources or explicit model registration
rg -n "registered_resources|register.*nomic-embed-text" . --type yaml --type py -B 2 -A 2Repository: lightspeed-core/lightspeed-stack
Length of output: 3770
🏁 Script executed:
#!/bin/bash
# Check for model registration in Python code
rg -n "nomic-embed-text|nomic-ai" . --type py -B 3 -A 3Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
#!/bin/bash
# Look for setup or initialization files that might register models
fd -e py -e yaml | xargs rg -l "models.*register|register.*models" | grep -E "(setup|init|fixture|conftest)" | head -10Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
#!/bin/bash
# Search for sentence-transformers provider implementation
find . -type f -name "*.py" -path "*/sentence*transformers*" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
#!/bin/bash
# Search for vector store implementation and default_embedding_model handling
rg -n "default_embedding_model|vector_stores" . --type py -B 2 -A 5 | head -100Repository: lightspeed-core/lightspeed-stack
Length of output: 9671
🏁 Script executed:
#!/bin/bash
# Check for test conftest files that might set up models
fd -name "conftest.py" | xargs cat | grep -A 10 -B 10 "nomic\|embedding\|register" 2>/dev/null | head -50Repository: lightspeed-core/lightspeed-stack
Length of output: 306
Add embedding model registration for nomic-ai/nomic-embed-text-v1.5 in registered_resources.
The default_embedding_model references nomic-ai/nomic-embed-text-v1.5 with provider_id: sentence-transformers, but this model is not registered in registered_resources.models. Unlike default models (granite-embedding-30m, granite-embedding-125m), the sentence-transformers provider does not auto-register arbitrary Hugging Face models—they must be explicitly registered via the Llama Stack client or in the configuration's registered_resources section to be available at runtime during RAG operations.
Either register the model in registered_resources.models or document the required pre-registration step:
registered_resources:
models:
- model_id: nomic-ai/nomic-embed-text-v1.5
provider_id: sentence-transformers🤖 Prompt for AI Agents
In `@tests/e2e/configs/run-rhaiis.yaml` around lines 154 - 158, The
default_embedding_model references nomic-ai/nomic-embed-text-v1.5 but that model
isn't listed in registered_resources.models; add an entry under
registered_resources.models for model_id: nomic-ai/nomic-embed-text-v1.5 with
provider_id: sentence-transformers (or alternatively document that the model
must be pre-registered via the Llama Stack client) so the model referenced by
default_embedding_model is available at runtime.
Description
FAISSrag (making sure the retrieved chunks are from the docs in the db).pyproject.toml)docs/providers, revoked support for shields as they are unusable now.examplesfolder, removed old unsupported configs (lls0.2.xformat).Test:
Ask the agent "What is the title of the article from Paul?", expected answer: "great work"
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
NA
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Improvements
Chores
Dependencies
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.