-
Notifications
You must be signed in to change notification settings - Fork 64
LCORE-1107: Bumped up llama-stack to RHOAI 3.2 and consuming downstream image #906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-1107: Bumped up llama-stack to RHOAI 3.2 and consuming downstream image #906
Conversation
WalkthroughReplaces mode-based e2e workflow config selection with environment-driven CONFIG_ENVIRONMENT, adds quay.io Docker login steps in multiple workflows, bumps llama-stack packages to 0.3.4 (and test/constants), centralizes storage to ~/.llama/storage with unified kv/sql backends, removes legacy library/server-mode configs, and adds LLAMA_STACK_LOGGING env support. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used🧠 Learnings (3)📚 Learning: 2025-08-18T10:57:39.266ZApplied to files:
📚 Learning: 2025-08-18T10:58:14.951ZApplied to files:
📚 Learning: 2025-08-18T10:56:55.349ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (16)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
README.md (1)
645-647: Uncomment the login command for clarity.The login guidance includes a commented-out
podman login quay.iocommand, which might confuse users. Consider uncommenting it or removing the comment, as users need to execute this command before starting services.Apply this diff to uncomment the login command:
# Login to quay.io to access the downstream llama-stack image -# podman login quay.io +podman login quay.io # Start both services podman compose up --build.github/workflows/e2e_tests.yaml (1)
142-147: Quay login step is fine; add minimal hardeningThe quay login is wired correctly to
QUAY_DOWNSTREAM_USERNAME/QUAY_DOWNSTREAM_TOKENand will fail fast on bad credentials.To make it a bit more robust:
- Quote the secret to avoid edge cases with shell metacharacters:
echo "$QUAY_ROBOT_TOKEN" | docker login quay.io -u="$QUAY_ROBOT_USERNAME" --password-stdin- Optionally add a simple guard so missing secrets fail with a clearer message before attempting login.
Also please confirm in repo settings that both secrets are defined for this workflow’s environment(s).
.github/workflows/e2e_tests_rhaiis.yaml (1)
125-130: Verify quay secret names and mirror hardening from main e2e workflowThe quay login step is structurally fine, but this workflow uses
QUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKENwhereas the main e2e workflow usesQUAY_DOWNSTREAM_*. That’s OK if both pairs are intentionally defined, but it’s an easy place for config drift.
- Confirm that
QUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKENexist in the repo/Org secrets and are the intended credentials for this job.- For consistency and safety, you can also quote the secret:
and optionally align the secret naming with the other workflows if they are meant to share the same robot account.echo "$QUAY_ROBOT_TOKEN" | docker login quay.io -u="$QUAY_ROBOT_USERNAME" --password-stdin
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
.github/workflows/e2e_tests.yaml(2 hunks).github/workflows/e2e_tests_rhaiis.yaml(1 hunks).github/workflows/e2e_tests_rhelai.yaml(1 hunks)Containerfile(1 hunks)README.md(2 hunks)docker-compose.yaml(1 hunks)pyproject.toml(1 hunks)requirements.aarch64.txt(1 hunks)requirements.x86_64.txt(1 hunks)run.yaml(3 hunks)run_library.yaml(0 hunks)src/constants.py(1 hunks)test.containerfile(1 hunks)tests/e2e/configs/run-azure.yaml(2 hunks)tests/e2e/configs/run-ci.yaml(2 hunks)tests/e2e/configs/run-library.yaml(0 hunks)tests/e2e/features/info.feature(1 hunks)
💤 Files with no reviewable changes (2)
- run_library.yaml
- tests/e2e/configs/run-library.yaml
🧰 Additional context used
📓 Path-based instructions (4)
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Configure pylint withsource-roots = "src"
Excludesrc/auth/k8s.pyfrom pyright type checking
Files:
pyproject.toml
tests/e2e/features/**/*.feature
📄 CodeRabbit inference engine (CLAUDE.md)
Use behave (BDD) framework with Gherkin feature files for end-to-end tests
Files:
tests/e2e/features/info.feature
src/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.py: Use absolute imports for internal modules in LCS project (e.g.,from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Uselogger = logging.getLogger(__name__)pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) andOptional[Type]orType | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Useasync deffor I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes:Configurationfor config classes,Error/Exceptionfor exceptions,Resolverfor strategy patterns,Interfacefor abstract base classes
Abstract classes must use ABC with@abstractmethoddecorators
Include complete type annotations for all class attributes in Python classes
Useimport loggingand module logger pattern with standard log levels: debug, info, warning, error
Files:
src/constants.py
src/**/constants.py
📄 CodeRabbit inference engine (CLAUDE.md)
Define shared constants in central
constants.pyfile with descriptive comments
Files:
src/constants.py
🧠 Learnings (3)
📚 Learning: 2025-08-18T10:56:55.349Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:0-0
Timestamp: 2025-08-18T10:56:55.349Z
Learning: The lightspeed-stack project intentionally uses a "generic image" approach, bundling many dependencies directly in the base runtime image to work for everyone, rather than using lean base images with optional dependency groups.
Applied to files:
README.mdrequirements.x86_64.txtrequirements.aarch64.txttest.containerfile
📚 Learning: 2025-08-18T10:58:14.951Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:47-47
Timestamp: 2025-08-18T10:58:14.951Z
Learning: psycopg2-binary is required by some llama-stack providers in the lightspeed-stack project, so it cannot be replaced with psycopg v3 or moved to optional dependencies without breaking llama-stack functionality.
Applied to files:
requirements.x86_64.txtrequirements.aarch64.txt
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.
Applied to files:
requirements.x86_64.txtrequirements.aarch64.txt
🪛 GitHub Actions: E2E Tests
test.containerfile
[error] 2-2: Docker build failed: failed to resolve source metadata for quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 (401 UNAUTHORIZED).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (16)
docker-compose.yaml (1)
26-28: LGTM!The optional debug logging environment variable is a good addition for troubleshooting. The default empty value ensures no behavior change unless explicitly enabled.
README.md (1)
626-628: LGTM - Clear access requirement documentation.The IMPORTANT note clearly communicates the authentication requirement for accessing the downstream llama-stack image.
test.containerfile (1)
10-11: Storage path migration is consistent.The change from
.llama/distributions/ollamato.llama/storagealigns with the storage path consolidation across the codebase.pyproject.toml (1)
31-32: Bothllama-stack==0.3.4andllama-stack-client==0.3.4are available on PyPI. The specified versions are valid and no action is required..github/workflows/e2e_tests_rhelai.yaml (1)
126-132: Secrets are already properly configured and used across workflows.The
QUAY_ROBOT_USERNAMEandQUAY_ROBOT_TOKENsecrets are referenced in multiple workflow files (e2e_tests_rhelai.yaml, e2e_tests_rhaiis.yaml, and e2e_tests.yaml), confirming they are configured in the repository. No action needed.src/constants.py (1)
5-5: The version bump to llama-stack 0.3.4 is safe. The release contains only bug fixes and backports (Starlette dependency bump, queue initialization fixes, adapter policies, error handling improvements, and RBAC hardening) with no breaking changes reported.requirements.aarch64.txt (2)
1396-1399: llama-stack and llama-stack-client bumped to 0.3.4 — versions aligned across all files.Version bump verified:
- Both requirements.aarch64.txt and requirements.x86_64.txt pin llama-stack==0.3.4 with matching hashes
- pyproject.toml specifies llama-stack==0.3.4 and llama-stack-client==0.3.4
- No stale 0.3.0–0.3.3 pins remain
- aarch64 and x86_64 maintain parity
- Test expectations (tests/e2e/features/info.feature) already expect version 0.3.4
1400-1406: llama-stack-client bumped to 0.3.4 — version-aligned with server.Confirmed: all four requirements files (aarch64, hermetic, torch, x86_64) and pyproject.toml pin llama-stack-client==0.3.4. No stale pins remain. The version is consistent across the codebase and actively used in 70+ locations. Approve.
requirements.x86_64.txt (1)
1396-1402: llama-stack / llama-stack-client bump looks consistentThe pins to
llama-stack==0.3.4andllama-stack-client==0.3.4align with the PR’s stated upgrade and the updated e2e expectations; no issues from this file alone.Please double‑check that:
pyproject.tomlandrequirements.aarch64.txtare on the same versions, and- the e2e
infotests pass against a real deployment with 0.3.4.tests/e2e/features/info.feature (1)
19-19: llama-stack version assertion matches the upgrade; confirm LCS versionThe
infoscenario now correctly expectsllama-stackversion0.3.4, which matches the dependency bump.Please just confirm that the service’s own version reported in the same payload is still
0.3.1(line 18). If the LCS app version was also bumped as part of this work, that expectation should be updated in lockstep to avoid flaky e2e failures.tests/e2e/configs/run-ci.yaml (1)
17-27: Centralized storage layout and path changes look coherent; confirm against llama-stack 0.3.4 schemaThe run-ci config now:
- Moves
conversations_store,inference_store, andmetadata_storeunder~/.llama/storage/..., avoiding ephemeral/tmp-style locations.- Introduces a
storageblock with:
backends.kv_default/sql_defaultpointing to dedicated KV/SQL SQLite files.storesformetadata,inference,conversations, andprompts, wired by namespace/table_name.- Updates providers (agents, batches, datasetio, eval, files, vector_io) to use those backends via
persistence/kvstore/metadata_storesettings.- Registers the builtin
rag-runtimetool group underregistered_resources.tool_groups.From a structure and wiring standpoint this all hangs together, and it matches the PR’s goal of unifying storage.
Since this depends on the exact config schema in
llama-stack==0.3.4, please verify against the current upstream docs that:
- The combination of legacy
*_storekeys with the newstoragesection is supported (or, if onlystorageis honored now, that any remaining legacy fields are safe to ignore), and- The chosen namespaces/table names (
agents_state,agents_responses,faiss_store, etc.) match how your code queries these stores so you don’t silently bifurcate data between old and new locations.Also applies to: 32-38, 44-45, 51-52, 64-65, 71-73, 102-105, 110-142
tests/e2e/configs/run-azure.yaml (3)
16-26: Clarify duplication of storage store definitions.The configuration defines storage stores in two places: top-level
conversations_store,inference_store,metadata_store(lines 16–26) with directdb_pathentries, and under the newstorage.storessection (lines 120–134) with backend references. This duplication may cause confusion or runtime conflicts.Confirm whether:
- Both definitions are intentional (serving different purposes)
- The top-level definitions are legacy and should be removed
- Which one takes precedence during initialization
69-75: Verifystorage_dirconfiguration in files provider.The files provider uses a mix of centralized storage (metadata_store via sql_default backend) and a direct filesystem path (storage_dir). Confirm that:
storage_dir: ~/.llama/storageis the correct location for file storage and is distinct from database storage- The directory is properly created and has appropriate permissions at runtime
- This mixed approach (DB metadata + filesystem storage) is the intended design for the files provider
112-134: Configuration pattern is correct for llama-stack 0.3.4.The centralized storage/backends structure with
kv_sqliteandsql_sqlitebackend types is the standard, documented pattern for llama-stack 0.3.x. The configuration correctly uses named backends and references them viabackend+namespace(for KV stores) ortable_name(for SQL stores), matching documented examples. All namespace and table_name identifiers used by providers follow the established pattern.run.yaml (2)
1-1: No action required. The version: 2 schema is already validated by the existing test suite in the repository. All test configurations and examples consistently useversion: '2', and the E2E tests would fail if llama-stack 0.3.4 did not recognize this version or support the new storage/backends configuration structure. The schema validation is implicitly confirmed through these passing test fixtures.
77-83: No action needed. The empty config{}forsentence-transformersis valid and intentional—the inline::sentence-transformers provider in llama-stack uses sensible defaults and explicitly documents empty config as the standard configuration pattern. The provider functions correctly for embedding tasks without additional configuration fields, provided the sentence-transformers library is available in the runtime environment.Likely an incorrect or invalid review comment.
| CONFIG_ENVIRONMENT: ${{ matrix.environment || 'ci' }} | ||
| run: | | ||
| CONFIGS_DIR="tests/e2e/configs" | ||
| MODE="$CONFIG_MODE" | ||
| ENVIRONMENT="$CONFIG_ENVIRONMENT" | ||
| echo "Deployment mode: $MODE" | ||
| echo "Looking for configurations in $CONFIGS_DIR/" | ||
| # Select config based on mode: | ||
| # - library mode: run-library.yaml (llama-stack 0.3.0 format) | ||
| # - server mode: run-ci.yaml (original format) | ||
| if [ "$MODE" == "library" ]; then | ||
| CONFIG_FILE="$CONFIGS_DIR/run-library.yaml" | ||
| # List available configurations | ||
| if [ -d "$CONFIGS_DIR" ]; then | ||
| echo "Available configurations:" | ||
| ls -la "$CONFIGS_DIR"/*.yaml 2>/dev/null || echo "No YAML files found in $CONFIGS_DIR/" | ||
| else | ||
| CONFIG_FILE="$CONFIGS_DIR/run-ci.yaml" | ||
| echo "Configs directory '$CONFIGS_DIR' not found!" | ||
| exit 1 | ||
| fi | ||
| echo "Using configuration: $CONFIG_FILE" | ||
| # Determine which config file to use | ||
| CONFIG_FILE="$CONFIGS_DIR/run-$ENVIRONMENT.yaml" | ||
| echo "Looking for: $CONFIG_FILE" | ||
| if [ ! -f "$CONFIG_FILE" ]; then | ||
| echo "❌ Configuration not found: $CONFIG_FILE" | ||
| echo "Available configs:" | ||
| ls -la "$CONFIGS_DIR"/*.yaml | ||
| if [ -f "$CONFIG_FILE" ]; then | ||
| echo "✅ Found config for environment: $ENVIRONMENT" | ||
| cp "$CONFIG_FILE" run.yaml | ||
| echo "✅ Config copied to run.yaml" | ||
| else | ||
| echo "❌ Configuration file not found: $CONFIG_FILE" | ||
| echo "Available files in $CONFIGS_DIR:" | ||
| ls -la "$CONFIGS_DIR/" | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Config selection is correct; fix hard‑coded source path in summary
The new CONFIG_ENVIRONMENT / run-$ENVIRONMENT.yaml selection logic is sound and will work for ci and azure environments.
However, the summary step always prints:
echo "Source config: tests/e2e/configs/run-ci.yaml"so azure runs will be mis‑reported.
Consider deriving this from the matrix value:
-echo "Source config: tests/e2e/configs/run-ci.yaml"
+echo "Source config: tests/e2e/configs/run-${{ matrix.environment }}.yaml"This keeps logs accurate if more environments are added later.
Also applies to: 133-133
🤖 Prompt for AI Agents
In .github/workflows/e2e_tests.yaml around lines 96-126 (and also line 133), the
summary log prints a hard-coded path ("tests/e2e/configs/run-ci.yaml") which
misreports the actual chosen config for other environments; update the echo to
derive the path from the variables you already set — e.g. use CONFIGS_DIR and
CONFIG_ENVIRONMENT or the CONFIG_FILE variable (echo "Source config:
$CONFIG_FILE" or echo "Source config:
$CONFIGS_DIR/run-$CONFIG_ENVIRONMENT.yaml") so the log reflects the real
selected file for all matrix environments.
| RUN mkdir -p /opt/app-root/src/.llama/storage /opt/app-root/src/.llama/providers.d && \ | ||
| chown -R 1001:1001 /opt/app-root/src/.llama |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for any remaining references to the old storage path structure
rg -n "distributions/ollama" --type yaml --type pyRepository: lightspeed-core/lightspeed-stack
Length of output: 7007
Storage path migration is incomplete and will cause runtime failures.
The change in the Containerfile creates the new .llama/storage directory, but 65+ references to the old .llama/distributions/ollama path remain active across configuration files (tests/, examples/). These files still expect databases and storage at the old location. The migration must include updating all db_path, storage_dir, and related environment variable defaults in YAML configuration files to use .llama/storage instead of .llama/distributions/ollama.
🤖 Prompt for AI Agents
In Containerfile around lines 78-79, the Dockerfile creates .llama/storage but
the project still references the old .llama/distributions/ollama path across
config and example files; update all YAML and env defaults (db_path,
storage_dir, STORAGE_DIR, DB_PATH, etc.) in tests/, examples/, configs/ and any
scripts to point to .llama/storage (or make path configurable via a single env
var), ensure relative paths and permissions remain correct, and run a
project-wide search/replace for ".llama/distributions/ollama" to replace with
".llama/storage", then update any README or docs referencing the old path and
add a compatibility note or symlink creation step if backward compatibility is
required.
| @@ -1,13 +1,13 @@ | |||
| # Custom Red Hat llama-stack image with missing dependencies | |||
| FROM quay.io/opendatahub/llama-stack:rhoai-v3.0-latest | |||
| FROM quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Pipeline failure due to unauthorized access to base image.
The GitHub Actions pipeline is failing with 401 UNAUTHORIZED when attempting to pull quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2. This prevents the PR from being tested and merged.
Root causes:
- The GitHub Actions workflow may not have access to the
aipccorganization on quay.io - The authentication step added in
.github/workflows/e2e_tests_rhelai.yamlmay need to be added to other workflows as well - The secrets (
QUAY_ROBOT_USERNAME,QUAY_ROBOT_TOKEN) may not be configured or may lack permissions
Resolution steps:
- Verify that the robot account has pull permissions for
quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 - Add Docker login steps to all workflows that build or pull this image (similar to the change in
.github/workflows/e2e_tests_rhelai.yaml) - Verify the image tag
rhoai-3.2exists in the registry
Run this script to check which workflow files might need the login step:
#!/bin/bash
# Find workflow files that reference the new base image or use test.containerfile
rg -l "test\.containerfile|quay\.io/aipcc" .github/workflows/🧰 Tools
🪛 GitHub Actions: E2E Tests
[error] 2-2: Docker build failed: failed to resolve source metadata for quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 (401 UNAUTHORIZED).
🤖 Prompt for AI Agents
In test.containerfile around lines 2 to 2, the base image
quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2 is causing 401 UNAUTHORIZED in CI;
ensure the robot account can pull the image, confirm the quay robot credentials
(QUAY_ROBOT_USERNAME and QUAY_ROBOT_TOKEN) are configured in GitHub secrets with
pull permissions, add Docker/login (quay) steps to every GitHub Actions workflow
that builds or pulls this image (mirror the login step from
.github/workflows/e2e_tests_rhelai.yaml), and verify the rhoai-3.2 tag exists in
the registry; to find all workflows needing the change, run the provided search
script to list workflows referencing test.containerfile or quay.io/aipcc and
update each to perform quay.io authentication before any docker pull/build.
|
PS: I will update other configs (RHAIIS, RHEL AI) in another PR |
8daa53b to
681b7e2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
.github/workflows/e2e_tests.yaml (1)
133-133: Fix hard-coded config path in log.The summary still prints a hard-coded path that will misreport the actual config for the
azureenvironment.Apply this diff to use the actual config file:
- echo "Source config: tests/e2e/configs/run-ci.yaml" + echo "Source config: $CONFIG_FILE"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (19)
.github/workflows/e2e_tests.yaml(2 hunks).github/workflows/e2e_tests_rhaiis.yaml(1 hunks).github/workflows/e2e_tests_rhelai.yaml(1 hunks)Containerfile(1 hunks)README.md(2 hunks)docker-compose.yaml(1 hunks)pyproject.toml(1 hunks)requirements.aarch64.txt(1 hunks)requirements.x86_64.txt(1 hunks)run.yaml(3 hunks)run_library.yaml(0 hunks)src/constants.py(1 hunks)test.containerfile(1 hunks)tests/e2e/configs/run-azure.yaml(2 hunks)tests/e2e/configs/run-ci.yaml(2 hunks)tests/e2e/configs/run-library.yaml(0 hunks)tests/e2e/configuration/lightspeed-stack-library-mode.yaml(0 hunks)tests/e2e/configuration/lightspeed-stack-server-mode.yaml(0 hunks)tests/e2e/features/info.feature(1 hunks)
💤 Files with no reviewable changes (4)
- tests/e2e/configs/run-library.yaml
- run_library.yaml
- tests/e2e/configuration/lightspeed-stack-library-mode.yaml
- tests/e2e/configuration/lightspeed-stack-server-mode.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
- tests/e2e/features/info.feature
- Containerfile
- requirements.x86_64.txt
- src/constants.py
- README.md
🧰 Additional context used
📓 Path-based instructions (1)
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
pyproject.toml: Configure pylint withsource-roots = "src"
Excludesrc/auth/k8s.pyfrom pyright type checking
Files:
pyproject.toml
🧠 Learnings (3)
📚 Learning: 2025-08-18T10:58:14.951Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:47-47
Timestamp: 2025-08-18T10:58:14.951Z
Learning: psycopg2-binary is required by some llama-stack providers in the lightspeed-stack project, so it cannot be replaced with psycopg v3 or moved to optional dependencies without breaking llama-stack functionality.
Applied to files:
requirements.aarch64.txt
📚 Learning: 2025-08-18T10:57:39.266Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:59-59
Timestamp: 2025-08-18T10:57:39.266Z
Learning: In the lightspeed-stack project, transitive dependencies like faiss-cpu are intentionally pinned as top-level dependencies to maintain better control over the dependency graph and avoid version conflicts when bundling ML/LLM tooling packages.
Applied to files:
requirements.aarch64.txt
📚 Learning: 2025-08-18T10:56:55.349Z
Learnt from: matysek
Repo: lightspeed-core/lightspeed-stack PR: 292
File: pyproject.toml:0-0
Timestamp: 2025-08-18T10:56:55.349Z
Learning: The lightspeed-stack project intentionally uses a "generic image" approach, bundling many dependencies directly in the base runtime image to work for everyone, rather than using lean base images with optional dependency groups.
Applied to files:
requirements.aarch64.txttest.containerfile
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-pr
🔇 Additional comments (16)
test.containerfile (2)
2-2: LGTM! Verify image access is configured.The base image change to
quay.io/aipcc/llama-stack/cpu-ubi9:rhoai-3.2aligns with the PR objectives to consume the downstream image. The Docker login steps have been added to the three e2e test workflows, which should resolve the previous 401 UNAUTHORIZED issue.Ensure the robot account credentials (
QUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKENorQUAY_DOWNSTREAM_USERNAME/QUAY_DOWNSTREAM_TOKEN) are configured in GitHub secrets and have pull permissions for this image. Also verify therhoai-3.2tag exists in the registry.
10-10: LGTM! Storage path updated consistently.The directory change from
.llama/distributions/ollamato.llama/storagealigns with the centralized storage model introduced across all configuration files in this PR..github/workflows/e2e_tests_rhelai.yaml (1)
126-132: LGTM! Docker authentication added correctly.The Docker login step is properly positioned before service startup and uses secure password-stdin input. This enables authenticated pulls of the new downstream base image.
.github/workflows/e2e_tests.yaml (2)
96-126: LGTM! Config selection logic improved.The environment-based configuration selection using
CONFIG_ENVIRONMENTis cleaner than the previous mode-based branching. The validation and error handling are appropriate.
142-148: LGTM! Note secret name inconsistency across workflows.The Docker login step is correctly implemented. However, this workflow uses
QUAY_DOWNSTREAM_USERNAME/QUAY_DOWNSTREAM_TOKENwhile the RHELAI and RHAIIS workflows useQUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKEN. Ensure both sets of secrets are configured, or standardize on one naming convention.Verify that the secrets
QUAY_DOWNSTREAM_USERNAMEandQUAY_DOWNSTREAM_TOKENare configured in the GitHub repository settings..github/workflows/e2e_tests_rhaiis.yaml (1)
125-131: LGTM! Docker authentication added correctly.The Docker login step matches the implementation in the RHELAI workflow and is properly positioned before service startup.
tests/e2e/configs/run-ci.yaml (3)
17-27: LGTM! Storage paths centralized.The consolidation of database paths under
~/.llama/storage/improves organization and aligns with the new storage model introduced in this PR.
32-106: LGTM! Provider configs migrated to namespace/backend pattern.The migration from direct
db_pathentries to thenamespace/backendpattern for persistence and kvstore configurations is a clean refactoring that centralizes storage management. This approach aligns with llama-stack 0.3.4's two-level storage architecture (backend definitions + store references), allowing multiple stores to reuse a single backend and avoiding duplicated configuration.
110-142: Storage and registered_resources configuration is correct for llama-stack 0.3.4.The new
storagesection withbackendsandstoresaligns with llama-stack 0.3.4 documented structure (a breaking change introduced in 0.3.x that consolidates backend declarations and reduces credential duplication). The emptymodelslist inregistered_resourcesis intentional and consistent across CI and test configs in this codebase—models are registered at runtime. Thetool_groupsentry mappingbuiltin::ragtorag-runtimeis correctly configured and matches the pattern used throughout example and configuration files.tests/e2e/configs/run-azure.yaml (3)
17-27: LGTM! Storage paths centralized consistently.The storage consolidation matches the pattern in
run-ci.yaml, ensuring consistency across environments.
83-85: LGTM! Sentence-transformers provider added.The addition of the
sentence-transformersinference provider with an empty config is appropriate for embedding support.
112-148: LGTM! Storage and model registration configured for Azure.The storage backend structure is consistent with
run-ci.yaml, and the pre-registeredgpt-4o-minimodel inregistered_resourcesis appropriate for the Azure environment, unlike the CI environment which leaves the models list empty.run.yaml (2)
17-27: LGTM! Main config storage paths centralized.The storage path updates in the main
run.yamlare consistent with the test configurations, ensuring uniform storage management across all environments.
32-142: LGTM! Provider and storage configuration unified.The migration to the namespace/backend pattern for provider configs and the addition of the
storageandregistered_resourcessections bring the main config in line with the new llama-stack 0.3.4 architecture. The changes are consistent with the test configurations.pyproject.toml (1)
31-32: Version bump to llama-stack 0.3.4 and llama-stack-client 0.3.4 is safe and compatible.The bump introduces no breaking changes (only bugfixes and chores per release notes). The project's Python requirement (
>=3.12,<3.14) is compatible with llama-stack-client 0.3.4's requirement of Python 3.12+. Additionally, the pyproject.toml correctly implements the required coding guidelines: pylint is configured withsource-roots = "src"and pyright excludessrc/authentication/k8s.pyfrom type checking.requirements.aarch64.txt (1)
1396-1406: Version alignment and regeneration look good. Bothllama-stack==0.3.4andllama-stack-client==0.3.4are properly pinned, Python 3.12 is supported by this release, and the identical hashes acrossrequirements.aarch64.txtandrequirements.x86_64.txtconfirm synchronized regeneration with no cross-arch drift.
| # Enable debug logging if needed | ||
| - LLAMA_STACK_LOGGING=${LLAMA_STACK_LOGGING:-} | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find references to LLAMA_STACK_LOGGING and locate the containerfile
rg "LLAMA_STACK_LOGGING" -i -A 3 -B 3Repository: lightspeed-core/lightspeed-stack
Length of output: 939
🏁 Script executed:
# Locate and examine the test.containerfile referenced in docker-compose.yaml
fd "test.containerfile" -x cat {}Repository: lightspeed-core/lightspeed-stack
Length of output: 574
🏁 Script executed:
# Check for any documentation or README about environment variables
fd -e "md" -e "txt" | head -20 | xargs grep -l "LLAMA_STACK_LOGGING" 2>/dev/null || echo "No docs found with LLAMA_STACK_LOGGING"Repository: lightspeed-core/lightspeed-stack
Length of output: 114
🏁 Script executed:
# Find and examine run.yaml
fd "run.yaml" -x cat {}Repository: lightspeed-core/lightspeed-stack
Length of output: 34362
🏁 Script executed:
# Check for any logging-related configuration or comments in the repo
rg "logging|LOGGING" -i -l | head -10 | xargs cat 2>/dev/null | grep -i "logging\|debug" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 794
🏁 Script executed:
# Check README, docs, or scripts that might explain LLAMA_STACK_LOGGING
fd -e "md" -e "sh" -e "py" -e "txt" | xargs grep -l "LLAMA_STACK" 2>/dev/null | head -10Repository: lightspeed-core/lightspeed-stack
Length of output: 212
🏁 Script executed:
# Look for any CI/CD or entrypoint scripts that might use or document this
fd -e "sh" -e "py" | xargs grep -B2 -A2 "LLAMA_STACK_LOGGING" 2>/dev/nullRepository: lightspeed-core/lightspeed-stack
Length of output: 60
🏁 Script executed:
# Check if there's a Makefile or setup guide
fd "Makefile\|makefile\|setup\|SETUP\|README" | head -10 | xargs cat 2>/dev/null | grep -i "llama\|logging" | head -20Repository: lightspeed-core/lightspeed-stack
Length of output: 58
🏁 Script executed:
# Read README to understand LLAMA_STACK_LOGGING context
cat README.md | grep -i "logging\|environment\|variable" -A 3 -B 3 | head -50Repository: lightspeed-core/lightspeed-stack
Length of output: 1228
🏁 Script executed:
# Check the tutorial script for LLAMA_STACK usage patterns
cat scripts/llama_stack_tutorial.sh | head -100Repository: lightspeed-core/lightspeed-stack
Length of output: 2830
🏁 Script executed:
# Check constants.py for any logging-related constants
cat src/constants.pyRepository: lightspeed-core/lightspeed-stack
Length of output: 4784
Document expected values for LLAMA_STACK_LOGGING or remove if unused.
LLAMA_STACK_LOGGING is not documented in this repository. The variable is set to potentially empty, and the comment "Enable debug logging if needed" doesn't clarify what values are accepted or whether it's actually used by the llama-stack container. Either add documentation of supported values (e.g., debug|info|warn|error) or confirm this variable is intentionally optional and can be omitted entirely if not set in the host environment.
🤖 Prompt for AI Agents
In docker-compose.yaml around lines 26 to 28, the LLAMA_STACK_LOGGING env var is
undocumented and ambiguous; either document its accepted values or remove it if
unused — update the file to one of these: a) replace the placeholder comment
with a clear comment listing supported values (e.g.,
LLAMA_STACK_LOGGING=debug|info|warn|error) and add that the default is
empty/unspecified, or b) if the llama-stack image does not consume this
variable, remove the LLAMA_STACK_LOGGING line entirely from the service
definition and any related docs; ensure any chosen option is reflected in repo
README or an environment variable example file.
asimurka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Breaking change: consumers of server mode will need to request access to the downstream llama-stack image from AIPCC.
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.