Skip to content

Conversation

@are-ces
Copy link
Contributor

@are-ces are-ces commented Dec 16, 2025

Description

Updating RHEL AI, RHAIIS, RHOAI configs to work with LLS 0.3.x

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)
NA

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features
    • Added multiple example configurations showcasing multi-provider deployments, vector storage, safety shields, embedding model registration, RAG tooling, batch processing, telemetry, and modular storage for notebook ecosystems.
  • Chores
    • Updated CI workflows to use downstream credentials for container registry login.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 16, 2025

Walkthrough

Updates CI workflow Docker login environment variable names and adds/restructures several YAML configuration files for vLLM/RHEL/RHO AI deployments and e2e tests, migrating from inline provider layouts to modular, store-backed configurations.

Changes

Cohort / File(s) Change Summary
CI workflow credential renames
/.github/workflows/e2e_tests_rhaiis.yaml, /.github/workflows/e2e_tests_rhelai.yaml
Replace QUAY_ROBOT_USERNAME/QUAY_ROBOT_TOKEN with QUAY_DOWNSTREAM_USERNAME/QUAY_DOWNSTREAM_TOKEN in the Docker login step for quay.io.
New example vLLM configurations
examples/vllm-rhaiis.yaml, examples/vllm-rhelai.yaml, examples/vllm-rhoai.yaml
Add three new multi-provider, multi-store YAML configs defining providers, storage backends/stores, registered resources (models, shields), vector stores, telemetry, and environment-driven fields.
Reworked e2e-prow configuration
tests/e2e-prow/rhoai/configs/run.yaml
Replace monolithic inline provider model with modular storage-backed config: new storage/backends, registered_resources, vector_stores, safety/shields, and reorganized provider sections.
Reworked e2e test configurations
tests/e2e/configs/run-rhaiis.yaml, tests/e2e/configs/run-rhelai.yaml
Migrate from inline/local provider settings to consolidated providers, storage backends/stores, registered_resources (models/shields), vector store defaults, safety defaults, and env-driven path/model references.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas to focus during review:
    • CI workflow credential variable rename consistency with repository secrets and other workflows
    • Cross-file consistency of provider IDs, provider_model_id, and registered_resources references
    • Storage backend/store names, namespaces, and env-driven path variables across examples and tests
    • Vector store and embedding model references and safety/shield IDs

Possibly related PRs

Suggested labels

ok-to-test

Suggested reviewers

  • tisnik
  • radofuchs
  • asimurka

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR: updating configuration files for three platforms (RHEL AI, RHAIIS, RHOAI) to be compatible with LLS 0.3.x, which aligns with the substantial config restructuring across all modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@are-ces are-ces changed the title Updating RHEL AI, RHAIIS, RHOAI configs to work with LLS 0.3.x Update RHEL AI, RHAIIS, RHOAI configs to work with LLS 0.3.x Dec 16, 2025
@are-ces are-ces changed the title Update RHEL AI, RHAIIS, RHOAI configs to work with LLS 0.3.x Update llama-stack configs for RHEL AI, RHAIIS, RHOAI to work with LLS 0.3.x Dec 16, 2025
@are-ces are-ces changed the title Update llama-stack configs for RHEL AI, RHAIIS, RHOAI to work with LLS 0.3.x Update llama-stack configs for RHEL AI, RHAIIS, RHOAI to LLS 0.3.x Dec 16, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
examples/vllm-rhelai.yaml (1)

19-23: Trailing whitespace and TLS verification consideration.

Line 23 has trailing whitespace after 2048. Additionally, tls_verify: false is set which disables certificate verification. Ensure this is acceptable for your deployment environment; consider enabling TLS verification for production deployments.

       url: http://${env.RHEL_AI_URL}:${env.RHEL_AI_PORT}/v1/
       api_token: ${env.RHEL_AI_API_KEY}
       tls_verify: false
-      max_tokens: 2048  
+      max_tokens: 2048
examples/vllm-rhaiis.yaml (1)

19-23: Trailing whitespace; hardcoded port differs from RHEL AI config.

Line 23 has trailing whitespace. Also note that this config uses a hardcoded port :8000 while vllm-rhelai.yaml uses a dynamic port via ${env.RHEL_AI_PORT}. Verify this is intentional for the RHAIIS deployment pattern.

       url: http://${env.RHAIIS_URL}:8000/v1/
       api_token: ${env.RHAIIS_API_KEY}
       tls_verify: false
-      max_tokens: 2048  
+      max_tokens: 2048
tests/e2e/configs/run-rhaiis.yaml (1)

19-23: Trailing whitespace on max_tokens line.

Line 23 has trailing whitespace after 2048.

       url: http://${env.RHAIIS_URL}:8000/v1/
       api_token: ${env.RHAIIS_API_KEY}
       tls_verify: false
-      max_tokens: 2048  
+      max_tokens: 2048
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bace7c4 and f192004.

📒 Files selected for processing (8)
  • .github/workflows/e2e_tests_rhaiis.yaml (1 hunks)
  • .github/workflows/e2e_tests_rhelai.yaml (1 hunks)
  • examples/vllm-rhaiis.yaml (1 hunks)
  • examples/vllm-rhelai.yaml (1 hunks)
  • examples/vllm-rhoai.yaml (1 hunks)
  • tests/e2e-prow/rhoai/configs/run.yaml (2 hunks)
  • tests/e2e/configs/run-rhaiis.yaml (1 hunks)
  • tests/e2e/configs/run-rhelai.yaml (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-pr
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (17)
.github/workflows/e2e_tests_rhaiis.yaml (1)

125-130: Credential migration is consistent with rhelai workflow.

The secret migration is implemented identically to the RHEL AI workflow, maintaining consistency across both E2E test environments.

The same verification applies here: ensure QUAY_DOWNSTREAM_USERNAME and QUAY_DOWNSTREAM_TOKEN secrets are configured in the repository settings. The verification script provided in the review of .github/workflows/e2e_tests_rhelai.yaml will check both files.

.github/workflows/e2e_tests_rhelai.yaml (1)

126-131: Credential migration is correctly implemented across all workflows.

The approach of mapping new secret names (QUAY_DOWNSTREAM_USERNAME and QUAY_DOWNSTREAM_TOKEN) to existing local environment variable names (QUAY_ROBOT_USERNAME and QUAY_ROBOT_TOKEN) maintains backward compatibility. This pattern is consistently applied across all three e2e test workflows, with no workflows directly referencing the old secret names.

Ensure that the new secrets QUAY_DOWNSTREAM_USERNAME and QUAY_DOWNSTREAM_TOKEN are properly configured in the GitHub repository settings before this workflow runs, otherwise authentication will fail.

examples/vllm-rhelai.yaml (1)

134-140: Conditional shield configuration looks correct.

The shield configurations use shell-style parameter expansion (${env.SAFETY_MODEL:+llama-guard} and ${env.SAFETY_MODEL:=}) to optionally enable shields based on environment variables. This allows flexible deployment where safety models can be conditionally enabled.

examples/vllm-rhaiis.yaml (1)

1-157: Configuration structure is consistent with the modular LLS 0.3.x pattern.

The file follows the same structure as other example configs with appropriate RHAIIS-specific environment variables. Storage backends, resource registrations, and provider configurations align with the expected modular architecture.

tests/e2e-prow/rhoai/configs/run.yaml (2)

129-132: Hardcoded model ID appropriate for test configuration.

Using a hardcoded model (meta-llama/Llama-3.2-1B-Instruct) is appropriate for a test configuration, providing deterministic behavior. The smaller 1B model is suitable for e2e testing.


98-120: Storage configuration follows the new modular backend pattern.

The storage section properly defines kv_default and sql_default backends with environment-driven paths and appropriate stores for metadata, inference, conversations, and prompts. This aligns with the LLS 0.3.x architecture.

examples/vllm-rhoai.yaml (2)

129-132: Hardcoded model ID differs from other example configs.

This example config uses a hardcoded model ID (meta-llama/Llama-3.2-1B-Instruct) while vllm-rhelai.yaml and vllm-rhaiis.yaml use environment variables (${env.RHEL_AI_MODEL} and ${env.RHAIIS_MODEL} respectively). Consider whether this should also use an environment variable for consistency and flexibility.


15-26: Inference provider configuration looks correct.

The vLLM remote provider and sentence-transformers inline provider are properly configured. No trailing whitespace issues in this file.

tests/e2e/configs/run-rhaiis.yaml (1)

121-146: Resource registrations properly configured for RHAIIS testing.

The registered resources section correctly defines the embedding model with metadata and uses environment variables for the RHAIIS model. Shield configurations follow the same conditional pattern as other configs.

tests/e2e/configs/run-rhelai.yaml (8)

1-13: LGTM!

The version, image name, and APIs configuration looks well-structured. The comprehensive API list aligns with the modular LLS 0.3.x configuration approach.


19-23: Verify that tls_verify: false is intentional for e2e tests.

Disabling TLS verification is acceptable for e2e tests with self-signed certificates or local endpoints, but this setting should not propagate to production configurations.


27-59: LGTM!

The provider configurations for vector_io, files, safety, and agents are well-structured with proper namespace separation and consistent backend references.


60-97: LGTM!

The remaining provider configurations follow consistent patterns with proper namespace separation for kvstore backends.


98-120: LGTM!

The storage configuration with separate backends for key-value and SQL storage, along with properly configured stores, follows a clean modular pattern.


134-140: Verify behavior when shield environment variables are unset.

The conditional syntax ${env.SAFETY_MODEL:+llama-guard} will result in an empty provider_id when SAFETY_MODEL is not set. Confirm that LLS 0.3.x gracefully handles shields with empty/null provider_id, or consider omitting these entries entirely when the environment variables are not configured.


147-150: LGTM!

Server and telemetry configuration is appropriate for e2e testing.


151-157: LGTM!

The vector_stores and safety defaults correctly reference providers and resources defined earlier in the configuration.

@are-ces are-ces force-pushed the update-provider-configs-clean branch from 183e135 to f192004 Compare December 16, 2025 15:07
@are-ces are-ces marked this pull request as draft December 16, 2025 15:09
@are-ces are-ces force-pushed the update-provider-configs-clean branch from f192004 to 38f0904 Compare December 16, 2025 15:14
@are-ces are-ces marked this pull request as ready for review December 16, 2025 15:15
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tests/e2e/configs/run-rhaiis.yaml (1)

134-140: Verify conditional shield registration handles missing environment variables correctly.

The conditional syntax ${env.SAFETY_MODEL:+llama-guard} will result in an empty string for provider_id when the environment variable is not set, potentially causing runtime errors.

tests/e2e/configs/run-rhelai.yaml (1)

134-140: Verify conditional shield registration handles missing environment variables correctly.

The conditional syntax ${env.SAFETY_MODEL:+llama-guard} will result in an empty string for provider_id when the environment variable is not set, potentially causing runtime errors.

🧹 Nitpick comments (1)
tests/e2e/configs/run-rhaiis.yaml (1)

16-26: Consider making the port configurable for consistency.

The RHAIIS URL hardcodes port 8000, while the RHELAI configuration uses an environment variable RHEL_AI_PORT for flexibility. Consider using ${env.RHAIIS_PORT:=8000} to allow port customization while maintaining the default.

Apply this diff if port flexibility is desired:

-      url: http://${env.RHAIIS_URL}:8000/v1/
+      url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f192004 and 38f0904.

📒 Files selected for processing (8)
  • .github/workflows/e2e_tests_rhaiis.yaml (1 hunks)
  • .github/workflows/e2e_tests_rhelai.yaml (1 hunks)
  • examples/vllm-rhaiis.yaml (1 hunks)
  • examples/vllm-rhelai.yaml (1 hunks)
  • examples/vllm-rhoai.yaml (1 hunks)
  • tests/e2e-prow/rhoai/configs/run.yaml (2 hunks)
  • tests/e2e/configs/run-rhaiis.yaml (1 hunks)
  • tests/e2e/configs/run-rhelai.yaml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • .github/workflows/e2e_tests_rhelai.yaml
  • examples/vllm-rhelai.yaml
  • examples/vllm-rhoai.yaml
  • examples/vllm-rhaiis.yaml
  • .github/workflows/e2e_tests_rhaiis.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: build-pr
  • GitHub Check: E2E: library mode / vertexai
  • GitHub Check: E2E: server mode / azure
  • GitHub Check: E2E: server mode / ci
  • GitHub Check: E2E: library mode / azure
  • GitHub Check: E2E: server mode / vertexai
  • GitHub Check: E2E: library mode / ci
🔇 Additional comments (6)
tests/e2e-prow/rhoai/configs/run.yaml (3)

15-97: LGTM!

The provider configurations are well-structured with proper references to storage backends and correct YAML syntax. The modular, multi-provider approach aligns with the LLS 0.3.x architecture.


98-157: LGTM!

The storage, registered resources, and server configurations are properly structured with appropriate environment variable fallbacks and defaults. The modular design with empty placeholder lists for datasets, scoring_fns, and benchmarks allows for future extensibility.


129-132: Verify whether this hardcoded model ID should be consistent with other configs.

The inconsistency is confirmed: RHOAI config uses a hardcoded meta-llama/Llama-3.2-1B-Instruct while run-rhaiis.yaml uses ${env.RHAIIS_MODEL} and run-rhelai.yaml uses ${env.RHEL_AI_MODEL}. Either align this config with the environment variable pattern used elsewhere, or document why RHOAI requires a specific hardcoded model.

tests/e2e/configs/run-rhaiis.yaml (1)

1-157: Configuration structure is solid.

The overall configuration correctly implements the modular, store-backed architecture for LLS 0.3.x. The use of environment variables for model IDs provides good flexibility for different test scenarios.

tests/e2e/configs/run-rhelai.yaml (2)

16-26: LGTM! Good use of environment variables for URL construction.

The inference provider configuration properly uses environment variables for both the URL and port (${env.RHEL_AI_URL}:${env.RHEL_AI_PORT}), providing maximum flexibility for different deployment scenarios.


1-157: Configuration structure is solid.

The overall configuration correctly implements the modular, store-backed architecture for LLS 0.3.x with appropriate use of environment variables throughout.

Comment on lines +134 to +140
shields:
- shield_id: llama-guard
provider_id: ${env.SAFETY_MODEL:+llama-guard}
provider_shield_id: ${env.SAFETY_MODEL:=}
- shield_id: code-scanner
provider_id: ${env.CODE_SCANNER_MODEL:+code-scanner}
provider_shield_id: ${env.CODE_SCANNER_MODEL:=}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Verify conditional shield registration handles missing environment variables correctly.

The conditional syntax ${env.SAFETY_MODEL:+llama-guard} will result in an empty string for provider_id when the environment variable is not set. This could lead to runtime errors when shields are referenced but have no valid provider.

Verify that LLS 0.3.x handles shields with empty provider_id gracefully, or ensure these environment variables are always set in the test environment. Run the following to check how shields are referenced:

#!/bin/bash
# Check if shields with empty provider_id are handled or if env vars are always set
rg -n "SAFETY_MODEL|CODE_SCANNER_MODEL" --type=yaml --type=sh

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tisnik tisnik merged commit a6fb210 into lightspeed-core:main Dec 16, 2025
19 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants