-
Notifications
You must be signed in to change notification settings - Fork 63
Update llama-stack configs for RHEL AI, RHAIIS, RHOAI to LLS 0.3.x #925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update llama-stack configs for RHEL AI, RHAIIS, RHOAI to LLS 0.3.x #925
Conversation
WalkthroughUpdates CI workflow Docker login environment variable names and adds/restructures several YAML configuration files for vLLM/RHEL/RHO AI deployments and e2e tests, migrating from inline provider layouts to modular, store-backed configurations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
examples/vllm-rhelai.yaml (1)
19-23: Trailing whitespace and TLS verification consideration.Line 23 has trailing whitespace after
2048. Additionally,tls_verify: falseis set which disables certificate verification. Ensure this is acceptable for your deployment environment; consider enabling TLS verification for production deployments.url: http://${env.RHEL_AI_URL}:${env.RHEL_AI_PORT}/v1/ api_token: ${env.RHEL_AI_API_KEY} tls_verify: false - max_tokens: 2048 + max_tokens: 2048examples/vllm-rhaiis.yaml (1)
19-23: Trailing whitespace; hardcoded port differs from RHEL AI config.Line 23 has trailing whitespace. Also note that this config uses a hardcoded port
:8000whilevllm-rhelai.yamluses a dynamic port via${env.RHEL_AI_PORT}. Verify this is intentional for the RHAIIS deployment pattern.url: http://${env.RHAIIS_URL}:8000/v1/ api_token: ${env.RHAIIS_API_KEY} tls_verify: false - max_tokens: 2048 + max_tokens: 2048tests/e2e/configs/run-rhaiis.yaml (1)
19-23: Trailing whitespace on max_tokens line.Line 23 has trailing whitespace after
2048.url: http://${env.RHAIIS_URL}:8000/v1/ api_token: ${env.RHAIIS_API_KEY} tls_verify: false - max_tokens: 2048 + max_tokens: 2048
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/e2e_tests_rhaiis.yaml(1 hunks).github/workflows/e2e_tests_rhelai.yaml(1 hunks)examples/vllm-rhaiis.yaml(1 hunks)examples/vllm-rhelai.yaml(1 hunks)examples/vllm-rhoai.yaml(1 hunks)tests/e2e-prow/rhoai/configs/run.yaml(2 hunks)tests/e2e/configs/run-rhaiis.yaml(1 hunks)tests/e2e/configs/run-rhelai.yaml(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🔇 Additional comments (17)
.github/workflows/e2e_tests_rhaiis.yaml (1)
125-130: Credential migration is consistent with rhelai workflow.The secret migration is implemented identically to the RHEL AI workflow, maintaining consistency across both E2E test environments.
The same verification applies here: ensure
QUAY_DOWNSTREAM_USERNAMEandQUAY_DOWNSTREAM_TOKENsecrets are configured in the repository settings. The verification script provided in the review of.github/workflows/e2e_tests_rhelai.yamlwill check both files..github/workflows/e2e_tests_rhelai.yaml (1)
126-131: Credential migration is correctly implemented across all workflows.The approach of mapping new secret names (
QUAY_DOWNSTREAM_USERNAMEandQUAY_DOWNSTREAM_TOKEN) to existing local environment variable names (QUAY_ROBOT_USERNAMEandQUAY_ROBOT_TOKEN) maintains backward compatibility. This pattern is consistently applied across all three e2e test workflows, with no workflows directly referencing the old secret names.Ensure that the new secrets
QUAY_DOWNSTREAM_USERNAMEandQUAY_DOWNSTREAM_TOKENare properly configured in the GitHub repository settings before this workflow runs, otherwise authentication will fail.examples/vllm-rhelai.yaml (1)
134-140: Conditional shield configuration looks correct.The shield configurations use shell-style parameter expansion (
${env.SAFETY_MODEL:+llama-guard}and${env.SAFETY_MODEL:=}) to optionally enable shields based on environment variables. This allows flexible deployment where safety models can be conditionally enabled.examples/vllm-rhaiis.yaml (1)
1-157: Configuration structure is consistent with the modular LLS 0.3.x pattern.The file follows the same structure as other example configs with appropriate RHAIIS-specific environment variables. Storage backends, resource registrations, and provider configurations align with the expected modular architecture.
tests/e2e-prow/rhoai/configs/run.yaml (2)
129-132: Hardcoded model ID appropriate for test configuration.Using a hardcoded model (
meta-llama/Llama-3.2-1B-Instruct) is appropriate for a test configuration, providing deterministic behavior. The smaller 1B model is suitable for e2e testing.
98-120: Storage configuration follows the new modular backend pattern.The storage section properly defines
kv_defaultandsql_defaultbackends with environment-driven paths and appropriate stores for metadata, inference, conversations, and prompts. This aligns with the LLS 0.3.x architecture.examples/vllm-rhoai.yaml (2)
129-132: Hardcoded model ID differs from other example configs.This example config uses a hardcoded model ID (
meta-llama/Llama-3.2-1B-Instruct) whilevllm-rhelai.yamlandvllm-rhaiis.yamluse environment variables (${env.RHEL_AI_MODEL}and${env.RHAIIS_MODEL}respectively). Consider whether this should also use an environment variable for consistency and flexibility.
15-26: Inference provider configuration looks correct.The vLLM remote provider and sentence-transformers inline provider are properly configured. No trailing whitespace issues in this file.
tests/e2e/configs/run-rhaiis.yaml (1)
121-146: Resource registrations properly configured for RHAIIS testing.The registered resources section correctly defines the embedding model with metadata and uses environment variables for the RHAIIS model. Shield configurations follow the same conditional pattern as other configs.
tests/e2e/configs/run-rhelai.yaml (8)
1-13: LGTM!The version, image name, and APIs configuration looks well-structured. The comprehensive API list aligns with the modular LLS 0.3.x configuration approach.
19-23: Verify thattls_verify: falseis intentional for e2e tests.Disabling TLS verification is acceptable for e2e tests with self-signed certificates or local endpoints, but this setting should not propagate to production configurations.
27-59: LGTM!The provider configurations for vector_io, files, safety, and agents are well-structured with proper namespace separation and consistent backend references.
60-97: LGTM!The remaining provider configurations follow consistent patterns with proper namespace separation for kvstore backends.
98-120: LGTM!The storage configuration with separate backends for key-value and SQL storage, along with properly configured stores, follows a clean modular pattern.
134-140: Verify behavior when shield environment variables are unset.The conditional syntax
${env.SAFETY_MODEL:+llama-guard}will result in an emptyprovider_idwhenSAFETY_MODELis not set. Confirm that LLS 0.3.x gracefully handles shields with empty/null provider_id, or consider omitting these entries entirely when the environment variables are not configured.
147-150: LGTM!Server and telemetry configuration is appropriate for e2e testing.
151-157: LGTM!The vector_stores and safety defaults correctly reference providers and resources defined earlier in the configuration.
183e135 to
f192004
Compare
f192004 to
38f0904
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tests/e2e/configs/run-rhaiis.yaml (1)
134-140: Verify conditional shield registration handles missing environment variables correctly.The conditional syntax
${env.SAFETY_MODEL:+llama-guard}will result in an empty string forprovider_idwhen the environment variable is not set, potentially causing runtime errors.tests/e2e/configs/run-rhelai.yaml (1)
134-140: Verify conditional shield registration handles missing environment variables correctly.The conditional syntax
${env.SAFETY_MODEL:+llama-guard}will result in an empty string forprovider_idwhen the environment variable is not set, potentially causing runtime errors.
🧹 Nitpick comments (1)
tests/e2e/configs/run-rhaiis.yaml (1)
16-26: Consider making the port configurable for consistency.The RHAIIS URL hardcodes port 8000, while the RHELAI configuration uses an environment variable
RHEL_AI_PORTfor flexibility. Consider using${env.RHAIIS_PORT:=8000}to allow port customization while maintaining the default.Apply this diff if port flexibility is desired:
- url: http://${env.RHAIIS_URL}:8000/v1/ + url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/e2e_tests_rhaiis.yaml(1 hunks).github/workflows/e2e_tests_rhelai.yaml(1 hunks)examples/vllm-rhaiis.yaml(1 hunks)examples/vllm-rhelai.yaml(1 hunks)examples/vllm-rhoai.yaml(1 hunks)tests/e2e-prow/rhoai/configs/run.yaml(2 hunks)tests/e2e/configs/run-rhaiis.yaml(1 hunks)tests/e2e/configs/run-rhelai.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- .github/workflows/e2e_tests_rhelai.yaml
- examples/vllm-rhelai.yaml
- examples/vllm-rhoai.yaml
- examples/vllm-rhaiis.yaml
- .github/workflows/e2e_tests_rhaiis.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
- GitHub Check: build-pr
- GitHub Check: E2E: library mode / vertexai
- GitHub Check: E2E: server mode / azure
- GitHub Check: E2E: server mode / ci
- GitHub Check: E2E: library mode / azure
- GitHub Check: E2E: server mode / vertexai
- GitHub Check: E2E: library mode / ci
🔇 Additional comments (6)
tests/e2e-prow/rhoai/configs/run.yaml (3)
15-97: LGTM!The provider configurations are well-structured with proper references to storage backends and correct YAML syntax. The modular, multi-provider approach aligns with the LLS 0.3.x architecture.
98-157: LGTM!The storage, registered resources, and server configurations are properly structured with appropriate environment variable fallbacks and defaults. The modular design with empty placeholder lists for datasets, scoring_fns, and benchmarks allows for future extensibility.
129-132: Verify whether this hardcoded model ID should be consistent with other configs.The inconsistency is confirmed: RHOAI config uses a hardcoded
meta-llama/Llama-3.2-1B-Instructwhile run-rhaiis.yaml uses${env.RHAIIS_MODEL}and run-rhelai.yaml uses${env.RHEL_AI_MODEL}. Either align this config with the environment variable pattern used elsewhere, or document why RHOAI requires a specific hardcoded model.tests/e2e/configs/run-rhaiis.yaml (1)
1-157: Configuration structure is solid.The overall configuration correctly implements the modular, store-backed architecture for LLS 0.3.x. The use of environment variables for model IDs provides good flexibility for different test scenarios.
tests/e2e/configs/run-rhelai.yaml (2)
16-26: LGTM! Good use of environment variables for URL construction.The inference provider configuration properly uses environment variables for both the URL and port (
${env.RHEL_AI_URL}:${env.RHEL_AI_PORT}), providing maximum flexibility for different deployment scenarios.
1-157: Configuration structure is solid.The overall configuration correctly implements the modular, store-backed architecture for LLS 0.3.x with appropriate use of environment variables throughout.
| shields: | ||
| - shield_id: llama-guard | ||
| provider_id: ${env.SAFETY_MODEL:+llama-guard} | ||
| provider_shield_id: ${env.SAFETY_MODEL:=} | ||
| - shield_id: code-scanner | ||
| provider_id: ${env.CODE_SCANNER_MODEL:+code-scanner} | ||
| provider_shield_id: ${env.CODE_SCANNER_MODEL:=} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verify conditional shield registration handles missing environment variables correctly.
The conditional syntax ${env.SAFETY_MODEL:+llama-guard} will result in an empty string for provider_id when the environment variable is not set. This could lead to runtime errors when shields are referenced but have no valid provider.
Verify that LLS 0.3.x handles shields with empty provider_id gracefully, or ensure these environment variables are always set in the test environment. Run the following to check how shields are referenced:
#!/bin/bash
# Check if shields with empty provider_id are handled or if env vars are always set
rg -n "SAFETY_MODEL|CODE_SCANNER_MODEL" --type=yaml --type=sh
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
Updating RHEL AI, RHAIIS, RHOAI configs to work with LLS 0.3.x
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
NA
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.