Skip to content

fix demo - use arm compatible mongodb image from bitnami#673

Open
thapakazi wants to merge 2 commits intoNVIDIA:mainfrom
thapakazi:main
Open

fix demo - use arm compatible mongodb image from bitnami#673
thapakazi wants to merge 2 commits intoNVIDIA:mainfrom
thapakazi:main

Conversation

@thapakazi
Copy link

@thapakazi thapakazi commented Jan 12, 2026

Summary

while trying to run the demo on mac(arm64), mongo image is missing for arm64,

 message: 'Back-off pulling image "docker.io/bitnamilegacy/mongodb:8.0.3-debian-12-r1":                                             │
│           ErrImagePull: rpc error: code = NotFound desc = failed to pull and unpack                                                        │
│           image "docker.io/bitnamilegacy/mongodb:8.0.3-debian-12-r1": no match for                                                         │
│           platform in manifest: not found' 

so using arm64 available image from bitnami: https://hub.docker.com/r/bitnamisecure/mongodb

Type of Change

  • 🐛 Demo fix for arm64 processor
  • ✨ New feature
  • 💥 Breaking change
  • 📚 Documentation
  • 🔧 Refactoring
  • 🔨 Build/CI

Component(s) Affected

  • Core Services
  • Documentation/CI
  • Fault Management
  • Health Monitors
  • Janitor
  • Other: DEMO

Testing

  • Tests pass locally
  • Manual testing completed
  • No breaking changes (or documented)

Checklist

  • Self-review completed
  • Documentation updated (if needed)
  • Ready for review

Summary by CodeRabbit

  • New Features
    • Local fault-injection demo now supports architecture-specific overrides (ARM64) and uses an external demo values file for easier customization.
  • Chores
    • Switched demo component installation to OCI-based manifests and consolidated configuration into a single external values file.
    • Updated MongoDB container image and simplified demo setup and sequencing for more reliable local deployments.

✏️ Tip: You can customize this high-level summary in your review settings.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 12, 2026

📝 Walkthrough

Walkthrough

Replaces an embedded NVSentinel/MongoDB YAML in the demo setup script with an external values file, adds architecture-specific overrides for ARM64, switches NVSentinel install to use the external values file (OCI-based), and updates script flow to conditionally include per-arch values and apply Helm with --values.

Changes

Cohort / File(s) Summary
Setup script
demos/local-fault-injection-demo/scripts/00-setup.sh
Remove large embedded YAML; introduce BASE_DIR, load external 00-demo-values.yaml, add conditional arch overrides for ARM64/aarch64, and update helm install/upgrade invocation to use --values plus optional arch override.
Demo values (default)
demos/local-fault-injection-demo/scripts/00-demo-values.yaml
Add new external values file containing NVSentinel/demo configuration: enables GPU health monitoring, core components, MongoDB store config, resource requests/limits, simpleHealthClient, nodeSelector/tolerations, and image settings.
Demo values (ARM overrides)
demos/local-fault-injection-demo/scripts/00-demo-values-arm.yaml
Add ARM-specific overrides including Bitnami MongoDB ARM image (docker.io/dlavrenuek/bitnami-mongodb-arm:8.0.4) under mongodb-store.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through scripts with a curious glance,
Pulled YAML out to give Helm a chance.
Per-arch carrots for ARM, neat and small,
Values file tucked in — tidy for all. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: updating the demo to use an ARM-compatible MongoDB image from Bitnami to fix failures on Mac (arm64).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
demos/local-fault-injection-demo/scripts/00-setup.sh (1)

222-233: Update bitnamilegacy images for arm64 compatibility.

Both referenced images lack arm64 support and will fail on arm64 systems:

  • Line 224: bitnamilegacy/nginx:1.27.2-debian-12-r2 — amd64-only
  • Line 232: bitnamilegacy/mongodb-exporter:0.41.2-debian-12-r1 — amd64-only

These images need to be replaced with versions that have multi-arch manifests including arm64.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d84499 and 435ff04.

📒 Files selected for processing (1)
  • demos/local-fault-injection-demo/scripts/00-setup.sh

@lalitadithya
Copy link
Collaborator

Hey, Thank you for your PR! For ARM we actually make use the following image in CI: https://github.com/NVIDIA/NVSentinel/blob/main/distros/kubernetes/nvsentinel/values-tilt-arm64.yaml#L19. Can we use the same? It might be better to update the demo script to detect the architecture and update the values file at runtime

- Move inline Helm configurations from 00-setup.sh to standalone YAML files.
- Add 00-demo-values-arm.yaml to support ARM64 overrides(MongoDB images)
- Refactor setup script to use the new values files and improve path handling.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @demos/local-fault-injection-demo/scripts/00-demo-values.yaml:
- Around line 69-73: The ARM override file is missing ARM-compatible image
overrides for nginx and mongodb-exporter; update 00-demo-values-arm.yaml to
mirror the pattern used for mongodb by replacing the bitnamilegacy
image.repository/tag for the nginx and mongodb-exporter entries (the
image.repository and image.tag keys under the nginx and mongodb-exporter
sections) with ARM-compatible registry/repository/tag values (e.g., an ARM build
equivalent) and ensure image.pullPolicy matches the main arm override entries so
both services use ARM-capable images.
🧹 Nitpick comments (1)
demos/local-fault-injection-demo/scripts/00-demo-values.yaml (1)

46-46: Consider removing fragile line number reference.

The comment referencing "lines 73-184" from values-tilt.yaml will become stale as that file evolves. Consider removing the line numbers or referencing a section name instead.

Suggested change
-# MongoDB Store configuration - copied from values-tilt.yaml (lines 73-184)
+# MongoDB Store configuration - based on values-tilt.yaml
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 435ff04 and 30474ab.

📒 Files selected for processing (3)
  • demos/local-fault-injection-demo/scripts/00-demo-values-arm.yaml
  • demos/local-fault-injection-demo/scripts/00-demo-values.yaml
  • demos/local-fault-injection-demo/scripts/00-setup.sh
✅ Files skipped from review due to trivial changes (1)
  • demos/local-fault-injection-demo/scripts/00-demo-values-arm.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • demos/local-fault-injection-demo/scripts/00-setup.sh
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Applies to **/values.yaml : Include examples for non-obvious configurations in Helm chart documentation
📚 Learning: 2025-11-24T22:20:48.152Z
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Applies to **/values.yaml : Include examples for non-obvious configurations in Helm chart documentation

Applied to files:

  • demos/local-fault-injection-demo/scripts/00-demo-values.yaml
📚 Learning: 2025-11-26T13:54:26.427Z
Learnt from: rupalis-nv
Repo: NVIDIA/NVSentinel PR: 361
File: distros/kubernetes/nvsentinel/values-tilt.yaml:215-223
Timestamp: 2025-11-26T13:54:26.427Z
Learning: For values-tilt.yaml files: Keep documentation minimal and implementation-focused. Unlike production values.yaml files which need comprehensive inline comments, values-tilt.yaml is a Tilt-specific testing/development override file that doesn't require extensive documentation.

Applied to files:

  • demos/local-fault-injection-demo/scripts/00-demo-values.yaml
📚 Learning: 2025-11-24T22:20:48.152Z
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Health monitors should include Helm chart in `distros/kubernetes/nvsentinel/charts/`

Applied to files:

  • demos/local-fault-injection-demo/scripts/00-demo-values.yaml
🔇 Additional comments (4)
demos/local-fault-injection-demo/scripts/00-demo-values.yaml (4)

1-14: LGTM - Global settings are well-organized.

The configuration clearly enables only the GPU health monitor while explicitly disabling others, which is appropriate for a focused demo. The inline comments improve readability.


15-44: LGTM - Component configuration is minimal and demo-appropriate.

Good practice to explicitly disable unused components. The comment on line 34 explaining why labeler is disabled is helpful.


94-98: Hardcoded password is acceptable for local demo context.

The rootPassword: "demo-password" is fine for a local demo setup with persistence disabled. Just ensure this file isn't accidentally used in any non-demo environments.


100-130: LGTM - Resource configurations are reasonable for demo workloads.

The resource requests and limits are appropriately sized for a local demo environment.

Comment on lines +69 to +73
image:
registry: "docker.io"
repository: "bitnamilegacy/mongodb"
tag: "8.0.3-debian-12-r1"
pullPolicy: "IfNotPresent"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find the ARM override values file
find . -name "00-demo-values-arm.yaml" -o -name "*values*arm*.yaml" | head -20

Repository: NVIDIA/NVSentinel

Length of output: 182


🏁 Script executed:

#!/bin/bash
# Get a better understanding of the directory structure
ls -la demos/local-fault-injection-demo/scripts/ 2>/dev/null | head -20

Repository: NVIDIA/NVSentinel

Length of output: 643


🏁 Script executed:

#!/bin/bash
# Read the ARM override file
echo "=== ARM Override File ==="
cat demos/local-fault-injection-demo/scripts/00-demo-values-arm.yaml

echo -e "\n=== Base Values File ==="
cat demos/local-fault-injection-demo/scripts/00-demo-values.yaml

Repository: NVIDIA/NVSentinel

Length of output: 2925


🏁 Script executed:

#!/bin/bash
# Read the CI ARM values file for comparison
echo "=== CI ARM Values File ==="
cat distros/kubernetes/nvsentinel/values-tilt-arm64.yaml | head -150

Repository: NVIDIA/NVSentinel

Length of output: 854


ARM override file is incomplete; nginx and mongodb-exporter images still use bitnamilegacy variants on ARM.

The 00-demo-values-arm.yaml file only provides an ARM-compatible override for mongodb (dlavrenuek/bitnami-mongodb-arm), but fails to override the following bitnamilegacy images that also likely lack ARM manifests:

  • nginx (lines 82-83): bitnamilegacy/nginx:1.27.2-debian-12-r2
  • mongodb-exporter (lines 90-91): bitnamilegacy/mongodb-exporter:0.41.2-debian-12-r1

Add ARM-compatible image overrides for both nginx and mongodb-exporter to ensure the demo runs on ARM systems.

🤖 Prompt for AI Agents
In @demos/local-fault-injection-demo/scripts/00-demo-values.yaml around lines 69
- 73, The ARM override file is missing ARM-compatible image overrides for nginx
and mongodb-exporter; update 00-demo-values-arm.yaml to mirror the pattern used
for mongodb by replacing the bitnamilegacy image.repository/tag for the nginx
and mongodb-exporter entries (the image.repository and image.tag keys under the
nginx and mongodb-exporter sections) with ARM-compatible registry/repository/tag
values (e.g., an ARM build equivalent) and ensure image.pullPolicy matches the
main arm override entries so both services use ARM-capable images.

Comment on lines +152 to +165

# pass architecture specific overrides
local arch_overrides=""
case "$(uname -m)" in
arm64|aarch64)
arch_overrides="--values $BASE_DIR/00-demo-values-arm.yaml"
;;
esac

helm upgrade --install nvsentinel oci://ghcr.io/nvidia/nvsentinel \
--version "$nvsentinel_version" \
--namespace "$NAMESPACE" \
--values /tmp/nvsentinel-demo-values.yaml \
--values $BASE_DIR/00-demo-values.yaml \
$arch_overrides \
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the main change, and values(from files) specific to arm is passed if arch == arm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments