Skip to content

Feature/support none tls data store#697

Open
aireet wants to merge 7 commits intoNVIDIA:mainfrom
aireet:feature/support-none-tls-dataStore
Open

Feature/support none tls data store#697
aireet wants to merge 7 commits intoNVIDIA:mainfrom
aireet:feature/support-none-tls-dataStore

Conversation

@aireet
Copy link
Contributor

@aireet aireet commented Jan 16, 2026

Summary

Add password-based authentication support for MongoDB and PostgreSQL datastores, allowing connections without requiring TLS certificates. Password is securely read from Kubernetes Secret instead of ConfigMap.

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 💥 Breaking change
  • 📚 Documentation
  • 🔧 Refactoring
  • 🔨 Build/CI

Component(s) Affected

  • Core Services
  • Documentation/CI
  • Fault Management
  • Health Monitors
  • Janitor
  • Other: ____________

Testing

  • Tests pass locally (go build successful)
  • Manual testing completed (configuration verified)
  • No breaking changes (backward compatible - Secret is optional)

Checklist

  • Self-review completed
  • Documentation updated (values.yaml example added)
  • Ready for review

Summary by CodeRabbit

  • New Features
    • Added password-based authentication support for datastore connections, enabling credentials to be securely provided through Kubernetes secrets across all deployment services.
    • Extended authentication capabilities to support both password-based and TLS-based connection methods for MongoDB and PostgreSQL databases.

✏️ Tip: You can customize this high-level summary in your review settings.

Add support for password-based authentication in PostgreSQL datastore,
allowing connections without requiring TLS certificates.

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
Add support for password-based authentication in MongoDB datastore,
allowing connections without requiring TLS certificates using
DATASTORE_PASSWORD environment variable.

When DATASTORE_PASSWORD is set, the client uses password authentication
with tls=false. Otherwise, it falls back to certificate authentication.

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
Remove DATASTORE_PASSWORD from ConfigMap (security concern) and add
support for reading password from Kubernetes Secret.

Components updated:
- ConfigMap: removed DATASTORE_PASSWORD
- DaemonSet, Deployments: added envFrom.secretKeyRef support

Usage in values.yaml:
  global:
    datastore:
      credentials:
        existingSecret: my-datastore-secret
        existingSecretPasswordKey: password  # default: "password"

Secret example:
  apiVersion: v1
  kind: Secret
  metadata:
    name: my-datastore-secret
  stringData:
    password: "your-secure-password"

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
Add example configuration for password-based datastore authentication
using Kubernetes Secret in values.yaml comments.

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 16, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 16, 2026

Warning

Rate limit exceeded

@aireet has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 16 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 9644a89 and 7cde165.

📒 Files selected for processing (3)
  • store-client/pkg/config/env_loader.go
  • tests/data/password-auth-fault-configmap.yaml
  • tests/fault_management_test.go
📝 Walkthrough

Walkthrough

Adds password-based datastore authentication support across Kubernetes deployments and the configuration loader. The DATASTORE_PASSWORD environment variable is conditionally injected from secrets in multiple Helm templates, with corresponding password-based MongoDB and PostgreSQL connection logic implemented in the config loader.

Changes

Cohort / File(s) Summary
Helm Deployment Templates
distros/kubernetes/nvsentinel/charts/csp-health-monitor/templates/deployment.yaml, distros/kubernetes/nvsentinel/charts/event-exporter/templates/deployment.yaml, distros/kubernetes/nvsentinel/charts/fault-quarantine/templates/deployment.yaml, distros/kubernetes/nvsentinel/charts/fault-remediation/templates/deployment.yaml, distros/kubernetes/nvsentinel/charts/health-events-analyzer/templates/deployment.yaml, distros/kubernetes/nvsentinel/charts/node-drainer/templates/deployment.yaml, distros/kubernetes/nvsentinel/templates/daemonset.yaml
Conditional DATASTORE_PASSWORD environment variable added to container specs, sourced from Kubernetes secrets when .Values.global.datastore.credentials.existingSecret is set. References use existingSecretPasswordKey with default "password" fallback.
Helm Values Documentation
distros/kubernetes/nvsentinel/values.yaml
Added commented example configuration blocks for datastore username, TLS options (sslmode, sslcert, sslkey, sslrootcert), and credentials section with existingSecret and existingSecretPasswordKey parameters.
Configuration Loader
store-client/pkg/config/env_loader.go
Implemented password-based authentication paths for both MongoDB and PostgreSQL. Introduces branching logic: when DATASTORE_PASSWORD is present, builds connection URI with user credentials and disables TLS; otherwise uses TLS-based authentication with certificate configuration. Added net/url import for URI construction.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A password springs forth from secrets so deep,
Across the deployments, conditions now keep,
From Kubernetes vaults to the config it flows,
Both MongoDB and Postgres now know,
Authentication paths where credentials can go! 🔐

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Feature/support none tls data store' is partially related to the main changes. It mentions TLS support but is somewhat vague and uses non-standard phrasing ('none tls'). The actual work involves adding password-based authentication support for MongoDB and PostgreSQL, updating Helm templates across multiple services, and documenting Secret-based credential handling—more specific than the title suggests. Consider revising the title to be more specific and clear, such as 'Add password-based authentication support for datastores' or 'Support non-TLS datastore connections via password authentication'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Fix all issues with AI agents
In
`@distros/kubernetes/nvsentinel/charts/csp-health-monitor/templates/deployment.yaml`:
- Around line 141-147: The template accesses
.Values.global.datastore.credentials.existingSecret without nil-safety; update
the conditional to check existence of datastore and credentials first (e.g. {{-
if and .Values.global.datastore .Values.global.datastore.credentials
.Values.global.datastore.credentials.existingSecret }} ) before emitting the
DATASTORE_PASSWORD secretKeyRef block, and apply the same change for the
maintenance-notifier container where
.Values.global.datastore.credentials.existingSecret is used so both occurrences
guard against nil .datastore and .credentials.
- Around line 207-213: The env var block that sets DATASTORE_PASSWORD uses only
{{- if .Values.global.datastore.credentials.existingSecret }} and can panic if
.Values.global.datastore.credentials is nil; guard it the same way as the main
container by checking that .Values.global.datastore.credentials is defined and
that existingSecret is non-empty before creating the secretKeyRef. Update the
conditional to something like a combined nil-safe test for
.Values.global.datastore.credentials and
.Values.global.datastore.credentials.existingSecret (so the DATASTORE_PASSWORD
env var and secretKeyRef are only rendered when both exist), leaving the
DATASTORE_PASSWORD name and key logic unchanged.

In
`@distros/kubernetes/nvsentinel/charts/fault-quarantine/templates/deployment.yaml`:
- Around line 148-154: The env var block referencing DATASTORE_PASSWORD should
first guard that .Values.global.datastore.credentials and
.Values.global.datastore.credentials.existingSecret are non-nil before accessing
nested keys; update the conditional around the DATASTORE_PASSWORD stanza to
match the existing pattern (check .Values.global.datastore.credentials and then
.Values.global.datastore.credentials.existingSecret) and keep using the
existingSecretPasswordKey with default "password" to avoid nil panics when
credentials or the key are absent.

In
`@distros/kubernetes/nvsentinel/charts/fault-remediation/templates/deployment.yaml`:
- Around line 157-163: The template accesses
.Values.global.datastore.credentials.existingSecret directly which can error if
global.datastore or credentials is nil; update the conditional that sets the
DATASTORE_PASSWORD env var to first guard the parent fields (e.g., check
.Values.global.datastore and .Values.global.datastore.credentials) before
checking .Values.global.datastore.credentials.existingSecret so the secret block
only renders when those parents exist and the existingSecret key is present;
keep the env var name DATASTORE_PASSWORD and the existingSecretPasswordKey
default behavior unchanged.

In
`@distros/kubernetes/nvsentinel/charts/health-events-analyzer/templates/deployment.yaml`:
- Around line 128-134: The template accesses
.Values.global.datastore.credentials.existingSecret without guarding for nil;
update the conditional to check that .Values.global.datastore and
.Values.global.datastore.credentials exist before referencing existingSecret
(e.g. use an and-based conditional such as checking .Values.global.datastore,
.Values.global.datastore.credentials, and
.Values.global.datastore.credentials.existingSecret) so the block that sets
DATASTORE_PASSWORD only runs when those nested values are present; adjust the
conditional around the DATASTORE_PASSWORD secretKeyRef to use that guarded
condition and keep the existing key default logic for existingSecretPasswordKey.

In `@distros/kubernetes/nvsentinel/templates/configmap-datastore.yaml`:
- Around line 33-35: The template currently injects
.Values.global.datastore.connection.password into the ConfigMap
(configmap-datastore.yaml); remove that rendering to avoid storing credentials
in plaintext and instead require using a Kubernetes Secret (e.g., expect
.Values.global.datastore.secretName and reference that secret in pod specs) or,
if you must support legacy behavior, gate the plaintext export behind an
explicit opt-in flag like .Values.global.datastore.allowPlaintextInConfigMap =
true; update the template to either drop the password block entirely or wrap it
with the explicit opt-in conditional and add/validate the secret-based reference
in the consuming manifests.

In `@distros/kubernetes/nvsentinel/templates/daemonset.yaml`:
- Around line 158-164: Guard against nil by expanding the conditional to ensure
.Values.global.datastore and .Values.global.datastore.credentials exist before
reading .Values.global.datastore.credentials.existingSecret; update the
conditional that controls the DATASTORE_PASSWORD env var (the block referencing
DATASTORE_PASSWORD and .Values.global.datastore.credentials.existingSecret /
existingSecretPasswordKey) to check the parent keys first (e.g., use a combined
if that verifies .Values.global.datastore and
.Values.global.datastore.credentials are defined) so the template won’t error
when datastore or credentials are undefined.

In `@store-client/pkg/config/env_loader.go`:
- Around line 306-314: The DSN construction in env_loader.go currently builds a
keyword-style connectionURI using plain string formatting in the block that sets
connectionURI when password != "", which breaks on passwords with spaces or
special characters; update the logic that sets connectionURI to build a
URL-style DSN (postgresql://) using net/url's url.URL and url.UserPassword to
automatically escape the username and password (keep the same host, port, dbname
as Path or Query as appropriate), ensure you import "net/url" (and "net" if
needed for host formatting), and leave the StandardCertificateConfig assignment
intact.
- Around line 196-225: Replace the raw fmt.Sprintf URI construction with a
percent-encoded URL built via net/url: use url.UserPassword(username, password)
and construct a url.URL with Scheme "mongodb", Host fmt.Sprintf("%s:%s", host,
port), Path "/"+database, and set RawQuery to "tls=false"; assign url.String()
to connectionURI. Update imports to include "net/url" and ensure you still set
databaseName and certConfig as before (refer to variables connectionURI,
username, password, host, port, database and the StandardCertificateConfig
block).
🧹 Nitpick comments (1)
distros/kubernetes/nvsentinel/values.yaml (1)

35-49: Add a note discouraging connection.password in values.yaml.
Docs describe secret-based credentials but don’t warn that plaintext connection.password is insecure/legacy. A brief note keeps the values docs comprehensive and reduces misuse.

📝 Suggested doc tweak
   #     # Option 2: Password-based authentication (non-TLS)
   #     # Password is read from Kubernetes Secret for security
   #     # Create secret: kubectl create secret generic datastore-creds --from-literal=password=your-password
+  #     # NOTE: Do not set connection.password in values.yaml; use credentials.existingSecret instead.
   #   credentials:
   #     existingSecret: "datastore-credentials"
   #     existingSecretPasswordKey: "password"  # Optional, defaults to "password"

As per coding guidelines, document all values and call out insecure/legacy options.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a04cd6 and 368ed70.

📒 Files selected for processing (10)
  • distros/kubernetes/nvsentinel/charts/csp-health-monitor/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/event-exporter/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/fault-quarantine/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/fault-remediation/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/health-events-analyzer/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/node-drainer/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/templates/configmap-datastore.yaml
  • distros/kubernetes/nvsentinel/templates/daemonset.yaml
  • distros/kubernetes/nvsentinel/values.yaml
  • store-client/pkg/config/env_loader.go
🧰 Additional context used
📓 Path-based instructions (4)
**/values.yaml

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/values.yaml: Document all values in Helm chart values.yaml with inline comments
Include examples for non-obvious configurations in Helm chart documentation
Note truthy value requirements in Helm chart documentation where applicable

Files:

  • distros/kubernetes/nvsentinel/values.yaml
**/*.go

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.go: Follow standard Go conventions with gofmt and golint
Use structured logging via log/slog in Go code
Wrap errors with context using fmt.Errorf("context: %w", err) in Go code
Within retry.RetryOnConflict blocks, return errors without wrapping to preserve retry behavior
Use meaningful variable names such as synced over ok for cache sync checks
Use client-go for Kubernetes API interactions in Go code
Prefer informers over direct API calls for watching Kubernetes resources
Implement proper shutdown handling with context cancellation in Go code
Package-level godoc required for all Go packages
Function comments required for all exported Go functions
Use inline comments for complex logic only in Go code
TODO comments should reference issues in Go code
Extract informer event handler setup into helper methods
Use separate informers for different Kubernetes resource types

Files:

  • store-client/pkg/config/env_loader.go
**/daemonset*.yaml

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Explain DaemonSet variant selection logic in Helm chart documentation

Files:

  • distros/kubernetes/nvsentinel/templates/daemonset.yaml
distros/kubernetes/**/*daemonset*.yaml

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

distros/kubernetes/**/*daemonset*.yaml: Separate DaemonSets should be created for kata vs regular nodes using nodeAffinity based on kata.enabled label
Regular node DaemonSets should use /var/log volume mount for file-based logs
Kata node DaemonSets should use /run/log/journal and /var/log/journal volume mounts for systemd journal

Files:

  • distros/kubernetes/nvsentinel/templates/daemonset.yaml
🧠 Learnings (6)
📓 Common learnings
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Use Kubernetes secrets for sensitive data
📚 Learning: 2025-11-24T22:20:48.152Z
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Use Kubernetes secrets for sensitive data

Applied to files:

  • distros/kubernetes/nvsentinel/charts/node-drainer/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/health-events-analyzer/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/event-exporter/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/charts/fault-remediation/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/values.yaml
  • distros/kubernetes/nvsentinel/charts/fault-quarantine/templates/deployment.yaml
  • distros/kubernetes/nvsentinel/templates/daemonset.yaml
  • distros/kubernetes/nvsentinel/charts/csp-health-monitor/templates/deployment.yaml
📚 Learning: 2025-11-24T22:20:48.152Z
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Applies to **/values.yaml : Document all values in Helm chart `values.yaml` with inline comments

Applied to files:

  • distros/kubernetes/nvsentinel/values.yaml
📚 Learning: 2025-11-26T13:54:26.427Z
Learnt from: rupalis-nv
Repo: NVIDIA/NVSentinel PR: 361
File: distros/kubernetes/nvsentinel/values-tilt.yaml:215-223
Timestamp: 2025-11-26T13:54:26.427Z
Learning: For values-tilt.yaml files: Keep documentation minimal and implementation-focused. Unlike production values.yaml files which need comprehensive inline comments, values-tilt.yaml is a Tilt-specific testing/development override file that doesn't require extensive documentation.

Applied to files:

  • distros/kubernetes/nvsentinel/values.yaml
📚 Learning: 2025-11-24T22:20:48.152Z
Learnt from: CR
Repo: NVIDIA/NVSentinel PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-24T22:20:48.152Z
Learning: Applies to **/values.yaml : Include examples for non-obvious configurations in Helm chart documentation

Applied to files:

  • distros/kubernetes/nvsentinel/values.yaml
📚 Learning: 2025-12-23T05:02:22.108Z
Learnt from: tanishagoyal2
Repo: NVIDIA/NVSentinel PR: 609
File: store-client/pkg/client/postgresql_pipeline_builder.go:119-132
Timestamp: 2025-12-23T05:02:22.108Z
Learning: In the NVSentinel codebase, protobuf fields stored in MongoDB should use lowercase field names (e.g., processingstrategy, componentclass, checkname). Ensure pipeline filters and queries that access protobuf fields in the database consistently use lowercase field names in the store-client package, avoiding camelCase mappings for MongoDB reads/writes.

Applied to files:

  • store-client/pkg/config/env_loader.go
🔇 Additional comments (2)
distros/kubernetes/nvsentinel/charts/node-drainer/templates/deployment.yaml (1)

149-155: SecretKeyRef wiring for DATASTORE_PASSWORD looks good.
Conditional secret injection aligns with the new auth mode and keeps credentials out of ConfigMaps.

distros/kubernetes/nvsentinel/charts/event-exporter/templates/deployment.yaml (1)

134-140: SecretKeyRef injection for DATASTORE_PASSWORD looks good.
This cleanly integrates secret-based credentials with the new password auth flow.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Add nil-safety checks for .Values.global.datastore.credentials
before accessing nested fields to prevent template rendering failures.

fix(config): encode credentials in connection URIs

Use url.UserPassword() for both MongoDB and PostgreSQL to properly
handle special characters (@, :, /, ?, #, $, [, ]) in usernames
and passwords when constructing connection URIs.

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
Remove password field from datastore.yaml in ConfigMap to prevent
credential exposure. Passwords should only come from Kubernetes
Secret via secretKeyRef.

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
@lalitadithya
Copy link
Collaborator

/ok to test 9644a89

Copy link
Contributor

@KaivalyaMDabhadkar KaivalyaMDabhadkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aireet can please fix the linter issues?

Copy link
Contributor

@KaivalyaMDabhadkar KaivalyaMDabhadkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add an e2e test here to test whether mongodb works with the other components with the newly added username/pwd auth mechanism?

@aireet aireet force-pushed the feature/support-none-tls-dataStore branch from c446348 to 46c69a6 Compare January 19, 2026 12:15
…separate helper functions

Add buildMongoDBConnectionConfig/buildMongoDBPasswordAuthConfig/buildMongoDBCertAuthConfig for MongoDB
Add buildPostgreSQLConnectionConfig/buildPostgreSQLPasswordAuthConfig/buildPostgreSQLCertAuthConfig for PostgreSQL

This separates concerns about authentication method and improves testability

Signed-off-by: aireet <xuexuexue_xxx@proton.me>
@aireet aireet force-pushed the feature/support-none-tls-dataStore branch from 46c69a6 to 7cde165 Compare January 19, 2026 12:16
@lalitadithya
Copy link
Collaborator

hey @aireet , apologies for the delay in review. Can you please rebase and resolve conflicts, and we can move forward with this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants