Skip to content

The PersistentVolume Sync Operator externalizes PV metadata to an compatible backend, decoupling storage identity from the cluster. This creates a centralized source of truth for reconstructing volumes across any cluster sharing the underlying storage.

Notifications You must be signed in to change notification settings

bartvanbenthem/pvsync-operator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PersistentVolume Sync Operator

The PersistentVolume Sync Operator externalizes PV metadata to an S3-compatible backend, decoupling storage identity from the cluster. This creates a centralized source of truth for reconstructing volumes across any cluster sharing the underlying storage (e.g., NAS, GFS, SDS).

The operator mirrors PersistentVolume definitions and lifecycle states by capturing exact specifications and volume handles from the Protected cluster. These manifests are then used to reconstruct identical PV objects within a Recovery cluster. This mechanism ensures that during a disaster recovery event, the recovery site has immediate, pre-configured access to the underlying storage, significantly reducing Recovery Time Objectives (RTO) by eliminating manual storage re-provisioning.

VolumeSync

Operator Logic: Metadata Synchronization and Storage Mapping

The PersistentVolume Sync Operator facilitates cross-cluster disaster recovery by synchronizing the state and specifications of Kubernetes storage resources. Its behavior is divided into two distinct logical paths:

Resource Specification Capture (Sync)

The operator performs continuous, point-in-time captures of Kubernetes resource specifications for all targeted PersistentVolumes (PV) and PersistentVolumeClaims (PVC).

  • Intelligent Scoping: It monitors resources across designated StorageClasses, maintaining a comprehensive map of the source cluster's storage landscape.
  • State Capture Mechanism: The operator serializes metadata, labels, and specific volume requirements (capacity, access modes, and volume handles) into a portable, cluster-agnostic format.
  • Preservation of Intent: By capturing the full lifecycle state, the operator ensures that the original storage configuration is preserved and ready for immediate reconstitution on the recovery cluster.

StorageClass Decoupling (Heterogeneous Environment Support)

The operator explicitly decouples the PersistentVolumeClaims from the StorageClass definitions. While it captures the metadata for PVCs tied to various classes, it does not recreate the StorageClass objects on the recovery cluster.

  • Architectural Intent: This is a deliberate design choice to support Heterogeneous Storage Environments. In many disaster recovery scenarios, the infrastructure at the recovery site differs from the primary site. For example, the recovery cluster may utilize a localized file cache or a different storage endpoint to optimize performance or adhere to site-specific infrastructure constraints.
  • Flexible Binding Logic: By synchronizing only the PV/PVC metadata and omitting the StorageClass, the operator enables the recovery cluster's local storage controller to handle the binding process. This allows the restored claims to be dynamically mapped to the appropriate local backend while maintaining the data's integrity and volume handles.
  • Site-Specific Optimization: This decoupling ensures that storage policies (such as replication factors or IOPS limits) can be tailored to the recovery site's specific capabilities without needing to mirror the primary site's configuration exactly.

Recovery Workflow

  • Metadata Extraction: The operator scans the source cluster for labeled storage resources and captures their specifications.
  • Transformation: Using the clean_metadata logic, the operator strips environment-specific internal annotations while preserving the core volume requirements.
  • Local Re-Binding: On the recovery cluster, the operator recreates the PVCs. These claims then automatically target the pre-existing local StorageClasses defined on the recovery site, ensuring the data is served via the correct local file cache or storage provider.

Key Benefit: This approach provides a "Clean Slate" recovery where storage logic is kept local to the cluster, preventing the migration of invalid or incompatible storage provider configurations from the primary site.

Supported Storage Architectures

The PersistentVolume Sync Operator is designed for storage backends that provide cross-cluster data accessibility. For the operator to successfully reconstruct volumes on a recovery site, the underlying data must be reachable by the nodes in the Recovery cluster using the same volume handles captured from the Protected cluster.

Network Attached Storage (NAS)

NAS is the most common use case for the operator. Because NAS protocols provide a global namespace, the metadata (path/IP) remains valid across different clusters.

  • Protocols: NFS, SMB/CIFS.
  • Featured Support: Ctera GFS. Since Ctera provides a global file system, it is uniquely suited for this operator, allowing volumes to be mounted as RWX (ReadWriteMany) across geographic boundaries.
  • Recovery Requirement: The Recovery cluster must have network routability to the same storage endpoints.

Storage Area Network (SAN) & Block Storage

Using the operator with SAN backends (like IBM SVC) requires an additional layer to ensure data is present at both sites.

  • Clustered File Systems: To achieve RWX on SAN storage, a clustered filesystem such as IBM Spectrum Scale (GPFS) should be used. The operator can sync the metadata for these volumes, provided the CSI driver handles are consistent.
  • Replication Requirement: The SAN LUNs must be replicated at the hardware level (e.g., via IBM HyperSwap or Global Mirror).
  • Volume Handles: The operator captures the LUN UID. The Recovery cluster must be able to "see" the same LUN UID via its local Fibre Channel or iSCSI fabric.

Compatibility Matrix

Storage Category Technology Example Supported Metadata Sync Key Requirements for Success
SAN (Enterprise) IBM SVC + Spectrum Scale Yes LUN UID (WWN) Hardware-level replication (HyperSwap/Global Mirror).
Standard NAS NFS / SMB Yes Export Path & IP/DNS Network routability to the same storage endpoint.
Standard NAS NetApp ONTAP Yes Volume UUID / Junction Path SnapMirror or MetroCluster configured between sites.
Standard NAS Dell PowerScale (Isilon) Yes File System ID / SmartConnect SyncIQ replication and DNS-based SmartConnect availability.
Global File System Ctera GFS Yes File Path / Share ID Active global namespace across both sites.
Global File System NetApp Global File Cache Yes Cache ID / Backend Volume Backend ONTAP storage reachable with consistent cache coherency.
Global File System Microsoft Azure Files (Premium) Yes Share Name / Storage Account Cross-region replication (GZRS) or paired-region failover.
Distributed SDS Ceph (CephFS) Yes Monitor IPs & FS Path Recovery cluster must have access to the Ceph Monitor/OSD network.
Distributed SDS Red Hat OpenShift Data Foundation Yes StorageClass / FSID Stretch or mirrored clusters with quorum maintained.
Cloud Native Longhorn Yes Engine Name / Frontend Longhorn “Disaster Recovery Volumes” or cross-cluster backend.
Cloud Native Portworx (Pure Storage PX) Yes Volume ID / Cluster UUID PX-DR or Stork-based replication and scheduler awareness.

Implementation Note: RWX vs. RWO

While the operator can technically sync RWO (ReadWriteOnce) volumes, it is most effective for RWX (ReadWriteMany) workloads. For RWO block storage, ensure that the source cluster has fully released the volume (SCSI reservations) before the Recovery cluster attempts to reconstruct and mount it, otherwise, the mount operation will fail at the infrastructure level.

Features

  • 🔄 Cluster-wide PV discovery (no namespace restrictions)
  • ☁️ Backend-agnostic object storage support (Azure Blob, S3, MinIO, Cloudian)
  • 📤 Export storage definitions from the Protected cluster to object storage
  • 📥 Recreate PV objects on the Recovery cluster pointing to the same shared storage
  • 🏷 Cluster identity detection via configurable value
  • 🧹 Automatic retention-based cleanup of historical exports
  • 📡 Event-driven + periodic sync using Kubernetes watches and optional scheduling

Use Cases

  • 🌐 Multi-cluster DR for shared RWX storage
  • 💾 PV metadata backup and restore
  • 🔁 Migration of PVs between clusters
  • 🧭 Stateless failover for NFS-backed workloads

upcoming release

Features in currently in development for the upcoming release:

  • Validating admission webhook for a max of one pvsync custom resource per cluster
  • Advanced Helm chart for prodcution deployments
  • update cr status with more information:
    • pub error_message: Option,
    • pub last_run: Option<chrono::DateTimechrono::Utc>,
    • pub managed_volumes: Vec,
  • Optimize current logging implementation (via tracing + tracing-subscriber + EnvFilter)
  • Implement traces (via tracing + tracing-subscriber + opentelemetry)
  • Implement metrics (via tikv/prometheus exposed via axum)
  • instead of an external watcher (s3) on Polling/Listing Comparison, investigate an alternative based on ETAG
  • Watcher optimizations (namespaces exclusions, pruning fields, Debouncing Repetitions)

Build container

source ../00-ENV/env.sh
CVERSION="v0.6.2"

docker login ghcr.io -u bartvanbenthem -p $CR_PAT

docker build -t pvsync:$CVERSION .

docker tag pvsync:$CVERSION ghcr.io/bartvanbenthem/pvsync:$CVERSION
docker push ghcr.io/bartvanbenthem/pvsync:$CVERSION

# test image
docker run --rm -it --entrypoint /bin/sh pvsync:$CVERSION

/# ls -l /usr/local/bin/pvsync
/# /usr/local/bin/pvsync

Deploy CRD

kubectl apply -f ./config/crd/pvsync.storage.cndev.nl.yaml
# kubectl delete -f ./config/crd/pvsync.storage.cndev.nl.yaml

create secret

# secret containing object storage
source ../00-ENV/env.sh
kubectl -n kube-system create secret generic pvsync \
  --from-literal=OBJECT_STORAGE_ACCOUNT=$OBJECT_STORAGE_ACCOUNT \
  --from-literal=OBJECT_STORAGE_SECRET=$OBJECT_STORAGE_SECRET \
  --from-literal=OBJECT_STORAGE_BUCKET=$OBJECT_STORAGE_BUCKET \
  --from-literal=S3_ENDPOINT_URL=""

Deploy Operator

helm install pvsync ./config/operator/chart --create-namespace --namespace kube-system
kubectl -n kube-system get pods
# helm -n kube-system uninstall pvsync

Sample volume sync resource on protected cluster

# use label: volumesyncs.storage.cndev.nl/sync: "enabled"
# to enable a sync on a persistant volume
kubectl apply -f ./config/samples/pvsync-protected-example.yaml
kubectl describe persistentvolumesyncs.storage.cndev.nl example-protected-cluster
# kubectl delete -f ./config/samples/pvsync-protected-example.yaml

Sample volume sync resource on recovery cluster

kubectl apply -f ./config/samples/pvsync-recovery-example.yaml
kubectl describe persistentvolumesyncs.storage.cndev.nl example-recovery-cluster
# kubectl delete -f ./config/samples/pvsync-recovery-example.yaml

Test Watchers & Reconciler on Create Persistant Volumes

kubectl apply -f ./config/samples/test-pv-nolabel.yaml
kubectl apply -f ./config/samples/test-pv.yaml
# kubectl delete -f ./config/samples/test-pv.yaml
# kubectl delete -f ./config/samples/test-pv-nolabel.yaml

CR Spec

apiVersion: storage.cndev.nl/v1alpha1
kind: PersistentVolumeSync
metadata:
  name: protected-cluster
  labels:
    volumesyncs.storage.cndev.nl/name: protected-cluster
    volumesyncs.storage.cndev.nl/part-of: pvsync-operator
  annotations:
    description: "Disaster Recovery PVSYNC Module"
spec:
  protectedCluster: mycluster
  mode: Protected
  cloudProvider: azure
  retention: 15
---
apiVersion: storage.cndev.nl/v1alpha1
kind: PersistentVolumeSync
metadata:
  name: recovery-cluster
  labels:
    volumesyncs.storage.cndev.nl/name: recovery-cluster
    volumesyncs.storage.cndev.nl/part-of: pvsync-operator
  annotations:
    description: "Disaster Recovery PVSYNC Module"
spec:
  protectedCluster: mycluster 
  mode: Recovery 
  cloudProvider: azure 
  pollingInterval: 25 

About

The PersistentVolume Sync Operator externalizes PV metadata to an compatible backend, decoupling storage identity from the cluster. This creates a centralized source of truth for reconstructing volumes across any cluster sharing the underlying storage.

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages