The PersistentVolume Sync Operator externalizes PV metadata to an S3-compatible backend, decoupling storage identity from the cluster. This creates a centralized source of truth for reconstructing volumes across any cluster sharing the underlying storage (e.g., NAS, GFS, SDS).
The operator mirrors PersistentVolume definitions and lifecycle states by capturing exact specifications and volume handles from the Protected cluster. These manifests are then used to reconstruct identical PV objects within a Recovery cluster. This mechanism ensures that during a disaster recovery event, the recovery site has immediate, pre-configured access to the underlying storage, significantly reducing Recovery Time Objectives (RTO) by eliminating manual storage re-provisioning.
The PersistentVolume Sync Operator facilitates cross-cluster disaster recovery by synchronizing the state and specifications of Kubernetes storage resources. Its behavior is divided into two distinct logical paths:
The operator performs continuous, point-in-time captures of Kubernetes resource specifications for all targeted PersistentVolumes (PV) and PersistentVolumeClaims (PVC).
- Intelligent Scoping: It monitors resources across designated StorageClasses, maintaining a comprehensive map of the source cluster's storage landscape.
- State Capture Mechanism: The operator serializes metadata, labels, and specific volume requirements (capacity, access modes, and volume handles) into a portable, cluster-agnostic format.
- Preservation of Intent: By capturing the full lifecycle state, the operator ensures that the original storage configuration is preserved and ready for immediate reconstitution on the recovery cluster.
The operator explicitly decouples the PersistentVolumeClaims from the StorageClass definitions. While it captures the metadata for PVCs tied to various classes, it does not recreate the StorageClass objects on the recovery cluster.
- Architectural Intent: This is a deliberate design choice to support Heterogeneous Storage Environments. In many disaster recovery scenarios, the infrastructure at the recovery site differs from the primary site. For example, the recovery cluster may utilize a localized file cache or a different storage endpoint to optimize performance or adhere to site-specific infrastructure constraints.
- Flexible Binding Logic: By synchronizing only the PV/PVC metadata and omitting the StorageClass, the operator enables the recovery cluster's local storage controller to handle the binding process. This allows the restored claims to be dynamically mapped to the appropriate local backend while maintaining the data's integrity and volume handles.
- Site-Specific Optimization: This decoupling ensures that storage policies (such as replication factors or IOPS limits) can be tailored to the recovery site's specific capabilities without needing to mirror the primary site's configuration exactly.
- Metadata Extraction: The operator scans the source cluster for labeled storage resources and captures their specifications.
- Transformation: Using the clean_metadata logic, the operator strips environment-specific internal annotations while preserving the core volume requirements.
- Local Re-Binding: On the recovery cluster, the operator recreates the PVCs. These claims then automatically target the pre-existing local StorageClasses defined on the recovery site, ensuring the data is served via the correct local file cache or storage provider.
Key Benefit: This approach provides a "Clean Slate" recovery where storage logic is kept local to the cluster, preventing the migration of invalid or incompatible storage provider configurations from the primary site.
The PersistentVolume Sync Operator is designed for storage backends that provide cross-cluster data accessibility. For the operator to successfully reconstruct volumes on a recovery site, the underlying data must be reachable by the nodes in the Recovery cluster using the same volume handles captured from the Protected cluster.
NAS is the most common use case for the operator. Because NAS protocols provide a global namespace, the metadata (path/IP) remains valid across different clusters.
- Protocols: NFS, SMB/CIFS.
- Featured Support: Ctera GFS. Since Ctera provides a global file system, it is uniquely suited for this operator, allowing volumes to be mounted as RWX (ReadWriteMany) across geographic boundaries.
- Recovery Requirement: The Recovery cluster must have network routability to the same storage endpoints.
Using the operator with SAN backends (like IBM SVC) requires an additional layer to ensure data is present at both sites.
- Clustered File Systems: To achieve RWX on SAN storage, a clustered filesystem such as IBM Spectrum Scale (GPFS) should be used. The operator can sync the metadata for these volumes, provided the CSI driver handles are consistent.
- Replication Requirement: The SAN LUNs must be replicated at the hardware level (e.g., via IBM HyperSwap or Global Mirror).
- Volume Handles: The operator captures the LUN UID. The Recovery cluster must be able to "see" the same LUN UID via its local Fibre Channel or iSCSI fabric.
| Storage Category | Technology Example | Supported | Metadata Sync Key | Requirements for Success |
|---|---|---|---|---|
| SAN (Enterprise) | IBM SVC + Spectrum Scale | Yes | LUN UID (WWN) | Hardware-level replication (HyperSwap/Global Mirror). |
| Standard NAS | NFS / SMB | Yes | Export Path & IP/DNS | Network routability to the same storage endpoint. |
| Standard NAS | NetApp ONTAP | Yes | Volume UUID / Junction Path | SnapMirror or MetroCluster configured between sites. |
| Standard NAS | Dell PowerScale (Isilon) | Yes | File System ID / SmartConnect | SyncIQ replication and DNS-based SmartConnect availability. |
| Global File System | Ctera GFS | Yes | File Path / Share ID | Active global namespace across both sites. |
| Global File System | NetApp Global File Cache | Yes | Cache ID / Backend Volume | Backend ONTAP storage reachable with consistent cache coherency. |
| Global File System | Microsoft Azure Files (Premium) | Yes | Share Name / Storage Account | Cross-region replication (GZRS) or paired-region failover. |
| Distributed SDS | Ceph (CephFS) | Yes | Monitor IPs & FS Path | Recovery cluster must have access to the Ceph Monitor/OSD network. |
| Distributed SDS | Red Hat OpenShift Data Foundation | Yes | StorageClass / FSID | Stretch or mirrored clusters with quorum maintained. |
| Cloud Native | Longhorn | Yes | Engine Name / Frontend | Longhorn “Disaster Recovery Volumes” or cross-cluster backend. |
| Cloud Native | Portworx (Pure Storage PX) | Yes | Volume ID / Cluster UUID | PX-DR or Stork-based replication and scheduler awareness. |
While the operator can technically sync RWO (ReadWriteOnce) volumes, it is most effective for RWX (ReadWriteMany) workloads. For RWO block storage, ensure that the source cluster has fully released the volume (SCSI reservations) before the Recovery cluster attempts to reconstruct and mount it, otherwise, the mount operation will fail at the infrastructure level.
- 🔄 Cluster-wide PV discovery (no namespace restrictions)
- ☁️ Backend-agnostic object storage support (Azure Blob, S3, MinIO, Cloudian)
- 📤 Export storage definitions from the Protected cluster to object storage
- 📥 Recreate PV objects on the Recovery cluster pointing to the same shared storage
- 🏷 Cluster identity detection via configurable value
- 🧹 Automatic retention-based cleanup of historical exports
- 📡 Event-driven + periodic sync using Kubernetes watches and optional scheduling
- 🌐 Multi-cluster DR for shared RWX storage
- 💾 PV metadata backup and restore
- 🔁 Migration of PVs between clusters
- 🧭 Stateless failover for NFS-backed workloads
Features in currently in development for the upcoming release:
- Validating admission webhook for a max of one pvsync custom resource per cluster
- Advanced Helm chart for prodcution deployments
- update cr status with more information:
- pub error_message: Option,
- pub last_run: Option<chrono::DateTimechrono::Utc>,
- pub managed_volumes: Vec,
- Optimize current logging implementation (via tracing + tracing-subscriber + EnvFilter)
- Implement traces (via tracing + tracing-subscriber + opentelemetry)
- Implement metrics (via tikv/prometheus exposed via axum)
- instead of an external watcher (s3) on Polling/Listing Comparison, investigate an alternative based on ETAG
- Watcher optimizations (namespaces exclusions, pruning fields, Debouncing Repetitions)
source ../00-ENV/env.sh
CVERSION="v0.6.2"
docker login ghcr.io -u bartvanbenthem -p $CR_PAT
docker build -t pvsync:$CVERSION .
docker tag pvsync:$CVERSION ghcr.io/bartvanbenthem/pvsync:$CVERSION
docker push ghcr.io/bartvanbenthem/pvsync:$CVERSION
# test image
docker run --rm -it --entrypoint /bin/sh pvsync:$CVERSION
/# ls -l /usr/local/bin/pvsync
/# /usr/local/bin/pvsynckubectl apply -f ./config/crd/pvsync.storage.cndev.nl.yaml
# kubectl delete -f ./config/crd/pvsync.storage.cndev.nl.yaml# secret containing object storage
source ../00-ENV/env.sh
kubectl -n kube-system create secret generic pvsync \
--from-literal=OBJECT_STORAGE_ACCOUNT=$OBJECT_STORAGE_ACCOUNT \
--from-literal=OBJECT_STORAGE_SECRET=$OBJECT_STORAGE_SECRET \
--from-literal=OBJECT_STORAGE_BUCKET=$OBJECT_STORAGE_BUCKET \
--from-literal=S3_ENDPOINT_URL=""helm install pvsync ./config/operator/chart --create-namespace --namespace kube-system
kubectl -n kube-system get pods
# helm -n kube-system uninstall pvsync# use label: volumesyncs.storage.cndev.nl/sync: "enabled"
# to enable a sync on a persistant volume
kubectl apply -f ./config/samples/pvsync-protected-example.yaml
kubectl describe persistentvolumesyncs.storage.cndev.nl example-protected-cluster
# kubectl delete -f ./config/samples/pvsync-protected-example.yamlkubectl apply -f ./config/samples/pvsync-recovery-example.yaml
kubectl describe persistentvolumesyncs.storage.cndev.nl example-recovery-cluster
# kubectl delete -f ./config/samples/pvsync-recovery-example.yamlkubectl apply -f ./config/samples/test-pv-nolabel.yaml
kubectl apply -f ./config/samples/test-pv.yaml
# kubectl delete -f ./config/samples/test-pv.yaml
# kubectl delete -f ./config/samples/test-pv-nolabel.yamlapiVersion: storage.cndev.nl/v1alpha1
kind: PersistentVolumeSync
metadata:
name: protected-cluster
labels:
volumesyncs.storage.cndev.nl/name: protected-cluster
volumesyncs.storage.cndev.nl/part-of: pvsync-operator
annotations:
description: "Disaster Recovery PVSYNC Module"
spec:
protectedCluster: mycluster
mode: Protected
cloudProvider: azure
retention: 15
---
apiVersion: storage.cndev.nl/v1alpha1
kind: PersistentVolumeSync
metadata:
name: recovery-cluster
labels:
volumesyncs.storage.cndev.nl/name: recovery-cluster
volumesyncs.storage.cndev.nl/part-of: pvsync-operator
annotations:
description: "Disaster Recovery PVSYNC Module"
spec:
protectedCluster: mycluster
mode: Recovery
cloudProvider: azure
pollingInterval: 25 