Storage & Data Management with Rook Ceph
The Armada Edge Platform leverages Rook Ceph as the primary distributed storage solution, providing reliable, scalable, and high-performance storage across edge deployments.
Overview
Rook Ceph delivers enterprise-grade storage capabilities optimized for edge environments:
- Distributed Storage: Fault-tolerant storage across multiple edge nodes
- Multiple Storage Types: Block (RBD), Object (S3), and File (CephFS) storage
- Self-Healing: Automatic recovery from hardware failures
- Edge-Optimized: Minimal resource footprint suitable for edge deployments
Storage Classes
Available Storage Classes
The platform provides several pre-configured storage classes optimized for different use cases:
# High-performance SSD storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
reclaimPolicy: Delete
volumeBindingMode: Immediate
# General-purpose storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block-retain
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
reclaimPolicy: Retain
volumeBindingMode: Immediate
# Shared file system storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
clusterID: rook-ceph
fsName: myfs
reclaimPolicy: Delete
volumeBindingMode: Immediate
Storage Class Usage Guidelines
| Storage Class | Use Case | Performance | Durability |
|---|---|---|---|
rook-ceph-block | Databases, single-pod applications | High | High (3x replication) |
rook-ceph-block-retain | Critical data, backup storage | High | Very High (retained after deletion) |
rook-cephfs | Multi-pod shared storage, logs | Medium | High (3x replication) |
PersistentVolume Configuration
Basic Block Storage
For applications requiring dedicated block storage:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: rook-ceph-block
Shared File Storage
For applications requiring shared access across multiple pods:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: shared-logs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName: rook-cephfs
StatefulSet Storage
For stateful applications like databases:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 3
template:
spec:
containers:
- name: postgres
image: postgres:15
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: rook-ceph-block
Edge-Specific Storage Optimizations
Bandwidth-Conscious Configurations
Optimize for limited network bandwidth between edge nodes:
# Reduce replication for non-critical data
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-edge-optimized
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: edge-pool # Pool with replication factor 2
imageFormat: "2"
imageFeatures: layering
reclaimPolicy: Delete
Local Storage for Performance-Critical Workloads
For ultra-low latency requirements:
apiVersion: v1
kind: StorageClass
metadata:
name: local-ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete
Backup and Disaster Recovery
Automated Snapshots
Configure automatic snapshots for critical data:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: rook-ceph-snapshot
driver: rook-ceph.rbd.csi.ceph.com
deletionPolicy: Delete
parameters:
clusterID: rook-ceph
pool: replicapool
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: database-snapshot
spec:
volumeSnapshotClassName: rook-ceph-snapshot
source:
persistentVolumeClaimName: database-storage
Cross-Site Replication
For disaster recovery across edge sites:
# Mirror configuration for critical data
apiVersion: ceph.rook.io/v1
kind: CephBlockPoolMirror
metadata:
name: replicapool-mirror
namespace: rook-ceph
spec:
blockPoolName: replicapool
peers:
- clusterName: remote-cluster
poolName: replicapool
Monitoring and Observability
Storage Metrics
Key metrics to monitor:
# Ceph cluster health
ceph_health_status
ceph_cluster_total_bytes
ceph_cluster_used_bytes
# Pool metrics
ceph_pool_stored_bytes
ceph_pool_objects
# OSD metrics
ceph_osd_up
ceph_osd_in
ceph_disk_occupation
Alerting Rules
Critical storage alerts:
# Low storage space
- alert: CephClusterSpaceLow
expr: ceph_cluster_used_bytes / ceph_cluster_total_bytes > 0.85
for: 5m
annotations:
summary: "Ceph cluster storage usage high"
# OSD down
- alert: CephOSDDown
expr: ceph_osd_up == 0
for: 1m
annotations:
summary: "Ceph OSD is down"
Best Practices
Performance Optimization
- Choose appropriate storage class based on workload requirements
- Use local SSDs for Ceph OSDs when possible
- Configure proper CPU/memory limits for Ceph processes
- Monitor IOPS and latency regularly
Security
- Enable encryption at rest for sensitive data
- Use RBAC to control storage access
- Regular security updates for Ceph components
- Network segmentation for Ceph traffic
Capacity Planning
- Monitor growth trends and plan capacity accordingly
- Maintain 20% free space for optimal performance
- Consider replication factor in capacity calculations
- Plan for hardware failures and replacement cycles
Troubleshooting
Common Issues
PVC stuck in Pending state:
# Check storage class availability
kubectl get storageclass
# Check Ceph cluster health
kubectl -n rook-ceph get cephcluster
# Check CSI driver pods
kubectl -n rook-ceph get pods -l app=csi-rbdplugin
Poor storage performance:
# Check OSD status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd status
# Check network latency
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd perf
# Monitor IOPS
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd pool stats
Support Resources
- Rook Documentation
- Ceph Documentation
- Internal monitoring dashboards
- Support ticket system for escalation
This guide provides comprehensive coverage of Rook Ceph storage management on the Armada Edge Platform. For additional support, consult the troubleshooting section or contact the platform team.