Skip to main content

Storage & Data Management with Rook Ceph

The Armada Edge Platform leverages Rook Ceph as the primary distributed storage solution, providing reliable, scalable, and high-performance storage across edge deployments.

Overview

Rook Ceph delivers enterprise-grade storage capabilities optimized for edge environments:

  • Distributed Storage: Fault-tolerant storage across multiple edge nodes
  • Multiple Storage Types: Block (RBD), Object (S3), and File (CephFS) storage
  • Self-Healing: Automatic recovery from hardware failures
  • Edge-Optimized: Minimal resource footprint suitable for edge deployments

Storage Classes

Available Storage Classes

The platform provides several pre-configured storage classes optimized for different use cases:

# High-performance SSD storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
reclaimPolicy: Delete
volumeBindingMode: Immediate
# General-purpose storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-block-retain
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: replicapool
imageFormat: "2"
imageFeatures: layering
reclaimPolicy: Retain
volumeBindingMode: Immediate
# Shared file system storage
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-cephfs
provisioner: rook-ceph.cephfs.csi.ceph.com
parameters:
clusterID: rook-ceph
fsName: myfs
reclaimPolicy: Delete
volumeBindingMode: Immediate

Storage Class Usage Guidelines

Storage ClassUse CasePerformanceDurability
rook-ceph-blockDatabases, single-pod applicationsHighHigh (3x replication)
rook-ceph-block-retainCritical data, backup storageHighVery High (retained after deletion)
rook-cephfsMulti-pod shared storage, logsMediumHigh (3x replication)

PersistentVolume Configuration

Basic Block Storage

For applications requiring dedicated block storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: rook-ceph-block

Shared File Storage

For applications requiring shared access across multiple pods:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: shared-logs
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Gi
storageClassName: rook-cephfs

StatefulSet Storage

For stateful applications like databases:

apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 3
template:
spec:
containers:
- name: postgres
image: postgres:15
volumeMounts:
- name: postgres-storage
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: postgres-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: rook-ceph-block

Edge-Specific Storage Optimizations

Bandwidth-Conscious Configurations

Optimize for limited network bandwidth between edge nodes:

# Reduce replication for non-critical data
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: rook-ceph-edge-optimized
provisioner: rook-ceph.rbd.csi.ceph.com
parameters:
clusterID: rook-ceph
pool: edge-pool # Pool with replication factor 2
imageFormat: "2"
imageFeatures: layering
reclaimPolicy: Delete

Local Storage for Performance-Critical Workloads

For ultra-low latency requirements:

apiVersion: v1
kind: StorageClass
metadata:
name: local-ssd
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
reclaimPolicy: Delete

Backup and Disaster Recovery

Automated Snapshots

Configure automatic snapshots for critical data:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: rook-ceph-snapshot
driver: rook-ceph.rbd.csi.ceph.com
deletionPolicy: Delete
parameters:
clusterID: rook-ceph
pool: replicapool
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: database-snapshot
spec:
volumeSnapshotClassName: rook-ceph-snapshot
source:
persistentVolumeClaimName: database-storage

Cross-Site Replication

For disaster recovery across edge sites:

# Mirror configuration for critical data
apiVersion: ceph.rook.io/v1
kind: CephBlockPoolMirror
metadata:
name: replicapool-mirror
namespace: rook-ceph
spec:
blockPoolName: replicapool
peers:
- clusterName: remote-cluster
poolName: replicapool

Monitoring and Observability

Storage Metrics

Key metrics to monitor:

# Ceph cluster health
ceph_health_status
ceph_cluster_total_bytes
ceph_cluster_used_bytes

# Pool metrics
ceph_pool_stored_bytes
ceph_pool_objects

# OSD metrics
ceph_osd_up
ceph_osd_in
ceph_disk_occupation

Alerting Rules

Critical storage alerts:

# Low storage space
- alert: CephClusterSpaceLow
expr: ceph_cluster_used_bytes / ceph_cluster_total_bytes > 0.85
for: 5m
annotations:
summary: "Ceph cluster storage usage high"

# OSD down
- alert: CephOSDDown
expr: ceph_osd_up == 0
for: 1m
annotations:
summary: "Ceph OSD is down"

Best Practices

Performance Optimization

  1. Choose appropriate storage class based on workload requirements
  2. Use local SSDs for Ceph OSDs when possible
  3. Configure proper CPU/memory limits for Ceph processes
  4. Monitor IOPS and latency regularly

Security

  1. Enable encryption at rest for sensitive data
  2. Use RBAC to control storage access
  3. Regular security updates for Ceph components
  4. Network segmentation for Ceph traffic

Capacity Planning

  1. Monitor growth trends and plan capacity accordingly
  2. Maintain 20% free space for optimal performance
  3. Consider replication factor in capacity calculations
  4. Plan for hardware failures and replacement cycles

Troubleshooting

Common Issues

PVC stuck in Pending state:

# Check storage class availability
kubectl get storageclass

# Check Ceph cluster health
kubectl -n rook-ceph get cephcluster

# Check CSI driver pods
kubectl -n rook-ceph get pods -l app=csi-rbdplugin

Poor storage performance:

# Check OSD status
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd status

# Check network latency
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd perf

# Monitor IOPS
kubectl -n rook-ceph exec -it deploy/rook-ceph-tools -- ceph osd pool stats

Support Resources


This guide provides comprehensive coverage of Rook Ceph storage management on the Armada Edge Platform. For additional support, consult the troubleshooting section or contact the platform team.