Configure Storage in Kubernetes
Redpanda brokers must store their data on disk (/var/lib/redpanda/data). By default, the Redpanda Helm chart uses the default StorageClass in a Kubernetes cluster to create one PersistentVolumeClaim for each Pod that runs a Redpanda broker. The default StorageClass in your Kubernetes cluster depends on the Kubernetes platform that you are using. You can customize the Helm chart to use the following storage volumes:
Prerequisites
-
If you’re configuring Redpanda for production, you must create and mount an XFS file system on any storage volumes that host the data directory of Redpanda (
/var/lib/redpanda/data). XFS is a high-performance file system that is required for running Redpanda in production. NFS file systems are not supported. -
Review the storage best practices.
Use PersistentVolumes
A PersistentVolume is storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses. For details about PersistentVolumes, see the Kubernetes documentation.
You can configure the Helm chart to use PersistentVolumes with a static provisioner or a dynamic provisioner. Redpanda recommends using a StorageClass with a dynamic provisioner. See the best practices.
Dynamic provisioners
A dynamic provisioner creates a PersistentVolume on demand for each Redpanda broker.
Managed Kubernetes platforms and cloud environments usually provide a dynamic provisioner. If you are running Kubernetes on-premises, make sure that you have a dynamic provisioner for your storage type.
-
Make sure that you have at least one StorageClass in the cluster:
kubectl get storageclassExample output
In a Google GKE cluster, this is the result:
NAME PROVISIONER AGE standard (default) kubernetes.io/gce-pd 1dThis StorageClass is marked as the default, which means that this class is used to provision a PersistentVolume when the PersistentVolumeClaim doesn’t specify the StorageClass.
-
Configure your StorageClass:
-
To use your Kubernetes cluster’s default StorageClass, set
storage.persistentVolume.storageClassto an empty string (""):-
Helm + Operator
-
Helm
redpanda-cluster.yamlapiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: redpanda spec: chartRef: {} clusterSpec: storage: persistentVolume: enabled: true size: 20Gi storageClass: ""kubectl apply -f redpanda-cluster.yaml --namespace <namespace>-
--values
-
--set
storageclass.yamlstorage: persistentVolume: enabled: true size: 20Gi storageClass: ""helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --values storageclass.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --set storage.persistentVolume.enabled=true \ --set storage.persistentVolume.size=20Gi \ --set storage.persistentVolume.storageClass="" -
-
To use a specific StorageClass, set its name in the
storage.persistentVolume.storageClassconfiguration:-
Helm + Operator
-
Helm
redpanda-cluster.yamlapiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: redpanda spec: chartRef: {} clusterSpec: storage: persistentVolume: enabled: true size: 20Gi storageClass: "<storage-class>"kubectl apply -f redpanda-cluster.yaml --namespace <namespace>-
--values
-
--set
storageclass.yamlstorage: persistentVolume: enabled: true size: 20Gi storageClass: "<storage-class>"helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --values storageclass.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --set storage.persistentVolume.enabled=true \ --set storage.persistentVolume.size=20Gi \ --set storage.persistentVolume.storageClass="<storage-class>" -
-
Static provisioners
When you use a static provisioner, an existing PersistentVolume in the cluster is selected and bound to one PersistentVolumeClaim for each Redpanda broker.
-
Create one PersistentVolume for each Redpanda broker. Make sure to create PersistentVolumes with a capacity of at least the value of the
storage.persistentVolume.sizeconfiguration. -
Set the
storage.persistentVolume.storageClassto a dash ("-") to use a PersistentVolume with a static provisioner:-
Helm + Operator
-
Helm
redpanda-cluster.yamlapiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: redpanda spec: chartRef: {} clusterSpec: storage: persistentVolume: enabled: true storageClass: "-"kubectl apply -f redpanda-cluster.yaml --namespace <namespace>-
--values
-
--set
storageclass.yamlstorage: persistentVolume: enabled: true storageClass: "-"helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --values storageclass.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --set storage.persistentVolume.enabled=true \ --set storage.persistentVolume.storageClass="-" -
Resize PersistentVolumes
To give Redpanda brokers more storage, you can expand the size of PersistentVolumes. The way you expand PersistentVolumes depends on the provisioner that you use.
The process for resizing PersistentVolumes that use a static provisioner varies depending on the way your file system is allocated. Follow the recommended process for your system. You do not need to make any configuration changes to the Helm chart.
To resize a PersistentVolume that uses a dynamic provisioner:
-
Make sure that your StorageClass is capable of volume expansions. For a list of volumes that support volume expansion, see the Kubernetes documentation.
-
Increase the value of the
storage.persistentVolume.sizeconfiguration:-
Helm + Operator
-
Helm
redpanda-cluster.yamlapiVersion: cluster.redpanda.com/v1alpha1 kind: Redpanda metadata: name: redpanda spec: chartRef: {} clusterSpec: storage: persistentVolume: enabled: true size: <custom-size>Gikubectl apply -f redpanda-cluster.yaml --namespace <namespace>-
--values
-
--set
persistentvolume-size.yamlstorage: persistentVolume: enabled: true size: <custom-size>Gihelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --values persistentvolume-size.yaml --reuse-valueshelm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \ --set storage.persistentVolume.enabled=true \ --set storage.persistentVolume.size=<custom-size>Gi -
Delete PersistentVolumeClaims
To prevent accidental loss of data, PersistentVolumesClaims are not deleted when Redpanda brokers are removed from a cluster. It is your responsibility to delete PersistentVolumeClaims when they are no longer needed. Check the reclaim policy of your PersistentVolumes before deleting a PersistentVolumeClaim.
kubectl get persistentvolume --namespace <namespace>
For descriptions of each reclaim policy, see the Kubernetes documentation.
Use hostPath volumes
A hostPath volume mounts a file or directory from the host node’s file system into your Pod. For details about hostPath volumes, see the Kubernetes documentation.
To store Redpanda data in hostPath volumes:
-
Set the
storage.hostPathconfiguration to the absolute path of a file on the local worker node. -
Set
storage.persistentVolume.enabledtofalse. -
Set
statefulset.initContainers.setDataDirOwnership.enabledtotrue.
Pods that run Redpanda brokers must have read/write access to their data directories. The initContainer is responsible for setting write permissions on the data directories. By default, statefulset.initContainers.setDataDirOwnership is disabled because most storage drivers call SetVolumeOwnership to give Redpanda permissions to the root of the storage mount. However, some storage drivers, such as hostPath, do not call SetVolumeOwnership. In this case, you must enable the initContainer to set the permissions.
|
To set permissions on the data directories, the initContainer must run as root. However, be aware that an initContainer running as root can introduce the following security risks:
|
|
Use only for development and testing
If the Pod is deleted and recreated, it might be scheduled on another worker node and no longer have access to the same hostPath volume data. |
-
Helm + Operator
-
Helm
redpanda-cluster.yamlapiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef: {}
clusterSpec:
storage:
hostPath: "<absolute-path>"
persistentVolume:
enabled: false
initContainers:
setDataDirOwnership:
enabled: true
kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
-
--values
-
--set
hostpath.yamlstorage:
hostPath: "<absolute-path>"
persistentVolume:
enabled: false
initContainers:
setDataDirOwnership:
enabled: true
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--values hostpath.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set storage.persistentVolume.enabled=false \
--set storage.hostPath=<absolute-path> \
--set statefulset.initContainers.setDataDirOwnership.enabled=true
Use emptyDir volumes
An emptyDir volume is first created when a Pod is assigned to a node, and the volume exists as long as the Pod is running on that node. For details about emptyDir volumes, see the Kubernetes documentation.
To store Redpanda data in emptyDir volumes,
set the storage.hostPath configuration to an empty string (""),
and set storage.persistentVolume.enabled to false.
|
Use only for development and testing
When a Pod is removed from a node for any reason, the data in the emptyDir volume is deleted permanently. |
-
Helm + Operator
-
Helm
redpanda-cluster.yamlapiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef: {}
clusterSpec:
storage:
hostPath: ""
persistentVolume:
enabled: false
kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
-
--values
-
--set
emptydir.yamlstorage:
hostPath: ""
persistentVolume:
enabled: false
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--values emptydir.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set storage.persistentVolume.enabled=false
Next steps
Enable rack awareness to minimize data loss in the event of a rack failure.