Configure Storage in Kubernetes

Redpanda brokers must store their data on disk (/var/lib/redpanda/data). By default, the Redpanda Helm chart uses the default StorageClass in a Kubernetes cluster to create one PersistentVolumeClaim for each Pod that runs a Redpanda broker. The default StorageClass in your Kubernetes cluster depends on the Kubernetes platform that you are using. You can customize the Helm chart to use the following storage volumes:

Prerequisites

  • If you’re configuring Redpanda for production, you must create and mount an XFS file system on any storage volumes that host the data directory of Redpanda (/var/lib/redpanda/data). XFS is a high-performance file system that is required for running Redpanda in production. NFS file systems are not supported.

  • Review the storage best practices.

Use PersistentVolumes

A PersistentVolume is storage in the cluster that has been provisioned by an administrator or dynamically provisioned using StorageClasses. For details about PersistentVolumes, see the Kubernetes documentation.

You can configure the Helm chart to use PersistentVolumes with a static provisioner or a dynamic provisioner. Redpanda recommends using a StorageClass with a dynamic provisioner. See the best practices.

Dynamic provisioners

A dynamic provisioner creates a PersistentVolume on demand for each Redpanda broker.

Managed Kubernetes platforms and cloud environments usually provide a dynamic provisioner. If you are running Kubernetes on-premises, make sure that you have a dynamic provisioner for your storage type.

  1. Make sure that you have at least one StorageClass in the cluster:

    kubectl get storageclass

    Example output

    In a Google GKE cluster, this is the result:

    NAME                 PROVISIONER            AGE
    standard (default)   kubernetes.io/gce-pd   1d

    This StorageClass is marked as the default, which means that this class is used to provision a PersistentVolume when the PersistentVolumeClaim doesn’t specify the StorageClass.

  2. Configure your StorageClass:

    • To use your Kubernetes cluster’s default StorageClass, set storage.persistentVolume.storageClass to an empty string (""):

      • Helm + Operator

      • Helm

      redpanda-cluster.yaml
      apiVersion: cluster.redpanda.com/v1alpha1
      kind: Redpanda
      metadata:
        name: redpanda
      spec:
        chartRef: {}
        clusterSpec:
          storage:
            persistentVolume:
              enabled: true
              size: 20Gi
              storageClass: ""
      kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
      • --values

      • --set

      storageclass.yaml
      storage:
        persistentVolume:
          enabled: true
          size: 20Gi
          storageClass: ""
      helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
        --values storageclass.yaml --reuse-values
      helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
        --set storage.persistentVolume.enabled=true \
        --set storage.persistentVolume.size=20Gi \
        --set storage.persistentVolume.storageClass=""
    • To use a specific StorageClass, set its name in the storage.persistentVolume.storageClass configuration:

      • Helm + Operator

      • Helm

      redpanda-cluster.yaml
      apiVersion: cluster.redpanda.com/v1alpha1
      kind: Redpanda
      metadata:
        name: redpanda
      spec:
        chartRef: {}
        clusterSpec:
          storage:
            persistentVolume:
              enabled: true
              size: 20Gi
              storageClass: "<storage-class>"
      kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
      • --values

      • --set

      storageclass.yaml
      storage:
        persistentVolume:
          enabled: true
          size: 20Gi
          storageClass: "<storage-class>"
      helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
        --values storageclass.yaml --reuse-values
      helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
        --set storage.persistentVolume.enabled=true \
        --set storage.persistentVolume.size=20Gi \
        --set storage.persistentVolume.storageClass="<storage-class>"

Static provisioners

When you use a static provisioner, an existing PersistentVolume in the cluster is selected and bound to one PersistentVolumeClaim for each Redpanda broker.

  1. Create one PersistentVolume for each Redpanda broker. Make sure to create PersistentVolumes with a capacity of at least the value of the storage.persistentVolume.size configuration.

  2. Set the storage.persistentVolume.storageClass to a dash ("-") to use a PersistentVolume with a static provisioner:

    • Helm + Operator

    • Helm

    redpanda-cluster.yaml
    apiVersion: cluster.redpanda.com/v1alpha1
    kind: Redpanda
    metadata:
      name: redpanda
    spec:
      chartRef: {}
      clusterSpec:
        storage:
          persistentVolume:
            enabled: true
            storageClass: "-"
    kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
    • --values

    • --set

    storageclass.yaml
    storage:
      persistentVolume:
        enabled: true
        storageClass: "-"
    helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
      --values storageclass.yaml --reuse-values
    helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
      --set storage.persistentVolume.enabled=true \
      --set storage.persistentVolume.storageClass="-"

Resize PersistentVolumes

To give Redpanda brokers more storage, you can expand the size of PersistentVolumes. The way you expand PersistentVolumes depends on the provisioner that you use.

The process for resizing PersistentVolumes that use a static provisioner varies depending on the way your file system is allocated. Follow the recommended process for your system. You do not need to make any configuration changes to the Helm chart.

To resize a PersistentVolume that uses a dynamic provisioner:

  1. Make sure that your StorageClass is capable of volume expansions. For a list of volumes that support volume expansion, see the Kubernetes documentation.

  2. Increase the value of the storage.persistentVolume.size configuration:

    • Helm + Operator

    • Helm

    redpanda-cluster.yaml
    apiVersion: cluster.redpanda.com/v1alpha1
    kind: Redpanda
    metadata:
      name: redpanda
    spec:
      chartRef: {}
      clusterSpec:
        storage:
          persistentVolume:
            enabled: true
            size: <custom-size>Gi
    kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
    • --values

    • --set

    persistentvolume-size.yaml
    storage:
      persistentVolume:
        enabled: true
        size: <custom-size>Gi
    helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
      --values persistentvolume-size.yaml --reuse-values
    helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
      --set storage.persistentVolume.enabled=true \
      --set storage.persistentVolume.size=<custom-size>Gi

Delete PersistentVolumeClaims

To prevent accidental loss of data, PersistentVolumesClaims are not deleted when Redpanda brokers are removed from a cluster. It is your responsibility to delete PersistentVolumeClaims when they are no longer needed. Check the reclaim policy of your PersistentVolumes before deleting a PersistentVolumeClaim.

kubectl get persistentvolume --namespace <namespace>

For descriptions of each reclaim policy, see the Kubernetes documentation.

Use hostPath volumes

A hostPath volume mounts a file or directory from the host node’s file system into your Pod. For details about hostPath volumes, see the Kubernetes documentation.

To store Redpanda data in hostPath volumes:

  1. Set the storage.hostPath configuration to the absolute path of a file on the local worker node.

  2. Set storage.persistentVolume.enabled to false.

  3. Set statefulset.initContainers.setDataDirOwnership.enabled to true.

Pods that run Redpanda brokers must have read/write access to their data directories. The initContainer is responsible for setting write permissions on the data directories. By default, statefulset.initContainers.setDataDirOwnership is disabled because most storage drivers call SetVolumeOwnership to give Redpanda permissions to the root of the storage mount. However, some storage drivers, such as hostPath, do not call SetVolumeOwnership. In this case, you must enable the initContainer to set the permissions.

To set permissions on the data directories, the initContainer must run as root. However, be aware that an initContainer running as root can introduce the following security risks:

  • Privilege escalation: If attackers gains access to the initContainer, they can escalate privileges to gain full control over the system. For example, attackers could use the initContainer to gain unauthorized access to sensitive data, tamper with the system, or start denial-of-service attacks.

  • Container breakouts: If the container is misconfigured or the container runtime has a vulnerability, attackers could escape from the initContainer and access the host operating system.

  • Image tampering: If attackers gain access to the container image of the initContainer, they could add malicious code or backdoors to it. Image tampering could compromise the security of the entire cluster.

Use only for development and testing

If the Pod is deleted and recreated, it might be scheduled on another worker node and no longer have access to the same hostPath volume data.

  • Helm + Operator

  • Helm

redpanda-cluster.yaml
apiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef: {}
clusterSpec:
  storage:
    hostPath: "<absolute-path>"
    persistentVolume:
      enabled: false
  initContainers:
    setDataDirOwnership:
      enabled: true
kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
  • --values

  • --set

hostpath.yaml
storage:
hostPath: "<absolute-path>"
persistentVolume:
  enabled: false
initContainers:
setDataDirOwnership:
  enabled: true
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--values hostpath.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set storage.persistentVolume.enabled=false \
--set storage.hostPath=<absolute-path> \
--set statefulset.initContainers.setDataDirOwnership.enabled=true

Use emptyDir volumes

An emptyDir volume is first created when a Pod is assigned to a node, and the volume exists as long as the Pod is running on that node. For details about emptyDir volumes, see the Kubernetes documentation.

To store Redpanda data in emptyDir volumes, set the storage.hostPath configuration to an empty string (""), and set storage.persistentVolume.enabled to false.

Use only for development and testing

When a Pod is removed from a node for any reason, the data in the emptyDir volume is deleted permanently.

  • Helm + Operator

  • Helm

redpanda-cluster.yaml
apiVersion: cluster.redpanda.com/v1alpha1
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef: {}
clusterSpec:
  storage:
    hostPath: ""
    persistentVolume:
      enabled: false
kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
  • --values

  • --set

emptydir.yaml
storage:
hostPath: ""
persistentVolume:
  enabled: false
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--values emptydir.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set storage.persistentVolume.enabled=false

Next steps

Enable rack awareness to minimize data loss in the event of a rack failure.