Skip to main content
Version: 5.16.0

Backup and Restore etcd in RKE2

This how-to is a step-by-step guide for backing up and restoring the etcd database within a Rancher Kubernetes Engine 2 (RKE2) environment. It details the default backup settings, including file path locations, and explains how to configure an S3 bucket as a backup target. The guide also outlines the process of viewing etcd backups through a Kubernetes ConfigMap. It provides steps to restore your etcd database from a snapshot, including specific actions on Control Plane nodes.

note

Placeholder values are used throughout the guide and should be replaced with information pertinent to your Kubernetes setup.

Configuration

By default, RKE2 stores backups in the local filesystem. The path is usually /var/lib/rancher/rke2/server/db/snapshots/.

To configure an S3 backup target, add these general settings in /etc/rancher/rke2/config.yaml:

etcd-s3: true
etcd-s3-region: <aws-region>
etcd-s3-bucket: <s3-bucket-name>
etcd-s3-folder: <s3-folder-name>

etcd Backup

You can view the etcd backups in a designated ConfigMap specific to your Kubernetes setup.

kubectl get cm -n <namespace> <etcd-snapshots-configmap> -oyaml
  local-etcd-snapshot-<node-name>-<timestamp>: '{"name":"etcd-snapshot-<node-name>-<timestamp>","location":"file://<path-to-snapshot>","nodeName":"<node-name>","createdAt":"<creation-date>","size":<snapshot-size>,"status":"successful","compressed":false}'

s3-etcd-snapshot-<node-name>-<timestamp>: '{"name":"etcd-snapshot-<node-name>-<timestamp>","nodeName":"s3","size":<snapshot-size>,"status":"successful","s3Config":{"endpoint":"<s3-endpoint>","bucket":"<s3-bucket-name>","region":"<s3-region>","folder":"<s3-folder>"},"compressed":false}'

etcd Restore

For the etcd restore, you may need to follow the official documentation for restoring a snapshot to existing nodes. The process may take some time, so patience is required.

  • On all Control Plane nodes, stop the RKE2 service:
systemctl stop rke2-server
  • On the first Control Plane node, perform cluster reset:
rke2 server \
--cluster-reset \
--cluster-reset-restore-path=<PATH-TO-SNAPSHOT>
  • On the first Control Plane node, start the RKE2 service:
systemctl start rke2-server
  • On the remaining Control Plane nodes, delete the etcd data directory and start the RKE2 service:
rm -rf /var/lib/rancher/rke2/server/db
systemctl start rke2-server
info

In this how-to, replace the placeholders like <namespace>, <etcd-snapshots-configmap>, <node-name>, <timestamp>, etc., with the actual values relevant to your specific Kubernetes setup.