Skip to main content

Example Scenarios

These example scenarios for backup and restore are different based on your version of RKE.

This walkthrough will demonstrate how to restore an etcd cluster from a local snapshot with the following steps:

  1. Back up the cluster
  2. Simulate a node failure
  3. Add a new etcd node to the cluster
  4. Restore etcd on the new node from the backup
  5. Confirm that cluster operations are restored

In this example, the Kubernetes cluster was deployed on two AWS nodes.

NameIPRole
node110.0.0.1[controlplane, worker]
node210.0.0.2[etcd]

1. Back Up the Cluster

Take a local snapshot of the Kubernetes cluster.

You can upload this snapshot directly to an S3 backend with the S3 options.

$ rke etcd snapshot-save --name snapshot.db --config cluster.yml

2. Simulate a Node Failure

To simulate the failure, let's power down node2.

root@node2:~# poweroff
NameIPRole
node110.0.0.1[controlplane, worker]
node210.0.0.2[etcd]

3. Add a New etcd Node to the Kubernetes Cluster

Before updating and restoring etcd, you will need to add the new node into the Kubernetes cluster with the etcd role. In the cluster.yml, comment out the old node and add in the new node.

nodes:
- address: 10.0.0.1
hostname_override: node1
user: ubuntu
role:
- controlplane
- worker
# - address: 10.0.0.2
# hostname_override: node2
# user: ubuntu
# role:
# - etcd
- address: 10.0.0.3
hostname_override: node3
user: ubuntu
role:
- etcd

4. Restore etcd on the New Node from the Backup

Prerequisite

If the snapshot was created using RKE v1.1.4 or higher, the cluster state file should be included in the snapshot. The cluster state file will be automatically extracted and used for the restore. If the snapshot was created using RKE v1.1.3 or lower, please ensure your cluster.rkestate is present before starting the restore, because this contains your certificate data for the cluster.

After the new node is added to the cluster.yml, run the rke etcd snapshot-restore to launch etcd from the backup:

$ rke etcd snapshot-restore --name snapshot.db --config cluster.yml

The snapshot is expected to be saved at /opt/rke/etcd-snapshots.

If you want to directly retrieve the snapshot from S3, add in the S3 options.

note

As of v0.2.0, the file pki.bundle.tar.gz is no longer required for the restore process because the certificates required to restore are preserved within the cluster.rkestate.

5. Confirm that Cluster Operations are Restored

The rke etcd snapshot-restore command triggers rke up using the new cluster.yml. Confirm that your Kubernetes cluster is functional by checking the pods on your cluster.

> kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-65899c769f-kcdpr 1/1 Running 0 17s
nginx-65899c769f-pc45c 1/1 Running 0 17s
nginx-65899c769f-qkhml 1/1 Running 0 17s