1. Home
  2. Containers
  3. Etcd Backup and Restore with BCM 10.0

Etcd Backup and Restore with BCM 10.0

Prerequisites
  • The following article was written with Bright Cluster Manager version 10 (10.23.09 or newer) in mind.
  • We assume we have shared storage available as a mount in /cm/shared, we will create a target directory there for our Etcd backups.
  • The backup part of this KB article (Section 2) is always a good idea, and it can be executed on a running system, without downtime of Etcd. A snapshot is always made on the Etcd leader, and can be used in the future to restore all Etcd members of the cluster in the case of required disaster recovery.
  • The restore part of this KB article (Section 3) should only be followed if your entire Etcd cluster has to be recreated from the backup, and/or downtime is acceptable. If you run a multi-node Etcd cluster, broken members can be replaced or fixed by synchronizing from the remaining working Etcd members. This is often a better approach when possible, and is explained in this KB article:  https://kb.brightcomputing.com/knowledge-base/etcd-membership-reconfiguration-in-bright-9-0/
1. Etcd installations
  • Bright Kubernetes setups always require an odd number of Etcd nodes. Typically this is one or three nodes.
  • Three Etcd nodes are recommended, but single-node Etcd deployments are also possible. If you have a single-node deployment it is worth considering adding two more nodes by following the “Add new nodes” portion of the following KB article: https://kb.brightcomputing.com/knowledge-base/etcd-membership-reconfiguration-in-bright-9-0/
  • Etcd members, when installed on Compute Nodes, are marked as datanodes. This prevents Full Provisioning from unintentionally wiping the Etcd database (stored in /var/lib/etcd usually). 
  • Etcd stores its data in /var/lib/etcd by default, which it calls the spool directory.
The spool directory can be configured, it can be found in the Etcd::Host role via cmsh. Please note that in this example the Kubernetes cluster’s label is default. The name of the Configuration Overlay is different for each Kubernetes cluster managed by BCM. Below the spool directory was not configured to something different.
[cluster->configurationoverlay[kube-default-etcd]->roles[Etcd::Host]]% get spool
/var/lib/etcd

If your spool directory is different, please use that instead of /var/lib/etcd throughout the rest of the KB article.

2. Create the snapshot
2.1. Check the cluster health

First login to one of the Etcd cluster nodes (any of the devices with the Etcd::Host role). Load the module file and check for the Health. Please note that we will set an additional environment variable (ETCDCTL_ENDPOINTS, only relevant for three-member Etcd clusters) to get the output for all endpoints at once. Please note again that in below example the Kubernetes cluster has the label/name default. This label can be different.

# module load etcd/kube-default/<version>
# ETCDCTL_ENDPOINTS=$(etcdctl member list | awk -F ',' '{print $5}' | sed 's/\s//' | paste -sd ",")
# etcdctl -w table endpoint health
+-----------------------------+--------+-------------+-------+
|          ENDPOINT           | HEALTH |    TOOK     | ERROR |
+-----------------------------+--------+-------------+-------+
|     https://10.141.0.2:2379 |   true | 15.065402ms |       |
|     https://10.141.0.1:2379 |   true | 22.261303ms |       |
| https://10.141.255.254:2379 |   true |  15.98154ms |       |
+-----------------------------+--------+-------------+-------+

In this example output the Health is good for all endpoints, which should mean we’re good to go for creating a snapshot. If not, please fix the broken node first (see https://kb.brightcomputing.com/knowledge-base/etcd-membership-reconfiguration-in-bright-9-0/)

2.2. Find the Etcd leader

Now we execute another query.

# etcdctl -w table endpoint status
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     https://10.141.0.2:2379 | 10cee25dc156ff4a |  3.5.15 |   34 MB |     false |      false |        21 |     132769 |             132769 |        |
| https://10.141.255.254:2379 | 1e8ae2b8f6e1cbd9 |  3.5.15 |   34 MB |     false |      false |        21 |     132769 |             132769 |        |
|     https://10.141.0.1:2379 | 4a336cbcb0bafdc0 |  3.5.15 |   35 MB |      true |      false |        21 |     132769 |             132769 |        |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

Here we see that the leader node is the endpoint with the 10.141.0.1 internal IP. This is the internal IP for node001 in this example cluster, we will now switch to that node, or if it happens to be the node we’re already on we don’t have to ssh.

root@??# ssh root@10.141.0.1 # node001

We will have to “reload” the module file, and we  will not set the $ETCDCTL_ENDPOINTS environment variable this time!

root@node001:~# module unload etcd/kube-default/3.5.15
root@node001:~# module load etcd/kube-default/3.5.15

(If we just SSH-ed to node001, we do not have to unload first, the module unload would print an error, which can be ignored)

We can verify we indeed only have the leader as an endpoint to talk to using:

root@node002:~# echo $ETCDCTL_ENDPOINTS 
https://10.141.0.2:2379
2.3. Prepare backup location

We will use a directory in /cm/shared shared storage which is shared among all Etcd hosts in our example cluster. 

root@node002:~# mkdir -p /cm/shared/backup/etcd
2.4. Use etcdctl to save the snapshot

We make the snapshot on this Etcd leader node. The snapshot contains the entire Etcd state for all members.

root@node001:~# etcdctl snapshot save "/cm/shared/backup/etcd/etcd-$(hostname)-$(date +"%Y%m%d%H%M%S")"
{"level":"info","ts":"2024-10-17T16:38:09.943338+0200","caller":"snapshot/v3_snapshot.go:65","msg":"created temporary db file","path":"/cm/shared/backup/etcd/etcd-node001-20241017163809.part"}
{"level":"info","ts":"2024-10-17T16:38:09.953540+0200","logger":"client","caller":"v3@v3.5.15/maintenance.go:212","msg":"opened snapshot stream; downloading"}
{"level":"info","ts":"2024-10-17T16:38:09.954000+0200","caller":"snapshot/v3_snapshot.go:73","msg":"fetching snapshot","endpoint":"https://10.141.0.1:2379"}
{"level":"info","ts":"2024-10-17T16:38:10.432409+0200","logger":"client","caller":"v3@v3.5.15/maintenance.go:220","msg":"completed snapshot read; closing"}
{"level":"info","ts":"2024-10-17T16:38:10.705496+0200","caller":"snapshot/v3_snapshot.go:88","msg":"fetched snapshot","endpoint":"https://10.141.0.1:2379","size":"35 MB","took":"now"}
{"level":"info","ts":"2024-10-17T16:38:10.708300+0200","caller":"snapshot/v3_snapshot.go:97","msg":"saved","path":"/cm/shared/backup/etcd/etcd-node001-20241017163809"}
Snapshot saved at /cm/shared/backup/etcd/etcd-node001-20241017163809

As can be seen in the above output, we saved a snapshot to: /cm/shared/backup/etcd/etcd-node001-20241017163809.
This file can be used to restore all Etcd members when needed, see the next section (Section 3).


3. Restore the snapshot

This section assumes that there is an actual need to restore a snapshot. For example, hardware failure of all Etcd members. If some are still up & running, then replacing, adding or removing members while it remains operational is possibly a better solution. Again, the following KB article might be more appropriate for this case: https://kb.brightcomputing.com/knowledge-base/etcd-membership-reconfiguration-in-bright-9-0/.

If the above KB article cannot save the Etcd cluster, then this section will explain how to restore all Etcd members at once from the snapshot. This will require downtime of Etcd, we don’t want Etcd to corrupt it’s database while we are restoring the snapshot.

3.1. Make sure all Etcd Hosts are UP

Assuming this is a disaster recovery, perhaps one server is completely new, or all servers corrupted or suffered data loss, before we can restore the database, all the nodes need to be up and running. We can check this using cmsh.

root@bcm:~# cmsh
[bcm]% device
[bcm->device]% list -l etcd::host
Type                   Hostname (key)    MAC                Category         Ip              Network        Status          
---------------------- ----------------- ------------------ ---------------- --------------- -------------- ----------------
HeadNode               bcm FA:16:3E:CB:0F:A1                   10.141.255.254  internalnet    [   UP   ]      
PhysicalNode           node001           FA:16:3E:47:A8:B6  default          10.141.0.1      internalnet    [   UP   ]      
PhysicalNode           node002           FA:16:3E:C9:2D:95  default          10.141.0.2      internalnet    [   UP   ]      

Above we listed all the devices with the etcd::host role and confirmed their status to be UP.

3.2. Stop all Etcd services

We can then also use cmsh for stopping the nodes. Let’s first get the list of hostnames: bcm, node001 and node002 in this example.

root@bcm:~# cmsh
[bcm]% device
[bcm->device]% roleoverview -v | grep Etcd
Etcd::Host                  bcm             overlay:kube-default-etcd
Etcd::Host                  node001         overlay:kube-default-etcd
Etcd::Host                  node002         overlay:kube-default-etcd

We can stop the etcd service on those nodes using:

[bcm->device]% foreach -n bcm,node001,node002 (services; stop etcd)
Thu Oct 17 16:10:30 2024 [notice] bcm: Service etcd was stopped
Thu Oct 17 16:10:31 2024 [notice] node001: Service etcd was stopped
Thu Oct 17 16:10:32 2024 [notice] node002: Service etcd was stopped

This might take a few seconds. After that, we can want to confirm if the services have indeed stopped:

[bcm->device]% foreach -n bcm,node001,node002 (services; status etcd)
Service      Status
------------ -----------
etcd         [STOPPED ]
Service      Status
------------ -----------
etcd         [STOPPED ]
Service      Status
------------ -----------
etcd         [STOPPED ]

Now we are safe to continue restoring the snapshot, without running processes corrupting the database.

3.3. Recreate the spool dir

We will use pdsh to reset the spool directory for all Etcd members at once.

# move the spool dir out of the way:
pdsh -w bcm,node001,node002 mv /var/lib/etcd /var/lib/etcd.old

# create new:
pdsh -w bcm,node001,node002 mkdir /var/lib/etcd

# fix permissions and ownership:
pdsh -w bcm,node001,node002 chmod 0700 /var/lib/etcd
pdsh -w bcm,node001,node002 chown etcd:etcd /var/lib/etcd

As can be seen we use the same comma-separated list of nodes that we previously determined in section 3.2.
All these pdsh commands do not give any output, but to verify that they worked correctly, we can use these commands and compare against the expected output:

root@bcm:~# pdsh -w bcm,node001,node002 'ls -al /var/lib/etcd'
node002: total 3
node002: drwx------ 3 etcd etcd 20 Oct 18 03:56 .
node002: drwxr-xr-x 62 root root 4096 Oct 18 03:59 ..
node001: total 3
node001: drwx------ 3 etcd etcd 20 Oct 18 03:57 .
node001: drwxr-xr-x 62 root root 4096 Oct 18 03:59 ..
bcm: total 3
bcm: drwx------ 3 etcd etcd 20 Oct 18 03:57 .
bcm: drwxr-xr-x 79 root root 4096 Oct 18 03:59 ..

The drwx------ (0700) permissions are important for /var/lib/etcd, the etcd:etcd ownership, and the directory is expected to be empty (no member subdirectory that exists for a populated database).

3.4. Import the snapshot

Again, we will use pdsh to do it all at once. Please note that unfortunately this is a slightly tedious step. All nodes need the list of all Etcd members, as documented here: https://etcd.io/docs/v3.5/op-guide/recovery/.

Therefor, please edit the below pdsh command. And replace <etcd-node1-hostname>, <etcd-node1-ip>, <etcd-node2-hostname>, <etcd-node2-ip>, <etcd-node3-hostname>, <etcd-node3-ip> in the below command: 

pdsh -w bcm,node001,node002 ". /etc/profile.d/modules.sh; module load etcd && etcdctl snapshot restore --data-dir=/var/lib/etcd --name \$(hostname) --initial-cluster <etcd-node1-hostname>=https://<etcd-node1-ip>:2380,<etcd-node2-hostname>=https://<etcd-node2-ip>:2380,<etcd-node3-hostname>=https://<etcd-node3-ip>:2380 --initial-advertise-peer-urls=https://\$(hostname -i):2380 /cm/shared/backup/etcd/etcd-node001-20241017163809"

For a single member Etcd cluster:

pdsh -w node001 ". /etc/profile.d/modules.sh; module load etcd && etcdctl snapshot restore --data-dir=/var/lib/etcd --name \$(hostname) --initial-cluster <etcd-node1-hostname>=https://<etcd-node1-ip>:2380 --initial-advertise-peer-urls=https://\$(hostname -i):2380 /cm/shared/backup/etcd/etcd-node001-20241017163809"

For a five member Etcd cluster:

pdsh -w node001,node002,node003,node004,node005 ". /etc/profile.d/modules.sh; module load etcd && etcdctl snapshot restore --data-dir=/var/lib/etcd --name \$(hostname) --initial-cluster <etcd-node1-hostname>=https://<etcd-node1-ip>:2380,<etcd-node2-hostname>=https://<etcd-node2-ip>:2380,<etcd-node3-hostname>=https://<etcd-node3-ip>:2380,<etcd-node4-hostname>=https://<etcd-node4-ip>:2380,<etcd-node5-hostname>=https://<etcd-node5-ip>:2380 --initial-advertise-peer-urls=https://\$(hostname -i):2380 /cm/shared/backup/etcd/etcd-node001-20241017163809"

To continue with the same example cluster that we’ve dealt with so far in this KB article, the command with replaced values looks like this:

pdsh -w bcm,node001,node002 ". /etc/profile.d/modules.sh; module load etcd && etcdctl snapshot restore --data-dir=/var/lib/etcd --name \$(hostname) --initial-cluster bcm=https://10.141.255.254:2380,node001=https://10.141.0.1:2380,node002=https://10.141.0.2:2380 --initial-advertise-peer-urls=https://\$(hostname -i):2380 /cm/shared/backup/etcd/etcd-node001-20241017163809"

Example output of the above command. The deprecation warning can be ignored for now.

bcm: Deprecated: Use `etcdutl snapshot restore` instead.
bcm:
bcm: 2024-10-30T11:51:42+01:00 info snapshot/v3_snapshot.go:265 restoring snapshot {"path": "/cm/shared/backup/etcd/etcd-bcm-20241030112048", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap", "initial-memory-map-size": 0}
node002: Deprecated: Use `etcdutl snapshot restore` instead.
node002:
node002: 2024-10-30T11:51:42+01:00 info snapshot/v3_snapshot.go:265 restoring snapshot {"path": "/cm/shared/backup/etcd/etcd-bcm-20241030112048", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap", "initial-memory-map-size": 0}
node001: Deprecated: Use `etcdutl snapshot restore` instead.
node001:
node001: 2024-10-30T11:51:42+01:00 info snapshot/v3_snapshot.go:265 restoring snapshot {"path": "/cm/shared/backup/etcd/etcd-bcm-20241030112048", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap", "initial-memory-map-size": 0}
node002: 2024-10-30T11:51:43+01:00 info membership/store.go:141 Trimming membership information from the backend...
node001: 2024-10-30T11:51:43+01:00 info membership/store.go:141 Trimming membership information from the backend...
node002: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "39fd91fec477489", "added-peer-peer-urls": ["https://10.141.0.1:2380"]}
node002: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "10f8defdec135a31", "added-peer-peer-urls": ["https://10.141.255.254:2380"]}
node002: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "5a2d7ce8e12419a7", "added-peer-peer-urls": ["https://10.141.0.2:2380"]}
node001: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "39fd91fec477489", "added-peer-peer-urls": ["https://10.141.0.1:2380"]}
node001: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "10f8defdec135a31", "added-peer-peer-urls": ["https://10.141.255.254:2380"]}
node001: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "5a2d7ce8e12419a7", "added-peer-peer-urls": ["https://10.141.0.2:2380"]}
bcm: 2024-10-30T11:51:43+01:00 info membership/store.go:141 Trimming membership information from the backend...
node002: 2024-10-30T11:51:43+01:00 info snapshot/v3_snapshot.go:293 restored snapshot {"path": "/cm/shared/backup/etcd/etcd-bcm-20241030112048", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap", "initial-memory-map-size": 0}
bcm: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "39fd91fec477489", "added-peer-peer-urls": ["https://10.141.0.1:2380"]}
bcm: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "10f8defdec135a31", "added-peer-peer-urls": ["https://10.141.255.254:2380"]}
bcm: 2024-10-30T11:51:43+01:00 info membership/cluster.go:421 added member {"cluster-id": "601f2e306eb58e49", "local-member-id": "0", "added-peer-id": "5a2d7ce8e12419a7", "added-peer-peer-urls": ["https://10.141.0.2:2380"]}
node001: 2024-10-30T11:51:43+01:00 info snapshot/v3_snapshot.go:293 restored snapshot {"path": "/cm/shared/backup/etcd/etcd-bcm-20241030112048", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap", "initial-memory-map-size": 0}
bcm: 2024-10-30T11:51:43+01:00 info snapshot/v3_snapshot.go:293 restored snapshot {"path": "/cm/shared/backup/etcd/etcd-bcm-20241030112048", "wal-dir": "/var/lib/etcd/member/wal", "data-dir": "/var/lib/etcd", "snap-dir": "/var/lib/etcd/member/snap", "initial-memory-map-size": 0}

We need to fix the ownership once more, since the import by default recreates subdirectories with the root user.

pdsh -w bcm,node001,node002  chown etcd:etcd -R /var/lib/etcd
3.5. Start the Etcd services

This will be different from Section 3.2, starting via cmsh is not going to work. We need all etcd services to start in parallel. So we will have to use pdsh instead of cmsh.

bcm:~# pdsh -w bcm,node001,node002 systemctl start etcd

If the restore command was done correctly (no typo’s in hostnames and/or IP addresses) this start command should exit without hanging. Next we can double check if the services have indeed started:

root@bcm:~# pdsh -w bcm,node001,node002 systemctl status etcd | grep Active:
bcm: Active: active (running) since Thu 2024-10-31 09:12:52 CET; 1min 51s ago
node001: Active: active (running) since Thu 2024-10-31 09:12:52 CET; 1min 51s ago
node002: Active: active (running) since Thu 2024-10-31 09:12:52 CET; 1min 51s ago

Now Section 2.1 and 2.2 can be repeated in order to query Etcd for it’s endpoint health status.

root@bcm:~# module load etcd/kube-default/3.5.15 
root@bcm:~# ETCDCTL_ENDPOINTS=$(etcdctl member list | awk -F ',' '{print $5}' | sed 's/\s//' | paste -sd ",")
root@bcm:~# etcdctl -w table endpoint health
+-----------------------------+--------+-------------+-------+
|          ENDPOINT           | HEALTH |    TOOK     | ERROR |
+-----------------------------+--------+-------------+-------+
| https://10.141.255.254:2379 |   true |  18.23326ms |       |
|     https://10.141.0.2:2379 |   true | 25.209949ms |       |
|     https://10.141.0.1:2379 |   true | 28.479844ms |       |
+-----------------------------+--------+-------------+-------+
root@bcm:~# etcdctl -w table endpoint status
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|          ENDPOINT           |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     https://10.141.0.1:2379 |  39fd91fec477489 |  3.5.15 |   34 MB |     false |      false |         2 |       1126 |               1126 |        |
| https://10.141.255.254:2379 | 10f8defdec135a31 |  3.5.15 |   34 MB |      true |      false |         2 |       1126 |               1126 |        |
|     https://10.141.0.2:2379 | 5a2d7ce8e12419a7 |  3.5.15 |   34 MB |     false |      false |         2 |       1126 |               1126 |        |
+-----------------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
root@ci-tmp-100-u2204-smiley-square-300706:~# etcdctl -w table member list
+------------------+---------+---------------------------------------+-----------------------------+-----------------------------+------------+
|        ID        | STATUS  |                 NAME                  |         PEER ADDRS          |        CLIENT ADDRS         | IS LEARNER |
+------------------+---------+---------------------------------------+-----------------------------+-----------------------------+------------+
|  39fd91fec477489 | started |                               node001 |     https://10.141.0.1:2380 |     https://10.141.0.1:2379 |      false |
| 10f8defdec135a31 | started | ci-tmp-100-u2204-smiley-square-300706 | https://10.141.255.254:2380 | https://10.141.255.254:2379 |      false |
| 5a2d7ce8e12419a7 | started |                               node002 |     https://10.141.0.2:2380 |     https://10.141.0.2:2379 |      false |
+------------------+---------+---------------------------------------+-----------------------------+-----------------------------+------------+

Pay special attention to the fact that the members are considered to be part of the same cluster, and not all individually restored as a member. In that case the member list output will only show one member on each of the nodes. (The above output is correct.)

In case Etcd is reporting a healthy status, the next logical check would be to see if the Kubernetes API server is working with our restored Etcd database. A quick sanity check could be to see if commands such as the following yield expected output.

# kubectl get nodes -o wide
...
# kubectl get pod -A -o wide
...
Updated on October 31, 2024

Related Articles