1. Prerequisites
- The following article was written with Bright Cluster Manager 9.0 in mind but should work the same also for newer versions (at least versions 9.1 and 9.2)
- We assume we have shared storage available as a mount in
/cm/shared
, we will create a target directory there for our Etcd backups. - The restore part of this KB article should only be followed if your entire Etcd cluster has to be recreated from the backup. If you run a multi-node Etcd cluster, broken members can be replaced or fixed by synchronizing from the remaining working Etcd members. This is often a better approach (compared to restoring from snapshots) and can be found here: https://kb.brightcomputing.com/knowledge-base/etcd-membership-reconfiguration-in-bright-9-0/. The snapshots are still a good backup to have.
- Please note that there is a subtle difference between Etcd versions 3.4.x and 3.5.x. We used to set
/var/lib/etcd
with0755
permissions, but this has been changed to0700
. (Setting to0755
will result in Etcd not starting).- In the same vain,
etcdctl
has been replaced byetcdutl
for certain operations. This KB article still usesetcdctl
, and should still work with all versions in BCM 9.0, 9.1 and 9.2.
- In the same vain,
2. Etcd installations
- Bright Kubernetes setups always require an odd number of Etcd nodes.
- Three Etcd nodes are recommended, but single-node Etcd deployments are also possible.
- Etcd nodes are marked as datanodes. This prevents Full Provisioning from unintentionally wiping the Etcd database.
- Etcd stores its data in
/var/lib/etcd
by default, which is called the spool directory.
The spool directory can be changed, it can be found in the Etcd::Host role via cmsh:
[cluster->configurationoverlay[kube-default-etcd]->roles[Etcd::Host]]% get spool /var/lib/etcd
In case it’s not /var/lib/etcd
, please substitute it for the correct path in the rest of this article.
3. Check the cluster health
First login to one of the Etcd cluster nodes (any of the devices with the Etcd::Host
role)
Then load the module file and check for the Health:
# module load etcd/kube-default/<version> # etcdctl endpoint health https://10.141.0.1:2379 is healthy: successfully committed proposal: took = 16.551531ms
4. Prepare a location for the backup
# mkdir -p /cm/shared/backup/etcd
5. Create the snapshot
# etcdctl snapshot save /cm/shared/backup/etcd/etcd-$(hostname)-$(date +"%Y%m%d%H%M%S") {"level":"info","ts":1647867412.727886,"caller":"snapshot/v3_snapshot.go:110","msg":"created temporary db file","path":"/cm/shared/backup/etcd/etcd-node001-20220321135652.part"} {"level":"info","ts":1647867412.7418113,"caller":"snapshot/v3_snapshot.go:121","msg":"fetching snapshot","endpoint":"https://10.141.0.1:2379"} {"level":"info","ts":1647867412.841619,"caller":"snapshot/v3_snapshot.go:134","msg":"fetched snapshot","endpoint":"https://10.141.0.1:2379","took":0.110848759} {"level":"info","ts":1647867412.849341,"caller":"snapshot/v3_snapshot.go:143","msg":"saved","path":"/cm/shared/backup/etcd/etcd-node001-20220321135652"} Snapshot saved at /cm/shared/backup/etcd/etcd-node001-20220321135652
6. Restore the snapshot
This section assumes that there is an actual need to restore a snapshot. For example, hardware failure of all Etcd members. If some are still up & running, then replacing, adding or removing members while it remains operational is possibly a better solution.
6.1. Important: First stop Etcd service(s)
Even if the nodes are SHUTOFF, we ensure that Bright Cluster Manager won’t try to start Etcd before the backup has been restored.
If an Etcd node comes up, with a new HDD, and this disk is provisioned from scratch, Etcd’s spool directory (/var/lib/etcd
) will be empty, and Etcd will come up as new, with nothing in its database.
This is a problem, since Kubernetes, once the connection with Etcd is established again. Will consider the empty database the new desired state, and start terminating all containers that do not match this desired state.
Once we restored the backup, Kubernetes will do its best to make the actual state match the desired state again, but running jobs and so on will have already been interrupted. This is why we need to ensure etcd services don’t come up before the backup has been restored.
Kube API servers
Since we’re about to unassign the Etcd::Host
roles from some nodes, and with that, stop the etcd services. The API servers will be restarted with an empty list of servers for the --etcd-servers=https://10.141.0.1:2379
parameter. This will result in them failing to restart.
During this period kubectl cannot be used to query Kubernetes resources, but the containerized services running inside Kubernetes, should continue to run where possible. When Pods crash, Kubernetes won’t be able to issue reschedules. This will all start working again once the Etcd backup has been restored, and we re-assign the Etcd::Host
roles.
How to Stop the Etcd service(s)
- Let’s launch
cmsh
. - Remove the node(s) from the Configuration Overlay
[cluster]% configurationoverlay [cluster->configurationoverlay]% use kube-default-etcd [cluster->configurationoverlay[kube-default-etcd]]% removefrom nodes node001 [cluster->configurationoverlay*[kube-default-etcd*]]% commit Tue Mar 22 07:14:05 2022 [notice] node001: Service etcd was stopped
If the node was already SHUTOFF that means Kubernetes is already incapable of updating any changes to its state. The If the node is still UP, it will become so after the last etcd
service is stopped.
To re-emphasize, if this is a three-node Etcd cluster, do this for all the Etcd nodes, not just one, or one-by-one. Etcd needs to be “collectively” not-running before we take a snapshot.
Three extra steps if the node is still UP
and we’re dealing with a Compute Node.
- Login to the Etcd node.
- Stop the
etcd
service:systemctl stop etcd
(Optionally check first withsystemctl status etcd
if stopping it is necessary.) - Move the spool dir out of the way:
mv /var/lib/etcd /var/lib/etcd.old
Do this if you do not plan to do a full reprovisioning of the Compute Node: recreate the directory.
For Etcd version 3.4.x:mkdir /var/lib/etcd; chmod 0755 /var/lib/etcd
For Etcd version 3.5.x:mkdir /var/lib/etcd; chmod 0700 /var/lib/etcd
Now it’s safe to power on the node and do a FULL provisioning. (See next section)
Next steps are useful if we’re dealing with a Head Node:
- Stop the
etcd
service:systemctl stop etcd
(Optionally check first withsystemctl status etcd
if stopping it is necessary.) - Move the spool dir out of the way:
mv /var/lib/etcd /var/lib/etcd.old
- Recreate the directory.
For Etcd version 3.4.x:mkdir /var/lib/etcd; chmod 0755 /var/lib/etcd
For Etcd version 3.5.x:mkdir /var/lib/etcd; chmod 0700 /var/lib/etcd
FULL provisioning
Please keep in mind that this step wipes the entire disk for the node, and reprovisions it from scratch.
In this example, we had an HDD failure in node001, so we already lost our Etcd data in /var/lib/etcd
. Let’s say this was a single-node Etcd cluster, and we want to do a FULL provisioning, and try to recover Etcd with the snapshot we made in Step 5 once this is completed.
# make sure we allow for FULL provisioning [cluster->device[node001]]% set datanode no [cluster->device*[node001*]]% commit # configure our next provisioning to be FULL [cluster->device[node001]]% set nextinstallmode full [cluster->device*[node001*]]% commit [cluster->device[node001]]% reboot # or start if powered off node001: Reboot in progress ...
If we were dealing with a three-node Etcd cluster, we’d have to do this for all three-nodes.
6.3. Restore the snapshot
Once the node is back up, log in to the node, and restore the backup with the following commands.
# module load etcd/kube-default/<your_version> # etcdctl snapshot restore --data-dir=/var/lib/etcd /cm/shared/backup/etcd/etcd-node001-20220321135652 {"level":"info","ts":1647875570.8151839,"caller":"snapshot/v3_snapshot.go:287","msg":"restoring snapshot","path":"/cm/shared/backup/etcd/etcd-node001-20220321135652","wal-dir":"/var/lib/etcd/member/wal","data-dir":"/var/lib/etcd","snap-dir":"/var/lib/etcd/member/snap"} {"level":"info","ts":1647875570.862523,"caller":"mvcc/kvstore.go:378","msg":"restored last compact revision","meta-bucket-name":"meta","meta-bucket-name-key":"finishedCompactRev","restored-compact-revision":10502} {"level":"info","ts":1647875570.8769147,"caller":"membership/cluster.go:392","msg":"added member","cluster-id":"cdf818194e3a8c32","local-member-id":"0","added-peer-id":"8e9e05c52164694d","added-peer-peer-urls":["http://localhost:2380"]} {"level":"info","ts":1647875570.9070866,"caller":"snapshot/v3_snapshot.go:300","msg":"restored snapshot","path":"/cm/shared/backup/etcd/etcd-node001-20220321135652","wal-dir":"/var/lib/etcd/member/wal","data-dir":"/var/lib/etcd","snap-dir":"/var/lib/etcd/member/snap"} # chmod 755 /var/lib/etcd # use chmod 700 for 3.5.x Etcd! # chown etcd:etcd -R /var/lib/etcd
Please don’t forget the last two commands (chmod
and chown
, or the Etcd won’t be able to start due to permissions issues).
If we were dealing with a three-node Etcd cluster, we’d have to do this for all three-nodes.
6.4. Start the Etcd Service(s)
Something to keep in mind before continuing: Any potential changes that were made after the snapshot was made will not match the (restored) desired state. Those containers might still be running, since the connection with Etcd is lost. Once the connection is re-established, it will fix the actual state by terminating them. This is not a problem if the snapshot is new enough.
Having said that, re-assigning the Etcd::Host
role to the node(s) can be done by assigning back to the Configuration Overlay:
[cluster->configurationoverlay[kube-default-etcd]]% append nodes node001 [cluster->configurationoverlay*[kube-default-etcd*]]% commit
Bright Cluster Manager will start the Etcd service after some delays.
Now Kubernetes should be able to reconnect to Etcd.
If we were dealing with a three-node Etcd cluster, we’d have to do this for all three-nodes.
7. Final notes
Please ensure that each Etcd::Host
node has the datanode property set to “yes”.
[cluster->device[node001]]% set datanode yes [cluster->device*[node001*]]% commit
The Kubernetes API servers should automatically start. If not, you can start them manually from cmsh
.