1. Prerequisites
- This article is written with Bright Cluster Manager 9.2 in mind, where Kubernetes is currently deployed with the default version 1.21.4 using containerd as its container runtime.
- The instructions are written with RHEL 8 and Ubuntu 20.04 in mind.
- These instructions have been run in dev environments a couple of times, all caveats should be covered by this KB article. We do however recommend making a backup of Etcd so a roll-back to an older version is possible. This backup can be made without interrupting the running cluster. Please follow the instructions on the following URL to create a snapshot of Etcd:https://kb.brightcomputing.com/knowledge-base/etcd-backup-and-restore-with-bright-9-0/
2. Upgrade approach
- For the purposes of this KB article we will use the following example deployment on six nodes, 3 head nodes (2 of them in a HA setup, which is not a requirement), and 3 compute-nodes make up the Kubernetes cluster.
[root@ea-k8s-a ~]# module load kubernetes/default/1.21.4 [root@ea-k8s-a ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION ea-k8s-a Ready control-plane,master 37m v1.21.4 ea-k8s-b Ready control-plane,master 36m v1.21.4 node001 Ready control-plane,master 37m v1.21.4 node002 Ready worker 37m v1.21.4 node003 Ready worker 37m v1.21.4 node004 Ready worker 37m v1.21.4 [root@ea-k8s-a ~]# kubectl version Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.4", GitCommit:"3cce4a82b44f032d0cd1a1790e6d2f5a55d20aae", GitTreeState:"clean", BuildDate:"2021-08-11T18:16:05Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.4", GitCommit:"3cce4a82b44f032d0cd1a1790e6d2f5a55d20aae", GitTreeState:"clean", BuildDate:"2021-08-11T18:10:22Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
3. Prepare a configuration overlay for control-plane
We’re updating from version 1.21 to 1.24 and new parameters have been added to Kubernetes. If we upgrade the kube apiserver, it will no longer start, because of the missing parameters.
We will create a configuration overlay, without any nodes, categories or headnodes assigned to it for future use.
[ea-k8s-a]% configurationoverlay [ea-k8s-a->configurationoverlay]% clone kube-default-master kube-default-master-new [ea-k8s-a->configurationoverlay*[kube-default-master-new*]]% set priority 520 [ea-k8s-a->configurationoverlay*[kube-default-master-new*]]% clear nodes [ea-k8s-a->configurationoverlay*[kube-default-master-new*]]% clear categories [ea-k8s-a->configurationoverlay*[kube-default-master-new*]]% roles [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*]% use kubernetes::apiserver [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::ApiServer*]]% append options "--feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false" [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::ApiServer*]]% use kubernetes::controller [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::Controller*]]% set options "--feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false" [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::Controller*]]% use kubernetes::node [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::Node*]]% set cnipluginbinariespath "/opt/cni/bin" [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::Node*]]% append options "--cgroup-driver=systemd" [ea-k8s-a->configurationoverlay*[kube-default-master-new*]->roles*[Kubernetes::Node*]]% commit
To make it easier to apply, here’s the sequence of cmsh commands used there:
configurationoverlay clone kube-default-master kube-default-master-new set priority 520 clear nodes clear categories roles use kubernetes::apiserver append options "--feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false" use kubernetes::controller set options "--feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false" use kubernetes::node set cnipluginbinariespath "/opt/cni/bin" append options "--cgroup-driver=systemd" commit
It might be possible that a deprecated option be set in kubernetes::apiserver. After commit, run this command to make sure:
use kubernetes::apiserver get options
If –feature-gates=RunAsGroup=true is listed, it needs to be removed. Use these commands to get this done:
removefrom options "--feature-gates=RunAsGroup=true" commit
4. Prepare software images
We will bump the kubernetes package for each software image that is relevant to the Kubernetes cluster. In this example scenario our three compute nodes are provisioned from /cm/images/default-image
. We will use the cm-chroot-sw-img
program to replace the kubernetes package.
[root@ea-k8s-a ~]# cm-chroot-sw-img /cm/images/default-image/ # go into chroot $ apt install -y cm-kubernetes121- cm-kubernetes124 # for ubuntu $ yum swap -y cm-kubernetes121 cm-kubernetes124 # for RHEL $ exit
5. Image update one of the workers
We start with one to see if we can update on of the kubelet
s. This should give us some confidence before upgrading all of the kubelets. We do not start with the control plane (Kubernetes API server, etc., since additional command-line flags have been added since Kubernetes version 1.21)
In our example node002
is a worker, and we will first drain the node. See https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/ for more details. This is not strictly necessary, but usually recommended.
[root@ea-k8s-a ~]# kubectl cordon node002 # disable scheduling
[root@ea-k8s-a ~]# kubectl drain node002 --ignore-daemonsets --delete-emptydir-data # optionally drain as well
The drain command will evict all Pods and prevent anything from being scheduled on the node. After the command finishes successfully we will issue an imageupdate
on node002
via cmsh
.
[root@ea-k8s-a ~]# cmsh [ea-k8s-a]% device [ea-k8s-a->device]% imageupdate -w node002 Wed Nov 23 15:09:02 2022 [notice] ea-k8s-a: Provisioning started: sending ea-k8s-a:/cm/images/default-image to node002:/, mode UPDATE, dry run = no Wed Nov 23 15:09:56 2022 [notice] ea-k8s-a: Provisioning completed: sent ea-k8s-a:/cm/images/default-image to node002:/, mode UPDATE, dry run = no imageupdate -w node002 [ COMPLETED ]
We will now restart cmd, kubelet and kube-proxy services on the node.
[root@ea-k8s-a ~]# pdsh -w node002 'systemctl daemon-reload; systemctl restart cmd; systemctl restart kubelet.service; systemctl restart kube-proxy.service'
After a few moments, verify that the kubelet has been updated correctly.
[root@ea-k8s-a ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION ea-k8s-a Ready control-plane,master 66m v1.21.4 ea-k8s-b Ready control-plane,master 66m v1.21.4 node001 Ready control-plane,master 66m v1.21.4 node002 Ready,SchedulingDisabled worker 66m v1.24.0 node003 Ready worker 66m v1.21.4 node004 Ready worker 66m v1.21.4
Notice how node002
has version set to 1.24.0
Now we can re-enable scheduling for the node.
[root@ea-k8s-a ~]# kubectl uncordon node002 node/node002 uncordoned
6. Image update the rest of the workers
This can be done similarly to step 5, one-by-one, or in batches. In the case of this KB article we’ll do the remaining compute nodes node00[3-4]
in one go, without draining them first.
- We issue an imageupdate, but for the whole category in cmsh:
device; imageupdate -c default -w
- We restart the services:
pdsh -w node00[3-4] 'systemctl daemon-reload; systemctl restart cmd; systemctl restart kubelet.service; systemctl restart kube-proxy.service'
- We confirm the version has updated.
[root@ea-k8s-a ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION ea-k8s-a Ready control-plane,master 76m v1.21.4 ea-k8s-b Ready control-plane,master 75m v1.21.4 node001 Ready control-plane,master 76m v1.21.4 node002 Ready worker 76m v1.24.0 node003 Ready worker 76m v1.24.0 node004 Ready worker 76m v1.24.0
7. Update one of the control-plane nodes
We will pick node001
and add the node to the new overlay created in step 3. If your cluster does not have control-plane nodes running on compute nodes, see the next section on how to update the Head Nodes, and pick a Head Node that runs as a control-plane.
Given that this node has not received an image update yet as, in our example, the node is in a separate category from the one used by the workers so we need to do that first if in this scenario:
[root@ea-k8s-update ~]# cmsh [ea-k8s-update]% device [ea-k8s-update->device]% imageupdate -w node001 Wed Nov 23 15:28:44 2022 [notice] ea-k8s-update: Provisioning started: sending ea-k8s-update:/cm/images/default-image to node001:/, mode UPDATE, dry run = no Wed Nov 23 15:29:32 2022 [notice] ea-k8s-update: Provisioning completed: sent ea-k8s-update:/cm/images/default-image to node001:/, mode UPDATE, dry run = no imageupdate -w node001 [ COMPLETED ]
Now we proceed with setting up the configuration overlay:
[ea-k8s-a]% configurationoverlay [ea-k8s-a->configurationoverlay]% use kube-default-master-new [ea-k8s-a->configurationoverlay[kube-default-master-new]]% append nodes node001 [ea-k8s-a->configurationoverlay*[kube-default-master-new*]]% commit
We expect the Kube API server to be automatically restarted, however, we also want to restart the scheduler and controller-manager.
pdsh -w node001 "systemctl daemon-reload; systemctl restart kube-scheduler; systemctl restart kube-controller-manager"
In this case we can try to exercise the API server on the node via curl:
[root@ea-k8s-a ~]# curl -k https://node001:6443; echo { "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "Unauthorized", "reason": "Unauthorized", "code": 401 }
The authorization error is expected here, not important for now, but mentioning for completeness: one way to do an authenticated request would be using a token (which we embed in the kubeconfig by default for the root
user):
[root@ea-k8s-a ~]# grep token .kube/config-default token: 'SOME_LONG_STRING' [root@ea-k8s-a ~]# export TOKEN=SOME_LONG_STRING [root@ea-k8s-a ~]# curl -s https://node001:6443/openapi/v2 --header "Authorization: Bearer $TOKEN" --cacert /cm/local/apps/kubernetes/var/etc/kubeca-default.pem | less
8. Updating Head Nodes
First we need to execute step 4 on the Head Nodes. In case there are two, execute the following on both.
[root@ea-k8s-a ~]# apt install -y cm-kubernetes121- cm-kubernetes124 # for ubuntu [root@ea-k8s-a ~]# yum swap -y cm-kubernetes121 cm-kubernetes124 # for RHEL
We can do kubelet + kube-proxy first as before, or we can do all services at once. Section 5 and 7 can be referenced for the detailed steps. The imageupdate steps can be omitted since those are only relevant for Compute Nodes.
We will update the worker services on the active Head Node first, and verify that the version has updated.
First, we add the active Head Node into the overlay that was created for master nodes:
root@ea-k8s-ubuntu-a:~# cmsh [ea-k8s-ubuntu-a]% configurationoverlay [ea-k8s-ubuntu-a->configurationoverlay]% use kube-default-master-new [ea-k8s-ubuntu-a->configurationoverlay[kube-default-master-new]]% append nodes master [ea-k8s-ubuntu-a->configurationoverlay*[kube-default-master-new*]]% commit
Then we can restart the kubernetes services:
[root@ea-k8s-a ~]# systemctl daemon-reload; systemctl restart kubelet; systemctl restart kube-proxy; [root@ea-k8s-a ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION ea-k8s-a Ready,SchedulingDisabled control-plane,master 47m v1.24.0 ea-k8s-b Ready,SchedulingDisabled control-plane,master 47m v1.21.4 node001 Ready control-plane,master 47m v1.24.0 node002 Ready worker 47m v1.24.0 node003 Ready worker 47m v1.24.0 node004 Ready worker 47m v1.24.0
And now we restart the Scheduler and Controller-Manager.
[root@ea-k8s-a ~]# systemctl daemon-reload; systemctl restart kubelet; systemctl restart kube-proxy;
Finally, we will repeat for the secondary Head Node. And after that, the cluster should be fully updated.
9. Updating Addons
Issuing the following command updates the addons. The output for the command has been omitted to avoid cluttering this KB article, but backups of the original yaml are made to the following directory: /cm/local/apps/kubernetes/var/
, this information is printed as part of the output.
cm-kubernetes-setup -v --update-addons
The update script will have backed up the old configuration inside NVIDIA Bright Cluster Manager as well:
[ea-k8s-a]% kubernetes [ea-k8s-a->kubernetes[default]]% appgroups [ea-k8s-a->kubernetes[default]->appgroups]% list Name (key) Applications -------------------------------- ------------------------------ system <13 in submode> system-backup-2022-11-23-193952 <13 in submode>
Restart ingress-nginx
Due to the differences that exist in the configuration of ingress-nginx between 1.21 and 1.24, its jobs have to be deleted so that cmd can restart them with the proper configuration for 1.24:
[root@ea-k8s-a ~]# kubectl delete job -n ingress-nginx --all
After a few minutes, the job should be visible again:
[root@ea-k8s-a ~]# kubectl get jobs -A NAMESPACE NAME COMPLETIONS DURATION AGE ingress-nginx ingress-nginx-admission-create 1/1 7s 7m17s ingress-nginx ingress-nginx-admission-patch 1/1 7s 7m17s
10. Finalize the update.
Kubernetes should be ready at this point, we can get rid of the old module file and make one final change to the configuration overlays.
[root@ea-k8s-a ~]# pdsh -A rm -rf /cm/local/modulefiles/kubernetes/default/1.21.4 [ea-k8s-a]% configurationoverlay [ea-k8s-a->configurationoverlay]% remove kube-default-master [ea-k8s-a->configurationoverlay*]% commit Successfully removed 1 ConfigurationOverlays Successfully committed 0 ConfigurationOverlays [ea-k8s-a->configurationoverlay]% set kube-default-master-new priority 510 [ea-k8s-a->configurationoverlay]% set kube-default-master-new name kube-default-master [ea-k8s-a->configurationoverlay*]% commit Successfully committed 1 ConfigurationOverlays [ea-k8s-a->configurationoverlay]% kubernetes [ea-k8s-a->kubernetes[default]]% labelsets [ea-k8s-a->kubernetes[default]->labelsets]% show [ea-k8s-a->kubernetes[default]->labelsets]% use master [ea-k8s-a->kubernetes[default]->labelsets[master]]% append overlays kube-default-master [ea-k8s-a->kubernetes*[default*]->labelsets*[master*]]% commit
11. Rollback the update.
In order to go back to the previous version 1.18, we have to follow the reverse of steps 1-10.
Downgrade the addons
This is needed if Step 9 was executed only.
[root@ea-k8s-a ~]# cmsh [ea-k8s-a]% kubernetes [ea-k8s-a->kubernetes[default]]% appgroups [ea-k8s-a->kubernetes[default]->appgroups]% list Name (key) Applications -------------------------------- ------------------------------ system <13 in submode> system-backup-2022-11-24-112008 <13 in submode> [ea-k8s-a->kubernetes[default]->appgroups]% set system enabled no [ea-k8s-a->kubernetes*[default*]->appgroups*]% set system-backup-2022-11-24-112008 enabled yes [ea-k8s-a->kubernetes*[default*]->appgroups*]% commit
This should keep Kubernetes busy for a minute, after it’s done restoring all the resources, do the following steps:
Downgrading the packages
We need to downgrade the newly installed cm-kubernetes124
and downgrade it everywhere to cm-kubernetes121
.
Meaning that on both Head Nodes and relevant software images the following command needs to be executed.
apt install -y cm-kubernetes124- cm-kubernetes121 # for ubuntu
yum swap -y cm-kubernetes124 cm-kubernetes121 # for RHEL
Image update relevant nodes
We need to image update the relevant nodes next, in order for all Kubernetes nodes to have the Kubernetes 1.21 binaries again. (e.g. imageupdate -c default -w
in cmsh
)
Restore the configuration overlay
Depending on whether Step 10 was executed, and whether the kube-default-master-new
overlay was already removed, the rollback can be different. In case kube-default-master-new
still exists, we can remove + commit it. The lower-priority original kube-default-master
overlay should take over the configuration.
[root@ea-k8s-a ~]# cmsh
[ea-k8s-a]% configurationoverlay
[ea-k8s-a->configurationoverlay]% remove kube-default-master-new
[ea-k8s-a->configurationoverlay*]% commit
In the second case kube-default-master
was updated in Step 10, we have to remove the extra parameters from the api server role and add the as follows.
[ea-k8s-a]% configurationoverlay [ea-k8s-a->configurationoverlay]% use kube-default-master [ea-k8s-a->configurationoverlay[kube-default-master]]% roles [ea-k8s-a->configurationoverlay[kube-default-master]->roles]% use kubernetes::apiserver [ea-k8s-a->configurationoverlay[kube-default-master]->roles[Kubernetes::ApiServer]]% removefrom options "--feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false" [ea-k8s-a->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::ApiServer*]]% use kubernetes::controller [ea-k8s-a->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::Controller]]% removefrom options "--feature-gates=LegacyServiceAccountTokenNoAutoGeneration=false" [ea-k8s-a->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::Controller*]]% use kubernetes::node [ea-k8s-a->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::Node]]% set cnipluginbinariespath /cm/local/apps/kubernetes/current/bin/cni [ea-k8s-a->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::Node*]]% removefrom options "--cgroup-driver=systemd" [ea-k8s-a->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::Node*]]% commit
In both cases the Kube API servers may be restarted and can produce errors until we complete the next step.
Restart services
On all the nodes relevant to the Kube cluster, we need to execute the following reload + restarts. In our example setup, as follows. Please note that it includes a restart of Bright Cluster Manager.
[root@ea-k8s-a ~]# pdsh -w ea-k8s-a,ea-k8s-b,node00[1-4] "systemctl daemon-reload; systemctl restart cmd; systemctl restart '*kube*.service'"
We can cleanup the module file for version 1.21 to avoid it from popping up in tab-completion.
[root@ea-k8s-a ~]# pdsh -A rm -rf /cm/local/modulefiles/kubernetes/default/1.24.0
All versions should be back to 1.21.4:
[root@ea-k8s-a ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION ea-k8s-a Ready control-plane,master 22h v1.21.4 ea-k8s-b Ready control-plane,master 22h v1.21.4 node001 Ready control-plane,master 22h v1.21.4 node002 Ready worker 22h v1.21.4 node003 Ready worker 22h v1.21.4 node004 Ready worker 22h v1.21.4
Hopefully resources inside Kubernetes are also running in good health and without issues.
It is very unlikely with this downgrade from 1.21 back to 1.18, however, should something get into an invalid, unrecoverable state, we can restore the Etcd database at this point with the snapshot created in Step 1. The instructions for this are explained in the same KB article referenced in Step 1.