Changing Kubernetes Cluster CIDR with Calico or Flannel in Bright 8.2, 9.0, 9.1, 9.2.

Contents

1. Prerequisites

The following article was written with Bright Cluster Manager 9.1 in mind, but should work for versions 8.2, 9.0, 9.1 and 9.2 as well.

2. Kubernetes networks

During Bright Cluster Manager’s Kubernetes setup wizard the administrator is asked to define two CIDR’s for the Kube Pod Network and the Kube Service Network.

In cmsh the references to these networks can be found in the Kubernetes submode (some output omitted for brevity):

[root@headnode ~]# cmsh
[headnode]% kubernetes 
[headnode->kubernetes[default]]% show
Parameter                                   Value                                                                                              
------------------------------------------- ---------------------------------------------------------------------------------------------------
...
Service Network                             kube-default-service                                                                               
Pod Network                                 kube-default-pod                                                                                   
Pod Network Node Mask                                                                                                                          
Internal Network                            internalnet                                                                                        
KubeDNS IP                                  10.150.255.254                                                                                     
...

The networks themselves can be found and configured in the network submode:

[headnode]% network list
Name (key)            Type           Netmask bits   Base address     Domain name            IPv6
--------------------- -------------- -------------- ---------------- ---------------------- ----
externalnet           External       24             192.168.200.0    openstacklocal         no  
globalnet             Global         0              0.0.0.0          cm.cluster                 
internalnet           Internal       16             10.141.0.0       eth.cluster                
kube-default-pod      Internal       16             172.28.0.0       pod.cluster.local          
kube-default-service  Internal       16             10.150.0.0       service.cluster.local

3. Kubernetes configuration

Three relevant parameters for the Kube Controller Manager are populated with these aforementioned ranges:

–cluster-cidr=172.28.0.0/16
–service-cluster-ip-range=10.150.0.0/16
–allocate-node-cidrs (required for the other two parameters to function)

Other relevant parameters that we don’t explicitly set or change are:

–node-cidr-mask-size
- –node-cidr-mask-size-ipv4 (default 24)
- –node-cidr-mask-size-ipv6 (default 64)

The cluster CIDR is /16 by default, and the mask used for node CIDRs is /24 by default, which means 256 nodes can get a /24 CIDR. Changing the mask value to something lower can increase the number of nodes.

3.1. Changing the node-CIDR mask

Since this mask is not managed by Bright Cluster Manager by default, it can be added via cmsh inside the Kubernetes::Controller role as a generic additional option:

[root@headnode ~]# cmsh
[headnode]% configurationoverlay 
[headnode->configurationoverlay]% use kube-default-master 
[headnode->configurationoverlay[kube-default-master]]% roles
[headnode->configurationoverlay[kube-default-master]->roles]% use kubernetes::controller 
[headnode->configurationoverlay[kube-default-master]->roles[Kubernetes::Controller]]% append options "--node-cidr-mask-size-ipv4 27"
[headnode->configurationoverlay*[kube-default-master*]->roles*[Kubernetes::Controller*]]% commit

3.2. Controller Manager configuration file

With Bright Cluster Manager you can find the CIDR configurations provided to the Kubernetes Controller Manager via the following parameters file:

/cm/local/apps/kubernetes/var/etc/controller-manager

3.3. Kube Proxy configuration file

The second component that is provided the cluster CIDR is the kube proxy, via the following configuration file:

/cm/local/apps/kubernetes/var/etc/proxy.yaml

The relevant configuration in that YAML file is: clusterCIDR: 172.28.0.0/16

4. Changing the Kube Pod Network in cmsh

In this example we’ll change a previously chosen /22 CIDR to a /16.

We have a seven node Kubernetes cluster, but only four of the nodes have a CIDR allocated, as can be seen with the following kubectl command:

[root@localhost ~]# module load kubernetes/default/1.18.15 
[root@localhost ~]# kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR} ' | sed 's/ /\n/g'
172.29.3.0/24
172.29.0.0/24
172.29.2.0/24
172.29.1.0/24

Since a /22 can be divided into exactly four /24 subnets this is expected. We want to expand the cluster CIDR to /16 so each node can get a /24 podCIDR.

[root@localhost ~]# cmsh
[localhost]% network 
[localhost->network]% use kube-default-pod 
[localhost->network[kube-default-pod]]% show
Parameter                        Value                                           
-------------------------------- ------------------------------------------------
...
Base address                     172.28.0.0                                      
Broadcast address                172.29.255.255                                  
Dynamic range start              0.0.0.0                                         
Dynamic range end                0.0.0.0                                         
Netmask bits                     22                                              
Gateway                          0.0.0.0                                         
...
[localhost->network[kube-default-pod]]% set netmaskbits 16
[localhost->network*[kube-default-pod*]]% commit

In a recent enough version of Bright Cluster Manager the above will result in Kubernetes configuration to be updated and Kubernetes services to be restarted automatically. These versions are:

>= 8.2-28
>= 9.0-19
>= 9.1-13
>= 9.2-3

For older versions we need to modify the the Kubernetes Master Configuration Overlays priority briefly, in order to make Bright Cluster Manager write out the new configuration. At the time of writing, changes made in the Networking submode do not automatically propagate to Kubernetes configuration files. This has been fixed in versions 8.2-28, 9.2-2, 9.1-13, 9.0-19. For now we have to execute the following in cmsh:

[localhost->network[kube-default-pod]]% configurationoverlay 
[localhost->configurationoverlay]% use kube-default-master 
[localhost->configurationoverlay[kube-default-master]]% get priority 
510
[localhost->configurationoverlay[kube-default-master]]% set priority 511
[localhost->configurationoverlay*[kube-default-master*]]% commit
[localhost->configurationoverlay[kube-default-master]]% 
Tue Apr 19 21:03:38 2022 [notice] node004: Service kube-proxy was restarted
Tue Apr 19 21:03:38 2022 [notice] node002: Service kube-controller-manager was restarted
Tue Apr 19 21:03:38 2022 [notice] node001: Service kube-proxy was restarted
Tue Apr 19 21:03:38 2022 [notice] node002: Service kube-proxy was restarted
Tue Apr 19 21:03:38 2022 [notice] node001: Service kube-controller-manager was restarted
Tue Apr 19 21:03:38 2022 [notice] node005: Service kube-proxy was restarted
Tue Apr 19 21:03:38 2022 [notice] node006: Service kube-proxy was restarted
Tue Apr 19 21:03:38 2022 [notice] node003: Service kube-proxy was restarted

After the restarts the priority can be restored to its original value.

[localhost->configurationoverlay[kube-default-master]]% set priority 510
[localhost->configurationoverlay*[kube-default-master*]]% commit

This won’t result in another restart, since we didn’t change any configuration (such as the CIDR) after the previous restart.

5. Check the node CIDRs

After the kube services restarts from the previous section, we should already see more nodes getting a CIDR:

[root@localhost ~]# module load kubernetes/default/1.18.15 
[root@localhost ~]# kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR} ' | sed 's/ /\n/g'
172.29.3.0/24
172.29.4.0/24
172.29.5.0/24
172.29.6.0/24
172.29.0.0/24
172.29.2.0/24
172.29.1.0/24

6. Updating Calico Configuration

In case Calico is used, we can use calicoctl to list the default pool that is created when calico is first initialized. And if this was with the previous CIDR (likely) then this should be reflected in the below output:

[root@localhost ~]# calicoctl get pool -o wide
NAME                  CIDR            NAT    IPIPMODE   VXLANMODE   DISABLED   SELECTOR   
default-ipv4-ippool   172.29.0.0/22   true   Always     Never       false      all()

In this case we’re dealing with expanding the CIDR from /22 to /16, but if we had changed 172.29.0.0/16 to 172.30.0.0/16 for example, the same approach can be used.

[root@localhost ~]# calicoctl get pool -o yaml > pool.yaml
[root@localhost ~]# vim pool.yaml  # change the CIDR
[root@localhost ~]# calicoctl delete -f pool.yaml && calicoctl apply -f pool.yaml

Calico should start using addresses within the /16 range right away for new pods. Pods need to be recreated in order for them to get an IP in the modified pool.

7. Updating Flannel Configuration

In case Flannel is used, the Controller Manager might keep showing errors such as:

Apr 20 11:43:33 node002 kube-controller-manager[27266]: E0420 11:43:33.265097   27266 controllermanager.go:521] Error starting "nodeipam"
Apr 20 11:43:33 node002 kube-controller-manager[27266]: F0420 11:43:33.265119   27266 controllermanager.go:235] error starting controllers: failed to mark cidr[172.29.1.0/24] at idx [0] as occupied for node: node001: cidr 172.29.1.0/24 is out the range of cluster cidr 10.75.0.0/16

In this example we’ve changed the base network from 172.29.0.0/16 to 10.75.0.0/16. Some more manual work is required for Flannel.

First ensure we delete the interfaces used by flannel on the Kubernetes nodes.

sudo ip link del cni0;
sudo ip link del flannel.1

For example by using pdsh:

pdsh -A "sudo ip link del cni0; sudo ip link del flannel.1"

Secondly, get rid of the flannel and core-dns pods:

root@headnode:~# kubectl delete pod -n kube-system -l app=flannel
pod "kube-flannel-ds-amd64-tqnwx" deleted
pod "kube-flannel-ds-amd64-wc87p" deleted
pod "kube-flannel-ds-amd64-zcq9n" deleted
pod "kube-flannel-ds-amd64-zpmqf" deleted
root@headnode:~# kubectl delete pod -n kube-system -l k8s-app=kube-dns
pod "coredns-b5cdc886c-8vp2b" deleted
pod "coredns-b5cdc886c-jswxd" deleted

Thirdly, we also need to manually change the allocated podCIDRs in the nodes. The easiest way we found so far is based of off the following serfault answer here.

The example bash script below can be copy & pasted on the Head Node after loading the module file. And recreates each Kubernetes node resource, and replaces “172.29.” with “10.75.”.

while read node; do
  kubectl get node $node -o yaml > /tmp/$node.yaml
  sed -i.bak "s/172.29./10.75./g" /tmp/$node.yaml
  kubectl delete node $node
  kubectl create -f /tmp/$node.yaml
done < <(kubectl get nodes --no-headers | awk '{print $1}')

Confirm that the podCIDRs are now correct (below example output from a different cluster, with only four nodes):

root@headnode:~# kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR} ' | sed 's/ /\n/g'
10.75.1.0/24
10.75.2.0/24
10.75.3.0/24
10.75.0.0/24

Final step: Recreate other Kubernetes PODs that are still using an old CIDR to migrate them.

Updated on May 17, 2022