There are two ways to handle this issue:
- Reorder the load order of the kernel modules,
- Detect an invalid network interface order and reboot
- via initialize script
- via the init script
Reordering the kernel modules:
The reason NICs get reordered might be because each of them uses a different kernel module, say Module A and module B.
If those kernel modules are close together on the list of modules for a software image, then once in a while they might detect their respective hardware in a different order than the one implied from the ordering of the kernel modules themselves.
A way to deal with it is to separate those kernel modules with different kernel modules. This can be done either using CMGUI, or by means of CMSH up/down commands.
Below is a brief walkthrough for reordering the modules with cmsh:
[headnode->softwareimage]% use default-image
Module (key) Parameters
[headnode->softwareimage[default-image]->kernelmodules]% up NIC_MODULE_A
[headnode->softwareimage[default-image]->kernelmodules]% down NIC_MODULE_B
<repeat until the modules are further apart>
<wait until the initrd gets regenerated>
Mon Mar 11 16:10:52 2013 [notice] headnode: Initial ramdisk for image default-image is being generated
Mon Mar 11 16:12:18 2013 [notice] headnode: Initial ramdisk for image default-image was regenerated successfully.
Detect invalid network interface order and rebooting
Another way to deal with the issue of NICs being identified in an incorrect order would be to manually check the order when the node boots and reboot if the order is incorrect. This is only going to work if the hardware regularly changes its order of detection on boot, so if the kernel driver changes, and the new order favors the wrong NIC being identified first, you will end up with an endless reboot loop. So, some care is advised for the custom script.
Depending on the individual circumstances this should be done either via a custom initalize script, or via a modified init script of the initrd. The latter should work in most cases, whereas the former (easier to do), might not work if one of the interfaces which were identified in the incorrect order was the boot interface (again, this depends on individual circumstances).
Custom initialize script
When writing a custom initialize script use the “sreboot” command (that’s not a typo) to reboot the node. Details on writing the initialize scripts can be found in the Administrator Manual.
Custom code in the initrd’s init script
You can also insert custom code into the init script of the initrd which will detect when the interface were brought up in the incorrect order before the node installer starts, and will then reboot the node.
Note that (depending on the situation) a custom initialize script might also be used as an alternative to modifying the initrd’s init script, however it is possible that due to NICs being brought up in the incorrect order the node installer will not reach the moment where the initialize script is run.
Please refer to the following KB article on how to manually insert code snippets into the initrd’s init script.
Using the guidelines in that KB article, insert your custom code snippet right before the following (existing) code fragment:
ifconfig $BOOTDEV mtu $MTU
if [ "$?" -ne "0" ]; then
echo "WARNING: Setting MTU failed!"
echo "Setting MTU $MTU."
The exact location inside the init script is not that important (as long it is not in the code branch which gets executed by the cloud nodes).
The exact code which you need to write to test for network interfaces being out of order will, of course, be specific to your individual case. If you need any help with this, please contact our support.