1. Home
  2. Third Party Software
  3. The Mellanox IB kernel modules fail to load with mlnx-ofed-49, mlnx-ofed-50, mlnx-ofed-51, mlnx-ofed-52

The Mellanox IB kernel modules fail to load with mlnx-ofed-49, mlnx-ofed-50, mlnx-ofed-51, mlnx-ofed-52

Warning: These instructions may not be appropriate for all use cases. Please check the vendor release notes carefully before implementing.

The Linux Kernel ABI compatibility appears to have been broken in newer Linux kernel releases.
This appears to affect the kernels in SLES12 SP5, RHEL/CentOS 7.7 onwards, RHEL/Centos 8.1 onwards.
There may be other versions affected as well. For example: SLES 15

The issue shows up as follows when attempting to start the openibd service.

# service openibd start
Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…
Module mlx4_ib belong to kernel which is not a part of MLNX[FAILED] skipping…
Module mlx4_core belong to kernel which is not a part of ML[FAILED] skipping…
Module mlx4_en belong to kernel which is not a part of MLNX[FAILED] skipping…
Module mlx5_core belong to kernel which is not a part of ML[FAILED] skipping…
Module mlx5_ib belong to kernel which is not a part of MLNX[FAILED] skipping…
Module mlx5_fpga_tools does not exist, skipping… [FAILED]
Module ib_umad belong to kernel which is not a part of MLNX[FAILED] skipping…
Module ib_uverbs belong to kernel which is not a part of ML[FAILED] skipping…
Module ib_ipoib belong to kernel which is not a part of MLN[FAILED]skipping…
Loading HCA driver and Access Layer:                       [  OK  ]
Module rdma_cm belong to kernel which is not a part of MLNX[FAILED]skipping…
Module ib_ucm does not exist, skipping…                  [FAILED]
Module rdma_ucm belong to kernel which is not a part of MLN[FAILED]skipping…

There are two potential solutions.

Option1: 

To work around this issue, the upstream Mellanox installer provides a “–add-kernel-support” flag. Unfortunately, the Bright packaged version of the Mellanox OFED doesn’t provide this functionality as it has the potential to break MPI and workload manager compatibility. 

As a workaround for the Bright packages, perform the following:

1. In /etc/init.d/openibd on line 132, set FORCE=0 to FORCE=1. This causes openibd to ignore the kernel difference but relies on weak-updates. 

2. Edit /etc/infiniband/openib.conf and set UCM_LOAD=no and MLX5_FPGA_LOAD=no. As most customers aren’t using Legacy cards or FPGAs, this should not be an issue. 

3. Restart the openibd service.

Once complete, the Mellanox OFED modules should load as expected.

# service openibd start
Loading HCA driver and Access Layer:                       [  OK  ]


The above changes would also need to be applied to your software images.

Option 2:
The alternative to the above steps is to use the upstream Mellanox installer with the –add-kernel-support flag.

Updated on May 5, 2021

Was this article helpful?

Related Articles

Comments

  1. This is a bad kb article. Gave me a lot of grief and caused BeeGFS to use TCP rather than RDMA. In my opinion, it should be retracted.

    1. You make a fair point. We have added a warning to the top of the article.

Reply to Tore H. Larsen Cancel