• How Do I Add a BCM ISO as an APT Repository on BCM Ubuntu Clusters?

    First, mount the BCM ISO on the head node. For example: # mount -o loop,ro bcm-11.0-ubuntu2404.iso /mnt For installing updates to the head node, you can create a .list file under /etc/apt/sources.list.d/. For example: # cat /etc/apt/sources.list.d/bcm-dvd.list deb [trusted=yes] file:///mnt/data/packages/11.25.04/ubuntu/2404/ ./ deb [trusted=yes] file:///mnt/data/packages/dist/ubuntu/2404/ ./ The second line isn’t necessary…

  • How to Run a Script Only at First Boot of Compute Node

    Let’s say you have some software installer that needs to be run on a compute node the first time that node is booted. An rc.local script can be added to that node’s software image that checks for the existence of some file to see if the installer has been run…

  • How to Change TTL for Head Node DNS Service?

    Bright Versions: 8.1, 8.2, 9.0, 9.1, 9.2 BCM Versions: 10.0 On the cluster’s head node(s), the minimum TTL that is being used by the start-of-authority (SOA) record for the DNS service may be changed with the DNSSoaTTL AdvancedConfig directive. If no AdvancedConfig directive is currently defined in the /cm/local/apps/cmd/etc/cmd.conf file…

  • How to Use a Compute Node as a Redundant Slurm Controller?

    Let’s say that the BCM cluster already has Slurm deployed and that the cluster has one head node that is serving as the lone Slurm server (controller). Perhaps you do not have additional hardware that you can use as a secondary head node that can be configured for head node…

  • Preventing NetworkManager from Overwriting /etc/resolv.conf on RHEL Compute Nodes

    With RHEL/Rocky Linux 8 and later releases that are being used by software images for compute nodes, you may find that the NetworkManager service is overwriting the /etc/resolv.conf file on the compute node after it has been provisioned by the BCM node installer. For example: [root@hpc077 ~]# cat /etc/resolv.conf #…

  • Why Are Software Images Not Syncing from the Active to the Passive Head Node?

    After setting up head node high availability (HA) on a BCM cluster, you may notice that the software images are not being synced from the active head node to the passive head node. If specific categories or nodegroups have been set in the provisioning role of the head nodes in…

  • How to Use the By-path Method for Defining Block Devices in the Disk Setup

    When block device naming is not persistent, you may specify the “by-path” name on the <blockdev> line in a node’s disk setup instead of, say, /dev/nvme2n1. For example: <blockdev>/dev/disk/by-path/pci-0000:03:00.0-scsi-0:3:111:0</blockdev> You may find the “by-path” names for a node’s block devices by running the following command on that node: # ls…

  • How to use Lmod spider cache?

    The BCM Lmod package is built with spider cache functionality.  The recommended directory for storing this cache is /var/lib/lmod/mData/cacheDir, and it is recommended that the timestamp file be located at /var/lib/lmod/mData/cacheTS.txt. The cache may be generated and updated as follows: # /usr/share/lmod/lmod/libexec/update_lmod_system_cache_files -t /var/lib/lmod/mData/cacheTS.txt -d /var/lib/lmod/mData/cacheDir /cm/shared/modulefiles Note how the…

  • Why Is DNS Resolution Failing in My Kubernetes Containers?

     If the cluster is running Bright 9.2 and using Ubuntu 22.04 for the nodes that are part of the Kubernetes cluster, then you may notice that DNS resolution inside Kubernetes containers is failing. For example: # kubectl exec -i -t dnsutils — nslookup www.google.com Server: 10.150.255.254 Address: 10.150.255.254#53 Non-authoritative answer:…