• How to Use the By-path Method for Defining Block Devices in the Disk Setup

    When block device naming is not persistent, you may specify the “by-path” name on the <blockdev> line in a node’s disk setup instead of, say, /dev/nvme2n1. For example: <blockdev>/dev/disk/by-path/pci-0000:03:00.0-scsi-0:3:111:0</blockdev> You may find the “by-path” names for a node’s block devices by running the following command on that node: # ls…

  • How to use Lmod spider cache?

    The BCM Lmod package is built with spider cache functionality.  The recommended directory for storing this cache is /var/lib/lmod/mData/cacheDir, and it is recommended that the timestamp file be located at /var/lib/lmod/mData/cacheTS.txt. The cache may be generated and updated as follows: # /usr/share/lmod/lmod/libexec/update_lmod_system_cache_files -t /var/lib/lmod/mData/cacheTS.txt -d /var/lib/lmod/mData/cacheDir /cm/shared/modulefiles Note how the…

  • Why Is DNS Resolution Failing in My Kubernetes Containers?

     If the cluster is running Bright 9.2 and using Ubuntu 22.04 for the nodes that are part of the Kubernetes cluster, then you may notice that DNS resolution inside Kubernetes containers is failing. For example: # kubectl exec -i -t dnsutils — nslookup www.google.com Server: 10.150.255.254 Address: 10.150.255.254#53 Non-authoritative answer:…

  • How do I remove a generic device from Bright?

    To remove a generic device from Bright using cmsh, the device will first need to be closed. For example, let’s say you have a generic device called powershelf01 that you no longer want to use with your Bright cluster.  The device may be removed by running the following cmsh commands…

  • Why is sshare for Slurm 23.02 complaining about the priority/basic plugin?

    If you run the “sshare” tool for Slurm 23.02 on your cluster, you may see the following error output: $ sshare sshare: error: plugin_load_from_file: dlopen(/cm/shared/apps/slurm/23.02.2/lib64/slurm/priority_basic.so): /cm/shared/apps/slurm/23.02.2/lib64/slurm/priority_basic.so: undefined symbol: job_list sshare: error: Couldn’t load specified plugin name for priority/basic: Dlopen of plugin file failed sshare: error: cannot create priority context for…

  • How Do I Set up GPFS 5 on a Bright cluster?

    Run the installer and accept the license.# ./Spectrum_Scale_Developer-5.1.0.3-x86_64-Linux-install NOTE: If you have the purchased version of the GPFS software, the name of the installer will be different. In that case, replace the above with the name of the installer from the purchased version. Install onto the head node(s) the packages…

  • How to have Bright monitor a BittWare FPGA card

    If you have a BittWare FPGA card that can be inserted into a PCI/PCIe slot of a Bright-managed compute node (e. g. XUP-VVH), then you can follow these procedures to allow Bright to monitor the card’s sensor data. Installing the toolkit First, in order to be able to gather sensor…