This document assumes that Intel Cluster Checker 2019 is used. If a different version of Cluster Checker is used, instructions are probably similar but may differ slightly.
- Install your Bright cluster and bring up all of the nodes
- Install the Intel Cluster Runtime RPM from Bright YUM repo
yum install intel-cluster-runtime
- Schedule the Intel Cluster Ready environment module to be loaded at every login for new users and for root
echo "module load intel-cluster-runtime" >> /etc/skel/.bashrc
echo "module load intel-cluster-runtime" >> /root/.bashrc
echo "module load shared intel-cluster-runtime" >> /cm/images/default-image/root/.bashrc
NOTE 1: the ‘shared’ module is needed for root on the compute nodes because by default root logins do not touch /cm/shared to avoid lock issues when NFS server is down.
NOTE 2: The reason why we want to run Cluster Checker as root (as opposed to an ordinary user), is because otherwise dmidecode can not be used to obtain information about the memory DIMMs.
- Download Intel Cluster Checker from Intel website at https://software.intel.com/en-us/intel-cluster-checker
- Copy file l_clck_p_2019.0.015.tgz to the cluster’s root account.
- Untar file
tar -xvzf intel-clck-2019.0-20180529-2019.0-20180529.x86_64.rpm
- Run the installer
cd l_clck_p_2019.0.015/
./install.sh
- Go through the installer
- Accept License Agreement by scrolling through and typing “accept”
- Make sure that Installation target is set to “[ Current system only]” (which is the default)
- Select option 1 for finishing configuring installation targets
- Select option 1 for starting the installation
- Wait until installation is finished and press enter a few times to exit
- Intel Cluster Checker is now installed in /opt/intel/clck_latest
- Copy the /opt/intel tree into the software image:
cp -a /opt/intel /cm/images/default-image/opt/
- Propagate the changes to your nodes by using e.g. the following command in CMSH:
device imageupdate -w -c default
- Load the environment settings
source /opt/intel/clck_latest/bin/clckvars.sh
- Set a temporary directory to be used because root’s home directory is not shared across the nodes
export CLCK_SHARED_TEMP_DIR=/home/cmsupport
- Create a nodes file
for i in `seq -w 001 004`; do echo node$i; done > nodefile
- Run Cluster Checker
clck -f nodefile
- Check the clck_results file for results or run clck-analyze
clck-analyze -f nodefile
- If there are “false negatives”, it may be necessary to run with a custom configuration to disable tests or change thresholds
cp $CLCK_ROOT/etc/clck.xml ~/my_clck.xml
- Modify the configuration according to the CLCK documentation and re-run with:
clck -f nodefile -c ~/my_clck.xml