• How to disable a GPU on a node

    In certain scenarios disabling a node GPU can be necessary, for example when a GPU on a node becomes faulty and replacement is about to arrive. In this article we will show 2 possible ways for disabling an NVIDIA GPU on a compute node.   Method 1 * Collect the…

  • Exclude lists for the DGX Docker nodes

    This article is applicable for the DGX nodes where Docker has been set up without using NVIDIA Bright Cluster Manager. At the time of setting up Docker using Bright, a few Docker files and directories are added to different exclude lists to avoid syncing undesired files between software image and…

  • How to run SLURM jobs in Singularity containers via Jupyter

    In this article we are going to demonstrate a procedure to run SLURM jobs in Singularity containers by Jupyter on a Bright 9.2 Ubuntu 20.04 cluster. We assume that Jupyter, SLURM and Singularity have already been set up on the target cluster by following NVIDIA Bright Cluster Manager manuals. The…

  • Enabling Kdump (Ubuntu)

    The instructions in this article can be followed to enable Kdump on Ubuntu 20.04 compute nodes.  As an additional precaution, if you have a test compute node you could consider that for testing this procedure first.  One possibility is to clone the existing production software image in use on the…

  • How to create a Docker image to run Jupyter kernels

    This article demonstrates a procedure to create a Docker image which can be used to run Jupyter kernels via Kubernetes. Some sample Docker files for creating Jupyter kernel compatible images can be found in the following directories: In addition to the Docker images mentioned in the sample Docker files, any…