How do I validate that my DGX cluster is working properly? One of the best ways to stress test your DGX cluster is to use NVIDIA’s HPC benchmarks which can be...
How should I set up Slurm on a DGX cluster? Background A workload management system is helpful for scheduling jobs on a cluster of nodes. The steps below describe how...
General considerations for installing a Bright DGX cluster Loading the correct kernel modules If you are going to use the built-in gigabit Ethernet interface as your internal cluster...
Optimizing and validating JupyterHub setup to support more user sessions The default configuration of Bright’s Jupyter integration supports up to 1000 users on a single login node as long as...
Exclude lists for the DGX Docker nodes This article is applicable for the DGX nodes where Docker has been set up without using NVIDIA Bright Cluster Manager....
How to upgrade DGX A100 Firmware from headnode This article describes how to stage the NVIDIA DGX A100 Firmware Update Utility for PXE booting from the BCM headnode....