How do I validate that my DGX cluster is working properly? One of the best ways to stress test your DGX cluster is to use NVIDIA’s HPC benchmarks which can be...
Installing NVIDIA DGX software stack in Bright software images This article has been split into a number of new articles that are specific to a Linux distribution: Installing NVIDIA...
How can I run a simple test to stress test my GPUs? Make sure CUDA, git and cmake are installed on the head node of the cluster: Clone the Multi GPU Benchmark...
How should I set up Slurm on a DGX cluster? A workload management system is helpful to be able to schedule jobs on a cluster of nodes. The steps below...
General considerations for installing a Bright DGX cluster Loading the correct kernel modules If you are going to use the built-in gigabit Ethernet interface as your internal cluster...
How do I use NGC containers with Bright’s Jupyter setup? These instructions are not relevant for installations of Bright Cluster Manager 9.2 and newer. An integration with Kubernetes’ Jupyter operator...
Installing NVIDIA DGX software stack in Bright RHEL7 software images This document describes the procedure for installing the official Nvidia DGX software stack in a Bright RHEL7 software image. The...
Installing NVIDIA DGX software stack in Bright RHEL8 software images This document describes the procedure for installing the official Nvidia DGX software stack in a Bright RHEL8 software image. The...
Using enroot and pyxis in Bright Cluster Manager These instructions are not relevant for installations of Bright Cluster Manager 9.1 and newer. An integration with enroot and pyxis...
Optimizing and validating JupyterHub setup to support more user sessions The default configuration of Bright’s Jupyter integration supports up to 1000 users on a single login node as long as...
Installing NVIDIA DGX software stack in Bright Ubuntu 20.04 software images This document describes the procedure for installing the official Nvidia DGX software stack in a Bright Ubuntu 20.04 software image....
Exclude lists for the DGX Docker nodes This article is applicable for the DGX nodes where Docker has been set up without using NVIDIA Bright Cluster Manager....
How to upgrade DGX A100 Firmware from headnode This article describes how to stage the NVIDIA DGX A100 Firmware Update Utility for PXE booting from the BCM headnode....