Why has my software suite vanished from the bunch of nodes I installed it on? How do I fix it? Most likely you installed the software directly onto those nodes instead of on the “software image”. Then, after a reboot...
How do I use Spack on my Bright cluster? Bright Cluster Manager provides a good selection of ready-to-use libraries and tools that are commonly used in a high performance...
How can I fix “Failed to initialize NVML: Driver/library version mismatch?” Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
How to have Bright monitor a BittWare FPGA card If you have a BittWare FPGA card that can be inserted into a PCI/PCIe slot of a Bright-managed compute node...
How can I use Grafana to monitor multiple Bright clusters? Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
How do I use Grafana to visualize monitoring data from a Bright cluster? Although Bright View has extensive capabilities when it comes to visualizing monitoring information, it may be desirable to be able...
How do I validate that my DGX cluster is working properly? One of the best ways to stress test your DGX cluster is to use NVIDIA’s HPC benchmarks which can be...
How can I get access to nightly builds of packages? Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
Running Jupyter kernel with Conda (Anaconda/Miniconda) environments Bright Cluster Manager’s data science add-on provides many ML related packages that can be used to run AI workloads on...
Enabling Kdump (RHEL/CentOS) Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
Deploying NICE DCV on a Bright cluster Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
Firefox issue – Secure Connection Failed Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
How to create a Docker image to run Jupyter kernels This article demonstrates a procedure to create a Docker image which can be used to run Jupyter kernels via Kubernetes....
Upgrading Slurm This article will go over the steps needed to upgrade the Bright provided SLURM packages to a newer major version...
Installing Kubernetes on Air-Gapped Systems Kubernetes is most easily installed on a cluster that is able to access the internet. For clusters without internet access...
Installing Kubernetes on Air-Gapped Systems Kubernetes is most easily installed on a cluster that is able to access the internet. For clusters without internet access...
How do I ensure that the container images I run on my BCM cluster through Kubernetes are secure? Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
Enabling Kdump (Ubuntu) Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
How do I add a QCOW image as a software image? Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
Installing Kubernetes on Air-Gapped 9.2 Systems Kubernetes is most easily installed on a cluster that is able to access the internet. For clusters without internet access...
How to install Azure Managed Lustre FS client on top of BCM 10 This article is tested on BCM10 with Ubuntu 22.04 1. clone the default software-image cmsh softwareimage clone default-image amlfs-image 2....
Kubernetes 1.30.5 setup on BCM 10.24.09 Airgapped 1. Prerequisites and Requirements We will setup the following: Kubernetes version: v1.30.5 Base Command Manager (BCM) version: 10.24.09 On Linux...
Required security upgrade for nvidia-container-toolkit A security issue has been found in nvidia-container-toolkit. All nodes that have this package installed need to have the package...
Deprecation of cm-singularity and cm-apptainer packages Notice of Knowledge Base Relocation Our Knowledge Base has been relocated to the NVIDIA Enterprise Support Portal. This update is...
Pythoncm script examples on software image related operations This article will provide Pythoncm script examples for software image related operations. Pre-requisites: Install Pythoncm and load the module:...