• General considerations for installing a Bright DGX cluster

    Loading the correct kernel modules If you are going to use the built-in gigabit Ethernet interface as your internal cluster network between the head node(s) and the DGX nodes, there is nothing special that needs to be done in terms of loading kernel modules. This is because the igb module…

  • How should I set up Slurm on a DGX cluster?

    A workload management system is helpful to be able to schedule jobs on a cluster of nodes. The steps below describe how to set up Slurm in such a way so that GPUs have to be explicitly requested. This way it becomes much easier to share GPU and CPU compute…

  • How can I run a simple test to stress test my GPUs?

    Make sure CUDA, git and cmake are installed on the head node of the cluster: Clone the Multi GPU Benchmark (mgbench) repository under a user account (e.g. cmsupport): Load the CUDA environment module: Build it: Create a file mgbench.slurm with the following contents: Submit a number of jobs: Each job…

  • How do I validate that my DGX cluster is working properly?

    One of the best ways to stress test your DGX cluster is to use NVIDIA’s HPC benchmarks which can be found in NGC. Since this software is packaged as a container image, we will need to use a container runtime engine such as Singularity to run it. It is worth…

  • How can I have multiple network interfaces on a node in the same IP subnet?

    When you configure multiple network interfaces on a single machine with an IP address in the same IP subnet, you will need to do some additional configuration work to allow the networking stack in the Linux kernel to use these interfaces properly. By default you will find that only one…

  • How do I use Grafana to visualize monitoring data from a Bright cluster?

    Although Bright View has extensive capabilities when it comes to visualizing monitoring information, it may be desirable to be able to visualize monitoring data using Grafana (e.g. when Grafana is used as a datacenter wide monitoring tool). As of Bright 8.2 it is possible to follow the instructions below to…

  • How can I use Grafana to monitor multiple Bright clusters?

    As of Bright 8.2-24 / 9.0-12 / 9.1-2, it is possible to add the cluster as a data source to Grafana so that the Grafana interface can be used to visualize monitoring information. To quickly verify that cmdaemon on your cluster is compatible, you can check that the build date…

  • What is Bright Computing’s take on CentOS versus CentOS Stream?

    In December of 2020, Red Hat announced that it will discontinue CentOS 8 by the end of 2021 and instead will focus on CentOS Stream going forward. Fortunately CentOS 7 will continue to be updated until 2024 and is therefore not affected by this change. While CentOS traditionally has been…

  • How do I use Spack on my Bright cluster?

    Bright Cluster Manager provides a good selection of ready-to-use libraries and tools that are commonly used in a high performance computing environment. It may happen that you need a tool or library for which Bright does not provide a package. Spack is a package manager that can be of use…