1. Home
  2. NVIDIA DGX
  3. How should I set up Slurm on a DGX cluster?
  1. Home
  2. Workload Management
  3. How should I set up Slurm on a DGX cluster?

How should I set up Slurm on a DGX cluster?

Background

A workload management system is helpful for scheduling jobs on a cluster of nodes. The steps below describe how to set up Slurm so that GPUs have to be explicitly requested. This makes sharing GPU and CPU computing resources much easier for many people.

Installation Steps

1. To start the process, we are going to remove the current Slurm setup, if there is any:

# cm-wlm-setup --disable --wlm-cluster-name=slurm --yes-i-really-mean-it

2. Next, we will start the interactive setup tool:

# cm-wlm-setup

3. At the first screen, we will choose the Step By Step setup option:

4. Select Slurm as the workload management system to set up:

5. Set an appropriate name for the Slurm cluster (slurm will do fine if you will only have one Slurm instance on this Bright cluster).

6. Select your head node as the Slurm server:

7. Choose a name for the configuration overlay that is about to be created (the defaults are fine):

8. Select Yes to configure GPU resources:

9. Initially, cm-wlm-setup will let you set up Slurm clients without GPUs. Assuming that there are no nodes to be set up without GPUs, unselect all categories and press OK.

10. Assuming there are no compute nodes without GPUs, leave all the options unselected at the following screen and press OK:

11. Choose a suitable name for the configuration overlay of Slurm clients without GPUs (the default is fine):

12. Choose a suitable name for the configuration overlay of Slurm clients with GPUs (the default is fine):

13. Select the categories of compute nodes with GPUs that you would like to include in the configuration overlay that was created in the previous step:

14. Select any additional nodes with GPUs that should be added to the configuration overlay:

15. Select a priority for the configuration overlay (the default is fine for all practical cases):

16. Leave the number of slots unconfigured:

16. Select the category of GPU compute nodes: these are nodes from which jobs will be submitted. If you have a category of login nodes, you will want to add it as well. We will add the head node in the next screen:

17. Select additional nodes from where you will be submitting jobs (e.g. head node of the cluster):

18. Choose a name for the configuration overlay of submit hosts (the defaults will be fine):

19. Choose a name for the configuration overlay of accounting nodes:

20. Select the head node as the accounting node:

21. Add the 8 GPUs in each node as GPU resources that can be requested:

NOTE: It is also possible to rely on Slurm’s GPU autodetect capabilities. Please consult the Bright documentation for details.

22. Leave the MPS settings empty unless you intend to use CUDA Multi Process Management (MPS). 

NOTE: To use MPS, you must perform additional setup steps to start/stop the MPS daemon through the prolog/epilog.

23. Enable the following cgroup resource constraints to make sure that jobs cannot use CPU cores or GPUs that they did not request:

24. Create a default queue. More queues can always be defined later:

25. Select Save & Deploy:

26. Store the configuration for later:

Post-Installation Steps

1. After the setup completes, you will want to reboot all compute nodes using CMSH:

device power reset -c dgx

28. After the nodes come back up, you can verify that Slurm is working properly by checking the sinfo output

[root@utilitynode-01 ~]# sinfo                                                                                                
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST                                                                             
defq*        up   infinite      8   idle dgx-[01-08] 

29. Slurm is configured not to allow multiple jobs on the same node. To change this behavior and allow (for example) a maximum of 8 simultaneous jobs to run on a single node, you could do the following:

[root@utilitynode-01 ~]# cmsh
[utilitynode-01]% wlm use slurm 
[utilitynode-01->wlm[slurm]]% jobqueue 
[utilitynode-01->wlm[slurm]->jobqueue]% use defq 
[utilitynode-01->wlm[slurm]->jobqueue[defq]]% get oversubscribe 
NO
[utilitynode-01->wlm[slurm]->jobqueue[defq]]% set oversubscribe YES:8
[utilitynode-01->wlm[slurm]->jobqueue*[defq*]]% commit
[utilitynode-01->wlm[slurm]->jobqueue[defq]]% 

30. To verify that GPU reservation is working, first try allocating no GPUs:

[root@utilitynode-01 ~]# srun nvidia-smi
No devices were found
srun: error: dgx-06: task 0: Exited with exit code 6
[root@utilitynode-01 ~]#

31. Then try allocating GPUs, for example, 2 GPUs:

[root@utilitynode-01 ~]# srun --gres=gpu:2 nvidia-smi
Thu Mar  4 08:50:44 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.102.04   Driver Version: 450.102.04   CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  A100-SXM4-40GB      On   | 00000000:07:00.0 Off |                    0 |
| N/A   30C    P0    54W / 400W |      0MiB / 40537MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
|   1  A100-SXM4-40GB      On   | 00000000:0F:00.0 Off |                    0 |
| N/A   30C    P0    53W / 400W |      0MiB / 40537MiB |      0%      Default |
|                               |                      |             Disabled |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
Updated on August 26, 2025

Related Articles

Leave a Comment