slurmrestd is a stateless REST compatible API to the slurm control plane. This article will go over the steps to install and configure basic operation of the service.
Important Notes
- Instructions are based on Bright Cluster Manager 9.2 on a Rocky8 based cluster running slurm 21.08 although these steps should work for most Bright Clusters where the slurmXX.XX-slurmrestd package is available.
- SLES15-specific steps are provided.
- Currently Bright Cluster Manager provides no integration or management of slurmrestd.
- This installation is only for trusted networks. Refer to the slurm documentation on the security and best practices for slurmrestd. The service should not be exposed on untrusted networks without additional security.
- Instructions presume that slurm is installed and operating normally.
Installation
The first step will be the installation of the necessary package. On the primary head node, you should install the following package.
NOTE: Please make sure to install the version matched slurmrestd package. Installing a non-matched package may result in SLURM failures. If you would like to update SLURM please use the appropriate KB article.
# yum install `rpm -qa | grep -e 'slurm[[:digit:]][[:digit:]]\.[[:digit:]][[:digit:]]-[[:digit:]]' | awk -F- '{print $1 "-slurmrestd-" $2 "-" $3}'` Dependencies resolved. Installing: slurm21.08-slurmrestd x86_64 21.08.8-100612_cm9.2_457a042d80 cm-rhel8-9.2-updates 179 k Installing dependencies: http-parser x86_64 2.8.0-9.el8 ... Running transaction Preparing : 1/1 Installing : http-parser-2.8.0-9.el8.x86_64 1/2 Installing : slurm21.08-slurmrestd-21.08.8-100612_cm9.2_457a042d80.x86_64 2/2 Running scriptlet: slurm21.08-slurmrestd-21.08.8-100612_cm9.2_457a042d80.x86_64 2/2 Verifying : slurm21.08-slurmrestd-21.08.8-100612_cm9.2_457a042d80.x86_64 1/2 Verifying : http-parser-2.8.0-9.el8.x86_64 2/2 Installed: http-parser-2.8.0-9.el8.x86_64 slurm21.08-slurmrestd-21.08.8-100612_cm9.2_457a042d80.x86_64 Complete!
At this point, you should have an installed binary for slurmrestd and can test to confirm that it is able to communicate with the slurm controller (non-SLES only).
# module load slurm # echo -e "GET /slurm/v0.0.37/diag/ HTTP/1.0\r\n\r\n" | SLURMRESTD_SECURITY=disable_user_check /cm/shared/apps/slurm/current/sbin/slurmrestd | head -n 20
Sample output:
slurmrestd: operations_router: [fd:0->fd:1] GET /slurm/v0.0.37/diag/ slurmrestd: rest_auth/local: slurm_rest_auth_p_authenticate: [fd:0->fd:1] accepted connection from user: root[0] slurmrestd: rest_auth/local: slurm_rest_auth_p_apply: apply local auth for user root HTTP/1.0 200 OK Content-Length: 1383 Content-Type: application/json { "meta": { "plugin": { "type": "openapi\/v0.0.37", "name": "Slurm OpenAPI v0.0.37" }, "Slurm": { "version": { "major": 21, "micro": 8, "minor": 8 }, "release": "21.08.8-2" } }, "errors": [
If you are not on SLES, you may jump to the next section of this article.
For SLES, a couple extra steps are needed to execute the test above (do not execute this if not on SUSE):
# module load slurm # echo "LD_LIBRARY_PATH=/cm/local/apps/http-parser/lib:\$LD_LIBRARY_PATH" >> /etc/sysconfig/slurmrestd # echo -e "GET /slurm/v0.0.36/diag/ HTTP/1.0\r\n\r\n" | SLURMRESTD_SECURITY=disable_user_check LD_LIBRARY_PATH=/cm/local/apps/http-parser/lib:$LD_LIBRARY_PATH /cm/shared/apps/slurm/current/sbin/slurmrestd | head -n 20
Sample output:
slurmrestd: operations_router: [fd:0->fd:1] GET /slurm/v0.0.36/diag/ slurmrestd: rest_auth/local: slurm_rest_auth_p_authenticate: slurm_rest_auth_p_authenticate: [fd:0->fd:1] accepted connection from uid:0 slurmrestd: rest_auth/local: slurm_rest_auth_p_apply: apply local auth for user root HTTP/1.0 200 OK Content-Length: 1367 Content-Type: application/json { "meta": { "plugin": { "type": "openapi\/v0.0.36", "name": "REST v0.0.36" }, "Slurm": { "version": { "major": 20, "micro": 9, "minor": 11 }, "release": "20.11.9" } }, "errors": [
Note that in Bright 9.1 you may want to use API version v0.0.36, and v0.0.37 on Bright 9.2. You can check which version you need by running the following:
# echo -e "GETopenapi.json HTTP/1.0\r\n\r\n" | SLURMRESTD_SECURITY=disable_user_check LD_LIBRARY_PATH=/cm/local/apps/http-parser/lib:$LD_LIBRARY_PATH /cm/shared/apps/slurm/current/sbin/slurmrestd | grep -A20 openapi | grep version
Setting up JWT Authentication
Follow the instructions on the SchedMD site to configure the appropriate JWT keys for the site. Below configures a basic JWT key for a Bright Managed SLURM cluster.
# module load slurm # echo $SLURM_CONF # export JWT_KEY=`dirname $SLURM_CONF`/jwt.key # echo $JWT_KEY # confirm this outputs a path on the system
Next, we will generate a simple key and make the necessary changes to slurm.conf
install -m 0600 -o slurm -g slurm <(dd if=/dev/random bs=32 count=1) $JWT_KEY cat <<END >> $SLURM_CONF # Added to support slurmrestd AuthAltTypes=auth/jwt AuthAltParameters=jwt_key=$JWT_KEY # end slurmrestd END
Next, we have to create an appropriate service file to start slurmrestd. Newer versions of the slurmrestd package come with this service file so only add it if not already present.
systemctl cat slurmrestd.service || cat <<END > /usr/lib/systemd/system/slurmrestd.service [Unit] RequiresMountsFor=/cm/shared Description=Slurm REST daemon After=network-online.target munge.service slurmctld.service Wants=network-online.target [Service] User=daemon Type=simple EnvironmentFile=-/etc/sysconfig/slurmrestd Environment="SLURM_JWT=daemon" ExecStart=/cm/shared/apps/slurm/current/sbin/slurmrestd \$SLURMRESTD_OPTIONS ExecReload=/bin/kill -HUP $MAINPID [Install] WantedBy=multi-user.target END cat <<END > /etc/sysconfig/slurmrestd SLURMRESTD_OPTIONS=-u daemon -a rest_auth/jwt 0.0.0.0:6820 END mkdir /etc/systemd/system/slurmrestd.service.d cat <<END > /etc/systemd/system/slurmrestd.service.d/99-cmd.conf [Service] Environment=SLURM_CONF=$SLURM_CONF END systemctl daemon-reload systemctl restart slurmctld cmsh -x -q -c 'device use master; services; add slurmrestd; set monitored yes; set autostart yes; commit'
Testing
Now that slurmrestd is installed and operational you can execute the following to confirm that it is operating normally.
NOTE: Your openapi version may be different than v0.0.37.
. <(scontrol token) echo $SLURM_JWT wget --header "X-SLURM-USER-TOKEN: $SLURM_JWT" --header "X-SLURM-USER-NAME: $USER" -q http://localhost:6820/slurm/v0.0.37/diag/ -O - | head -n 10
The output of the above should mirror the test from earlier.
{ "meta": { "plugin": { "type": "openapi\/v0.0.37", "name": "Slurm OpenAPI v0.0.37" }, "Slurm": { "version": { "major": 21, "micro": 8, #