This article was tested on DGX OS 6.2 BCM10
Before listing the instructions that has to be followed to create UDEV rules for consistent NVME disk renaming, we would like to list a couple of facts:
– Creating an XML disk setup with consistent NVME disk renaming can be done without creating udev rules but just by using the PCI ID addresses directly in the disklayout. If you have a valid and pressing need to use this method, then please contact BCM support for more detailed instructions.
– The reason why we prefer the udev rules is because it hides the ugly details of using the PCI ID addresses and it makes the disk setup more readable.
1. Edit /cm/images/<IMAGENAME>/usr/lib/udev/rules.d/60-persistent-storage-<DGXTYPE>.rules and add the following lines:
a. For DGX H100
########## persistent nvme rules by HW address ########## KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:01:00.0", SYMLINK+="disk/by-id/osdisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:02:00.0", SYMLINK+="disk/by-id/osdisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ab:00.0", SYMLINK+="disk/by-id/raiddisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ac:00.0", SYMLINK+="disk/by-id/raiddisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ad:00.0", SYMLINK+="disk/by-id/raiddisk-3" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ae:00.0", SYMLINK+="disk/by-id/raiddisk-4" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2a:00.0", SYMLINK+="disk/by-id/raiddisk-5" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2b:00.0", SYMLINK+="disk/by-id/raiddisk-6" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2c:00.0", SYMLINK+="disk/by-id/raiddisk-7" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2d:00.0", SYMLINK+="disk/by-id/raiddisk-8" ########## persistent nvme rules by HW address ##########
b. For DGX A100
########## persistent nvme rules by HW address ########## KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:22:00.0", SYMLINK+="disk/by-id/osdisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:23:00.0", SYMLINK+="disk/by-id/osdisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:08:00.0", SYMLINK+="disk/by-id/raiddisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:09:00.0", SYMLINK+="disk/by-id/raiddisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:52:00.0", SYMLINK+="disk/by-id/raiddisk-3" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:53:00.0", SYMLINK+="disk/by-id/raiddisk-4" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:89:00.0", SYMLINK+="disk/by-id/raiddisk-5" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:8a:00.0", SYMLINK+="disk/by-id/raiddisk-6" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ca:00.0", SYMLINK+="disk/by-id/raiddisk-7" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:cb:00.0", SYMLINK+="disk/by-id/raiddisk-8" ########## persistent nvme rules by HW address ##########
c. For DGX B200
########## persistent nvme rules by HW address ########## KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:01:00.0", SYMLINK+="disk/by-id/osdisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:02:00.0", SYMLINK+="disk/by-id/osdisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ab:00.0", SYMLINK+="disk/by-id/raiddisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ac:00.0", SYMLINK+="disk/by-id/raiddisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ad:00.0", SYMLINK+="disk/by-id/raiddisk-3" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:ae:00.0", SYMLINK+="disk/by-id/raiddisk-4" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2a:00.0", SYMLINK+="disk/by-id/raiddisk-5" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2b:00.0", SYMLINK+="disk/by-id/raiddisk-6" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2c:00.0", SYMLINK+="disk/by-id/raiddisk-7" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0000:2d:00.0", SYMLINK+="disk/by-id/raiddisk-8" ########## persistent nvme rules by HW address ##########
d. For DGX GB200
########## persistent nvme rules by HW address ########## KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0015:01:00.0", SYMLINK+="disk/by-id/osdisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0006:07:00.0", SYMLINK+="disk/by-id/raiddisk-1" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0006:09:00.0", SYMLINK+="disk/by-id/raiddisk-2" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0016:07:00.0", SYMLINK+="disk/by-id/raiddisk-3" KERNEL=="nvme[0-9]n[0-9]", ATTRS{address}=="0016:09:00.0", SYMLINK+="disk/by-id/raiddisk-4" ########## persistent nvme rules by HW address ##########
Note: Please replace <DGXTYPE> with the appropriate DGX type from the above types; i.e. a100, h100, b200, gb200.
2. Edit /cm/node-installer/usr/lib/udev/rules.d/60-persistent-storage-<DGXTYPE>.rules and add the same lines as above
3. append nvme_core.multipath=n to kernel parameter to disable NVMe multipath capability
cmsh softwareimage use <IMAGENAME> append kernelparameters " nvme_core.multipath=n" commit
4a. For A100, B200, H100: Create the following disksetup using the persistent names created by the UDEV rules
cmsh category use <categoryname> set disksetup ### the disksetup will open in the default editor; copy/paste the following XML; save changes and exit commit <?xml version="1.0" encoding="UTF-8"?> <diskSetup> <device> <blockdev>/dev/disk/by-id/osdisk-1</blockdev> <partition id="boot1" partitiontype="esp"> <size>512M</size> <type>linux</type> <filesystem>fat</filesystem> <mountPoint>/boot/efi</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </partition> <partition id="slash1"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/osdisk-2</blockdev> <partition id="boot2" partitiontype="esp"> <size>512M</size> <type>linux</type> <filesystem>fat</filesystem> <mountOptions>defaults,noatime,nodiratime</mountOptions> </partition> <partition id="slash2"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-1</blockdev> <partition id="raid1" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-2</blockdev> <partition id="raid2" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-3</blockdev> <partition id="raid3" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-4</blockdev> <partition id="raid4" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-5</blockdev> <partition id="raid5" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-6</blockdev> <partition id="raid6" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-7</blockdev> <partition id="raid7" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-8</blockdev> <partition id="raid8" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <raid id="slash"> <member>slash1</member> <member>slash2</member> <level>1</level> <filesystem>ext4</filesystem> <mountPoint>/</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </raid> <raid id="raid"> <member>raid1</member> <member>raid2</member> <member>raid3</member> <member>raid4</member> <member>raid5</member> <member>raid6</member> <member>raid7</member> <member>raid8</member> <level>0</level> <filesystem>ext4</filesystem> <mountPoint>/raid</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </raid> </diskSetup>
4b. For GB200: Create the following disksetup using persistent names created by the UDEV rules
cmsh category use <categoryname> set disksetup ### the disksetup will open in the default editor; copy/paste the following XML; save changes and exit commit <?xml version="1.0" encoding="UTF-8"?> <diskSetup> <device> <blockdev>/dev/disk/by-id/osdisk-1</blockdev> <partition id="efi" partitiontype="esp"> <size>512M</size> <type>linux</type> <filesystem>fat</filesystem> <mountPoint>/boot/efi</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </partition> <partition id="boot1"> <size>4G</size> <type>linux</type> <filesystem>ext2</filesystem> <mountPoint>/boot</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </partition> <partition id="slash1"> <size>max</size> <type>linux</type> <filesystem>ext4</filesystem> <mountPoint>/</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-1</blockdev> <partition id="raid1" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-2</blockdev> <partition id="raid2" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-3</blockdev> <partition id="raid3" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <device> <blockdev>/dev/disk/by-id/raiddisk-4</blockdev> <partition id="raid4" partitiontype="esp"> <size>max</size> <type>linux raid</type> </partition> </device> <raid id="raid"> <member>raid1</member> <member>raid2</member> <member>raid3</member> <member>raid4</member> <level>0</level> <filesystem>ext4</filesystem> <mountPoint>/raid</mountPoint> <mountOptions>defaults,noatime,nodiratime</mountOptions> </raid> </diskSetup>
5. Add a finalize script to replace the /dev/nvmeXnY of /boot/efi in /etc/fstab with the UUID
cmsh category use <categoryname> set finalizescript ### the finalizescript will open in the default editor; copy/paste the following script; save changes and exit commit #!/bin/bash sed -i "s/.*\/boot\/efi.*/UUID=\"$(blkid -l -t PARTLABEL=/boot/efi -s UUID -o value)\" \/boot\/efi vfat defaults,noatime,nodiratime 0 2/" /localdisk/etc/fstab
Create a service to regenerate the contents of /boot/efi/EFI
cm-chroot-sw-img /cm/images/<IMAGENAME> cat > /etc/systemd/system/nvsm-bootefi.service << EOF [Unit] Description= Fix /boot/efi partition Before=nvsm.service nvsm-notifier.service [Service] Type=oneshot RemainAfterExit=yes ExecStartPre=/usr/sbin/grub-install --no-nvram ExecStart=cp -r --preserve /boot/efi/EFI/dgx /boot/efi/EFI/ubuntu [Install] WantedBy=multi-user.target EOF systemctl enable nvsm-bootefi.service
IMPORTANT NOTE:
On some systems it is necessary to disable NVMe/Harddisk boot option from the BIOS menu to avoid falling back to booting from local harddisk when GRUB is installed on the MBR via the grub-install command.