• Major Operating System (OS) upgrades on a BCM cluster

    Preamble There are several paths available for upgrading the operating system of a BCM Cluster. In this article, we will attempt to break down the upgrade process into separate sections, which may be applied to your cluster environment. It is worth noting in Bright 9.0 and later releases, Multi-OS and…

  • Kubernetes: Limit GPU resource usage for a namespace

    About:  This article contains Kubernetes resource quota examples for limiting GPU usage per namespace. For further information on setting resources quotas against namespaces in Kubernetes: https://kubernetes.io/docs/concepts/policy/resource-quotas/#viewing-and-setting-quotas Applying a resource quota In this example, we have the user “john” who has a Kubernetes namespace called “john-restricted”. In order to define limits…

  • Bright Computing Ansible galaxy module examples

    About:  This article contains Ansible code examples for implementing the Bright Computing Galaxy module. https://galaxy.ansible.com/brightcomputing Requirements: These instructions have been tested on Bright 9.2 (9.2-13) and later releases. Functional Ansible environment (Python 3.9, Ansible 2.10+) Setup the Ansible environment: There are a number of approaches to installing Ansible. This approach…

  • How do I add the Dell OpenManage tools to a Bright 9.2 Ubuntu 22.04 headnode?

    Installation instructions are available in the Dell OpenManage repositories. Dell OpenManage Repository The following example will install OpenManage version 11 on an Ubuntu 22.04 headnode.  apt-get install gpg libssl-dev echo ‘deb http://linux.dell.com/repo/community/openmanage/11000/jammy jammy main’ | sudo tee -a /etc/apt/sources.list.d/linux.dell.com.sources.list apt-get update sudo wget https://linux.dell.com/repo/pgp_pubkeys/0x1285491434D8786F.asc sudo apt-key add 0x1285491434D8786F.asc apt update…

  • Configuring Nvidia Bright Cluster Manager as an Infiniband-only cluster

    Caveats This information is valid at the time of writing (9.2-6 release). Currently, the installation of a cluster requires an ethernet interface, configured as “internalnet”. This is needed to generate a valid license file and associated certificates. Selecting Infiniband-only interfaces in the Bright Installer at the “Head node interfaces” step…

  • Deploying NICE DCV on a Bright cluster

    This article discusses deploying a NICE DCV server on Bright managed compute nodes.We recommend reviewing the excellent third-party NICE DCV upstream documentation, which is available here: NICE DCV (amazon.com) Before we start… This article doesn’t seek to replace the upstream documentation, rather details the integration with Bright Cluster Manager. This…

  • Enabling Kdump (RHEL/CentOS)

    You can use the below instructions to configure kdump on a Bright managed cluster. These instructions should work on RHEL/CentOS 7 and 8. 1. Install required packages Install the kexec-tools in the software image:# yum install kexec-tools –installroot=/cm/images/<image-name> 2. Modify the software image Configure the software image to allow crashkernel…

  • Troubleshooting provisioning issues on a system with SuperMicro BMC

    Please note this article shouldn’t replace the need to contact your vendor for guidance on hardware issues. Here are some steps that may assist in resolving provisioning issues with SuperMicro BMCs. Is the system running the latest BMC firmware? Check the vendor website. Have you attempted a BMC reset? “ipmitool…