1. Home
  2. Cluster Monitoring
  3. How can I use Grafana to monitor multiple Bright clusters?

How can I use Grafana to monitor multiple Bright clusters?

As of Bright 8.2-24 / 9.0-12 / 9.1-2, it is possible to add the cluster as a data source to Grafana so that the Grafana interface can be used to visualize monitoring information. To quickly verify that cmdaemon on your cluster is compatible, you can check that the build date of cmdaemon (as reported by rpm -qi cmdaemon) is after November 28th 2020.

An open-source tool called bright-grafana is available to make it as easy as possible to create a setup that lets you pull and visualize monitoring data from multiple Bright clusters. Bright-grafana is available from GitHub.

Setting up Grafana

Bright-grafana should be deployed on a stand-alone server running Grafana. This server could be a single Bright head node, but it does not need to be. If you don’t have Grafana installed yet, you can use the install-grafana.sh script:

git clone https://github.com/Bright-Computing/bright-grafana.git
cd bright-grafana
systemctl enable grafana-server.service
systemctl start grafana-server.service

At this point you should have a working Grafana deployment on port 3000 of the server that you installed on. You can verify this by pointing your web-browser to http://yourserver:3000. If this does not work, please check firewall / packet filtering settings. The default login for Grafana is admin with password admin.

Adding clusters to be monitored

The bright_grafana tool can be used to add clusters that are to be monitored. Root access to the cluster is required at the time that it is being added. The bright_grafana tool will take care of:

  • Adding the cluster to the local cache clusters.json
  • Coping (rsync) the cluster pythoncm version to the local directory
  • Creating a Grafana profile with the minimal required tokens for running queries
  • Creating a certificate with the Grafana profile
  • Configuring a Grafana datasource to the cluster in /etc/grafana/provisioning/datasources
  • Copying or updating the default dashboards to /var/lib/grafana/dashboards/bright
  • Testing the configuration by doing a basic query

Assuming that the bright-grafana repository has been cloned already and that a Python 3.7 interpreter is available, the following command will add a cluster that is to be monitored:

module load python37
cd bright-grafana
./bright_grafana.py -u root -p yourpassword -H mycluster.mydomain.com -a
systemctl restart grafana-server.service

For more information about the bright_grafana tool, you may use the --help option.

You will now find in Grafana all of the dashboards that are defined in the dashboards directory. The change that has been made is that a data source selector has been added where you can select the cluster from which monitoring data should be pulled.

Adding a new dashboard

The easiest way to define a new dashboard is to create the desired dashboard for a single cluster, export it to JSON and store it in the dashboards folder, and then let the bright_grafana tool add the datasource selector where you will be able to select all of the clusters that you have defined.

After making changes to the dashboards in the dashboardsdirectory, it is necessary to re-run the bright_grafana tool as follows and restart Grafana:

./bright_grafana.py -b
systemctl restart grafana-server.service

Entity series supported by CMDaemon

The following series are supported:

  • hostname (all devices)
  • node (hostnames for devices of type node)
  • category
  • wlm (workload management system instance)
  • job_id

Updated on January 12, 2021

Related Articles

Leave a Comment