Monitoring service

The beegfs-mon service collects statistics from the system and provides them to the user using a time series database (InfluxDB). For visualization of the data beegfs-mon provides predefined Grafana panels that can be used out of the box, or the user can use whatever tool he prefers.

Installation

The service and the Grafana panels are contained in the optional beegfs-mon package. The package is available from the general BeeGFS repository.

Additionally, a working and reachable InfluxDB setup is required. Installing InfluxDB should be simple in most cases since there are prebuilt packages available for all of the distributions that are supported by BeeGFS. The installation instructions for InfluxDB version 1.x can be found at https://docs.influxdata.com/influxdb/v1.8/introduction/install/ and for InfluxDB version 2.x can be found at https://docs.influxdata.com/influxdb/v2.0/install/.

It can be installed on the same host, but if you have an existing installation, you can use this one as well. Just make sure beegfs-mon can access it via http.

If you want to use the prebuilt Grafana panels (or want to create your own), you also need Grafana. It also doesn’t need to be on the same host, it just needs http access to the InfluxDB instance. For installation instructions, please refer to the official website: http://docs.grafana.org/installation/ .

Configuration

Before running beegfs-mon, you need to edit the configuration file located at /etc/beegfs/beegfs-mon.conf. If you have InfluxDB version 2.x change the value of (dbType) from influxdb to influxdb2 and add bucket name with the (dbBucket) entry. If you have everything installed on the same host, you only need to specify the management host (sysMgmtdHost). If your InfluxDB is installed on another host or you need to use a different database name, you also need to modify the corresponding entries (dbHostName, dbHostPort, dbDatabase).

After editing the configuration, you can start the service with

$ systemctl start beegfs-mon

Grafana panels: Default installation

A set of Grafana panels for use with BeeGFS is provided by the beegfs-mon-grafana package. Once it is installed they can be imported using the script /opt/beegfs/scripts/grafana/import-dashboards. For the out-of-the-box setup with InfluxDB and Grafana being on the same host, just use

$ cd /opt/beegfs/scripts/grafana
$ ./import-dashboards default

Grafana panels: Custom installation

In any other case, either provide the script with the URLs to InfluxDB and Grafana (call the script without arguments for usage instruction) or install them manually. The latter can be done from within Grafanas web interface:

First, the data source must be defined. In the main menu, click on Data Sources and then Add Data Source. Enter a name, hostname and port where your InfluxDB is running. Set the name of the Database (default: beegfs_mon). Save.

To add the dashboards, select Dashboards/Import from the main menu. Choose one of the dashboard .json files located at /opt/beegfs/scripts/grafana/. Depending on your InfluxDB version, select either a file ending with influxdbv1.json (e.g. beegfs_overview_influxdbv1.json) for InfluxDB version 1 or a file ending with influxdbv2.json (e.g. beegfs_overview_influxdbv2.json) for InfluxDB version 2. Select the data source you created previously from the dropdown menu and click Import. Repeat for the rest of the panels.

You can now click on Dashboards in the main menu and then on the Button to the right of it. A list of the installed dashboards should pop up, in which you can select the one you want to watch. If your BeeGFS setup, the beegfs-mon service, and InfluxDB are already running and are configured properly, you should already see some data being collected.

For more documentation and help in using Grafana, please visit the official website http://docs.grafana.org.

Grafana Alerts setup

To take advantage of the new alerting feature, you need to run the alerting script provided in the beegfs-mon-grafana package. The script, located at /opt/beegfs/scripts/grafana/import-alerts, sets up preconfigured BeeGFS alerts including an email template, contact point and notification policies. After running the script, update the placeholder email address in the contact point configuration with your own email address to receive alerts. By default all alert are paused you can unpause them from grafana UI.

Steps to unpause alert evaluation

  1. In the left-side menu, click on “Alerting”

  2. Click on “BeeGFS-Alert” to see the list of existing alerts

  3. Identify the alert you wish to unpause, then click “Edit” (the pen icon)

  4. Scroll down to find the “Pause evaluation” option. Click the button to unpause the alert

  5. Save your changes and exit editing mode

By completing these steps, you’ve successfully unpaused the specified alert, allowing it to resume evaluation based on the configured conditions.

Steps to edit a contact point

  1. In the left-side menu, click “Alerting”

  2. Click “Contact points” to view a list of existing contact points

  3. Find the BeeGFS email contact point to edit, and then click “Edit” (the pen icon)

  4. Change “Addresses” section with your email and click “Save contact point”

Note

For email alerts to work correctly, it’s essential to configure the SMTP settings in the grafana configuration file /etc/grafana/grafana.ini. This configuration is necessary to enable Grafana to send email notifications in response to defined alert conditions.

Alert customization for system-specific requirements

Users can customize their alert preferences to align with their specific configurations. The following are elements they can modify:

  1. Pending period

    Setting a pending period helps stop unnecessary alerts for short-term issues. In the pending period, you select the period in which an alert rule can be in breach of the condition until the alert fires.

  2. Alert condition

    An alert condition is the query or expression that determines whether the alert will fire or not depending on the value it yields. There can be only one condition which will determine the triggering of the alert.

For more details see the Grafana documentation on alert rules.

Steps to edit an alert rule

  1. In the left-side menu, click “Alerting”

  2. Click on “BeeGFS-Alert” to see the list of existing alerts

  3. Identify the alert you wish to edit, then click on “Edit” (the pen icon)

  4. Adjust the alert condition threshold or pending period as per your requirements

For more details see the Grafana documentation on queries and conditions.

Usage

You can connect to Grafana using your web browser. If you installed the predefined panels, you will find five of them: One for BeeGFS overview, one for meta service statistics, one for storage, one for storage targets and one for client operations. You can modify the node shown using the drop down on the upper left corner.

If you want to write your own Grafana panels or use other software to process the collected data, you can access the InfluxDB using one of its provided APIs. Please refer to the InfluxDB documentation for details. Here you find a reference of the used fields and tags in the database.

Apache Cassandra Support

beegfs-mon supports the use of a Apache Cassandra database as database backend. Unless you already have a Cassandra installation you want to use or have other reasons to specifically use Casssandra, we recommend to use InfluxDB. It is more lightweight and easier to handle. Also, there are no Grafana panels available for Cassandra.

To use Cassandra, you need to install a third-party library: https://github.com/datastax/cpp-driver. For BeeGFS version 7.1 it has to be version 2.9. Make sure, the dynamic library is located in the standard path, so it can be loaded by the service. To load the library and use Cassandra, change the corresponding line in the mon configuration file from influxdb to cassandra. Cassandra uses slightly different options for configuration as you can see there, but you can achieve the same functionality as with InfluxDB. Please refer to the configuration file documentation for details.