Quick Start Guide¶
This section provides example commands for the manual installation procedure. For general information on manual setup, please have a look at Advanced Topics.
Example Setup¶
In this example, we use the following hardware and software configuration:
Software: RHEL 8 or similar on all nodes
Host Services: (see note on dedicated hosts below) -
node01
: Management Server -node02
: Metadata Server -node03
: Storage Server -node04
: ClientStorage:
Storage servers with RAID-6 data partition formatted with XFS mounted to
/data
. (See Storage Node Tuning)Metadata servers with RAID-1 data partition formatted with ext4 mounted to
/data
. (See Metadata Node Tuning)
Note
In this example, we are using dedicated hosts for all BeeGFS services. This is just to show the different installation steps for each service. BeeGFS allows running any combination of services (including client and storage/metadata service) on the same machine.
Especially the management is not performance-critical and thus are typically not running on dedicated machines.
Note
Starting over with a fresh installation
In case you already had BeeGFS installed on one or more of your nodes (for testing, for example) and want to start over with a completely fresh system, you need to make sure that you use fresh data directories for all BeeGFS services.
Step 1: Package Download and Installation¶
First, the BeeGFS package repository file for your distribution needs to be downloaded to all nodes. Visit the BeeGFS download page and follow the step-by-step directions to add the BeeGFS package repositories to your Linux distribution’s package manager.
With the BeeGFS package repository configured we can install the packages from the repository:
$ ssh root@node01 yum install beegfs-mgmtd libbeegfs-license # management service; libbeegfs-license is only required to use enterprise features
$ ssh root@node02 yum install beegfs-meta libbeegfs-ib # metadata service; libbeegfs-ib is only required for RDMA
$ ssh root@node03 yum install beegfs-storage libbeegfs-ib # storage service; libbeegfs-ib is only required for RDMA
$ ssh root@node04 yum install beegfs-client beegfs-tools beegfs-utils # client and command-line tools and utilities
Note
The beegfs-tools
package contains the beegfs
command line tool which can show statistics
and perfom administrative tasks. The beegfs-utils
package contains the file system checker
(beegfs-fsck
).
To enable support for remote direct memory access (RDMA) based on the OFED ibverbs API, please install the additional libbeegfs-ib
package. BeeGFS will automatically enable RDMA on startup if the corresponding hardware and drivers
are present.
To use enterprise features such as storage pools or quotas, please install the libbeegfs-license
package on the management node and download your BeeGFS license to /etc/beegfs/license.pem
.
Now that all services are installed, the next step is to configure the automatic client module build.
Step 2: Client Kernel Module Autobuild¶
Note
This step is only relevant if you have RDMA-capable network hardware (InfiniBand, Omni-Path, RoCE). Otherwise, skip this step.
Unlike in earlier versions, RDMA support based on the OFED ibverbs API is available by default.
If you want to use a thirdparty driver though, you have to specify it in the file /etc/beegfs/beegfs-client-autobuild.conf
like this: buildArgs=-j8 OFED_INCLUDE_PATH=/usr/src/openib/include
and rebuild the client:
$ ssh root@node04
$ vim /etc/beegfs/beegfs-client-autobuild.conf
Add the OFED_INCLUDE_PATH
to the buildArgs parameter:
buildArgs=-j8 OFED_INCLUDE_PATH=/usr/src/openib/include
Note
Defining OFED_INCLUDE_PATH
is only required if you installed separate kernel driver
modules. If you are not sure, check the returned path information of this command:
$ modinfo ib_core
Separate OFED modules are usually located in the /lib/modules/<kernel_version>
directory.
To build the client without RDMA support, define BEEGFS_NO_RDMA=1
.
Step 3: Basic Configuration¶
Before we can run the services, we need to update a few basic settings.
Management Service¶
By default the management service will store its data at /var/lib/beegfs/mgmtd.sqlite
. Its main
task is keeping track of file system configuration and state including the list of nodes, targets,
pools, mirrors, etc. Thus it does not typically require a dedicated machine or much storage space,
and its data access is not performance critical. The exception is if quotas are in use and there are
a particularly large number of quotas configured, then more storage space and faster disks may be
required, especially if quotas are configured to update very frequently.
To initialize the database for a new BeeGFS installation run:
$ ssh root@node01
$ /opt/beegfs/sbin/beegfs-mgmtd --init
The management service requires Configuring TLS. For a test system TLS could simply be
disabled by setting tls-disable = true
in /etc/beegfs/beegfs-mgmtd.toml, but this is
discouraged for production. This service also requires configuring connection based
authentication. Again, for a test system it could be disabled by setting
auth-disable = true
but this is discouraged for production.
To use enterprise features such as storage pools or quotas, please install the libbeegfs-license
package on the management node and download your BeeGFS license to /etc/beegfs/license.pem
.
Note
Please see the chapter Management Service for additional configuration options.
Metadata Service¶
The metadata service needs to know where it can store its data and where the management service is running. Typically, you will have multiple metadata services running on different machines.
Optionally, you can also define a custom numeric metadata service ID (range 1..65535). As this
service is running on a server with name node02
in our example, we will also pick number 2 as
metadata service ID here.
$ ssh root@node02
$ /opt/beegfs/sbin/beegfs-setup-meta -p /data/beegfs/beegfs_meta -s 2 -m node01
Note
The metadata service will store metadata as extended attributes (xattr) on the underlying file system for performance reasons. Xattrs have to be supported and enabled on the underlying file system. Please see the chapter Metadata Node Tuning on how to enable extended attributes for ext4 file systems.
Storage Service¶
The storage service needs to know where it can store its data and how to reach the management server. Typically, you will have multiple storage services running on different machines and/or multiple storage targets (e.g., multiple RAID volumes) per storage service.
Optionally, you can also define a custom numeric storage service ID and numeric storage target ID (both in range 1..65535). As this service is running on a server with name node03 in our example, we will pick number 3 as ID for this storage service and we will use 301 as storage target ID to show that this is the first target (01) of storage service 3.
$ ssh root@node03
$ /opt/beegfs/sbin/beegfs-setup-storage -p /mnt/myraid1/beegfs_storage -s 3 -i 301 -m node01
To add a second storage target on this same machine:
$ /opt/beegfs/sbin/beegfs-setup-storage -p /mnt/myraid2/beegfs_storage -s 3 -i 302
Client¶
The client needs to know where the management service is running.
$ ssh root@node04
$ /opt/beegfs/sbin/beegfs-setup-client -m node01
The client mount directory is defined in a separate configuration file.
This file will be used by the beegfs-client
service startup script.
By default, BeeGFS will be mounted to /mnt/beegfs
.
Thus, you need to perform this step only if you want to mount the file system to a different location.
$ ssh root@node04
$ vim /etc/beegfs/beegfs-mounts.conf
The first entry defines the mount directory. The second entry refers to the corresponding configuration file for this mount point.
Connection authentication¶
We strongly recommend to configure connection authentication with a connAuthFile
at this point.
Please see Authentication for more information. If connection authentication is
not needed, connDisableAuthentication
has to be set to true
on all nodes.
Step 4: Service Startup¶
BeeGFS services can be started in arbitrary order by using the corresponding systemctl
service
scripts. By default all services log to the system journal (use the -u <service>
to filter logs
for a particular service).
$ ssh root@node01 systemctl start beegfs-mgmtd
$ ssh root@node02 systemctl start beegfs-meta
$ ssh root@node03 systemctl start beegfs-storage
$ ssh root@node04 systemctl start beegfs-client
Note
BeeGFS clients have a mount sanity check and cancel a mount operation if servers are unreachable.
If you want to mount even with unreachable servers, set sysMountSanityCheckMS=0
in the file
/etc/beegfs/beegfs-client.conf
.
Note
Some Linux distributions enable SELinux by default. If you are seeing an “Access denied” error when you access the BeeGFS mount, read ‘Access denied’ error on the client, even with correct permissions.
Congratulations, your parallel file system is now up and running!
Step 5: Check Connectivity¶
Note
This step is especially relevant if you have RDMA-capable network hardware (InfiniBand, Omni-Path, RoCE) to ensure that RDMA is used as the transport protocol. However, you can also check that BeeGFS is using the intended routes to the servers by using the commands below.
Setup your client node to interact with BeeGFS using the new beegfs
tool. The new tool does not
use a configuration file, but rather uses flags and/or environment variables for configuration. This
means if you want persistent configuration for the tool it can be set in your ~/.bashrc
file (or
equivalent for your shell). If you followed the recommendations in the quick start guide no
additional configuration is required, and the tool will work out of the box. Below is common
configuration to be aware of in case you deviated from the guide.
Since the client where the tool is running should already have BeeGFS mounted, the management address will be automatically determined. If BeeGFS was not mounted or you had multiple BeeGFS instances mounted, then you would need to specify which BeeGFS you wish to manage:
$ ssh node04
$ echo "export BEEGFS_MGMTD_ADDR='<BEEGFS-MGMTD-IP-OR-HOSTNAME>:8010'" >> ~/.bashrc
$ source ~/.bashrc
If you choose to use a self-signed TLS certificate and enable connection authentication, as long as
the TLS certificate and secret are already at /etc/beegfs/cert.pem
and /etc/beegfs/conn.auth
on the machine where you are running the tool, then no configuration is needed. If you choose to
disable TLS and/or connection authentication you will also need to specify those options:
$ ssh node04
$ echo "export BEEGFS_TLS_DISABLE='true'" >> ~/.bashrc
$ echo "export BEEGFS_AUTH_DISABLE='true'" >> ~/.bashrc
$ source ~/.bashrc
Once you have the correct configuration in place, check the detected network interfaces and transport protocols with the following commands:
$ ssh node04
$ beegfs node list --with-nics
$ beegfs health net # Display connections the client is actually using.
$ beegfs health df # Display free space and inodes on metadata and storage targets.
$ beegfs health check # Check for common issues.
Note
Some commands such as health check
require root privileges. By default environment variables
are not preserved when using sudo
. For testing with a non-root user you could use sudo -E
<command>
to preserve the entire environment, but this is generally discouraged in production
due to security risks. Instead configure your /etc/sudoers
file to only keep necessary and
safe variables, for example: Defaults env_keep += "BEEGFS_MGMTD_ADDR"
or Defaults env_keep
+= "BEEGFS_*"
(for all BeeGFS namespaced environment variables). Alternatively define these in
shell-specific global configuration such as /etc/profile
or /etc/bash.bashrc
.
Please check the system journal (journalctl
) if you do not see expected RDMA connections and
verify that the libbeegfs-ib
package is installed.
Make sure that the interfaces are listed in your order of preference (i.e. the primary interface
should be listed first in the output of beegfs node list --with-nics
) and that you only see
interfaces that you want BeeGFS to use. (see Configure allowed network interfaces)
Note that BeeGFS clients establish connections only when needed and also drop idle connections, so
you might e.g. not see metadata server connections in beegfs health net
until you performed a
metadata operation on the client mount like e.g., ls
.