Backup

Backup Best Practices

  • Use incremental backups: Having a history of multiple backups helps recover from data faults that were not detected over periods of time as well as user error such as accidental deletion.

  • Store your backups on separate machines or even off-site to avoid hardware crashes or other catastrophes.

  • Test your restore procedure before your system goes into production to avoid longer downtimes.

Minimal Backup of Important System Configuration

Warning

This is the BARE MINIMUM of what you need to save! This will NOT PREVENT DATA LOSS when servers crash!

If you do not have the spare disk space or infrastructure for a full system backup, you still need to backup the mgmtd information and the configuration files of all daemons. When you lose your mgmtd data, the BeeGFS will not be able to find its root directory or buddy mirror associations. Fixing a missing mgmtd directory requires a lot of very expensive manual work and has no guarantee of success.

The best approach for this are cron jobs to automatically save the most relevant data with an incremental backup tool such as BorgBackup. The following instructions will instead focus on tar to give a simple overview of the data to save.

First find your mgmtd store directory:

# grep storeMgmtdDirectory /etc/beegfs/beegfs-mgmtd.conf
storeMgmtdDirectory = /mnt/beegfs-mgmtd-disk

# ls -l /mnt/beegfs-mgmtd-disk
total 28
-rw------- 1 root root  8 Jul  9 12:20 clients.nodes
-rw------- 1 root root 88 Jul  9 12:20 format.conf
-rw-r--r-- 1 root root  0 Jul  9 12:20 lock.pid
-rw------- 1 root root 13 Jul  9 12:20 meta.nodes
-rw------- 1 root root 64 Jul  9 12:20 nodeStates
-rw-r--r-- 1 root root 36 Jul  9 12:20 nodeUUID
-rw------- 1 root root 13 Jul  9 12:20 storage.nodes
-rw------- 1 root root 64 Jul  9 12:20 targetStates

Then save the mgmtd data directory:

# tar --force-local -cpzf mgmtd_storagedir_$(date +'%F_%T').tar.gz /mnt/beegfs-mgmtd-disk

And save the configuration files on all your BeeGFS nodes.

# tar --force-local -cpzf beegfs_configs_$(date +'%F_%T')_$(hostname).tar.gz /etc/beegfs

Full System Backup

Creating consistent snapshots of a BeeGFS filesystem requires a shutdown of the whole system. If you choose not to shut down your system prior to making a backup, you will run into inconsistencies without any guarantees that those can be fixed. If you still choose to run a backup of an online system, please run File System Check after you restore your backup to attempt a repair.

The following is a step by step instruction on creating a full backup of an entire BeeGFS system.

System Shutdown

Cleanly shut down the system components in the following order:

  • Unmount all clients.

  • Shut down meta services.

  • Shut down storage services.

  • Shut down mgmtd service.

You may adapt and use the following script for this task:

#!/bin/bash

set -e
set -o pipefail

# Change to reflect your system.
BEEGFS_MGMTD_HOST=node0
BEEGFS_META_HOSTS=( node0 node1 node2 )
BEEGFS_STORAGE_HOSTS=( node1 node2 node3 node4 )
BEEGFS_CLIENT_HOSTS=( node0 client0 client1 client2 client3 )

for host in "${BEEGFS_CLIENT_HOSTS[@]}"; do
   ssh "root@${host}" systemctl stop beegfs-client
   ! ( ssh "root@${host}" 'mount | grep beegfs' \
       && echo "BeeGFS mount on ${host} failed to unmount." )
done

for host in "${BEEGFS_META_HOSTS[@]}"; do
   ssh "root@${host}" systemctl stop beegfs-meta
done

for host in "${BEEGFS_STORAGE_HOSTS[@]}"; do
   ssh "root@${host}" systemctl stop beegfs-storage
done

for host in "${BEEGFS_MGMTD_HOST}"; do
   ssh "root@${host}" systemctl stop beegfs-mgmtd
done

Metadata Daemon Backup

The BeeGFS metadata server uses extended attributes to store its data on the underlying filesystem.

As extended attributes are typically not copied by default by many backup tools, here are three different ways to backup a BeeGFS metadata server. However, keep in mind that these are just examples. Other tools like rsync also have options to preserve extended attributes and hardlinks and thus could be used to backup BeeGFS metadata.

BeeGFS metadata uses hardlinks on the underlying file system. It is important that these are preserved when metadata is backed up and restored.

Using GNU tar

Backup of the data:

$ cd /path/to/beegfs/meta_parent_dir
$ tar czvf /path/to/archive/meta.tar.gz metadir/ --xattrs

Restoring the backup:

$ cd /path/to/beegfs/meta_parent_dir
$ tar xvf /path/to/archive/meta.tar.gz --xattrs

Old versions of GNU tar might not support extended attributes. In that case see the workaround in the next section.

Using GNU tar without support for extended attributes

Older GNU tar versions do not have support for extended attributes, so two metadata backup steps are required with tar and getfattr.

Backup of the data:

$ cd /path/to/beegfs/meta_parent_dir
$ tar czvf /path/to/archive/meta.tar.gz metadir/
$ getfattr -R -d -P metadir/ > /path/to/archive/ea.bak

Restoring the backup:

$ cd /path/to/beegfs/meta_parent_dir
$ tar xvf /path/archive/meta.tar.gz
$ setfattr --restore=/path/to/archive/ea.bak

Important Step: Check Metadata Backup

To be sure that extended attributes have been included in the backup, it is recommended to test that the commands above work in your environment. This test could be done with a subdirectory (e.g. metadir/inodes/1D/4F/). Restored data should be tested with getfattr -d metadir/inodes/1D/4F/some_file to check if those files still have BeeGFS extended attributes.

The hardlink count of a metadata file can be checked with the stat tool (e.g. stat metadir/dentries/.../some_file), which will show a links count greater than one if a file has hardlinks.

Storage Daemon Backup

Backing up the storage server data is similar to the meta server backup, with two key differences:

  • The storage server may have multiple storage targets. The first target of a daemon is used to store the node ID, so please ensure that your targets remain in the same order after a backup. The best approach to that is to backup your configuration files along with your data.

  • No extended attributes: The storage server stores file contents, mtime and atime. When quota is enabled the chunk files are also owned by the user they belong to in BeeGFS.

$ rsync -a /path/to/beegfs/storage/target/ /path/to/beegfs/backup/target/

or

# tar cvpf /mnt/backups/storagebackup_target0.tar /path/to/beegfs/storage/target

Restarting BeeGFS

Use the inverted shutdown order.

  • Start mgmtd

  • Start storage

  • Start meta

  • Start clients