Mirroring¶
Warning
Mirroring is not a replacement for backups. If files are accidentally deleted or overwritten by a user or process, mirroring won’t help you to bring the old file back. So you are still responsible to do regular backups of your important bits.
This article is about using mirroring, for a general explanation see Mirroring
By default, mirroring is disabled for a new file system instance. Both metadata and storage
mirroring can be enabled with the beegfs
command line tool. The beegfs
tool is provided by
the beegfs-tools
package and is usually run from a client node.
Before metadata or storage mirroring can be enabled, buddy groups need to be defined, as these are the basis for mirroring.
Management of Mirror Buddy Groups¶
When defining buddy groups administrators should take into consideration:
Are the targets equal (or nearly equal) in size?
Are the targets on different servers or in different racks/failure domains for redundancy?
Defining Buddy Groups¶
Warning
When creating a buddy group that contains metadata node that owns the root directory, you must assign that target as the primary so mirroring can be enabled on the root directory. It is also recommended to then immediately follow the steps to activate mirroring for the root inode. Otherwise if there is a switchover before mirroring the root inode you may run into problems enabling mirroring.
Buddy groups are created using the beegfs mirror create
command. To define a buddy mirror first
decide which targets will be the primary and secondary in the group, and what alias you want to use
for the group. You can also optionally specify a numerical ID or allow this to be automatically
assigned. These numerical IDs only need to be unique amongst the other buddy groups of a particular
type (i.e., you could have a metadata target 1 and a metadata buddy group 1) however it is
recommended they are unique to simplify troubleshooting.
For example to create a metadata buddy group with ID 1 where the primary target is meta:1 and the secondary target is meta:2 with the alias “m1m2” you would run:
beegfs mirror create --node-type=meta --num-id=1 --primary=meta:1 --secondary=meta:2 m1m2
Note
Note in previous version of BeeGFS it was possible to automatically define buddy groups. In practice this often led to mishaps where buddy groups were not defined optimally, because the system couldn’t take take topology constraints into consideration, for example when using multi-mode if two BeeGFS services were on the same physical server.
List defined Mirror Buddy Groups¶
Mirror buddy groups can be listed by running:
$ beegfs mirror list
$ beegfs mirror list --node-type=meta
$ beegfs mirror list --node-type=storage
Define Stripe Pattern¶
After defining storage buddy mirror groups in your system, you have to define a data stripe pattern that uses it: Striping.
Caveats of Storage Mirroring¶
Storage buddy mirroring provides protection against many failure modes of a distributed system, such as drives failing, servers failing, networks being unstable or failing, and a number of other modes. It does not provide perfect protection if a system is degraded, mostly only for the degraded part of the system. If any storage buddy group is in a degraded state, another failure may cause data loss. Administrative actions can also cause data loss or corruption if the system is in an unstable or degraded state. These actions should be avoided if at all possible, for example, by ensuring that no access to the system is possible while the actions are performed.
Setting states of active storage targets¶
When manually changing the state of a storage target from GOOD
to NEEDS_RESYNC
, clients
accessing files during a period of propagation “see” different versions of the global state. This
influences data and file locks. By default, propagation happens every 30 seconds, so the period will
not take longer than a minute. This may happen because the state is not synchronously propagated to
all clients, which makes the following sequence of events possible:
An administrator sets the state of an active storage target which is the secondary of a buddy group to
NEEDS_RESYNC
withbeegfs mirror resync start <buddy-group>
.The state is propagated to the primary of the buddy group. The primary will no longer forward written data to the secondary.
A client writes data to a file residing on the buddy group. The data is not forwarded to the secondary.
A different client reads data from the file. If the client attempts to read from the primary, no data loss occurs. If the client attempts to read from the secondary, which is possible without problems in a stable system, the client will receive stale data.
If the two clients in this example used the file system to communicate, e.g., by calling flock for the file they share, the second client would not see the expected data. Accesses to the file will only stop considering the secondary as a source once all clients have received the updated state information, which may take up to 30 seconds.
Setting the state of a primary storage target may exhibit the same effects. Setting states for
targets that are currently GOOD
, and by that triggering a switchover, must be avoided while clients
are still able to access data on the target. Propagation of the switchover takes some time during
which clients may attempt to access data on the target that was set to non-GOOD
. If the access was a
write, that write may be lost.
Fsync may fail without setting targets to NEEDS_RESYNC¶
When fsync is configured to propagate to the storage servers and trigger an fsync on the storage servers, an error during fsync may leave the system in an unpredictable state if the error occurred on the secondary of a buddy group. If the fsync operation failed on the secondary due to a disk error, the error may be detected only during the next operation of the secondary. If a failover happens before the error is detected the automatic resync from the new primary (old secondary, which has failed) to the new secondary (old primary) may cause data loss.
Activating Metadata Mirroring¶
After defining metadata mirror buddy groups, you have to activate metadata mirroring: Metadata Mirroring.
Enable Mirroring¶
Storage mirroring can be enabled on a per-directory basis, so that some data in the file system can be mirrored while other data might not be mirrored. On the metadata side, it is also possible to activate or deactivate mirroring per directory, but certain logical restrictions apply. For example, for a directory to be mirrored effectively, the whole path to it must also be mirrored.
Mirroring settings of a directory will be applied to new file entries and will be inherited by new
subdirectories. For instance, if metadata mirroring is enabled for directory /mnt/beegfs/mydir1
,
then a new subdirectory /mnt/beegfs/mydir1/mydir2
will also automatically have metadata mirroring
enabled.
After metadata mirroring is enabled for a file system using the beegfs mirror init
command, the
metadata of the root directory will be mirrored by default. Therefore, newly created directories
under the root will also have metadata mirroring enabled. It is possible to exclude new folders from
metadata mirroring by creating them using beegfs entry create directory --no-mirror <name>
. For
more information about metadata mirroring, please see Metadata Mirroring.
To enable file contents mirroring for a certain existing directory, see the built-in help of the
beegfs
tool (remember to define buddy groups first):
$ beegfs entry set --pattern=mirrored --help
File contents mirroring could be subsequently disabled using beegfs entry set --pattern=raid0
.
Files that were already created in the directory when mirroring was enabled will remain mirrored.
To check the metadata and file contents mirroring settings run:
$ beegfs entry info /mnt/beegfs/mydir/myfile
To check the state of all targets (metadata and storage) run:
$ beegfs target list --state
Restoring Metadata and Storage Target Data after Failures¶
If a storage target or metadata server is not reachable, it will be marked as offline and won’t get data updates. Usually, when the target or server re-registers, it will automatically be synchronized from the remaining mirror in the buddy group (self-healing). However, in some cases, it might be necessary that you manually start a synchronization process. For more information on how to do that and on how to monitor synchronization, please see Resynchronization of mirrored targets.