Data Management API¶
Overview¶
The BeeGFS Data Management API provides integration with external data management solutions that need to monitor and react to file system activity and/or control the state of file system entries. A common use case is hierarchical storage management solutions that automatically move data between high-cost and low-cost storage media. For example, BeeGFS using SSDs and external tape storage.
There are three main components:
A system for applications to subscribe to a stream of Filesystem Modification Events from each BeeGFS metadata service using gRPC.
File access flags that block different types of file access. These access checks can only be bypassed by BeeGFS clients configured with the
sysBypassFileAccessCheckOnMetaparameter.These access flags and the resulting locking behavior they impose are unique to BeeGFS and distinct from Linux permissions or advisory file locks via flock. They are intended to provide stricter more consistent locking guarantees than otherwise possible in POSIX.
File data states used to change client behavior when operations are blocked by access flags.
Note
Prior to BeeGFS 8.3 file data states did not modify access flag behavior and open always returned
EWOULDBLOCK immediately. Clients and servers must be upgraded to 8.3+ to leverage the new
behavior. Mixed versions are still supported, but always fallback to the pre-8.3 behavior.
Access flags and data states are part of a file’s persistent metadata and visible/enforced on all
clients once set. They are also preserved across metadata operations like rename, the same as
other file metadata like user or group IDs. Currently file access flags and data states can be set,
modified, and inspected using the BeeGFS CTL command-line tool, Golang library, and set using an ioctl (8.3+).
See below for more details.
Example Use Case¶
An HSM solution can subscribe to BeeGFS File System Modification events to receive updates about what files are being created, accessed, modified, renamed, and removed from BeeGFS. It can use these events to keep an out-of-tree index updated then apply policies to move file contents between storage tiers based on user criteria such as last file access or modification timestamps.
Internal BeeGFS file access flags can be set to take a read and/or write lock on a file’s contents
and prevent regular client access while a sync is in progress. If the HSM chooses to leave behind an
empty “stub file”, files can be left locked to indefinitely block client access. An OPEN_BLOCKED
event is triggered whenever a client attempts to access a locked file which can be used by HSM
software to trigger a restore of the file’s contents.
File Access Flags and Data States¶
Setting Access Flags¶
Warning
Generally managing access flags should only be done by a single data management application responsible for managing file contents in BeeGFS. Manually modifying access flags set by an external data management application using BeeGFS CTL may lead to unintended behavior.
One or both of the following access flags can be set on a file to block access from regular clients:
Read Lock: Block
open(2)withO_RDONLYorO_RDWR.Write Lock: Block
open(2)withO_WRONLYorO_RDWR(with or without O_TRUNC).
The O_APPEND flag can be combined with other flags but on its own does not affect if read/write
access is allowed through that file descriptor, thus is ignored for the access checks.
When an open(2) is blocked EWOULDBLOCK (i.e., resource temporarily unavailable) is returned.
If file system modification events are enabled, an OPEN_BLOCKED event is triggered for the file.
Note
This deviates slightly from POSIX, technically open(2) without O_NONBLOCK should block.
Other metadata operations such as unlink, rename, hardlink, setattr, getxattr,
setxattr are not affected by the read/write locks. Additional locks may be added in the future
for these operations.
The metadata service enforces a few rules when transitioning between access flags:
Acquiring stricter locks (adding flags): not allowed if conflicting read/write sessions exist.
For example, if any client has a file open with the
O_RDWRflag, attempting to take a write lock would fail with the errorIN_USE.
Relaxing locks (removing flags): not allowed if dependent read/write sessions exist.
Relaxing a lock is only allowed when no conflicting open file descriptors exist. This is intended to protect against races/conflicts where there is an open file descriptor relying on the lock for protected access. This protection only applies to file descriptors opened with access modes that match those protected by the lock. For example, a write lock may be removed if the file is only opened read-only, but not if there are any open descriptors with write access that may be relying on the lock for protected access.
Access flags can be updated/inspected using BeeGFS CTL using a hidden CLI command or the Golang Library:
$ beegfs entry set --access-flags=<unlocked | read-lock | write-lock | read-write-lock | none> /mnt/beegfs/myfile
For example to set the write access flag and verify it is set:
$ beegfs entry set --access-flags=write /mnt/beegfs/myfile
$ beegfs entry info --columns=path,type,access /mnt/beegfs/myfile
PATH TYPE ACCESS
/myfile file Locked (write)
All access flags can be cleared by specifying none:
$ beegfs entry set --access-flags=none /mnt/beegfs/myfile
$ beegfs entry info --columns=path,type,access /mnt/beegfs/myfile
PATH TYPE ACCESS
/myfile file Unlocked
Bypassing Access Flags¶
A BeeGFS client used by HSM software can specify sysBypassFileAccessCheckOnMeta=true in their
beegfs-client.conf file to bypass all access checks and manage the contents of locked and
unlocked files. It is the responsibility of the HSM to ensure the appropriate file access locks are
taken and released if exclusive access to a file is required.
Warning
Be careful these special clients are not inadvertently used by other users or applications.
Setting Data States¶
Warning
Generally managing data state should only be done by a single data management application responsible for managing file contents in BeeGFS. Manually modifying data states set by an external data management application using BeeGFS CTL may lead to unintended behavior.
Data states can be updated/inspected using BeeGFS CTL (hidden CLI command or the Go library):
$ beegfs entry set --data-state=1 /mnt/beegfs/myfile
$ beegfs entry info --columns=path,type,state /mnt/beegfs/myfile
PATH TYPE STATE
/myfile file Available
The following data states are currently supported:
Data State |
Client Error |
Behavior |
|---|---|---|
Available (0) |
EWOULDBLOCK |
Waits up to |
Manual Restore (1) |
EWOULDBLOCK |
Immediately returns an error. Intended to indicate when auto restore is not supported or not allowed by policy and requires explicit administrative intervention. |
Auto Restore (2) |
EWOULDBLOCK |
Waits up to |
Delayed Restore (3) |
EWOULDBLOCK |
Immediately returns an error. Intended to indicate the contents will be restored automatically, but may take a very long time. |
Unavailable (4) |
EREMOTEIO |
Immediately returns an error. Intended to indicate the backing media is permanently lost, destroyed, or the file/object was deleted by a third party application. |
Reserved (5) |
EWOULDBLOCK |
Immediately returns an error. Behavior is subject to change in a future release. |
Reserved (6) |
EWOULDBLOCK |
Immediately returns an error. Behavior is subject to change in a future release. |
Reserved (7) |
EWOULDBLOCK |
Immediately returns an error. Behavior is subject to change in a future release. |
By default the data state of all files is Available. This allows data management application to
lock a file for exclusive access while syncing it, but not cause other applications to encounter an
error when attempting to open that file unless tuneFileOpenRetryTimeoutMS is exceeded. How
frequently a client checks file availability can be tuned using tuneFileOpenRetryIntervalMS.
Both parameters are configured on a per client (mount) basis in beegfs-client.conf.
The Auto Restore state is intended when the contents of a file have been offloaded and are no
longer locally available in BeeGFS, but can be restored within tuneFileOpenRetryTimeoutMS. By
default the timeout is five minutes, but can be increased as needed to facilitate longer restore
times. The timeout can be set to different values on each client, for example if some clients are
used for interactive access where it is better to immediately return an error, and other clients are
used for non-interactive long-running jobs where waiting for the file to be restored is acceptable.
When a client is waiting for a file to be restored, the blocked open can be interrupted by sending a
standard Linux signal (such as SIGINT) to the process that first attempted to open the file.
Note if multiple processes attempt to open the same file, only the first one to issue an open
request can be interrupted. Other processes will be in an uninterruptible sleep until the open
issued by the first is interrupted due to restrictions in the Linux Virtual File System (VFS) layer.
Note
Interrupting a blocked open does not signal to the data management application to cancel any restore operations that might have been triggered.
The other states always immediately return an error, but allow data management applications to
surface more information to users through standard BeeGFS tooling such as beegfs entry info.
Additionally the Unavailable state can be used to mark files that cannot be restored, as
returning the default EWOULDBLOCK (resource temporarily unavailable) would be misleading.