Quota

Introduction

BeeGFS allows the definition of system-wide quotas of disk space allocation and number of chunk files, on a per-user or per-group basis. This can be used to organize users in different access layers with different levels of restriction and also prevent individuals from consuming alone all file system’s resources.

The BeeGFS quota management mechanism is composed of two features: quota tracking and quota enforcement. Quota tracking allows the query of the amount of data and the number of chunk files that users and groups are using in the system, without imposing any restriction.

Quota enforcement allows the definition and application of quota limits in the whole system. When this feature is enabled, the BeeGFS management daemon periodically collects quota reports from all storage targets in regular intervals, checks for exceeded quota limits, and informs the rest of the system about which users are no longer allowed to consume more resources.

BeeGFS quota management relies on quota data provided by the underlying file systems of storage server targets. Therefore, the capabilities of such file systems determine which types of quota BeeGFS is able to manage. For example, if the storage targets a version of ZFS prior to 0.7.4, BeeGFS will allow the definition of quotas only for used space, not for the number of files, as the latter is not supported by old releases of ZFS. If you use ZFS 0.7.4 or later, the latest version of BeeGFS will allow you to define both types of quota.

Quota limits are always configured on individual storage pools. Default limits for each pool can be set that apply to all users and groups, or per user/group limits can also be set. Per user/group limits take precedence over the default limits, and the creation of new files is prohibited when these limits are reached.

The following sections explain in more detail how these features work and how they can be configured.

Note

Quota limits for inodes (number of files) relate to the data chunk files created on storage targets and not files created by end-users under the BeeGFS mount point. It is also important to understand that such quota limits do not concern the number of directories created in the system.

Quota tracking

This section provides information on how to enable tracking of used disk space and number of chunk files on the storage targets. Note metadata targets do not have any requirements for quota tracking.

Requirements and general notes

Quota tracking is designed to generally work with any underlying local file system on the storage servers that supports user and group quota (reported through the system call quotactl()), but has only been fully tested with ext4, XFS, and ZFS.

If you are also creating files on the storage targets outside of the BeeGFS storage directory, note that the blocks and inodes occupied by those files will also account as used resources for the corresponding owner user. The reports would also be distorted if multiple storage targets were located within the same local file system instance. These scenarios are not uncommon on test systems.

Files stored in the disposal directory (which do not appear under the BeeGFS client mount point) also account for the amount of space used by users. Therefore, try to clear the disposal directory if you think that shown used space defers from actually used disk space.

Specifying tracked users and groups

By default the management will retrieve the user and group IDs used for quota checking and enforcement from the same source as used by commands getent passwd and getent group. This source could be a central LDAP database or another user management system, but the passwd and group databases on all nodes should be synchronized so user and group IDs are consistent across all nodes. Ensure these commands return a full list of all users and groups without any additional arguments. Simply being able to query specific users or groups (i.e., getent passwd <user>) is not sufficient because the management needs be able to enumerate the full list. You may need to adjust the configuration of the System Security Services Daemon (SSSD) to enable enumeration.

Warning

If the getent commands do not return anything on the management node you will not be able to use the command beegfs quota list-usage to query used space for all users at once, and you must use the file and/or range options below to manually specify the users and groups to track.

By default no users or groups are queried by the BeeGFS quota system and you must define the the minimum user and group ID for which quotas should be enabled. For example to query all users and groups including system accounts you would configure the following in beegfs-mgmtd.toml:

quota-user-system-ids-min = 0
quota-group-system-ids-min = 0

You may want to omit system users (like root) and groups, in which case you could set the minimum based on the maximum ID for system users/groups on your Linux distribution, 1000 is typical:

quota-user-system-ids-min = 1000
quota-group-system-ids-min = 1000

More advanced filtering of the users and/or groups tracked by the quota system is also possible with additional configuration options in beegfs-mgmtd.toml. These options can be used together and in conjunction with the quota-*-system-ids-min options. These steps are also required if you are not able to use the passwd and group databases as the source of IDs:

  1. You may provide specific ranges of user and/or group IDs that should be queried. Do not define unnecessarily large ranges as this could decrease query performance. This option is helpful when only a small range of IDs should be queried instead of all IDs available in the system. Ranges can be specified with:

    quota-user-ids-range = "<start>-<end>"
    quota-group-ids-range = "<start>-<end>"
    
  2. You can also provide files with lists of specific user and/or group IDs that should be queried. Each file should contain the desired IDs separated by newlines or whitespace. This query type is helpful when the IDs to be queried are not sequential. The file paths can be specified with:

    quota-user-ids-file = "<path>"
    quota-group-ids-file = "<path>"
    

After specifying the desired configuration don’t forget to restart the management service so the changes take effect.

Dynamically Generating User and Group ID Lists from LDAP

If you do not wish to track and enforce BeeGFS quotas for all LDAP users and groups you could dynamically generate files containing lists of user and group IDs from LDAP for quota tracking and enforcement using ldapsearch command. Below are examples for creating these files:

Generate User ID List

To create a file containing the user IDs, use the following command:

ldapsearch -x -LLL -b "ou=users,dc=example,dc=com" "(objectClass=posixAccount)" uidNumber | awk '/^uidNumber: / {print $2}' > /path/to/quota-user-ids-file
  • -x: Simple authentication.

  • -LLL: Strips the LDAP output header and comments for easier parsing.

  • -b: Specifies the search base (adjust based on your LDAP structure).

  • "(objectClass=posixAccount)": Filters for user accounts.

  • uidNumber: Attribute that contains user IDs.

  • awk: Extracts and writes user IDs to the file.

Ensure the attribute names (uidNumber, gidNumber) and object classes (posixAccount, posixGroup) match your LDAP schema. Use ldapsearch with no filters to inspect your LDAP entries if needed.

Generate Group ID List

To create a file containing the group IDs, use the following command:

ldapsearch -x -LLL -b "ou=groups,dc=example,dc=com" "(objectClass=posixGroup)" gidNumber | awk '/^gidNumber: / {print $2}' > /path/to/quota-group-ids-file
  • Replace the base (ou=groups,dc=example,dc=com) and filters (objectClass=posixGroup) as needed to match your LDAP schema.

Again, adjust attributes as needed to match your LDAP scheme.

Automating with a Cron Job

To keep these files updated dynamically, consider scheduling the commands using a cron job. For example:

  1. Edit the cron jobs for the root user (or another privileged user):

    crontab -e
    
  2. Add the following entries to update the files daily at midnight:

    0 0 * * * ldapsearch -x -LLL -b "ou=users,dc=example,dc=com" "(objectClass=posixAccount)" uidNumber | awk '/^uidNumber: / {print $2}' > /path/to/quota-user-ids-file
    0 0 * * * ldapsearch -x -LLL -b "ou=groups,dc=example,dc=com" "(objectClass=posixGroup)" gidNumber | awk '/^gidNumber: / {print $2}' > /path/to/quota-group-ids-file
    

Note once the management service has been configured to use these files (see below) it does not need to be restarted whenever the files are updated as it handles automatically reloading them from disk.

Applying the Configuration

After generating these files, specify their paths in the BeeGFS management configuration:

quota-user-ids-file = "/path/to/quota-user-ids-file"
quota-group-ids-file = "/path/to/quota-group-ids-file"

Finally, restart the BeeGFS management service for the changes to take effect:

systemctl restart beegfs-mgmtd

This method ensures that quota tracking and enforcement work reliably, even in environments with non-sequential or dynamically managed IDs from LDAP.

Warning

Protect the generated files by limiting access to only the user running the BeeGFS management service (typically root) to ensure they are not exposed to unauthorized users.

Enabling quota during a new BeeGFS installation

Walk-through these steps if you are about to setup a new BeeGFS instance that should support quota.

In this example, we assume that /dev/sdb is the underlying disk or RAID array of a storage target, which is mounted to the directory /data.

  1. Start by enabling quota support for the underlying file system on the storage targets, as described below for ext4, XFS, and ZFS.

    ext4: Enable quota support for ext4:

    # Mount device with quota support for users and groups
    $ mount /dev/sdb /data -t ext4 -orw,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv1,...
    
    # Create quota database files
    $ quotacheck -cug /data
    
    # Calculate current quota values
    $ quotacheck -vug /data
    
    # Enable quota counting
    $ quotaon -vug /data
    

    XFS: Enable quota support for XFS:

    # Mount device with quota support for users and groups
    $ mount /dev/sdb /data -t xfs -orw,uqnoenforce,gqnoenforce,...
    

    ZFS: Enable quota support for ZFS:

    Make sure that the package libzfs2-devel is installed on your system. On Debian/Ubuntu systems install libzfslinux-dev. Nothing else needs to be done, because quota tracking is supported automatically based on libzfs.

  2. Perform the installation as usual but before starting any clients or services update the following configuration:

    1. Set quota-enable = true in /etc/beegfs/beegfs-mgmtd.toml on your management node and ensure you have specified tracked users and groups.

    2. Set quotaEnabled = true in /etc/beegfs/beegfs-client.conf on all client nodes.

      This setting will cause the client to transfer extra user data to the servers, namely the uid and gid of the user making every IO syscall. This extra data allows BeeGFS to correctly compute disk space use and the number of files created by each user. If this setting is not done on a client node, all syscalls performed on that node will affect the quota consumption of the root user, instead of the actual caller.

Enabling quota for an existing BeeGFS installation

Take these steps if you want to enable quota support for an existing BeeGFS instance that was previously used without quota support.

In this example, we assume that /dev/sdb is the underlying disk or RAID array of a storage target, which is mounted to the directory /data.

  1. Stop all BeeGFS server and client services.

  2. Enable quota support for the underlying file system on the storage targets, as described below for ext4, XFS and ZFS.

    ext4: Enable quota support for ext4:

    # Mount device with quota support for users and groups
    $ mount /dev/sdb /data -t ext4 -orw,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv1,...
    
    # Create quota database files
    $ quotacheck -cug /data
    
    # Calculate current quota values
    $ quotacheck -vug /data
    
    # Enable quota counting
    $ quotaon -vug /data
    

    XFS: Enable quota support for XFS:

    # Mount device with quota support for users and groups
    $ mount /dev/sdb /data -t xfs -orw,uqnoenforce,gqnoenforce,...
    

    ZFS: Enable quota support for ZFS:

    Make sure that the package libzfs2-devel is installed on your system. On Debian/Ubuntu systems install libzfslinux-dev. Nothing else needs to be done, because quota tracking is supported automatically based on libzfs.

  3. Set quota-enable = true in /etc/beegfs/beegfs-mgmtd.toml on your management node and ensure you have specified tracked users and groups.

  4. Set quotaEnabled = true in /etc/beegfs/beegfs-client.conf on all client nodes.

    This setting will cause the client to transfer extra user data to the servers, namely the uid and gid of the user making every IO syscall. This extra data allows BeeGFS to correctly compute disk space use and the number of files created by each user. If this setting is not done on a client node, all syscalls performed on that node will affect the quota consumption of the root user, instead of the actual caller.

  5. Start all BeeGFS services.

  6. With the beegfs-utils package installed, run the following command on one of the client nodes to update the ownership information of the existing data chunk files on the storage servers for quota tracking. This command can take a while to complete, but it is executed only once, and the system can be online while the chunk files are being updated.

    $ beegfs-fsck --enablequota
    

    This command could be re-executed if you discover later that some clients didn’t have option quotaEnabled set to true, and you want to update the ownership information of the data chunk files created in the meantime.

Querying quota information

Quota information can be queried with the commands under beegfs quota. These commands query the management service which periodically pulling quota information from storage nodes and calculates/enforces quota limits (if enabled/defined). This means quota usage may be slightly out of date from the actual usage depending on the quota-update-interval set in beegfs-mgmtd.toml.

Here are some usage examples:

  • Show quota information for all normal users:

    $ beegfs quota list-usage --type=user
    
  • Show quota information for the user ID 1000:

    $ beegfs quota list-usage --type=user --ids=1000
    
  • Show quota information for group IDs range 1000 to 1500:

    $ beegfs quota list-usage --type=group --ids=1000-1500
    
  • Show the default quota limits for each storage pool:

    $ beegfs quota list-defaults
    
  • To get quota information for a specific storage pool, include the --pool=<entityID> option in the command. For example:

    $ beegfs quota list-usage --pool=<entityID>
    
  • Show more examples and general help:

    $ beegfs quota --help
    

Note

If the underlying file system of the storage targets is ZFS querying/setting quotas for the number of files is not supported due to underlying ZFS limitations. Thus the values for used files/inodes will not be available.

Quota enforcement

This section provides information on how to activate quota enforcement in a BeeGFS system.

Requirements

Quota enforcement requires quota tracking to be enabled, as described above.

Enable quota enforcement

Take the steps below on each service to enable quota enforcement in the whole system.

Storage Service Setting

  1. Set the option below to true in the storage configuration file /etc/beegfs/beegfs-storage.conf:

    quotaEnableEnforcement = true
    
  2. Restart the storage service daemon.

Meta Service Setting

  1. Set the option below to true in the meta configuration file /etc/beegfs/beegfs-meta.conf:

    quotaEnableEnforcement = true
    
  2. Restart the meta service daemon.

Management Service Settings

Take the following steps below to enable quota enforcement in the system. All options presented in this section are found in file /etc/beegfs/beegfs-mgmtd.toml.

  1. Quota reports are collected from the storage targets and quota limits checked by the management service at regular intervals. Such interval is set by option quota-update-interval, in minutes (by default: 30 seconds). A shorter interval will reduce the time until an exceeded limit is noticed, and the quota enforced. Thus, in order to reduce the possibility of a user momentarily exceeding its limits, this interval should be kept as low as possible. On the other hand, constant queries will cause some workload overhead on the system, possibly reducing performance. So, change this option with caution. If you reduce this interval, please consider changing also the type of quota query, as discussed below. All quota query types (system, range and file) will be updated with the specified interval.

    quota-update-interval = 30
    
  2. If you haven’t already, follow the steps in the section Specifying tracked users and groups to define what users and groups should have quota tracking and enforcement enabled.

  3. Set the following options to true to activate quota enforcement on the system (note quota-enable = true must also be set):

    quota-enforce = true
    
  4. Restart the management service daemon.

  5. These changes won’t be noticed by the other server services until they are restarted. Therefore, restart the storage service daemons and the metadata service daemons.

Setting quota limits

Quota limits can be set with the command beegfs quota set-limits and are configured on a per-storage pool basis. Here are some usage examples:

  • Set quota limit for user ID 1000 to 1 gigabyte and 500 chunk files on pool 1

    $ beegfs quota set-limits --uid=1000 --space=1G --inode=500 storage:1
    
  • Set quota limit for group ID 1289 to 10 gigabyte and 22 chunk files on pool 2:

    $ beegfs quota set-limits --gid=1289 --space=10G --inode=22 storage:2
    
  • Reset the quota limits for user ID 1000 on pool 1 so the defaults will be enforced:

    $ beegfs quota set-limits --uid=1000 --space=reset --inode=reset storage:1
    
  • Set quota limit for group ID 1289 to 10 gigabyte and reset the chunk file limit on pool 1:

    $ beegfs quota set-limits --gid=1289 --space=10G --inode=reset storage:1
    
  • Set the default user quota limits on pool 1 to 10 gigabyte and unlimited chunk files:

    $ beegfs quota set-defaults --user-space=10G --user-inode=unlimited storage:1
    

Note

Similar to the beegfs quota list-usage mode, it is possible to set quota limits for individual user and/or groups separated by a comma, or ranges in the form <min>-<max>.

Project directory quota tracking

The BeeGFS quota management mechanism is based on user and group quota. Group quota can be used for project directories by using the setgid flag on a directory (chmod g+s /mnt/beegfs/project01). If this flag is set, all files created in the directory will automatically have the group of the directory instead of the primary group of the user who created the file.

A slightly different approach that achieves a very similar result is to mount BeeGFS with the grpid mount option. Doing so acts like a global setgid bit and makes new files and directories inherit group IDs from their parent directories across the entire filesystem.

With this approach, it is useful to also create a separate group for the project, e.g., a group project01 and apply it to the project directory (chown root:project01 /mnt/beegfs/project01). To avoid conflicts with per-user quota limits, the same approach can be used not only for shared project directories but also for user directories, in which case each user has its own group.

Alternatively, if you want to track used space or number of files based on subdirectory trees, you might want to look at the Robinhood Policy Engine.

Robinhood can run parallel scans of the file system at regular intervals and store the discovered file and directory information in a SQL database. On the one hand, this has the advantage of enabling various queries of the database with fast results. On the other hand, automatic actions for certain events can be defined in Robinhood, e.g., if the defined used space limit for a certain subdirectory tree is exceeded.

As BeeGFS keeps all the metadata for such scans readily available on the metadata servers (usually flash storage), crawling a file system in parallel is fast. To make sure that the SQL database of Robinhood does not reduce the scan speed, it is recommended to have the Robinhood database also on flash storage.