Storage Pools

Storage pools can be considered a template for the stripe pattern BeeGFS uses to assign file chunks to specific storage targets. Instead of directly assigning a stripe pattern, a user can now assign a storage pool, which is much easier and more intuitive. The storage pool assignment can be part of the directory metadata, and therefore apply to all newly created files in a particular directory, or it can be part of the file metadata, applying only to that file and overriding the directory default.

On a freshly installed BeeGFS, or after an update from an older version which does not support storage pools, all targets and storage buddy groups will be in the “default” pool. You can check the members of each pool with beegfs pool list:

$ beegfs pool list
ALIAS    ID         MEMBERS
default  storage:1  target_0-6734A781-1
                    target_1-6734A781-1
                    target_2-6734A781-1
                    target_3-6734A781-1
                    target_0-6734A784-2
                    target_1-6734A784-2
                    target_2-6734A784-2
                    target_3-6734A784-2
                    s103_s104

Because this system is using buddy mirroring, in addition to targets, the default pool also contains the buddy group s103_s104. Another way to check the storage pools is using beegfs target list, which prints a list of targets, and the pool for each target.

A new storage pool can be added using the beegfs pool create command. The beegfs pool create command requires the list of targets and/or buddy groups to assign to the pool, along with the alias and optionally a numeric ID to assign to the pool. If no ID is specified, the first unused ID will be used. Adding a target or group to a storage pool will remove it from the default storage pool, as targets and groups can only be part of one storage pool at a time.

$ beegfs pool create --targets=target_0-6734A781-1,target_0-6734A784-2 archive
Pool created: archive[storage:2, uid:46]

$ beegfs pool list
ALIAS    ID         MEMBERS
default  storage:1  target_1-6734A781-1
                    target_2-6734A781-1
                    target_3-6734A781-1
                    target_1-6734A784-2
                    target_2-6734A784-2
                    target_3-6734A784-2
                    s103_s104
archive  storage:2  target_0-6734A781-1
                    target_0-6734A784-2

If targets are members of a buddy group they cannot be assigned directly to a pool, instead the buddy group should be assigned to the pool and the targets will be moved to the pool automatically:

$ beegfs pool create --targets=target_2-6734A781-1,target_3-6734A781-1 mirrored
Error: rpc error: code = Unknown desc = Create pool: Target target_2-6734A781-1 can't be assigned directly as it's part of a buddy group

$ beegfs pool create --groups s103_s104 mirrored
Pool created: mirrored[storage:3, uid:48]

$ beegfs pool list
ALIAS     ID         MEMBERS
default   storage:1  target_1-6734A781-1
                     target_1-6734A784-2
                     target_2-6734A784-2
                     target_3-6734A784-2
archive   storage:2  target_0-6734A781-1
                     target_0-6734A784-2
mirrored  storage:3  target_2-6734A781-1
                     target_3-6734A781-1
                     s103_s104

The command beegfs pool delete removes an existing storage pool. Before deleting a pool its members must be assigned to other pools using the beegfs pool assign command:

$ beegfs pool delete mirrored
Error: rpc error: code = Unknown desc = Delete pool: 2 targets and 1 buddy groups are still assigned to this pool

$ beegfs pool assign --groups s103_s104 default
Pool assigned: default[storage:1, uid:2]

$ beegfs pool list
ALIAS     ID         MEMBERS
default   storage:1  target_1-6734A781-1
                     target_2-6734A781-1
                     target_3-6734A781-1
                     target_1-6734A784-2
                     target_2-6734A784-2
                     target_3-6734A784-2
                     s103_s104
archive   storage:2  target_0-6734A781-1
                     target_0-6734A784-2
mirrored  storage:3

$ beegfs pool delete mirrored
Pool can be deleted: mirrored[storage:3, uid:48]
If you really want to delete it, please add the --yes flag to the command.

Warning

When a pool is removed existing files and directories that reference this pool ID are not updated and will continue to reference the now deleted pool ID. This prevents pool deletions from accidentally triggering mass reconfiguration/restriping of entries in the file system. Existing files will remain on the same targets even though those targets are now in a different pool, and directories will continue to refer to the now deleted pool.

Note new files will not be able to be created in directories that reference the deleted pool until either their stripe pattern is updated or a new pool with the same ID is created. Use the entry set to update the pool on existing directories and entry migrate to move existing files to new targets if desired.

The alias assigned to a pool can be changed at any time using beegfs pool set-alias.

Assigning a Directory Stripe Pattern

The following command sets the default pattern of a directory to use only targets from the storage pool with ID 2 for the directory named first_dir.

$ beegfs entry set --pool=archive first_dir/
Processed 1 entries.
Configuration Updates: Pool (archive)

All newly created files and files copied into that directory will only be stored on targets belonging to the archive pool. To move existing files between pools, see Migrating Files between Storage Pools.

For example, after touching a new file, its entry info reads like this:

$ beegfs entry info first_dir/my_file
PATH                ENTRY ID      TYPE  META NODE        META MIRROR  STORAGE POOL  STRIPE PATTERN  STORAGE TARGETS  BUDDY GROUPS  REMOTE TARGETS  COOL DOWN
/first_dir/my_file  0-6758BC3C-2  file  node_meta_2 (2)            1  archive (2)   RAID0 (4x1M)    101,201          (unmirrored)  (none)          (n/a)

Note that even though the (default) stripe pattern in this example specifies four targets, only two targets are used. This is because storage pool 2 only contains two targets. When a pool is specified, targets outside this pool will not be used.

Note that the entry set command has more command line arguments giving you more fine-grained control over the data placement. For example, the number of targets and the chunk size can be controlled. beegfs entry set --help will print an explanation of all available options. To allow unprivileged users to use the entry set command, it is necessary to set the option sysAllowUserSetPattern = true in /etc/beegfs/beegfs-meta.conf.

Creating Files

By default, new files inherit the stripe pattern, and therefore the storage pool setting from their parent directory. beegfs provides a way to override this setting, allowing the user to specify the storage pool a file’s contents reside on. On the command line, beegfs entry create file can be used to create a new, empty file on a specific storage pool:

$ beegfs entry create file --pool=archive new_archived_file
NAME                 STATUS  ENTRY ID      TYPE
/new_archived_file  Success  0-6758BCB7-2  file

Much like the entry set mode, the entry create file mode supports many more options, like setting the stripe pattern and chunk size. Use beegfs entry create file --help to get an overview and explanation of all available options.

The stripe pattern settings of a file can be verified using beegfs entry info, just like those of a directory.

$ beegfs entry info . new_archived_file
PATH                ENTRY ID           TYPE  META NODE        META MIRROR  STORAGE POOL  STRIPE PATTERN  STORAGE TARGETS  BUDDY GROUPS  REMOTE TARGETS  COOL DOWN
/                   root          directory  node_meta_2 (2)            1  default (1)   RAID0 (4x1M)    (directory)      (directory)   (none)          (n/a)
/new_archived_file  0-6758BCB7-2       file  node_meta_2 (2)            1  archive (2)   RAID0 (4x1M)    101,201          (unmirrored)  (none)          (n/a)

Notice how instead of inheriting the default storage pool from its parent directory (root), the new_archived_file is assigned to the archive pool and only contains targets from that pool. This new file is empty and ready to be written to be any application and data will be striped across targets 101 and 201.

Migrating Files between Storage Pools

Files can be migrated from one storage pool to another using the beegfs tool. For example if we had written some data to the archive/ directory and we wanted to bring that back to the default pool, you could run:

$ sudo beegfs entry migrate --from-pools archive --pool default archive/ --recurse --yes
Migration statistics: {MigrationStatusUnknown:0 MigrationErrors:0 MigrationNotSupported:0 MigrationSkipped:1 MigrationNotNeeded:0 MigrationNeeded:0 MigratedFiles:5 MigratedDirectories:0}

This recursively migrates all files under the archive/ directory assigned to the archive pool to the default pool. By default only files are migrated and the pool set on the archive/ directory (and any subdirectories) is not modified, but could also be updated as part of the migration by specifying --update-directories or by later using the beegfs entry set command.

Warning

If any files in the source pool have buddy mirroring enabled, ensure there is at least one buddy group in the destination pool, otherwise those files cannot be migrated.

Extra Notes and Caveats

Note that storage pools are different from capacity pools. Capacity pools (as seen e.g., by running beegfs health capacity) are used by BeeGFS to balance free space on the targets within a capacity pool.

If buddy groups are defined, targets in a buddy group are always in the same storage pool as their buddy. Targets and buddy groups can not be moved between pools. They have to be removed from one pool, effectively moving them back to the default pool, before they can be added to another pool.

Changing the storage pool for a directory only affects new files in that directory. Existing files stay in the storage pool they were created in and have to be moved over manually. For instance, the following command will create a new copy of the file in the new storage pool, and then move it over the original file’s name, removing the old copy:

cp file file.tmp && mv file.tmp file

The same applies for moving a file into a directory. If the file was on the same BeeGFS instance before, it keeps its existing stripe pattern and storage pool. Copying a file into the directory will create a new copy of the file and apply the storage pool setting from the directory, as moving a file from outside a BeeGFS instance into it.

When all targets have been removed from a pool, it is not possible to create files in directories which have that pools selected as their storage pool anymore. The application will receive a “Remote I/O error”, the metadata server log will give the reason: “No storage targets available”. This can happen, for example, when a file system has been freshly created, and all targets have been moved to storage pools. The “default” pool is now empty, but still selected as the pool for the root directory of the file system. To resolve this situation, you have to select a pool that contains at least one target for that directory, using the command beegfs entry set --pool=<entityID>, so you can create new files in that directory.

When using the migrate command, files are migrated based on the targets/groups they are currently using. The list of target/groups to migrate away from is determined either by explicitly specifying individual targets, selecting all targets for a particular node, or all targets assigned to a particular pool. Keep in mind the --from-pool=<entityID> parameter considers not the ID of the pool files were originally created in, but rather the pool its targets/groups are currently assigned to, which may have changed if the targets were moved between pools.