Storage Pools¶
Storage pools can be considered a template for the stripe pattern BeeGFS uses to assign file chunks to specific storage targets. Instead of directly assigning a stripe pattern, a user can now assign a storage pool, which is much easier and more intuitive. The storage pool assignment can be part of the directory metadata, and therefore apply to all newly created files in a particular directory, or it can be part of the file metadata, applying only to that file and overriding the directory default.
On a freshly installed BeeGFS, or after an update from an older version which does not support
storage pools, all targets and storage buddy groups will be in the “default” pool. You can check the
members of each pool with beegfs pool list
:
$ beegfs pool list
ALIAS ID MEMBERS
default storage:1 target_0-6734A781-1
target_1-6734A781-1
target_2-6734A781-1
target_3-6734A781-1
target_0-6734A784-2
target_1-6734A784-2
target_2-6734A784-2
target_3-6734A784-2
s103_s104
Because this system is using buddy mirroring, in addition to targets, the default pool also contains
the buddy group s103_s104
. Another way to check the storage pools is using beegfs target
list
, which prints a list of targets, and the pool for each target.
A new storage pool can be added using the beegfs pool create
command. The beegfs pool create
command requires the list of targets and/or buddy groups to assign to the pool, along with the alias
and optionally a numeric ID to assign to the pool. If no ID is specified, the first unused ID will be
used. Adding a target or group to a storage pool will remove it from the default storage pool, as
targets and groups can only be part of one storage pool at a time.
$ beegfs pool create --targets=target_0-6734A781-1,target_0-6734A784-2 archive
Pool created: archive[storage:2, uid:46]
$ beegfs pool list
ALIAS ID MEMBERS
default storage:1 target_1-6734A781-1
target_2-6734A781-1
target_3-6734A781-1
target_1-6734A784-2
target_2-6734A784-2
target_3-6734A784-2
s103_s104
archive storage:2 target_0-6734A781-1
target_0-6734A784-2
If targets are members of a buddy group they cannot be assigned directly to a pool, instead the buddy group should be assigned to the pool and the targets will be moved to the pool automatically:
$ beegfs pool create --targets=target_2-6734A781-1,target_3-6734A781-1 mirrored
Error: rpc error: code = Unknown desc = Create pool: Target target_2-6734A781-1 can't be assigned directly as it's part of a buddy group
$ beegfs pool create --groups s103_s104 mirrored
Pool created: mirrored[storage:3, uid:48]
$ beegfs pool list
ALIAS ID MEMBERS
default storage:1 target_1-6734A781-1
target_1-6734A784-2
target_2-6734A784-2
target_3-6734A784-2
archive storage:2 target_0-6734A781-1
target_0-6734A784-2
mirrored storage:3 target_2-6734A781-1
target_3-6734A781-1
s103_s104
The command beegfs pool delete
removes an existing storage pool. Before deleting a pool its
members must be assigned to other pools using the beegfs pool assign
command:
$ beegfs pool delete mirrored
Error: rpc error: code = Unknown desc = Delete pool: 2 targets and 1 buddy groups are still assigned to this pool
$ beegfs pool assign --groups s103_s104 default
Pool assigned: default[storage:1, uid:2]
$ beegfs pool list
ALIAS ID MEMBERS
default storage:1 target_1-6734A781-1
target_2-6734A781-1
target_3-6734A781-1
target_1-6734A784-2
target_2-6734A784-2
target_3-6734A784-2
s103_s104
archive storage:2 target_0-6734A781-1
target_0-6734A784-2
mirrored storage:3
$ beegfs pool delete mirrored
Pool can be deleted: mirrored[storage:3, uid:48]
If you really want to delete it, please add the --yes flag to the command.
Warning
When a pool is removed existing files and directories that reference this pool ID are not updated and will continue to reference the now deleted pool ID. This prevents pool deletions from accidentally triggering mass reconfiguration/restriping of entries in the file system. Existing files will remain on the same targets even though those targets are now in a different pool, and directories will continue to refer to the now deleted pool.
Note new files will not be able to be created in directories that reference the deleted pool
until either their stripe pattern is updated or a new pool with the same ID is created. Use the
entry set
to update the pool on existing directories and entry migrate
to move existing
files to new targets if desired.
The alias assigned to a pool can be changed at any time using beegfs pool set-alias
.
Assigning a Directory Stripe Pattern¶
The following command sets the default pattern of a directory to use only targets from the storage pool with ID 2 for the directory named first_dir.
$ beegfs entry set --pool=archive first_dir/
Processed 1 entries.
Configuration Updates: Pool (archive)
All newly created files and files copied into that directory will only be stored on targets belonging to the archive pool. To move existing files between pools, see Migrating Files between Storage Pools.
For example, after touching a new file, its entry info reads like this:
$ beegfs entry info first_dir/my_file
PATH ENTRY ID TYPE META NODE META MIRROR STORAGE POOL STRIPE PATTERN STORAGE TARGETS BUDDY GROUPS REMOTE TARGETS COOL DOWN
/first_dir/my_file 0-6758BC3C-2 file node_meta_2 (2) 1 archive (2) RAID0 (4x1M) 101,201 (unmirrored) (none) (n/a)
Note that even though the (default) stripe pattern in this example specifies four targets, only two targets are used. This is because storage pool 2 only contains two targets. When a pool is specified, targets outside this pool will not be used.
Note that the entry set
command has more command line arguments giving you more fine-grained
control over the data placement. For example, the number of targets and the chunk size can be
controlled. beegfs entry set --help
will print an explanation of all available options. To allow
unprivileged users to use the entry set
command, it is necessary to set the option
sysAllowUserSetPattern = true
in /etc/beegfs/beegfs-meta.conf
.
Creating Files¶
By default, new files inherit the stripe pattern, and therefore the storage pool setting from their
parent directory. beegfs
provides a way to override this setting, allowing the user to specify
the storage pool a file’s contents reside on. On the command line, beegfs entry create file
can
be used to create a new, empty file on a specific storage pool:
$ beegfs entry create file --pool=archive new_archived_file
NAME STATUS ENTRY ID TYPE
/new_archived_file Success 0-6758BCB7-2 file
Much like the entry set
mode, the entry create file
mode supports many more options, like
setting the stripe pattern and chunk size. Use beegfs entry create file --help
to get an
overview and explanation of all available options.
The stripe pattern settings of a file can be verified using beegfs entry info
, just like those
of a directory.
$ beegfs entry info . new_archived_file
PATH ENTRY ID TYPE META NODE META MIRROR STORAGE POOL STRIPE PATTERN STORAGE TARGETS BUDDY GROUPS REMOTE TARGETS COOL DOWN
/ root directory node_meta_2 (2) 1 default (1) RAID0 (4x1M) (directory) (directory) (none) (n/a)
/new_archived_file 0-6758BCB7-2 file node_meta_2 (2) 1 archive (2) RAID0 (4x1M) 101,201 (unmirrored) (none) (n/a)
Notice how instead of inheriting the default storage pool from its parent directory (root), the
new_archived_file
is assigned to the archive pool and only contains targets from that pool. This
new file is empty and ready to be written to be any application and data will be striped across
targets 101 and 201.
Migrating Files between Storage Pools¶
Files can be migrated from one storage pool to another using the beegfs
tool. For example if we
had written some data to the archive/
directory and we wanted to bring that back to the default
pool, you could run:
$ sudo beegfs entry migrate --from-pools archive --pool default archive/ --recurse --yes
Migration statistics: {MigrationStatusUnknown:0 MigrationErrors:0 MigrationNotSupported:0 MigrationSkipped:1 MigrationNotNeeded:0 MigrationNeeded:0 MigratedFiles:5 MigratedDirectories:0}
This recursively migrates all files under the archive/
directory assigned to the archive pool to
the default pool. By default only files are migrated and the pool set on the archive/
directory
(and any subdirectories) is not modified, but could also be updated as part of the migration by
specifying --update-directories
or by later using the beegfs entry set
command.
Warning
If any files in the source pool have buddy mirroring enabled, ensure there is at least one buddy group in the destination pool, otherwise those files cannot be migrated.
Extra Notes and Caveats¶
Note that storage pools are different from capacity pools. Capacity pools (as seen e.g., by running
beegfs health capacity
) are used by BeeGFS to balance free space on the targets within a
capacity pool.
If buddy groups are defined, targets in a buddy group are always in the same storage pool as their buddy. Targets and buddy groups can not be moved between pools. They have to be removed from one pool, effectively moving them back to the default pool, before they can be added to another pool.
Changing the storage pool for a directory only affects new files in that directory. Existing files stay in the storage pool they were created in and have to be moved over manually. For instance, the following command will create a new copy of the file in the new storage pool, and then move it over the original file’s name, removing the old copy:
cp file file.tmp && mv file.tmp file
The same applies for moving a file into a directory. If the file was on the same BeeGFS instance before, it keeps its existing stripe pattern and storage pool. Copying a file into the directory will create a new copy of the file and apply the storage pool setting from the directory, as moving a file from outside a BeeGFS instance into it.
When all targets have been removed from a pool, it is not possible to create files in directories
which have that pools selected as their storage pool anymore. The application will receive a “Remote
I/O error”, the metadata server log will give the reason: “No storage targets available”. This can
happen, for example, when a file system has been freshly created, and all targets have been moved to
storage pools. The “default” pool is now empty, but still selected as the pool for the root
directory of the file system. To resolve this situation, you have to select a pool that contains at
least one target for that directory, using the command beegfs entry set --pool=<entityID>
, so
you can create new files in that directory.
When using the migrate
command, files are migrated based on the targets/groups they are
currently using. The list of target/groups to migrate away from is determined either by explicitly
specifying individual targets, selecting all targets for a particular node, or all targets assigned
to a particular pool. Keep in mind the --from-pool=<entityID>
parameter considers not the ID of
the pool files were originally created in, but rather the pool its targets/groups are currently
assigned to, which may have changed if the targets were moved between pools.