Storage Pools

Storage pools can be considered a template for the stripe pattern BeeGFS uses to assign file chunks to specific storage targets. Instead of directly assigning a stripe pattern, a user can now assign a storage pool, which is much easier and more intuitive. The storage pool assignment can be part of the directory metadata, and therefore apply to all newly created files in a particular directory, or it can be part of the file metadata, applying only to that file and overriding the directory default.

On a freshly installed BeeGFS, or after an update from an older version which does not support storage pools, all targets will be in the “Default Pool”. You can check the target assignment using beegfs-ctl --liststoragepools:

$ beegfs-ctl --liststoragepools
Pool ID   Pool Description                      Targets                 Buddy Groups
======= ================== ============================ ============================
      1            Default 1,2,3,4,5,6,7,8

Another way to check the storage pools is using beegfs-ctl --listtargets --storagepools, which prints a list of targets, and the storage pool ID for each target.

A new storage pool can be added using the --addstoragepool mode:

$ beegfs-ctl --liststoragepools
Pool ID   Pool Description                      Targets                 Buddy Groups
======= ================== ============================ ============================
      1            Default 1,2,3,4,5,6,7,8

$ beegfs-ctl --addstoragepool --desc="first pool" --targets=1,5
Successfully created storage pool.

ID: 2; Description: first pool

$ beegfs-ctl --addstoragepool --desc="second pool" --targets=2,6
Successfully created storage pool.

ID: 3; Description: second pool

$ beegfs-ctl --addstoragepool --desc="third pool" --targets=3,7
Successfully created storage pool.

ID: 4; Description: third pool
$ beegfs-ctl --liststoragepools
Pool ID   Pool Description                      Targets                 Buddy Groups
======= ================== ============================ ============================
      1            Default 4,8
      2         first pool 1,5
      3        second pool 2,6
      4         third pool 3,7

The command line arguments to --addstoragepool include a descriptive text for the storage pool, the list of targets to initially add, and optionally a numeric ID to use for the new storage pool. If not ID is specified, the first unused ID will be used.

Adding a target to a storage pool will remove it from the Default storage pool, as each target can only be part of one storage pool at a time.

The mode --removestoragepool deletes an existing storage pool. Targets belonging to that storage pool are moved back to the Default pool.

$ beegfs-ctl --removestoragepool 4
Removing storage pool 4 will move all targets of the pool to the default pool.
Furthermore, ID 4 will be reused for new pools, i.e. if the pool ID is still
referenced in stripe patterns, the stripe patterns will automatically reference any
new pool with ID 4.

Do you really want to continue? (y/n)
y
Successfully removed storage pool with ID 4.

$ beegfs-ctl --liststoragepools
Pool ID   Pool Description                      Targets                 Buddy Groups
======= ================== ============================ ============================
      1            Default 3,4,7,8
      2         first pool 1,5
      3        second pool 2,6

The mode --modifystoragepool lets you modify an existing pool. Targets and mirror buddy groups can be added to existing storage pools or removed from them. Furthermore, the description text of the pool can be changed.

Assigning a Directory Stripe Pattern

The following command sets the default pattern of a directory to use only targets from the storage pool with ID 2 for the directory named first_dir.

$ beegfs-ctl --setpattern --storagepoolid=2 first_dir
New storage pool ID: 2

All newly created files and files copied into that directory will only be stored on targets belonging to storage pool 2. To move existing files between pools, see Migrating Files between Storage Pools.

For example, after touching a new file, its entry info reads like this:

$ beegfs-ctl --getentryinfo first_dir/my_file
EntryID: 0-5A041C08-1
Metadata node: fslab-c18.beeg.local [ID: 1]
Stripe pattern details:
+ Type: RAID0
+ Chunksize: 512K
+ Number of storage targets: desired: 4; actual: 2
+ Storage targets:
  + 5 @ fslab-c19.beeg.local [ID: 2]
  + 1 @ fslab-c18.beeg.local [ID: 1]

Note that even though the (default) stripe pattern in this example specifies four targets, only two targets are used. This is because storage pool 2 only contains two targets. When a pool is specified, targets outside this pool will not be used.

Note that the --setpattern mode has more command line arguments giving you more fine-grained control over the data placement. For example, the number of targets and the chunk size can be controlled. beegfs-ctl --setpatten --help will print an explanation of all available options. To allow unprivileged users to use the --setpattern command, it is necessary to set the option sysAllowUserSetPattern = true in /etc/beegfs/beegfs-meta.conf.

Creating Files

By default, new files inherit the stripe pattern, and therefore the storage pool setting from their parent directory. beegfs-ctl provides a way to override this setting, allowing the user to specify the storage pool a file’s contents reside on.

On the command line, beegfs-ctl can be used to create a new, empty file on a specific storage pool:

$ beegfs-ctl --createfile --storagepoolid=4 new_file

Much like the --setpattern mode, the --createfile mode supports many more options, like setting the stripe pattern and chunk size. beegfs-ctl --setpattern --help can be used to get an overview and explanation of all available options.

The stripe pattern settings of a file can be verified using beegfs-ctl --gententryinfo, just like those of a directory.

$ beegfs-ctl --getentryinfo new_file
EntryID: 0-5A0446CE-1
Metadata node: fslab-c18.beeg.local [ID: 1]
Stripe pattern details:
+ Type: RAID0
+ Chunksize: 512K
+ Number of storage targets: desired: 4; actual: 2
+ Storage targets:
  + 1 @ fslab-c18.beeg.local [ID: 1]
  + 5 @ fslab-c19.beeg.local [ID: 2]

The list of storage targets used for this file contains only targets which belong to the storage pool specified when the file was created. The file created by beegfs-ctl is still empty. It can now be opened and filled with data by any application. All data written to the file will go to the specified targets.

Migrating Files between Storage Pools

Files can be migrated from one storage pool to another using the beegfs-ctl tool.

$ beegfs-ctl --migrate --storagepoolid=<from> --destinationpoolid=<to> <path>

This will move everything below <path> to storage targets of the destination pool. Note: If files from the source pool have buddy mirroring enabled, make sure that there is at least one buddy mirrored target in the destination pool. Otherwise, these files can’t be moved.

Extra Notes and Caveats

Note that storage pools are different from capacity pools. Capacity pools (as seen e.g., by running beegfs-df) are used by BeeGFS to balance free space on the targets within a capacity pool.

If buddy groups are defined, targets in a buddy group are always in the same storage pool as their buddy. Targets and buddy groups can not be moved between pools. They have to be removed from one pool, effectively moving them back to the default pool, before they can be added to another pool.

Changing the storage pool for a directory only affects new files in that directory. Existing files stay in the storage pool they were created in and have to be moved over manually. For instance, the following command will create a new copy of the file in the new storage pool, and then move it over the original file’s name, removing the old copy:

cp file file.tmp && mv file.tmp file

The same applies for moving a file into a directory. If the file was on the same BeeGFS instance before, it keeps its existing stripe pattern and storage pool. Copying a file into the directory will create a new copy of the file and apply the storage pool setting from the directory, as moving a file from outside a BeeGFS instance into it.

When all targets have been removed from a pool, it is not possible to create files in directories which have that pools selected as their storage pool anymore. The application will receive a “Remote I/O error”, the metadata server log will give the reason: “No storage targets available”. This can happen, for example, when a file system has been freshly created, and all targets have been moved to storage pools. The “Default” pool is now empty, but still selected as the pool for the root directory of the file system. To resolve this situation, you have to select a pool that contains at least one target for that directory, using the command beegfs-ctl --setpattern --storagepoolid=..., so you can create new files in that directory.

When using the --migrate command to migrate files out of a pool which has been changed in the meantime (targets removed, maybe the pool has even been emptied) keep in mind that the --poolid=<from> parameter considers not the ID of the pool the file was created on, but the one of the pool that it is currently in. If targets are moved to another pool, this ID changes as well.