Striping¶

Striping in BeeGFS can be configured on a per-directory and per-file basis. Each directory has a specific stripe pattern configuration, which will be derived to new subdirectories and applied to any file created inside a directory. There are currently two basic parameters that can be configured for stripe patterns: the desired number of storage targets for each file and the chunk size (or block size) for each file stripe.

The stripe pattern parameters can be configured the command-line control tool beegfs. This tool allows you to inspect or change the stripe pattern of files and directories in the file system at runtime. The following command will show you the current stripe settings of your BeeGFS mount root directory (in this case /mnt/beegfs):

$ beegfs entry info /mnt/beegfs/
PATH  ENTRY ID       TYPE  META NODE        META MIRROR  STORAGE POOL  STRIPE PATTERN  STORAGE TARGETS  BUDDY GROUPS  REMOTE TARGETS  COOL DOWN
/     root      directory  node_meta_2 (2)            1  default (1)   RAID0 (4x1M)    (directory)      (directory)   (none)          (n/a)

The entry info can also be printed in the horizontal format from BeeGFS 7 using the --retro flag and additional details can be printed with --verbose:

$ beegfs entry info /mnt/beegfs/ --retro --verbose
Entry type: directory
EntryID: root
ParentID:
Stripe pattern details:
+ Type: RAID0
+ Chunksize: 1M
+ Number of storage targets: desired: 4
+ Storage Pool: 1  (default)
Inlined inode: no
Inode info:
+ Path: 38/51/root
+ Metadata buddy group: 1
+ Current primary metadata node: node_meta_2 [ID: 2]

Use the command entry set to apply new striping settings to a directory. For example to stripe files across 4 storage targets with a chunk size of 1 MiB, run:

$ beegfs entry set --num-targets=4 --chunk-size=4MiB /mnt/beegfs/
Processed 1 entries.
Configuration Updates: Chunksize (4194304), DefaultNumTargets (4)

Stripe settings will be applied to new files, not to existing files in the directory. To apply the pattern to existing files, recreate them by performing a deep copy or using the entry migrate command.

Buddy Mirroring¶

If you have buddy mirror groups (see Mirroring) defined in your system, you can set the stripe pattern to use buddy groups as stripe targets, instead of individual storage targets. In order to do that, add the option --pattern=mirrored to the command, as follows. In this particular example, the data will be striped across 4 buddy groups with a chunk size of 1 MB:

$ beegfs entry set --num-targets=4 --chunk-size=1MiB --pattern=mirrored /mnt/beegfs/
Processed 1 entries.
Configuration Updates: Chunksize (1048576), DefaultNumTargets (4), StripePattern (Buddy Mirror)

To switch back to non-mirrored mode, set the pattern to raid0.

Impact on network communication¶

The data chunk size has an impact on the communication between client and storage servers in several ways.

When a process writes data on a file located on BeeGFS, the client identifies the storage targets that contain the data chunks that will be modified (by querying the metadata servers) and send modification messages to the storage servers containing the modified data. The maximum size of such messages is determined by the data chunk size of the file.

If you define --chunk-size=1MiB, 1 MiB will be the maximum size of each message. If the amount of data written to the file is larger than the maximum message size, more messages will have to be sent to the servers and this may cause performance loss. So, slightly increasing the chunk size to a few MB has the effect of reducing the number of messages, and this can have a positive performance impact, even in a system with a single target.

On the other hand, it is important to make sure that a data chunk fits the RDMA buffers available on the client (see Client Node Tuning), in order to prevent the messages from being split, and again increasing the number of messages. See

You also have to consider the file cache settings. When the client is using the buffered cache (tuneFileCacheType = buffered), it uses a file cache buffer of 512 kiB to accumulate changes on the same data. This data is sent to the servers only when data from outside the boundaries of that buffer is needed by the client. So, the larger this buffer, the less communication will be needed between the client and the servers. You should set this buffer size to a multiple of the data chunk size. For example, adding tuneFileCacheBufSize = 2097152 to the BeeGFS client configuration file will raise the file cache buffer size to 2 MiB.

Striping¶

Buddy Mirroring¶

Impact on network communication¶

Documentation

Table of Contents

Previous topic

Next topic