Striping API

The Striping API was developed during the European Commission funded exascale project DEEP-ER. The API allows the application developer to define some parameters of the stripe pattern. The application developer knows the I/O pattern for each file of the application, so they can decide the striping settings for each file to get the best performance for the application.

A file is striped across multiple targets. It is also possible to stripe the file across multiple targets and mirror the data for resiliency. It is possible to change the number of storage targets which should be used for the striping and the chunk size. The application developer can control these settings using the cache API. The file system administrator can also set the stripe pattern (raid0 and buddymirror, see Mirroring) using the beegfs-ctl, while some striping settings (number of targets and the chunk size) can also be changed by the user (see Striping for details).

../_images/striping.png

Chunks of file data are distributed across multiple storage targets. If the file is buddy-mirrored, each chunk will be duplicated onto two targets. Each target can store chunks of unmirrored, and if part of a buddy group additionally of mirrored files.

The picture shows Target 1 and Target 3 belong to BuddyMirrorGroup 1. The Target 2 and Target 4 belong to BuddyMirrorGroup 2. Every target belongs to a BuddyMirrorGroup to store mirrored files, but the BeeGFS design allows to store unmirroed files in such a target as well. File #1 is mirrored and stripped across BuddyGroup 1 and BuddyGroup 2. So the 1st chunk of File #1 is stored on Target 1 and Target 3, the 2nd chunk is stored on Target 2 and Target 4, the 3rd chunk is stored again on Target 1 and Target 3, and so on. File #2 is only mirrored on BuddyGroup 1. The 1st chunk of File #2 is stored on Target 1 and Target 3, the 2nd chunk is stored again on Target 1 and Target 3, and so on. File #3 is only striped across all 4 targets, but not mirrored. The 1st chunk of File #3 is stored on Target 2, the 2nd chunk is stored on Target 3, the 3rd chunk is stored on Target 4, the 4th chunk is stored on Target 1, the 5th chunk is stored again on Target 2, and so on.

See also

Architecture.

Installation

Install the package beegfs-client-devel or beegfs-client-dev, depending on your operating system, from the repository which fits to your installed BeeGFS version. The package contains the required header files. A library is not required because the API uses ioctls (input/output control) for the communication with the file system. The header file is installed to /usr/include/beegfs/beegfs.h.

API Specification

beegfs.h
  1#define BEEGFS_IOCTL_NODESTRID_BUFLEN   256      // max buffer size for the node string ID
  2
  3// stripe pattern types
  4#define BEEGFS_STRIPEPATTERN_INVALID      0      // Stripe pattern is invalid
  5#define BEEGFS_STRIPEPATTERN_RAID0        1      // Stripe pattern RAID0
  6#define BEEGFS_STRIPEPATTERN_RAID10       2      // Stripe pattern RAID10
  7                   // (deprecated since 2015.03)
  8#define BEEGFS_STRIPEPATTERN_BUDDYMIRROR  3      // Stripe pattern for Buddy Mirroring
  9                   // (supported since 2015.03)
 10
 11
 12/**
 13 * Struct for details of a stripe target
 14 */
 15struct BeegfsIoctl_GetStripeTargetV2_Arg
 16{
 17   /* inputs */
 18   uint32_t targetIndex;
 19
 20   /* outputs */
 21   uint32_t targetOrGroup; // target ID if the file is not buddy mirrored, otherwise mirror group ID
 22
 23   uint32_t primaryTarget; // target ID != 0 if buddy mirrored
 24   uint32_t secondaryTarget; // target ID != 0 if buddy mirrored
 25
 26   uint32_t primaryNodeID; // node ID of target (if unmirrored) or primary target (if mirrored)
 27   uint32_t secondaryNodeID; // node ID of secondary target, or 0 if unmirrored
 28
 29   char primaryNodeStrID[BEEGFS_IOCTL_NODESTRID_BUFLEN];
 30   char secondaryNodeStrID[BEEGFS_IOCTL_NODESTRID_BUFLEN];
 31};
 32
 33
 34/**
 35 * Get the path to the client config file of an active BeeGFS mountpoint.
 36 *
 37 * @param fd file descriptor pointing to file or dir inside BeeGFS mountpoint.
 38 * @param outCfgFile buffer for config file path; will be malloc'ed and needs to be free'd by
 39 *        caller if success was returned.
 40 * @return true on success, false on error (in which case errno will be set).
 41 */
 42bool beegfs_getConfigFile(int fd, char** outCfgFile);
 43
 44/**
 45 * Get the path to the client runtime config file in procfs.
 46 *
 47 * @param fd file descriptor pointing to file or dir inside BeeGFS mountpoint.
 48 * @param outCfgFile buffer for config file path; will be malloc'ed and needs to be free'd by
 49 *        caller if success was returned.
 50 * @return true on success, false on error (in which case errno will be set).
 51 */
 52bool beegfs_getRuntimeConfigFile(int fd, char** outCfgFile);
 53
 54/**
 55 * Test if the underlying file system is a BeeGFS.
 56 *
 57 * @param fd file descriptor pointing to some file or dir that should be checked for whether it is
 58 *        located inside a BeeGFS mount.
 59 * @return true on success, false on error (in which case errno will be set).
 60 */
 61bool beegfs_testIsBeeGFS(int fd);
 62
 63/**
 64 * Get the mountID aka clientID aka nodeID of client mount aka sessionID.
 65 *
 66 * @param fd file descriptor pointing to some file or dir that should be checked for whether it is
 67 *        located inside a BeeGFS mount.
 68 * @return true on success, false on error (in which case errno will be set).
 69 */
 70bool beegfs_getMountID(int fd, char** outMountID);
 71
 72/**
 73 * Get the stripe info of a file.
 74 *
 75 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 76 * @param outPatternType type of stripe pattern (BEEGFS_STRIPEPATTERN_...)
 77 * @param outChunkSize chunk size for striping.
 78 * @param outNumTargets number of targets for striping.
 79 * @return true on success, false on error (in which case errno will be set).
 80 */
 81bool beegfs_getStripeInfo(int fd, unsigned* outPatternType, unsigned* outChunkSize, uint16_t*
 82     outNumTargets);
 83
 84/**
 85 * Get the stripe target of a file (with 0-based index).
 86 *
 87 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 88 * @param targetIndex index of target that should be retrieved (start with 0 and then call this
 89 *        again with index up to "*outNumTargets-1" to retrieve remaining targets).
 90 * @param outTargetNumID numeric ID of target at given index.
 91 * @param outNodeNumID numeric ID to node to which this target is assigned.
 92 * @param outNodeStrID string ID of the node to which this target is assigned; buffer will be 
 93 *        alloc'ed and needs to be free'd by caller if success is returned.
 94 * @return true on success, false on error (in which case errno will be set).
 95 */
 96bool beegfs_getStripeTarget(int fd, uint16_t targetIndex, uint16_t* outTargetNumID,
 97     uint16_t* outNodeNumID, char** outNodeStrID);
 98
 99/**
100 * Get the stripe target of a file (with 0-based index).
101 *
102 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
103 * @param targetIndex index of target that should be retrieved (start with 0 and then call this
104 *        again with index up to "*outNumTargets-1" to retrieve remaining targets).
105 * @param outTargetInfo pointer to struct that will be filled with information about the selected
106 *        stripe target
107 * @return true on success, false on error (in which case errno will be set).
108 */
109bool beegfs_getStripeTargetV2(int fd, uint32_t targetIndex,
110     struct BeegfsIoctl_GetStripeTargetV2_Arg* outTargetInfo);
111
112/**
113 * Create a new regular file with stripe hints.
114 *
115 * As the stripe pattern cannot be changed when a file is already created, this is an exclusive
116 * create, so it will return an error if the file already existed.
117 *
118 * @param fd file descriptor pointing to parent directory for the new file.
119 * @param filename name of created file.
120 * @param mode permission bits of new file (i.e. symbolic constants like S_IRWXU or 0644).
121 * @param numtargets desired number of storage targets for striping; 0 for directory default; ~0 to
122 *        use all available targets.
123 * @param chunksize chunksize per storage target for striping in bytes; 0 for directory default;
124 *        must be 2^n >= 64KiB.
125 * @return true on success, false on error (in which case errno will be set).
126 */
127bool beegfs_createFile(int fd, const char* filename, mode_t mode, unsigned numtargets,
128     unsigned chunksize);
129
130/**
131 * Checks if the required API version of the application is compatible to current API version
132 *
133 * @param required_major_version the required major API version of the user application
134 * @param required_minor_version the minimal required minor API version of the user application
135 * @return true if the required version and the API version are compatible, if not false is returned
136 */
137bool beegfs_checkApiVersion(const unsigned required_major_version,
138     const unsigned required_minor_version);

Code Examples

The header files of the striping API are located in the default system include path and will be found automatically by your compiler. There is no additional shared library that needs to be linked to your application.

Create a file with a special stripe pattern

The following program creates a file with a specific stripe pattern.

create-with-pattern.cpp
 1#include <beegfs/beegfs.h>
 2
 3#include <dirent.h>
 4#include <errno.h>
 5#include <iostream>
 6#include <libgen.h>
 7#include <stdlib.h>
 8
 9
10
11static const mode_t MODE_FLAG = S_IRWXU | S_IRGRP | S_IROTH;
12static const unsigned numtargets = 8;
13static const unsigned chunksize = 1048576; // 1 Mebibyte
14
15
16int main(int argc, char** argv)
17{
18   // check if a path to the file is provided
19   if(argc != 2)
20   {
21     std::cout << "Usage: " << argv[0] << " $PATH_TO_FILE" << std::endl;
22     exit(-1);
23   }
24
25   std::string file(argv[1]);
26   std::string fileName(basename(argv[1]) );
27   std::string parentDirectory(dirname(argv[1]) );
28
29   // check if we got a file name from the given path
30   if(fileName.empty() )
31   {
32     std::cout << "Can not get file name from given path: " << file << std::endl;
33     exit(-1);
34   }
35
36   // check if we got the parent directory path from the given path
37   if(parentDirectory.empty() )
38   {
39     std::cout << "Can not get parent directory path from given path: " << file << std::endl;
40     exit(-1);
41   }
42
43   // open the directory to get a directory stream 
44   DIR* parentDir = opendir(parentDirectory.c_str() );
45   if(parentDir == NULL)
46   {
47     std::cout << "Can not get directory stream of directory: " << parentDirectory
48       << " errno: " << errno << std::endl;
49     exit(-1);
50   }
51
52   // get a fd of the parent directory
53   int fd = dirfd(parentDir);
54   if(fd == -1)
55   {
56     std::cout << "Can not get fd from directory: " << parentDirectory
57       << " errno: " << errno << std::endl;
58     exit(-1);
59   }
60
61   // check if the parent directory is located on a BeeGFS, because the striping API works only on
62   // BeeGFS (Results of the BeeGFS ioctl on other file systems are undefined.)
63   bool isBeegfs = beegfs_testIsBeeGFS(fd);
64   if(!isBeegfs)
65   {
66     std::cout << "The given file is not located on an BeeGFS: " << file << std::endl;
67     exit(-1);
68   }
69
70   // create the file with the given stripe pattern
71   bool isFileCreated = beegfs_createFile(fd, fileName.c_str(), MODE_FLAG, numtargets, chunksize);
72   if(isFileCreated)
73   {
74     std::cout << "File successful created: " << file << std::endl;
75   }
76   else
77   {
78     std::cout << "Can not create file: " << file << " errno: " << errno << std::endl;
79     exit(-1);
80   }
81}

Compile like this:

$ g++ create-with-pattern.cpp -o create-with-pattern -I /usr/include/

Retrieve the stripe pattern of a file

The following program retrieves the stripe pattern settings from a file.

retrieve-pattern.cpp
  1#include <beegfs/beegfs.h>
  2
  3#include <errno.h>
  4#include <iostream>
  5#include <stdlib.h>
  6
  7
  8
  9static const mode_t MODE_FLAG = S_IRWXU | S_IRGRP | S_IROTH;
 10static const int OPEN_FLAGS = O_RDWR;
 11
 12
 13int main(int argc, char** argv)
 14{
 15   // check if a path to the file is provided
 16   if(argc != 2)
 17   {
 18	  std::cout << "Usage: " << argv[0] << " $PATH_TO_FILE" << std::endl;
 19	  exit(-1);
 20   }
 21
 22   std::string file(argv[1]);
 23
 24   // open the provided file
 25   int fd = open(file.c_str(), OPEN_FLAGS, MODE_FLAG);
 26   if(fd == -1)
 27   {
 28	  std::cout << "Open: can not open file: " << file << " errno: " << errno << std::endl;
 29	  exit(-1);
 30   }
 31
 32   // check if the file is located on a BeeGFS, because the striping API works only on BeeGFS
 33   // (Results of the BeeGFS ioctls on other file systems are undefined.)
 34   bool isBeegfs = beegfs_testIsBeeGFS(fd);
 35   if(!isBeegfs)
 36   {
 37	  std::cout << "The given file is not located on an BeeGFS: " << file << std::endl;
 38	  exit(-1);
 39   }
 40
 41   unsigned outPatternType = 0;
 42   unsigned outChunkSize = 0;
 43   uint16_t outNumTargets = 0;
 44
 45   // retrive the stripe pattern of the file and print them to the console
 46   bool stripeInfoRetVal = beegfs_getStripeInfo(fd, &outPatternType, &outChunkSize, &outNumTargets);
 47   if(stripeInfoRetVal)
 48   {
 49	  std::string patternType;
 50	  switch(outPatternType)
 51	  {
 52		 case BEEGFS_STRIPEPATTERN_RAID0:
 53			patternType = "RAID0";
 54			break;
 55		 case BEEGFS_STRIPEPATTERN_RAID10:
 56			patternType = "RAID10";
 57			break;
 58		 case BEEGFS_STRIPEPATTERN_BUDDYMIRROR:
 59			patternType = "BUDDYMIRROR";
 60			break;
 61		 default:
 62			patternType = "INVALID";
 63	  }
 64	  std::cout << "Stripe pattern of file: " << file << std::endl;
 65	  std::cout << "+ Type: " << patternType << std::endl;
 66	  std::cout << "+ Chunksize: " << outChunkSize << " Byte" << std::endl;
 67	  std::cout << "+ Number of storage targets: " << outNumTargets << std::endl;
 68	  std::cout << "+ Storage targets:" << std::endl;
 69
 70	  // get the targets which are used for the file and print them to the console
 71	  for (int targetIndex = 0; targetIndex < outNumTargets; targetIndex++)
 72	  {
 73		 struct BeegfsIoctl_GetStripeTargetV2_Arg outTargetInfo;
 74
 75		 bool stripeTargetRetVal = beegfs_getStripeTargetV2(fd, targetIndex, &outTargetInfo);
 76		 if(stripeTargetRetVal)
 77		 {
 78			if(outPatternType == BEEGFS_STRIPEPATTERN_BUDDYMIRROR)
 79			{
 80			   std::cout << "  + " << outTargetInfo.targetOrGroup
 81				  << " @ " << outTargetInfo.primaryTarget
 82				  << " @ " << outTargetInfo.primaryNodeStrID
 83				  << " [ID: "<< outTargetInfo.primaryNodeID << "]" << std::endl;
 84			   std::cout << "  + " << outTargetInfo.targetOrGroup
 85				  << " @ " << outTargetInfo.secondaryTarget
 86				  << " @ " << outTargetInfo.secondaryNodeStrID
 87				  << " [ID: "<< outTargetInfo.secondaryNodeID << "]" << std::endl;
 88			}
 89			else
 90			{
 91			   std::cout << "  + " << outTargetInfo.targetOrGroup
 92				  << " @ " << outTargetInfo.primaryNodeStrID
 93				  << " [ID: "<< outTargetInfo.primaryNodeID << "]" << std::endl;
 94			}
 95		 }
 96		 else
 97		 {
 98			std::cout << "Can not get stripe targets of file: " << file << std::endl;
 99			exit(-1);
100		 }
101	  }
102   }
103   else
104   {
105	  std::cout << "Can not get stripe info of file: " << file << std::endl;
106	  exit(-1);
107   }
108}

Compile like this:

$ g++ retrieve-pattern.cpp -o retrieve-pattern -I /usr/include/