Striping API

The Striping API was developed during the European Commission funded exascale project DEEP-ER. The API allows the application developer to define some parameters of the stripe pattern. The application developer knows the I/O pattern for each file of the application, so they can decide the striping settings for each file to get the best performance for the application.

A file is striped across multiple targets. It is also possible to stripe the file across multiple targets and mirror the data for resiliency. It is possible to change the number of storage targets which should be used for the striping and the chunk size. The application developer can control these settings using the cache API. The file system administrator can also set the stripe pattern (raid0 and buddymirror, see Mirroring) using the beegfs-ctl, while some striping settings (number of targets and the chunk size) can also be changed by the user (see Striping for details).

../_images/striping.png

Chunks of file data are distributed across multiple storage targets. If the file is buddy-mirrored, each chunk will be duplicated onto two targets. Each target can store chunks of unmirrored, and if part of a buddy group additionally of mirrored files.

The picture shows Target 1 and Target 3 belong to BuddyMirrorGroup 1. The Target 2 and Target 4 belong to BuddyMirrorGroup 2. Every target belongs to a BuddyMirrorGroup to store mirrored files, but the BeeGFS design allows to store unmirroed files in such a target as well. File #1 is mirrored and stripped across BuddyGroup 1 and BuddyGroup 2. So the 1st chunk of File #1 is stored on Target 1 and Target 3, the 2nd chunk is stored on Target 2 and Target 4, the 3rd chunk is stored again on Target 1 and Target 3, and so on. File #2 is only mirrored on BuddyGroup 1. The 1st chunk of File #2 is stored on Target 1 and Target 3, the 2nd chunk is stored again on Target 1 and Target 3, and so on. File #3 is only striped across all 4 targets, but not mirrored. The 1st chunk of File #3 is stored on Target 2, the 2nd chunk is stored on Target 3, the 3rd chunk is stored on Target 4, the 4th chunk is stored on Target 1, the 5th chunk is stored again on Target 2, and so on.

See also

Architecture.

Installation

Install the package beegfs-client-devel or beegfs-client-dev, depending on your operating system, from the repository which fits to your installed BeeGFS version. The package contains the required header files. A library is not required because the API uses ioctls (input/output control) for the communication with the file system. The header file is installed to /usr/include/beegfs/beegfs.h.

API Specification

beegfs.h
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
#define BEEGFS_IOCTL_NODESTRID_BUFLEN   256      // max buffer size for the node string ID

// stripe pattern types
#define BEEGFS_STRIPEPATTERN_INVALID      0      // Stripe pattern is invalid
#define BEEGFS_STRIPEPATTERN_RAID0        1      // Stripe pattern RAID0
#define BEEGFS_STRIPEPATTERN_RAID10       2      // Stripe pattern RAID10
                   // (deprecated since 2015.03)
#define BEEGFS_STRIPEPATTERN_BUDDYMIRROR  3      // Stripe pattern for Buddy Mirroring
                   // (supported since 2015.03)


/**
 * Struct for details of a stripe target
 */
struct BeegfsIoctl_GetStripeTargetV2_Arg
{
   /* inputs */
   uint32_t targetIndex;

   /* outputs */
   uint32_t targetOrGroup; // target ID if the file is not buddy mirrored, otherwise mirror group ID

   uint32_t primaryTarget; // target ID != 0 if buddy mirrored
   uint32_t secondaryTarget; // target ID != 0 if buddy mirrored

   uint32_t primaryNodeID; // node ID of target (if unmirrored) or primary target (if mirrored)
   uint32_t secondaryNodeID; // node ID of secondary target, or 0 if unmirrored

   char primaryNodeStrID[BEEGFS_IOCTL_NODESTRID_BUFLEN];
   char secondaryNodeStrID[BEEGFS_IOCTL_NODESTRID_BUFLEN];
};


/**
 * Get the path to the client config file of an active BeeGFS mountpoint.
 *
 * @param fd file descriptor pointing to file or dir inside BeeGFS mountpoint.
 * @param outCfgFile buffer for config file path; will be malloc'ed and needs to be free'd by
 *        caller if success was returned.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getConfigFile(int fd, char** outCfgFile);

/**
 * Get the path to the client runtime config file in procfs.
 *
 * @param fd file descriptor pointing to file or dir inside BeeGFS mountpoint.
 * @param outCfgFile buffer for config file path; will be malloc'ed and needs to be free'd by
 *        caller if success was returned.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getRuntimeConfigFile(int fd, char** outCfgFile);

/**
 * Test if the underlying file system is a BeeGFS.
 *
 * @param fd file descriptor pointing to some file or dir that should be checked for whether it is
 *        located inside a BeeGFS mount.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_testIsBeeGFS(int fd);

/**
 * Get the mountID aka clientID aka nodeID of client mount aka sessionID.
 *
 * @param fd file descriptor pointing to some file or dir that should be checked for whether it is
 *        located inside a BeeGFS mount.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getMountID(int fd, char** outMountID);

/**
 * Get the stripe info of a file.
 *
 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 * @param outPatternType type of stripe pattern (BEEGFS_STRIPEPATTERN_...)
 * @param outChunkSize chunk size for striping.
 * @param outNumTargets number of targets for striping.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getStripeInfo(int fd, unsigned* outPatternType, unsigned* outChunkSize, uint16_t*
     outNumTargets);

/**
 * Get the stripe target of a file (with 0-based index).
 *
 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 * @param targetIndex index of target that should be retrieved (start with 0 and then call this
 *        again with index up to "*outNumTargets-1" to retrieve remaining targets).
 * @param outTargetNumID numeric ID of target at given index.
 * @param outNodeNumID numeric ID to node to which this target is assigned.
 * @param outNodeStrID string ID of the node to which this target is assigned; buffer will be 
 *        alloc'ed and needs to be free'd by caller if success is returned.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getStripeTarget(int fd, uint16_t targetIndex, uint16_t* outTargetNumID,
     uint16_t* outNodeNumID, char** outNodeStrID);

/**
 * Get the stripe target of a file (with 0-based index).
 *
 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 * @param targetIndex index of target that should be retrieved (start with 0 and then call this
 *        again with index up to "*outNumTargets-1" to retrieve remaining targets).
 * @param outTargetInfo pointer to struct that will be filled with information about the selected
 *        stripe target
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getStripeTargetV2(int fd, uint32_t targetIndex,
     struct BeegfsIoctl_GetStripeTargetV2_Arg* outTargetInfo);

/**
 * Create a new regular file with stripe hints.
 *
 * As the stripe pattern cannot be changed when a file is already created, this is an exclusive
 * create, so it will return an error if the file already existed.
 *
 * @param fd file descriptor pointing to parent directory for the new file.
 * @param filename name of created file.
 * @param mode permission bits of new file (i.e. symbolic constants like S_IRWXU or 0644).
 * @param numtargets desired number of storage targets for striping; 0 for directory default; ~0 to
 *        use all available targets.
 * @param chunksize chunksize per storage target for striping in bytes; 0 for directory default;
 *        must be 2^n >= 64KiB.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_createFile(int fd, const char* filename, mode_t mode, unsigned numtargets,
     unsigned chunksize);

/**
 * Checks if the required API version of the application is compatible to current API version
 *
 * @param required_major_version the required major API version of the user application
 * @param required_minor_version the minimal required minor API version of the user application
 * @return true if the required version and the API version are compatible, if not false is returned
 */
bool beegfs_checkApiVersion(const unsigned required_major_version,
     const unsigned required_minor_version);

Code Examples

The header files of the striping API are located in the default system include path and will be found automatically by your compiler. There is no additional shared library that needs to be linked to your application.

Create a file with a special stripe pattern

The following program creates a file with a specific stripe pattern.

create-with-pattern.cpp
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#include <beegfs/beegfs.h>

#include <dirent.h>
#include <errno.h>
#include <iostream>
#include <libgen.h>
#include <stdlib.h>



static const mode_t MODE_FLAG = S_IRWXU | S_IRGRP | S_IROTH;
static const unsigned numtargets = 8;
static const unsigned chunksize = 1048576; // 1 Mebibyte


int main(int argc, char** argv)
{
   // check if a path to the file is provided
   if(argc != 2)
   {
     std::cout << "Usage: " << argv[0] << " $PATH_TO_FILE" << std::endl;
     exit(-1);
   }

   std::string file(argv[1]);
   std::string fileName(basename(argv[1]) );
   std::string parentDirectory(dirname(argv[1]) );

   // check if we got a file name from the given path
   if(fileName.empty() )
   {
     std::cout << "Can not get file name from given path: " << file << std::endl;
     exit(-1);
   }

   // check if we got the parent directory path from the given path
   if(parentDirectory.empty() )
   {
     std::cout << "Can not get parent directory path from given path: " << file << std::endl;
     exit(-1);
   }

   // open the directory to get a directory stream 
   DIR* parentDir = opendir(parentDirectory.c_str() );
   if(parentDir == NULL)
   {
     std::cout << "Can not get directory stream of directory: " << parentDirectory
       << " errno: " << errno << std::endl;
     exit(-1);
   }

   // get a fd of the parent directory
   int fd = dirfd(parentDir);
   if(fd == -1)
   {
     std::cout << "Can not get fd from directory: " << parentDirectory
       << " errno: " << errno << std::endl;
     exit(-1);
   }

   // check if the parent directory is located on a BeeGFS, because the striping API works only on
   // BeeGFS (Results of the BeeGFS ioctl on other file systems are undefined.)
   bool isBeegfs = beegfs_testIsBeeGFS(fd);
   if(!isBeegfs)
   {
     std::cout << "The given file is not located on an BeeGFS: " << file << std::endl;
     exit(-1);
   }

   // create the file with the given stripe pattern
   bool isFileCreated = beegfs_createFile(fd, fileName.c_str(), MODE_FLAG, numtargets, chunksize);
   if(isFileCreated)
   {
     std::cout << "File successful created: " << file << std::endl;
   }
   else
   {
     std::cout << "Can not create file: " << file << " errno: " << errno << std::endl;
     exit(-1);
   }
}

Compile like this:

$ g++ create-with-pattern.cpp -o create-with-pattern -I /usr/include/

Retrieve the stripe pattern of a file

The following program retrieves the stripe pattern settings from a file.

retrieve-pattern.cpp
  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
#include <beegfs/beegfs.h>

#include <errno.h>
#include <iostream>
#include <stdlib.h>



static const mode_t MODE_FLAG = S_IRWXU | S_IRGRP | S_IROTH;
static const int OPEN_FLAGS = O_RDWR;


int main(int argc, char** argv)
{
   // check if a path to the file is provided
   if(argc != 2)
   {
	  std::cout << "Usage: " << argv[0] << " $PATH_TO_FILE" << std::endl;
	  exit(-1);
   }

   std::string file(argv[1]);

   // open the provided file
   int fd = open(file.c_str(), OPEN_FLAGS, MODE_FLAG);
   if(fd == -1)
   {
	  std::cout << "Open: can not open file: " << file << " errno: " << errno << std::endl;
	  exit(-1);
   }

   // check if the file is located on a BeeGFS, because the striping API works only on BeeGFS
   // (Results of the BeeGFS ioctls on other file systems are undefined.)
   bool isBeegfs = beegfs_testIsBeeGFS(fd);
   if(!isBeegfs)
   {
	  std::cout << "The given file is not located on an BeeGFS: " << file << std::endl;
	  exit(-1);
   }

   unsigned outPatternType = 0;
   unsigned outChunkSize = 0;
   uint16_t outNumTargets = 0;

   // retrive the stripe pattern of the file and print them to the console
   bool stripeInfoRetVal = beegfs_getStripeInfo(fd, &outPatternType, &outChunkSize, &outNumTargets);
   if(stripeInfoRetVal)
   {
	  std::string patternType;
	  switch(outPatternType)
	  {
		 case BEEGFS_STRIPEPATTERN_RAID0:
			patternType = "RAID0";
			break;
		 case BEEGFS_STRIPEPATTERN_RAID10:
			patternType = "RAID10";
			break;
		 case BEEGFS_STRIPEPATTERN_BUDDYMIRROR:
			patternType = "BUDDYMIRROR";
			break;
		 default:
			patternType = "INVALID";
	  }
	  std::cout << "Stripe pattern of file: " << file << std::endl;
	  std::cout << "+ Type: " << patternType << std::endl;
	  std::cout << "+ Chunksize: " << outChunkSize << " Byte" << std::endl;
	  std::cout << "+ Number of storage targets: " << outNumTargets << std::endl;
	  std::cout << "+ Storage targets:" << std::endl;

	  // get the targets which are used for the file and print them to the console
	  for (int targetIndex = 0; targetIndex < outNumTargets; targetIndex++)
	  {
		 struct BeegfsIoctl_GetStripeTargetV2_Arg outTargetInfo;

		 bool stripeTargetRetVal = beegfs_getStripeTargetV2(fd, targetIndex, &outTargetInfo);
		 if(stripeTargetRetVal)
		 {
			if(outPatternType == BEEGFS_STRIPEPATTERN_BUDDYMIRROR)
			{
			   std::cout << "  + " << outTargetInfo.targetOrGroup
				  << " @ " << outTargetInfo.primaryTarget
				  << " @ " << outTargetInfo.primaryNodeStrID
				  << " [ID: "<< outTargetInfo.primaryNodeID << "]" << std::endl;
			   std::cout << "  + " << outTargetInfo.targetOrGroup
				  << " @ " << outTargetInfo.secondaryTarget
				  << " @ " << outTargetInfo.secondaryNodeStrID
				  << " [ID: "<< outTargetInfo.secondaryNodeID << "]" << std::endl;
			}
			else
			{
			   std::cout << "  + " << outTargetInfo.targetOrGroup
				  << " @ " << outTargetInfo.primaryNodeStrID
				  << " [ID: "<< outTargetInfo.primaryNodeID << "]" << std::endl;
			}
		 }
		 else
		 {
			std::cout << "Can not get stripe targets of file: " << file << std::endl;
			exit(-1);
		 }
	  }
   }
   else
   {
	  std::cout << "Can not get stripe info of file: " << file << std::endl;
	  exit(-1);
   }
}

Compile like this:

$ g++ retrieve-pattern.cpp -o retrieve-pattern -I /usr/include/