From af464b49c3049b313129c1d7eaa2f34ed1b8bc3a Mon Sep 17 00:00:00 2001
From: Frank Baker
+
+
+
+
+
+The HDF5 file format describes how HDF5 data structures and dataset raw
+data are mapped to a linear format address space and the HDF5
+library implements that bidirectional mapping in terms of an
+API. However, the HDF5 format specifications do not indicate how
+the format address space is mapped onto storage and HDF (version 5 and
+earlier) simply mapped the format address space directly onto a single
+file by convention.
+
+
+Since early versions of HDF5 it became apparent that users want the ability to
+map the format address space onto different types of storage (a single file,
+multiple files, local memory, global memory, network distributed global
+memory, a network protocol, etc.) with various types of maps. For
+instance, some users want to be able to handle very large format address
+spaces on operating systems that support only 2GB files by partitioning the
+format address space into equal-sized parts each served by a separate
+file. Other users want the same multi-file storage capability but want to
+partition the address space according to purpose (raw data in one file, object
+headers in another, global heap in a third, etc.) in order to improve I/O
+speeds.
+
+
+In fact, the number of storage variations is probably larger than the
+number of methods that the HDF5 team is capable of implementing and
+supporting. Therefore, a Virtual File Layer API is being
+implemented which will allow application teams or departments to design
+and implement their own mapping between the HDF5 format address space
+and storage, with each mapping being a separate file driver
+(possibly written in terms of other file drivers). The HDF5 team will
+provide a small set of useful file drivers which will also serve as
+examples for those who which to write their own:
+
+
+Most application writers will use a driver defined by the HDF5 library or
+contributed by another programming team. This chapter describes how existing
+drivers are used.
+
+
+Each file driver is defined in its own public header file which should
+be included by any application which plans to use that driver. The
+predefined drivers are in header files whose names begin with
+`H5FD' followed by the driver name and `.h'. The `hdf5.h'
+header file includes all the predefined driver header files.
+
+
+Once the appropriate header file is included a symbol of the form
+`H5FD_' followed by the upper-case driver name will be the driver
+identification number.(1) However, the
+value may change if the library is closed (e.g., by calling
+
+In order to create or open a file one must define the method by which the
+storage is accessed(2) and does so by creating a file access property list(3) which is passed to the
+Each file driver will have its own initialization function
+whose name is
+An alternative to using the driver initialization function is to set the
+driver directly using the
+It is also possible to query the file driver information from a file access
+property list by calling
+The
+Like file access properties in the previous section, data transfer properties
+can be set using a driver initialization function or a general purpose
+function. For example, to set the MPI-IO driver to use independent access for
+I/O operations one would say:
+
+
+The alternative is to initialize a driver defined C
+The transfer propery list can be queried in a manner similar to the file
+access property list: the driver provides a function (or functions) to return
+various information about the transfer property list:
+
+
+The HDF5 specifications describe two things: the mapping of data onto a linear
+format address space and the C API which performs the mapping.
+However, the mapping of the format address space onto storage intentionally
+falls outside the scope of the HDF5 specs. This is a direct result of the fact
+that it is not generally possible to store information about how to access
+storage inside the storage itself. For instance, given only the file name
+`/arborea/1225/work/f%03d' the HDF5 library is unable to tell whether the
+name refers to a file on the local file system, a family of files on the local
+file system, a file on host `arborea' port 1225, a family of files on a
+remote system, etc.
+
+
+Two ways which library could figure out where the storage is located are:
+storage access information can be provided by the user, or the library can try
+all known file access methods. This implementation uses the former method.
+
+
+In general, if a file was created with one driver then it isn't possible to
+open it with another driver. There are of course exceptions: a file created
+with MPIO could probably be opened with the sec2 driver, any file created
+by the sec2 driver could be opened as a family of files with one member,
+etc. In fact, sometimes a file must not only be opened with the same
+driver but also with the same driver properties. The predefined drivers are
+written in such a way that specifying the correct driver is sufficient for
+opening a file.
+
+
+A driver is simply a collection of functions and data structures which are
+registered with the HDF5 library at runtime. The functions fall into these
+categories:
+
+
+Some drivers need information about file access and data transfers which are
+very specific to the driver. The information is usually implemented as a pair
+of pointers to C structs which are allocated and initialized as part of an
+HDF5 property list and passed down to various driver functions. There are two
+classes of settings: file access modes that describe how to access the file
+through the driver, and data transfer modes which are settings that control
+I/O operations. Each file opened by a particular driver may have a different
+access mode; each dataset I/O request for a particular file may have a
+different data transfer mode.
+
+
+Since each driver has its own particular requirements for various settings,
+each driver is responsible for defining the mode structures that it
+needs. Higher layers of the library treat the structures as opaque but must be
+able to copy and free them. Thus, the driver provides either the size of the
+structure or a pair of function pointers for each of the mode types.
+
+
+Example: The family driver needs to know how the format address
+space is partitioned and the file access property list to use for the
+family members.
+
+
+In order to copy or free one of these structures the member file access
+or data transfer properties must also be copied or freed. This is done
+by providing a copy and close function for each structure:
+
+
+Example: The file access property list copy and close functions
+for the family driver:
+
+
+Generally when a file is created or opened the file access properties
+for the driver are copied into the file pointer which is returned and
+they may be modified from their original value (for instance, the file
+family driver modifies the member size property when opening an existing
+family). In order to support the
+Example: The file family driver copies the member size file
+access property list into the return value:
+
+
+The higher layers of the library expect files to have a name and allow the
+file to be accessed in various modes. The driver must be able to create a new
+file, replace an existing file, or open an existing file. Opening or creating
+a file should return a handle, a pointer to a specialization of the
+
+Example: The family driver requires handles to the underlying
+storage, the size of the members for this particular file (which might be
+different than the member size specified in the file access property list if
+an existing file family is being opened), the name used to open the file in
+case additional members must be created, and the flags to use for creating
+those additional members. The
+Example: The sec2 driver needs to keep track of the underlying Unix
+file descriptor and also the end of format address space and current Unix file
+size. It also keeps track of the current file position and last operation
+(read, write, or unknown) in order to optimize calls to
+All drivers must define a function for opening/creating a file. This
+function should have a prototype which is:
+
+
+HDF5
+Virtual File Layer
+Proposal 1999-08-11
+Robb Matzke
+Table of Contents
+
+
+Introduction
+
+
+
+
+
+
+
+H5FD_SEC2
+read
and write
to perform I/O to a single file. All I/O
+requests are unbuffered although the driver does optimize file seeking
+operations to some extent.
+
+H5FD_STDIO
+H5FD_CORE
+H5FD_MPIIO
+H5FD_FAMILY
+h5repart
tool can be used to change the sizes of the
+family members when stored as files or to convert a family of files to a
+single file or vice versa.
+
+H5FD_SPLIT
+Using a File Driver
+
+Driver Header Files
+
+H5close
) and the symbol is referenced again.
+
+Creating and Opening Files
+
+H5Fcreate
or
+H5Fopen
function. A default file access property list is created by
+calling H5Pcreate
and then the file driver information is inserted by
+calling a driver initialization function such as H5Pset_fapl_family
:
+
+
+hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
+size_t member_size = 100*1024*1024; /*100MB*/
+H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT);
+hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
+H5Pclose(fapl);
+
+
+H5Pset_fapl_
followed by the driver name and which
+takes a file access property list as the first argument followed by
+additional driver-dependent arguments.
+
+H5Pset_driver
function.(4) Its second argument is the file driver identifier, which may
+have a different numeric value from run to run depending on the order in which
+the file drivers are registered with the library. The third argument
+encapsulates the additional arguments of the driver initialization
+function. This method only works if the file driver writer has made the
+driver-specific property list structure a public datatype, which is
+often not the case.
+
+
+hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
+static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT};
+H5Pset_driver(fapl, H5FD_FAMILY, &fa);
+hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
+H5Pclose(fapl);
+
+
+H5Pget_driver
to determine the driver and then
+calling a driver-defined query function to obtain the driver information:
+
+
+hid_t driver = H5Pget_driver(fapl);
+if (H5FD_SEC2==driver) {
+ /*nothing further to get*/
+} else if (H5FD_FAMILY==driver) {
+ hid_t member_fapl;
+ haddr_t member_size;
+ H5Pget_fapl_family(fapl, &member_size, &member_fapl);
+} else if (....) {
+ ....
+}
+
+
+
+
+Performing I/O
+
+H5Dread
and H5Dwrite
functions transfer data between
+application memory and the file. They both take an optional data transfer
+property list which has some general driver-independent properties and
+optional driver-defined properties. An application will typically perform I/O
+in one of three styles via the H5Dread
or H5Dwrite
function:
+
+
+hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
+H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT);
+H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
+H5Pclose(dxpl);
+
+
+struct
and pass it
+to the H5Pset_driver
function:
+
+
+hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
+static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT};
+H5Pset_driver(dxpl, H5FD_MPIO, &dx);
+H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
+
+
+
+hid_t driver = H5Pget_driver(dxpl);
+if (H5FD_MPIO==driver) {
+ H5FD_mpio_xfer_t xfer_mode;
+ H5Pget_dxpl_mpio(dxpl, &xfer_mode);
+} else {
+ ....
+}
+
+
+
+
+File Driver Interchangeability
+
+Implementation of a Driver
+
+
+
+
+
+
+Mode Functions
+
+
+/* Driver-specific file access properties */
+typedef struct H5FD_family_fapl_t {
+ hsize_t memb_size; /*size of each member */
+ hid_t memb_fapl_id; /*file access property list of each memb*/
+} H5FD_family_fapl_t;
+
+/* Driver specific data transfer properties */
+typedef struct H5FD_family_dxpl_t {
+ hid_t memb_dxpl_id; /*data xfer property list of each memb */
+} H5FD_family_dxpl_t;
+
+
+
+static void *
+H5FD_family_fapl_copy(const void *_old_fa)
+{
+ const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa;
+ H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t));
+ assert(new_fa);
+
+ memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t));
+ new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id);
+ return new_fa;
+}
+
+static herr_t
+H5FD_family_fapl_free(void *_fa)
+{
+ H5FD_family_fapl_t *fa = (H5FD_family_fapl_t*)_fa;
+ H5Pclose(fa->memb_fapl_id);
+ free(fa);
+ return 0;
+}
+
+
+H5Fget_access_plist
function the
+driver must provide a fapl_get
callback which creates a copy of
+the driver-specific properties based on a particular file.
+
+
+static void *
+H5FD_family_fapl_get(H5FD_t *_file)
+{
+ H5FD_family_t *file = (H5FD_family_t*)_file;
+ H5FD_family_fapl_t *fa = calloc(1, sizeof(H5FD_family_fapl_t*));
+
+ fa->memb_size = file->memb_size;
+ fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id);
+ return fa;
+}
+
+
+
+
+File Functions
+
+H5FD_t
struct, which allows read-only or read-write access and which
+will be passed to the other driver functions as they are
+called.(5)
+
+
+typedef struct {
+ /* Public fields */
+ H5FD_class_t *cls; /*class data defined below*/
+
+ /* Private fields -- driver-defined */
+
+} H5FD_t;
+
+
+eoa
member caches the size of the format
+address space so the family members don't have to be queried in order to find
+it.
+
+
+/* The description of a file belonging to this driver. */
+typedef struct H5FD_family_t {
+ H5FD_t pub; /*public stuff, must be first */
+ hid_t memb_fapl_id; /*file access property list for members */
+ hsize_t memb_size; /*maximum size of each member file */
+ int nmembs; /*number of family members */
+ int amembs; /*number of member slots allocated */
+ H5FD_t **memb; /*dynamic array of member pointers */
+ haddr_t eoa; /*end of allocated addresses */
+ char *name; /*name generator printf format */
+ unsigned flags; /*flags for opening additional members */
+} H5FD_family_t;
+
+
+lseek
. The
+device
and inode
fields are defined on Unix in order to uniquely
+identify the file and will be discussed below.
+
+
+typedef struct H5FD_sec2_t {
+ H5FD_t pub; /*public stuff, must be first */
+ int fd; /*the unix file */
+ haddr_t eoa; /*end of allocated region */
+ haddr_t eof; /*end of file; current file size*/
+ haddr_t pos; /*current file I/O position */
+ int op; /*last operation */
+ dev_t device; /*file device number */
+ ino_t inode; /*file i-node number */
+} H5FD_sec2_t;
+
+
+
+
+Opening Files
+
+
+The file name name and file access property list fapl are
+the same as were specified in the H5Fcreate
or H5Fopen
+call. The flags are the same as in those calls also except the
+flag H5F_ACC_CREATE
is also present if the call was to
+H5Fcreate
and they are documented in the `H5Fpublic.h'
+file. The maxaddr argument is the maximum format address that the
+driver should be prepared to handle (the minimum address is always
+zero).
+
+
+
+Example: The sec2 driver opens a Unix file with the requested name +and saves information which uniquely identifies the file (the Unix device +number and inode). + +
+ ++static H5FD_t * +H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/, + haddr_t maxaddr) +{ + unsigned o_flags; + int fd; + struct stat sb; + H5FD_sec2_t *file=NULL; + + /* Check arguments */ + if (!name || !*name) return NULL; + if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL; + if (ADDR_OVERFLOW(maxaddr)) return NULL; + + /* Build the open flags */ + o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY; + if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC; + if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT; + if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL; + + /* Open the file */ + if ((fd=open(name, o_flags, 0666))<0) return NULL; + if (fstat(fd, &sb)<0) { + close(fd); + return NULL; + } + + /* Create the new file struct */ + file = calloc(1, sizeof(H5FD_sec2_t)); + file->fd = fd; + file->eof = sb.st_size; + file->pos = HADDR_UNDEF; + file->op = OP_UNKNOWN; + file->device = sb.st_dev; + file->inode = sb.st_ino; + + return (H5FD_t*)file; +} ++ + + +
+Closing a file simply means that all cached data should be flushed to the next +lower layer, the file should be closed at the next lower layer, and all +file-related data structures should be freed. All information needed by the +close function is already present in the file handle. + +
++
+The file argument is the handle which was returned by the open
+function, and the close
should free only memory associated with the
+driver-specific part of the handle (the public parts will have already been released by HDF5's virtual file layer).
+
+Example: The sec2 driver just closes the underlying Unix file, +making sure that the actual file size is the same as that known to the +library by writing a zero to the last file position it hasn't been +written by some previous operation (which happens in the same code which +flushes the file contents and is shown below). + +
+ ++static herr_t +H5FD_sec2_close(H5FD_t *_file) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + + if (H5FD_sec2_flush(_file)<0) return -1; + if (close(file->fd)<0) return -1; + free(file); + return 0; +} ++ + + +
+Occasionally an application will attempt to open a single file more than one +time in order to obtain multiple handles to the file. HDF5 allows the files to +share information(6) but in order to +accomplish this HDF5 must be able to tell when two names refer to the same +file. It does this by associating a driver-defined key with each file opened +by a driver and comparing the key for an open request with the keys for all +other files currently open by the same driver. + +
++
+The driver may provide a function which compares two files f1 and
+f2 belonging to the same driver and returns a negative, positive, or
+zero value a la the strcmp
function.(7) If this
+function is not provided then HDF5 assumes that all calls to the open
+callback return unique files regardless of the arguments and it is up to the
+application to avoid doing this if that assumption is incorrect.
+
+Each time a file is opened the library calls the cmp
function to
+compare that file with all other files currently open by the same driver and
+if one of them matches (at most one can match) then the file which was just
+opened is closed and the previously opened file is used instead.
+
+
+Opening a file twice with incompatible flags will result in failure. For +instance, opening a file with the truncate flag is a two step process which +first opens the file without truncation so keys can be compared, and if no +matching file is found already open then the file is closed and immediately +reopened with the truncation flag set (if a matching file is already open then +the truncating open will fail). + +
++Example: The sec2 driver uses the Unix device and i-node as the +key. They were initialized when the file was opened. + +
+ ++static int +H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2) +{ + const H5FD_sec2_t *f1 = (const H5FD_sec2_t*)_f1; + const H5FD_sec2_t *f2 = (const H5FD_sec2_t*)_f2; + + if (f1->device < f2->device) return -1; + if (f1->device > f2->device) return 1; + + if (f1->inode < f2->inode) return -1; + if (f1->inode > f2->inode) return 1; + + return 0; +} ++ + + +
+Some drivers may also need to store certain information in the file superblock +in order to be able to reliably open the file at a later date. This is done by +three functions: one to determine how much space will be necessary to store +the information in the superblock, one to encode the information, and one to +decode the information. These functions are optional, but if any one is +defined then the other two must also be defined. + +
++
+The sb_size
function returns the number of bytes necessary to encode
+information needed later if the file is reopened. The sb_encode
+function encodes information from the file into buffer buf
+allocated by the caller. It also writes an 8-character (plus null
+termination) into the name
argument, which should be a unique
+identification for the driver. The sb_decode
function looks at
+the name
+
+
+ decodes +data from the buffer buf and updates the file argument with the new information, +advancing *p in the process. +
+The part of this which is somewhat tricky is that the file must be readable +before the superblock information is decoded. File access modes fall outside +the scope of the HDF5 file format, but they are placed inside the boot block +for convenience.(8) + +
++Example: To be written later. + +
+ + ++HDF5 does not assume that a file is a linear address space of bytes. Instead, +the library will call functions to allocate and free portions of the HDF5 +format address space, which in turn map onto functions in the file driver to +allocate and free portions of file address space. The library tells the file +driver how much format address space it wants to allocate and the driver +decides what format address to use and how that format address is mapped onto +the file address space. Usually the format address is chosen so that the file +address can be calculated in constant time for data I/O operations (which are +always specified by format addresses). + +
+ + + ++The HDF5 format allows an optional userblock to appear before the actual HDF5 +data in such a way that if the userblock is sucked out of the file and +everything remaining is shifted downward in the file address space, then the +file is still a valid HDF5 file. The userblock size can be zero or any +multiple of two greater than or equal to 512 and the file superblock begins +immediately after the userblock. + +
++HDF5 allocates space for the userblock and superblock by calling an +allocation function defined below, which must return a chunk of memory at +format address zero on the first call. + +
+ + ++The library makes many types of allocation requests: + +
+H5FD_MEM_SUPER
+H5FD_MEM_BTREE
+H5FD_MEM_DRAW
+H5FD_MEM_META
+H5FD_MEM_GROUP
+H5FD_MEM_GHEAP
+H5FD_MEM_LHEAP
+H5FD_MEM_OHDR
+
+When a chunk of memory is freed the library adds it to a free list and
+allocation requests are satisfied from the free list before requesting memory
+from the file driver. Each type of allocation request enumerated above has its
+own free list, but the file driver can specify that certain object types can
+share a free list. It does so by providing an array which maps a request type
+to a free list. If any value of the map is H5MF_DEFAULT
(zero) then the
+object's own free list is used. The special value H5MF_NOLIST
indicates
+that the library should not attempt to maintain a free list for that
+particular object type, instead calling the file driver each time an object of
+that type is freed.
+
+
+Mappings predefined in the `H5FDpublic.h' file are: +
H5FD_FLMAP_SINGLE
+H5FD_FLMAP_DICHOTOMY
+H5FD_FLMAP_DEFAULT
+
+Example: To make a map that manages object headers on one free list
+and everything else on another free list one might initialize the map with the
+following code: (the use of H5FD_MEM_SUPER
is arbitrary)
+
+
+H5FD_mem_t mt, map[H5FD_MEM_NTYPES]; + +for (mt=0; mt<H5FD_MEM_NTYPES; mt++) { + map[mt] = (H5FD_MEM_OHDR==mt) ? mt : H5FD_MEM_SUPER; +} ++ +
+If an allocation request cannot be satisfied from the free list then one of +two things happen. If the driver defines an allocation callback then it is +used to allocate space; otherwise new memory is allocated from the end of the +format address space by incrementing the end-of-address marker. + +
++
+The file argument is the file from which space is to be allocated,
+type is the type of memory being requested (from the list above) without
+being mapped according to the freelist map and size is the number of
+bytes being requested. The library is allowed to allocate large chunks of
+storage and manage them in a layer above the file driver (although the current
+library doesn't do that). The allocation function should return a format
+address for the first byte allocated. The allocated region extends from that
+address for size bytes. If the request cannot be honored then the
+undefined address value is returned (HADDR_UNDEF
). The first call to
+this function for a file which has never had memory allocated must
+return a format address of zero or HADDR_UNDEF
since this is how the
+library allocates space for the userblock and/or superblock.
+
+Example: To be written later. + +
+ + +
+When the library is finished using a certain region of the format address
+space it will return the space to the free list according to the type of
+memory being freed and the free list map described above. If the free list has
+been disabled for a particular memory usage type (according to the free list
+map) and the driver defines a free
callback then it will be
+invoked. The free
callback is also invoked for all entries on the free
+list when the file is closed.
+
+
+
+The file argument is the file for which space is being freed; type +is the type of object being freed (from the list above) without being mapped +according to the freelist map; addr is the first format address to free; +and size is the size in bytes of the region being freed. The region +being freed may refer to just part of the region originally allocated and/or +may cross allocation boundaries provided all regions being freed have the same +usage type. However, the library will never attempt to free regions which have +already been freed or which have never been allocated. +
+A driver may choose to not define the free
function, in which case
+format addresses will be leaked. This isn't normally a huge problem since the
+library contains a simple free list of its own and freeing parts of the format
+address space is not a common occurrence.
+
+
+Example: To be written later. + +
+ + ++Each file driver must have some mechanism for setting and querying the end of +address, or EOA, marker. The EOA marker is the first format address +after the last format address ever allocated. If the last part of the +allocated address range is freed then the driver may optionally decrease the +eoa marker. + +
++
+This function returns the current value of the EOA marker for the specified +file. +
+Example: The sec2 driver just returns the current eoa marker value +which is cached in the file structure: + +
+ ++static haddr_t +H5FD_sec2_get_eoa(H5FD_t *_file) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + return file->eoa; +} ++ +
+The eoa marker is initially zero when a file is opened and the library may set
+it to some other value shortly after the file is opened (after the superblock
+is read and the saved eoa marker is determined) or when allocating additional
+memory in the absence of an alloc
callback (described above).
+
+
+Example: The sec2 driver simply caches the eoa marker in the file +structure and does not extend the underlying Unix file. When the file is +flushed or closed then the Unix file size is extended to match the eoa marker. + +
+ ++static herr_t +H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + file->eoa = addr; + return 0; +} ++ + + +
+These functions operate on data, transferring a region of the format address +space between memory and files. + +
+ + + ++A driver must specify two functions to transfer data from the library to the +file and vice versa. + +
++
+The read
function reads data from file file beginning at address
+addr and continuing for size bytes into the buffer buf
+supplied by the caller. The write
function transfers data in the
+opposite direction. Both functions take a data transfer property list
+dxpl which indicates the fine points of how the data is to be
+transferred and which comes directly from the H5Dread
or
+H5Dwrite
function.
+
+Both functions should return a negative value if they fail to transfer the +requested data, or non-negative if they succeed. The library will never +attempt to read from unallocated regions of the format address space. + +
+
+Example: The sec2 driver just makes system calls. It tries not to
+call lseek
if the current operation is the same as the previous
+operation and the file position is correct. It also fills the output buffer
+with zeros when reading between the current EOF and EOA markers and restarts
+system calls which were interrupted.
+
+
+static herr_t +H5FD_sec2_read(H5FD_t *_file, hid_t dxpl_id/*unused*/, haddr_t addr, + hsize_t size, void *buf/*out*/) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + ssize_t nbytes; + + assert(file && file->pub.cls); + assert(buf); + + /* Check for overflow conditions */ + if (REGION_OVERFLOW(addr, size)) return -1; + if (addr+size>file->eoa) return -1; + + /* Seek to the correct location */ + if ((addr!=file->pos || OP_READ!=file->op) && + file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) { + file->pos = HADDR_UNDEF; + file->op = OP_UNKNOWN; + return -1; + } + + /* + * Read data, being careful of interrupted system calls, partial results, + * and the end of the file. + */ + while (size>0) { + do nbytes = read(file->fd, buf, size); + while (-1==nbytes && EINTR==errno); + if (-1==nbytes) { + /* error */ + file->pos = HADDR_UNDEF; + file->op = OP_UNKNOWN; + return -1; + } + if (0==nbytes) { + /* end of file but not end of format address space */ + memset(buf, 0, size); + size = 0; + } + assert(nbytes>=0); + assert((hsize_t)nbytes<=size); + size -= (hsize_t)nbytes; + addr += (haddr_t)nbytes; + buf = (char*)buf + nbytes; + } + + /* Update current position */ + file->pos = addr; + file->op = OP_READ; + return 0; +} ++ +
+Example: The sec2 write
callback is similar except it updates
+the file EOF marker when extending the file.
+
+
+Some drivers may desire to cache data in memory in order to make larger I/O
+requests to the underlying file and thus improving bandwidth. Such drivers
+should register a cache flushing function so that the library can insure that
+data has been flushed out of the drivers in response to the application
+calling H5Fflush
.
+
+
+
+Flush all data for file file to storage. +
+Example: The sec2 driver doesn't cache any data but it also doesn't +extend the Unix file as agressively as it should. Therefore, when finalizing a +file it should write a zero to the last byte of the allocated region so that +when reopening the file later the EOF marker will be at least as large as the +EOA marker saved in the superblock (otherwise HDF5 will refuse to open the +file, claiming that the data appears to be truncated). + +
+ ++static herr_t +H5FD_sec2_flush(H5FD_t *_file) +{ + H5FD_sec2_t *file = (H5FD_sec2_t*)_file; + + if (file->eoa>file->eof) { + if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1; + if (write(file->fd, "", 1)!=1) return -1; + file->eof = file->eoa; + file->pos = file->eoa; + file->op = OP_WRITE; + } + + return 0; +} ++ + + +
+Before a driver can be used the HDF5 library needs to be told of its +existence. This is done by registering the driver, which results in a driver +identification number. Instead of passing many arguments to the registration +function, the driver information is entered into a structure and the address +of the structure is passed to the registration function where it is +copied. This allows the HDF5 API to be extended while providing backward +compatibility at the source level. + +
++
+The driver described by struct cls is registered with the library and an +ID number for the driver is returned. +
+The H5FD_class_t
type is a struct with the following fields:
+
+
const char *name
+size_t fapl_size
+void *(*fapl_copy)(const void *fapl)
+fm_size
when both are defined.
+void (*fapl_free)(void *fapl)
+free
function to free the
+structure.
+size_t dxpl_size
+void *(*dxpl_copy)(const void *dxpl)
+xm_size
when both are
+defined.
+void (*dxpl_free)(void *dxpl)
+free
function to
+free the structure.
+H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr)
+herr_t (*close)(H5FD_t *file)
+int (*cmp)(const H5FD_t *f1, const H5FD_t *f2)
+haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size)
+herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size)
+haddr_t (*get_eoa)(H5FD_t *file)
+herr_t (*set_eoa)(H5FD_t *file, haddr_t)
+haddr_t (*get_eof)(H5FD_t *file)
+herr_t (*read)(H5FD_t *file, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer)
+herr_t (*write)(H5FD_t *file, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer)
+herr_t (*flush)(H5FD_t *file)
+H5FD_mem_t fl_map[H5FD_MEM_NTYPES]
++Example: The sec2 driver would be registered as: + +
+ ++static const H5FD_class_t H5FD_sec2_g = { + "sec2", /*name */ + MAXADDR, /*maxaddr */ + NULL, /*sb_size */ + NULL, /*sb_encode */ + NULL, /*sb_decode */ + 0, /*fapl_size */ + NULL, /*fapl_get */ + NULL, /*fapl_copy */ + NULL, /*fapl_free */ + 0, /*dxpl_size */ + NULL, /*dxpl_copy */ + NULL, /*dxpl_free */ + H5FD_sec2_open, /*open */ + H5FD_sec2_close, /*close */ + H5FD_sec2_cmp, /*cmp */ + NULL, /*alloc */ + NULL, /*free */ + H5FD_sec2_get_eoa, /*get_eoa */ + H5FD_sec2_set_eoa, /*set_eoa */ + H5FD_sec2_get_eof, /*get_eof */ + H5FD_sec2_read, /*read */ + H5FD_sec2_write, /*write */ + H5FD_sec2_flush, /*flush */ + H5FD_FLMAP_SINGLE, /*fl_map */ +}; + +hid_t +H5FD_sec2_init(void) +{ + if (!H5FD_SEC2_g) { + H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g); + } + return H5FD_SEC2_g; +} ++ +
+A driver can be removed from the library by unregistering it + +
++
+Unregistering a driver makes it unusable for creating new file access or data +transfer property lists but doesn't affect any property lists or files that +already use that driver. + +
+ + + ++
+This function is intended to be used by driver functions, not applications.
+It returns a pointer directly into the file access property list
+fapl
which is a copy of the driver's file access mode originally
+provided to the H5Pset_driver
function. If its argument is a data
+transfer property list fxpl
then it returns a pointer to the
+driver-specific data transfer information instead.
+
+The various private H5F_low_*
functions will be replaced by public
+H5FD*
functions so they can be called from drivers.
+
+
+All private functions H5F_addr_*
which operate on addresses will be
+renamed as public functions by removing the first underscore so they can be
+called by drivers.
+
+
+The haddr_t
address data type will be passed by value throughout the
+library. The original intent was that this type would eventually be a union of
+file address types for the various drivers and may become quite large, but
+that was back when drivers were part of HDF5. It will become an alias for an
+unsigned integer type (32 or 64 bits depending on how the library was
+configured).
+
+
+The various H5F*.c
driver files will be renamed H5FD*.c
and each
+will have a corresponding header file. All driver functions except the
+initializer and API will be declared static.
+
+
+This documentation didn't cover optimization functions which would be useful +to drivers like MPI-IO. Some drivers may be able to perform data pipeline +operations more efficiently than HDF5 and need to be given a chance to +override those parts of the pipeline. The pipeline would be designed to call +various H5FD optimization functions at various points which return one of +three values: the operation is not implemented by the driver, the operation is +implemented but failed in a non-recoverable manner, the operation is +implemented and succeeded. + +
++Various parts of HDF5 check the only the top-level file driver and do +something special if it is the MPI-IO driver. However, we might want to be +able to put the MPI-IO driver under other drivers such as the raw part of a +split driver or under a debug driver whose sole purpose is to accumulate +statistics as it passes all requests through to the MPI-IO driver. Therefore +we will probably need a function which takes a format address and or object +type and returns the driver which would have been used at the lowest level to +process the request. + +
+ ++
The driver name is by convention and might +not apply to drivers which are not distributed with HDF5. +
The access method also indicates how to translate +the storage name to a storage server such as a file, network protocol, or +memory. +
The term +"file access property list" is a misnomer since storage isn't +required to be a file. +
This +function is overloaded to operate on data transfer property lists also, as +described below. +
Read-only access is only appropriate when opening an existing +file. +
For instance, writing data to one handle will cause +the data to be immediately visible on the other handle. +
The ordering is +arbitrary as long as it's consistent within a particular file driver. +
File access modes do not describe data, but rather +describe how the HDF5 format address space is mapped to the underlying +file(s). Thus, in general the mapping must be known before the file superblock +can be read. However, the user usually knows enough about the mapping for the +superblock to be readable and once the superblock is read the library can fill +in the missing parts of the mapping. +
+This document was generated on 18 November 1999 using the +texi2html +translator version 1.51.
+ + -- cgit v0.12