HDF5 files are composed of a super block describing information required to portably access files on multiple platforms, followed by information about the groups in a file and the datasets in the file. The super block contains information about the size of offsets and lengths of objects, the number of entries in symbol tables (used to store groups) and additional version information for the file.
The HDF5 library assumes that all files are implicitly opened for read
access at all times. Passing the H5F_ACC_RDWR
parameter to H5Fopen()
allows write access to a
file also. H5Fcreate()
assumes write access as
well as read access, passing H5F_ACC_TRUNC
forces
the truncation of an existing file, otherwise H5Fcreate will
fail to overwrite an existing file.
Files are created with the H5Fcreate()
function,
and existing files can be accessed with H5Fopen()
. Both
functions return an object ID which should be eventually released by
calling H5Fclose()
.
hid_t H5Fcreate (const char *name, uintn
flags, hid_t create_properties, hid_t
access_properties)
H5F_ACC_TRUNC
flag is set,
any current file is truncated when the new file is created.
If a file of the same name exists and the
H5F_ACC_TRUNC
flag is not set (or the
H5F_ACC_EXCL
bit is set), this function will
fail. Passing H5P_DEFAULT
for the creation
and/or access property lists uses the library's default
values for those properties. Creating and changing the
values of a property list is documented further below. The
return value is an ID for the open file and it should be
closed by calling H5Fclose()
when it's no longer
needed. A negative value is returned for failure.
hid_t H5Fopen (const char *name, uintn
flags, hid_t access_properties)
H5F_ACC_RDWR
flag is
set. The access_properties is a file access property
list ID or H5P_DEFAULT
for the default I/O access
parameters. Creating and changing the parameters for access
property lists is documented further below. Files which are opened
more than once return a unique identifier for each
H5Fopen()
call and can be accessed through all
file IDs. The return value is an ID for the open file and it
should be closed by calling H5Fclose()
when it's
no longer needed. A negative value is returned for failure.
herr_t H5Fclose (hid_t file_id)
H5Fcreate()
or H5Fopen()
. After
closing a file the file_id should not be used again. This
function returns zero for success or a negative value for failure.
herr_t H5Fflush (hid_t object_id,
H5F_scope_t scope)
Additional parameters to H5Fcreate()
or
H5Fopen()
are passed through property list
objects, which are created with the H5Pcreate()
function. These objects allow many parameters of a file's
creation or access to be changed from the default values.
Property lists are used as a portable and extensible method of
modifying multiple parameter values with simple API functions.
There are two kinds of file-related property lists,
namely file creation properties and file access properties.
File creation property lists apply to H5Fcreate()
only
and are used to control the file meta-data which is maintained
in the super block of the file. The parameters which can be
modified are:
H5Pset_userblock()
and
H5Pget_userblock()
calls.
H5Pset_sizes()
and
H5Pget_sizes()
calls.
H5Pset_sym_k()
and H5Pget_sym_k()
calls.
H5Pset_istore_k()
and H5Pget_istore_k()
calls.
File access property lists apply to H5Fcreate()
or
H5Fopen()
and are used to control different methods of
performing I/O on files.
open()
,
lseek()
, read()
, write()
, and
close()
. The lseek64()
function is used
on operating systems that support it. This driver is enabled and
configured with H5Pset_fapl_sec2()
.
stdio.h
, namely
fopen()
, fseek()
, fread()
,
fwrite()
, and fclose()
. The
fseek64()
function is used on operating systems that
support it. This driver is enabled and configured with
H5Pset_fapl_stdio()
.
malloc()
and free()
to create storage
space for the file. The total size of the file must be small enough
to fit in virtual memory. The name supplied to
H5Fcreate()
is irrelevant, and H5Fopen()
will always fail.
MPI_File_open()
during file creation or open.
The access_mode controls the kind of parallel access the application
intends. (Note that it is likely that the next API revision will
remove the access_mode parameter and have access control specified
via the raw data transfer property list of H5Dread()
and H5Dwrite()
.) These parameters are set and queried
with the H5Pset_fapl_mpi()
and
H5Pget_fapl_mpi()
calls.
H5Pset_alignment()
function. Any allocation
request at least as large as some threshold will be aligned on
an address which is a multiple of some number.
This following example shows how to create a file with 64-bit object
offsets and lengths:
hid_t create_plist; hid_t file_id; create_plist = H5Pcreate(H5P_FILE_CREATE); H5Pset_sizes(create_plist, 8, 8); file_id = H5Fcreate("test.h5", H5F_ACC_TRUNC, create_plist, H5P_DEFAULT); . . . H5Fclose(file_id);
This following example shows how to open an existing file for
independent datasets access by MPI parallel I/O:
hid_t access_plist; hid_t file_id; access_plist = H5Pcreate(H5P_FILE_ACCESS); H5Pset_fapl_mpi(access_plist, MPI_COMM_WORLD, MPI_INFO_NULL); /* H5Fopen must be called collectively */ file_id = H5Fopen("test.h5", H5F_ACC_RDWR, access_plist); . . . /* H5Fclose must be called collectively */ H5Fclose(file_id);
HDF5 is able to access its address space through various types of low-level file drivers. For instance, an address space might correspond to a single file on a Unix file system, multiple files on a Unix file system, multiple files on a parallel file system, or a block of memory within the application. Generally, an HDF5 address space is referred to as an HDF5 file regardless of how the space is organized at the storage level.
The sec2 driver uses functions from section 2 of the
Posix manual to access files stored on a local file system. These are
the open()
, close()
, read()
,
write()
, and lseek()
functions. If the
operating system supports lseek64()
then it is used instead
of lseek()
. The library buffers meta data regardless of
the low-level driver, but using this driver prevents data from being
buffered again by the lowest layers of the HDF5 library.
hid_t H5Pget_driver (hid_t access_properties)
H5FD_SEC2
if the
sec2 driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_fapl_sec2
(hid_t access_properties)
The stdio driver uses the functions declared in the
stdio.h
header file to access permanent files in a local
file system. These are the fopen()
, fclose()
,
fread()
, fwrite()
, and fseek()
functions. If the operating system supports fseek64()
then
it is used instead of fseek()
. Use of this driver
introduces an additional layer of buffering beneath the HDF5 library.
hid_t H5Pget_driver(hid_t access_properties)
H5FD_STDIO
if the
stdio driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_fapl_stdio
(hid_t access_properties)
The core driver uses malloc()
and
free()
to allocate space for a file in the heap. Reading
and writing to a file of this type results in mem-to-mem copies instead
of disk I/O and as a result is somewhat faster. However, the total file
size must not exceed the amount of available virtual memory, and only
one HDF5 file handle can access the file (because the name of such a
file is insignificant and H5Fopen()
always fails).
hid_t H5Pget_driver (hid_t access_properties)
H5FD_CORE
if the
core driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_fapl_core (hid_t access_properties,
size_t block_size,
hbool_t backing_store)
herr_t H5Pget_fapl_core (hid_t access_properties,
size_t *block_size),
hbool_t *backing_store)
H5Pset_fapl_core()
.
This driver uses MPI I/O to provide parallel access to a file.
hid_t H5Pget_driver (hid_t access_properties)
H5FD_MPI
if the
mpi driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_fapl_mpi (hid_t access_properties, MPI_Comm
comm, MPI_info info)
herr_t H5Pget_fapl_mpi
(hid_t access_properties,
MPI_Comm *comm,
MPI_info *info)
H5Pset_fapl_mpi()
.
A single HDF5 address space may be split into multiple files which,
together, form a file family. Each member of the family must be the
same logical size although the size and disk storage reported by
ls
(1) may be substantially smaller. The name passed to
H5Fcreate()
or H5Fopen()
should include a
printf(3c)
style integer format specifier which will be
replaced with the family member number (the first family member is
zero).
Any HDF5 file can be split into a family of files by running
the file through split
(1) and numbering the output
files. However, because HDF5 is lazy about extending the size
of family members, a valid file cannot generally be created by
concatenation of the family members. Additionally,
split
and cat
don't attempt to
generate files with holes. The h5repart
program
can be used to repartition an HDF5 file or family into another
file or family and preserves holes in the files.
h5repart
[-v
] [-b
block_size[suffix]] [-m
member_size[suffix]] source
destination
printf
-style integer format such as "%d". The
-v
switch prints input and output file names on
the standard error stream for progress monitoring,
-b
sets the I/O block size (the default is 1kB),
and -m
sets the output member size if the
destination is a family name (the default is 1GB). The block
and member sizes may be suffixed with the letters
g
, m
, or k
for GB, MB,
or kB respectively.
hid_t H5Pget_driver (hid_t access_properties)
H5FD_FAMILY
if
the family driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_fapl_family (hid_t access_properties,
hsize_t memb_size, hid_t member_properties)
off_t
type is
four bytes then the maximum family member size is usually
2^31-1 because the byte at offset 2,147,483,647 is generally
inaccessible. Additional parameters may be added to this
function in the future.
herr_t H5Pget_fapl_family (hid_t access_properties,
hsize_t *memb_size,
hid_t *member_properties)
H5Pclose()
when the application is finished with
it. If memb_size is non-null then it will contain
the logical size in bytes of each family member. In the
future, additional arguments may be added to this function to
match those added to H5Pset_fapl_family()
.
On occasion, it might be useful to separate meta data from raw
data. The split driver does this by creating two files: one for
meta data and another for raw data. The application provides a base
file name to H5Fcreate()
or H5Fopen()
and this
driver appends a file extension which defaults to .meta
for
the meta data file and .raw
for the raw data file.
Each file can have its own
file access property list which allows, for instance, a split file with
meta data stored with the core driver and raw data stored with
the sec2 driver.
hid_t H5Pget_driver (hid_t access_properties)
H5FD_SPLIT
if
the split driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_fapl_split (hid_t access_properties,
const char *meta_extension,
hid_t meta_properties, const char *raw_extension,
hid_t raw_properties)
.meta
) to the end of
the base name and will be accessed according to the
meta_properties. The raw file will have a name which is
formed by appending raw_extension (or .raw
) to the base
name and will be accessed according to the raw_properties.
Additional parameters may be added to this function in the future.