HDF5 files are composed of a "boot block" describing information required to portably access files on multiple platforms, followed by information about the groups in a file and the datasets in the file. The boot block contains information about the size of offsets and lengths of objects, the number of entries in symbol tables (used to store groups) and additional version information for the file.
The HDF5 library assumes that all files are implicitly opened for read
access at all times. Passing the H5F_ACC_RDWR
parameter to H5Fopen()
allows write access to a
file also. H5Fcreate()
assumes write access as
well as read access, passing H5F_ACC_TRUNC
forces
the truncation of an existing file, otherwise H5Fcreate will
fail to overwrite an existing file.
Files are created with the H5Fcreate()
function,
and existing files can be accessed with H5Fopen()
. Both
functions return an object ID which should be eventually released by
calling H5Fclose()
.
hid_t H5Fcreate (const char *name, uintn
flags, hid_t create_properties, hid_t
access_properties)
H5F_ACC_TRUNC
flag is set,
any current file is truncated when the new file is created.
If a file of the same name exists and the
H5F_ACC_TRUNC
flag is not set (or the
H5F_ACC_EXCL
bit is set), this function will
fail. Passing H5P_DEFAULT
for the creation
and/or access property lists uses the library's default
values for those properties. Creating and changing the
values of a property list is documented further below. The
return value is an ID for the open file and it should be
closed by calling H5Fclose()
when it's no longer
needed. A negative value is returned for failure.
hid_t H5Fopen (const char *name, uintn
flags, hid_t access_properties)
H5F_ACC_RDWR
flag is
set. The access_properties is a file access property
list ID or H5P_DEFAULT
for the default I/O access
parameters. Creating and changing the parameters for access
templates is documented further below. Files which are opened
more than once return a unique identifier for each
H5Fopen()
call and can be accessed through all
file IDs. The return value is an ID for the open file and it
should be closed by calling H5Fclose()
when it's
no longer needed. A negative value is returned for failure.
herr_t H5Fclose (hid_t file_id)
H5Fcreate()
or H5Fopen()
. After
closing a file the file_id should not be used again. This
function returns zero for success or a negative value for failure.
herr_t H5Fflush (hid_t object_id)
Additional parameters to H5Fcreate()
or
H5Fopen()
are passed through property list
objects, which are created with the H5Pcreate()
function. These objects allow many parameters of a file's
creation or access to be changed from the default values.
Property lists are used as a portable and extensible method of
modifying multiple parameter values with simple API functions.
There are two kinds of file-related property lists,
namely file creation properties and file access properties.
File creation property lists apply to H5Fcreate()
only
and are used to control the file meta-data which is maintained
in the boot block of the file. The parameters which can be
modified are:
H5Pset_userblock()
and
H5Pget_userblock()
calls.
H5Pset_sizes()
and
H5Pget_sizes()
calls.
H5Pset_sym_k()
and H5Pget_sym_k()
calls.
H5Pset_istore_k()
and H5Pget_istore_k()
calls.
File access property lists apply to H5Fcreate()
or
H5Fopen()
and are used to control different methods of
performing I/O on files.
open()
,
lseek()
, read()
, write()
, and
close()
. The lseek64()
function is used
on operating systems that support it. This driver is enabled and
configured with H5Pset_sec2()
, and queried with
H5Pget_sec2()
.
stdio.h
header file, namely
fopen()
, fseek()
, fread()
,
fwrite()
, and fclose()
. The
fseek64()
function is used on operating systems that
support it. This driver is enabled and configured with
H5Pset_stdio()
, and queried with
H5Pget_stdio()
.
malloc()
and free()
to create storage
space for the file. The total size of the file must be small enough
to fit in virtual memory. The name supplied to
H5Fcreate()
is irrelevant, and H5Fopen()
will always fail.
MPI_File_open()
during file creation or open.
The access_mode controls the kind of parallel access the application
intends. (Note that it is likely that the next API revision will
remove the access_mode parameter and have access control specified
via the raw data transfer property list of H5Dread()
and H5Dwrite()
.) These parameters are set and queried
with the H5Pset_mpi()
and H5Pget_mpi()
calls.
H5Pset_alignment()
function. Any allocation
request at least as large as some threshold will be aligned on
an address which is a multiple of some number.
This following example shows how to create a file with 64-bit object
offsets and lengths:
hid_t create_template; hid_t file_id; create_template = H5Pcreate(H5P_FILE_CREATE); H5Pset_sizes(create_template, 8, 8); file_id = H5Fcreate("test.h5", H5F_ACC_TRUNC, create_template, H5P_DEFAULT); . . . H5Fclose(file_id);
This following example shows how to open an existing file for
independent datasets access by MPI parallel I/O:
hid_t access_template; hid_t file_id; access_template = H5Pcreate(H5P_FILE_ACCESS); H5Pset_mpi(access_template, MPI_COMM_WORLD, MPI_INFO_NULL); /* H5Fopen must be called collectively */ file_id = H5Fopen("test.h5", H5F_ACC_RDWR, access_template); . . . /* H5Fclose must be called collectively */ H5Fclose(file_id);
HDF5 is able to access its address space through various types of low-level file drivers. For instance, an address space might correspond to a single file on a Unix file system, multiple files on a Unix file system, multiple files on a parallel file system, or a block of memory within the application. Generally, an HDF5 address space is referred to as an "HDF5 file" regardless of how the space is organized at the storage level.
The sec2 driver uses functions from section 2 of the
Posix manual to access files stored on a local file system. These are
the open()
, close()
, read()
,
write()
, and lseek()
functions. If the
operating system supports lseek64()
then it is used instead
of lseek()
. The library buffers meta data regardless of
the low-level driver, but using this driver prevents data from being
buffered again by the lowest layers of the HDF5 library.
H5F_driver_t H5Pget_driver (hid_t
access_properties)
H5F_LOW_SEC2
if the
sec2 driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_sec2 (hid_t access_properties)
herr_t H5Pget_sec2 (hid_t access_properties)
H5Pset_sec2()
.
The stdio driver uses the functions declared in the
stdio.h
header file to access permanent files in a local
file system. These are the fopen()
, fclose()
,
fread()
, fwrite()
, and fseek()
functions. If the operating system supports fseek64()
then
it is used instead of fseek()
. Use of this driver
introduces an additional layer of buffering beneath the HDF5 library.
H5F_driver_t H5Pget_driver(hid_t
access_properties)
H5F_LOW_STDIO
if the
stdio driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_stdio (hid_t access_properties)
herr_t H5Pget_stdio (hid_t access_properties)
H5Pset_stdio()
.
The core driver uses malloc()
and
free()
to allocated space for a file in the heap. Reading
and writing to a file of this type results in mem-to-mem copies instead
of disk I/O and as a result is somewhat faster. However, the total file
size must not exceed the amount of available virtual memory, and only
one HDF5 file handle can access the file (because the name of such a
file is insignificant and H5Fopen()
always fails).
H5F_driver_t H5Pget_driver (hid_t
access_properties)
H5F_LOW_CORE
if the
core driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_core (hid_t access_properties, size_t
block_size)
herr_t H5Pget_core (hid_t access_properties, size_t
*block_size)
H5Pset_core()
.
This driver uses MPI I/O to provide parallel access to a file.
H5F_driver_t H5Pget_driver (hid_t
access_properties)
H5F_LOW_MPI
if the
mpi driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_mpi (hid_t access_properties, MPI_Comm
comm, MPI_info info)
herr_t H5Pget_mpi (hid_t access_properties, MPI_Comm
*comm, MPI_info *info)
H5Pset_mpi()
.
A single HDF5 address space may be split into multiple files which,
together, form a file family. Each member of the family must be the
same logical size although the size and disk storage reported by
ls
(1) may be substantially smaller. The name passed to
H5Fcreate()
or H5Fopen()
should include a
printf(3c)
style integer format specifier which will be
replaced with the family member number (the first family member is
zero).
Any HDF5 file can be split into a family of files by running
the file through split
(1) and numbering the output
files. However, because HDF5 is lazy about extending the size
of family members, a valid file cannot generally be created by
concatenation of the family members. Additionally,
split
and cat
don't attempt to
generate files with holes. The h5repart
program
can be used to repartition an HDF5 file or family into another
file or family and preserves holes in the files.
h5repart
[-v
] [-b
block_size[suffix]] [-m
member_size[suffix]] source
destination
printf
-style integer format such as "%d". The
-v
switch prints input and output file names on
the standard error stream for progress monitoring,
-b
sets the I/O block size (the default is 1kB),
and -m
sets the output member size if the
destination is a family name (the default is 1GB). The block
and member sizes may be suffixed with the letters
g
, m
, or k
for GB, MB,
or kB respectively.
H5F_driver_t H5Pget_driver (hid_t
access_properties)
H5F_LOW_FAMILY
if
the family driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_family (hid_t access_properties,
hsize_t memb_size, hid_t member_properties)
off_t
type is
four bytes then the maximum family member size is usually
2^31-1 because the byte at offset 2,147,483,647 is generally
inaccessable. Additional parameters may be added to this
function in the future.
herr_t H5Pget_family (hid_t access_properties,
hsize_t *memb_size, hid_t
*member_properties)
H5Pclose() when the application is finished with
it. If memb_size is non-null then it will contain
the logical size in bytes of each family member. In the
future, additional arguments may be added to this function to
match those added to H5Pset_family()
.
On occasion, it might be useful to separate meta data from raw
data. The split driver does this by creating two files: one for
meta data and another for raw data. The application provides a base
file name to H5Fcreate()
or H5Fopen()
and this
driver appends a file extension which defaults to ".meta" for the meta
data file and ".raw" for the raw data file. Each file can have its own
file access property list which allows, for instance, a split file with
meta data stored with the core driver and raw data stored with
the sec2 driver.
H5F_driver_t H5Pget_driver (hid_t
access_properties)
H5F_LOW_SPLIT
if
the split driver is defined as the low-level driver for the
specified access property list.
herr_t H5Pset_split (hid_t access_properties,
const char *meta_extension, hid_t
meta_properties, const char *raw_extension, hid_t
raw_properties)
herr_t H5Pget_split (hid_t access_properties,
size_t meta_ext_size, const char *meta_extension,
hid_t meta_properties, size_t raw_ext_size, const
char *raw_extension, hid_t *raw_properties)
H5Pclose() when
the application is finished with them, but if the meta and/or
raw file has no property list then a negative value is
returned for that property list handle. Also, if
meta_extension and/or raw_extension are
non-null pointers, at most meta_ext_size or
raw_ext_size characters of the meta or raw file name
extension will be copied to the specified buffer. If the
actual name is longer than what was requested then the result
will not be null terminated (similar to
strncpy()
). In the future, additional arguments
may be added to this function to match those added to
H5Pset_split()
.