HDF5
1.15.0.b1054b4
API Reference
|
Navigate back: Main / Getting Started with HDF5 / Learning the Basics
A dataset is a multidimensional array of data elements, together with supporting metadata. To create a dataset, the application program must specify the location at which to create the dataset, the dataset name, the datatype and dataspace of the data array, and the property lists.
A datatype is a collection of properties, all of which can be stored on disk, and which, when taken as a whole, provide complete information for data conversion to or from that datatype.
There are two categories of datatypes in HDF5:
Shown below is the HDF5 pre-defined datatypes.
Some of the HDF5 predefined atomic datatypes are listed below.
Datatype | Description |
---|---|
H5T_STD_I32LE | Four-byte, little-endian, signed, two's complement integer |
H5T_STD_U16BE | Two-byte, big-endian, unsigned integer |
H5T_IEEE_F32BE | Four-byte, big-endian, IEEE floating point |
H5T_IEEE_F64LE | Eight-byte, little-endian, IEEE floating point |
H5T_C_S1 | One-byte, null-terminated string of eight-bit characters |
Native Datatype | Corresponding C or FORTRAN Type |
---|---|
C | |
H5T_NATIVE_INT | int |
H5T_NATIVE_FLOAT | float |
H5T_NATIVE_CHAR | char |
H5T_NATIVE_DOUBLE | double |
H5T_NATIVE_LDOUBLE | long double |
Fortran | |
H5T_NATIVE_INTEGER | integer |
H5T_NATIVE_REAL | real |
H5T_NATIVE_DOUBLE | double precision |
H5T_NATIVE_CHARACTER | character |
In this tutorial, we consider only HDF5 predefined integers.
For further information on datatypes, see HDF5 Datatypes in the HDF5 User Guide, in addition to the Datatype Basics tutorial topic.
A dataspace describes the dimensionality of the data array. A dataspace is either a regular N-dimensional array of data points, called a simple dataspace, or a more general collection of data points organized in another manner, called a complex dataspace. In this tutorial, we only consider simple dataspaces.
HDF5 dataspaces
The dimensions of a dataset can be fixed (unchanging), or they may be unlimited, which means that they are extensible. A dataspace can also describe a portion of a dataset, making it possible to do partial I/O operations on selections.
Property lists are a mechanism for modifying the default behavior when creating or accessing objects. For more information on property lists see the Property Lists Basics tutorial topic.
The following property lists can be specified when creating a dataset:
To create an empty dataset (no data written) the following steps need to be taken:
In HDF5, datatypes and dataspaces are independent objects which are created separately from any dataset that they might be attached to. Because of this, the creation of a dataset requires the definition of the datatype and dataspace. In this tutorial, we use the HDF5 predefined datatypes (integer) and consider only simple dataspaces. Hence, only the creation of dataspace objects is needed.
The High Level HDF5 Lite APIs (H5LT,H5LD) include functions that simplify and condense the steps for creating datasets in HDF5. The examples in the following section use the standard APIs. For a quick start you may prefer to look at the HDF5 Lite APIs (H5LT,H5LD) at this time.
If you plan to work with images, please look at the High Level HDF5 Images API (H5IM), as well.
See Examples from Learning the Basics for the examples used in the Learning the Basics tutorial.
The example shows how to create an empty dataset. It creates a file called dset.h5
in the C version (dsetf.h5
in Fortran), defines the dataset dataspace, creates a dataset which is a 4x6 integer array, and then closes the dataspace, the dataset, and the file.
For details on compiling an HDF5 application: [ Compiling HDF5 Applications ]
H5Screate_simple creates a new simple dataspace and returns a dataspace identifier. H5Sclose releases and terminates access to a dataspace.
C
FORTRAN
H5Dcreate creates an empty dataset at the specified location and returns a dataset identifier. H5Dclose closes the dataset and releases the resource used by the dataset. This call is mandatory.
C
FORTRAN
Note that if using the pre-defined datatypes in FORTRAN, then a call must be made to initialize and terminate access to the pre-defined datatypes:
H5open must be called before any HDF5 library subroutine calls are made; H5close must be called after the final HDF5 library subroutine call.
See the programming example for an illustration of the use of these calls.
The contents of the file dset.h5 (dsetf.h5 for FORTRAN) are shown below:
dset.h5 in DDL | dsetf.h5 in DDL |
---|---|
HDF5 "dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) }
DATA {
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0
}
}
}
}
| HDF5 "dsetf.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 6, 4 ) / ( 6, 4 ) }
DATA {
0, 0, 0, 0,
0, 0, 0, 0,
0, 0, 0, 0,
0, 0, 0, 0,
0, 0, 0, 0,
0, 0, 0, 0
}
}
}
}
|
Note in above that H5T_STD_I32BE, a 32-bit Big Endian integer, is an HDF atomic datatype.
The following is the simplified DDL dataset definition:
Navigate back: Main / Getting Started with HDF5 / Learning the Basics