Introduction to HDF5 HDF5 User Guide |
HDF5 Reference Manual Other HDF5 documents and links |
This is an introduction to the HDF5 data model and programming model. Being a Getting Started or QuickStart document, this
Introduction to HDF5 is intended to provide enough information for you to develop a basic understanding of how HDF5 works and is meant to be used. Knowledge of the current version of HDF will make it easier to follow the text, but it is not required. More complete information of the sort you will need to actually use HDF5 is available in the HDF5 documentation. Available documents include the following:Code examples are available in the source code tree when you install HDF5.
hdf5/examples
,
hdf5/doc/html/examples/
, and
hdf5/doc/html/Tutor/examples/
contain the examples
used in this document.
HDF5 is a completely new Hierarchical Data Format product consisting of a data format specification and a supporting library implementation. HDF5 is designed to address some of the limitations of the older HDF product and to address current and anticipated requirements of modern systems and applications. 1
We urge you to look at HDF5, the format and the library, and give us
feedback on what you like or do not like about it, and what features
you would like to see added to it.
Why HDF5?
HDF5 includes the following improvements.
This release includes the basic functionality that was planned for the HDF5 library. However, the library does not implement all of the features detailed in the format and API specifications. Here is a listing of some of the limitations of the current release:
A detailed list of changes in HDF5 since the last release,
HDF5 Release 1.0, can be found in the file hdf5/RELEASE
in the source code installation. At a higher level, those changes include:
h5ls
and h5dump
tools
and a new HDF5 to HDF4 conversion tool, h5toh4
The changes in bugfix releases, numbered Release 1.4.x,
will be listed in the RELEASE
, HISTORY
, and
README
files distributed with the source code.
The changes as HDF5 has evolved from the first Alpha release
to the present are summarized in the file hdf5/HISTORY
in the source code installation.
HDF5 files are organized in a hierarchical structure, with two primary structures: groups and datasets.
Working with groups and group members is similar in many ways to working with directories and files in UNIX. As with UNIX directories and files, objects in an HDF5 file are often described by giving their full (or absolute) path names.
- /
signifies the root group.
/foo
signifies a member of the root group called foo
.
/foo/zoo
signifies a member of the group foo
, which in turn is a member of the root group.
Any HDF5 group or dataset may have an associated attribute list. An HDF5 attribute is a user-defined HDF5 structure that provides extra information about an HDF5 object. Attributes are described in more detail below.
An HDF5 group is a structure containing zero or more HDF5 objects. A group has two parts:
A dataset is stored in a file in two parts: a header and a data array.
The header contains information that is needed to interpret the array portion of the dataset, as well as metadata (or pointers to metadata) that describes or annotates the dataset. Header information includes the name of the object, its dimensionality, its number-type, information about how the data itself is stored on disk, and other information used by the library to speed up access to the dataset or maintain the file's integrity.
There are four essential classes of information in any header: name, datatype, dataspace, and storage layout:
Name.
A dataset name is a sequence of alphanumeric ASCII characters.Datatype.
HDF5 allows one to define many different kinds of datatypes. There are two categories of datatypes: atomic datatypes and compound datatypes. Atomic datatypes can also be system-specific, orNATIVE
, and all datatypes can be named:
NATIVE
datatypes are system-specific instances of atomic datatypes.
Atomic datatypes include integers and floating-point numbers. Each atomic type belongs to a particular class and has several properties: size, order, precision, and offset. In this introduction, we consider only a few of these properties.
Atomic classes include integer, float, date and time, string, bit field, and opaque. (Note: Only integer, float and string classes are available in the current implementation.)
Properties of integer types include size, order (endian-ness), and signed-ness (signed/unsigned).
Properties of float types include the size and location of the exponent and mantissa, and the location of the sign bit.
The datatypes that are supported in the current implementation are:
NATIVE
datatypes. Although it is possible to describe nearly any kind of atomic datatype, most applications will use predefined datatypes that are supported by their compiler. In HDF5 these are called native datatypes. NATIVE
datatypes are C-like datatypes that are generally supported by the hardware of the machine on which the library was compiled. In order to be portable, applications should almost always use the NATIVE
designation to describe data values in memory.
The NATIVE
architecture has base names which do not follow the same rules as the others. Instead, native type names are similar to the C type names. The following figure shows several examples.
Example |
Corresponding C Type |
H5T_NATIVE_CHAR |
signed char |
H5T_NATIVE_UCHAR |
unsigned char |
H5T_NATIVE_SHORT |
short |
H5T_NATIVE_USHORT |
unsigned short |
H5T_NATIVE_INT |
int |
H5T_NATIVE_UINT |
unsigned |
H5T_NATIVE_LONG |
long |
H5T_NATIVE_ULONG |
unsigned long |
H5T_NATIVE_LLONG |
long long |
H5T_NATIVE_ULLONG |
unsigned long long |
H5T_NATIVE_FLOAT |
float |
H5T_NATIVE_DOUBLE |
double |
H5T_NATIVE_LDOUBLE |
long double |
H5T_NATIVE_HSIZE |
hsize_t |
H5T_NATIVE_HSSIZE |
hssize_t |
H5T_NATIVE_HERR |
herr_t |
H5T_NATIVE_HBOOL |
hbool_t |
See Datatypes in the HDF Users Guide for further information.
A compound datatype is one in which a collection of several datatypes are represented as a single unit, a compound datatype, similar to a struct in C. The parts of a compound datatype are called members. The members of a compound datatype may be of any datatype, including another compound datatype. It is possible to read members from a compound type without reading the whole type.
Named datatypes. Normally each dataset has its own datatype, but sometimes we may want to share a datatype among several datasets. This can be done using a named datatype. A named datatype is stored in the file independently of any dataset, and referenced by all datasets that have that datatype. Named datatypes may have an associated attributes list.
See Datatypes
Dataspace.
A dataset dataspace describes the dimensionality of the dataset. The dimensions of a dataset can be fixed (unchanging), or they may be unlimited, which means that they are extendible (i.e. they can grow larger).Properties of a dataspace consist of the rank (number of dimensions) of the data array, the actual sizes of the dimensions of the array, and the maximum sizes of the dimensions of the array. For a fixed-dimension dataset, the actual size is the same as the maximum size of a dimension. When a dimension is unlimited, the maximum size is set to the
valueH5P_UNLIMITED
. (An example below shows how to create extendible datasets.)
A dataspace can also describe portions of a dataset, making it possible to do partial I/O operations on selections. Selection is supported by the dataspace interface (H5S). Given an n-dimensional dataset, there are currently four ways to do partial selection:
Since I/O operations have two end-points, the raw data transfer functions require two dataspace arguments: one describes the application memory dataspace or subset thereof, and the other describes the file dataspace or subset thereof.
See Dataspaces
in the HDF Users Guide for further information.Storage layout.
The HDF5 format makes it possible to store data in a variety of ways. The default storage layout format is contiguous, meaning that data is stored in the same linear way that it is organized in memory. Two other storage layout formats are currently defined for HDF5: compact, and chunked. In the future, other storage layouts may be added.Compact storage is used when the amount of data is small and can be stored directly in the object header. (Note: Compact storage is not supported in this release.)
Chunked storage involves dividing the dataset into equal-sized "chunks" that are stored separately. Chunking has three important benefits.
See Datasets and Dataset Chunking Issues
in the HDF Users Guide for further information. We particularly encourage you to read Dataset Chunking Issues since the issue is complex and beyond the scope of this document.The Attribute API (H5A) is used to read or write attribute information. When accessing attributes, they can be identified by name or by an index value. The use of an index value makes it possible to iterate through all of the attributes associated with a given object.
The HDF5 format and I/O library are designed with the assumption that attributes are small datasets. They are always stored in the object header of the object they are attached to. Because of this, large datasets should not be stored as attributes. How large is "large" is not defined by the library and is up to the user's interpretation. (Large datasets with metadata can be stored as supplemental datasets in a group with the primary dataset.)
See Attributes
in the HDF Users Guide for further information.For those who are interested, this section takes a look at the low-level elements of the file as the file is written to disk (or other storage media) and the relation of those low-level elements to the higher level elements with which users typically are more familiar. The HDF5 API generally exposes only the high-level elements to the user; the low-level elements are often hidden. The rest of this Introduction does not assume an understanding of this material.
The format of an HDF5 file on disk encompasses several key ideas of the HDF4 and AIO file formats as well as addressing some shortcomings therein. The new format is more self-describing than the HDF4 format and is more uniformly applied to data objects in the file.
| |
Figure 1: Relationships among the
HDF5 root group, other groups, and objects
|
An HDF5 file appears to the user as a directed graph. The nodes of this graph are the higher-level HDF5 objects that are exposed by the HDF5 APIs:
At the lowest level, as information is actually written to the disk, an HDF5 file is made up of the following objects:
| |
Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces
|
See the HDF5 File Format
Specification for further information.
The current HDF5 API is implemented only in C. The API provides routines for creating HDF5 files, creating and writing groups, datasets, and their attributes to HDF5 files, and reading groups, datasets and their attributes from HDF5 files.
All C routines in the HDF 5 library begin with a prefix of the form H5*, where * is a single letter indicating the object on which the operation is to be performed:
H5Fopen
, which opens an HDF5 file.
H5Gset
,which sets the working group to the specified group.
H5Tcopy
,which creates a copy of an existing datatype.
H5Screate_simple
, which creates simple dataspaces.
H5Dread
, which reads all or part of a dataset into a buffer in memory.
H5Pset_chunk
, which sets the number of dimensions and the size of a chunk.
H5Aget_name
, which retrieves name of an attribute.
H5Zregister
, which registers new compression and uncompression functions for use with the HDF5 library.
H5Eprint
, which prints the current error stack.
H5Rcreate
, which creates a reference.
H5Iget_type
, which retrieves the type of an object.
There are a number definitions and declarations that should be included with any HDF5 program. These definitions and declarations are contained in several include files. The main include
file ishdf5.h
. This file includes all of the other files that your program is likely to need. Be sure to include hdf5.h
in any program that uses the HDF5 library.
In this section we describe how to program some basic operations on files, including how to
This programming model shows how to create a file and also how to close the file.
The following code fragment implements the specified model. If there is a possibility that the file already exists, the user must add the flag H5ACC_TRUNC
to the access mode to overwrite the previous file's information.
hid_t file; /* identifier */
/*
* Create a new file using H5ACC_TRUNC access,
* default file creation properties, and default file
* access properties.
* Then close the file.
*/
file = H5Fcreate(FILE, H5ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
status = H5Fclose(file);
Recall that datatypes and dimensionality (dataspace) are independent objects, which are created separately from any dataset that they might be attached to. Because of this the creation of a dataset requires, at a minimum, separate definitions of datatype, dimensionality, and dataset. Hence, to create a dataset the following steps need to be taken:
The following code illustrates the creation of these three components of a dataset object.
hid_t dataset, datatype, dataspace; /* declare identifiers */
/*
* Create dataspace: Describe the size of the array and
* create the data space for fixed size dataset.
*/
dimsf[0] = NX;
dimsf[1] = NY;
dataspace = H5Screate_simple(RANK, dimsf, NULL);
/*
* Define datatype for the data in the file.
* We will store little endian integer numbers.
*/
datatype = H5Tcopy(H5T_NATIVE_INT);
status = H5Tset_order(datatype, H5T_ORDER_LE);
/*
* Create a new dataset within the file using defined
* dataspace and datatype and default dataset creation
* properties.
* NOTE: H5T_NATIVE_INT can be used as datatype if conversion
* to little endian is not needed.
*/
dataset = H5Dcreate(file, DATASETNAME, datatype, dataspace, H5P_DEFAULT);
The datatype, dataspace and dataset objects should be released once they are no longer needed by a program. Since each is an independent object, the must be released (or closed) separately. The following lines of code close the datatype, dataspace, and datasets that were created in the preceding section.
H5Tclose(datatype);
H5Dclose(dataset);
H5Sclose(dataspace);
Having defined the datatype, dataset, and dataspace parameters, you write out the data with a call to
H5Dwrite
.
/*
* Write the data to the dataset using default transfer
* properties.
*/
status = H5Dwrite(dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
H5P_DEFAULT, data);
The third and fourth parameters of
H5Dwrite
in the example describe the dataspaces in memory and in the file, respectively. They are set to the value H5S_ALL
to indicate that an entire dataset is to be written. In a later section we look at how we would access a portion of a dataset.
Example 1 contains a program that creates a file and a dataset, and writes the dataset to the file.
Reading is analogous to writing. If, in the previous example, we wish to read an entire dataset, we would use the same basic calls with the same parameters. Of course, the routine H5Dread
would replace H5Dwrite
.
Although reading is analogous to writing, it is often necessary to query a file to obtain information about a dataset. For instance, we often need to know about the datatype associated with a dataset, as well dataspace information (e.g. rank and dimensions). There are several "get" routines for obtaining this information. The following code segment illustrates how we would get this kind of information:
/*
* Get datatype and dataspace identifiers and then query
* dataset class, order, size, rank and dimensions.
*/
datatype = H5Dget_type(dataset); /* datatype identifier */
class = H5Tget_class(datatype);
if (class == H5T_INTEGER) printf("Data set has INTEGER type \n");
order = H5Tget_order(datatype);
if (order == H5T_ORDER_LE) printf("Little endian order \n");
size = H5Tget_size(datatype);
printf(" Data size is %d \n", size);
dataspace = H5Dget_space(dataset); /* dataspace identifier */
rank = H5Sget_simple_extent_ndims(dataspace);
status_n = H5Sget_simple_extent_dims(dataspace, dims_out);
printf("rank %d, dimensions %d x %d \n", rank, dims_out[0], dims_out[1]);
In the previous discussion, we describe how to access an entire dataset with one write (or read) operation. HDF5 also supports access to portions (or selections) of a dataset in one read/write operation. Currently selections are limited to hyperslabs, their unions, and the lists of independent points. Both types of selection will be discussed in the following sections. Several sample cases of selection reading/writing are shown on the following figure.
a |
b |
c |
d |
In example (a) a single hyperslab is read from the midst of a two-dimensional array in a file and stored in the corner of a smaller two-dimensional array in memory. In (b) a regular series of blocks is read from a two-dimensional array in the file and stored as a contiguous sequence of values at a certain offset in a one-dimensional array in memory. In (c) a sequence of points with no regular pattern is read from a two-dimensional array in a file and stored as a sequence of points with no regular pattern in a three-dimensional array in memory. In (d) a union of hyperslabs in the file dataspace is read and the data is stored in another union of hyperslabs in the memory dataspace.
As these examples illustrate, whenever we perform partial read/write operations on the data, the following information must be provided: file dataspace, file dataspace selection, memory dataspace and memory dataspace selection. After the required information is specified, actual read/write operation on the portion of data is done in a single call to the HDF5 read/write functions H5Dread(write).
Hyperslabs are portions of datasets. A hyperslab selection can be a logically contiguous collection of points in a dataspace, or it can be regular pattern of points or blocks in a dataspace. The following picture illustrates a selection of regularly spaced 3x2 blocks in an 8x12 dataspace.
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
Four parameters are required to describe a completely general hyperslab. Each parameter is an array whose rank is the same as that of the dataspace:
start
: a starting location for the hyperslab. In the example start
is (0,1).
stride
: the number of elements to separate each element or block to be selected. In the example stride
is (4,3). If the stride parameter is set to NULL, the stride size defaults to 1 in each dimension.
count
: the number of elements or blocks to select along each dimension. In the example, count
is (2,4).
block
: the size of the block selected from the dataspace. In the example, block
is (3,2). If the block parameter is set to NULL, the block size defaults to a single element in each dimension, as if the block array was set to all 1s.
In what order is data copied? When actual I/O is performed data values are copied by default from one dataspace to another in so-called row-major, or C order. That is, it is assumed that the first dimension varies slowest, the second next slowest, and so forth.
Example without strides or blocks. Suppose we want to read a 3x4 hyperslab from a dataset in a file beginning at the element This describes the dataspace from which we wish to read. We need to define the dataspace in memory analogously. Suppose, for instance, that we have in memory a 3 dimensional 7x7x3 array into which we wish to read the 3x4 hyperslab described above beginning at the element Notice that we must describe two things: the dimensions of the in-memory array, and the size and position of the hyperslab that we wish to read in. The following code illustrates how this would be done.
<1,2>
in the dataset. In order to do this, we must create a dataspace that describes the overall rank and dimensions of the dataset in the file, as well as the position and size of the hyperslab that we are extracting from that dataset. The following code illustrates the selection of the hyperslab in the file dataspace.
/*
* Define file dataspace.
*/
dataspace = H5Dget_space(dataset); /* dataspace identifier */
rank = H5Sget_simple_extent_ndims(dataspace);
status_n = H5Sget_simple_extent_dims(dataspace, dims_out, NULL);
/*
* Define hyperslab in the dataset.
*/
offset[0] = 1;
offset[1] = 2;
count[0] = 3;
count[1] = 4;
status = H5Sselect_hyperslab(dataspace, H5S_SELECT_SET, offset, NULL,
count, NULL);
<3,0,0>
. Since the in-memory dataspace has three dimensions, we have to describe the hyperslab as an array with three dimensions, with the last dimension being 1: <3,4,1>
.
/*
* Define memory dataspace.
*/
dimsm[0] = 7;
dimsm[1] = 7;
dimsm[2] = 3;
memspace = H5Screate_simple(RANK_OUT,dimsm,NULL);
/*
* Define memory hyperslab.
*/
offset_out[0] = 3;
offset_out[1] = 0;
offset_out[2] = 0;
count_out[0] = 3;
count_out[1] = 4;
count_out[2] = 1;
status = H5Sselect_hyperslab(memspace, H5S_SELECT_SET, offset_out, NULL,
count_out, NULL);
/*
Example 2 contains a complete program that performs these operations.
Example with strides and blocks. Consider the 8x12 dataspace described above, in which we selected eight 3x2 blocks. Suppose we wish to fill these eight blocks.
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
This hyperslab has the following parameters: Suppose that the source dataspace in memory is this 50-element one dimensional array called start=(0,1), stride=(4,3), count=(2,4), block=(3,2).
vector
:
|
|
|
|
|
|
|
|
|
|
|
|
The following code will write 48 elements from
vector
to our file dataset, starting with the second element in vector
.
/* Select hyperslab for the dataset in the file, using 3x2 blocks, (4,3) stride * (2,4) count starting at the position (0,1). */ start[0] = 0; start[1] = 1; stride[0] = 4; stride[1] = 3; count[0] = 2; count[1] = 4; block[0] = 3; block[1] = 2; ret = H5Sselect_hyperslab(fid, H5S_SELECT_SET, start, stride, count, block); /* * Create dataspace for the first dataset. */ mid1 = H5Screate_simple(MSPACE1_RANK, dim1, NULL); /* * Select hyperslab. * We will use 48 elements of the vector buffer starting at the second element. * Selected elements are 1 2 3 . . . 48 */ start[0] = 1; stride[0] = 1; count[0] = 48; block[0] = 1; ret = H5Sselect_hyperslab(mid1, H5S_SELECT_SET, start, stride, count, block); /* * Write selection from the vector buffer to the dataset in the file. * ret = H5Dwrite(dataset, H5T_NATIVE_INT, midd1, fid, H5P_DEFAULT, vector)
After these operations, the file dataspace will have the following values.
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
Notice that the values are inserted in the file dataset in row-major order.
Example 3 includes this code and other example code illustrating the use of hyperslab selection.
H5Sselect_elements
. Suppose, for example, that we wish to write the values 53, 59, 61, 67 to the following elements of the 8x12 array used in the previous example: (0,0), (3,3), (3,5), and (5,6). The following code selects the points and writes them to the dataset:
#define FSPACE_RANK 2 /* Dataset rank as it is stored in the file */ #define NPOINTS 4 /* Number of points that will be selected and overwritten */ #define MSPACE2_RANK 1 /* Rank of the second dataset in memory */ #define MSPACE2_DIM 4 /* Dataset size in memory */ hsize_t dim2[] = {MSPACE2_DIM}; /* Dimension size of the second dataset (in memory) */ int values[] = {53, 59, 61, 67}; /* New values to be written */ hssize_t coord[NPOINTS][FSPACE_RANK]; /* Array to store selected points from the file dataspace */ /* * Create dataspace for the second dataset. */ mid2 = H5Screate_simple(MSPACE2_RANK, dim2, NULL); /* * Select sequence of NPOINTS points in the file dataspace. */ coord[0][0] = 0; coord[0][1] = 0; coord[1][0] = 3; coord[1][1] = 3; coord[2][0] = 3; coord[2][1] = 5; coord[3][0] = 5; coord[3][1] = 6; ret = H5Sselect_elements(fid, H5S_SELECT_SET, NPOINTS, (const hssize_t **)coord); /* * Write new selection of points to the dataset. */ ret = H5Dwrite(dataset, H5T_NATIVE_INT, mid2, fid, H5P_DEFAULT, values);
After these operations, the file dataspace will have the following values:
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
||||||||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
||||
Example 3 contains a complete program that performs these subsetting operations.
Selecting a union of hyperslabs
The HDF5 Library allows the user to select a union of hyperslabs and
write or read the selection into another selection. The shapes of
the two selections may differ, but the number of elements must be equal.
Suppose that we want to read two overlapping hyperslabs from the dataset written in the previous example into a union of hyperslabs in the memory dataset. This exercise is illustrated in the two figures immediately below. Note that the memory dataset has a different shape from the previously written dataset. Similarly, the selection in the memory dataset could have a different shape than the selected union of hyperslabs in the original file; for simplicity, we will preserve the selection's shape in this example.
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
||||
|
|
||||||||||
|
|
|
|
|
|
|
|
||||
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
|
||||
|
|
|
||||||
|
|
|
|
|
||||
|
|
|||||||
|
|
|
|
|||||
|
|
|
|
|
||||
|
|
|
|
|||||
The following lines of code show the required steps.
First obtain the dataspace identifier for the dataset in the file.
/* * Get dataspace of the open dataset. */ fid = H5Dget_space(dataset);Then select the hyperslab with the size 3x4 and the left upper corner at the position (1,2):
/* * Select first hyperslab for the dataset in the file. The following * elements are selected: * 10 0 11 12 * 18 0 19 20 * 0 59 0 61 * */ start[0] = 1; start[1] = 2; block[0] = 1; block[1] = 1; stride[0] = 1; stride[1] = 1; count[0] = 3; count[1] = 4; ret = H5Sselect_hyperslab(fid, H5S_SELECT_SET, start, stride, count, block);Now select the second hyperslab with the size 6x5 at the position (2,4), and create the union with the first hyperslab.
/* * Add second selected hyperslab to the selection. * The following elements are selected: * 19 20 0 21 22 * 0 61 0 0 0 * 27 28 0 29 30 * 35 36 67 37 38 * 43 44 0 45 46 * 0 0 0 0 0 * Note that two hyperslabs overlap. Common elements are: * 19 20 * 0 61 */ start[0] = 2; start[1] = 4; block[0] = 1; block[1] = 1; stride[0] = 1; stride[1] = 1; count[0] = 6; count[1] = 5; ret = H5Sselect_hyperslab(fid, H5S_SELECT_OR, start, stride, count, block);Note that when we add the selected hyperslab to the union, the second argument to the
H5Sselect_hyperslab
function
has to be H5S_SELECT_OR
instead of H5S_SELECT_SET
.
Using H5S_SELECT_SET
would reset the selection to
the second hyperslab.
Now define the memory dataspace and select the union of the hyperslabs in the memory dataset.
/* * Create memory dataspace. */ mid = H5Screate_simple(MSPACE_RANK, mdim, NULL); /* * Select two hyperslabs in memory. Hyperslabs has the same * size and shape as the selected hyperslabs for the file dataspace. */ start[0] = 0; start[1] = 0; block[0] = 1; block[1] = 1; stride[0] = 1; stride[1] = 1; count[0] = 3; count[1] = 4; ret = H5Sselect_hyperslab(mid, H5S_SELECT_SET, start, stride, count, block); start[0] = 1; start[1] = 2; block[0] = 1; block[1] = 1; stride[0] = 1; stride[1] = 1; count[0] = 6; count[1] = 5; ret = H5Sselect_hyperslab(mid, H5S_SELECT_OR, start, stride, count, block);Finally we can read the selected data from the file dataspace to the selection in memory with one call to the
H5Dread
function.
ret = H5Dread(dataset, H5T_NATIVE_INT, mid, fid, H5P_DEFAULT, matrix_out);
Example 3 includes this code along with the previous selection example.
Properties of compound datatypes. A compound datatype is similar to a struct in C or a common block in Fortran. It is a collection of one or more atomic types or small arrays of such types. To create and use of a compound datatype you need to refer to various properties of the data compound datatype:
Properties of members of a compound datatype are defined when the member is added to the compound type and cannot be subsequently modified.
Defining compound datatypes. Compound datatypes must be built out of other datatypes. First, one creates an empty compound datatype and specifies its total size. Then members are added to the compound datatype in any order.
Member names. Each member must have a descriptive name, which is the key used to uniquely identify the member within the compound datatype. A member name in an HDF5 datatype does not necessarily have to be the same as the name of the corresponding member in the C struct in memory, although this is often the case. Nor does one need to define all members of the C struct in the HDF5 compound datatype (or vice versa).
Offsets. Usually a C struct will be defined to hold a data point in memory, and the offsets of the members in memory will be the offsets of the struct members from the beginning of an instance of the struct. The library defines the macro to compute the offset of a member within a struct:
HOFFSET(s,m)
Here is an example in which a compound datatype is created to describe complex numbers whose type is defined by the
complex_t
struct.
typedef struct {
double re; /*real part */
double im; /*imaginary part */
} complex_t;
complex_t tmp; /*used only to compute offsets */
hid_t complex_id = H5Tcreate (H5T_COMPOUND, sizeof tmp);
H5Tinsert (complex_id, "real", HOFFSET(tmp,re),
H5T_NATIVE_DOUBLE);
H5Tinsert (complex_id, "imaginary", HOFFSET(tmp,im),
H5T_NATIVE_DOUBLE);
Example 4 shows how to create a compound datatype, write an array that has the compound datatype to the file, and read back subsets of the members.
An extendible dataset is one whose dimensions can grow. In HDF5, it is possible to define a dataset to have certain initial dimensions, then later to increase the size of any of the initial dimensions.
For example, you can create and store the following 3x3 HDF5 dataset:
1 1 1 1 1 1 1 1 1
then later to extend this into a 10x3 dataset by adding 7 rows, such as this:
1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
then further extend it to a 10x5 dataset by adding two columns, such as this:
1 1 1 3 3 1 1 1 3 3 1 1 1 3 3 2 2 2 3 3 2 2 2 3 3 2 2 2 3 3 2 2 2 3 3 2 2 2 3 3 2 2 2 3 3 2 2 2 3 3
HDF 5 requires you to use chunking in order to define extendible datasets. Chunking makes it possible to extend datasets efficiently, without having to reorganize storage excessively.
The following operations are required in order to write an extendible dataset:
For example, suppose we wish to create a dataset similar to the one shown above. We want to start with a 3x3 dataset, then later extend it in both directions.
Declaring unlimited dimensions. We could declare the dataspace to have unlimited dimensions with the following code, which uses the predefined constant
H5S_UNLIMITED
to specify unlimited dimensions.
hsize_t dims[2] = { 3, 3}; /* dataset dimensions at the creation time */ hsize_t maxdims[2] = {H5S_UNLIMITED, H5S_UNLIMITED}; /* * Create the data space with unlimited dimensions. */ dataspace = H5Screate_simple(RANK, dims, maxdims);
Enabling chunking. We can then set the dataset storage layout properties to enable chunking. We do this using the routine H5Pset_chunk
:
hid_t cparms; hsize_t chunk_dims[2] ={2, 5}; /* * Modify dataset creation properties to enable chunking. */ cparms = H5Pcreate (H5P_DATASET_CREATE); status = H5Pset_chunk( cparms, RANK, chunk_dims);Then create a dataset.
/* * Create a new dataset within the file using cparms * creation properties. */ dataset = H5Dcreate(file, DATASETNAME, H5T_NATIVE_INT, dataspace, cparms);
Extending dataset size. Finally, when we want to extend the size of the dataset, we invoke H5Dextend
to extend the size of the dataset. In the following example, we extend the dataset along the first dimension, by seven rows, so that the new dimensions are <10,3>
:
/* * Extend the dataset. Dataset becomes 10 x 3. */ dims[0] = dims[0] + 7; size[0] = dims[0]; size[1] = dims[1]; status = H5Dextend (dataset, size);
Example 5 shows how to create a 3x3 extendible dataset, write the dataset, extend the dataset to 10x3, write the dataset again, extend it again to 10x5, write the dataset again.
Example 6 shows how to read the data written by Example 5.
Groups provide a mechanism for organizing meaningful and extendible sets of datasets within an HDF5 file. The H5G API contains routines for working with groups.
Creating a group. To create a group, use
H5Gcreate
. For example, the following code
creates a group called Data
in the root group.
/* * Create a group in the file. */ grp = H5Gcreate(file, "/Data", 0);A group may be created in another group by providing the absolute name of the group to the
H5Gcreate
function or by specifying its location. For example,
to create the group Data_new
in the
Data
group, one can use the following sequence
of calls:
/* * Create group "Data_new" in the group "Data" by specifying * absolute name of the group. */ grp_new = H5Gcreate(file, "/Data/Data_new", 0);or
/* * Create group "Data_new" in the "Data" group. */ grp_new = H5Gcreate(grp, "Data_new", 0);Note that the group identifier
grp
is used
as the first parameter in the H5Gcreate
function
when the relative name is provided.
The third parameter in H5Gcreate
optionally
specifies how much file space to reserve to store the names
that will appear in this group. If a non-positive
value is supplied, then a default size is chosen.
H5Gclose
closes the group and releases the
group identifier.
Creating a dataset in a particular group. As with groups, a dataset can be created in a particular group by specifying its absolute name as illustrated in the following example:
/* * Create the dataset "Compressed_Data" in the group using the * absolute name. The dataset creation property list is modified * to use GZIP compression with the compression effort set to 6. * Note that compression can be used only when the dataset is * chunked. */ dims[0] = 1000; dims[1] = 20; cdims[0] = 20; cdims[1] = 20; dataspace = H5Screate_simple(RANK, dims, NULL); plist = H5Pcreate(H5P_DATASET_CREATE); H5Pset_chunk(plist, 2, cdims); H5Pset_deflate( plist, 6); dataset = H5Dcreate(file, "/Data/Compressed_Data", H5T_NATIVE_INT, dataspace, plist);A relative dataset name may also be used when a dataset is created. First obtain the identifier of the group in which the dataset is to be created. Then create the dataset with
H5Dcreate
as illustrated in the following
example:
/* * Open the group. */ grp = H5Gopen(file, "Data"); /* * Create the dataset "Compressed_Data" in the "Data" group * by providing a group identifier and a relative dataset * name as parameters to the H5Dcreate function. */ dataset = H5Dcreate(grp, "Compressed_Data", H5T_NATIVE_INT, dataspace, plist);
Accessing an object in a group.
Any object in a group can be accessed by its absolute or
relative name. The following lines of code show how to use
the absolute name to access the dataset
Compressed_Data
in the group Data
created in the examples above:
/* * Open the dataset "Compressed_Data" in the "Data" group. */ dataset = H5Dopen(file, "/Data/Compressed_Data");The same dataset can be accessed in another manner. First access the group to which the dataset belongs, then open the dataset.
/* * Open the group "data" in the file. */ grp = H5Gopen(file, "Data"); /* * Access the "Compressed_Data" dataset in the group. */ dataset = H5Dopen(grp, "Compressed_Data");
Example 7 shows
how to create a group in a file and a
dataset in a group. It uses the iterator function
H5Giterate
to find the names of the objects
in the root group, and H5Glink
and H5Gunlink
to create a new group name and delete the original name.
Think of an attribute as a small datasets that is attached to a normal dataset or group. The H5A API contains routines for working with attributes. Since attributes share many of the characteristics of datasets, the programming model for working with attributes is analogous in many ways to the model for working with datasets. The primary differences are that an attribute must be attached to a dataset or a group, and subsetting operations cannot be performed on attributes.
To create an attribute belonging to a particular dataset or group, first create a dataspace for the attribute with the call to H5Screate
, then create the attribute using H5Acreate
. For example, the following code creates an attribute called Integer_attribute
that is a member of a dataset whose identifier is dataset
. The attribute identifier is attr2
. H5Awrite
then sets the value of the attribute of that of the integer variable point
. H5Aclose
then releases the attribute identifier.
int point = 1; /* Value of the scalar attribute */ /* * Create scalar attribute. */ aid2 = H5Screate(H5S_SCALAR); attr2 = H5Acreate(dataset, "Integer attribute", H5T_NATIVE_INT, aid2, H5P_DEFAULT); /* * Write scalar attribute. */ ret = H5Awrite(attr2, H5T_NATIVE_INT, &point); /* * Close attribute dataspace. */ ret = H5Sclose(aid2); /* * Close attribute. */ ret = H5Aclose(attr2);
To read a scalar attribute whose name and datatype are known, first open the attribute using H5Aopen_name
, then use H5Aread to get its value. For example the following reads a scalar attribute called Integer_attribute
whose datatype is a native integer, and whose parent dataset has the identifier dataset
.
/* * Attach to the scalar attribute using attribute name, then read and * display its value. */ attr = H5Aopen_name(dataset,"Integer attribute"); ret = H5Aread(attr, H5T_NATIVE_INT, &point_out); printf("The value of the attribute \"Integer attribute\" is %d \n", point_out); ret = H5Aclose(attr);
Reading an attribute whose characteristics are not known. It may be necessary to query a file to obtain information about an attribute, namely its name, datatype, rank and dimensions. The following code opens an attribute by its index value using H5Aopen_index
, then reads in information about its datatype.
/* * Attach to the string attribute using its index, then read and display the value. */ attr = H5Aopen_idx(dataset, 2); atype = H5Tcopy(H5T_C_S1); H5Tset_size(atype, 4); ret = H5Aread(attr, atype, string_out); printf("The value of the attribute with the index 2 is %s \n", string_out);
In practice, if the characteristics of attributes are not known,
the code involved in accessing and processing the attribute can be quite
complex. For this reason, HDF5 includes a function called
H5Aiterate
, which applies a user-supplied function to each
of a set of attributes. The user-supplied function can contain the code
that interprets, accesses and processes each attribute.
Example 8 illustrates the use of the H5Aiterate
function, as well as the other attribute examples described above.
An object reference is based on the relative file address of the object header in the file and is constant for the life of the object. Once a reference to an object is created and stored in a dataset in the file, it can be used to dereference the object it points to. References are handy for creating a file index or for grouping related objects by storing references to them in one dataset.
#include <hdf5.h> #define FILE1 "trefer1.h5" /* 1-D dataset with fixed dimensions */ #define SPACE1_NAME "Space1" #define SPACE1_RANK 1 #define SPACE1_DIM1 4 /* 2-D dataset with fixed dimensions */ #define SPACE2_NAME "Space2" #define SPACE2_RANK 2 #define SPACE2_DIM1 10 #define SPACE2_DIM2 10 int main(void) { hid_t fid1; /* HDF5 File IDs */ hid_t dataset; /* Dataset ID */ hid_t group; /* Group ID */ hid_t sid1; /* Dataspace ID */ hid_t tid1; /* Datatype ID */ hsize_t dims1[] = {SPACE1_DIM1}; hobj_ref_t *wbuf; /* buffer to write to disk */ int *tu32; /* Temporary pointer to int data */ int i; /* counting variables */ const char *write_comment="Foo!"; /* Comments for group */ herr_t ret; /* Generic return value */ /* Compound datatype */ typedef struct s1_t { unsigned int a; unsigned int b; float c; } s1_t; /* Allocate write buffers */ wbuf=(hobj_ref_t *)malloc(sizeof(hobj_ref_t)*SPACE1_DIM1); tu32=malloc(sizeof(int)*SPACE1_DIM1); /* Create file */ fid1 = H5Fcreate(FILE1, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); /* Create dataspace for datasets */ sid1 = H5Screate_simple(SPACE1_RANK, dims1, NULL); /* Create a group */ group=H5Gcreate(fid1,"Group1",-1); /* Set group's comment */ ret=H5Gset_comment(group,".",write_comment); /* Create a dataset (inside Group1) */ dataset=H5Dcreate(group,"Dataset1",H5T_STD_U32LE,sid1,H5P_DEFAULT); for(i=0; i < SPACE1_DIM1; i++) tu32[i] = i*3; /* Write selection to disk */ ret=H5Dwrite(dataset,H5T_NATIVE_INT,H5S_ALL,H5S_ALL,H5P_DEFAULT,tu32); /* Close Dataset */ ret = H5Dclose(dataset); /* Create another dataset (inside Group1) */ dataset=H5Dcreate(group,"Dataset2",H5T_NATIVE_UCHAR,sid1,H5P_DEFAULT); /* Close Dataset */ ret = H5Dclose(dataset); /* Create a datatype to refer to */ tid1 = H5Tcreate (H5T_COMPOUND, sizeof(s1_t)); /* Insert fields */ ret=H5Tinsert (tid1, "a", HOFFSET(s1_t,a), H5T_NATIVE_INT); ret=H5Tinsert (tid1, "b", HOFFSET(s1_t,b), H5T_NATIVE_INT); ret=H5Tinsert (tid1, "c", HOFFSET(s1_t,c), H5T_NATIVE_FLOAT); /* Save datatype for later */ ret=H5Tcommit (group, "Datatype1", tid1); /* Close datatype */ ret = H5Tclose(tid1); /* Close group */ ret = H5Gclose(group); /* Create a dataset to store references */ dataset=H5Dcreate(fid1,"Dataset3",H5T_STD_REF_OBJ,sid1,H5P_DEFAULT); /* Create reference to dataset */ ret = H5Rcreate(&wbuf[0],fid1,"/Group1/Dataset1",H5R_OBJECT,-1); /* Create reference to dataset */ ret = H5Rcreate(&wbuf[1],fid1,"/Group1/Dataset2",H5R_OBJECT,-1); /* Create reference to group */ ret = H5Rcreate(&wbuf[2],fid1,"/Group1",H5R_OBJECT,-1); /* Create reference to named datatype */ ret = H5Rcreate(&wbuf[3],fid1,"/Group1/Datatype1",H5R_OBJECT,-1); /* Write selection to disk */ ret=H5Dwrite(dataset,H5T_STD_REF_OBJ,H5S_ALL,H5S_ALL,H5P_DEFAULT,wbuf); /* Close disk dataspace */ ret = H5Sclose(sid1); /* Close Dataset */ ret = H5Dclose(dataset); /* Close file */ ret = H5Fclose(fid1); free(wbuf); free(tu32); return 0; }Remarks:
dataset = H5Dcreate ( fid1,"Dataset3",H5T_STD_REF_OBJ,sid1,H5P_DEFAULT );creates a dataset to store references. Notice that the
H5T_SDT_REF_OBJ
datatype is used to specify that
references to objects will be stored.
The datatype H5T_STD_REF_DSETREG
is used to store the
dataset region references and is be discussed later.
H5Rcreate
function create
references to the objects and store them in the buffer wbuf.
The signature of the H5Rcreate
function is:
herr_t H5Rcreate ( void* buf, hid_t loc_id, const char *name, H5R_type_t ref_type, hid_t space_id )
/Group1/Dataset1
identify the dataset. One could also use the group identifier
of group Group1
and the relative name of the dataset
Dataset1
to create the same reference.
H5R_OBJECT
).
Another type of reference, reference to the dataset region
(H5R_DATASET_REGION
), is discussed later.
-1
.
H5Dwrite
function writes a dataset with the
references to the file. Notice that the H5T_SDT_REF_OBJ
datatype is used to describe the dataset's memory datatype.
trefer1.h5
file created by this example
are as follows:
HDF5 "trefer1.h5" { GROUP "/" { DATASET "Dataset3" { DATATYPE { H5T_REFERENCE } DATASPACE { SIMPLE ( 4 ) / ( 4 ) } DATA { DATASET 0:1696, DATASET 0:2152, GROUP 0:1320, DATATYPE 0:2268 } } GROUP "Group1" { DATASET "Dataset1" { DATATYPE { H5T_STD_U32LE } DATASPACE { SIMPLE ( 4 ) / ( 4 ) } DATA { 0, 3, 6, 9 } } DATASET "Dataset2" { DATATYPE { H5T_STD_U8LE } DATASPACE { SIMPLE ( 4 ) / ( 4 ) } DATA { 0, 0, 0, 0 } } DATATYPE "Datatype1" { H5T_STD_I32BE "a"; H5T_STD_I32BE "b"; H5T_IEEE_F32BE "c"; } } } }Notice how the data in dataset
Dataset3
is described.
The two numbers with the colon in between represent a unique identifier
of the object. These numbers are constant for the life of the object.
H5T_STD_REF_OBJ
datatype must be used to
describe the memory datatype.
Dataset3
from
the file created previously. Then the program dereferences the references
to dataset Dataset1
, the group and the named datatype,
and opens those objects.
The program reads and displays the dataset's data, the group's comment, and
the number of members of the compound datatype.
#include <stdlib.h> #include <hdf5.h> #define FILE1 "trefer1.h5" /* dataset with fixed dimensions */ #define SPACE1_NAME "Space1" #define SPACE1_RANK 1 #define SPACE1_DIM1 4 int main(void) { hid_t fid1; /* HDF5 File IDs */ hid_t dataset, /* Dataset ID */ dset2; /* Dereferenced dataset ID */ hid_t group; /* Group ID */ hid_t sid1; /* Dataspace ID */ hid_t tid1; /* Datatype ID */ hobj_ref_t *rbuf; /* buffer to read from disk */ int *tu32; /* temp. buffer read from disk */ int i; /* counting variables */ char read_comment[10]; herr_t ret; /* Generic return value */ /* Allocate read buffers */ rbuf = malloc(sizeof(hobj_ref_t)*SPACE1_DIM1); tu32 = malloc(sizeof(int)*SPACE1_DIM1); /* Open the file */ fid1 = H5Fopen(FILE1, H5F_ACC_RDWR, H5P_DEFAULT); /* Open the dataset */ dataset=H5Dopen(fid1,"/Dataset3"); /* Read selection from disk */ ret=H5Dread(dataset,H5T_STD_REF_OBJ,H5S_ALL,H5S_ALL,H5P_DEFAULT,rbuf); /* Open dataset object */ dset2 = H5Rdereference(dataset,H5R_OBJECT,&rbuf[0]); /* Check information in referenced dataset */ sid1 = H5Dget_space(dset2); ret=H5Sget_simple_extent_npoints(sid1); /* Read from disk */ ret=H5Dread(dset2,H5T_NATIVE_INT,H5S_ALL,H5S_ALL,H5P_DEFAULT,tu32); printf("Dataset data : \n"); for (i=0; i < SPACE1_DIM1 ; i++) printf (" %d ", tu32[i]); printf("\n"); printf("\n"); /* Close dereferenced Dataset */ ret = H5Dclose(dset2); /* Open group object */ group = H5Rdereference(dataset,H5R_OBJECT,&rbuf[2]); /* Get group's comment */ ret=H5Gget_comment(group,".",10,read_comment); printf("Group comment is %s \n", read_comment); printf(" \n"); /* Close group */ ret = H5Gclose(group); /* Open datatype object */ tid1 = H5Rdereference(dataset,H5R_OBJECT,&rbuf[3]); /* Verify correct datatype */ { H5T_class_t tclass; tclass= H5Tget_class(tid1); if ((tclass == H5T_COMPOUND)) printf ("Number of compound datatype members is %d \n", H5Tget_nmembers(tid1)); printf(" \n"); } /* Close datatype */ ret = H5Tclose(tid1); /* Close Dataset */ ret = H5Dclose(dataset); /* Close file */ ret = H5Fclose(fid1); /* Free memory buffers */ free(rbuf); free(tu32); return 0; }The output of this program is as follows:
Dataset data : 0 3 6 9 Group comment is Foo! Number of compound datatype members is 3Remarks:
H5Dread
function was used to read dataset
Dataset3
containing the references to the objects.
The H5T_STD_REF_OBJ
memory datatype was
used to read references to memory.
H5Rdereference
obtains the object's identifier.
The signature of this function is:
hid_t H5Rdereference (hid_t datatset, H5R_type_t ref_type, void *ref)
H5R_OBJECT
was used to specify a reference to an
object. Another type, used to specifiy a reference to a dataset
region and discussed later, is H5R_DATASET_REGION
.
H5Rget_object_type
should be used to
identify the type of object the reference points to.
#include <stdlib.h> #include <hdf5.h> #define FILE2 "trefer2.h5" #define SPACE1_NAME "Space1" #define SPACE1_RANK 1 #define SPACE1_DIM1 4 /* Dataset with fixed dimensions */ #define SPACE2_NAME "Space2" #define SPACE2_RANK 2 #define SPACE2_DIM1 10 #define SPACE2_DIM2 10 /* Element selection information */ #define POINT1_NPOINTS 10 int main(void) { hid_t fid1; /* HDF5 File IDs */ hid_t dset1, /* Dataset ID */ dset2; /* Dereferenced dataset ID */ hid_t sid1, /* Dataspace ID #1 */ sid2; /* Dataspace ID #2 */ hsize_t dims1[] = {SPACE1_DIM1}, dims2[] = {SPACE2_DIM1, SPACE2_DIM2}; hssize_t start[SPACE2_RANK]; /* Starting location of hyperslab */ hsize_t stride[SPACE2_RANK]; /* Stride of hyperslab */ hsize_t count[SPACE2_RANK]; /* Element count of hyperslab */ hsize_t block[SPACE2_RANK]; /* Block size of hyperslab */ hssize_t coord1[POINT1_NPOINTS][SPACE2_RANK]; /* Coordinates for point selection */ hdset_reg_ref_t *wbuf; /* buffer to write to disk */ int *dwbuf; /* Buffer for writing numeric data to disk */ int i; /* counting variables */ herr_t ret; /* Generic return value */ /* Allocate write & read buffers */ wbuf=calloc(sizeof(hdset_reg_ref_t), SPACE1_DIM1); dwbuf=malloc(sizeof(int)*SPACE2_DIM1*SPACE2_DIM2); /* Create file */ fid1 = H5Fcreate(FILE2, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); /* Create dataspace for datasets */ sid2 = H5Screate_simple(SPACE2_RANK, dims2, NULL); /* Create a dataset */ dset2=H5Dcreate(fid1,"Dataset2",H5T_STD_U8LE,sid2,H5P_DEFAULT); for(i=0; i < SPACE2_DIM1*SPACE2_DIM2; i++) dwbuf[i]=i*3; /* Write selection to disk */ ret=H5Dwrite(dset2,H5T_NATIVE_INT,H5S_ALL,H5S_ALL,H5P_DEFAULT,dwbuf); /* Close Dataset */ ret = H5Dclose(dset2); /* Create dataspace for the reference dataset */ sid1 = H5Screate_simple(SPACE1_RANK, dims1, NULL); /* Create a dataset */ dset1=H5Dcreate(fid1,"Dataset1",H5T_STD_REF_DSETREG,sid1,H5P_DEFAULT); /* Create references */ /* Select 6x6 hyperslab for first reference */ start[0]=2; start[1]=2; stride[0]=1; stride[1]=1; count[0]=6; count[1]=6; block[0]=1; block[1]=1; ret = H5Sselect_hyperslab(sid2,H5S_SELECT_SET,start,stride,count,block); /* Store first dataset region */ ret = H5Rcreate(&wbuf[0],fid1,"/Dataset2",H5R_DATASET_REGION,sid2); /* Select sequence of ten points for second reference */ coord1[0][0]=6; coord1[0][1]=9; coord1[1][0]=2; coord1[1][1]=2; coord1[2][0]=8; coord1[2][1]=4; coord1[3][0]=1; coord1[3][1]=6; coord1[4][0]=2; coord1[4][1]=8; coord1[5][0]=3; coord1[5][1]=2; coord1[6][0]=0; coord1[6][1]=4; coord1[7][0]=9; coord1[7][1]=0; coord1[8][0]=7; coord1[8][1]=1; coord1[9][0]=3; coord1[9][1]=3; ret = H5Sselect_elements(sid2,H5S_SELECT_SET,POINT1_NPOINTS,(const hssize_t **)coord1); /* Store second dataset region */ ret = H5Rcreate(&wbuf[1],fid1,"/Dataset2",H5R_DATASET_REGION,sid2); /* Write selection to disk */ ret=H5Dwrite(dset1,H5T_STD_REF_DSETREG,H5S_ALL,H5S_ALL,H5P_DEFAULT,wbuf); /* Close all objects */ ret = H5Sclose(sid1); ret = H5Dclose(dset1); ret = H5Sclose(sid2); /* Close file */ ret = H5Fclose(fid1); free(wbuf); free(dwbuf); return 0; }Remarks:
dset1=H5Dcreate(fid1,"Dataset1",H5T_STD_REF_DSETREG,sid1,H5P_DEFAULT);creates a dataset to store references to the dataset(s) regions (selections). Notice that the
H5T_STD_REF_DSETREG
datatype is used.
H5Sselect_hyperslab
and H5Sselect_elements
. The handle was created when dataset
Dataset2
was created and it describes the dataset's
dataspace. It was not closed when the dataset was closed to decrease
the number of function calls used in the example.
In a real application program, one should open the dataset and determine
its dataspace using the H5Dget_space
function.
H5Rcreate
is used to create a dataset region reference
and store it in a buffer. The signature of the function is:
herr_t H5Rcreate(void *buf, hid_t loc_id, const char *name, H5R_type_t ref_type, hid_t space_id)
/Dataset2
were
used to identify the dataset. The reference to the region of this
dataset is stored in the buffer buf.
H5R_DATASET_REGION
datatype is used.
trefer2.h5
created by this program
are as follows:
HDF5 "trefer2.h5" { GROUP "/" { DATASET "Dataset1" { DATATYPE { H5T_REFERENCE } DATASPACE { SIMPLE ( 4 ) / ( 4 ) } DATA { DATASET 0:744 {(2,2)-(7,7)}, DATASET 0:744 {(6,9), (2,2), (8,4), (1,6), (2,8), (3,2), (0,4), (9,0), (7,1), (3,3)}, NULL, NULL } } DATASET "Dataset2" { DATATYPE { H5T_STD_U8LE } DATASPACE { SIMPLE ( 10, 10 ) / ( 10, 10 ) } DATA { 0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48, 51, 54, 57, 60, 63, 66, 69, 72, 75, 78, 81, 84, 87, 90, 93, 96, 99, 102, 105, 108, 111, 114, 117, 120, 123, 126, 129, 132, 135, 138, 141, 144, 147, 150, 153, 156, 159, 162, 165, 168, 171, 174, 177, 180, 183, 186, 189, 192, 195, 198, 201, 204, 207, 210, 213, 216, 219, 222, 225, 228, 231, 234, 237, 240, 243, 246, 249, 252, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255 } } } }Notice how raw data of the dataset with the dataset regions is displayed. Each element of the raw data consists of a reference to the dataset (
DATASET number1:number2
) and its selected region.
If the selection is a hyperslab, the corner coordinates of the hyperslab
are displayed.
For the point selection, the coordinates of each point are displayed.
Since only two selections were stored, the third and fourth elements of the
dataset Dataset1
are set to NULL
.
This was done by the buffer inizialization in the program.
H5T_STD_REF_DSETREG
must be used during
read operation.
H5Rdereference
to obtain the dataset identifier
from the read dataset region reference.
ORUse
H5Rget_region
to obtain the dataspace identifier for
the dataset containing the selection from the read dataset region reference.
H5Sget_select_
*, can be used to obtain information
about the selection.
#include <stdlib.h> #include <hdf5.h> #define FILE2 "trefer2.h5" #define NPOINTS 10 /* 1-D dataset with fixed dimensions */ #define SPACE1_NAME "Space1" #define SPACE1_RANK 1 #define SPACE1_DIM1 4 /* 2-D dataset with fixed dimensions */ #define SPACE2_NAME "Space2" #define SPACE2_RANK 2 #define SPACE2_DIM1 10 #define SPACE2_DIM2 10 int main(void) { hid_t fid1; /* HDF5 File IDs */ hid_t dset1, /* Dataset ID */ dset2; /* Dereferenced dataset ID */ hid_t sid1, /* Dataspace ID #1 */ sid2; /* Dataspace ID #2 */ hsize_t * coords; /* Coordinate buffer */ hsize_t low[SPACE2_RANK]; /* Selection bounds */ hsize_t high[SPACE2_RANK]; /* Selection bounds */ hdset_reg_ref_t *rbuf; /* buffer to to read disk */ int *drbuf; /* Buffer for reading numeric data from disk */ int i, j; /* counting variables */ herr_t ret; /* Generic return value */ /* Output message about test being performed */ /* Allocate write & read buffers */ rbuf=malloc(sizeof(hdset_reg_ref_t)*SPACE1_DIM1); drbuf=calloc(sizeof(int),SPACE2_DIM1*SPACE2_DIM2); /* Open the file */ fid1 = H5Fopen(FILE2, H5F_ACC_RDWR, H5P_DEFAULT); /* Open the dataset */ dset1=H5Dopen(fid1,"/Dataset1"); /* Read selection from disk */ ret=H5Dread(dset1,H5T_STD_REF_DSETREG,H5S_ALL,H5S_ALL,H5P_DEFAULT,rbuf); /* Try to open objects */ dset2 = H5Rdereference(dset1,H5R_DATASET_REGION,&rbuf[0]); /* Check information in referenced dataset */ sid1 = H5Dget_space(dset2); ret=H5Sget_simple_extent_npoints(sid1); printf(" Number of elements in the dataset is : %d\n",ret); /* Read from disk */ ret=H5Dread(dset2,H5T_NATIVE_INT,H5S_ALL,H5S_ALL,H5P_DEFAULT,drbuf); for(i=0; i < SPACE2_DIM1; i++) { for (j=0; j < SPACE2_DIM2; j++) printf (" %d ", drbuf[i*SPACE2_DIM2+j]); printf("\n"); } /* Get the hyperslab selection */ sid2=H5Rget_region(dset1,H5R_DATASET_REGION,&rbuf[0]); /* Verify correct hyperslab selected */ ret = H5Sget_select_npoints(sid2); printf(" Number of elements in the hyperslab is : %d \n", ret); ret = H5Sget_select_hyper_nblocks(sid2); coords=malloc(ret*SPACE2_RANK*sizeof(hsize_t)*2); /* allocate space for the hyperslab blocks */ ret = H5Sget_select_hyper_blocklist(sid2,0,ret,coords); printf(" Hyperslab coordinates are : \n"); printf (" ( %lu , %lu ) ( %lu , %lu ) \n", \ (unsigned long)coords[0],(unsigned long)coords[1],(unsigned long)coords[2],(unsigned long)coords[3]); free(coords); ret = H5Sget_select_bounds(sid2,low,high); /* Close region space */ ret = H5Sclose(sid2); /* Get the element selection */ sid2=H5Rget_region(dset1,H5R_DATASET_REGION,&rbuf[1]); /* Verify correct elements selected */ ret = H5Sget_select_elem_npoints(sid2); printf(" Number of selected elements is : %d\n", ret); /* Allocate space for the element points */ coords= malloc(ret*SPACE2_RANK*sizeof(hsize_t)); ret = H5Sget_select_elem_pointlist(sid2,0,ret,coords); printf(" Coordinates of selected elements are : \n"); for (i=0; i < 2*NPOINTS; i=i+2) printf(" ( %lu , %lu ) \n", (unsigned long)coords[i],(unsigned long)coords[i+1]); free(coords); ret = H5Sget_select_bounds(sid2,low,high); /* Close region space */ ret = H5Sclose(sid2); /* Close first space */ ret = H5Sclose(sid1); /* Close dereferenced Dataset */ ret = H5Dclose(dset2); /* Close Dataset */ ret = H5Dclose(dset1); /* Close file */ ret = H5Fclose(fid1); /* Free memory buffers */ free(rbuf); free(drbuf); return 0; }
The output of this program is :
Number of elements in the dataset is : 100 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72 75 78 81 84 87 90 93 96 99 102 105 108 111 114 117 120 123 126 129 132 135 138 141 144 147 150 153 156 159 162 165 168 171 174 177 180 183 186 189 192 195 198 201 204 207 210 213 216 219 222 225 228 231 234 237 240 243 246 249 252 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 Number of elements in the hyperslab is : 36 Hyperslab coordinates are : ( 2 , 2 ) ( 7 , 7 ) Number of selected elements is : 10 Coordinates of selected elements are : ( 6 , 9 ) ( 2 , 2 ) ( 8 , 4 ) ( 1 , 6 ) ( 2 , 8 ) ( 3 , 2 ) ( 0 , 4 ) ( 9 , 0 ) ( 7 , 1 ) ( 3 , 3 )Remarks:
H5Dread
with the H5T_STD_REF_DSETREG
datatype specified.
dset2 = H5Rdereference (dset1,H5R_DATASET_REGION,&rbuf[0]);or to obtain spacial information (dataspace and selection) with the call to
H5Rget_region
:
sid2=H5Rget_region(dset1,H5R_DATASET_REGION,&rbuf[0]);The reference to the dataset region has information for both the dataset itself and its selection. In both functions:
H5Sget_select
*
functions used to obtain information about selections:
H5Sget_select_npoints:
returns the number of elements in
the hyperslabH5Sget_select_hyper_nblocks:
returns the number of blocks
in the hyperslabH5Sget_select_blocklist:
returns the "lower left" and
"upper right" coordinates of the blocks in the hyperslab selectionH5Sget_select_bounds:
returns the coordinates of the
"minimal" block containing a hyperslab selectionH5Sget_select_elem_npoints:
returns the number of points
in the element selectionH5Sget_select_elem_points:
returns the coordinates of
the element selection
Introduction to HDF5 HDF5 User Guide |
HDF5 Reference Manual Other HDF5 documents and links |
HDF Help Desk
Last modified: 30 January 2001 Describes HDF5 Release 1.4, February 2001 | Copyright |