Introduction to HDF5 1.0 Beta

This is an introduction to the HDF5 data model and programming model. Being a Getting Started or QuickStart document, this Introduction to HDF5 is intended to provide enough information for you to develop a basic understanding of how HDF5 works and is meant to be used. Knowledge of the current version of HDF will make it easier to follow the text, but it is not required. More complete information of the sort you will need to actually use HDF5 is available in the HDF5 documentation at http://hdf.ncsa.uiuc.edu/nra/BigHDF/. Available documents include the following:

Code examples are available in the source code tree when you install HDF5.

What is HDF5?

HDF5 is a new, experimental version of HDF that is designed to address some of the limitations of the current version of HDF (HDF4.x) and to address current and anticipated requirements of modern systems and applications.

We urge you to look at this new version of HDF and give us feedback on what you like or do not like about it, and what features you would like to see added to it.

Why HDF5? The development of HDF5 is motivated by a number of limitations in the current HDF format, as well as limitations in the library. Some of these limitations are:

HDF5 includes the following improvements.

Limitations of the current release

The beta release includes most of the basic functionality that is planned for the HDF5 library. However, the library does not implement all of the features detailed in the format and API specifications. Here is a listing of some of the limitations of the current release:

Changes in the current release

A detailed listing of changes in HDF5 since the last release (HDF5 1.0 alpha 2.0) can be found in the file hdf5/RELEASE in the beta code installation. Important changes include:

HDF5 file organization and data model.

HDF5 files are organized in a hierarchical structure, with two primary structures: groups and datasets.

Working with groups and group members is similar in many ways to working with directories and files in UNIX. As with UNIX directories and files, objects in an HDF5 file are often described by giving their full path names.

/ signifies the root group.
/foo signifies a member of the root group called foo.
/foo/zoo signifies a member of the group foo, which in turn is a member of the root group.

Any HDF5 group, dataset, or named datatype may have an associated attribute list. An HDF5 attribute is a user-defined HDF5 structure that provides extra information about an HDF5 object. Attributes are described in more detail below.

HDF5 Groups

An HDF5 group is a structure containing zero or more HDF5 objects. A group has two parts:

HDF5 Datasets

A dataset is stored in a file in two parts: a header and a data array.

The header contains information that is needed to interpret the array portion of the dataset, as well as metadata (or pointers to metadata) that describes or annotates the dataset. Header information includes the name of the object, its dimensionality, its number-type, information about how the data itself is stored on disk, and other information used by the library to speed up access to the dataset or maintain the file's integrity.

There are four essential classes of information in any header: name, datatype, dataspace, and storage layout:

Name. A dataset name is a sequence of alphanumeric ASCII characters.

Datatype. HDF5 allows one to define many different kinds of datatypes. There are two basic categories of datatypes: atomic types and compound types. Atomic types are those that are not decomposed at the data type interface level, such as integers and floats. Compound types are made up of atomic types. Named datatypes, discussed later in this document, provide a mechanism for sharing a datatype across datasets, ensuring that the datatype is identical for each dataset.

Atomic datatypes include integers and floating-point numbers. Each atomic type belongs to a particular class and has several properties: size, order, precision, and offset. In this introduction, we consider only a few of these properties.

Atomic datatypes include integer, float, date and time, string, bit field, and opaque. (Note: Only integer, float and string classes are available in the current implementation.)

Properties of integer types include size, order (endian-ness), and signed-ness (signed/unsigned).

Properties of float types include the size and location of the exponent and mantissa, and the location of the sign bit.

The datatypes that are supported in the current implementation are:

A compound datatype is one in which a collection of simple datatypes are represented as a single unit, similar to a struct in C. The parts of a compound datatype are called members. The members of a compound datatype may be of any datatype, including another compound datatype. It is possible to read members from a compound type without reading the whole type.

Dataspace. A dataset dataspace describes the dimensionality of the dataset. The dimensions of a dataset can be fixed (unchanging), or they may be unlimited, which means that they are extendible (i.e. they can grow larger).

Properties of a dataspace consist of the rank (number of dimensions) of the data array, the actual sizes of the dimensions of the array, and the maximum sizes of the dimensions of the array. For a fixed-dimension dataset, the actual size is the same as the maximum size of a dimension. When a dimension is unlimited, the maximum size is set to the value H5P_UNLIMITED. (An example below shows how to create extendible datasets.)

A dataspace can also describe portions of a dataset, making it possible to do partial I/O operations on selections. Selection is supported by the dataspace interface (H5S). Given an n-dimensional dataset, there are currently four ways to do partial selection:

  1. Select a logically contiguous n-dimensional hyperslab.
  2. Select a non-contiguous hyperslab consisting of elements or blocks of elements (hyperslabs) that are equally spaced.
  3. Select a list of independent points.

Since I/O operations have two end-points, the raw data transfer functions require two dataspace arguments: one describes the application memory dataspace or subset thereof, and the other describes the file dataspace or subset thereof.

See Dataspaces at http://hdf.ncsa.uiuc.edu/nra/BigHDF/Dataspaces.html in the HDF User’s Guide for further information.

Storage layout. The HDF5 format makes it possible to store data in a variety of ways. The default storage layout format is contiguous, meaning that data is stored in the same linear way that it is organized in memory. Two other storage layout formats are currently defined for HDF5: compact, and chunked. In the future, other storage layouts may be added.

Compact storage is used when the amount of data is small and can be stored directly in the object header. (Note: Compact storage is not supported in this release.)

Chunked storage involves dividing the dataset into equal-sized "chunks" that are stored separately. Chunking has three important benefits.

  1. It makes it possible to achieve good performance when accessing subsets of the datasets, even when the subset to be chosen is orthogonal to the normal storage order of the dataset.
  2. It makes it possible to compress large datasets and still achieve good performance when accessing subsets of the dataset.
  3. It makes it possible efficiently to extend the dimensions of a dataset in any direction.

See Datasets at http://hdf.ncsa.uiuc.edu/nra/BigHDF/Datasets.html in the HDF User’s Guide for further information.

HDF5 Attributes

Attributes are small named datasets that are attached to primary datasets, groups, or named datatypes. Attributes can be used to describe the nature and/or the intended usage of a dataset or group. An attribute has two parts: (1) a name and (2) a value. The value part contains one or more data entries of the same data type.

The Attribute API (H5A) is used to read or write attribute information. When accessing attributes, they can be identified by name or by an index value. The use of an index value makes it possible to iterate through all of the attributes associated with a given object.

The HDF5 format and I/O library are designed with the assumption that attributes are small datasets. They are always stored in the object header of the object they are attached to. Because of this, large datasets should not be stored as attributes. How large is "large" is not defined by the library and is up to the user's interpretation. (Large datasets with metadata can be stored as supplemental datasets in a group with the primary dataset.)

See Attributes at http://hdf.ncsa.uiuc.edu/nra/BigHDF/Attributes.html in the HDF User’s Guide for further information.

The HDF5 Applications Programming Interface (API)

The current HDF5 API is implemented only in C. The API provides routines for creating HDF5 files, creating and writing groups, datasets, and their attributes to HDF5 files, and reading groups, datasets and their attributes from HDF5 files.

Naming conventions

All C routines in the HDF 5 library begin with a prefix of the form H5*, where * is a single letter indicating the object on which the operation is to be performed:

Include files

There are a number definitions and declarations that should be included with any HDF5 program. These definitions and declarations are contained in several include files. The main include file is hdf5.h. This file includes all of the other files that your program is likely to need. Be sure to include hdf5.h in any program that accesses HDF5.

Predefined atomic datatypes

A datatype is a collection of data type properties, all of which can be stored on disk, and which when taken as a whole, provide complete information for data conversion to or from that data type. The datatype (H5D) interface provides functions to set and query properties of a data type.

A data point is an instance of a data type, which is an instance of a type class. We have defined a set of type classes and properties which can be extended at a later time. The atomic type classes describe types that cannot be decomposed at the data type interface level; all other classes are compound.

NATIVE datatypes. Although it is possible to describe nearly any kind of atomic data type, most applications will use predefined datatypes that are supported by their compiler. In HDF5 these are called "native" datatypes. NATIVE datatypes are C-like datatypes that are generally supported by the hardware of the machine on which the library was compiled. In order to be portable, applications should almost always use the NATIVE designation to describe data values in memory.

The NATIVE architecture has base names which do not follow the same rules as the others. Instead, native type names are similar to the C type names. Here are some examples:

Example

Corresponding C Type

H5T_NATIVE_CHAR

signed char

H5T_NATIVE_UCHAR

unsigned char

H5T_NATIVE_SHORT

short

H5T_NATIVE_USHORT

unsigned short

H5T_NATIVE_INT

int

H5T_NATIVE_UINT

unsigned

H5T_NATIVE_LONG

long

H5T_NATIVE_ULONG

unsigned long

H5T_NATIVE_LLONG

long long

H5T_NATIVE_ULLONG

unsigned long long

H5T_NATIVE_FLOAT

float

H5T_NATIVE_DOUBLE

double

H5T_NATIVE_LDOUBLE

long double

H5T_NATIVE_HSIZE

hsize_t

H5T_NATIVE_HSSIZE

hssize_t

H5T_NATIVE_HERR

herr_t

H5T_NATIVE_HBOOL

hbool_t

See Datatypes at http://hdf.ncsa.uiuc.edu/HDF5/Datatypes.html in the HDF User’s Guide for further information.

Named datatypes. Normally each dataset has its own datatype, but sometimes we may want to share a datatype among several datasets. This can be done using a named datatype. A named data type is stored in a file independent of any dataset, and referenced by all datasets that have that datatype. Named datatypes are discussed more fully in the Datatypes document referenced immediately above.

Programming models

In this section we describe how to program some basic operations on files, including how to

How to create an HDF5 file

This programming model shows how to create a file and also how to close the file.

  1. Create the file using H5Fcreate. Obtain a file identifier.
  2. Close the file with H5Fclose.

The following code fragment implements the specified model. If there is a possibility that the file already exists, the user must add the flag H5ACC_TRUNC to the access mode to overwrite the previous file's information.

hid_t       file;                          /* identifier */
/*
 * Create a new file using H5ACC_TRUNC access,
 * default file creation properties, and default file
 * access properties.
 * Then close the file.
 */
file = H5Fcreate(FILE, H5ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
status = H5Fclose(file); 
 

How to create and initialize the essential components of a dataset for writing to a file.

Recall that datatypes and dimensionality (dataspace) are independent objects, which are created separately from any dataset that they might be attached to. Because of this the creation of a dataset requires, at a minimum, separate definitions of datatype, dimensionality, and dataset. Hence, to create a dataset the following steps need to be taken:

  1. Create and initialize a dataspace for the dataset to be written.
  2. Define the datatype for the dataset to be written.
  3. Create and initialize the dataset itself.

The following code illustrates the creation of these three components of a dataset object.

hid_t    dataset, datatype, dataspace;   /* declare identifiers */

/* 
 * 1. Create dataspace: Describe the size of the array and 
 * create the data space for fixed size dataset. 
 */
dimsf[0] = NX;
dimsf[1] = NY;
dataspace = H5Pcreate_simple(RANK, dimsf, NULL); 
/*
/* 
 * 2. Define datatype for the data in the file.
 * We will store little endian integer numbers.
 */
datatype = H5Tcopy(H5T_NATIVE_INT);
status = H5Tset_order(datatype, H5T_ORDER_LE);
/*
 * 3. Create a new dataset within the file using defined 
 * dataspace and datatype and default dataset creation
 * properties.
 * NOTE: H5T_NATIVE_INT can be used as datatype if conversion
 * to little endian is not needed.
 */
dataset = H5Dcreate(file, DATASETNAME, datatype, dataspace, H5P_DEFAULT);

How to discard objects when they are no longer needed

The datatype, dataspace and dataset objects should be released once they are no longer needed by a program. Since each is an independent object, the must be released (or closed) separately. The following lines of code close the datatype, dataspace, and datasets that were created in the preceding section.

H5Tclose(datatype);

H5Dclose(dataset);

H5Sclose(dataspace);

How to write a dataset to a new file

Having defined the datatype, dataset, and dataspace parameters, you write out the data with a call to H5Dwrite.

/*
 * Write the data to the dataset using default transfer
 * properties.
 */
status = H5Dwrite(dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
                  H5P_DEFAULT, data);

The third and fourth parameters of H5Dwrite in the example describe the dataspaces in memory and in the file, respectively. They are set to the value H5S_ALL to indicate that an entire dataset is to be written. In a later section we look at how we would access a portion of a dataset.

Example 1 contains a program that creates a file and a dataset, and writes the dataset to the file.

Reading is analogous to writing. If, in the previous example, we wish to read an entire dataset, we would use the same basic calls with the same parameters. Of course, the routine H5Dread would replace H5Dwrite.

Getting information about a dataset

Although reading is analogous to writing, it is often necessary to query a file to obtain information about a dataset. For instance, we often need to know about the datatype associated with a dataset, as well dataspace information (e.g. rank and dimensions). There are several "get" routines for obtaining this information The following code segment illustrates how we would get this kind of information:

/*
 * Get datatype and dataspace identifiers and then query
 * dataset class, order, size, rank and dimensions.
 */

datatype  = H5Dget_type(dataset);     /* datatype identifier */ 
class     = H5Tget_class(datatype);
if (class == H5T_INTEGER) printf("Data set has INTEGER type \n");
order     = H5Tget_order(datatype);
if (order == H5T_ORDER_LE) printf("Little endian order \n");

size  = H5Tget_size(datatype);
printf(" Data size is %d \n", size);

dataspace = H5Dget_space(dataset);    /* dataspace identifier */
rank      = H5Sextent_ndims(dataspace);
status_n  = H5Sextent_dims(dataspace, dims_out);
printf("rank %d, dimensions %d x %d \n", rank, dims_out[0], dims_out[1]);

Reading and writing a portion of a dataset

In the previous discussion, we describe how to access an entire dataset with one write (or read) operation. HDF5 also supports access to portions (or selections) of a dataset in one read/write operation. Currently selections are limited to hyperslabs and the lists of independent points. Both types of selection will be discussed in the following sections. Several sample cases of selection reading/writing are shown on the following figure.

<<< Insert dataspace figure here. (If you see this note, check the copy of this Introduction at http://hdf.ncsa.uiuc.edu/HDF5/H5.intro.html to see the figure.) >>>

In example (a) a single hyperslab is read from the midst of a 2-D array in a file and stored in the corner of a smaller 2-D array in memory. In (b) a regular series of blocks is read from a 2-D array in the file and stored as a contiguous sequence of values at a certain offset in a 1-D array in memory. In (c) a sequence of points with no regular pattern is read from a 2-D array in a file and stored as a sequence of points with no regular pattern in a 3-D array in memory.

As these examples illustrate, whenever we perform partial read/write operations on the data, the following information must be provided: file dataspace, file dataspace selection, memory dataspace and memory dataspace selection. After the required information is specified, actual read/write operation on the portion of data is done in a single call to the HDF5 read/write functions H5Dread(write).

Selecting hyperslabs

Hyperslabs are portions of datasets. A hyperslab selection can be a logically contiguous collection of points in a dataspace, or it can be regular pattern of points or blocks in a dataspace. The following picture illustrates a selection of regularly spaced 3x2 blocks in an 8x12 dataspace.

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

Four parameters are required to describe a completely general hyperslab. Each parameter is an array whose rank is the same as that of the dataspace:

In what order is data copied? When actual I/O is performed data values are copied by default from one dataspace to another in so-called row-major, or 'C' order. That is, it is assumed that the first dimension varies slowest, the second next slowest, and so forth.

Example without strides or blocks. Suppose we want to read a 3x4 hyperslab from a dataset in a file beginning at the element <1,2> in the dataset. In order to do this, we must create a dataspace that describes the overall rank and dimensions of the dataset in the file, as well as the position and size of the hyperslab that we are extracting from that dataset. The following code illustrates how this would be done.

/* 
 * Define hyperslab in the dataset. 
 */
offset[0] = 1;
offset[1] = 2;
count[0]  = 3;
count[1]  = 4;
status = H5Sselect_hyperslab(dataspace, H5S_SELECT_SET, offset, NULL, count, NULL);

This describes the dataspace from which we wish to read. We need to define the dataspace in memory analogously. Suppose, for instance, that we have in memory a 3 dimensional 7x7x3 array into which we wish to read the 3x4 hyperslab described above beginning at the element <3,0,0>. Since the in-memory dataspace has three dimensions, we have to describe the hyperslab as an array with three dimensions, with the last dimension being 1: <3,4,1>.

Notice that now we must describe two things: the dimensions of the in-memory array, and the size and position of the hyperslab that we wish to read in. The following code illustrates how this would be done.

/*
 * Define the memory dataspace.
 */
dimsm[0] = 7;
dimsm[1] = 7;
dimsm[2] = 3;
memspace = H5Screate_simple(RANK_OUT,dimsm,NULL);   

/* 
 * Define memory hyperslab. 
 */
offset_out[0] = 3;
offset_out[1] = 0;
offset_out[2] = 0;
count_out[0]  = 3;
count_out[1]  = 4;
count_out[2]  = 1;
status = H5Sselect_hyperslab(memspace, H5S_SELECT_SET, offset_out, NULL, count_out, NULL);

/*

Example 2 contains a complete program that performs these operations.

Example with strides and blocks. Consider the 8x12 dataspace described above, in which we selected eight 3x2 blocks. Suppose we wish to fill these eight blocks

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

X

This hyperslab has the following parameters: start=(0,1), stride=(4,3), count=(2,4), block=(3,2).

Suppose that the source dataspace in memory is this 49-element one dimensional array called vector:

-1

1

2

3

4

5

6

7

...

47

48

-1

The following code will write 48 elements from vector to our file dataset, starting with the second element in vector.

/* Select hyperslab for the dataset in the file, using 3x2 blocks, (4,3) stride
 * (2,4) count starting at the position (0,1).
 */
start[0]  = 0; start[1]  = 1;
stride[0] = 4; stride[1] = 3;
count[0]  = 2; count[1]  = 4;    
block[0]  = 3; block[1]  = 2;
ret = H5Sselect_hyperslab(fid, H5S_SELECT_SET, start, stride, count, block);

/*
 * Create dataspace for the first dataset.
 */
mid1 = H5Screate_simple(MSPACE1_RANK, dim1, NULL);

/*
 * Select hyperslab. 
 * We will use 48 elements of the vector buffer starting at the second element.
 * Selected elements are 1 2 3 . . . 48
 */
start[0]  = 1;
stride[0] = 1;
count[0]  = 48;
block[0]  = 1;
ret = H5Sselect_hyperslab(mid1, H5S_SELECT_SET, start, stride, count, block);
 
/*
 * Write selection from the vector buffer to the dataset in the file.
 *
ret = H5Dwrite(dataset, H5T_NATIVE_INT, midd1, fid, H5P_DEFAULT, vector)

 

After these operations, the file dataspace will have the following values.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

Notice that the values are inserted in the file dataset in row-major order.

Example 7 includes this code and other example code illustrating the use of hyperslab selection.

Selecting a list of independent points
A hyperslab specifies a regular pattern of elements in a dataset. It is also possible to specify a list of independent elements to read or write using the function H5Sselect_elements. Suppose, for example, that we wish to write the values 53, 59, 61, 67 to the following elements of the 8x12 array used in the previous example: (0,0), (3,3), (3,5), and (5,6). The following code selects the points:
#define FSPACE_RANK      2    /* Dataset rank as it is stored in the file */
#define NPOINTS          4    /* Number of points that will be selected 
                                 and overwritten */ 
#define MSPACE2_RANK     1    /* Rank of the second dataset in memory */ 
#define MSPACE2_DIM      4    /* Dataset size in memory */ 

 
hsize_t dim2[] = {MSPACE2_DIM};       /* Dimension size of the second 
                                         dataset (in memory) */ 
int     values[] = {53, 59, 61, 67};  /* New values to be written */
hssize_t coord[NPOINTS][FSPACE_RANK]; /* Array to store selected points 
                                         from the file dataspace */ 

/*
 * Create dataspace for the second dataset.
 */
mid2 = H5Screate_simple(MSPACE2_RANK, dim2, NULL);

/*
 * Select sequence of NPOINTS points in the file dataspace.
 */
coord[0][0] = 0; coord[0][1] = 0;
coord[1][0] = 3; coord[1][1] = 3;
coord[2][0] = 3; coord[2][1] = 5;
coord[3][0] = 5; coord[3][1] = 6;

ret = H5Sselect_elements(fid, H5S_SELECT_SET, NPOINTS, 
                         (const hssize_t **)coord);

/*
 * Write new selection of points to the dataset.
 */
ret = H5Dwrite(dataset, H5T_NATIVE_INT, mid2, fid, H5P_DEFAULT, values);   

 

After these operations, the file dataspace will have the following values:

53

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

59

61

25

26

27

28

29

30

31

32

33

34

35

36

67

37

38

39

40

41

42

43

44

45

46

47

48

Example 7 contains a complete program that performs these subsetting operations.

Creating compound datatypes

Properties of compound datatypes. A compound datatype is similar to a struct in C or a common block in Fortran. It is a collection of one or more atomic types or small arrays of such types. To create and use of a compound datatype you need to refer to various properties of the data compound datatype:

Properties of members of a compound data type are defined when the member is added to the compound type and cannot be subsequently modified.

Defining compound datatypes. Compound datatypes must be built out of other datatypes. First, one creates an empty compound data type and specifies its total size. Then members are added to the compound data type in any order.

Member names. Each member must have a descriptive name, which is the key used to uniquely identify the member within the compound data type. A member name in an HDF5 data type does not necessarily have to be the same as the name of the corresponding member in the C struct in memory, although this is often the case. Nor does one need to define all members of the C struct in the HDF5 compound data type (or vice versa).

Offsets. Usually a C struct will be defined to hold a data point in memory, and the offsets of the members in memory will be the offsets of the struct members from the beginning of an instance of the struct. The library defines the macro to compute the offset of a member within a struct:
  HOFFSET(s,m)

This macro computes the offset of member m within a struct variable s.

Here is an example in which a compound data type is created to describe complex numbers whose type is defined by the complex_t struct.

typedef struct {
   double re;   /*real part */
   double im;   /*imaginary part */
} complex_t;

complex_t tmp;  /*used only to compute offsets */
hid_t complex_id = H5Tcreate (H5T_COMPOUND, sizeof tmp);
H5Tinsert (complex_id, "real", HOFFSET(tmp,re),
           H5T_NATIVE_DOUBLE);
H5Tinsert (complex_id, "imaginary", HOFFSET(tmp,im),
           H5T_NATIVE_DOUBLE);

Example 3 shows how to create a compound data type, write an array that has the compound data type to the file, and read back subsets of the members.

Creating and writing extendible datasets

An extendible dataset is one whose dimensions can grow. In HDF5, it is possible to define a dataset to have certain initial dimensions, then later to increase the size of any of the initial dimensions.

For example, you can create and store the following 3x3 HDF5 dataset:

     1 1 1
     1 1 1 
     1 1 1 

then later to extend this into a 10x3 dataset by adding 7 rows, such as this:

     1 1 1 
     1 1 1 
     1 1 1 
     2 2 2
     2 2 2
     2 2 2
     2 2 2
     2 2 2
     2 2 2
     2 2 2

then further extend it to a 10x5 dataset by adding two columns, such as this:

     1 1 1 3 3 
     1 1 1 3 3 
     1 1 1 3 3 
     2 2 2 3 3
     2 2 2 3 3
     2 2 2 3 3
     2 2 2 3 3
     2 2 2 3 3
     2 2 2 3 3
     2 2 2 3 3

The current version of HDF 5 requires you to use chunking in order to define extendible datasets. Chunking makes it possible to extend datasets efficiently, without having to reorganize storage excessively.

Three operations are required in order to write an extendible dataset:

  1. Declare the dataspace of the dataset to have unlimited dimensions for all dimensions that might eventually be extended.
  2. When creating the dataset, set the storage layout for the dataset to chunked.
  3. Extend the size of the dataset.

For example, suppose we wish to create a dataset similar to the one shown above. We want to start with a 3x3 dataset, then later extend it in both directions.

Declaring unlimited dimensions. We could declare the dataspace to have unlimited dimensions with the following code, which uses the predefined constant H5S_UNLIMITED to specify unlimited dimensions.

hsize_t dims[2] = { 3, 3}; /* dataset dimensions
at the creation time */ 
hsize_t maxdims[2] = {H5S_UNLIMITED, H5S_UNLIMITED};
/*
 * 1. Create the data space with unlimited dimensions. 
 */
dataspace = H5Screate_simple(RANK, dims, maxdims); 

Enabling chunking. We can then modify the dataset storage layout properties to enable chunking. We do this using the routine H5Pset_chunk:

hid_t cparms; 
hsize_t chunk_dims[2] ={2, 5};
/* 
* 2. Modify dataset creation properties to enable chunking.
*/
cparms = H5Pcreate (H5P_DATASET_CREATE);
status = H5Pset_chunk( cparms, RANK, chunk_dims);

Extending dataset size. Finally, when we want to extend the size of the dataset, we invoke H5Dextend to extend the size of the dataset. In the following example, we extend the dataset along the first dimension, by seven rows, so that the new dimensions are <10,3>:

/*
* Extend the dataset. Dataset becomes 10 x 3.
*/
dims[0] = dims[0] + 7;
size[0] = dims[0]; 
size[1] = dims[1]; 
status = H5Dextend (dataset, size);

 

Example 4 shows how to create a 3x3 extendible dataset, write the dataset, extend the dataset to 10x3, write the dataset again, extend it again to 10x5, write the dataset again.

Example 5 shows how to read the data written by Example 4.

Working with groups in a file

Groups provide a mechanism for organizing datasets in an HDF5 file extendable meaningful ways. The H5G API contains routines for working with groups.

Creating a group. To create a group, use H5Gcreate. For example, the following code creates two groups that are members of the root group. They are called /IntData and /FloatData. The return value (dir) is the group identifier.

/*
* Create two groups in a file.
*/
dir = H5Gcreate(file, "/IntData", 0);
status = H5Gclose(dir);
dir = H5Gcreate(file,"/FloatData", 0);
status = H5Gclose(dir);

The third parameter in H5Gcreate optionally specifies how much file space to reserve to store the names that will appear in this group. If a non-positive value is supplied then a default size is chosen.

H5Gclose closes the group and releases the group identifier.

 

Creating an object in a particular group. Except for single-object HDF5 files, every object in an HDF5 file must belong to a group, and hence has a path name. Hence, we put an object in a particular group by giving its path name when we create it. For example, the following code creates a dataset IntArray in the group /IntData:

/*
 * Create dataset in the /IntData group by specifying full path.
 */
dims[0] = 2;
dims[1] = 3;
dataspace = H5Pcreate_simple(2, dims, NULL);
dataset = H5Dcreate(file, "/IntData/IntArray", H5T_NATIVE_INT, dataspace, H5C_DEFAULT); 

Changing the current group. The HDF5 Group API supports the idea of a current group. This is analogous to the current working directory idea in UNIX. You can set the current group in HDF5 with the routine H5Gset. The following code shows how to set a current group, then create a certain dataset (FloatData) in that group.

/*
 * Set current group to /FloatData.
 */
status = H5Gset (file, "/FloatData");

/* 
 * Create two datasets
 */

dims[0] = 5;
dims[1] = 10;
dataspace = H5Screate_simple(2, dims, NULL);
dataset = H5Dcreate(file, "FloatArray", H5T_NATIVE_FLOAT, dataspace, H5P_DEFAULT); 

Example 6 shows how to create an HDF5 file with two group, and to place some datasets within those groups.

Working with attributes

Think of an attribute as a small datasets that is attached to a normal dataset or group. The H5A API contains routines for working with attributes. Since attributes share many of the characteristics of datasets, the programming model for working with attributes is analogous in many ways to the model for working with datasets. The primary differences are that an attribute must be attached to a dataset or a group, and subsetting operations cannot be performed on attributes.

To create an attribute belonging to a particular dataset or group, first create a dataspace for the attribute (H5Screate), then create the attribute using H5Acreate. For example, the following code creates an attribute called Integer_attribute that is a member of a dataset whose identifier is dataset. The attribute identifier is attr2. H5Awrite then sets the value of the attribute of that of integer variable point. H5Aclose then releases the attribute identifier.

int point = 1;                         /* Value of the scalar attribute */ 

/*
 * Create scalar attribute.
 */
aid2  = H5Screate(H5S_SCALAR);
attr2 = H5Acreate(dataset, "Integer attribute", H5T_NATIVE_INT, aid2,
                  H5P_DEFAULT);

/*
 * Write scalar attribute.
 */
ret = H5Awrite(attr2, H5T_NATIVE_INT, &point); 

/*
 * Close attribute dataspace.
 */
ret = H5Sclose(aid2); 

/*
 * Close attribute.
 */
ret = H5Aclose(attr2); 

 

To read a scalar attribute whose name and datatype are known, first open the attribute using H5Aopen_name, then use H5Aread to get its value. For example the following reads a scalar attribute called Integer_attribute whose datatype is a native integer, and whose parent dataset has the id dataset.

/*
 * Attach to the scalar attribute using attribute name, then read and 
 * display its value.
 */
attr = H5Aopen_name(dataset,"Integer attribute");
ret  = H5Aread(attr, H5T_NATIVE_INT, &point_out);
printf("The value of the attribute \"Integer attribute\" is %d \n", point_out); 
ret =  H5Aclose(attr);

Reading an attribute whose characterstics are not known. It may be necessary to query a file to obtain information about an attribute, namely its name, data type, rank and dimensions. The following code opens an attribute by its index value using H5Aopen_index, then reads in information about its datatype.

/*
 * Attach to the string attribute using its index, then read and display the value.
 */
attr =  H5Aopen_idx(dataset, 2);
atype = H5Tcopy(H5T_C_S1);
        H5Tset_size(atype, 4);
ret   = H5Aread(attr, atype, string_out);
printf("The value of the attribute with the index 2 is %s \n", string_out);

In practice, if the characteristics of attributes are not know, the code involved in accessing and processing the attribute can be quite complex. For this reason, HDF5 includes a function called H5Aiterate, which applies a user-supplied function to each of a set of attributes. The user-supplied function can contain the code that interprets, accesses and processes each attribute.

Example 8 illustrates the use of the H5Aiterate function, as well as the other attribute examples described above.

Example code

Example 1: How to create a homogeneous multi-dimensional dataset and write it to a file.

This example creates a 2-dimensional HDF 5 dataset of little endian 32-bit integers.


/*  
 *  This example writes data to the HDF5 file.
 *  Data conversion is performed during write operation.  
 */
 
#include <hdf5.h>

#define FILE        "SDS.h5"
#define DATASETNAME "IntArray" 
#define NX     5                      /* dataset dimensions */
#define NY     6
#define RANK   2

main ()
{
   hid_t       file, dataset;         /* file and dataset identifiers */
   hid_t       datatype, dataspace;   /* identifiers */
   hsize_t     dimsf[2];              /* dataset dimensions */
   herr_t      status;                             
   int         data[NX][NY];          /* data to write */
   int         i, j;

/* 
 * Data  and output buffer initialization. 
 */

for (j = 0; j < NX; j++) {
    for (i = 0; i < NY; i++)
        data[j][i] = i + j;
}     
                                       /*  0 1 2 3 4 5 
                                           1 2 3 4 5 6
                                           2 3 4 5 6 7
                                           3 4 5 6 7 8
                                           4 5 6 7 8 9   */

/*
 * Create a new file using H5F_ACC_TRUNC access,
 * default file creation properties, and default file
 * access properties.
 */
file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

/*
 * Describe the size of the array and create the data space for fixed
 * size dataset. 
 */
dimsf[0] = NX;
dimsf[1] = NY;
dataspace = H5Screate_simple(RANK, dimsf, NULL); 

/* 
 * Define datatype for the data in the file.
 * We will store little endian INT numbers.
 */
datatype = H5Tcopy(H5T_NATIVE_INT);
status = H5Tset_order(datatype, H5T_ORDER_LE);
/*
 * Create a new dataset within the file using defined dataspace and
 * datatype and default dataset creation properties.
 */
dataset = H5Dcreate(file, DATASETNAME, datatype, dataspace,
                    H5P_DEFAULT);

/*
 * Write the data to the dataset using default transfer properties.
 */
status = H5Dwrite(dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
                  H5P_DEFAULT, data);

/*
 * Close/release resources.
 */
H5Sclose(dataspace);
H5Tclose(datatype);
H5Dclose(dataset);
H5Fclose(file);
 
}     

 

Example 2. How to read a hyperslab from file into memory.

This example reads a hyperslab from a 2-d HDF5 dataset into a 3-d dataset in memory.

/*  
 *   This example reads hyperslab from the SDS.h5 file 
 *   created by h5_write.c program into two-dimensional
 *   plane of the tree-dimensional array. 
 *   Information about dataset in the SDS.h5 file is obtained. 
 */
 
#include "hdf5.h"

#define FILE        "SDS.h5"
#define DATASETNAME "IntArray" 
#define NX_SUB  3           /* hyperslab dimensions */ 
#define NY_SUB  4 
#define NX 7           /* output buffer dimensions */ 
#define NY 7 
#define NZ  3 
#define RANK         2
#define RANK_OUT     3

main ()
{
   hid_t       file, dataset;         /* identifiers */
   hid_t       datatype, dataspace;   
   hid_t       memspace; 
   H5T_class_t class;                 /* data type class */
   H5T_order_t order;                 /* data order */
   size_t      size;                  /* size of the data element
                                         stored in file */ 
   hsize_t     dimsm[3];              /* memory space dimensions */
   hsize_t     dims_out[2];           /* dataset dimensions */      
   herr_t      status;                             

   int         data_out[NX][NY][NZ ]; /* output buffer */
   
   hsize_t      count[2];              /* size of the hyperslab in the file */
   hsize_t      offset[2];             /* hyperslab offset in the file */
   hsize_t      count_out[3];          /* size of the hyperslab in memory */
   hsize_t      offset_out[3];         /* hyperslab offset in memory */
   int          i, j, k, status_n, rank;

for (j = 0; j < NX; j++) {
    for (i = 0; i < NY; i++) {
        for (k = 0; k < NZ ; k++)
            data_out[j][i][k] = 0;
    }
} 
 
/*
 * Open the file and the dataset.
 */
file = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT);
dataset = H5Dopen(file, DATASETNAME);

/*
 * Get datatype and dataspace identifiers and then query
 * dataset class, order, size, rank and dimensions.
 */

datatype  = H5Dget_type(dataset);     /* datatype identifier */ 
class     = H5Tget_class(datatype);
if (class == H5T_INTEGER) printf("Data set has INTEGER type \n");
order     = H5Tget_order(datatype);
if (order == H5T_ORDER_LE) printf("Little endian order \n");

size  = H5Tget_size(datatype);
printf(" Data size is %d \n", size);

dataspace = H5Dget_space(dataset);    /* dataspace identifier */
rank      = H5Sextent_ndims(dataspace);
status_n  = H5Sextent_dims(dataspace, dims_out, NULL);
printf("rank %d, dimensions %d x %d \n", rank, dims_out[0], dims_out[1]);

/* 
 * Define hyperslab in the datatset. 
 */
offset[0] = 1;
offset[1] = 2;
count[0]  = NX_SUB;
count[1]  = NY_SUB;
status = H5Sselect_hyperslab(dataspace, H5S_SELECT_SET, offset, NULL, 
                             count, NULL);

/*
 * Define the memory dataspace.
 */
dimsm[0] = NX;
dimsm[1] = NY;
dimsm[2] = NZ ;
memspace = H5Screate_simple(RANK_OUT,dimsm,NULL);   

/* 
 * Define memory hyperslab. 
 */
offset_out[0] = 3;
offset_out[1] = 0;
offset_out[2] = 0;
count_out[0]  = NX_SUB;
count_out[1]  = NY_SUB;
count_out[2]  = 1;
status = H5Sselect_hyperslab(memspace, H5S_SELECT_SET, offset_out, NULL, 
                             count_out, NULL);

/*
 * Read data from hyperslab in the file into the hyperslab in 
 * memory and display.
 */
status = H5Dread(dataset, H5T_NATIVE_INT, memspace, dataspace,
                 H5P_DEFAULT, data_out);
for (j = 0; j < NX; j++) {
    for (i = 0; i < NY; i++) printf("%d ", data_out[j][i][0]);
    printf("\n");
}
                                         /*  0 0 0 0 0 0 0
                                             0 0 0 0 0 0 0
                                             0 0 0 0 0 0 0
                                             3 4 5 6 0 0 0  
                                             4 5 6 7 0 0 0
                                             5 6 7 8 0 0 0
                                             0 0 0 0 0 0 0 */

/*
 * Close/release resources.
 */
H5Tclose(datatype);
H5Dclose(dataset);
H5Sclose(dataspace);
H5Sclose(memspace);
H5Fclose(file);

}     

 

Example 3. Working with compound datatypes.

This example shows how to create a compound data type, write an array which has the compound data type to the file, and read back subsets of fields.

/*
 * This example shows how to create a compound data type,
 * write an array which has the compound data type to the file,
 * and read back fields' subsets.
 */

#include "hdf5.h"

#define FILE          "SDScompound.h5"
#define DATASETNAME   "ArrayOfStructures"
#define LENGTH        10
#define RANK          1

main()

{

   
/* First structure  and dataset*/
typedef struct s1_t {
    int    a;
    float  b;
    double c; 
} s1_t;
s1_t       s1[LENGTH];
hid_t      s1_tid;     /* File datatype hadle */

/* Second structure (subset of s1_t)  and dataset*/
typedef struct s2_t {
    double c;
    int    a;
} s2_t;
s2_t       s2[LENGTH];
hid_t      s2_tid;    /* Memory datatype identifier */

/* Third "structure" ( will be used to read float field of s1) */
hid_t      s3_tid;   /* Memory datatype identifier */
float      s3[LENGTH];

int        i;
hid_t      file, datatype, dataset, space; /* Identifiers */
herr_t     status;
hsize_t    dim[] = {LENGTH};   /* Dataspace dimensions */


/*
 * Initialize the data
 */
   for (i = 0; i< LENGTH; i++) {
        s1[i].a = i;
        s1[i].b = i*i;
        s1[i].c = 1./(i+1);
}

/*
 * Create the data space.
 */
space = H5Screate_simple(RANK, dim, NULL);

/*
 * Create the file.
 */
file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

/*
 * Create the memory data type. 
 */
s1_tid = H5Tcreate (H5T_COMPOUND, sizeof(s1_t));
H5Tinsert(s1_tid, "a_name", HOFFSET(s1_t, a), H5T_NATIVE_INT);
H5Tinsert(s1_tid, "c_name", HOFFSET(s1_t, c), H5T_NATIVE_DOUBLE);
H5Tinsert(s1_tid, "b_name", HOFFSET(s1_t, b), H5T_NATIVE_FLOAT);

/* 
 * Create the dataset.
 */
dataset = H5Dcreate(file, DATASETNAME, s1_tid, space, H5P_DEFAULT);

/*
 * Wtite data to the dataset; 
 */
status = H5Dwrite(dataset, s1_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s1);

/*
 * Release resources
 */
H5Tclose(s1_tid);
H5Sclose(space);
H5Dclose(dataset);
H5Fclose(file);
 
/*
 * Open the file and the dataset.
 */
file = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT);
 
dataset = H5Dopen(file, DATASETNAME);

/* 
 * Create a data type for s2
 */
s2_tid = H5Tcreate(H5T_COMPOUND, sizeof(s2_t));

H5Tinsert(s2_tid, "c_name", HOFFSET(s2_t, c), H5T_NATIVE_DOUBLE);
H5Tinsert(s2_tid, "a_name", HOFFSET(s2_t, a), H5T_NATIVE_INT);

/*
 * Read two fields c and a from s1 dataset. Fields in the file
 * are found by their names "c_name" and "a_name".
 */
status = H5Dread(dataset, s2_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s2);

/*
 * Display the fields
 */
printf("\n");
printf("Field c : \n");
for( i = 0; i < LENGTH; i++) printf("%.4f ", s2[i].c);
printf("\n");

printf("\n");
printf("Field a : \n");
for( i = 0; i < LENGTH; i++) printf("%d ", s2[i].a);
printf("\n");

/* 
 * Create a data type for s3.
 */
s3_tid = H5Tcreate(H5T_COMPOUND, sizeof(float));

status = H5Tinsert(s3_tid, "b_name", 0, H5T_NATIVE_FLOAT);

/*
 * Read field b from s1 dataset. Field in the file is found by its name.
 */
status = H5Dread(dataset, s3_tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, s3);

/*
 * Display the field
 */
printf("\n");
printf("Field b : \n");
for( i = 0; i < LENGTH; i++) printf("%.4f ", s3[i]);
printf("\n");

/*
 * Release resources
 */
H5Tclose(s2_tid);
H5Tclose(s3_tid);
H5Dclose(dataset);
H5Fclose(file);
}

 

Example 4. Creating and writing an extendible dataset.

This example shows how to create a 3x3 extendible dataset, to extend the dataset to 10x3, then to extend it again to 10x5.

/*  
 *   This example shows how to work with extendible dataset.
 *   In the current version of the library dataset MUST be
 *   chunked.
 *   
 */
 
#include "hdf5.h"

#define FILE        "SDSextendible.h5"
#define DATASETNAME "ExtendibleArray" 
#define RANK         2
#define NX     10
#define NY     5 

main ()
{
   hid_t       file;                          /* identifiers */
   hid_t       datatype, dataspace, dataset;  
   hid_t       filespace;                   
   hid_t       cparms;                     
   hsize_t      dims[2]  = { 3, 3};            /* dataset dimensions
                                                 at the creation time  */ 
   hsize_t      dims1[2] = { 3, 3};            /* data1 dimensions */ 
   hsize_t      dims2[2] = { 7, 1};            /* data2 dimensions */  
   hsize_t      dims3[2] = { 2, 2};            /* data3 dimensions */ 

   hsize_t      maxdims[2] = {H5S_UNLIMITED, H5S_UNLIMITED};
   hsize_t      chunk_dims[2] ={2, 5};
   hsize_t      size[2];
   hssize_t     offset[2];

   herr_t      status;                             

   int         data1[3][3] = { 1, 1, 1,       /* data to write */
                               1, 1, 1,
                               1, 1, 1 };      

   int         data2[7]    = { 2, 2, 2, 2, 2, 2, 2};

   int         data3[2][2] = { 3, 3,
                               3, 3};

/*
 * Create the data space with ulimited dimensions. 
 */
dataspace = H5Screate_simple(RANK, dims, maxdims); 

/*
 * Create a new file. If file exists its contents will be overwritten.
 */
file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

/* 
 * Modify dataset creation properties, i.e. enable chunking.
 */
cparms = H5Pcreate (H5P_DATASET_CREATE);
status = H5Pset_chunk( cparms, RANK, chunk_dims);

/*
 * Create a new dataset within the file using cparms
 * creation properties.
 */
dataset = H5Dcreate(file, DATASETNAME, H5T_NATIVE_INT, dataspace,
                 cparms);

/*
 * Extend the dataset. This call assures that dataset is at least 3 x 3.
 */
size[0]   = 3; 
size[1]   = 3; 
status = H5Dextend (dataset, size);

/*
 * Select a hyperslab.
 */
filespace = H5Dget_space (dataset);
offset[0] = 0;
offset[1] = 0;
status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL,
                             dims1, NULL);  

/*
 * Write the data to the hyperslab.
 */
status = H5Dwrite(dataset, H5T_NATIVE_INT, dataspace, filespace,
                  H5P_DEFAULT, data1);

/*
 * Extend the dataset. Dataset becomes 10 x 3.
 */
dims[0]   = dims1[0] + dims2[0];
size[0]   = dims[0];  
size[1]   = dims[1]; 
status = H5Dextend (dataset, size);

/*
 * Select a hyperslab.
 */
filespace = H5Dget_space (dataset);
offset[0] = 3;
offset[1] = 0;
status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL,
                              dims2, NULL);  

/*
 * Define memory space
 */
dataspace = H5Screate_simple(RANK, dims2, NULL); 

/*
 * Write the data to the hyperslab.
 */
status = H5Dwrite(dataset, H5T_NATIVE_INT, dataspace, filespace,
                  H5P_DEFAULT, data2);

/*
 * Extend the dataset. Dataset becomes 10 x 5.
 */
dims[1]   = dims1[1] + dims3[1];
size[0]   = dims[0];  
size[1]   = dims[1]; 
status = H5Dextend (dataset, size);

/*
 * Select a hyperslab
 */
filespace = H5Dget_space (dataset);
offset[0] = 0;
offset[1] = 3;
status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL, 
                             dims3, NULL);  

/*
 * Define memory space.
 */
dataspace = H5Screate_simple(RANK, dims3, NULL); 

/*
 * Write the data to the hyperslab.
 */
status = H5Dwrite(dataset, H5T_NATIVE_INT, dataspace, filespace,
                  H5P_DEFAULT, data3);

/*
 * Resulting dataset
 *                 
         3 3 3 2 2
         3 3 3 2 2
         3 3 3 0 0
         2 0 0 0 0
         2 0 0 0 0
         2 0 0 0 0
         2 0 0 0 0
         2 0 0 0 0
         2 0 0 0 0
         2 0 0 0 0
 */ 
/*
 * Close/release resources.
 */
H5Dclose(dataset);
H5Sclose(dataspace);
H5Sclose(filespace);
H5Fclose(file);

}     

 

Example 5. Reading data.

This example shows how to read information the chunked dataset written by Example 4.

/*  
 *   This example shows how to read data from a chunked dataset.
 *   We will read from the file created by h5_extend_write.c 
 */
 
#include "hdf5.h"

#define FILE        "SDSextendible.h5"
#define DATASETNAME "ExtendibleArray" 
#define RANK         2
#define RANKC        1
#define NX           10
#define NY           5 

main ()
{
   hid_t       file;                        /* identifiers */
   hid_t       datatype, dataset;  
   hid_t       filespace;                   
   hid_t       memspace;                  
   hid_t       cparms;                   
   H5T_class_t class;                       /* data type class */
   size_t      elem_size;                   /* size of the data element
                                               stored in file */ 
   hsize_t     dims[2];                     /* dataset and chunk dimensions */ 
   hsize_t     chunk_dims[2];
   hsize_t     col_dims[1];
   size_t      size[2];
   hsize_t     count[2];
   hsize_t     offset[2];

   herr_t      status, status_n;                             

   int         data_out[NX][NY];  /* buffer for dataset to be read */
   int         chunk_out[2][5];   /* buffer for chunk to be read */
   int         column[10];        /* buffer for column to be read */
   int         i, j, rank, rank_chunk;

 
/*
 * Open the file and the dataset.
 */
file = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT);
dataset = H5Dopen(file, DATASETNAME);
 
/*
 * Get dataset rank and dimension.
 */
 
filespace = H5Dget_space(dataset);    /* Get filespace identifier first. */
rank      = H5Sextent_ndims(filespace);
status_n  = H5Sextent_dims(filespace, dims, NULL);
printf("dataset rank %d, dimensions %d x %d \n", rank, dims[0], dims[1]);

/*
 * Get creation properties list.
 */
cparms = H5Dget_create_plist(dataset); /* Get properties identifier first. */

/* 
 * Check if dataset is chunked.
 */
 if (H5D_CHUNKED == H5Pget_layout(cparms))  {

/*
 * Get chunking information: rank and dimensions
 */
rank_chunk = H5Pget_chunk(cparms, 2, chunk_dims);
printf("chunk rank %d, dimensions %d x %d \n", rank_chunk,
        chunk_dims[0], chunk_dims[1]);
} 
 
/*
 * Define the memory space to read dataset.
 */
memspace = H5Screate_simple(RANK,dims,NULL);
 
/*
 * Read dataset back and display.
 */
status = H5Dread(dataset, H5T_NATIVE_INT, memspace, filespace,
                 H5P_DEFAULT, data_out);
    printf("\n");
    printf("Dataset: \n");
for (j = 0; j < dims[0]; j++) {
    for (i = 0; i < dims[1]; i++) printf("%d ", data_out[j][i]);
    printf("\n");
}     

/*
            dataset rank 2, dimensions 10 x 5 
            chunk rank 2, dimensions 2 x 5 

            Dataset:
            1 1 1 3 3 
            1 1 1 3 3 
            1 1 1 0 0 
            2 0 0 0 0 
            2 0 0 0 0 
            2 0 0 0 0 
            2 0 0 0 0 
            2 0 0 0 0 
            2 0 0 0 0 
            2 0 0 0 0 
*/

/*
 * Read the third column from the dataset.
 * First define memory dataspace, then define hyperslab
 * and read it into column array.
 */
col_dims[0] = 10;
memspace =  H5Screate_simple(RANKC, col_dims, NULL);

/*
 * Define the column (hyperslab) to read.
 */
offset[0] = 0;
offset[1] = 2;
count[0]  = 10;
count[1]  = 1;
status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL,
                             count, NULL);
status = H5Dread(dataset, H5T_NATIVE_INT, memspace, filespace,
                 H5P_DEFAULT, column);
printf("\n");
printf("Third column: \n");
for (i = 0; i < 10; i++) {
     printf("%d \n", column[i]);
}

/*

            Third column: 
            1 
            1 
            1 
            0 
            0 
            0 
            0 
            0 
            0 
            0 
*/

/*
 * Define the memory space to read a chunk.
 */
memspace = H5Screate_simple(rank_chunk,chunk_dims,NULL);

/*
 * Define chunk in the file (hyperslab) to read.
 */
offset[0] = 2;
offset[1] = 0;
count[0]  = chunk_dims[0];
count[1]  = chunk_dims[1];
status = H5Sselect_hyperslab(filespace, H5S_SELECT_SET, offset, NULL, 
                              count, NULL);

/*
 * Read chunk back and display.
 */
status = H5Dread(dataset, H5T_NATIVE_INT, memspace, filespace,
                 H5P_DEFAULT, chunk_out);
    printf("\n");
    printf("Chunk: \n");
for (j = 0; j < chunk_dims[0]; j++) {
    for (i = 0; i < chunk_dims[1]; i++) printf("%d ", chunk_out[j][i]);
    printf("\n");
}     
/*
         Chunk: 
         1 1 1 0 0 
         2 0 0 0 0 
*/

/*
 * Close/release resources.
 */
H5Pclose(cparms);
H5Dclose(dataset);
H5Sclose(filespace);
H5Sclose(memspace);
H5Fclose(file);

}     

 

Example 6. Creating groups.

This example shows how to create an HDF5 file with two groups, and to place some datasets within those groups.

/*
 * This example shows how to create groups within the file and    
 * datasets within the file and groups.
 */ 


#include "hdf5.h"


#define FILE    "DIR.h5"
#define RANK    2

main()
{

   hid_t    file, dir;
   hid_t    dataset, dataspace;

   herr_t   status;
   hsize_t  dims[2];
   hsize_t  size[1];

/*
 * Create a file.
 */
file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

/*
 * Create two groups in a file.
 */
dir = H5Gcreate(file, "/IntData", 0);
status = H5Gclose(dir);

dir = H5Gcreate(file,"/FloatData", 0);
status = H5Gclose(dir);

/* 
 * Create dataspace for the character string
 */
size[0] = 80;
dataspace = H5Screate_simple(1, size, NULL);

/*
 * Create dataset "String" in the root group.  
 */
dataset = H5Dcreate(file, "String", H5T_NATIVE_CHAR, dataspace, H5P_DEFAULT);
H5Dclose(dataset);

/*
 * Create dataset "String" in the /IntData group.  
 */
dataset = H5Dcreate(file, "/IntData/String", H5T_NATIVE_CHAR, dataspace,
                    H5P_DEFAULT);
H5Dclose(dataset);

/*
 * Create dataset "String" in the /FloatData group.  
 */
dataset = H5Dcreate(file, "/FloatData/String", H5T_NATIVE_CHAR, dataspace,
                    H5P_DEFAULT);
H5Sclose(dataspace);
H5Dclose(dataset);

/*
 * Create IntArray dataset in the /IntData group by specifying full path.
 */
dims[0] = 2;
dims[1] = 3;
dataspace = H5Screate_simple(RANK, dims, NULL);
dataset = H5Dcreate(file, "/IntData/IntArray", H5T_NATIVE_INT, dataspace,
                    H5P_DEFAULT); 
H5Sclose(dataspace);
H5Dclose(dataset);

/*
 * Set current group to /IntData and attach to the dataset String.
 */

status = H5Gset (file, "/IntData");
dataset = H5Dopen(file, "String");
if (dataset > 0) printf("String dataset in /IntData group is found\n"); 
H5Dclose(dataset);

/*
 * Set current group to /FloatData.
 */
status = H5Gset (file, "/FloatData");

/* 
 * Create two datasets FlatArray and DoubleArray.
 */

dims[0] = 5;
dims[1] = 10;
dataspace = H5Screate_simple(RANK, dims, NULL);
dataset = H5Dcreate(file, "FloatArray", H5T_NATIVE_FLOAT, dataspace, H5P_DEFAULT); 
H5Sclose(dataspace);
H5Dclose(dataset);

dims[0] = 4;
dims[1] = 6;
dataspace = H5Screate_simple(RANK, dims, NULL);
dataset = H5Dcreate(file, "DoubleArray", H5T_NATIVE_DOUBLE, dataspace,
                    H5P_DEFAULT); 
H5Sclose(dataspace);
H5Dclose(dataset);

/* 
 * Attach to /FloatData/String dataset.
 */

dataset = H5Dopen(file, "/FloatData/String");
if (dataset > 0) printf("/FloatData/String dataset is found\n"); 
H5Dclose(dataset);
H5Fclose(file);

}

Example 7. Writing selected data from memory to a file.

This example shows how to use the selection capabilities of HDF5 to write selected data to a file. It includes the examples discussed in the text.

/* 
 *  This program shows how the H5Sselect_hyperslab and H5Sselect_elements
 *  functions are used to write selected data from memory to the file.
 *  Program takes 48 elements from the linear buffer and writes them into
 *  the matrix using 3x2 blocks, (4,3) stride and (2,4) count. 
 *  Then four elements  of the matrix are overwritten with the new values and 
 *  file is closed. Program reopens the file and reads and displays the result.
 */ 
 
#include <hdf5.h>

#define FILE "Select.h5"

#define MSPACE1_RANK     1          /* Rank of the first dataset in memory */
#define MSPACE1_DIM      50         /* Dataset size in memory */ 

#define MSPACE2_RANK     1          /* Rank of the second dataset in memory */ 
#define MSPACE2_DIM      4          /* Dataset size in memory */ 

#define FSPACE_RANK      2          /* Dataset rank as it is stored in the file */
#define FSPACE_DIM1      8          /* Dimension sizes of the dataset as it is
                                       stored in the file */
#define FSPACE_DIM2      12 

                                    /* We will read dataset back from the file
                                       to the dataset in memory with these
                                       dataspace parameters. */  
#define MSPACE_RANK      2
#define MSPACE_DIM1      8 
#define MSPACE_DIM2      12 

#define NPOINTS          4          /* Number of points that will be selected 
                                       and overwritten */ 
main ()
{

   hid_t   file, dataset;           /* File and dataset identifiers */
   hid_t   mid1, mid2, fid, mid;    /* Dataspace identifiers */
   hsize_t dim1[] = {MSPACE1_DIM};  /* Dimension size of the first dataset 
                                       (in memory) */ 
   hsize_t dim2[] = {MSPACE2_DIM};  /* Dimension size of the second dataset
                                       (in memory */ 
   hsize_t fdim[] = {FSPACE_DIM1, FSPACE_DIM2}; 
                                    /* Dimension sizes of the dataset (on disk) */
   hsize_t mdim[] = {MSPACE_DIM1, MSPACE_DIM2}; 
                                    /* Dimension sizes when we 
                                                   read data back */
   hssize_t start[2]; /* Start of hyperslab */
   hsize_t stride[2]; /* Stride of hyperslab */
   hsize_t count[2];  /* Block count */
   hsize_t block[2];  /* Block sizes */

   hssize_t coord[NPOINTS][FSPACE_RANK]; /* Array to store selected points 
                                            from the file dataspace */ 
   herr_t  ret;
   uint    i,j;
   int     matrix[MSPACE_DIM1][MSPACE_DIM2];
   int     vector[MSPACE1_DIM];
   int     values[] = {53, 59, 61, 67};  /* New values to be written */
/*
 * Buffers' initialization.
 */
vector[0] = vector[MSPACE1_DIM - 1] = -1;
for (i = 1; i < MSPACE1_DIM - 1; i++) vector[i] = i;

for (i = 0; i < MSPACE_DIM1; i++) {
    for (j = 0; j < MSPACE_DIM2; j++)
        matrix[i][j] = 0;
}
/*
 * Create a file.
 */
file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

/* 
 * Create dataspace for the dataset in the file.
 */
fid = H5Screate_simple(FSPACE_RANK, fdim, NULL);

/*
 * Create dataset and write it into the file.
 */
dataset = H5Dcreate(file, "Matrix in file", H5T_NATIVE_INT, fid, H5P_DEFAULT);
ret = H5Dwrite(dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, matrix);

/*
 * Select hyperslab for the dataset in the file, using 3x2 blocks, (4,3) stride
 * (2,4) count starting at the position (0,1).
 */
start[0]  = 0; start[1]  = 1;
stride[0] = 4; stride[1] = 3;
count[0]  = 2; count[1]  = 4;    
block[0]  = 3; block[1]  = 2;
ret = H5Sselect_hyperslab(fid, H5S_SELECT_SET, start, stride, count, block);

/*
 * Create dataspace for the first dataset.
 */
mid1 = H5Screate_simple(MSPACE1_RANK, dim1, NULL);

/*
 * Select hyperslab. 
 * We will use 48 elements of the vector buffer starting at the second element.
 * Selected elements are 1 2 3 . . . 48
 */
start[0]  = 1;
stride[0] = 1;
count[0]  = 48;
block[0]  = 1;
ret = H5Sselect_hyperslab(mid1, H5S_SELECT_SET, start, stride, count, block);
 
/*
 * Write selection from the vector buffer to the dataset in the file.
 *
 * File dataset should look like this:       
 *                    0  1  2  0  3  4  0  5  6  0  7  8 
 *                    0  9 10  0 11 12  0 13 14  0 15 16
 *                    0 17 18  0 19 20  0 21 22  0 23 24
 *                    0  0  0  0  0  0  0  0  0  0  0  0
 *                    0 25 26  0 27 28  0 29 30  0 31 32
 *                    0 33 34  0 35 36  0 37 38  0 39 40
 *                    0 41 42  0 43 44  0 45 46  0 47 48
 *                    0  0  0  0  0  0  0  0  0  0  0  0
 */
ret = H5Dwrite(dataset, H5T_NATIVE_INT, mid1, fid, H5P_DEFAULT, vector);

/*
 * Reset the selection for the file dataspace fid.
 */
ret = H5Sselect_none(fid);

/*
 * Create dataspace for the second dataset.
 */
mid2 = H5Screate_simple(MSPACE2_RANK, dim2, NULL);

/*
 * Select sequence of NPOINTS points in the file dataspace.
 */
coord[0][0] = 0; coord[0][1] = 0;
coord[1][0] = 3; coord[1][1] = 3;
coord[2][0] = 3; coord[2][1] = 5;
coord[3][0] = 5; coord[3][1] = 6;

ret = H5Sselect_elements(fid, H5S_SELECT_SET, NPOINTS, 
                         (const hssize_t **)coord);

/*
 * Write new selection of points to the dataset.
 */
ret = H5Dwrite(dataset, H5T_NATIVE_INT, mid2, fid, H5P_DEFAULT, values);   

/*
 * File dataset should look like this:     
 *                   53  1  2  0  3  4  0  5  6  0  7  8 
 *                    0  9 10  0 11 12  0 13 14  0 15 16
 *                    0 17 18  0 19 20  0 21 22  0 23 24
 *                    0  0  0 59  0 61  0  0  0  0  0  0
 *                    0 25 26  0 27 28  0 29 30  0 31 32
 *                    0 33 34  0 35 36 67 37 38  0 39 40
 *                    0 41 42  0 43 44  0 45 46  0 47 48
 *                    0  0  0  0  0  0  0  0  0  0  0  0
 *                                        
 */
   
/*
 * Close memory file and memory dataspaces.
 */
ret = H5Sclose(mid1); 
ret = H5Sclose(mid2); 
ret = H5Sclose(fid); 
 
/*
 * Close dataset.
 */
ret = H5Dclose(dataset);

/*
 * Close the file.
 */
ret = H5Fclose(file);
/*
 * Open the file.
 */
file = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT);

/*
 * Open the dataset.
 */
dataset = dataset = H5Dopen(file,"Matrix in file");

/*
 * Read data back to the buffer matrix.
 */
ret = H5Dread(dataset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
                  H5P_DEFAULT, matrix);

/*
 * Display the result.
 */
for (i=0; i < MSPACE_DIM1; i++) {
    for(j=0; j < MSPACE_DIM2; j++) printf("%3d  ", matrix[i][j]);
    printf("\n");
}

}

Example 8. Writing and reading attributes.

This example shows how to create HDF5 attributes, to attach them to a dataset, and to read through all of the attributes of a dataset.

/* 
 *  This program illustrates the usage of the H5A Interface functions.
 *  It creates and writes a dataset, and then creates and writes array,
 *  scalar, and string attributes of the dataset. 
 *  Program reopens the file, attaches to the scalar attribute using
 *  attribute name and reads and displays its value. Then index of the
 *  third attribute is used to read and display attribute values.
 *  The H5Aiterate function is used to iterate through the dataset attributes,
 *  and display their names. The function is also reads and displays the values 
 *  of the array attribute. 
 */ 
 
#include <hdf5.h>

#define FILE "Attributes.h5"

#define RANK  1   /* Rank and size of the dataset  */ 
#define SIZE  7

#define ARANK  2   /* Rank and dimension sizes of the first dataset attribute */
#define ADIM1  2
#define ADIM2  3 
#define ANAME  "Float attribute"      /* Name of the array attribute */
#define ANAMES "Character attribute" /* Name of the string attribute */

herr_t attr_info(hid_t loc_id, const char *name, void *opdata); 
                                     /* Operator function */

int main (void)
{

   hid_t   file, dataset;       /* File and dataset identifiers */
   
   hid_t   fid;                 /* Dataspace identifier */
   hid_t   attr1, attr2, attr3; /* Attribute identifiers */
   hid_t   attr;
   hid_t   aid1, aid2, aid3;    /* Attribute dataspace identifiers */ 
   hid_t   atype;               /* Attribute type */

   hsize_t fdim[] = {SIZE};
   hsize_t adim[] = {ADIM1, ADIM2};  /* Dimensions of the first attribute  */
   
   float matrix[ADIM1][ADIM2]; /* Attribute data */ 

   herr_t  ret;                /* Return value */
   uint    i,j;                /* Counters */
   int     idx;                /* Attribute index */
   char    string_out[80];     /* Buffer to read string attribute back */
   int     point_out;          /* Buffer to read scalar attribute back */

/*
 * Data initialization.
 */
int vector[] = {1, 2, 3, 4, 5, 6, 7};  /* Dataset data */
int point = 1;                         /* Value of the scalar attribute */ 
char string[] = "ABCD";                /* Value of the string attribute */

   
for (i=0; i < ADIM1; i++) {            /* Values of the array attribute */
    for (j=0; j < ADIM2; j++)
        matrix[i][j] = -1.;
}

/*
 * Create a file.
 */
file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

/* 
 * Create the dataspace for the dataset in the file.
 */
fid = H5Screate(H5S_SIMPLE);
ret = H5Sset_extent_simple(fid, RANK, fdim, NULL);

/*
 * Create the dataset in the file.
 */
dataset = H5Dcreate(file, "Dataset", H5T_NATIVE_INT, fid, H5P_DEFAULT);

/*
 * Write data to the dataset.
 */
ret = H5Dwrite(dataset, H5T_NATIVE_INT, H5S_ALL , H5S_ALL, H5P_DEFAULT, vector);

/*
 * Create dataspace for the first attribute. 
 */
aid1 = H5Screate(H5S_SIMPLE);
ret  = H5Sset_extent_simple(aid1, ARANK, adim, NULL);

/*
 * Create array attribute.
 */
attr1 = H5Acreate(dataset, ANAME, H5T_NATIVE_FLOAT, aid1, H5P_DEFAULT);

/*
 * Write array attribute.
 */
ret = H5Awrite(attr1, H5T_NATIVE_FLOAT, matrix);

/*
 * Create scalar attribute.
 */
aid2  = H5Screate(H5S_SCALAR);
attr2 = H5Acreate(dataset, "Integer attribute", H5T_NATIVE_INT, aid2,
                  H5P_DEFAULT);

/*
 * Write scalar attribute.
 */
ret = H5Awrite(attr2, H5T_NATIVE_INT, &point); 

/*
 * Create string attribute.
 */
aid3  = H5Screate(H5S_SCALAR);
atype = H5Tcopy(H5T_C_S1);
        H5Tset_size(atype, 4);
attr3 = H5Acreate(dataset, ANAMES, atype, aid3, H5P_DEFAULT);

/*
 * Write string attribute.
 */
ret = H5Awrite(attr3, atype, string); 

/*
 * Close attribute and file datapsaces.
 */
ret = H5Sclose(aid1); 
ret = H5Sclose(aid2); 
ret = H5Sclose(aid3); 
ret = H5Sclose(fid); 

/*
 * Close the attributes.
 */ 
ret = H5Aclose(attr1);
ret = H5Aclose(attr2);
ret = H5Aclose(attr3);
 
/*
 * Close the dataset.
 */
ret = H5Dclose(dataset);

/*
 * Close the file.
 */
ret = H5Fclose(file);

/*
 * Reopen the file.
 */
file = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT);

/*
 * Open the dataset.
 */
dataset = H5Dopen(file,"Dataset");

/*
 * Attach to the scalar attribute using attribute name, then read and 
 * display its value.
 */
attr = H5Aopen_name(dataset,"Integer attribute");
ret  = H5Aread(attr, H5T_NATIVE_INT, &point_out);
printf("The value of the attribute \"Integer attribute\" is %d \n", point_out); 
ret =  H5Aclose(attr);

/*
 * Attach to the string attribute using its index, then read and display the value.
 */
attr = H5Aopen_idx(dataset, 2);
atype = H5Tcopy(H5T_C_S1);
        H5Tset_size(atype, 4);
ret   = H5Aread(attr, atype, string_out);
printf("The value of the attribute with the index 2 is %s \n", string_out);
ret   = H5Aclose(attr);
ret   = H5Tclose(atype);

/*
 * Get attribute info using iteration function. 
 */
idx = H5Aiterate(dataset, NULL, attr_info, NULL);

/*
 * Close the dataset and the file.
 */
H5Dclose(dataset);
H5Fclose(file);
return 0;  
}

/*
 * Operator function.
 */
herr_t attr_info(hid_t loc_id, const char *name, void *opdata)
{
    hid_t attr, atype, aspace;  /* Attribute, datatype and dataspace identifiers */
    int   rank;
    hsize_t sdim[64]; 
    herr_t ret;
    int i;
    size_t npoints;             /* Number of elements in the array attribute. */ 
    float *float_array;         /* Pointer to the array attribute. */
/*
 * Open the attribute using its name.
 */    
    attr = H5Aopen_name(loc_id, name);

/*
 * Display attribute name.
 */
    printf("\n");
    printf("Name : ");
    puts(name);

/* 
 * Get attribute datatype, dataspace, rank, and dimensions.
 */
    atype = H5Aget_type(attr);
    aspace = H5Aget_space(attr);
    rank = H5Sextent_ndims(aspace);
    ret = H5Sextent_dims(aspace, sdim, NULL);
/*
 *  Display rank and dimension sizes for the array attribute.
 */

    if(rank > 0) {
    printf("Rank : %d \n", rank); 
    printf("Dimension sizes : ");
    for (i=0; i< rank; i++) printf("%d ", (int)sdim[i]);
    printf("\n");
    }

/*
 * Read array attribute and display its type and values.
 */

    if (H5T_FLOAT == H5Tget_class(atype)) {
    printf("Type : FLOAT \n"); 
    npoints = H5Sextent_npoints(aspace);
    float_array = (float *)malloc(sizeof(float)*(int)npoints); 
    ret = H5Aread(attr, atype, float_array);
    printf("Values : ");
    for( i = 0; i < npoints; i++) printf("%f ", float_array[i]); 
    printf("\n");
    free(float_array);
    }

 
/*
 * Release all identifiers.
 */
    H5Tclose(atype);
    H5Sclose(aspace);
    H5Aclose(attr);
    return 0;
}

 


HDF Help Desk
Last modified: 11 September 1998