Introduction to HDF5 
HDF5 Reference Manual 
Other HDF5 documents and links 
And in this document, the HDF5 User's Guide:    
Files   Datasets   Datatypes   Dataspaces   Groups  
References   Attributes   Property Lists   Error Handling  
Filters   Palettes   Caching   Chunking   Mounting Files  
Performance   Debugging   Environment   DDL  
Ragged Arrays  

The Group Interface (H5G)

1. Introduction

An object in HDF5 consists of an object header at a fixed file address that contains messages describing various properties of the object such as its storage location, layout, compression, etc. and some of these messages point to other data such as the raw data of a dataset. The address of the object header is also known as an OID and HDF5 has facilities for translating names to OIDs.

Every HDF5 object has at least one name and a set of names can be stored together in a group. Each group implements a name space where the names are any length and unique with respect to other names in the group.

Since a group is a type of HDF5 object it has an object header and a name which exists as a member of some other group. In this way, groups can be linked together to form a directed graph. One particular group is called the Root Group and is the group to which the HDF5 file super block points. Its name is "/" by convention. The full name of an object is created by joining component names with slashes much like Unix.

Group Graph Example

However, unlike Unix which arranges directories hierarchically, HDF5 arranges groups in a directed graph. Therefore, there is no ".." entry in a group since a group can have more than one parent. There is no "." entry either but the library understands it internally.

2. Names

HDF5 places few restrictions on names: component names may be any length except zero and may contain any character except slash ("/") and the null terminator. A full name may be composed of any number of component names separated by slashes, with any of the component names being the special name ".". A name which begins with a slash is an absolute name which is looked up beginning at the root group of the file while all other relative names are looked up beginning at the specified group. Multiple consecutive slashes in a full name are treated as single slashes and trailing slashes are not significant. A special case is the name "/" (or equivalent) which refers to the root group.

Functions which operate on names generally take a location identifier which is either a file ID or a group ID and perform the lookup with respect to that location. Some possibilities are:

Location Type Object Name Description
File ID /foo/bar The object bar in group foo in the root group.
Group ID /foo/bar The object bar in group foo in the root group of the file containing the specified group. In other words, the group ID's only purpose is to supply a file.
File ID / The root group of the specified file.
Group ID / The root group of the file containing the specified group.
File ID foo/bar The object bar in group foo in the specified group.
Group ID foo/bar The object bar in group foo in the specified group.
File ID . The root group of the file.
Group ID . The specified group.
Other ID . The specified object.

Note, however, that object names within a group must be unique. For example, H5Dcreate returns an error if a dataset with the dataset name specified in the parameter list already exists at the location specified in the parameter list.

3. Creating, Opening, and Closing Groups

Groups are created with the H5Gcreate() function, and existing groups can be access with H5Gopen(). Both functions return an object ID which should be eventually released by calling H5Gclose().

hid_t H5Gcreate (hid_t location_id, const char *name, size_t size_hint)
This function creates a new group with the specified name at the specified location which is either a file ID or a group ID. The name must not already be taken by some other object and all parent groups must already exist. The size_hint is a hint for the number of bytes to reserve to store the names which will be eventually added to the new group. Passing a value of zero for size_hint is usually adequate since the library is able to dynamically resize the name heap, but a correct hint may result in better performance. The return value is a handle for the open group and it should be closed by calling H5Gclose() when it's no longer needed. A negative value is returned for failure.

hid_t H5Gopen (hid_t location_id, const char *name)
This function opens an existing group with the specified name at the specified location which is either a file ID or a group ID and returns an object ID. The object ID should be released by calling H5Gclose() when it is no longer needed. A negative value is returned for failure.

herr_t H5Gclose (hid_t group_id)
This function releases resources used by an group which was opened by H5Gcreate() or H5Gopen(). After closing a group the group_id should not be used again. This function returns zero for success or a negative value for failure.

4. Objects with Multiple Names

An object (including a group) can have more than one name. Creating the object gives it the first name, and then functions described here can be used to give it additional names. The association between a name and the object is called a link and HDF5 supports two types of links: a hard link is a direct association between the name and the object where both exist in a single HDF5 address space, and a soft link is an indirect association.

Hard Link Example

Soft Link Example
Object Creation
The creation of an object creates a hard link which is indistinguishable from other hard links that might be added later.

herr_t H5Glink (hid_t file_id, H5G_link_t link_type, const char *current_name, const char *new_name)
Creates a new name for an object that has some current name (possibly one of many names it currently has). If the link_type is H5G_LINK_HARD then a new hard link is created. Otherwise if link_type is H5T_LINK_SOFT a soft link is created which is an alias for the current_name. When creating a soft link the object need not exist. This function returns zero for success or negative for failure.

herr_t H5Gunlink (hid_t file_id, const char *name)
This function removes an association between a name and an object. Object headers keep track of how many hard links refer to the object and when the hard link count reaches zero the object can be removed from the file (but objects which are open are not removed until all handles to the object are closed).

5. Comments

Objects can have a comment associated with them. The comment is set and queried with these two functions:

herr_t H5Gset_comment (hid_t loc_id, const char *name, const char *comment)
The previous comment (if any) for the specified object is replace with a new comment. If the comment argument is the empty string or a null pointer then the comment message is removed from the object. Comments should be relatively short, null-terminated, ASCII strings.

herr_t H5Gget_comment (hid_t loc_id, const char *name, size_t bufsize, char *comment)
The comment string for an object is returned through the comment buffer. At most bufsize characters including a null terminator are copied, and the result is not null terminated if the comment is longer than the supplied buffer. If an object doesn't have a comment then the empty string is returned.

6. Unlinking Datasets with H5Gmove and H5Gunlink

Exercise caution in the use of H5Gmove and H5Gunlink.

Note that H5Gmove and H5Gunlink each include a step that unlinks pointers to a set or group. If the link that is removed is on the only path leading to a dataset or group, that dataset or group will become inaccessible in the file.

Consider the following example. Assume that the group group2 can only be accessed via the following path, where top_group is a member of the file's root group:

              /top_group/group1/group2/ 
Using H5Gmove, top_group is renamed to be a member of group2. At this point, since top_group was the only route from the root group to group1, there is no longer a path by which one can access group1, group2, or any member datasets. top_group and any member datasets have also become inaccessible.
Introduction to HDF5 
HDF5 Reference Manual 
Other HDF5 documents and links 
And in this document, the HDF5 User's Guide:    
Files   Datasets   Datatypes   Dataspaces   Groups  
References   Attributes   Property Lists   Error Handling  
Filters   Palettes   Caching   Chunking   Mounting Files  
Performance   Debugging   Environment   DDL  
Ragged Arrays  

HDF Help Desk
Last modified: 1 November 2000
Describes HDF5 Release 1.4 Beta, December 2000