Groups

1. Introduction

An object in HDF5 consists of an object header at a fixed file address that contains messages describing various properties of the object such as its storage location, layout, compression, etc. and some of these messages point to other data such as the raw data of a dataset. The address of the object header is also known as an OID and HDF5 has facilities for translating names to OIDs.

Every HDF5 object has at least one name and a set of names can be stored together in a group. Each group implements a name space where the names are any length and unique with respect to other names in the group.

Since a group is a type of HDF5 object it has an object header and a name which exists as a member of some other group. In this way, groups can be linked together to form a directed graph. One particular group is called the Root Group and is the group to which the HDF5 file boot block points. Its name is "/" by convention. The full name of an object is created by joining component names with slashes much like Unix.

Group Graph Example

However, unlike Unix which arranges directories hierarchically, HDF5 arranges groups in a directed graph. Therefore, there is no ".." entry in a group since a group can have more than one parent. There is no "." entry either but the library understands it internally.

2. Names

HDF5 places few restrictions on names: component names may be any length except zero and may contain any character except slash ("/") and the null terminator. A full name may be composed of any number of component names separated by slashes, with any of the component names being the special name ".". A name which begins with a slash is an absolute name which is looked up beginning at the root group of the file while all other relative names are looked up beginning at the current working group (described below) or a specified group. Multiple consecutive slashes in a full name are treated as single slashes and trailing slashes are not significant. A special case is the name "/" (or equivalent) which refers to the root group.

Functions which operate on names generally take a location identifier which is either a file ID or a group ID and perform the lookup with respect to that location. Some possibilities are:

Location Type Object Name Description
File ID /foo/bar The object bar in group foo in the root group of the specified file.
Group ID /foo/bar The object bar in group foo in the root group of the file containing the specified group. In other words, the group ID's only purpose is to supply a file.
File ID / The root group of the specified file.
Group ID / The root group of the file containing the specified group.
File ID foo/bar The object bar in group foo in the current working group of the specified file. The initial current working group is the root group of the file as described below.
Group ID foo/bar The object bar in group foo in the specified group.
File ID . The current working group of the specified file.
Group ID . The specified group.
Other ID . The specified object.

3. Creating, Opening, and Closing Groups

Groups are created with the H5Gcreate() function, and existing groups can be access with H5Gopen(). Both functions return an object ID which should be eventually released by calling H5Gclose().

hid_t H5Gcreate (hid_t location_id, const char *name, size_t size_hint)
This function creates a new group with the specified name at the specified location which is either a file ID or a group ID. The name must not already be taken by some other object and all parent groups must already exist. The size_hint is a hint for the number of bytes to reserve to store the names which will be eventually added to the new group. Passing a value of zero for size_hint is usually adequate since the library is able to dynamically resize the name heap, but a correct hint may result in better performance. The return value is a handle for the open group and it should be closed by calling H5Gclose() when it's no longer needed. A negative value is returned for failure.

hid_t H5Gopen (hid_t location_id, const char *name)
This function opens an existing group with the specified name at the specified location which is either a file ID or a group ID and returns an object ID. The object ID should be released by calling H5Gclose() when it is no longer needed. A negative value is returned for failure.

herr_t H5Gclose (hid_t group_id)
This function releases resources used by an group which was opened by H5Gcreate() or H5Gopen(). After closing a group the group_id should not be used again. This function returns zero for success or a negative value for failure.

4. Current Working Group

Each file handle (hid_t file_id) has a current working group, initially the root group of the file. Names which do not begin with a slash are relative to the specified group or to the current working group as described above. For instance, the name "/Foo/Bar/Baz" is resolved by first looking up "Foo" in the root group. But the name "Foo/Bar/Baz" is resolved by first looking up "Foo" in the current working group.

herr_t H5Gset (hid_t location_id, const char *name)
The group with the specified name is made the current working group for the file which contains it. The location_id can be a file handle or a group handle and the name is resolved as described above. Each file handle has it's own current working group and if the location_id is a group handle then the file handle is derived from the group handle. This function returns zero for success or negative for failure.

herr_t H5Gpush (hid_t location_id, const char *name)
Each file handle has a stack of groups and the top group on that stack is the current working group. The stack initially contains only the root group. This function pushes a new group onto the stack and returns zero for success or negative for failure.

herr_t H5Gpop (hid_t location_id)
This function pops one group off the group stack for the specified file (if the location_id is a group then the file is derived from that group), changing the current working group to the new top-of-stack group. The function returns zero for success or negative for failure (failure includes attempting to pop from an empty stack). If the last item is popped from the stack then the current working group is set to the root group.

5. Objects with Multiple Names

An object (including a group) can have more than one name. Creating the object gives it the first name, and then functions described here can be used to give it additional names. The association between a name and the object is called a link and HDF5 supports two types of links: a hard link is a direct association between the name and the object where both exist in a single HDF5 address space, and a soft link is an indirect association.

Hard Link Example

Soft Link Example
Object Creation
The creation of an object creates a hard link which is indistinguishable from other hard links that might be added later.

herr_t H5Glink (hid_t file_id, H5G_link_t link_type, const char *current_name, const char *new_name)
Creates a new name for an object that has some current name (possibly one of many names it currently has). If the link_type is H5G_LINK_HARD then a new hard link is created. Otherwise if link_type is H5T_LINK_SOFT a soft link is created which is an alias for the current_name. When creating a soft link the object need not exist. This function returns zero for success or negative for failure. This function is not part of the prototype API.

herr_t H5Gunlink (hid_t file_id, const char *name)
This function removes an association between a name and an object. Object headers keep track of how many hard links refer to the object and when the hard link count reaches zero the object can be removed from the file (but objects which are open are not removed until all handles to the object are closed). This function is not part of the prototype API.

6. Comments

Objects can have a comment associated with them. The comment is set and queried with these two functions:

herr_t H5Gset_comment (hid_t loc_id, const char *name, const char *comment)
The previous comment (if any) for the specified object is replace with a new comment. If the comment argument is the empty string or a null pointer then the comment message is removed from the object. Comments should be relatively short, null-terminated, ASCII strings.

herr_t H5Gget_comment (hid_t loc_id, const char *name, size_t bufsize, char *comment)
The comment string for an object is returned through the comment buffer. At most bufsize characters including a null terminator are copied, and the result is not null terminated if the comment is longer than the supplied buffer. If an object doesn't have a comment then the empty string is returned.

Robb Matzke
Last modified: Wed Jul 22 14:24:34 EDT 1998