The Dataspace Interface (H5S)
1. Introduction
The dataspace interface (H5S) provides a mechanism to describe the positions
of the elements of a dataset and is designed in such a way as to allow
new features to be easily added without disrupting applications that use
the dataspace interface. A dataset (defined with the dataset interface) is
composed of a collection of raw data points of homogeneous type, defined in the
datatype (H5T) interface, organized according to the dataspace with this
interface.
A dataspace describes the locations that dataset elements are located at.
A dataspace is either a regular N-dimensional array of data points,
called a simple dataspace, or a more general collection of data
points organized in another manner, called a complex dataspace.
A scalar dataspace is a special case of the simple data
space and is defined to be a 0-dimensional single data point in size. Currently
only scalar and simple dataspaces are supported with this version
of the H5S interface.
Complex dataspaces will be defined and implemented in a future
version. Complex dataspaces are intended to be used for such structures
which are awkward to express in simple dataspaces, such as irregularly
gridded data or adaptive mesh refinement data. This interface provides
functions to set and query properties of a dataspace.
Operations on a dataspace include defining or extending the extent of
the dataspace, selecting portions of the dataspace for I/O and storing the
dataspaces in the file. The extent of a dataspace is the range of coordinates
over which dataset elements are defined and stored. Dataspace selections are
subsets of the extent (up to the entire extent) which are selected for some
operation.
For example, a 2-dimensional dataspace with an extent of 10 by 10 may have
the following very simple selection:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
0 |
- | - | - | - | - | - | - | - | - | - |
1 |
- | X | X | X | - | - | - | - | - | - |
2 |
- | X | X | X | - | - | - | - | - | - |
3 |
- | X | X | X | - | - | - | - | - | - |
4 |
- | X | X | X | - | - | - | - | - | - |
5 |
- | X | X | X | - | - | - | - | - | - |
6 |
- | - | - | - | - | - | - | - | - | - |
7 |
- | - | - | - | - | - | - | - | - | - |
8 |
- | - | - | - | - | - | - | - | - | - |
9 |
- | - | - | - | - | - | - | - | - | - |
Example 1: Contiguous rectangular selection
Or, a more complex selection may be defined:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
0 |
- | - | - | - | - | - | - | - | - | - |
1 |
- | X | X | X | - | - | X | - | - | - |
2 |
- | X | - | X | - | - | - | - | - | - |
3 |
- | X | - | X | - | - | X | - | - | - |
4 |
- | X | - | X | - | - | - | - | - | - |
5 |
- | X | X | X | - | - | X | - | - | - |
6 |
- | - | - | - | - | - | - | - | - | - |
7 |
- | - | X | X | X | X | - | - | - | - |
8 |
- | - | - | - | - | - | - | - | - | - |
9 |
- | - | - | - | - | - | - | - | - | - |
Example 2: Non-contiguous selection
Selections within dataspaces have an offset within the extent which is used
to locate the selection within the extent of the dataspace. Selection offsets
default to 0 in each dimension, but may be changed to move the selection within
a dataspace. In example 2 above, if the offset was changed to 1,1, the selection
would look like this:
| 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
0 |
- | - | - | - | - | - | - | - | - | - |
1 |
- | - | - | - | - | - | - | - | - | - |
2 |
- | - | X | X | X | - | - | X | - | - |
3 |
- | - | X | - | X | - | - | - | - | - |
4 |
- | - | X | - | X | - | - | X | - | - |
5 |
- | - | X | - | X | - | - | - | - | - |
6 |
- | - | X | X | X | - | - | X | - | - |
7 |
- | - | - | - | - | - | - | - | - | - |
8 |
- | - | - | X | X | X | X | - | - | - |
9 |
- | - | - | - | - | - | - | - | - | - |
Example 3: Non-contiguous selection with 1,1 offset
Selections also have an linearization ordering of the points selected
(defaulting to "C" order, ie. last dimension changing fastest). The
linearization order may be specified for each point or it may be chosen by
the axis of the dataspace. For example, with the default "C" ordering,
example 1's selected points are iterated through in this order: (1,1), (2,1),
(3,1), (1,2), (2,2), etc. With "FORTRAN" ordering, example 1's selected points
would be iterated through in this order: (1,1), (1,2), (1,3), (1,4), (1,5),
(2,1), (2,2), etc.
A dataspace may be stored in the file as a permanent object, to allow many
datasets to use a commonly defined dataspace. Dataspaces with extendable
extents (ie. unlimited dimensions) are not able to be stored as permanent
dataspaces.
Dataspaces may be created using an existing permanent dataspace as a
container to locate the new dataspace within. These dataspaces are complete
dataspaces and may be used to define datasets. A dataspaces with a "parent"
can be queried to determine the parent dataspace and the location within the
parent. These dataspaces must currently be the same number of dimensions as
the parent dataspace.
2. General Dataspace Operations
The functions defined in this section operate on dataspaces as a whole.
New dataspaces can be created from scratch or copied from existing data
spaces. When a dataspace is no longer needed its resources should be released
by calling H5Sclose().
-
hid_t H5Screate(H5S_class_t type)
-
This function creates a new dataspace of a particular type. The
types currently supported are H5S_SCALAR, H5S_SIMPLE, or H5S_NONE, although
others are planned to be added later. The H5S_NONE dataspace can only hold a
selection, not an extent.
-
hid_t H5Sopen(hid_t location, const char *name)
-
This function opens a permanent dataspace for use in an application.
The location argument is a file or group ID and name is
an absolute or relative path to the permanent dataspace. The dataspace ID which
is returned is a handle to a permanent dataspace which can't be modified.
-
hid_t H5Scopy (hid_t space)
-
This function creates a new dataspace which is an exact copy of the
dataspace space.
-
hid_t H5Ssubspace (hid_t space)
-
This function uses the currently defined selection and offset in space
to create a dataspace which is located within space. The space
dataspace must be a sharable dataspace located in the file, not a dataspace for
a dataset. The relationship of the new dataspace within the existing dataspace
is preserved when the new dataspace is used to create datasets. Currently,
only subspaces which are equivalent to simple dataspaces (ie. rectangular
contiguous areas) are allowed. A subspace is not "simplified" or reduced in
the number of dimensions used if the selection is "flat" in one dimension, they
always have the same number of dimensions as their parent dataspace.
-
herr_t H5Scommit (hid_t location, const char *name, hid_t space)
-
The dataspaces specified with space is stored in the file specified
by location. Location may be either a file or group handle
and name is an absolute or relative path to the location to store the
dataspace. After this call, the dataspace is permanent and can't be modified.
-
herr_t H5Sclose (hid_t space)
-
Releases resources associated with a dataspace. Subsequent use of the
dataspace identifier after this call is undefined.
-
H5S_class_t H5Sextent_class (hid_t space)
-
Query a dataspace to determine the current class of a dataspace. The value
which is returned is one of: H5S_SCALAR, H5S_SIMPLE on success or
H5S_NO_CLASS on failure.
3. Dataspace Extent Operations
These functions operate on the extent portion of a dataspace.
-
herr_t H5Sset_extent_simple (hid_t space, int rank, const hsize_t
*current_size, const hsize_t *maximum_size)
-
Sets or resets the size of an existing dataspace, where rank is
the dimensionality, or number of dimensions, of the dataspace.
current_size is an array of size rank which contains the new size
of each dimension in the dataspace. maximum_size is an array of size
rank which contains the maximum size of each dimension in the dataspace.
Any previous extent is removed from the dataspace, the dataspace type is set to
H5S_SIMPLE and the extent is set as specified.
-
herr_t H5Sset_extent_none (hid_t space)
-
Removes the extent from a dataspace and sets the type to H5S_NONE.
-
herr_t H5Sextent_copy (hid_t dest_space,
hid_t source_space)
-
Copies the extent from source_space to dest_space, which may
change the type of the dataspace. Returns non-negative on success, negative on
failure.
-
hsize_t H5Sextent_npoints (hid_t space)
-
This function determines the number of elements in a dataspace. For example, a
simple 3-dimensional dataspace with dimensions 2, 3 and 4 would have 24
elements.
Returns the number of elements in the dataspace, negative on failure.
-
int H5Sextent_ndims (hid_t space)
-
This function determines the dimensionality (or rank) of a dataspace.
Returns the number of dimensions in the dataspace, negative on failure.
-
herr_t H5Sextent_dims (hid_t space, hsize_t *dims,
hsize_t *max)
-
The function retrieves the size of the extent of the dataspace space by
placing the size of each dimension in the array dims. Also retrieves
the size of the maximum extent of the dataspace, placing the results in
max.
Returns non-negative on success, negative on failure.
4. Dataspace Selection Operations
Selections are maintained separately from extents in dataspaces and operations
on the selection of a dataspace do not affect the extent of the dataspace.
Selections are independent of extent type and the boundaries of selections are
reconciled with the extent at the time of the data transfer. Selection offsets
apply a selection to a location within an extent, allowing the same selection
to be moved within the extent without requiring a new selection to be specified.
Offsets default to 0 when the dataspace is created. Offsets are applied when
an I/O transfer is performed (and checked during calls to H5Sselect_valid).
Selections have an iteration order for the points selected, which can be any
permutation of the dimensions involved (defaulting to 'C' array order) or a
specific order for the selected points, for selections composed of single array
elements with H5Sselect_elements. Selections can also be copied or combined
together in various ways with H5Sselect_op. Further methods of selecting
portions of a dataspace may be added in the future.
-
herr_t H5Sselect_hyperslab (hid_t space, h5s_selopt_t op,
const hssize_t * start, const hsize_t * stride,
const hsize_t * count, const hsize_t * block)
-
This function selects a hyperslab region to add to the current selected region
for the space dataspace. The start, stride, count
and block arrays must be the same size as the rank of the dataspace.
The selection operator op determines how the new selection is to be
combined with the already existing selection for the dataspace. Currently,
only the H5S_SELECT_SET operator is supported, which replaces the existing
selection with the parameters from this call. Overlapping blocks are not
supported with the H5S_SELECT_SET operator.
The start array determines the starting coordinates of the hyperslab
to select. The stride array chooses array locations from the dataspace
with each value in the stride array determining how many elements to move
in each dimension. Setting a value in the stride array to 1 moves to
each element in that dimension of the dataspace, setting a value of 2 in a
location in the stride array moves to every other element in that
dimension of the dataspace. In other words, the stride determines the
number of elements to move from the start location in each dimension.
Stride values of 0 are not allowed. If the stride parameter is NULL,
a contiguous hyperslab is selected (as if each value in the stride array
was set to all 1's). The count array determines how many blocks to
select from the dataspace, in each dimension. The block array determines
the size of the element block selected from the dataspace. If the block
parameter is set to NULL, the block size defaults to a single element
in each dimension (as if the block array was set to all 1's).
For example, in a 2-dimensional dataspace, setting start to [1,1],
stride to [4,4], count to [3,7] and block to [2,2] selects
21 2x2 blocks of array elements starting with location (1,1) and selecting
blocks at locations (1,1), (5,1), (9,1), (1,5), (5,5), etc.
Regions selected with this function call default to 'C' order iteration when
I/O is performed.
-
herr_t H5Sselect_elements (hid_t space, h5s_selopt_t op,
const size_t num_elements, const hssize_t *coord[])
-
This function selects array elements to be included in the selection for the
space dataspace. The number of elements selected must be set with the
num_elements. The coord array is a two-dimensional array of size
<dataspace rank> by <num_elements> in size (ie. a list of
coordinates in the array). The order of the element coordinates in the
coord array also specifies the order that the array elements are
iterated through when I/O is performed. Duplicate coordinate locations are not
checked for.
The selection operator op determines how the new selection is to be
combined with the already existing selection for the dataspace. Currently,
only the H5S_SELECT_SET operator is supported, which replaces the existing
selection with the parameters from this call. When operators other than
H5S_SELECT_SET are used to combine a new selection with an existing selection,
the selection ordering is reset to 'C' array ordering.
-
herr_t H5Sselect_all (hid_t space)
-
This function selects the special H5S_SELECT_ALL region for the space
dataspace. H5S_SELECT_ALL selects the entire dataspace for any dataspace is is
applied to.
-
herr_t H5Sselect_none (hid_t space)
-
This function resets the selection region for the space
dataspace not to include any elements.
-
herr_t H5Sselect_op (hid_t space1, h5s_selopt_t op,
hid_t space2)
-
Uses space2 to perform an operation on space1. The valid
operations for op are:
- H5S_SELECT_COPY
- Copies the selection from space2 into space1, removing any
previously defined selection for space1. The selection order
and offset are also copied to space1
- H5S_SELECT_UNION
- Performs a set union of the selection of the dataspace space2
with the selection from the dataspace space1, with the result
being stored in space1. The selection order for space1 is
reset to 'C' order.
- H5S_SELECT_INTERSECT
- Performs an set intersection of the selection from space2 with
space1, with the result being stored in space1. The
selection order for space1 is reset to 'C' order.
- H5S_SELECT_DIFFERENCE
- Performs a set difference of the selection from space2 with
space1, with the result being stored in space1. The
selection order for space1 is reset to 'C' order.
-
herr_t H5Sselect_order (hid_t space,
hsize_t perm_vector[])
-
This function selects the order to iterate through the dimensions of a dataspace
when performing I/O on a selection. If a specific order has already been
selected for the selection with H5Sselect_elements, this function will remove
it and use a dimension oriented ordering on the selected elements. The elements
of the perm_vector array must be unique and between 0 and the rank of the
dataspace, minus 1. The order of the elements in perm_vector specify
the order to iterate through the selection for each dimension of the dataspace.
To iterate through a 3-dimensional dataspace selection in 'C' order, specify
the elements of the perm_vector as [0, 1, 2], for FORTRAN order they
would be [2, 1, 0]. Other orderings, such as [1, 2, 0] are also possible, but
may execute slower.
-
hbool_t H5Sselect_valid (hid_t space)
-
This function verifies that the selection for a dataspace is within the extent
of the dataspace, if the currently set offset for the dataspace is used.
Returns TRUE if the selection is contained within the extent, FALSE if it
is not contained within the extent and FAIL on error conditions (such as if
the selection or extent is not defined).
-
hsize_t H5Sselect_npoints (hid_t space)
-
This function determines the number of elements in the current selection
of a dataspace.
-
herr_t H5Soffset_simple (hid_t space, const hssize_t *
offset)
-
Sets the offset of a simple dataspace space. The offset array
must be the same number of elements as the number of dimensions for the
dataspace. If the offset array is set to NULL, the offset
for the dataspace is reset to 0.
5. Misc. Dataspace Operations
-
herr_t H5Slock (hid_t space)
-
Locks the dataspace so that it cannot be modified or closed. When the library
exits, the dataspace will be unlocked and closed.
-
hid_t H5Screate_simple(int rank, const hsize_t *current_size,
const hsize_t *maximum_size)
-
This function is a "convenience" wrapper to create a simple dataspace
and set it's extent in one call. It is equivalent to calling H5Screate
and H5Sset_extent_simple() in two steps.
-
int H5Sis_subspace(hid_t space)
-
This function returns positive if space is located within another
dataspace, zero if it is not, and negative on a failure.
-
char *H5Ssubspace_name(hid_t space)
-
This function returns the name of the named dataspace that space
is located within. If space is not located within another dataspace,
or an error occurs, NULL is returned. The application is responsible for
freeing the string returned.
-
herr_t H5Ssubspace_location(hid_t space, hsize_t *loc)
-
If space is located within another dataspace, this function puts
the location of the origin of space in the loc array. The loc
array must be at least as large as the number of dimensions of space.
If space is not located within another dataspace
or an error occurs, a negative value is returned, otherwise a non-negative value
is returned.
Robb Matzke
Quincey Koziol
Last
modified: Thu May 28 15:12:04 EST 1998