/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* Copyright by The HDF Group. *
* All rights reserved. *
* *
* This file is part of HDF5. The full HDF5 copyright notice, including *
* terms governing use, modification, and redistribution, is contained in *
* the COPYING file, which can be found at the root of the source code *
* distribution tree, or in https://www.hdfgroup.org/licenses. *
* If you do not have access to either file, you may request a copy from *
* help@hdfgroup.org. *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
/*
* Programmer: Quincey Koziol
* Saturday, September 12, 2015
*
* Purpose: This file contains declarations which define macros for the
* H5T package. Including this header means that the source file
* is part of the H5T package.
*/
#ifndef H5Tmodule_H
#define H5Tmodule_H
/* Define the proper control macros for the generic FUNC_ENTER/LEAVE and error
* reporting macros.
*/
#define H5T_MODULE
#define H5_MY_PKG H5T
#define H5_MY_PKG_ERR H5E_DATATYPE
#define H5_MY_PKG_INIT YES
/** \page H5T_UG HDF5 Datatypes
*
* \section sec_datatype HDF5 Datatypes
* HDF5 datatypes describe the element type of HDF5 datasets and attributes.
* There's a large set of predefined datatypes, but users may find it useful
* to define new datatypes through a process called \Emph{derivation}.
*
* The element type is automatically persisted as part of the HDF5 metadata of
* attributes and datasets. Additionally, datatype definitions can be persisted
* to HDF5 files and linked to groups as HDF5 datatype objects or so-called
* \Emph{committed datatypes}.
*
* \subsection subsec_datatype_intro Introduction and Definitions
*
* An HDF5 dataset is an array of data elements, arranged according to the specifications
* of the dataspace. In general, a data element is the smallest addressable unit of storage
* in the HDF5 file. (Compound datatypes are the exception to this rule.) The HDF5 datatype
* defines the storage format for a single data element. See the figure below.
*
* The model for HDF5 attributes is extremely similar to datasets: an attribute has a dataspace
* and a data type, as shown in the figure below. The information in this chapter applies to both
* datasets and attributes.
*
*
* Datatype classes and their properties
*
*
* Class
* |
*
* Description
* |
*
* Properties
* |
*
* Notes
* |
*
*
*
* Integer
* |
*
* Twos complement integers
* |
*
* Size (bytes), precision (bits), offset (bits), pad, byte order, signed/unsigned
* |
*
* |
*
*
*
* Float
* |
*
* Floating Point numbers
* |
*
* Size (bytes), precision (bits), offset (bits), pad, byte order, sign position,
* exponent position, exponent size (bits), exponent sign, exponent bias, mantissa position,
* mantissa (size) bits, mantissa sign, mantissa normalization, internal padding
* |
*
* See IEEE 754 for a definition of these properties. These properties describe
* non-IEEE 754 floating point formats as well.
* |
*
*
*
* Character
* |
*
* Array of 1-byte character encoding
* |
*
* Size (characters), Character set, byte order, pad/no pad, pad character
* |
*
* Currently, ASCII and UTF-8 are supported.
* |
*
*
*
* Bitfield
* |
*
* String of bits
* |
*
* Size (bytes), precision (bits), offset (bits), pad, byte order
* |
*
* A sequence of bit values packed into one or more bytes.
* |
*
*
*
* Opaque
* |
*
* Uninterpreted data
* |
*
* Size (bytes), precision (bits), offset (bits), pad, byte order, tag
* |
*
* A sequence of bytes, stored and retrieved as a block.
* The ‘tag’ is a string that can be used to label the value.
* |
*
*
*
* Enumeration
* |
*
* A list of discrete values, with symbolic names in the form of strings.
* |
*
* Number of elements, element names, element values
* |
*
* Enumeration is a list of pairs (name, value). The name is a string; the
* value is an unsigned integer.
* |
*
*
*
* Reference
* |
*
* Reference to object or region within the HDF5 file
* |
*
*
* |
*
* @see H5R
* |
*
*
*
* Array
* |
*
* Array (1-4 dimensions) of data elements
* |
*
* Number of dimensions, dimension sizes, base datatype
* |
*
* The array is accessed atomically: no selection or sub-setting.
* |
*
*
*
* Variable-length
* |
*
* A variable-length 1-dimensional array of data elements
* |
*
* Current size, base type
* |
*
*
* |
*
*
*
* Compound
* |
*
* A Datatype of a sequence of Datatypes
* |
*
* Number of members, member names, member types, member offset, member class,
* member size, byte order
* |
*
*
* |
*
*
*
* \subsubsection subsubsec_datatype_model_predefine Predefined Datatypes
* The HDF5 library predefines a modest number of commonly used datatypes. These types have
* standard symbolic names of the form H5T_arch_base where arch is an architecture name and
* base is a programming type name
* Table 2. Architectures used in predefined datatypes
*
*
* Architecture Name
* |
*
* Description
* |
*
*
*
* IEEE
* |
*
* IEEE-754 standard floating point types in various byte orders.
* |
*
*
*
* STD
* |
*
* This is an architecture that contains semi-standard datatypes like signed
* two’s complement integers, unsigned integers, and bitfields in various
* byte orders.
* |
*
*
*
* C FORTRAN
* |
*
* Types which are specific to the C or Fortran programming languages
* are defined in these architectures. For instance, #H5T_C_S1 defines a
* base string type with null termination which can be used to derive string
* types of other lengths.
* |
*
*
*
* NATIVE
* |
*
* This architecture contains C-like datatypes for the machine on which
* the library was compiled. The types were actually defined by running
* the H5detect program when the library was compiled. In order to be
* portable, applications should almost always use this architecture
* to describe things in memory.
* |
*
*
*
* CRAY
* |
*
* Cray architectures. These are word-addressable, big-endian systems
* with non-IEEE floating point.
* |
*
*
*
* INTEL
* |
*
* All Intel and compatible CPU’s.
* These are little-endian systems with IEEE floating-point.
* |
*
*
*
* MIPS
* |
*
* All MIPS CPU’s commonly used in SGI systems. These are big-endian
* systems with IEEE floating-point.
* |
*
*
*
* ALPHA
* |
*
* All DEC Alpha CPU’s, little-endian systems with IEEE floating-point.
* |
*
*
*
*
* Table 5. Some predefined datatypes
*
*
* Example
* |
*
* Description
* |
*
*
*
* #H5T_IEEE_F64LE
* |
*
* Eight-byte, little-endian, IEEE floating-point
* |
*
*
*
* #H5T_IEEE_F32BE
* |
*
* Four-byte, big-endian, IEEE floating point
* |
*
*
*
* #H5T_STD_I32LE
* |
*
* Four-byte, little-endian, signed two’s complement integer
* |
*
*
*
* #H5T_STD_U16BE
* |
*
* Two-byte, big-endian, unsigned integer
* |
*
*
*
* #H5T_C_S1
* |
*
* One-byte,null-terminated string of eight-bit characters
* |
*
*
*
* #H5T_INTEL_B64
* |
*
* Eight-byte bit field on an Intel CPU
* |
*
*
*
* #H5T_STD_REF_OBJ
* |
*
* Reference to an entire object in a file
* |
*
*
*
* The HDF5 library predefines a set of \Emph{NATIVE} datatypes which are similar to C type names.
* The native types are set to be an alias for the appropriate HDF5 datatype for each platform. For
* example, #H5T_NATIVE_INT corresponds to a C int type. On an Intel based PC, this type is the same as
* #H5T_STD_I32LE, while on a MIPS system this would be equivalent to #H5T_STD_I32BE. Table 6 shows
* examples of \Emph{NATIVE} types and corresponding C types for a common 32-bit workstation.
*
*
* Table 8. General operations on datatype objects
*
*
* API Function
* |
*
* Description
* |
*
*
*
* \ref hid_t \ref H5Tcreate (\ref H5T_class_t class, size_t size)
* |
*
* Create a new datatype object of datatype class . The following datatype classes care supported
* with this function:
* \li #H5T_COMPOUND
* \li #H5T_OPAQUE
* \li #H5T_ENUM
* \li Other datatypes are created with \ref H5Tcopy().
* |
*
*
*
* \ref hid_t \ref H5Tcopy (\ref hid_t type)
* |
*
* Obtain a modifiable transient datatype which is a copy of type. If type is a dataset identifier
* then the type returned is a modifiable transient copy of the datatype of the specified dataset.
* |
*
*
*
* \ref hid_t \ref H5Topen (\ref hid_t location, const char *name, #H5P_DEFAULT)
* |
*
* Open a committed datatype. The committed datatype returned by this function is read-only.
* |
*
*
*
* \ref htri_t \ref H5Tequal (\ref hid_t type1, \ref hid_t type2)
* |
*
* Determines if two types are equal.
* |
*
*
*
* \ref herr_t \ref H5Tclose (\ref hid_t type)
* |
*
* Releases resources associated with a datatype obtained from \ref H5Tcopy, \ref H5Topen, or
* \ref H5Tcreate. It is illegal to close an immutable transient datatype (for example, predefined types).
* |
*
*
*
* \ref herr_t \ref H5Tcommit (\ref hid_t location, const char *name, hid_t type,
* #H5P_DEFAULT, #H5P_DEFAULT, #H5P_DEFAULT)
* |
*
* Commit a transient datatype (not immutable) to a file to become a committed datatype. Committed
* datatypes can be shared.
* |
*
*
*
* \ref htri_t \ref H5Tcommitted (\ref hid_t type)
* |
*
* Test whether the datatype is transient or committed (named).
* |
*
*
*
* \ref herr_t \ref H5Tlock (\ref hid_t type)
* |
*
* Make a transient datatype immutable (read-only and not closable). Predefined types are locked.
* |
*
*
*
* In order to use a datatype, the object must be created (\ref H5Tcreate), or a reference obtained by
* cloning from an existing type (\ref H5Tcopy), or opened (\ref H5Topen). In addition, a reference to the
* datatype of a dataset or attribute can be obtained with \ref H5Dget_type or \ref H5Aget_type. For
* composite datatypes a reference to the datatype for members or base types can be obtained
* (\ref H5Tget_member_type, \ref H5Tget_super). When the datatype object is no longer needed, the
* reference is discarded with \ref H5Tclose.
*
* Two datatype objects can be tested to see if they are the same with \ref H5Tequal. This function
* returns true if the two datatype references refer to the same datatype object. However, if two
* datatype objects define equivalent datatypes (the same datatype class and datatype properties),
* they will not be considered ‘equal’.
*
* A datatype can be written to the file as a first class object (\ref H5Tcommit). This is a committed
* datatype and can be used in thesame way as any other datatype.
*
* \subsubsection subsubsec_datatype_program_discover Discovery of Datatype Properties
* Any HDF5 datatype object can be queried to discover all of its datatype properties. For each
* datatype class, there are a set of API functions to retrieve the datatype properties for this class.
*
*
* Table 9. Functions to discover properties of atomic datatypes
*
*
* API Function
* |
*
* Description
* |
*
*
*
* \ref H5T_class_t \ref H5Tget_class (\ref hid_t type)
* |
*
* The datatype class: #H5T_INTEGER, #H5T_FLOAT, #H5T_STRING, #H5T_BITFIELD, #H5T_OPAQUE, #H5T_COMPOUND,
* #H5T_REFERENCE, #H5T_ENUM, #H5T_VLEN, #H5T_ARRAY
* |
*
*
*
* size_t \ref H5Tget_size (\ref hid_t type)
* |
*
* The total size of the element in bytes, including padding which may appear on either side of the
* actual value.
* |
*
*
*
* \ref H5T_order_t \ref H5Tget_order (\ref hid_t type)
* |
*
* The byte order describes how the bytes of the datatype are laid out in memory. If the lowest memory
* address contains the least significant byte of the datum then it is said to be little-endian or
* #H5T_ORDER_LE. If the bytes are in the opposite order then they are said to be big-endianor #H5T_ORDER_BE.
* |
*
*
*
* size_t \ref H5Tget_precision (\ref hid_t type)
* |
*
* The precision property identifies the number of significant bits of a datatype and the offset property
* (defined below) identifies its location. Some datatypes occupy more bytes than what is needed to store
* the value. For instance, a short on a Cray is 32 significant bits in an eight-byte field.
* |
*
*
*
* int \ref H5Tget_offset (\ref hid_t type)
* |
*
* The offset property defines the bit location of the least significant bit of a bit field whose length
* is precision.
* |
*
*
*
* \ref herr_t \ref H5Tget_pad (\ref hid_t type, \ref H5T_pad_t *lsb, \ref H5T_pad_t *msb)
* |
*
* Padding is the bits of a data element which are not significant as defined by the precision and offset
* properties. Padding in the low-numbered bits is lsb padding and padding in the high-numbered bits is msb
* padding. Padding bits can be set to zero (#H5T_PAD_ZERO) or one (#H5T_PAD_ONE).
* |
*
*
*
*
* Table 10. Functions to discover properties of atomic datatypes
*
*
* API Function
* |
*
* Description
* |
*
*
*
* \ref H5T_sign_t \ref H5Tget_sign (\ref hid_t type)
* |
*
* (INTEGER)Integer data can be signed two’s complement (#H5T_SGN_2) or unsigned (#H5T_SGN_NONE).
* |
*
*
*
* \ref herr_t \ref H5Tget_fields (\ref hid_t type, size_t *spos, size_t *epos, size_t *esize,
* size_t*mpos, size_t *msize)
* |
*
* (FLOAT)A floating-point data element has bit fields which are the exponent and mantissa as well as a
* mantissa sign bit. These properties define the location (bit position of least significant bit of the
* field) and size (in bits) of each field. The sign bit is always of length one and none of the fields
* are allowed to overlap.
* |
*
*
*
* size_t \ref H5Tget_ebias (\ref hid_t type)
* |
*
* (FLOAT)A floating-point data element has bit fields which are the exponent and
* mantissa as well as a mantissa sign bit. These properties define the location (bit
* position of least significant bit of the field) and size (in bits) of
* each field. The sign bit is always of length one and none of the
* fields are allowed to overlap.
* |
*
*
*
* \ref H5T_norm_t \ref H5Tget_norm (\ref hid_t type)
* |
*
* (FLOAT)This property describes the normalization method of the mantissa.
* - #H5T_NORM_MSBSET: the mantissa is shifted left (if non-zero) until the first bit
* after the radix point is set and the exponent is adjusted accordingly. All bits of the
* mantissa after the radix point are stored.
* - #H5T_NORM_IMPLIED: the mantissa is shifted left \(if non-zero) until the first
* bit after the radix point is set and the exponent is adjusted accordingly. The first
* bit after the radix point is not stored since it’s always set.
* - #H5T_NORM_NONE: the fractional part of the mantissa is stored without normalizing it.
* |
*
*
*
* \ref H5T_pad_t \ref H5Tget_inpad (\ref hid_t type)
* |
*
* (FLOAT)If any internal bits (that is, bits between the sign bit, the mantissa field,
* and the exponent field but within the precision field) are unused, then they will be
* filled according to the value of this property. The padding can be:
* #H5T_PAD_BACKGROUND, #H5T_PAD_ZERO,or #H5T_PAD_ONE.
* |
*
*
*
*
* Table 13. Functions to discover properties of composite datatypes
*
*
* API Function
* |
*
* Description
* |
*
*
*
* int \ref H5Tget_nmembers(\ref hid_t type_id)
* |
*
* (COMPOUND)The number of fields in the compound datatype.
* |
*
*
*
* \ref H5T_class_t \ref H5Tget_member_class (\ref hid_t cdtype_id, unsigned member_no)
* |
*
* (COMPOUND)The datatype class of compound datatype member member_no.
* |
*
*
*
* char* \ref H5Tget_member_name (\ref hid_t type_id, unsigned field_idx)
* |
*
* (COMPOUND)The name of field field_idx of a compound datatype.
* |
*
*
*
* size_t \ref H5Tget_member_offset (\ref hid_t type_id, unsigned memb_no)
* |
*
* (COMPOUND)The byte offset of the beginning of a field within a compound datatype.
* |
*
*
*
* \ref hid_t \ref H5Tget_member_type (\ref hid_t type_id, unsigned field_idx)
* |
*
* (COMPOUND)The datatype of the specified member.
* |
*
*
*
* int \ref H5Tget_array_ndims (\ref hid_t adtype_id)
* |
*
* (ARRAY)The number of dimensions (rank) of the array datatype object.
* |
*
*
*
* int \ref H5Tget_array_dims (\ref hid_t adtype_id, hsize_t *dims[])
* |
*
* (ARRAY)The sizes of the dimensions and the dimension permutations of the array datatype object.
* |
*
*
*
* \ref hid_t \ref H5Tget_super(\ref hid_t type)
* |
*
* (ARRAY, VL, ENUM)The base datatype from which the datatype type is derived.
* |
*
*
*
* \ref herr_t \ref H5Tenum_nameof(\ref hid_t type, const void *value, char *name, size_t size)
* |
*
* (ENUM)The symbol name that corresponds to the specified value of the enumeration datatype.
* |
*
*
*
* \ref herr_t \ref H5Tenum_valueof(\ref hid_t type, const char *name, void *value)
* |
*
* (ENUM)The value that corresponds to the specified name of the enumeration datatype.
* |
*
*
*
* \ref herr_t \ref H5Tget_member_value (\ref hid_t type unsigned memb_no, void *value)
* |
*
* (ENUM)The value of the enumeration datatype member memb_no.
* |
*
*
*
* \subsubsection subsubsec_datatype_program_define Definition of Datatypes
* The HDF5 library enables user programs to create and modify datatypes. The essential steps are:
*
* Table 15. API methods that set properties of atomic datatypes
*
*
* Functions
* |
*
* Description
* |
*
*
*
* \ref herr_t \ref H5Tset_size (\ref hid_t type, size_t size)
* |
*
* Set the total size of the element in bytes. This includes padding which may appear on either
* side of the actual value. If this property is reset to a smaller value which would cause the
* significant part of the data to extend beyond the edge of the datatype, then the offset property
* is decremented a bit at a time. If the offset reaches zero and the significant part of the data
* still extends beyond the edge of the datatype then the precision property is decremented a bit at
* a time. Decreasing the size of a datatype may fail if the #H5T_FLOAT bit fields would extend beyond
* the significant part of the type.
* |
*
*
*
* \ref herr_t \ref H5Tset_order (\ref hid_t type, \ref H5T_order_t order)
* |
*
* Set the byte order to little-endian (#H5T_ORDER_LE) or big-endian (#H5T_ORDER_BE).
* |
*
*
*
* \ref herr_t \ref H5Tset_precision (\ref hid_t type, size_t precision)
* |
*
* Set the number of significant bits of a datatype. The offset property (defined below) identifies
* its location. The size property defined above represents the entire size (in bytes) of the datatype.
* If the precision is decreased then padding bits are inserted on the MSB side of the significant
* bits (this will fail for #H5T_FLOAT types if it results in the sign,mantissa, or exponent bit field
* extending beyond the edge of the significant bit field). On the other hand, if the precision is
* increased so that it “hangs over” the edge of the total size then the offset property is decremented
* a bit at a time. If the offset reaches zero and the significant bits still hang over the edge, then
* the total size is increased a byte at a time.
* |
*
*
*
* \ref herr_t \ref H5Tset_offset (\ref hid_t type, size_t offset)
* |
*
* Set the bit location of the least significant bit of a bit field whose length is precision. The
* bits of the entire data are numbered beginning at zero at the least significant bit of the least
* significant byte (the byte at the lowest memory address for a little-endian type or the byte at
* the highest address for a big-endian type). The offset property defines the bit location of the
* least significant bit of a bit field whose length is precision. If the offset is increased so the
* significant bits “hang over” the edge of the datum, then the size property is automatically incremented.
* |
*
*
*
* \ref herr_t \ref H5Tset_pad (\ref hid_t type, \ref H5T_pad_t lsb, \ref H5T_pad_t msb)
* |
*
* Set the padding to zeros (#H5T_PAD_ZERO) or ones (#H5T_PAD_ONE). Padding is the bits of a
* data element which are not significant as defined by the precision and offset properties. Padding
* in the low-numbered bits is lsb padding and padding in the high-numbered bits is msb padding.
* |
*
*
*
*
* Table 16. API methods that set properties of numeric datatypes
*
*
* Functions
* |
*
* Description
* |
*
*
*
* \ref herr_t \ref H5Tset_sign (\ref hid_t type, \ref H5T_sign_t sign)
* |
*
* (INTEGER)Integer data can be signed two’s complement (#H5T_SGN_2) or unsigned (#H5T_SGN_NONE).
* |
*
*
*
* \ref herr_t \ref H5Tset_fields (\ref hid_t type, size_t spos, size_t epos, size_t esize,
* size_t mpos, size_t msize)
* |
*
* (FLOAT)Set the properties define the location (bit position of least significant bit of the field)
* and size (in bits) of each field. The sign bit is always of length one and none of the fields are
* allowed to overlap.
* |
*
*
*
* \ref herr_t \ref H5Tset_ebias (\ref hid_t type, size_t ebias)
* |
*
* (FLOAT)The exponent is stored as a non-negative value which is ebias larger than the true exponent.
* |
*
*
*
* \ref herr_t \ref H5Tset_norm (\ref hid_t type, \ref H5T_norm_t norm)
* |
*
* (FLOAT)This property describes the normalization method of the mantissa.
* - #H5T_NORM_MSBSET: the mantissa is shifted left (if non-zero) until the first bit
* after theradix point is set and the exponent is adjusted accordingly. All bits of the
* mantissa after the radix point are stored.
* - #H5T_NORM_IMPLIED: the mantissa is shifted left (if non-zero) until the first bit
* after the radix point is set and the exponent is adjusted accordingly. The first bit after
* the radix point is not stored since it is always set.
* - #H5T_NORM_NONE: the fractional part of the mantissa is stored without normalizing it.
* |
*
*
*
* \ref herr_t \ref H5Tset_inpad (\ref hid_t type, \ref H5T_pad_t inpad)
* |
*
* (FLOAT)
If any internal bits (that is, bits between the sign bit, the mantissa field,
and the exponent field but within the precision field) are unused, then they will be
filled according to the value of this property. The padding can be:
* \li #H5T_PAD_BACKGROUND
* \li #H5T_PAD_ZERO
* \li #H5T_PAD_ONE
* |
*
*
*
*
* Table 17. API methods that set properties of string datatypes
*
*
* Functions
* |
*
* Description
* |
*
*
*
* \ref herr_t \ref H5Tset_size (\ref hid_t type, size_t size)
* |
*
* Set the length of the string, in bytes. The precision is automatically set to 8*size.
* |
*
*
*
* \ref herr_t \ref H5Tset_precision (\ref hid_t type, size_t precision)
* |
*
* The precision must be a multiple of 8.
* |
*
*
*
* \ref herr_t \ref H5Tset_cset (\ref hid_t type_id, \ref H5T_cset_t cset)
* |
*
* Two character sets are currently supported:
* \li ASCII (#H5T_CSET_ASCII)
* \li UTF-8 (#H5T_CSET_UTF8).
* |
*
*
*
* \ref herr_t \ref H5Tset_strpad (\ref hid_t type_id, H5T_str_t strpad)
* |
*
* The string datatype has a fixed length, but the string may be shorter than the length. This
* property defines the storage mechanism for the left over bytes. The method used to store
* character strings differs with the programming language:
* \li C usually null terminates strings
* \li Fortran left-justifies and space-pads strings
*
* Valid string padding values, as passed in the parameter strpad, are as follows:
* \li #H5T_STR_NULLTERM: Null terminate (as C does)
* \li #H5T_STR_NULLPAD: Pad with zeros
* \li #H5T_STR_SPACEPAD: Pad with spaces (as FORTRAN does)
* |
*
*
*
*
* Representing data with multiple measurements
*
*
* Storage Strategy
* |
*
* Stored as
* |
*
* Remarks
* |
*
*
* Multiple planes
* |
*
* Several datasets with identical dataspaces
* |
*
* This is optimal when variables are accessed individually, or when often uses only selected
* variables.
* |
*
*
*
* Additional dimension
* |
*
* One dataset, the last “dimension” is a vec-tor of variables
* |
*
* This can give good performance, although selecting only a few variables may be slow. This may
* not reflect the science.
* |
*
*
*
* Record with multiple values
* |
*
* One dataset with compound datatype
* |
*
* This enables the variables to be read all together or selected. Also handles “vectors” of
* heterogeneous data.
* |
*
*
*
* Vector or Tensor value
* |
*
* One dataset, each data element is a small array of values.
* |
*
* This uses the same amount of space as the previous two, and may represent the science model
* better.
* |
*
*
*
*
* Storage method advantages and disadvantages
*
*
* Method
* |
*
* Advantages
* |
*
* Disadvantages
* |
*
*
*
* Multiple Datasets
* |
*
* Easy to access each plane, can select any plane(s)
* |
*
* Less efficient to access a ‘column’ through the planes
* |
*
*
*
*
* N+1 Dimension
* |
*
* All access patterns supported
* |
*
* Must be homogeneous datatype
* The added dimension may not make sense in the scientific model
* |
*
*
*
*
* Compound Datatype
* |
*
* Can be heterogeneous datatype
* |
*
* Planes must be named, selection is by plane
* Not a natural representation for a matrix
* |
*
*
*
*
* Array
* |
*
* A natural representation for vector or tensor data
* |
*
* Cannot access elements separately (no access by plane)
* |
*
*
*
* An array datatype may be multi-dimensional with 1 to #H5S_MAX_RANK(the maximum rank
* of a dataset is currently 32) dimensions. The dimensions can be any size greater than 0, but
* unlimited dimensions are not supported (although the datatype can be a variable-length datatype).
*
* An array datatype is created with the #H5Tarray_create call, which specifies the number of
* dimensions, the size of each dimension, and the base type of the array. The array datatype can
* then be used in any way that any datatype object is used. The example below shows the creation
* of a datatype that is a two-dimensional array of native integers, and this is then used to create a
* dataset. Note that the dataset can be a dataspace that is any number and size of dimensions. The figure
* below shows the layout in memory assuming that the native integers are 4 bytes. Each
* data element has 6 elements, for a total of 24 bytes.
*
*
* A string stored as one-character elements in a one-dimensional array
*
*
* a) #H5T_NATIVE_CHAR: The dataset is a one-dimensional array with 29 elements, and each element
* is a single character.
* |
*
*
*
* \image html Dtypes_fig16a.gif
* |
*
*
*
* b) Fixed-length string: The dataset is a one-dimensional array with two elements, and each
* element is 20 characters.
* |
*
*
*
* \image html Dtypes_fig16b.gif
* |
*
*
*
* c) Variable-length string: The dataset is a one-dimensional array with two elements, and each
* element is a variable-length string. This is the same result when stored as a fixed-length
* string except that the first element of the array will need only 11 bytes for storage instead of 20.
* |
*
*
*
* \image html Dtypes_fig16c.gif
* |
*
*
*
* \image html Dtypes_fig16d.gif
* |
*
*
*
* First, a dataset may have a dataset with datatype #H5T_NATIVE_CHAR with each character of
* the string as an element of the dataset. This will store an unstructured block of text data, but
* gives little indication of any structure in the text. See item a in the figure above.
*
* A second alternative is to store the data using the datatype class #H5T_STRING with each
* element a fixed length. See item b in the figure above. In this approach, each element might be a
* word or a sentence, addressed by the dataspace. The dataset reserves space for the specified
* number of characters, although some strings may be shorter. This approach is simple and usually
* is fast to access, but can waste storage space if the length of the Strings varies.
*
* A third alternative is to use a variable-length datatype. See item c in the figure above. This can
* be done using the standard mechanisms described above. The program would use vl_t structures
* to write and read the data.
*
* A fourth alternative is to use a special feature of the string datatype class to set the size of the
* datatype to #H5T_VARIABLE. See item c in the figure above. The example below shows a
* declaration of a datatype of type #H5T_C_S1 which is set to #H5T_VARIABLE. The HDF5
* Library automatically translates between this and the vl_t structure. Note: the #H5T_VARIABLE
* size can only be used with string datatypes.
*
* The storage layout for the four member datatypes
*
*
* a) Compound type ‘s1_t’, size 16 bytes.
* |
*
*
*
* \image html Dtypes_fig20a.gif
* |
*
*
*
* b) Compound type ‘s2_t’, size 8 bytes.
* |
*
*
*
* \image html Dtypes_fig20b.gif
* |
*
*
*
* c) Array type ‘s3_tid’, 40 integers, total size 40 bytes.
* |
*
*
*
* \image html Dtypes_fig20c.gif
* |
*
*
*
* d) String type ‘s4_tid’, size 25 bytes.
* |
*
*
*
* \image html Dtypes_fig20d.gif
* |
*
*
*
*
* Datatype APIs
*
* Function |
* Description |
*
*
*
* \code
* hid_t H5Topen (hid_t location, const char *name)
* \endcode
* |
*
* A committed datatype can be opened by calling this function, which returns a datatype identifier.
* The identifier should eventually be released by calling #H5Tclose() to release resources. The
* committed datatype returned by this function is read-only or a negative value is returned for failure.
* The location is either a file or group identifier.
* |
*
*
*
* \code
* herr_t H5Tcommit (hid_t location, const char *name, hid_t type, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT)
* \endcode
* |
*
* A transient datatype (not immutable) can be written to a file and turned into a committed datatype by
* calling this function. The location is either a file or group identifier and when combined with name
* refers to a new committed datatype.
* |
*
*
*
* \code
* htri_t H5Tcommitted (hid_t type)
* \endcode
* |
*
* A type can be queried to determine if it is a committed type or a transient type. If this function
* returns a positive value then the type is committed. Datasets which return committed datatypes with
* #H5Dget_type() are able to share the datatype with other datasets in the same file.
* |
*
*
*
* \subsection subsec_datatype_transfer Data Transfer: Datatype Conversion and Selection
* When data is transferred (write or read), the storage layout of the data elements may be different.
* For example, an integer might be stored on disk in big-endian byte order and read into memory
* with little-endian byte order. In this case, each data element will be transformed by the HDF5
* Library during the data transfer.
*
* The conversion of data elements is controlled by specifying the datatype of the source and
* specifying the intended datatype of the destination. The storage format on disk is the datatype
* specified when the dataset is created. The datatype of memory must be specified in the library
* call.
*
* In order to be convertible, the datatype of the source and destination must have the same
* datatype class (with the exception of enumeration type). Thus, integers can be converted to other
* integers, and floats to other floats, but integers cannot (yet) be converted to floats. For each
* atomic datatype class, the possible conversions are defined. An enumeration datatype can be
* converted to an integer or a floating-point number datatype.
*
* Basically, any datatype can be converted to another datatype of the same datatype class. The
* HDF5 Library automatically converts all properties. If the destination is too small to hold the
* source value then an overflow or underflow exception occurs. If a handler is defined with the
* #H5Pset_type_conv_cb function, it will be called. Otherwise, a default action will be performed.
* The table below summarizes the default actions.
*
*