summaryrefslogtreecommitdiffstats
path: root/doc/html/Datatypes.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/html/Datatypes.html')
-rw-r--r--doc/html/Datatypes.html3114
1 files changed, 0 insertions, 3114 deletions
diff --git a/doc/html/Datatypes.html b/doc/html/Datatypes.html
deleted file mode 100644
index 232d7fb..0000000
--- a/doc/html/Datatypes.html
+++ /dev/null
@@ -1,3114 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <title>Datatype Interface (H5T)</title>
-
-<!-- #BeginLibraryItem "/ed_libs/styles_UG.lbi" -->
-<!--
- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
- * Copyright by the Board of Trustees of the University of Illinois. *
- * All rights reserved. *
- * *
- * This file is part of HDF5. The full HDF5 copyright notice, including *
- * terms governing use, modification, and redistribution, is contained in *
- * the files COPYING and Copyright.html. COPYING can be found at the root *
- * of the source code distribution tree; Copyright.html can be found at the *
- * root level of an installed copy of the electronic HDF5 document set and *
- * is linked from the top-level documents page. It can also be found at *
- * http://hdf.ncsa.uiuc.edu/HDF5/doc/Copyright.html. If you do not have *
- * access to either file, you may request a copy from hdfhelp@ncsa.uiuc.edu. *
- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
- -->
-
-<link href="ed_styles/UGelect.css" rel="stylesheet" type="text/css">
-<!-- #EndLibraryItem --></head>
-
- <body bgcolor="#FFFFFF">
-
-
-<!-- #BeginLibraryItem "/ed_libs/NavBar_UG.lbi" --><hr>
-<center>
-<table border=0 width=98%>
-<tr><td valign=top align=left>
- <a href="index.html">HDF5 documents and links</a>&nbsp;<br>
- <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
- <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
- <a href="http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/index.html">HDF5 User's Guide for Release 1.6</a>&nbsp;<br>
- <!--
- <a href="Glossary.html">Glossary</a><br>
- -->
-</td>
-<td valign=top align=right>
- And in this document, the
- <a href="H5.user.html"><strong>HDF5 User's Guide from Release 1.4.5:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
- <br>
- <a href="Files.html">Files</a>&nbsp;&nbsp;
- <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
- <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
- <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
- <a href="Groups.html">Groups</a>&nbsp;&nbsp;
- <br>
- <a href="References.html">References</a>&nbsp;&nbsp;
- <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
- <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
- <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
- <br>
- <a href="Filters.html">Filters</a>&nbsp;&nbsp;
- <a href="Caching.html">Caching</a>&nbsp;&nbsp;
- <a href="Chunking.html">Chunking</a>&nbsp;&nbsp;
- <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
- <br>
- <a href="Performance.html">Performance</a>&nbsp;&nbsp;
- <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
- <a href="Environment.html">Environment</a>&nbsp;&nbsp;
- <a href="ddl.html">DDL</a>&nbsp;&nbsp;
-</td></tr>
-</table>
-</center>
-<hr>
-<!-- #EndLibraryItem --><h1>The Datatype Interface (H5T)</h1>
-
- <h2>1. Introduction</h2>
-
- <p>The datatype interface provides a mechanism to describe the
- storage format of individual data points of a data set and is
- hopefully designed in such a way as to allow new features to be
- easily added without disrupting applications that use the
- datatype interface. A dataset (the H5D interface) is composed of a
- collection or raw data points of homogeneous type organized
- according to the data space (the H5S interface).
-
- <p>A datatype is a collection of datatype properties, all of
- which can be stored on disk, and which when taken as a whole,
- provide complete information for data conversion to or from that
- datatype. The interface provides functions to set and query
- properties of a datatype.
-
- <p>A <em>data point</em> is an instance of a <em>datatype</em>,
- which is an instance of a <em>type class</em>. We have defined
- a set of type classes and properties which can be extended at a
- later time. The atomic type classes are those which describe
- types which cannot be decomposed at the datatype interface
- level; all other classes are compound.
-
- <h2>2. General Datatype Operations</h2>
-
- <p>The functions defined in this section operate on datatypes as
- a whole. New datatypes can be created from scratch or copied
- from existing datatypes. When a datatype is no longer needed
- its resources should be released by calling <code>H5Tclose()</code>.
-
- <p> Datatypes come in two flavors: named datatypes and transient
- datatypes. A named datatype is stored in a file while the
- transient flavor is independent of any file. Named datatypes
- are always read-only, but transient types come in three
- varieties: modifiable, read-only, and immutable. The difference
- between read-only and immutable types is that immutable types
- cannot be closed except when the entire library is closed (the
- predefined types like <code>H5T_NATIVE_INT</code> are immutable
- transient types).
-
- <dl>
- <dt><code>hid_t H5Tcreate (H5T_class_t <em>class</em>, size_t
- <em>size</em>)</code>
- <dd> Datatypes can be created by calling this
- function, where <em>class</em> is a datatype class
- identifier. However, the only class currently allowed is
- <code>H5T_COMPOUND</code> to create a new empty compound
- datatype where <em>size</em> is the total size in bytes of an
- instance of this datatype. Other datatypes are created with
- <code>H5Tcopy()</code>. All functions that return datatype
- identifiers return a negative value for failure.
-
- <br><br>
- <dt><code>hid_t H5Topen (hid_t <em>location</em>, const char
- *<em>name</em>)</code>
- <dd>A named datatype can be opened by calling this function,
- which returns a datatype identifier. The identifier should
- eventually be released by calling <code>H5Tclose()</code> to
- release resources. The named datatype returned by this
- function is read-only or a negative value is returned for
- failure. The <em>location</em> is either a file or group
- identifier.
-
- <br><br>
- <dt><code>herr_t H5Tcommit (hid_t <em>location</em>, const char
- *<em>name</em>, hid_t <em>type</em>)</code>
- <dd>A transient datatype (not immutable) can be committed to a
- file and turned into a named datatype by calling this
- function. The <em>location</em> is either a file or group
- identifier and when combined with <em>name</em> refers to a new
- named datatype.
-
- <br><br>
- <dt><code>htri_t H5Tcommitted (hid_t <em>type</em>)</code>
- <dd>A type can be queried to determine if it is a named type or
- a transient type. If this function returns a positive value
- then the type is named (that is, it has been committed perhaps
- by some other application). Datasets which return committed
- datatypes with <code>H5Dget_type()</code> are able to share
- the datatype with other datasets in the same file.
-
- <br><br>
- <dt><code>hid_t H5Tcopy (hid_t <em>type</em>)</code>
- <dd>This function returns a modifiable transient datatype
- which is a copy of <em>type</em> or a negative value for
- failure. If <em>type</em> is a dataset identifier then the type
- returned is a modifiable transient copy of the datatype of
- the specified dataset.
-
- <br><br>
- <dt><code>herr_t H5Tclose (hid_t <em>type</em>)</code>
- <dd>Releases resources associated with a datatype. The
- datatype identifier should not be subsequently used since the
- results would be unpredictable. It is illegal to close an
- immutable transient datatype.
-
- <br><br>
- <dt><code>htri_t H5Tequal (hid_t <em>type1</em>, hid_t
- <em>type2</em>)</code>
- <dd>Determines if two types are equal. If <em>type1</em> and
- <em>type2</em> are the same then this function returns
- <code>TRUE</code>, otherwise it returns <code>FALSE</code> (an
- error results in a negative return value).
-
- <br><br>
- <dt><code>herr_t H5Tlock (hid_t <em>type</em>)</code>
- <dd>A transient datatype can be locked, making it immutable
- (read-only and not closable). The library does this to all
- predefined types to prevent the application from inadvertently
- modifying or deleting (closing) them, but the application is
- also allowed to do this for its own datatypes. Immutable
- datatypes are closed when the library closes (either by
- <code>H5close()</code> or by normal program termination).
- </dl>
-
- <h2>3. Properties of Atomic Types</h2>
-
- <p>An atomic type is a type which cannot be decomposed into
- smaller units at the API level. All atomic types have a common
- set of properties which are augmented by properties specific to
- a particular type class. Some of these properties also apply to
- compound datatypes, but we discuss them only as they apply to
- atomic datatypes here. The properties and the functions that
- query and set their values are:
-
- <dl>
- <dt><code>H5T_class_t H5Tget_class (hid_t <em>type</em>)</code>
- <dd>This property holds one of the class names:
- <code>H5T_INTEGER, H5T_FLOAT, H5T_TIME, H5T_STRING, or
- H5T_BITFIELD</code>. This property is read-only and is set
- when the datatype is created or copied (see
- <code>H5Tcreate()</code>, <code>H5Tcopy()</code>). If this
- function fails it returns <code>H5T_NO_CLASS</code> which has
- a negative value (all other class constants are non-negative).
-
- <br><br>
- <dt><code>size_t H5Tget_size (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_size (hid_t <em>type</em>, size_t
- <em>size</em>)</code>
- <dd>This property is total size of the datum in bytes, including
- padding which may appear on either side of the actual value.
- If this property is reset to a smaller value which would cause
- the significant part of the data to extend beyond the edge of
- the datatype then the <code>offset</code> property is
- decremented a bit at a time. If the offset reaches zero and
- the significant part of the data still extends beyond the edge
- of the datatype then the <code>precision</code> property is
- decremented a bit at a time. Decreasing the size of a
- datatype may fail if the <code>H5T_FLOAT</code> bit fields would
- extend beyond the significant part of the type. Adjusting the
- size of an <code>H5T_STRING</code> automatically adjusts the
- precision as well. On error, <code>H5Tget_size()</code>
- returns zero which is never a valid size.
-
- <br><br>
- <dt><code>H5T_order_t H5Tget_order (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_order (hid_t <em>type</em>, H5T_order_t
- <em>order</em>)</code>
- <dd>All atomic datatypes have a byte order which describes how
- the bytes of the datatype are layed out in memory. If the
- lowest memory address contains the least significant byte of
- the datum then it is said to be <em>little-endian</em> or
- <code>H5T_ORDER_LE</code>. If the bytes are in the oposite
- order then they are said to be <em>big-endian</em> or
- <code>H5T_ORDER_BE</code>. Some datatypes have the same byte
- order on all machines and are <code>H5T_ORDER_NONE</code>
- (like character strings). If <code>H5Tget_order()</code>
- fails then it returns <code>H5T_ORDER_ERROR</code> which is a
- negative value (all successful return values are
- non-negative).
-
- <br><br>
- <dt><code>size_t H5Tget_precision (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_precision (hid_t <em>type</em>, size_t
- <em>precision</em>)</code>
- <dd>Some datatypes occupy more bytes than what is needed to
- store the value. For instance, a <code>short</code> on a Cray
- is 32 significant bits in an eight-byte field. The
- <code>precision</code> property identifies the number of
- significant bits of a datatype and the <code>offset</code>
- property (defined below) identifies its location. The
- <code>size</code> property defined above represents the entire
- size (in bytes) of the datatype. If the precision is
- decreased then padding bits are inserted on the MSB side of
- the significant bits (this will fail for
- <code>H5T_FLOAT</code> types if it results in the sign,
- mantissa, or exponent bit field extending beyond the edge of
- the significant bit field). On the other hand, if the
- precision is increased so that it "hangs over" the edge of the
- total size then the <code>offset</code> property is
- decremented a bit at a time. If the <code>offset</code>
- reaches zero and the significant bits still hang over the
- edge, then the total size is increased a byte at a time. The
- precision of an <code>H5T_STRING</code> is read-only and is
- always eight times the value returned by
- <code>H5Tget_size()</code>. <code>H5Tget_precision()</code>
- returns zero on failure since zero is never a valid precision.
-
- <br><br>
- <dt><code>size_t H5Tget_offset (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_offset (hid_t <em>type</em>, size_t
- <em>offset</em>)</code>
- <dd>While the <code>precision</code> property defines the number
- of significant bits, the <code>offset</code> property defines
- the location of those bits within the entire datum. The bits
- of the entire data are numbered beginning at zero at the least
- significant bit of the least significant byte (the byte at the
- lowest memory address for a little-endian type or the byte at
- the highest address for a big-endian type). The
- <code>offset</code> property defines the bit location of the
- least signficant bit of a bit field whose length is
- <code>precision</code>. If the offset is increased so the
- significant bits "hang over" the edge of the datum, then the
- <code>size</code> property is automatically incremented. The
- offset is a read-only property of an <code>H5T_STRING</code>
- and is always zero. <code>H5Tget_offset()</code> returns zero
- on failure which is also a valid offset, but is guaranteed to
- succeed if a call to <code>H5Tget_precision()</code> succeeds
- with the same arguments.
-
- <br><br>
- <dt><code>herr_t H5Tget_pad (hid_t <em>type</em>, H5T_pad_t
- *<em>lsb</em>, H5T_pad_t *<em>msb</em>)</code>
- <dt><code>herr_t H5Tset_pad (hid_t <em>type</em>, H5T_pad_t
- <em>lsb</em>, H5T_pad_t <em>msb</em>)</code>
- <dd>The bits of a datum which are not significant as defined by
- the <code>precision</code> and <code>offset</code> properties
- are called <em>padding</em>. Padding falls into two
- categories: padding in the low-numbered bits is <em>lsb</em>
- padding and padding in the high-numbered bits is <em>msb</em>
- padding (bits are numbered according to the description for
- the <code>offset</code> property). Padding bits can always be
- set to zero (<code>H5T_PAD_ZERO</code>) or always set to one
- (<code>H5T_PAD_ONE</code>). The current pad types are returned
- through arguments of <code>H5Tget_pad()</code> either of which
- may be null pointers.
- </dl>
-
- <h3>3.1. Properties of Integer Atomic Types</h3>
-
- <p>Integer atomic types (<code>class=H5T_INTEGER</code>)
- describe integer number formats. Such types include the
- following information which describes the type completely and
- allows conversion between various integer atomic types.
-
- <dl>
- <dt><code>H5T_sign_t H5Tget_sign (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_sign (hid_t <em>type</em>, H5T_sign_t
- <em>sign</em>)</code>
- <dd>Integer data can be signed two's complement
- (<code>H5T_SGN_2</code>) or unsigned
- (<code>H5T_SGN_NONE</code>). Whether data is signed or not
- becomes important when converting between two integer
- datatypes of differing sizes as it determines how values are
- truncated and sign extended.
- </dl>
-
- <h3>3.2. Properties of Floating-point Atomic Types</h3>
-
- <p>The library supports floating-point atomic types
- (<code>class=H5T_FLOAT</code>) as long as the bits of the
- exponent are contiguous and stored as a biased positive number,
- the bits of the mantissa are contiguous and stored as a positive
- magnitude, and a sign bit exists which is set for negative
- values. Properties specific to floating-point types are:
-
- <dl>
- <dt><code>herr_t H5Tget_fields (hid_t <em>type</em>, size_t
- *<em>spos</em>, size_t *<em>epos</em>, size_t
- *<em>esize</em>, size_t *<em>mpos</em>, size_t
- *<em>msize</em>)</code>
- <dt><code>herr_t H5Tset_fields (hid_t <em>type</em>, size_t
- <em>spos</em>, size_t <em>epos</em>, size_t <em>esize</em>,
- size_t <em>mpos</em>, size_t <em>msize</em>)</code>
- <dd>A floating-point datum has bit fields which are the exponent
- and mantissa as well as a mantissa sign bit. These properties
- define the location (bit position of least significant bit of
- the field) and size (in bits) of each field. The bit
- positions are numbered beginning at zero at the beginning of
- the significant part of the datum (see the descriptions of the
- <code>precision</code> and <code>offset</code>
- properties). The sign bit is always of length one and none of
- the fields are allowed to overlap. When expanding a
- floating-point type one should set the precision first; when
- decreasing the size one should set the field positions and
- sizes first.
-
- <br><br>
- <dt><code>size_t H5Tget_ebias (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_ebias (hid_t <em>type</em>, size_t
- <em>ebias</em>)</code>
- <dd>The exponent is stored as a non-negative value which is
- <code>ebias</code> larger than the true exponent.
- <code>H5Tget_ebias()</code> returns zero on failure which is
- also a valid exponent bias, but the function is guaranteed to
- succeed if <code>H5Tget_precision()</code> succeeds when
- called with the same arguments.
-
- <br><br>
- <dt><code>H5T_norm_t H5Tget_norm (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_norm (hid_t <em>type</em>, H5T_norm_t
- <em>norm</em>)</code>
- <dd>This property determines the normalization method of the
- mantissa.
- <ul>
- <li>If the value is <code>H5T_NORM_MSBSET</code> then the
- mantissa is shifted left (if non-zero) until the first bit
- after the radix point is set and the exponent is adjusted
- accordingly. All bits of the mantissa after the radix
- point are stored.
-
- <li>If its value is <code>H5T_NORM_IMPLIED</code> then the
- mantissa is shifted left (if non-zero) until the first bit
- after the radix point is set and the exponent is adjusted
- accordingly. The first bit after the radix point is not stored
- since it's always set.
-
- <li>If its value is <code>H5T_NORM_NONE</code> then the fractional
- part of the mantissa is stored without normalizing it.
- </ul>
-
- <br><br>
- <dt><code>H5T_pad_t H5Tget_inpad (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_inpad (hid_t <em>type</em>, H5T_pad_t
- <em>inpad</em>)</code>
- <dd>If any internal bits (that is, bits between the sign bit,
- the mantissa field, and the exponent field but within the
- precision field) are unused, then they will be filled
- according to the value of this property. The <em>inpad</em>
- argument can be <code>H5T_PAD_ZERO</code> if the internal
- padding should always be set to zero, or <code>H5T_PAD_ONE</code>
- if it should always be set to one.
- <code>H5Tget_inpad()</code> returns <code>H5T_PAD_ERROR</code>
- on failure which is a negative value (successful return is
- always non-negative).
- </dl>
-
- <h3>3.3. Properties of Date and Time Atomic Types</h3>
-
- <p>Dates and times (<code>class=H5T_TIME</code>) are stored as
- character strings in one of the ISO-8601 formats like
- "<em>1997-12-05 16:25:30</em>"; as character strings using the
- Unix asctime(3) format like "<em>Thu Dec 05 16:25:30 1997</em>";
- as an integer value by juxtaposition of the year, month, and
- day-of-month, hour, minute and second in decimal like
- <em>19971205162530</em>; as an integer value in Unix time(2)
- format; or other variations.
-
- <h3>3.4. Properties of Character String Atomic Types</h3>
-
- <p>Fixed-length character string types are used to store textual
- information. The <code>offset</code> property of a string is
- always zero and the <code>precision</code> property is eight
- times as large as the value returned by
- <code>H5Tget_size()</code> (since precision is measured in bits
- while size is measured in bytes). Both properties are
- read-only.
-
- <dl>
- <dt><code>H5T_cset_t H5Tget_cset (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_cset (hid_t <em>type</em>, H5T_cset_t
- <em>cset</em>)</code>
- <dd>HDF5 is able to distinguish between character sets of
- different nationalities and to convert between them to the
- extent possible. The only character set currently supported
- is <code>H5T_CSET_ASCII</code>.
-
- <br><br>
- <dt><code>H5T_str_t H5Tget_strpad (hid_t <em>type</em>)</code>
- <dt><code>herr_t H5Tset_strpad (hid_t <em>type</em>, H5T_str_t
- <em>strpad</em>)</code>
- <dd>The method used to store character strings differs with the
- programming language: C usually null terminates strings while
- Fortran left-justifies and space-pads strings. This property
- defines the storage mechanism and can be
-
- <p>
- <dl>
- <dt><code>H5T_STR_NULLTERM</code>
- <dd>A C-style string which is guaranteed to be null
- terminated. When converting from a longer string the
- value will be truncated and then a null character
- appended.
-
- <br><br>
- <dt><code>H5T_STR_NULLPAD</code>
- <dd>A C-style string which is padded with null characters
- but not necessarily null terminated. Conversion from a
- long string to a shorter <code>H5T_STR_NULLPAD</code>
- string will truncate but not null terminate. Conversion
- from a short value to a longer value will append null
- characters as with <code>H5T_STR_NULLTERM</code>.
-
- <br><br>
- <dt><code>H5T_STR_SPACEPAD</code>
- <dd>A Fortran-style string which is padded with space
- characters. This is the same as
- <code>H5T_STR_NULLPAD</code> except the padding character
- is a space instead of a null.
- </dl>
-
- <p><code>H5Tget_strpad()</code> returns
- <code>H5T_STR_ERROR</code> on failure, a negative value (all
- successful return values are non-negative).
- </dl>
-
- <h3>3.5. Properties of Bit Field Atomic Types</h3>
-
- <p>Converting a bit field (<code>class=H5T_BITFIELD</code>) from
- one type to another simply copies the significant bits. If the
- destination is smaller than the source then bits are truncated.
- Otherwise new bits are filled according to the <code>msb</code>
- padding type.
-
- <h3>3.6. Character and String Datatype Issues</h3>
-
- The <code>H5T_NATIVE_CHAR</code> and <code>H5T_NATIVE_UCHAR</code>
- datatypes are actually numeric data (1-byte integers). If the
- application wishes to store character data, then an HDF5
- <em>string</em> datatype should be derived from
- <code>H5T_C_S1</code> instead.
-
- <h4>Motivation</h4>
-
- HDF5 defines at least three classes of datatypes:
- integer data, floating point data, and character data.
- However, the C language defines only integer and
- floating point datatypes; character data in C is
- overloaded on the 8- or 16-bit integer types and
- character strings are overloaded on arrays of those
- integer types which, by convention, are terminated with
- a zero element.
-
- In C, the variable <code>unsigned char s[256]</code> is
- either an array of numeric data, a single character string
- with at most 255 characters, or an array of 256 characters,
- depending entirely on usage. For uniformity with the
- other <code>H5T_NATIVE_</code> types, HDF5 uses the
- numeric interpretation of <code>H5T_NATIVE_CHAR</code>
- and <code>H5T_NATIVE_UCHAR</code>.
-
-
- <h4>Usage</h4>
-
- To store <code>unsigned char s[256]</code> data as an
- array of integer values, use the HDF5 datatype
- <code>H5T_NATIVE_UCHAR</code> and a data space that
- describes the 256-element array. Some other application
- that reads the data will then be able to read, say, a
- 256-element array of 2-byte integers and HDF5 will
- perform the numeric translation.
-
- To store <code>unsigned char s[256]</code> data as a
- character string, derive a fixed length string datatype
- from <code>H5T_C_S1</code> by increasing its size to
- 256 characters. Some other application that reads the
- data will be able to read, say, a space padded string
- of 16-bit characters and HDF5 will perform the character
- and padding translations.
-
- <pre>
- hid_t s256 = H5Tcopy(H5T_C_S1);
- H5Tset_size(s256, 256);
- </pre>
-
- To store <code>unsigned char s[256]</code> data as
- an array of 256 ASCII characters, use an
- HDF5 data space to describe the array and derive a
- one-character string type from <code>H5T_C_S1</code>.
- Some other application will be able to read a subset
- of the array as 16-bit characters and HDF5 will
- perform the character translations.
- The <code>H5T_STR_NULLPAD</code> is necessary because
- if <code>H5T_STR_NULLTERM</code> were used
- (the default) then the single character of storage
- would be for the null terminator and no useful data
- would actually be stored (unless the length were
- incremented to more than one character).
-
- <pre>
- hid_t s1 = H5Tcopy(H5T_C_S1);
- H5Tset_strpad(s1, H5T_STR_NULLPAD);
- </pre>
-
- <h4>Summary</h4>
-
- The C language uses the term <code>char</code> to
- represent one-byte numeric data and does not make
- character strings a first-class datatype.
- HDF5 makes a distinction between integer and
- character data and maps the C <code>signed char</code>
- (<code>H5T_NATIVE_CHAR</code>) and
- <code>unsigned char</code> (<code>H5T_NATIVE_UCHAR</code>)
- datatypes to the HDF5 integer type class.
-
- <h2>4. Properties of Opaque Types</h2>
-
- <p>Opaque types (<code>class=H5T_OPAQUE</code>) provide the
- application with a mechanism for describing data which cannot be
- otherwise described by HDF5. The only properties associated with
- opaque types are a size in bytes and an ASCII tag which is
- manipulated with <code>H5Tset_tag()</code> and
- <code>H5Tget_tag()</code> functions. The library contains no
- predefined conversion functions but the application is free to
- register conversions between any two opaque types or between an
- opaque type and some other type.
-
- <h2>5. Properties of Compound Types</h2>
-
- <p>A compound datatype is similar to a <code>struct</code> in C
- or a common block in Fortran: it is a collection of one or more
- atomic types or small arrays of such types. Each
- <em>member</em> of a compound type has a name which is unique
- within that type, and a byte offset that determines the first
- byte (smallest byte address) of that member in a compound datum.
- A compound datatype has the following properties:
-
- <dl>
- <dt><code>H5T_class_t H5Tget_class (hid_t <em>type</em>)</code>
- <dd>All compound datatypes belong to the type class
- <code>H5T_COMPOUND</code>. This property is read-only and is
- defined when a datatype is created or copied (see
- <code>H5Tcreate()</code> or <code>H5Tcopy()</code>).
-
- <br><br>
- <dt><code>size_t H5Tget_size (hid_t <em>type</em>)</code>
- <dd>Compound datatypes have a total size in bytes which is
- returned by this function. All members of a compound
- datatype must exist within this size. A value of zero is returned
- for failure; all successful return values are positive.
-
- <br><br>
- <dt><code>int H5Tget_nmembers (hid_t <em>type</em>)</code>
- <dd>A compound datatype consists of zero or more members
- (defined in any order) with unique names and which occupy
- non-overlapping regions within the datum. In the functions
- that follow, individual members are referenced by an index
- number between zero and <em>N</em>-1, inclusive, where
- <em>N</em> is the value returned by this function.
- <code>H5Tget_nmembers()</code> returns -1 on failure.
-
- <br><br>
- <dt><code>char *H5Tget_member_name (hid_t <em>type</em>, unsigned
- <em>membno</em>)</code>
- <dd>Each member has a name which is unique among its siblings in
- a compound datatype. This function returns a pointer to a
- null-terminated copy of the name allocated with
- <code>malloc()</code> or the null pointer on failure. The
- caller is responsible for freeing the memory returned by this
- function.
-
- <br><br>
- <dt><code>size_t H5Tget_member_offset (hid_t <em>type</em>, unsigned
- <em>membno</em>)</code>
- <dd>The byte offset of member number <em>membno</em> with
- respect to the beginning of the containing compound datum is
- returned by this function. A zero is returned on failure
- which is also a valid offset, but this function is guaranteed
- to succeed if a call to <code>H5Tget_member_class()</code>
- succeeds when called with the same <em>type</em> and
- <em>membno</em> arguments.
-
- <br><br>
- <dt><code>hid_t H5Tget_member_type (hid_t <em>type</em>, unsigned
- <em>membno</em>)</code>
- <dd>Each member has its own datatype, a copy of which is
- returned by this function. The returned datatype identifier
- should be released by eventually calling
- <code>H5Tclose()</code> on that type.
- </dl>
-
- <p>Properties of members of a compound datatype are
- defined when the member is added to the compound type (see
- <code>H5Tinsert()</code>) and cannot be subsequently modified.
- This makes it imposible to define recursive data structures.
-
-
- <a name="DTypes-PredefinedAtomic">
- <h2>6. Predefined Atomic Datatypes</h2>
- </a>
-
- <p>The library predefines a modest number of datatypes having
- names like <code>H5T_<em>arch</em>_<em>base</em></code> where
- <em>arch</em> is an architecture name and <em>base</em> is a
- programming type name. New types can be derived from the
- predifined types by copying the predefined type (see
- <code>H5Tcopy()</code>) and then modifying the result.
-
- <p>
- <center>
- <table align=center width="80%">
- <tr>
- <th align=left width="20%">Architecture Name</th>
- <th align=left width="80%">Description</th>
- </tr>
-
- <tr valign=top>
- <td><code>IEEE</code></td>
- <td>This architecture defines standard floating point
- types in various byte orders.</td>
- </tr>
-
- <tr valign=top>
- <td><code>STD</code></td>
- <td>This is an architecture that contains semi-standard
- datatypes like signed two's complement integers,
- unsigned integers, and bitfields in various byte
- orders.</td>
- </tr>
-
- <tr valign=top>
- <td><code>UNIX</code></td>
- <td>Types which are specific to Unix operating systems are
- defined in this architecture. The only type currently
- defined is the Unix date and time types
- (<code>time_t</code>).</td>
- </tr>
-
- <tr valign=top>
- <td><code>C<br>FORTRAN</code></td>
- <td>Types which are specific to the C or Fortran
- programming languages are defined in these
- architectures. For instance, <code>H5T_C_STRING</code>
- defines a base string type with null termination which
- can be used to derive string types of other
- lengths.</td>
- </tr>
-
- <tr valign=top>
- <td><code>NATIVE</code></td>
- <td>This architecture contains C-like datatypes for the
- machine on which the library was compiled. The types
- were actually defined by running the
- <code>H5detect</code> program when the library was
- compiled. In order to be portable, applications should
- almost always use this architecture to describe things
- in memory.</td>
- </tr>
-
- <tr valign=top>
- <td><code>CRAY</code></td>
- <td>Cray architectures. These are word-addressable,
- big-endian systems with non-IEEE floating point.</td>
- </tr>
-
- <tr valign=top>
- <td><code>INTEL</code></td>
- <td>All Intel and compatible CPU's including 80286, 80386,
- 80486, Pentium, Pentium-Pro, and Pentium-II. These are
- little-endian systems with IEEE floating-point.</td>
- </tr>
-
- <tr valign=top>
- <td><code>MIPS</code></td>
- <td>All MIPS CPU's commonly used in SGI systems. These
- are big-endian systems with IEEE floating-point.</td>
- </tr>
-
- <tr valign=top>
- <td><code>ALPHA</code></td>
- <td>All DEC Alpha CPU's, little-endian systems with IEEE
- floating-point.</td>
- </tr>
- </table>
- </center>
-
- <p>The base name of most types consists of a letter, a precision
- in bits, and an indication of the byte order. The letters are:
-
- <p>
- <center>
- <table border align=center width="40%">
- <tr>
- <td align=center width="30%">B</td>
- <td width="70%">Bitfield</td>
- </tr>
- <tr>
- <td align=center>D</td>
- <td>Date and time</td>
- </tr>
- <tr>
- <td align=center>F</td>
- <td>Floating point</td>
- </tr>
- <tr>
- <td align=center>I</td>
- <td>Signed integer</td>
- </tr>
- <tr>
- <td align=center>R</td>
- <td>References</td>
- </tr>
- <tr>
- <td align=center>S</td>
- <td>Character string</td>
- </tr>
- <tr>
- <td align=center>U</td>
- <td>Unsigned integer</td>
- </tr>
- </table>
- </center>
-
- <p>The byte order is a two-letter sequence:
-
- <p>
- <center>
- <table border align=center width="40%">
- <tr>
- <td align=center width="30%">BE</td>
- <td width="70%">Big endian</td>
- </tr>
- <tr>
- <td align=center>LE</td>
- <td>Little endian</td>
- </tr>
- <tr>
- <td align=center>VX</td>
- <td>Vax order</td>
- </tr>
- </table>
- </center>
-
- <p>
- <center>
- <table align=center width="80%">
- <tr>
- <th align=left><br><br>Example</th>
- <th align=left><br><br>Description</th>
- </tr>
-
- <tr valign=top>
- <td><code>H5T_IEEE_F64LE</code></td>
- <td>Eight-byte, little-endian, IEEE floating-point</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_IEEE_F32BE</code></td>
- <td>Four-byte, big-endian, IEEE floating point</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_STD_I32LE</code></td>
- <td>Four-byte, little-endian, signed two's complement integer</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_STD_U16BE</code></td>
- <td>Two-byte, big-endian, unsigned integer</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_UNIX_D32LE</code></td>
- <td>Four-byte, little-endian, time_t</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_C_S1</code></td>
- <td>One-byte, null-terminated string of eight-bit characters</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_INTEL_B64</code></td>
- <td>Eight-byte bit field on an Intel CPU</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_CRAY_F64</code></td>
- <td>Eight-byte Cray floating point</td>
- </tr>
- <tr valign=top>
- <td><code>H5T_STD_ROBJ</code></td>
- <td>Reference to an entire object in a file</td>
- </tr>
- </table>
- </center>
-
- <p>The <code>NATIVE</code> architecture has base names which don't
- follow the same rules as the others. Instead, native type names
- are similar to the C type names. Here are some examples:
-
- <p>
- <center>
- <table align=center width="80%">
- <tr>
- <th align=left><br><br>Example</th>
- <th align=left><br><br>Corresponding C Type</th>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_CHAR</code></td>
- <td><code>char</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_SCHAR</code></td>
- <td><code>signed char</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_UCHAR</code></td>
- <td><code>unsigned char</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_SHORT</code></td>
- <td><code>short</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_USHORT</code></td>
- <td><code>unsigned short</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_INT</code></td>
- <td><code>int</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_UINT</code></td>
- <td><code>unsigned</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_LONG</code></td>
- <td><code>long</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_ULONG</code></td>
- <td><code>unsigned long</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_LLONG</code></td>
- <td><code>long long</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_ULLONG</code></td>
- <td><code>unsigned long long</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_FLOAT</code></td>
- <td><code>float</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_DOUBLE</code></td>
- <td><code>double</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_LDOUBLE</code></td>
- <td><code>long double</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_HSIZE</code></td>
- <td><code>hsize_t</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_HSSIZE</code></td>
- <td><code>hssize_t</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_HERR</code></td>
- <td><code>herr_t</code></td>
- </tr>
- <tr>
- <td><code>H5T_NATIVE_HBOOL</code></td>
- <td><code>hbool_t</code></td>
- </tr>
- </table>
- </center>
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: A 128-bit
- integer</h4></caption>
- <tr>
- <td>
- <p>To create a 128-bit, little-endian signed integer
- type one could use the following (increasing the
- precision of a type automatically increases the total
- size):
-
- <p><code><pre>
-hid_t new_type = H5Tcopy (H5T_NATIVE_INT);
-H5Tset_precision (new_type, 128);
-H5Tset_order (new_type, H5T_ORDER_LE);
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: An 80-character
- string</h4></caption>
- <tr>
- <td>
- <p>To create an 80-byte null terminated string type one
- might do this (the offset of a character string is
- always zero and the precision is adjusted
- automatically to match the size):
-
- <p><code><pre>
-hid_t str80 = H5Tcopy (H5T_C_S1);
-H5Tset_size (str80, 80);
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
- <p>A complete list of the datatypes predefined in HDF5 can be found in
- <a href="PredefDTypes.html"><cite>HDF5 Predefined Datatypes</cite></a>
- in the <a href="RM_H5Front.html"><cite>HDF5 Reference Manual</cite></a>.
-
-
- <h2>7. Defining Compound Datatypes</h2>
-
- <p>Unlike atomic datatypes which are derived from other atomic
- datatypes, compound datatypes are created from scratch. First,
- one creates an empty compound datatype and specifies it's total
- size. Then members are added to the compound datatype in any
- order.
-
- <p>Usually a C struct will be defined to hold a data point in
- memory, and the offsets of the members in memory will be the
- offsets of the struct members from the beginning of an instance
- of the struct.
-
- <dl>
- <dt><code>HOFFSET(s,m)</code>
- <dd>This macro computes the offset of member <em>m</em> within
- a struct <em>s</em>.
- <dt><code>offsetof(s,m)</code>
- <dd>This macro defined in <code>stddef.h</code> does
- exactly the same thing as the <code>HOFFSET()</code> macro.
- </dl>
-
- <p>Each member must have a descriptive name which is the
- key used to uniquely identify the member within the compound
- datatype. A member name in an HDF5 datatype does not
- necessarily have to be the same as the name of the member in the
- C struct, although this is often the case. Nor does one need to
- define all members of the C struct in the HDF5 compound
- datatype (or vice versa).
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: A simple struct</h4></caption>
- <tr>
- <td>
- <p>An HDF5 datatype is created to describe complex
- numbers whose type is defined by the
- <code>complex_t</code> struct.
-
- <p><code><pre>
-typedef struct {
- double re; /*real part*/
- double im; /*imaginary part*/
-} complex_t;
-
-hid_t complex_id = H5Tcreate (H5T_COMPOUND, sizeof tmp);
-H5Tinsert (complex_id, "real", HOFFSET(complex_t,re),
- H5T_NATIVE_DOUBLE);
-H5Tinsert (complex_id, "imaginary", HOFFSET(complex_t,im),
- H5T_NATIVE_DOUBLE);
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
- <p>Member alignment is handled by the <code>HOFFSET</code>
- macro. However, data stored on disk does not require alignment,
- so unaligned versions of compound data structures can be created
- to improve space efficiency on disk. These unaligned compound
- datatypes can be created by computing offsets by hand to
- eliminate inter-member padding, or the members can be packed by
- calling <code>H5Tpack()</code> (which modifies a datatype
- directly, so it is usually preceded by a call to
- <code>H5Tcopy()</code>):
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: A packed struct</h4></caption>
- <tr>
- <td>
- <p>This example shows how to create a disk version of a
- compound datatype in order to store data on disk in
- as compact a form as possible. Packed compound
- datatypes should generally not be used to describe memory
- as they may violate alignment constraints for the
- architecture being used. Note also that using a
- packed datatype for disk storage may involve a higher
- data conversion cost.
- <p><code><pre>
-hid_t complex_disk_id = H5Tcopy (complex_id);
-H5Tpack (complex_disk_id);
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: A flattened struct</h4></caption>
- <tr>
- <td>
- <p>Compound datatypes that have a compound datatype
- member can be handled two ways. This example shows
- that the compound datatype can be flattened,
- resulting in a compound type with only atomic
- members.
-
- <p><code><pre>
-typedef struct {
- complex_t x;
- complex_t y;
-} surf_t;
-
-hid_t surf_id = H5Tcreate (H5T_COMPOUND, sizeof tmp);
-H5Tinsert (surf_id, "x-re", HOFFSET(surf_t,x.re),
- H5T_NATIVE_DOUBLE);
-H5Tinsert (surf_id, "x-im", HOFFSET(surf_t,x.im),
- H5T_NATIVE_DOUBLE);
-H5Tinsert (surf_id, "y-re", HOFFSET(surf_t,y.re),
- H5T_NATIVE_DOUBLE);
-H5Tinsert (surf_id, "y-im", HOFFSET(surf_t,y.im),
- H5T_NATIVE_DOUBLE);
- </code></pre>
- </td>
- </tr>
- </table>
- </center>
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: A nested struct</h4></caption>
- <tr>
- <td>
- <p>However, when the <code>complex_t</code> is used
- often it becomes inconvenient to list its members over
- and over again. So the alternative approach to
- flattening is to define a compound datatype and then
- use it as the type of the compound members, as is done
- here (the typedefs are defined in the previous
- examples).
-
- <p><code><pre>
-hid_t complex_id, surf_id; /*hdf5 datatypes*/
-
-complex_id = H5Tcreate (H5T_COMPOUND, sizeof c);
-H5Tinsert (complex_id, "re", HOFFSET(complex_t,re),
- H5T_NATIVE_DOUBLE);
-H5Tinsert (complex_id, "im", HOFFSET(complex_t,im),
- H5T_NATIVE_DOUBLE);
-
-surf_id = H5Tcreate (H5T_COMPOUND, sizeof s);
-H5Tinsert (surf_id, "x", HOFFSET(surf_t,x), complex_id);
-H5Tinsert (surf_id, "y", HOFFSET(surf_t,y), complex_id);
- </code></pre>
- </td>
- </tr>
- </table>
- </center>
-
-
-
- <a name="Datatypes_Enum">&nbsp;</a>
- <h2>8. Enumeration Datatypes</h2>
-
- <h3>8.1. Introduction</h3>
-
- <p>An HDF enumeration datatype is a 1:1 mapping between a set of
- symbols and a set of integer values, and an order is imposed on
- the symbols by their integer values. The symbols are passed
- between the application and library as character strings and all
- the values for a particular enumeration type are of the same
- integer type, which is not necessarily a native type.
-
- <h3>8.2. Creation</h3>
-
- <p>Creation of an enumeration datatype resembles creation of a
- compound datatype: first an empty enumeration type is created,
- then members are added to the type, then the type is optionally
- locked.
-
- <dl>
- <dt><code>hid_t H5Tcreate(H5T_class_t <em>type_class</em>,
- size_t <em>size</em>)</code>
- <dd>This function creates a new empty enumeration datatype based
- on a native signed integer type. The first argument is the
- constant <code>H5T_ENUM</code> and the second argument is the
- size in bytes of the native integer on which the enumeration
- type is based. If the architecture does not support a native
- signed integer of the specified size then an error is
- returned.
-
- <pre>
-/* Based on a native signed short */
-hid_t hdf_en_colors = H5Tcreate(H5T_ENUM, sizeof(short));</pre>
-
-
- <dt><code>hid_t H5Tenum_create(hid_t <em>base</em>)</code>
- <dd>This function creates a new empty enumeration datatype based
- on some integer datatype <em>base</em> and is a
- generalization of the <code>H5Tcreate()</code> function. This
- function is useful when creating an enumeration type based on
- some non-native integer datatype, but it can be used for
- native types as well.
-
- <pre>
-/* Based on a native unsigned short */
-hid_t hdf_en_colors_1 = H5Tenum_create(H5T_NATIVE_USHORT);
-
-/* Based on a MIPS 16-bit unsigned integer */
-hid_t hdf_en_colors_2 = H5Tenum_create(H5T_MIPS_UINT16);
-
-/* Based on a big-endian 16-bit unsigned integer */
-hid_t hdf_en_colors_3 = H5Tenum_create(H5T_STD_U16BE);</pre>
-
-
- <dt><code>herr_t H5Tenum_insert(hid_t <em>etype</em>, const char
- *<em>symbol</em>, void *<em>value</em>)</code>
- <dd>Members are inserted into the enumeration datatype
- <em>etype</em> with this function. Each member has a symbolic
- name <em>symbol</em> and some integer representation
- <em>value</em>. The <em>value</em> argument must point to a value
- of the same datatype as specified when the enumeration type
- was created. The order of member insertion is not important
- but all symbol names and values must be unique within a
- particular enumeration type.
-
- <pre>
-short val;
-H5Tenum_insert(hdf_en_colors, "RED", (val=0,&amp;val));
-H5Tenum_insert(hdf_en_colors, "GREEN", (val=1,&amp;val));
-H5Tenum_insert(hdf_en_colors, "BLUE", (val=2,&amp;val));
-H5Tenum_insert(hdf_en_colors, "WHITE", (val=3,&amp;val));
-H5Tenum_insert(hdf_en_colors, "BLACK", (val=4,&amp;val));</pre>
-
-
- <dt><code>herr_t H5Tlock(hid_t <em>etype</em>)</code>
- <dd>This function locks a datatype so it cannot be modified or
- freed unless the entire HDF5 library is closed. Its use is
- completely optional but using it on an application datatype
- makes that datatype act like a predefined datatype.
-
- <pre>
-H5Tlock(hdf_en_colors);</pre>
-
- </dl>
-
- <h3>8.3. Integer Operations</h3>
-
- <p>Because an enumeration datatype is derived from an integer
- datatype, any operation which can be performed on integer
- datatypes can also be performed on enumeration datatypes. This
- includes:
-
- <p>
- <center>
- <table>
- <tr>
- <td><code>H5Topen()</code></td>
- <td><code>H5Tcreate()</code></td>
- <td><code>H5Tcopy()</code></td>
- <td><code>H5Tclose()</code></td>
- </tr><tr>
- <td><code>H5Tequal()</code></td>
- <td><code>H5Tlock()</code></td>
- <td><code>H5Tcommit()</code></td>
- <td><code>H5Tcommitted()</code></td>
- </tr><tr>
- <td><code>H5Tget_class()</code></td>
- <td><code>H5Tget_size()</code></td>
- <td><code>H5Tget_order()</code></td>
- <td><code>H5Tget_pad()</code></td>
- </tr><tr>
- <td><code>H5Tget_precision()</code></td>
- <td><code>H5Tget_offset()</code></td>
- <td><code>H5Tget_sign()</code></td>
- <td><code>H5Tset_size()</code></td>
- </tr><tr>
- <td><code>H5Tset_order()</code></td>
- <td><code>H5Tset_precision()</code></td>
- <td><code>H5Tset_offset()</code></td>
- <td><code>H5Tset_pad()</code></td>
- </tr><tr>
- <td><code>H5Tset_sign()</code></td>
- </tr>
- </table>
- </center>
-
- <p>In addition, the new function <code>H5Tget_super()</code> will
- be defined for all datatypes that are derived from existing
- types (currently just enumeration types).
-
- <dl>
- <dt><code>hid_t H5Tget_super(hid_t <em>type</em>)</code>
- <dd>Return the datatype from which <em>type</em> is
- derived. When <em>type</em> is an enumeration datatype then
- the returned value will be an integer datatype but not
- necessarily a native type. One use of this function would be
- to create a new enumeration type based on the same underlying
- integer type and values but with possibly different symbols.
-
- <pre>
-hid_t itype = H5Tget_super(hdf_en_colors);
-hid_t hdf_fr_colors = H5Tenum_create(itype);
-H5Tclose(itype);
-
-short val;
-H5Tenum_insert(hdf_fr_colors, "ouge", (val=0,&amp;val));
-H5Tenum_insert(hdf_fr_colors, "vert", (val=1,&amp;val));
-H5Tenum_insert(hdf_fr_colors, "bleu", (val=2,&amp;val));
-H5Tenum_insert(hdf_fr_colors, "blanc", (val=3,&amp;val));
-H5Tenum_insert(hdf_fr_colors, "noir", (val=4,&amp;val));
-H5Tlock(hdf_fr_colors);</pre>
- </dl>
-
- <h3>8.4. Type Functions</h3>
-
- <p>A small set of functions is available for querying properties
- of an enumeration type. These functions are likely to be used
- by browsers to display datatype information.
-
- <dl>
- <dt><code>int H5Tget_nmembers(hid_t <em>etype</em>)</code>
- <dd>When given an enumeration datatype <em>etype</em> this
- function returns the number of members defined for that
- type. This function is already implemented for compound
- datatypes.
-
- <br><br>
- <dt><code>char *H5Tget_member_name(hid_t <em>etype</em>, unsigned
- <em>membno</em>)</code>
- <dd>Given an enumeration datatype <em>etype</em> this function
- returns the symbol name for the member indexed by
- <em>membno</em>. Members are numbered from zero to
- <em>N</em>-1 where <em>N</em> is the return value from
- <code>H5Tget_nmembers()</code>. The members are stored in no
- particular order. This function is already implemented for
- compound datatypes. If an error occurs then the null pointer
- is returned. The return value should be freed by calling
- <code>free()</code>.
-
- <br><br>
- <dt><code>herr_t H5Tget_member_value(hid_t <em>etype</em>, unsigned
- <em>membno</em>, void *<em>value</em>/*out*/)</code>
- <dd>Given an enumeration datatype <em>etype</em> this function
- returns the value associated with the member indexed by
- <em>membno</em> (as described for
- <code>H5Tget_member_name()</code>). The value returned
- is in the domain of the underlying integer
- datatype which is often a native integer type. The
- application should ensure that the memory pointed to by
- <em>value</em> is large enough to contain the result (the size
- can be obtained by calling <code>H5Tget_size()</code> on
- either the enumeration type or the underlying integer type
- when the type is not known by the C compiler.
-
- <pre>
-int n = H5Tget_nmembers(hdf_en_colors);
-unsigned u;
-for (u=0; u&lt;(unsigned)n; u++) {
- char *symbol = H5Tget_member_name(hdf_en_colors, u);
- short val;
- H5Tget_member_value(hdf_en_colors, u, &amp;val);
- printf("#%u %20s = %d\n", u, symbol, val);
- free(symbol);
-}</pre>
-
- <p>
- Output:
- <pre>
-#0 BLACK = 4
-#1 BLUE = 2
-#2 GREEN = 1
-#3 RED = 0
-#4 WHITE = 3</pre>
- </dl>
-
- <h3>8.5. Data Functions</h3>
-
- <p>In addition to querying about the enumeration type properties,
- an application may want to make queries about enumerated
- data. These functions perform efficient mappings between symbol
- names and values.
-
- <dl>
- <dt><code>herr_t H5Tenum_valueof(hid_t <em>etype</em>, const char
- *<em>symbol</em>, void *<em>value</em>/*out*/)</code>
- <dd>Given an enumeration datatype <em>etype</em> this function
- returns through <em>value</em> the bit pattern associated with
- the symbol name <em>symbol</em>. The <em>value</em> argument
- should point to memory which is large enough to hold the result,
- which is returned as the underlying integer datatype specified
- when the enumeration type was created, often a native integer
- type.
-
- <br><br>
- <dt><code>herr_t H5Tenum_nameof(hid_t <em>etype</em>, void
- *<em>value</em>, char *<em>symbol</em>, size_t
- <em>size</em>)</code>
- <dd>This function translates a bit pattern pointed to by
- <em>value</em> to a symbol name according to the mapping
- defined in the enumeration datatype <em>etype</em> and stores
- at most <em>size</em> characters of that name (counting the
- null terminator) to the <em>symbol</em> buffer. If the name is
- longer than the result buffer then the result is not null
- terminated and the function returns failure. If <em>value</em>
- points to a bit pattern which is not in the domain of the
- enumeration type then the first byte of the <em>symbol</em>
- buffer is set to zero and the function fails.
-
- <pre>
-short data[1000] = {4, 2, 0, 0, 5, 1, ...};
-int i;
-char symbol[32];
-
-for (i=0; i&lt;1000; i++) {
- if (H5Tenum_nameof(hdf_en_colors, data+i, symbol,
- sizeof symbol))&lt;0) {
- if (symbol[0]) {
- strcpy(symbol+sizeof(symbol)-4, "...");
- } else {
- strcpy(symbol, "UNKNOWN");
- }
- }
- printf("%d %s\n", data[i], symbol);
-}
-printf("}\n");</pre>
-
- <p>
- Output:
- <pre>
-4 BLACK
-2 BLUE
-0 RED
-0 RED
-5 UNKNOWN
-1 GREEN
-...</pre>
- </dl>
-
- <h3>8.6. Conversion</h3>
-
- <p>Enumerated data can be converted from one type to another
- provided the destination enumeration type contains all the
- symbols of the source enumeration type. The conversion operates
- by matching up the symbol names of the source and destination
- enumeration types to build a mapping from source value to
- destination value. For instance, if we are translating from an
- enumeration type that defines a sequence of integers as the
- values for the colors to a type that defines a different bit for
- each color then the mapping might look like this:
-
- <p><img src="EnumMap.gif" alt="Enumeration Mapping">
-
- <p>That is, a source value of <code>2</code> which corresponds to
- <code>BLUE</code> would be mapped to <code>0x0004</code>. The
- following code snippet builds the second datatype, then
- converts a raw data array from one datatype to another, and
- then prints the result.
-
- <pre>
-/* Create a new enumeration type */
-short val;
-hid_t bits = H5Tcreate(H5T_ENUM, sizeof val);
-H5Tenum_insert(bits, "RED", (val=0x0001,&amp;val));
-H5Tenum_insert(bits, "GREEN", (val=0x0002,&amp;val));
-H5Tenum_insert(bits, "BLUE", (val=0x0004,&amp;val));
-H5Tenum_insert(bits, "WHITE", (val=0x0008,&amp;val));
-H5Tenum_insert(bits, "BLACK", (val=0x0010,&amp;val));
-
-/* The data */
-short data[6] = {1, 4, 2, 0, 3, 5};
-
-/* Convert the data from one type to another */
-H5Tconvert(hdf_en_colors, bits, 5, data, NULL, plist_id);
-
-/* Print the data */
-for (i=0; i&lt;6; i++) {
- printf("0x%04x\n", (unsigned)(data[i]));
-}</pre>
-
- <p>
- Output:
- <pre>
-
-0x0002
-0x0010
-0x0004
-0x0001
-0x0008
-0xffff</pre>
-
- <p>If the source data stream contains values which are not in the
- domain of the conversion map then an overflow exception is
- raised within the library, causing the application defined
- overflow handler to be invoked (see
- <code>H5Tset_overflow()</code>). If no overflow handler is
- defined then all bits of the destination value will be set.
-
- <p>The HDF library will not provide conversions between enumerated
- data and integers although the application is free to do so
- (this is a policy we apply to all classes of HDF datatypes).
- However, since enumeration types are derived from
- integer types it is permissible to treat enumerated data as
- integers and perform integer conversions in that context.
-
- <h3>8.7. Symbol Order</h3>
-
- <p>Symbol order is determined by the integer values associated
- with each symbol. When the integer datatype is a native type,
- testing the relative order of two symbols is an easy process:
- simply compare the values of the symbols. If only the symbol
- names are available then the values must first be determined by
- calling <code>H5Tenum_valueof()</code>.
-
- <pre>
-short val1, val2;
-H5Tenum_valueof(hdf_en_colors, "WHITE", &amp;val1);
-H5Tenum_valueof(hdf_en_colors, "BLACK", &amp;val2);
-if (val1 &lt; val2) ...</pre>
-
- <p>When the underlying integer datatype is not a native type then
- the easiest way to compare symbols is to first create a similar
- enumeration type that contains all the same symbols but has a
- native integer type (HDF type conversion features can be used to
- convert the non-native values to native values). Once we have a
- native type we can compare symbol order as just described. If
- <code>foreign</code> is some non-native enumeration type then a
- native type can be created as follows:
-
- <pre>
-int n = H5Tget_nmembers(foreign);
-hid_t itype = H5Tget_super(foreign);
-void *val = malloc(n * MAX(H5Tget_size(itype), sizeof(int)));
-char *name = malloc(n * sizeof(char*));
-unsigned u;
-
-/* Get foreign type information */
-for (u=0; u&lt;(unsigned)n; u++) {
- name[u] = H5Tget_member_name(foreign, u);
- H5Tget_member_value(foreign, u,
- (char*)val+u*H5Tget_size(foreign));
-}
-
-/* Convert integer values to new type */
-H5Tconvert(itype, H5T_NATIVE_INT, n, val, NULL, plist_id);
-
-/* Build a native type */
-hid_t native = H5Tenum_create(H5T_NATIVE_INT);
-for (i=0; i&lt;n; i++) {
- H5Tenum_insert(native, name[i], ((int*)val)[i]);
- free(name[i]);
-}
-free(name);
-free(val);</pre>
-
- <p>It is also possible to convert enumerated data to a new type
- that has a different order defined for the symbols. For
- instance, we can define a new type, <code>reverse</code> that
- defines the same five colors but in the reverse order.
-
- <pre>
-short val;
-int i;
-char sym[8];
-short data[5] = {0, 1, 2, 3, 4};
-
-hid_t reverse = H5Tenum_create(H5T_NATIVE_SHORT);
-H5Tenum_insert(reverse, "BLACK", (val=0,&amp;val));
-H5Tenum_insert(reverse, "WHITE", (val=1,&amp;val));
-H5Tenum_insert(reverse, "BLUE", (val=2,&amp;val));
-H5Tenum_insert(reverse, "GREEN", (val=3,&amp;val));
-H5Tenum_insert(reverse, "RED", (val=4,&amp;val));
-
-/* Print data */
-for (i=0; i&lt;5; i++) {
- H5Tenum_nameof(hdf_en_colors, data+i, sym, sizeof sym);
- printf ("%d %s\n", data[i], sym);
-}
-
-puts("Converting...");
-H5Tconvert(hdf_en_colors, reverse, 5, data, NULL, plist_id);
-
-/* Print data */
-for (i=0; i&lt;5; i++) {
- H5Tenum_nameof(reverse, data+i, sym, sizeof sym);
- printf ("%d %s\n", data[i], sym);
-}</pre>
-
- <p>
- Output:
- <pre>
-0 RED
-1 GREEN
-2 BLUE
-3 WHITE
-4 BLACK
-Converting...
-4 RED
-3 GREEN
-2 BLUE
-1 WHITE
-0 BLACK</pre>
-
- <h3>8.8. Equality</h3>
-
- <p>The order that members are inserted into an enumeration type is
- unimportant; the important part is the associations between the
- symbol names and the values. Thus, two enumeration datatypes
- will be considered equal if and only if both types have the same
- symbol/value associations and both have equal underlying integer
- datatypes. Type equality is tested with the
- <code>H5Tequal()</code> function.
-
- <h3>8.9. Interacting with C's <code>enum</code> Type</h3>
-
- <p>Although HDF enumeration datatypes are similar to C
- <code>enum</code> datatypes, there are some important
- differences:
-
- <p>
- <center>
- <table border width="80%">
- <tr>
- <th>Difference</th>
- <th>Motivation/Implications</th>
- </tr>
-
- <tr>
- <td valign=top>Symbols are unquoted in C but quoted in
- HDF.</td>
- <td valign=top>This allows the application to manipulate
- symbol names in ways that are not possible with C.</td>
- </tr>
-
- <tr>
- <td valign=top>The C compiler automatically replaces all
- symbols with their integer values but HDF requires
- explicit calls to do the same.</td>
- <td valign=top>C resolves symbols at compile time while
- HDF resolves symbols at run time.</td>
- </tr>
-
- <tr>
- <td valign=top>The mapping from symbols to integers is
- <em>N</em>:1 in C but 1:1 in HDF.</td>
- <td valign=top>HDF can translate from value to name
- uniquely and large <code>switch</code> statements are
- not necessary to print values in human-readable
- format.</td>
- </tr>
-
- <tr>
- <td valign=top>A symbol must appear in only one C
- <code>enum</code> type but may appear in multiple HDF
- enumeration types.</td>
- <td valign=top>The translation from symbol to value in HDF
- requires the datatype to be specified while in C the
- datatype is not necessary because it can be inferred
- from the symbol.</td>
- </tr>
-
- <tr>
- <td valign=top>The underlying integer value is always a
- native integer in C but can be a foreign integer type in
- HDF.</td>
- <td valign=top>This allows HDF to describe data that might
- reside on a foreign architecture, such as data stored in
- a file.</td>
- </tr>
-
- <tr>
- <td valign=top>The sign and size of the underlying integer
- datatype is chosen automatically by the C compiler but
- must be fully specified with HDF.</td>
- <td valign=top>Since HDF doesn't require finalization of a
- datatype, complete specification of the type must be
- supplied before the type is used. Requiring that
- information at the time of type creation was a design
- decision to simplify the library.</td>
- </tr>
- </table>
- </center>
-
- <p>The examples below use the following C datatypes:
-
- <p>
- <table width="90%" bgcolor="white">
- <tr>
- <td>
- <code><pre>
-/* English color names */
-typedef enum {
- RED,
- GREEN,
- BLUE,
- WHITE,
- BLACK
-} c_en_colors;
-
-/* Spanish color names, reverse order */
-typedef enum {
- NEGRO
- BLANCO,
- AZUL,
- VERDE,
- ROJO,
-} c_sp_colors;
-
-/* No enum definition for French names */
- </pre></code>
- </td>
- </tr>
- </table>
-
- <h4>Creating HDF Types from C Types</h4>
-
- <p>An HDF enumeration datatype can be created from a C
- <code>enum</code> type simply by passing pointers to the C
- <code>enum</code> values to <code>H5Tenum_insert()</code>. For
- instance, to create HDF types for the <code>c_en_colors</code>
- type shown above:
-
- <p>
- <table width="90%" bgcolor="white">
- <tr>
- <td>
- <code><pre>
-
-c_en_colors val;
-hid_t hdf_en_colors = H5Tcreate(H5T_ENUM, sizeof(c_en_colors));
-H5Tenum_insert(hdf_en_colors, "RED", (val=RED, &amp;val));
-H5Tenum_insert(hdf_en_colors, "GREEN", (val=GREEN,&amp;val));
-H5Tenum_insert(hdf_en_colors, "BLUE", (val=BLUE, &amp;val));
-H5Tenum_insert(hdf_en_colors, "WHITE", (val=WHITE,&amp;val));
-H5Tenum_insert(hdf_en_colors, "BLACK", (val=BLACK,&amp;val));</pre></code>
- </td>
- </tr>
- </table>
-
- <h4>Name Changes between Applications</h4>
-
- <p>Occassionally two applicatons wish to exchange data but they
- use different names for the constants they exchange. For
- instance, an English and a Spanish program may want to
- communicate color names although they use different symbols in
- the C <code>enum</code> definitions. The communication is still
- possible although the applications must agree on common terms
- for the colors. The following example shows the Spanish code to
- read the values assuming that the applications have agreed that
- the color information will be exchanged using Enlish color
- names:
-
- <p>
- <table width="90%" bgcolor="white">
- <tr>
- <td>
- <code><pre>
-
-c_sp_colors val, data[1000];
-hid_t hdf_sp_colors = H5Tcreate(H5T_ENUM, sizeof(c_sp_colors));
-H5Tenum_insert(hdf_sp_colors, "RED", (val=ROJO, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "GREEN", (val=VERDE, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "BLUE", (val=AZUL, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "WHITE", (val=BLANCO, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "BLACK", (val=NEGRO, &amp;val));
-
-H5Dread(dataset, hdf_sp_colors, H5S_ALL, H5S_ALL, H5P_DEFAULT, data);</pre></code>
- </td>
- </tr>
- </table>
-
-
- <h4>Symbol Ordering across Applications</h4>
-
- <p>Since symbol ordering is completely determined by the integer values
- assigned to each symbol in the <code>enum</code> definition,
- ordering of <code>enum</code> symbols cannot be preserved across
- files like with HDF enumeration types. HDF can convert from one
- application's integer values to the other's so a symbol in one
- application's C <code>enum</code> gets mapped to the same symbol
- in the other application's C <code>enum</code>, but the relative
- order of the symbols is not preserved.
-
- <p>For example, an application may be defined to use the
- definition of <code>c_en_colors</code> defined above where
- <code>WHITE</code> is less than <code>BLACK</code>, but some
- other application might define the colors in some other
- order. If each application defines an HDF enumeration type based
- on that application's C <code>enum</code> type then HDF will
- modify the integer values as data is communicated from one
- application to the other so that a <code>RED</code> value
- in the first application is also a <code>RED</code> value in the
- other application.
-
- <p>A case of this reordering of symbol names was also shown in the
- previous code snippet (as well as a change of language), where
- HDF changed the integer values so 0 (<code>RED</code>) in the
- input file became 4 (<code>ROJO</code>) in the <code>data</code>
- array. In the input file, <code>WHITE</code> was less than
- <code>BLACK</code>; in the application the opposite is true.
-
- <p>In fact, the ability to change the order of symbols is often
- convenient when the enumeration type is used only to group
- related symbols that don't have any well defined order
- relationship.
-
- <h4>Internationalization</h4>
-
- <p>The HDF enumeration type conversion features can also be used
- to provide internationalization of debugging output. A program
- written with the <code>c_en_colors</code> datatype could define
- a separate HDF datatype for languages such as English, Spanish,
- and French and cast the enumerated value to one of these HDF
- types to print the result.
-
- <p>
- <table width="90%" bgcolor="white">
- <tr>
- <td>
- <code><pre>
-
-c_en_colors val, *data=...;
-
-hid_t hdf_sp_colors = H5Tcreate(H5T_ENUM, sizeof val);
-H5Tenum_insert(hdf_sp_colors, "ROJO", (val=RED, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "VERDE", (val=GREEN, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "AZUL", (val=BLUE, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "BLANCO", (val=WHITE, &amp;val));
-H5Tenum_insert(hdf_sp_colors, "NEGRO", (val=BLACK, &amp;val));
-
-hid_t hdf_fr_colors = H5Tcreate(H5T_ENUM, sizeof val);
-H5Tenum_insert(hdf_fr_colors, "OUGE", (val=RED, &amp;val));
-H5Tenum_insert(hdf_fr_colors, "VERT", (val=GREEN, &amp;val));
-H5Tenum_insert(hdf_fr_colors, "BLEU", (val=BLUE, &amp;val));
-H5Tenum_insert(hdf_fr_colors, "BLANC", (val=WHITE, &amp;val));
-H5Tenum_insert(hdf_fr_colors, "NOIR", (val=BLACK, &amp;val));
-
-void
-nameof(lang_t language, c_en_colors val, char *name, size_t size)
-{
- switch (language) {
- case ENGLISH:
- H5Tenum_nameof(hdf_en_colors, &amp;val, name, size);
- break;
- case SPANISH:
- H5Tenum_nameof(hdf_sp_colors, &amp;val, name, size);
- break;
- case FRENCH:
- H5Tenum_nameof(hdf_fr_colors, &amp;val, name, size);
- break;
- }
-}</pre></code>
- </td>
- </tr>
- </table>
-
- <h3>8.10. Goals That Have Been Met</h3>
-
- <p>The main goal of enumeration types is to provide communication
- of enumerated data using symbolic equivalence. That is, a
- symbol written to a dataset by one application should be read as
- the same symbol by some other application.
-
- <p>
- <table width="90%">
- <tr>
- <td valign=top><b>Architecture Independence</b></td>
- <td valign=top>Two applications shall be able to exchange
- enumerated data even when the underlying integer values
- have different storage formats. HDF accomplishes this for
- enumeration types by building them upon integer types.</td>
- </tr>
-
- <tr>
- <td valign=top><b>Preservation of Order Relationship</b></td>
- <td valign=top>The relative order of symbols shall be
- preserved between two applications that use equivalent
- enumeration datatypes. Unlike numeric values that have
- an implicit ordering, enumerated data has an explicit
- order defined by the enumeration datatype and HDF
- records this order in the file.</td>
- </tr>
-
- <tr>
- <td valign=top><b>Order Independence</b></td>
- <td valign=top>An application shall be able to change the
- relative ordering of the symbols in an enumeration
- datatype. This is accomplished by defining a new type with
- different integer values and converting data from one type
- to the other.</td>
- </tr>
-
- <tr>
- <td valign=top><b>Subsets</b></td>
- <td valign=top>An application shall be able to read
- enumerated data from an archived dataset even after the
- application has defined additional members for the
- enumeration type. An application shall be able to write
- to a dataset when the dataset contains a superset of the
- members defined by the application. Similar rules apply
- for in-core conversions between enumerated datatypes.</td>
- </tr>
-
- <tr>
- <td valign=top><b>Targetable</b></td>
- <td valign=top>An application shall be able to target a
- particular architecture or application when storing
- enumerated data. This is accomplished by allowing
- non-native underlying integer types and converting the
- native data to non-native data.</td>
- </tr>
-
- <tr>
- <td valign=top><b>Efficient Data Transfer</b></td>
- <td valign=top>An application that defines a file dataset
- that corresponds to some native C enumerated data array
- shall be able to read and write to that dataset directly
- using only Posix read and write functions. HDF already
- optimizes this case for integers, so the same optimization
- will apply to enumerated data.
- </tr>
-
- <tr>
- <td valign=top><b>Efficient Storage</b></td>
- <td valign=top>Enumerated data shall be stored in a manner
- which is space efficient. HDF stores the enumerated data
- as integers and allows the application to chose the size
- and format of those integers.</td>
- </tr>
- </table>
-
-
-
-
-
-
-
-<h2>9. Variable-length Datatypes</h2>
-
-<h3>9.1. Overview And Justification</h3>
-
-Variable-length (VL) datatypes are sequences of an existing datatype
-(atomic, VL, or compound) which are not fixed in length from one dataset location
-to another. In essence, they are similar to C character strings -- a sequence of
-a type which is pointed to by a particular type of <em>pointer</em> -- although
-they are implemented more closely to FORTRAN strings by including an explicit
-length in the pointer instead of using a particular value to terminate the
-sequence.
-
-<p>
-VL datatypes are useful to the scientific community in many different ways,
-some of which are listed below:
-<ul>
- <li>Ragged arrays: Multi-dimensional ragged arrays can be implemented with
- the last (fastest changing) dimension being ragged by using a
- VL datatype as the type of the element stored. (Or as a field in a
- compound datatype.)
- <li>Fractal arrays: If a compound datatype has a VL field of another compound
- type with VL fields (a <em>nested</em> VL datatype), this can be used to
- implement ragged arrays of ragged arrays, to whatever nesting depth is
- required for the user.
- <li>Polygon lists: A common storage requirement is to efficiently store arrays
- of polygons with different numbers of vertices. VL datatypes can be
- used to efficiently and succinctly describe an array of polygons with
- different numbers of vertices.
- <li>Character strings: Perhaps the most common use of VL datatypes will be to
- store C-like VL character strings in dataset elements or as attributes
- of objects.
- <li>Indices: An array of VL object references could be used as an index to
- all the objects in a file which contain a particular sequence of
- dataset values. Perhaps an array something like the following:
- <pre>
- Value1: Object1, Object3, Object9
- Value2: Object0, Object12, Object14, Object21, Object22
- Value3: Object2
- Value4: &lt;none&gt;
- Value5: Object1, Object10, Object12
- .
- .
- </pre>
- <li>Object Tracking: An array of VL dataset region references can be used as
- a method of tracking objects or features appearing in a sequence of
- datasets. Perhaps an array of them would look like:
- <pre>
- Feature1: Dataset1:Region, Dataset3:Region, Dataset9:Region
- Feature2: Dataset0:Region, Dataset12:Region, Dataset14:Region,
- Dataset21:Region, Dataset22:Region
- Feature3: Dataset2:Region
- Feature4: &lt;none&gt;
- Feature5: Dataset1:Region, Dataset10:Region, Dataset12:Region
- .
- .
- </pre>
-</ul>
-
-
-<h3>9.2. Variable-length Datatype Memory Management</h3>
-
-With each element possibly being of different sequence lengths for a
-dataset with a VL datatype, the memory for the VL datatype must be dynamically
-allocated. Currently there are two methods of managing the memory for
-VL datatypes: the standard C malloc/free memory allocation routines or a method
-of calling user-defined memory management routines to allocate or free memory.
-Since the memory allocated when reading (or writing) may be complicated to
-release, an HDF5 routine is provided to traverse a memory buffer and free the
-VL datatype information without leaking memory.
-
-
-<h4>Variable-length datatypes cannot be divided</h4>
-
-VL datatypes are designed so that they cannot be subdivided by the library
-with selections, etc. This design was chosen due to the complexities in
-specifying selections on each VL element of a dataset through a selection API
-that is easy to understand. Also, the selection APIs work on dataspaces, not
-on datatypes. At some point in time, we may want to create a way for
-dataspaces to have VL components to them and we would need to allow selections
-of those VL regions, but that is beyond the scope of this document.
-
-
-<h4>What happens if the library runs out of memory while reading?</h4>
-
-It is possible for a call to <code>H5Dread</code> to fail while reading in
-VL datatype information if the memory required exceeds that which is available.
-In this case, the <code>H5Dread</code> call will fail gracefully and any
-VL data which has been allocated prior to the memory shortage will be returned
-to the system via the memory management routines detailed below.
-It may be possible to design a <em>partial read</em> API function at a
-later date, if demand for such a function warrants.
-
-
-<h4>Strings as variable-length datatypes</h4>
-
-Since character strings are a special case of VL data that is implemented
-in many different ways on different machines and in different programming
-languages, they are handled somewhat differently from other VL datatypes in HDF5.
-
-<p>
-HDF5 has native VL strings for each language API, which are stored the
-same way on disk, but are exported through each language API in a natural way
-for that language. When retrieving VL strings from a dataset, users may choose
-to have them stored in memory as a native VL string or in HDF5's <code>hvl_t</code>
-struct for VL datatypes.
-
-<p>
-VL strings may be created in one of two ways: by creating a VL datatype with
-a base type of <code>H5T_NATIVE_ASCII</code>, <code>H5T_NATIVE_UNICODE</code>,
-etc., or by creating a string datatype and setting its length to
-<code>H5T_VARIABLE</code>. The second method is used to access
-native VL strings in memory. The library will convert between the two types,
-but they are stored on disk using different datatypes and have different
-memory representations.
-
-<p>
-Multi-byte character representations, such as UNICODE or <em>wide</em>
-characters in C/C++, will need the appropriate character and string datatypes
-created so that they can be described properly through the datatype API.
-Additional conversions between these types and the current ASCII characters
-will also be required.
-
-<p>
-Variable-width character strings (which might be compressed data or some
-other encoding) are not currently handled by this design. We will evaluate
-how to implement them based on user feedback.
-
-
-<h3>9.3. Variable-length Datatype API</h3>
-
-<h4>Creation</h4>
-
-VL datatypes are created with the <code>H5Tvlen_create()</code> function
-as follows:
-<dl>
- <dd><em>type_id</em> = <code>H5Tvlen_create</code>(<em>hid_t</em> <code>base_type_id</code>);
-</dl>
-
-<p>
-The base datatype will be the datatype that the sequence is composed of,
-characters for character strings, vertex coordinates for polygon lists, etc.
-The base datatype specified for the VL datatype can be of any HDF5 datatype,
-including another VL datatype, a compound datatype, or an atomic datatype.
-
-
-<h4>Query base datatype of VL datatype</h4>
-
-It may be necessary to know the base datatype of a VL datatype before
-memory is allocated, etc. The base datatype is queried with the
-<code>H5Tget_super()</code> function, described in the H5T documentation.
-
-
-<h4>Query minimum memory required for VL information</h4>
-
-It order to predict the memory usage that <code>H5Dread</code> may need
-to allocate to store VL data while reading the data, the
-<code>H5Dget_vlen_size()</code> function is provided:
-<dl>
- <dd><em>herr_t</em>
- <code>H5Dget_vlen_buf_size</code>(<em>hid_t</em> <code>dataset_id</code>,
- <em>hid_t</em> <code>type_id</code>,
- <em>hid_t</em> <code>space_id</code>,
- <em>hsize_t</em> *<code>size</code>)
-</dl>
- (This function is not implemented in Release 1.2.)
-
-<p>
-This routine checks the number of bytes required to store the VL data from
-the dataset, using the <code>space_id</code> for the selection in the dataset
-on disk and the <code>type_id</code> for the memory representation of the
-VL data in memory. The *<code>size</code> value is modified according to
-how many bytes are required to store the VL data in memory.
-
-
-<h4>Specifying how to manage memory for the VL datatype</h4>
-
-The memory management method is determined by dataset transfer properties
-passed into the <code>H5Dread</code> and <code>H5Dwrite</code> functions
-with the dataset transfer property list.
-
-<p>
-Default memory management is set by using <code>H5P_DEFAULT</code>
-for the dataset transfer property list identifier.
-If <code>H5P_DEFAULT</code> is used with <code>H5Dread</code>,
-the system <code>malloc</code> and <code>free</code> calls
-will be used for allocating and freeing memory.
-In such a case, <code>H5P_DEFAULT</code> should also be passed
-as the property list identifier to <code>H5Dvlen_reclaim</code>.
-
-<p>
-The rest of this subsection is relevant only to those who choose
-<i>not</i> to use default memory management.
-
-<p>
-The user can choose whether to use the
-system <code>malloc</code> and <code>free</code> calls or
-user-defined, or custom, memory management functions.
-If user-defined memory management functions are to be used,
-the memory allocation and free routines must be defined via
-<code>H5Pset_vlen_mem_manager()</code>, as follows:
-<dl>
- <dd><em>herr_t</em>
- <code>H5Pset_vlen_mem_manager</code>(<em>hid_t</em> <code>plist_id</code>,
- <em>H5MM_allocate_t</em> <code>alloc</code>,
- <em>void</em> *<code>alloc_info</code>,
- <em>H5MM_free_t</em> <code>free</code>,
- <em>void</em> *<code>free_info</code>)
-</dl>
-
-
-<p>
-The <code>alloc</code> and <code>free</code> parameters
-identify the memory management routines to be used.
-If the user has defined custom memory management routines,
-<code>alloc</code> and/or <code>free</code> should be set to make
-those routine calls (i.e., the name of the routine is used as
-the value of the parameter);
-if the user prefers to use the system's <code> malloc</code>
-and/or <code>free</code>, the <code>alloc</code> and
-<code>free</code> parameters, respectively, should be set to
-<code> NULL</code>
-<p>
-The prototypes for the user-defined functions would appear as follows:
-<dl>
- <dd><code>typedef</code> <em>void</em>
- *(*<code>H5MM_allocate_t</code>)(<em>size_t</em> <code>size</code>,
- <em>void</em> *<code>info</code>) ;
- <dd><code>typedef</code> <em>void</em>
- (*<code>H5MM_free_t</code>)(<em>void</em> *<code>mem</code>,
- <em>void</em> *<code>free_info</code>) ;
-</dl>
-
-<p>
-The <code>alloc_info</code> and <code>free_info</code> parameters can be
-used to pass along any required information to the user's memory management
-routines.
-
-<p>
-In summary, if the user has defined custom memory management
-routines, the name(s) of the routines are passed in the
-<code>alloc</code> and <code>free</code> parameters and the
-custom routines' parameters are passed in the
-<code>alloc_info</code> and <code>free_info</code> parameters.
-If the user wishes to use the system <code> malloc</code> and
-<code>free</code> functions, the <code>alloc</code> and/or
-<code>free</code> parameters are set to <code> NULL</code>
-and the <code>alloc_info</code> and <code>free_info</code>
-parameters are ignored.
-
-<h4>Recovering memory from VL buffers read in</h4>
-
-The complex memory buffers created for a VL datatype may be reclaimed with
-the <code>H5Dvlen_reclaim()</code> function call, as follows:
-<dl>
- <dd><em>herr_t</em>
- <code>H5Dvlen_reclaim</code>(<em>hid_t</em> <code>type_id</code>,
- <em>hid_t</em> <code>space_id</code>,
- <em>hid_t</em> <code>plist_id</code>,
- <em>void</em> *<code>buf</code>);
-</dl>
-
-<p>
-The <code>type_id</code> must be the datatype stored in the buffer,
-<code>space_id</code> describes the selection for the memory buffer
-to free the VL datatypes within,
-<code>plist_id</code> is the dataset transfer property list which
-was used for the I/O transfer to create the buffer, and
-<code>buf</code> is the pointer to the buffer to free the VL memory within.
-The VL structures (<code>hvl_t</code>) in the user's buffer are
-modified to zero out the VL information after it has been freed.
-
-<p>
-If nested VL datatypes were used to create the buffer,
-this routine frees them from the bottom up,
-releasing all the memory without creating memory leaks.
-
-
-<h3>9.4. Code Examples</h3>
-
-The following example creates the following one-dimensional array
-of size 4 of variable-length datatype.
-<pre>
- 0 10 20 30
- 11 21 31
- 22 32
- 33
-</pre>
-Each element of the VL datatype is of H5T_NATIVE_UINT type.
-<p>
-The array is stored in the dataset and then read back into memory.
-Default memory management routines are used for writing the VL data.
-Custom memory management routines are used for reading the VL data and
-reclaiming memory space.
-
-<center>
-<table border align=center width="100%">
- <caption align=bottom><h4>Example: Variable-length Datatypes</h4></caption>
- <tr>
- <td>
- <pre>
-#include &lt;hdf5.h&gt;
-
-#define FILE "vltypes.h5"
-#define MAX(X,Y) ((X)&gt;(Y)?(X):(Y))
-
-/* 1-D dataset with fixed dimensions */
-#define SPACE_NAME "Space"
-#define SPACE_RANK 1
-#define SPACE_DIM 4
-
-void *vltypes_alloc_custom(size_t size, void *info);
-void vltypes_free_custom(void *mem, void *info);
-
-/****************************************************************
-**
-** vltypes_alloc_custom(): VL datatype custom memory
-** allocation routine. This routine just uses malloc to
-** allocate the memory and increments the amount of memory
-** allocated.
-**
-****************************************************************/
-void *vltypes_alloc_custom(size_t size, void *info)
-{
-
- void *ret_value=NULL; /* Pointer to return */
- int *mem_used=(int *)info; /* Get the pointer to the memory used */
- size_t extra; /* Extra space needed */
-
- /*
- * This weird contortion is required on the DEC Alpha to keep the
- * alignment correct.
- */
- extra=MAX(sizeof(void *),sizeof(int));
-
- if((ret_value=(void *)malloc(extra+size))!=NULL) {
- *(int *)ret_value=size;
- *mem_used+=size;
- } /* end if */
- ret_value=((unsigned char *)ret_value)+extra;
- return(ret_value);
-}
-/******************************************************************
-** vltypes_free_custom(): VL datatype custom memory
-** allocation routine. This routine just uses free to
-** release the memory and decrements the amount of memory
-** allocated.
-** ****************************************************************/
-void vltypes_free_custom(void *_mem, void *info)
-
-{
- unsigned char *mem;
- int *mem_used=(int *)info; /* Get the pointer to the memory used */
- size_t extra; /* Extra space needed */
- /*
- * This weird contortion is required on the DEC Alpha to keep the
- * alignment correct.
- */
- extra=MAX(sizeof(void *),sizeof(int));
- if(_mem!=NULL) {
- mem=((unsigned char *)_mem)-extra;
- *mem_used-=*(int *)mem;
- free(mem);
- } /* end if */
-}
-
-int main(void)
-
-{
- hvl_t wdata[SPACE_DIM]; /* Information to write */
- hvl_t rdata[SPACE_DIM]; /* Information read in */
- hid_t fid; /* HDF5 File IDs */
- hid_t dataset; /* Dataset ID */
- hid_t sid; /* Dataspace ID */
- hid_t tid; /* Datatype ID */
- hid_t xfer_pid; /* Dataset transfer property list ID */
- hsize_t dims[] = {SPACE_DIM};
- uint i,j; /* counting variables */
- int mem_used=0; /* Memory used during allocation */
- herr_t ret; /* Generic return value */
-
- /*
- * Allocate and initialize VL data to write
- */
- for(i=0; i&lt;SPACE_DIM; i++) {
-
- wdata[i].p= (unsigned int *)malloc((i+1)*sizeof(unsigned int));
- wdata[i].len=i+1;
- for(j=0; j&lt;(i+1); j++)
- ((unsigned int *)wdata[i].p)[j]=i*10+j;
- } /* end for */
-
- /*
- * Create file.
- */
- fid = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
-
- /*
- * Create dataspace for datasets.
- */
- sid = H5Screate_simple(SPACE_RANK, dims, NULL);
-
- /*
- * Create a datatype to refer to.
- */
- tid = H5Tvlen_create (H5T_NATIVE_UINT);
-
- /*
- * Create a dataset.
- */
- dataset=H5Dcreate(fid, "Dataset", tid, sid, H5P_DEFAULT);
-
- /*
- * Write dataset to disk.
- */
- ret=H5Dwrite(dataset, tid, H5S_ALL, H5S_ALL, H5P_DEFAULT, wdata);
-
- /*
- * Change to the custom memory allocation routines for reading
- * VL data
- */
- xfer_pid=H5Pcreate(H5P_DATASET_XFER);
-
- ret=H5Pset_vlen_mem_manager(xfer_pid, vltypes_alloc_custom,
- &mem_used, vltypes_free_custom,
- &mem_used);
-
- /*
- * Read dataset from disk. vltypes_alloc_custom and
- * will be used to manage memory.
- */
- ret=H5Dread(dataset, tid, H5S_ALL, H5S_ALL, xfer_pid, rdata);
-
- /*
- * Display data read in
- */
- for(i=0; i&lt;SPACE_DIM; i++) {
- printf("%d-th element length is %d \n", i,
- (unsigned) rdata[i].len);
- for(j=0; j&lt;rdata[i].len; j++) {
- printf(" %d ",((unsigned int *)rdata[i].p)[j] );
- }
- printf("\n");
- } /* end for */
-
- /*
- * Reclaim the read VL data. vltypes_free_custom will be used
- * to reclaim the space.
- */
- ret=H5Dvlen_reclaim(tid, sid, xfer_pid, rdata);
-
- /*
- * Reclaim the write VL data. C language free function will be
- * used to reclaim space.
- */
- ret=H5Dvlen_reclaim(tid, sid, H5P_DEFAULT, wdata);
-
- /*
- * Close Dataset
- */
- ret = H5Dclose(dataset);
-
- /*
- * Close datatype
- */
- ret = H5Tclose(tid);
-
- /*
- * Close disk dataspace
- */
- ret = H5Sclose(sid);
-
- /*
- * Close dataset transfer property list
- */
- ret = H5Pclose(xfer_pid);
-
- /*
- * Close file
- */
- ret = H5Fclose(fid);
-
-}
- </pre>
- </td>
- </tr>
-</table>
-</center>
-
-And the output from this sample code would be as follows:
-
-<center>
-<table border align=center width="100%">
- <caption align=bottom><h4>Example: Variable-length Datatypes, Sample Output</h4></caption>
- <tr>
- <td>
- <pre>
-0-th element length is 1
-0
-1-th element length is 2
-10 11
-2-th element length is 3
-20 21 22
-3-th element length is 4
-30 31 32 33
- </pre>
- </td>
- </tr>
-</table>
-</center>
-
-<p>
-For further samples of VL datatype code, see the tests in <code>test/tvltypes.c</code>
-in the HDF5 distribution.
-
-
-
-
-<h2>10. Array Datatypes</h2>
-
-The array class of datatypes, <code>H5T_ARRAY</code>, allows the
-construction of true, homogeneous, multi-dimensional arrays.
-Since these are homogeneous arrays, each element of the array will be
-of the same datatype, designated at the time the array is created.
-
-<p>
-Arrays can be nested.
-Not only is an array datatype used as an element of an HDF5 dataset,
-but the elements of an array datatype may be of any datatype,
-including another array datatype.
-
-<p>
-Array datatypes cannot be subdivided for I/O; the entire array must
-be transferred from one dataset to another.
-
-<p>
-Within the limitations outlined in the next paragraph, array datatypes
-may be <em>N</em>-dimensional and of any dimension size.
-Unlimited dimensions, however, are not supported.
-Functionality similar to unlimited dimension arrays is available through
-the use of variable-length datatypes.
-
-<p>
-The maximum number of dimensions, i.e., the maximum rank, of an array
-datatype is specified by the HDF5 library constant <code>H5S_MAX_RANK</code>.
-The minimum rank is 1 (one).
-All dimension sizes must be greater than 0 (zero).
-
-<p>
-One array dataype may only be converted to another array datatype
-if the number of dimensions and the sizes of the dimensions are equal
-and the datatype of the first array's elements can be converted
-to the datatype of the second array's elements.
-
-<h3>10.1 Array Datatype APIs</h2>
-
-The functions for creating and manipulating array datadypes are
-as follows:
-
-<dir>
-<table>
- <tr>
- <td><code><b>H5Tarray_create</b></code>
- </td><td>&nbsp;&nbsp;
- </td><td>Creates an array datatype.
- </td></tr><tr><td colspan=3><dir>
- <em>hid_t</em> <code>H5Tarray_create</code>(
- <em>hid_t</em> <code>base</code>,
- <em>int</em> <code>rank</code>,
- <em>const hsize_t</em> <code>dims[/*rank*/]</code>,
- <em>const int</em> <code>perm[/*rank*/]</code>
- )
- </dir>
- </td></tr><tr>
- <td><code><b>H5Tget_array_ndims</b></code>
- </td><td>&nbsp;&nbsp;
- </td><td>Retrieves the rank of the array datatype.
- </td></tr><tr><td colspan=3><dir>
- <em>int</em> <code>H5Tget_array_ndims</code>(
- <em>hid_t</em> <code>adtype_id</code>
- )
- </dir>
- </td></tr><tr>
- <td><code><b>H5Tget_array_dims</b></code>
- </td><td>&nbsp;&nbsp;
- </td><td>Retrieves the dimension sizes of the array datatype.
- </td></tr><tr><td colspan=3><dir>
- <em>int</em> <code>H5Tget_array_dims</code>(
- <em>hid_t</em> <code>adtype_id</code>,
- <em>hsize_t *</em><code>dims[]</code>,
- <em>int *</em><code>perm[]</code>
- )
- </dir>
- </td></tr>
-</table>
-</dir>
-
-
-<h3>10.2 Transition Issues in Adapting Existing Software<br>
-&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
-(Transition to HDF5 Release 1.4 Only)</h3>
-
-The array datatype class is new with Release 1.4;
-prior releases included an array element for compound datatypes.
-<p>
-The use of the array datatype class will not interfere with the
-use of existing compound datatypes. Applications may continue to
-read and write the older field arrays, but they will no longer be
-able to create array fields in newly-defined compound datatypes.
-<p>
-Existing array fields will be transparently mapped to array datatypes
-when they are read in.
-
-
-<h3>10.3 Code Example</h3>
-
-The following example creates an array datatype and a dataset
-containing elements of the array datatype in an HDF5 file.
-It then writes the dataset to the file.
-<p>
-
-<center>
-<table border align=center width="100%">
- <caption align=bottom><h4>Example: Array Datatype</h4></caption>
- <tr>
- <td>
- <pre>
-#include &lt;hdf5.h&gt;
-
-#define FILE "SDS_array_type.h5"
-#define DATASETNAME "IntArray"
-#define ARRAY_DIM1 5 /* array dimensions and rank */
-#define ARRAY_DIM2 4
-#define ARRAY_RANK 2
-#define SPACE_DIM 10 /* dataset dimensions and rank */
-#define RANK 1
-
-int
-main (void)
-{
- hid_t file, dataset; /* file and dataset handles */
- hid_t datatype, dataspace; /* handles */
- hsize_t sdims[] = {SPACE_DIM}; /* dataset dimensions */
- hsize_t adims[] = {ARRAY_DIM1, ARRAY_DIM2}; /* array dimensions */
- hsize_t adims_out[2];
- herr_t status;
- int data[SPACE_DIM][ARRAY_DIM1][ARRAY_DIM2]; /* data to write */
- int k, i, j;
- int array_rank_out;
-
- /*
- * Data and output buffer initialization.
- */
- for (k = 0; k &lt; SPACE_DIM; k++) {
- for (j = 0; j &lt; ARRAY_DIM1; j++) {
- for (i = 0; i &lt; ARRAY_DIM2; i++)
- data[k][j][i] = k;
- }
- }
- /*
- * Create a new file using H5F_ACC_TRUNC access,
- * default file creation properties, and default file
- * access properties.
- */
- file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
-
- /*
- * Describe the size of the array and create the data space for fixed
- * size dataset.
- */
- dataspace = H5Screate_simple(RANK, sdims, NULL);
-
- /*
- * Define array datatype for the data in the file.
- */
- datatype = H5Tarray_create(H5T_NATIVE_INT, ARRAY_RANK, adims, NULL);
-
- /*
- * Create a new dataset within the file using defined dataspace and
- * datatype and default dataset creation properties.
- */
- dataset = H5Dcreate(file, DATASETNAME, datatype, dataspace,
- H5P_DEFAULT);
-
- /*
- * Write the data to the dataset using default transfer properties.
- */
- status = H5Dwrite(dataset, datatype, H5S_ALL, H5S_ALL,
- H5P_DEFAULT, data);
-
-
- /*
- * Close/release resources.
- */
- H5Sclose(dataspace);
- H5Tclose(datatype);
- H5Dclose(dataset);
- /*
- * Reopen dataset, and return information about its datatype.
- */
- dataset = H5Dopen(file, DATASETNAME);
- datatype = H5Dget_type(dataset);
- array_rank_out = H5Tget_array_ndims(datatype);
- status = H5Tget_array_dims(datatype, adims_out, NULL);
- printf(" Array datatype rank is %d \n", array_rank_out);
- printf(" Array dimensions are %d x %d \n", (int)adims_out[0],
- (int)adims_out[1]);
-
- H5Tclose(datatype);
- H5Dclose(dataset);
- H5Fclose(file);
-
- return 0;
-}
- </pre>
- </td>
- </tr>
-</table>
-</center>
-
-
-
- <h2>11. Sharing Datatypes among Datasets</h2>
-
- <p>If a file has lots of datasets which have a common datatype,
- then the file could be made smaller by having all the datasets
- share a single datatype. Instead of storing a copy of the
- datatype in each dataset object header, a single datatype is stored
- and the object headers point to it. The space savings is
- probably only significant for datasets with a compound datatype,
- since the atomic datatypes can be described with just a few
- bytes anyway.
-
- <p>To create a bunch of datasets that share a single datatype
- just create the datasets with a committed (named) datatype.
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: Shared Datatypes</h4></caption>
- <tr>
- <td>
- <p>To create two datasets that share a common datatype
- one just commits the datatype, giving it a name, and
- then uses that datatype to create the datasets.
-
- <p><code><pre>
-hid_t t1 = ...some transient type...;
-H5Tcommit (file, "shared_type", t1);
-hid_t dset1 = H5Dcreate (file, "dset1", t1, space, H5P_DEFAULT);
-hid_t dset2 = H5Dcreate (file, "dset2", t1, space, H5P_DEFAULT);
- </code></pre>
-
- <p>And to create two additional datasets later which
- share the same type as the first two datasets:
-
- <p><code><pre>
-hid_t dset1 = H5Dopen (file, "dset1");
-hid_t t2 = H5Dget_type (dset1);
-hid_t dset3 = H5Dcreate (file, "dset3", t2, space, H5P_DEFAULT);
-hid_t dset4 = H5Dcreate (file, "dset4", t2, space, H5P_DEFAULT);
- </code></pre>
- </td>
- </tr>
- </table>
- </center>
-
-
-
-
- <a name="Datatypes-DataConversion">
- <h2>12. Data Conversion</h2>
- </a>
-
- <p>The library is capable of converting data from one type to
- another and does so automatically when reading or writing the
- raw data of a dataset, attribute data, or fill values. The
- application can also change the type of data stored in an array.
-
- <p>In order to insure that data conversion exceeds disk I/O rates,
- common data conversion paths can be hand-tuned and optimized for
- performance. The library contains very efficient code for
- conversions between most native datatypes and a few non-native
- datatypes, but if a hand-tuned conversion function is not
- available, then the library falls back to a slower but more
- general conversion function. The application programmer can
- define additional conversion functions when the libraries
- repertoire is insufficient. In fact, if an application does
- define a conversion function which would be of general interest,
- we request that the function be submitted to the HDF5
- development team for inclusion in the library.
-
- <p><b>Note:</b> The HDF5 library contains a deliberately limited
- set of conversion routines. It can convert from one integer
- format to another, from one floating point format to another,
- and from one struct to another. It can also perform byte
- swapping when the source and destination types are otherwise the
- same. The library does not contain any functions for converting
- data between integer and floating point formats. It is
- anticipated that some users will find it necessary to develop
- float to integer or integer to float conversion functions at the
- application level; users are invited to submit those functions
- to be considered for inclusion in future versions of the
- library.
-
- <p>A conversion path contains a source and destination datatype
- and each path contains a <em>hard</em> conversion function
- and/or a <em>soft</em> conversion function. The only difference
- between hard and soft functions is the way in which the library
- chooses which function applies: A hard function applies to a
- specific conversion path while a soft function may apply to
- multiple paths. When both hard and soft functions apply to a
- conversion path, then the hard function is favored and when
- multiple soft functions apply, the one defined last is favored.
-
- <p>A data conversion function is of type <code>H5T_conv_t</code>,
- which is defined as follows:
-
-<dir><pre><em>typedef</em> herr_t (<em>*H5T_conv_t</em>) (hid_t <em>src_id</em>,
- hid_t <em>dst_id</em>,
- H5T_cdata_t *<em>cdata</em>,
- hsize_t <em>nelmts</em>,
- size_t <em>buf_stride</em>,
- size_t <em>bkg_stride</em>,
- void *<em>buffer</em>,
- void *<em>bkg_buffer</em>,
- hid_t <em>dset_xfer_plist</em>);</pre></dir>
-
-
- <p>The conversion function is called with
- the source and destination datatypes (<em>src_id</em> and
- <em>dst_id</em>),
- the path-constant data struct (<em>cdata</em>),
- the number of instances of the datatype to convert (<em>nelmts</em>),
- a conversion buffer (<em>buffer</em>) which initially contains
- an array of data having the source type and on return will
- contain an array of data having the destination type,
- a temporary or background buffer (<em>bkg_buffer</em>,
- see description of <code>H5T_BKG_YES</code> below),
- conversion and background buffer strides (<em>buf_stride</em> and
- <em>bkg_stride</em>) that indicate what data is to be converted, and
- a dataset transfer properties list (<em>dset_xfer_plist</em>).
-
- <p><em>buf_stride</em> and <em>bkg_stride</em> are in bytes and
- are related to the size of the datatype.
- If every data element is to be converted, the parameter's value
- is equal to the size of the datatype;
- if every other data element is to be converted, the parameter's value
- is equal to twice the size of the datatype; etc.
-
- <p><em>dset_xfer_plist</em> may contain properties that are passed
- to the read and write calls.
- This parameter is currently used only with variable-length data.
-
- <p><em>bkg_buffer</em> and <em>bkg_stride</em> are used only with
- compound datatypes.
-
- <p>The path-constant data struct, <code>H5T_cdata_t</code>,
- is declared as follows:
-
-<dir><pre><em>typedef</em> struct <em>*H5T_cdata_t</em> (H5T_cmd_t <em>command</em>,
- H5T_bkg_t <em>need_bkg</em>,
- hbool_t *<em>recalc</em>,
- void *<em>priv</em>)</pre></dir>
-
- <p>The <code>command</code> field of the <em>cdata</em> argument
- determines what happens within the conversion function. It's
- values can be:
-
- <dl>
- <dt><code>H5T_CONV_INIT</code>
- <dd>This command is to hard conversion functions when they're
- registered or soft conversion functions when the library is
- determining if a conversion can be used for a particular path.
- The <em>src_type</em> and <em>dst_type</em> are the end-points
- of the path being queried and <em>cdata</em> is all zero. The
- library should examine the source and destination types and
- return zero if the conversion is possible and negative
- otherwise (hard conversions need not do this since they've
- presumably been registered only on paths they support). If
- the conversion is possible the library may allocate and
- initialize private data and assign the pointer to the
- <code>priv</code> field of <em>cdata</em> (or private data can
- be initialized later). It should also initialize the
- <code>need_bkg</code> field described below. The <em>buf</em>
- and <em>background</em> pointers will be null pointers.
-
- <br><br>
- <dt><code>H5T_CONV_CONV</code>
- <dd>This command indicates that data points should be converted.
- The conversion function should initialize the
- <code>priv</code> field of <em>cdata</em> if it wasn't
- initialize during the <code>H5T_CONV_INIT</code> command and
- then convert <em>nelmts</em> instances of the
- <em>src_type</em> to the <em>dst_type</em>. The
- <em>buffer</em> serves as both input and output. The
- <em>background</em> buffer is supplied according to the value
- of the <code>need_bkg</code> field of <em>cdata</em> (the
- values are described below).
-
- <br><br>
- <dt><code>H5T_CONV_FREE</code>
- <dd>The conversion function is about to be removed from some
- path and the private data (the
- <code><em>cdata</em>->priv</code> pointer) should be freed and
- set to null. All other pointer arguments are null, the
- <em>src_type</em> and <em>dst_type</em> are invalid
- (negative), and the <em>nelmts</em> argument is zero.
-
- <br><br>
- <dt><em>Others...</em>
- <dd>Other commands might be implemented later and conversion
- functions that don't support those commands should return a
- negative value.
- </dl>
-
-
- <p>Whether a background buffer is supplied to a conversion
- function, and whether the background buffer is initialized
- depends on the value of <code><em>cdata</em>->need_bkg</code>
- which the conversion function should have initialized during the
- H5T_CONV_INIT command. It can have one of these values:
-
- <dl>
- <dt><code>H5T_BKG_NONE</code>
- <dd>No background buffer will be supplied to the conversion
- function. This is the default.
-
- <br><br>
- <dt><code>H5T_BKG_TEMP</code>
- <dd>A background buffer will be supplied but it will not be
- initialized. This is useful for those functions requiring some
- extra buffer space as the buffer can probably be allocated
- more efficiently by the library (the application can supply
- the buffer as part of the dataset transfer property list).
-
- <br><br>
- <dt><code>H5T_BKG_YES</code>
- <dd>An initialized background buffer is passed to the conversion
- function. The buffer is initialized with the current values
- of the destination for the data which is passed in through the
- <em>buffer</em> argument. It can be used to "fill in between
- the cracks". For instance, if the destination type is a
- compound datatype and we are initializing only part of the
- compound datatype from the source type then the background
- buffer can be used to initialize the other part of the
- destination.
- </dl>
-
- <p>The <code>recalc</code> field of <em>cdata</em> is set when the
- conversion path table changes. It can be used by conversion
- function that cache other conversion paths so they know when
- their cache needs to be recomputed.
-
-
- <p>Once a conversion function is written it can be registered and
- unregistered with these functions:
-
- <dl>
- <dt><code>herr_t H5Tregister(H5T_pers_t <em>pers</em>, const
- char *<em>name</em>, hid_t <em>src_type</em>, hid_t
- <em>dest_type</em>, H5T_conv_t <em>func</em>)</code>
- <dd>Once a conversion function is written, the library must be
- notified so it can be used. The function can be registered as
- a hard (<code>H5T_PERS_HARD</code>) or soft
- (<code>H5T_PERS_SOFT</code>) conversion depending on the value
- of <em>pers</em>, displacing any previous conversions for all
- applicable paths. The <em>name</em> is used only for
- debugging but must be supplied. If <em>pers</em> is
- <code>H5T_PERS_SOFT</code> then only the type classes of the
- <em>src_type</em> and <em>dst_type</em> are used. For
- instance, to register a general soft conversion function that
- can be applied to any integer to integer conversion one could
- say: <code>H5Tregister(H5T_PERS_SOFT, "i2i", H5T_NATIVE_INT,
- H5T_NATIVE_INT, convert_i2i)</code>. One special conversion
- path called the "no-op" conversion path is always defined by
- the library and used as the conversion function when no data
- transformation is necessary. The application can redefine this
- path by specifying a new hard conversion function with a
- negative value for both the source and destination datatypes,
- but the library might not call the function under certain
- circumstances.
-
- <br><br>
- <dt><code>herr_t H5Tunregister (H5T_pers_t <em>pers</em>, const
- char *<em>name</em>, hid_t <em>src_type</em>, hid_t
- <em>dest_type</em>, H5T_conv_t <em>func</em>)</code>
- <dd>Any conversion path or function that matches the critera
- specified by a call to this function is removed from the type
- conversion table. All fields have the same interpretation as
- for <code>H5Tregister()</code> with the added feature that any
- (or all) may be wild cards. The
- <code>H5T_PERS_DONTCARE</code> constant should be used to
- indicate a wild card for the <em>pers</em> argument. The wild
- card <em>name</em> is the null pointer or empty string, the
- wild card for the <em>src_type</em> and <em>dest_type</em>
- arguments is any negative value, and the wild card for the
- <em>func</em> argument is the null pointer. The special no-op
- conversion path is never removed by this function.
- </dl>
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: A conversion
- function</h4></caption>
- <tr>
- <td>
- <p>Here's an example application-level function that
- converts Cray <code>unsigned short</code> to any other
- 16-bit unsigned big-endian integer. A cray
- <code>short</code> is a big-endian value which has 32
- bits of precision in the high-order bits of a 64-bit
- word.
-
- <p><code><pre>
- 1 typedef struct {
- 2 size_t dst_size;
- 3 int direction;
- 4 } cray_ushort2be_t;
- 5
- 6 herr_t
- 7 cray_ushort2be (hid_t src, hid_t dst,
- 8 H5T_cdata_t *cdata, hsize_t nelmts,
- 9 size_t buf_str, size_t bkg_str, void *buf,
-10 const void *background, hid_t plist)
-11 {
-12 unsigned char *src = (unsigned char *)buf;
-13 unsigned char *dst = src;
-14 cray_ushort2be_t *priv = NULL;
-15
-16 switch (cdata-&gt;command) {
-17 case H5T_CONV_INIT:
-18 /*
-19 * We are being queried to see if we handle this
-20 * conversion. We can handle conversion from
-21 * Cray unsigned short to any other big-endian
-22 * unsigned integer that doesn't have padding.
-23 */
-24 if (!H5Tequal (src, H5T_CRAY_USHORT) ||
-25 H5T_ORDER_BE != H5Tget_order (dst) ||
-26 H5T_SGN_NONE != H5Tget_signed (dst) ||
-27 8*H5Tget_size (dst) != H5Tget_precision (dst)) {
-28 return -1;
-29 }
-30
-31 /*
-32 * Initialize private data. If the destination size
-33 * is larger than the source size, then we must
-34 * process the elements from right to left.
-35 */
-36 cdata-&gt;priv = priv = malloc (sizeof(cray_ushort2be_t));
-37 priv-&gt;dst_size = H5Tget_size (dst);
-38 if (priv-&gt;dst_size&gt;8) {
-39 priv-&gt;direction = -1;
-40 } else {
-41 priv-&gt;direction = 1;
-42 }
-43 break;
-44
-45 case H5T_CONV_FREE:
-46 /*
-47 * Free private data.
-48 */
-49 free (cdata-&gt;priv);
-50 cdata-&gt;priv = NULL;
-51 break;
-52
-53 case H5T_CONV_CONV:
-54 /*
-55 * Convert each element, watch out for overlap src
-56 * with dst on the left-most element of the buffer.
-57 */
-58 priv = (cray_ushort2be_t *)(cdata-&gt;priv);
-59 if (priv-&gt;direction&lt;0) {
-60 src += (nelmts - 1) * 8;
-61 dst += (nelmts - 1) * dst_size;
-62 }
-63 for (i=0; i&lt;n; i++) {
-64 if (src==dst && dst_size&lt;4) {
-65 for (j=0; j&lt;dst_size; j++) {
-66 dst[j] = src[j+4-dst_size];
-67 }
-68 } else {
-69 for (j=0; j&lt;4 && j&lt;dst_size; j++) {
-70 dst[dst_size-(j+1)] = src[3-j];
-71 }
-72 for (j=4; j&lt;dst_size; j++) {
-73 dst[dst_size-(j+1)] = 0;
-74 }
-75 }
-76 src += 8 * direction;
-77 dst += dst_size * direction;
-78 }
-79 break;
-80
-81 default:
-82 /*
-83 * Unknown command.
-84 */
-85 return -1;
-86 }
-87 return 0;
-88 }
- </pre></code>
-
- <p>The <em>background</em> argument is ignored since
- it's generally not applicable to atomic datatypes.
- </td>
- </tr>
- </table>
- </center>
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom><h4>Example: Soft
- Registration</h4></caption>
- <tr>
- <td>
- <p>The convesion function described in the previous
- example applies to more than one conversion path.
- Instead of enumerating all possible paths, we register
- it as a soft function and allow it to decide which
- paths it can handle.
-
- <p><code><pre>
-H5Tregister(H5T_PERS_SOFT, "cus2be",
- H5T_NATIVE_INT, H5T_NATIVE_INT,
- cray_ushort2be);
- </pre></code>
-
- <p>This causes it to be consulted for any conversion
- from an integer type to another integer type. The
- first argument is just a short identifier which will
- be printed with the datatype conversion statistics.
- </td>
- </tr>
- </table>
- </center>
-
-
- <p><b>NOTE:</b> The idea of a master soft list and being able to
- query conversion functions for their abilities tries to overcome
- problems we saw with AIO. Namely, that there was a dichotomy
- between generic conversions and specific conversions that made
- it very difficult to write a conversion function that operated
- on, say, integers of any size and order as long as they don't
- have zero padding. The AIO mechanism required such a function
- to be explicitly registered (like
- <code>H5Tregister_hard()</code>) for each an every possible
- conversion path whether that conversion path was actually used
- or not.</p>
-
-
-<!-- #BeginLibraryItem "/ed_libs/NavBar_UG.lbi" --><hr>
-<center>
-<table border=0 width=98%>
-<tr><td valign=top align=left>
- <a href="index.html">HDF5 documents and links</a>&nbsp;<br>
- <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
- <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
- <a href="http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/index.html">HDF5 User's Guide for Release 1.6</a>&nbsp;<br>
- <!--
- <a href="Glossary.html">Glossary</a><br>
- -->
-</td>
-<td valign=top align=right>
- And in this document, the
- <a href="H5.user.html"><strong>HDF5 User's Guide from Release 1.4.5:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
- <br>
- <a href="Files.html">Files</a>&nbsp;&nbsp;
- <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
- <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
- <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
- <a href="Groups.html">Groups</a>&nbsp;&nbsp;
- <br>
- <a href="References.html">References</a>&nbsp;&nbsp;
- <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
- <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
- <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
- <br>
- <a href="Filters.html">Filters</a>&nbsp;&nbsp;
- <a href="Caching.html">Caching</a>&nbsp;&nbsp;
- <a href="Chunking.html">Chunking</a>&nbsp;&nbsp;
- <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
- <br>
- <a href="Performance.html">Performance</a>&nbsp;&nbsp;
- <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
- <a href="Environment.html">Environment</a>&nbsp;&nbsp;
- <a href="ddl.html">DDL</a>&nbsp;&nbsp;
-</td></tr>
-</table>
-</center>
-<hr>
-<!-- #EndLibraryItem --><!-- #BeginLibraryItem "/ed_libs/Footer.lbi" --><address>
-<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
-<br>
-Describes HDF5 Release 1.4.5, February 2003
-</address><!-- #EndLibraryItem --><!-- Created: Thu Dec 4 14:57:32 EST 1997 -->
-<!-- hhmts start -->
-Last modified: 2 August 2001
-<!-- hhmts end -->
-
-
-</body>
-</html>