[svn-r467] Restructuring documentation.

author: Quincey Koziol <koziol@hdfgroup.org> 1998-07-08 14:54:54 (GMT)
committer: Quincey Koziol <koziol@hdfgroup.org> 1998-07-08 14:54:54 (GMT)
commit: bd1e676c521d881b3143829f493a28b5ced1294b (patch)
tree: 69c50f9fe21ce87f293d8617a6bd51b4cc1e0244 /doc/html/Datasets.html
parent: 73345095897d9698bb1f2f7df830bf80a56dc65a (diff)
download: hdf5-bd1e676c521d881b3143829f493a28b5ced1294b.zip
hdf5-bd1e676c521d881b3143829f493a28b5ced1294b.tar.gz
hdf5-bd1e676c521d881b3143829f493a28b5ced1294b.tar.bz2
1 files changed, 839 insertions, 0 deletions
diff --git a/doc/html/Datasets.html b/doc/html/Datasets.html
new file mode 100644
index 0000000..e0f9680
--- /dev/null
+++ b/doc/html/Datasets.html
@@ -0,0 +1,839 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>The Dataset Interface (H5D)</title>
+  </head>
+
+  <body>
+    <h1>The Dataset Interface (H5D)</h1>
+
+    <h2>1. Introduction</h2>
+
+    <p>The purpose of the dataset interface is to provide a mechanism
+      to describe properties of datasets and to transfer data between
+      memory and disk.  A dataset is composed of a collection of raw
+      data points and four classes of meta data to describe the data
+      points.  The interface is hopefully designed in such a way as to
+      allow new features to be added without disrupting current
+      applications that use the dataset interface.
+
+    <p>The four classes of meta data are:
+
+    <dl>
+      <dt>Constant Meta Data
+      <dd>Meta data that is created when the dataset is created and
+	exists unchanged for the life of the dataset.  For instance,
+	the data type of stored array elements is defined when the
+	dataset is created and cannot be subsequently changed.
+	
+      <dt>Persistent Meta Data
+      <dd>Meta data that is an integral and permanent part of a
+	dataset but can change over time.  For instance, the size in
+	any dimension can increase over time if such an increase is
+	allowed when the dataset was created.
+
+      <dt>Memory Meta Data
+      <dd>Meta data that exists to describe how raw data is organized
+	in the application's memory space.  For instance, the data
+	type of elements in an application array might not be the same
+	as the data type of those elements as stored in the HDF5 file.
+
+      <dt>Transport Meta Data
+      <dd>Meta data that is used only during the transfer of raw data
+	from one location to another.  For instance, the number of
+	processes participating in a collective I/O request or hints
+	to the library to control caching of raw data.
+    </dl>
+
+    <p>Each of these classes of meta data is handled differently by
+      the library although the same API might be used to create them.
+      For instance, the data type exists as constant meta data and as
+      memory meta data; the same API (the <code>H5T</code> API) is
+      used to manipulate both pieces of meta data but they're handled
+      by the dataset API (the <code>H5D</code> API) in different
+      manners.
+
+
+
+    <h2>2. Storage Layout Properties</h2>
+
+    <p>The dataset API partitions these terms on three orthogonal axes
+      (layout, compression, and external storage) and uses a
+      <em>dataset creation property list</em> to hold the various
+      settings and pass them through the dataset interface.  This is
+      similar to the way HDF5 files are created with a file creation
+      property list.  A dataset creation property list is always
+      derived from the default dataset creation property list (use
+      <code>H5Pcreate()</code> to get a copy of the default property
+      list) by modifying properties with various
+      <code>H5Pset_<em>property</em>()</code> functions.
+
+    <dl>
+      <dt><code>herr_t H5Pset_layout (hid_t <em>plist_id</em>,
+	  H5D_layout_t <em>layout</em>)</code>
+      <dd>The storage layout is a piece of constant meta data that
+	describes what method the library uses to organize the raw
+	data on disk. The default layout is contiguous storage.
+
+	<br><br>
+	<dl>
+	  <dt><code>H5D_COMPACT</code>
+	  <dd>The raw data is presumably small and can be stored
+	    directly in the object header.  Such data is
+	    non-extendible, non-compressible, non-sparse, and cannot
+	    be stored externally. Most of these restrictions are
+	    arbitrary but are enforced because of the small size of
+	    the raw data. Storing data in this format eliminates the
+	    disk seek/read request normally necessary to read raw
+	    data. <b>This layout is not implemented yet.</b>
+
+	    <br><br>
+	  <dt><code>H5D_CONTIGUOUS</code>
+	  <dd>The raw data is large, non-extendible, non-compressible,
+	    non-sparse, and can be stored externally.  This is the
+	    default value for the layout property.  The term
+	    <em>large</em> means that it may not be possible to hold
+	    the entire dataset in memory.  The non-compressibility is
+	    a side effect of the data being large, contiguous, and
+	    fixed-size at the physical level, which could cause
+	    partial I/O requests to be extremely expensive if
+	    compression were allowed.
+
+	    <br><br>
+	  <dt><code>H5D_CHUNKED</code>
+	  <dd>The raw data is large and can be extended in any
+	    dimension at any time (provided the data space also allows
+	    the extension). It may be sparse at the chunk level (each
+	    chunk is non-sparse, but there might only be a few chunks)
+	    and each chunk can be compressed and/or stored externally.
+	    A dataset is partitioned into chunks so each chunk is the
+	    same logical size.  The chunks are indexed by a B-tree and
+	    are allocated on demand (although it might be useful to be
+	    able to preallocate storage for parts of a chunked array
+	    to reduce contention for the B-tree in a parallel
+	    environment). The chunk size must be defined with
+	    <code>H5Pset_chunk()</code>.
+
+	    <br><br>
+	  <dt><em>others...</em>
+	  <dd>Other layout types may be defined later without breaking
+	    existing code.  However, to be able to correctly read or
+	    modify data stored with one of these new layouts, the
+	    application will need to be linked with a new version of
+	    the library.  This happens automatically on systems with
+	    dynamic linking.
+	</dl>
+    </dl>
+
+    <p>Once the general layout is defined, the user can define
+      properties of that layout.  Currently, the only layout that has
+      user-settable properties is the <code>H5D_CHUNKED</code> layout,
+      which needs to know the dimensionality and chunk size.
+
+    <dl>
+      <dt><code>herr_t H5Pset_chunk (hid_t <em>plist_id</em>, int
+	  <em>ndims</em>, hsize_t <em>dim</em>[])</code>
+      <dd>This function defines the logical size of a chunk for
+	chunked layout.  If the layout property is set to
+	<code>H5D_CHUNKED</code> and the chunk size is set to
+	<em>dim</em>. The number of elements in the <em>dim</em> array
+	is the dimensionality, <em>ndims</em>. One need not call
+	<code>H5Dset_layout()</code> when using this function since
+	the chunked layout is implied.
+    </dl>
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example: Chunked Storage</h4></caption>
+	  <tr>
+	    <td>
+	      <p>This example shows how a two-dimensional dataset
+		is partitioned into chunks.  The library can manage file
+		memory by moving the chunks around, and each chunk could be
+		compressed.  The chunks are allocated in the file on demand
+		when data is written to the chunk.
+	      <center>
+		<img alt="Chunked Storage" src="chunk1.gif">
+	      </center>
+
+	      <p><code><pre>
+size_t hsize[2] = {1000, 1000};
+plist = H5Pcreate (H5P_DATASET_CREATE);
+H5Pset_chunk (plist, 2, size);
+	      </pre></code>
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+
+    <p>Although it is most efficient if I/O requests are aligned on chunk
+      boundaries, this is not a constraint.  The application can perform I/O
+      on any set of data points as long as the set can be described by the
+      data space.  The set on which I/O is performed is called the
+      <em>selection</em>.
+
+    <h2>3. Compression Properties</h2>
+
+    <p>Some types of storage layout allow data compression which is
+      defined by the functions described here. <b>Compression is not
+      implemented yet.</b>
+
+    <dl>
+      <dt><code>herr_t H5Pset_compression (hid_t <em>plist_id</em>,
+	  H5Z_method_t <em>method</em>)</code>
+      <dt><code>H5Z_method_t H5Pget_compression (hid_t
+	  <em>plist_id</em>)</code>
+      <dd>These functions set and query the compression method that
+	is used to compress the raw data of a dataset.  The
+	<em>plist_id</em> is a dataset creation property list.  The
+	possible values for the compression method are:
+
+	<br><br>
+	<dl>
+	  <dt><code>H5Z_NONE</code>
+	  <dd>This is the default and specifies that no compression is
+	    to be performed.
+
+	    <br><br>
+	  <dt><code>H5Z_DEFLATE</code>
+	  <dd>This specifies that a variation of the Lempel-Ziv 1977
+	    (LZ77) encoding is used, the same encoding used by the
+	    free GNU <code>gzip</code> program.
+	</dl>
+
+	<br><br>
+      <dt><code>herr_t H5Pset_deflate (hid_t <em>plist_id</em>,
+	  int <em>level</em>)</code>
+      <dt><code>int H5Pget_deflate (hid_t <em>plist_id</em>)</code>
+      <dd>These functions set or query the deflate level of
+	dataset creation property list <em>plist_id</em>.  The
+	<code>H5Pset_deflate()</code> sets the compression method to
+	<code>H5Z_DEFLATE</code> and sets the compression level to
+	some integer between one and nine (inclusive).  One results in
+	the fastest compression while nine results in the best
+	compression ratio.  The default value is six if
+	<code>H5Pset_deflate()</code> isn't called.  The
+	<code>H5Pget_deflate()</code> returns the compression level
+	for the deflate method, or negative if the method is not the
+	deflate method.
+    </dl>
+
+    <h2>4. External Storage Properties</h2>
+
+    <p>Some storage formats may allow storage of data across a set of
+      non-HDF5 files. Currently, only the <code>H5D_CONTIGUOUS</code> storage
+      format allows external storage.  A set segments (offsets and sizes) in
+      one or more files is defined as an external file list, or <em>EFL</em>,
+      and the contiguous logical addresses of the data storage are mapped onto
+      these segments.
+
+    <dl>
+      <dt><code>herr_t H5Pset_external (hid_t <em>plist</em>, const
+	  char *<em>name</em>, off_t <em>offset</em>, hsize_t
+	  <em>size</em>)</code>
+      <dd>This function adds a new segment to the end of the external
+	file list of the specified dataset creation property list.  The
+	segment begins a byte <em>offset</em> of file <em>name</em> and
+	continues for <em>size</em> bytes.  The space represented by this
+	segment is adjacent to the space already represented by the external
+	file list.  The last segment in a file list may have the size
+	<code>H5F_UNLIMITED</em>.
+
+	  <br><br>
+      <dt><code>int H5Pget_external_count (hid_t <em>plist</em>)</code>
+      <dd>Calling this function returns the number of segments in an
+	external file list. If the dataset creation property list has no
+	external data then zero is returned.
+
+	<br><br>
+      <dt><code>herr_t H5Pget_external (hid_t <em>plist</em>, int
+	  <em>idx</em>, size_t <em>name_size</em>, char *<em>name</em>, off_t
+	  *<em>offset</em>, hsize_t *<em>size</em>)</code>
+      <dd>This is the counterpart for the <code>H5Pset_external()</code>
+	function.  Given a dataset creation property list and a zero-based
+	index into that list, the file name, byte offset, and segment size are
+	returned through non-null arguments.  At most <em>name_size</em>
+	characters are copied into the <em>name</em> argument which is not
+	null terminated if the file name is longer than the supplied name
+	buffer (this is similar to <code>strncpy()</code>).
+    </dl>
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example: Multiple Segments</h4></caption>
+	  <tr>
+	    <td>
+	      <p>This example shows how a contiguous, one-dimensional dataset
+		is partitioned into three parts and each of those parts is
+		stored in a segment of an external file. The top rectangle
+		represents the logical address space of the dataset
+		while the bottom rectangle represents an external file.
+	      <center>
+		<img alt="Multiple Segments" src="extern1.gif">
+	      </center>
+
+	      <p><code><pre>
+plist = H5Pcreate (H5P_DATASET_CREATE);
+H5Pset_external (plist, "velocity.data", 3000, 1000);
+H5Pset_external (plist, "velocity.data", 0, 2500);
+H5Pset_external (plist, "velocity.data", 4500, 1500);
+	      </pre></code>
+
+	      <p>One should note that the segments are defined in order of the
+		logical addresses they represent, not their order within the
+		external file.  It would also have been possible to put the
+		segments in separate files.  Care should be taken when setting
+		up segments in a single file since the library doesn't
+		automatically check for segments that overlap.
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example: Multi-Dimensional</h4></caption>
+	  <tr>
+	    <td>
+	      <p>This example shows how a contiguous, two-dimensional dataset
+		is partitioned into three parts and each of those parts is
+		stored in a separate external file. The top rectangle
+		represents the logical address space of the dataset
+		while the bottom rectangles represent external files.
+	      <center>
+		<img alt="Multiple Dimensions" src="extern2.gif">
+	      </center>
+
+	      <p><code><pre>
+plist = H5Pcreate (H5P_DATASET_CREATE);
+H5Pset_external (plist, "scan1.data", 0, 24);
+H5Pset_external (plist, "scan2.data", 0, 24);
+H5Pset_external (plist, "scan3.data", 0, 16);
+	      </pre></code>
+
+	      <p>The library maps the multi-dimensional array onto a linear
+		address space like normal, and then maps that address space
+		into the segments defined in the external file list.
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>The segments of an external file can exist beyond the end of the
+      file. The library reads that part of a segment as zeros.  When writing
+      to a segment that exists beyond the end of a file, the file is
+      automatically extended. Using this feature, one can create a segment
+      (or set of segments) which is larger than the current size of the
+      dataset, which allows to dataset to be extended at a future time
+      (provided the data space also allows the extension).
+
+    <p>All referenced external data files must exist before performing raw
+      data I/O on the dataset. This is normally not a problem since those
+      files are being managed directly by the application, or indirectly
+      through some other library.
+
+
+    <h2>5. Data Type</h2>
+
+    <p>Raw data has a constant data type which describes the data type
+      of the raw data stored in the file, and a memory data type that
+      describes the data type stored in application memory.  Both data
+      types are manipulated with the <a
+      href="Datatypes.html"><code>H5T</code></a> API.
+
+    <p>The constant file data type is associated with the dataset when
+      the dataset is created in a manner described below.  Once
+      assigned, the constant datatype can never be changed.
+
+    <p>The memory data type is specified when data is transferred
+      to/from application memory.  In the name of data sharability,
+      the memory data type must be specified, but can be the same
+      type identifier as the constant data type.
+
+    <p>During dataset I/O operations, the library translates the raw
+      data from the constant data type to the memory data type or vice
+      versa.  Structured data types include member offsets to allow
+      reordering of struct members and/or selection of a subset of
+      members and array data types include index permutation
+      information to allow things like transpose operations (<b>the
+      prototype does not support array reordering</b>) Permutations
+      are relative to some extrinsic descritpion of the dataset.
+
+
+
+    <h2>6. Data Space</h2>
+
+    <p>The dataspace of a dataset defines the number of dimensions
+      and the size of each dimension and is manipulated with the
+      <code>H5S</code> API.  The <em>simple</em> dataspace consists of
+      maximum dimension sizes and actual dimension sizes, which are
+      usually the same.  However, maximum dimension sizes can be the
+      constant <code>H5D_UNLIMITED</code> in which case the actual
+      dimension size can be incremented with calls to
+      <code>H5Dextend()</code>. The maximium dimension sizes are
+      constant meta data while the actual dimension sizes are
+      persistent meta data.  Initial actual dimension sizes are
+      supplied at the same time as the maximum dimension sizes when
+      the dataset is created.
+
+    <p>The dataspace can also be used to define partial I/O
+      operations. Since I/O operations have two end-points, the raw
+      data transfer functions take two data space arguments: one which
+      describes the application memory data space or subset thereof
+      and another which describes the file data space or subset
+      thereof.
+
+
+    <h2>7. Setting Constant or Persistent Properties</h2>
+
+    <p>Each dataset has a set of constant and persistent properties
+      which describe the layout method, pre-compression
+      transformation, compression method, data type, external storage,
+      and data space.  The constant properties are set as described
+      above in a dataset creation property list whose identifier is
+      passed to <code>H5Dcreate()</code>.
+
+    <dl>
+      <dt><code>hid_t H5Dcreate (hid_t <em>file_id</em>, const char
+	  *<em>name</em>, hid_t <em>type_id</em>, hid_t
+	  <em>space_id</em>, hid_t <em>create_plist_id</em>)</code>
+      <dd>A dataset is created by calling <code>H5Dcreate</code> with
+	a file identifier, a dataset name, a data type, a data space,
+	and constant properties.  The data type and data space are the
+	type and space of the dataset as it will exist in the file,
+	which may be different than in application memory.  The
+	<em>create_plist_id</em> is a <code>H5P_DATASET_CREATE</code>
+	property list created with <code>H5Pcreate()</code> and
+	initialized with the various functions described above.
+	<code>H5Dcreate()</code> returns a dataset handle for success
+	or negative for failure. The handle should eventually be
+	closed by calling <code>H5Dclose()</code> to release resources
+	it uses.
+
+	<br><br>
+      <dt><code>hid_t H5Dopen (hid_t <em>file_id</em>, const char
+	  *<em>name</em>)</code>
+      <dd>An existing dataset can be opened for access by calling this
+	function.  A dataset handle is returned for success or a
+	negative value is returned for failure.  The handle should
+	eventually be closed by calling <code>H5Dclose()</code> to
+	release resources it uses.
+
+	<br><br>
+      <dt><code>herr_t H5Dclose (hid_t <em>dataset_id</em>)</code>
+      <dd>This function closes a dataset handle and releases all
+	resources it might have been using.  The handle should not be
+	used in subsequent calls to the library.
+
+	<br><br>
+      <dt><code>herr_t H5Dextend (hid_t <em>dataset_id</em>,
+	  hsize_t <em>dim</em>[])</code>
+      <dd>This function extends a dataset by increasing the size in
+	one or more dimensions.  Not all datasets can be extended.
+    </dl>
+
+
+
+    <h2>8. Querying Constant or Persistent Properties</h2>
+
+    <p>Constant or persistent properties can be queried with a set of
+      three functions.  Each function returns an identifier for a copy
+      of the requested properties.  The identifier can be passed to
+      various functions which modify the underlying object to derive a
+      new object; the original dataset is completely unchanged.  The
+      return values from these functions should be properly destroyed
+      when no longer needed.
+
+    <dl>
+      <dt><code>hid_t H5Dget_type (hid_t <em>dataset_id</em>)</code>
+      <dd>Returns an identifier for a copy of the dataset permanent
+	data type or negative for failure.
+
+      <dt><code>hid_t H5Dget_space (hid_t <em>dataset_id</em>)</code>
+      <dd>Returns an identifier for a copy of the dataset permanent
+	data space, which also contains information about the current
+	size of the dataset if the data set is extendable with
+	<code>H5Dextend()</code>.
+
+      <dt><code>hid_t H5Dget_create_plist (hid_t
+	  <em>dataset_id</em>)</code>
+      <dd>Returns an identifier for a copy of the dataset creation
+	property list. The new property list is created by examining
+	various permanent properties of the dataset.  This is mostly a
+	catch-all for everything but type and space.
+    </dl>
+
+
+
+    <h2>9. Setting Memory and Transfer Properties</h2>
+
+    <p>A dataset also has memory properties which describe memory
+      within the application, and transfer properties that control
+      various aspects of the I/O operations.  The memory can have a
+      data type different than the permanent file data type (different
+      number types, different struct member offsets, different array
+      element orderings) and can also be a different size (memory is a
+      subset of the permanent dataset elements, or vice versa).  The
+      transfer properties might provide caching hints or collective
+      I/O information. Therefore, each I/O operation must specify
+      memory and transfer properties.
+
+    <p>The memory properties are specified with <em>type_id</em> and
+      <em>space_id</em> arguments while the transfer properties are
+      specified with the <em>transfer_id</em> property list for the
+      <code>H5Dread()</code> and <code>H5Dwrite()</code> functions
+      (these functions are described below).
+
+    <dl>
+      <dt><code>herr_t H5Pset_buffer (hid_t <em>xfer_plist</em>,
+	  size_t <em>max_buf_size</em>, void *<em>tconv_buf</em>, void
+	  *<em>bkg_buf</em>)</code>
+      <dt><code>size_t H5Pget_buffer (hid_t <em>xfer_plist</em>, void
+	  **<em>tconv_buf</em>, void **<em>bkg_buf</em>)</code>
+      <dd>Sets or retrieves the maximum size in bytes of the temporary
+	buffer used for data type conversion in the I/O pipeline.  An
+	application-defined buffer can also be supplied as the
+	<em>tconv_buf</em> argument, otherwise a buffer will be
+	allocated and freed on demand by the library. A second
+	temporary buffer <em>bkg_buf</em> can also be supplied and
+	should be the same size as the <em>tconv_buf</em>.  The
+	default values are 1MB for the maximum buffer size, and null
+	pointers for each buffer indicating that they should be
+	allocated on demand and freed when no longer needed. The
+	<code>H5Pget_buffer()</code> function returns the maximum
+	buffer size or zero on error.
+    </dl>
+
+    <p>If the maximum size of the temporary I/O pipeline buffers is
+      too small to hold the entire I/O request, then the I/O request
+      will be fragmented and the transfer operation will be strip
+      mined.  However, certain restrictions apply to the strip
+      mining.  For instance, when performing I/O on a hyperslab of a
+      simple data space the strip mining is in terms of the slowest
+      varying dimension.  So if a 100x200x300 hyperslab is requested,
+      the temporary buffer must be large enough to hold a 1x200x300
+      sub-hyperslab.
+
+    <p>To prevent strip mining from happening, the application should
+      use <code>H5Pset_buffer()</code> to set the size of the
+      temporary buffer so it's large enough to hold the entire
+      request.
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example</h4></caption>
+	  <tr>
+	    <td>
+	      <p>This example shows how to define a function that sets
+		a dataset transfer property list so that strip mining
+		does not occur.  It takes an (optional) dataset transfer
+		property list, a dataset, a data space that describes
+		what data points are being transfered, and a data type
+		for the data points in memory.  It returns a (new)
+		dataset transfer property list with the temporary
+		buffer size set to an appropriate value. The return
+		value should be passed as the fifth argument to
+		<code>H5Dread()</code> or <code>H5Dwrite()</code>.
+	      <p><code><pre>
+ 1 hid_t
+ 2 disable_strip_mining (hid_t xfer_plist, hid_t dataset,
+ 3                       hid_t space, hid_t mem_type)
+ 4 {
+ 5     hid_t file_type;          /* File data type */
+ 6     size_t type_size;         /* Sizeof larger type */
+ 7     size_t size;              /* Temp buffer size */
+ 8     hid_t xfer_plist;         /* Return value */
+ 9 
+10     file_type = H5Dget_type (dataset);
+11     type_size = MAX(H5Tget_size(file_type), H5Tget_size(mem_type));
+12     H5Tclose (file_type);
+13     size = H5Sget_npoints(space) * type_size;
+14     if (xfer_plist&lt;0) xfer_plist = H5Pcreate (H5P_DATASET_XFER);
+15     H5Pset_buffer(xfer_plist, size, NULL, NULL);
+16     return xfer_plist;
+17 }
+	      </pre></code>
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+      
+
+    <h2>10. Querying Memory or Transfer Properties</h2>
+
+    <p>Unlike constant and persistent properties, a dataset cannot be
+      queried for it's memory or transfer properties.  Memory
+      properties cannot be queried because the application already
+      stores those properties separate from the buffer that holds the
+      raw data, and the buffer may hold multiple segments from various
+      datasets and thus have more than one set of memory properties.
+      The transfer properties cannot be queried from the dataset
+      because they're associated with the transfer itself and not with
+      the dataset (but one can call
+      <code>H5Pget_<em>property</em>()</code> to query transfer
+      properties from a tempalate).
+
+
+    <h2>11. Raw Data I/O</h2>
+
+    <p>All raw data I/O is accomplished through these functions which
+      take a dataset handle, a memory data type, a memory data space,
+      a file data space, transfer properties, and an application
+      memory buffer.  They translate data between the memory data type
+      and space and the file data type and space.  The data spaces can
+      be used to describe partial I/O operations.
+      
+    <dl>
+      <dt><code>herr_t H5Dread (hid_t <em>dataset_id</em>, hid_t
+	  <em>mem_type_id</em>, hid_t <em>mem_space_id</em>, hid_t
+	  <em>file_space_id</em>, hid_t <em>xfer_plist_id</em>,
+	  void *<em>buf</em>/*out*/)</code>
+      <dd>Reads raw data from the specified dataset into <em>buf</em>
+	converting from file data type and space to memory data type
+	and space.
+
+	<br><br>
+      <dt><code>herr_t H5Dwrite (hid_t <em>dataset_id</em>, hid_t
+	  <em>mem_type_id</em>, hid_t <em>mem_space_id</em>, hid_t
+	  <em>file_space_id</em>, hid_t <em>xfer_plist_id</em>,
+	  const void *<em>buf</em>)</code>
+      <dd>Writes raw data from an application buffer <em>buf</em> to
+	the specified dataset converting from memory data type and
+	space to file data type and space.
+    </dl>
+
+
+    <p>In the name of sharability, the memory datatype must be
+      supplied.  However, it can be the same identifier as was used to
+      create the dataset or as was returned by
+      <code>H5Dget_type()</code>; the library will not implicitly
+      derive memory data types from constant data types.
+
+    <p>For complete reads of the dataset one may supply
+      <code>H5S_ALL</code> as the argument for the file data space.
+      If <code>H5S_ALL</code> is also supplied as the memory data
+      space then no data space conversion is performed.  This is a
+      somewhat dangerous situation since the file data space might be
+      different than what the application expects.
+
+
+
+    <h2>12. Examples</h2>
+
+    <p>The examples in this section illustrate some common dataset
+      practices.
+
+
+    <p>This example shows how to create a dataset which is stored in
+      memory as a two-dimensional array of native <code>double</code>
+      values but is stored in the file in Cray <code>float</code>
+      format using LZ77 compression.  The dataset is written to the
+      HDF5 file and then read back as a two-dimensional array of
+      <code>float</code> values.
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example 1</h4></caption>
+	  <tr>
+	    <td>
+	      <p><code><pre>
+ 1 hid_t file, data_space, dataset, properties;
+ 2 double dd[500][600];
+ 3 float ff[500][600];
+ 4 hsize_t dims[2], chunk_size[2];
+ 5 
+ 6 /* Describe the size of the array */
+ 7 dims[0] = 500;
+ 8 dims[1] = 600;
+ 9 data_space = H5Screate_simple (2, dims);
+10 
+11 
+12 /*
+13  * Create a new file using with read/write access,
+14  * default file creation properties, and default file
+15  * access properties.
+16  */
+17 file = H5Fcreate ("test.h5", H5F_ACC_RDWR, H5P_DEFAULT,
+18                   H5P_DEFAULT);
+19 
+20 /* 
+21  * Set the dataset creation plist to specify that
+22  * the raw data is to be partitioned into 100x100 element
+23  * chunks and that each chunk is to be compressed with
+24  * LZ77.
+25  */
+26 chunk_size[0] = chunk_size[1] = 100;
+27 properties = H5Pcreate (H5P_DATASET_CREATE);
+28 H5Pset_chunk (properties, 2, chunk_size);
+29 H5Pset_compression (properties, H5D_COMPRESS_LZ77);
+30 
+31 /*
+32  * Create a new dataset within the file.  The data type
+33  * and data space describe the data on disk, which may
+34  * be different than the format used in the application's
+35  * memory.
+36  */
+37 dataset = H5Dcreate (file, "dataset", H5T_CRAY_FLOAT,
+38                      data_space, properties);
+39 
+40 /*
+41  * Write the array to the file.  The data type and data
+42  * space describe the format of the data in the `dd'
+43  * buffer.  The raw data is translated to the format
+44  * required on disk defined above.  We use default raw
+45  * data transfer properties.
+46  */
+47 H5Dwrite (dataset, H5T_NATIVE_DOUBLE, H5S_ALL, H5S_ALL,
+48           H5P_DEFAULT, dd);
+49 
+50 /*
+51  * Read the array as floats.  This is similar to writing
+52  * data except the data flows in the opposite direction.
+53  */
+54 H5Dread (dataset, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL,
+55          H5P_DEFAULT, ff);
+56 
+64 H5Dclose (dataset);
+65 H5Sclose (data_space);
+66 H5Pclose (properties);
+67 H5Fclose (file);
+	      </pre></code>
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>This example uses the file created in Example 1 and reads a
+      hyperslab of the 500x600 file dataset.  The hyperslab size is
+      100x200 and it is located beginning at element
+      &lt;200,200&gt;. We read the hyperslab into an 200x400 array in
+      memory beginning at element &lt;0,0&gt; in memory.  Visually,
+      the transfer looks something like this: 
+
+      <center>
+	<img alt="Raw Data Transfer" src="dataset_p1.gif">
+      </center>
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example 2</h4></caption>
+	  <tr>
+	    <td>
+	      <p><code><pre>
+ 1 hid_t file, mem_space, file_space, dataset;
+ 2 double dd[200][400];
+ 3 hssize_t offset[2];
+ 4 hsize size[2];
+ 5 
+ 6 /*
+ 7  * Open an existing file and its dataset.
+ 8  */
+ 9 file = H5Fopen ("test.h5", H5F_ACC_RDONLY, H5P_DEFAULT);
+10 dataset = H5Dopen (file, "dataset");
+11 
+12 /*
+13  * Describe the file data space.
+14  */
+15 offset[0] = 200; /*offset of hyperslab in file*/
+16 offset[1] = 200;
+17 size[0] = 100;   /*size of hyperslab*/
+18 size[1] = 200;
+19 file_space = H5Dget_space (dataset);
+20 H5Sset_hyperslab (file_space, 2, offset, size);
+21 
+22 /*
+23  * Describe the memory data space.
+24  */
+25 size[0] = 200;  /*size of memory array*/
+26 size[1] = 400;
+27 mem_space = H5Screate_simple (2, size);
+28 
+29 offset[0] = 0;  /*offset of hyperslab in memory*/
+30 offset[1] = 0;
+31 size[0] = 100;  /*size of hyperslab*/
+32 size[1] = 200;
+33 H5Sset_hyperslab (mem_space, 2, offset, size);
+34 
+35 /*
+36  * Read the dataset.
+37  */
+38 H5Dread (dataset, H5T_NATIVE_DOUBLE, mem_space,
+39          file_space, H5P_DEFAULT, dd);
+40 
+41 /*
+42  * Close/release resources.
+43  */
+44 H5Dclose (dataset);
+45 H5Sclose (mem_space);
+46 H5Sclose (file_space);
+47 H5Fclose (file);
+	      </pre></code>
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+    <p>If the file contains a compound data structure one of whose
+      members is a floating point value (call it "delta") but the
+      application is interested in reading an array of floating point
+      values which are just the "delta" values, then the application
+      should cast the floating point array as a struct with a single
+      "delta" member.
+
+    <p>
+      <center>
+	<table border align=center width="100%">
+	  <caption align=bottom><h4>Example 3</h4></caption>
+	  <tr>
+	    <td>
+	      <p><code><pre>
+ 1 hid_t file, dataset, type;
+ 2 double delta[200];
+ 3 
+ 4 /*
+ 5  * Open an existing file and its dataset.
+ 6  */
+ 7 file = H5Fopen ("test.h5", H5F_ACC_RDONLY, H5P_DEFAULT);
+ 8 dataset = H5Dopen (file, "dataset");
+ 9 
+10 /*
+11  * Describe the memory data type, a struct with a single
+12  * "delta" member.
+13  */
+14 type = H5Tcreate (H5T_COMPOUND, sizeof(double));
+15 H5Tinsert (type, "delta", 0, H5T_NATIVE_DOUBLE);
+16 
+17 /*
+18  * Read the dataset.
+19  */
+20 H5Dread (dataset, type, H5S_ALL, H5S_ALL,
+21          H5P_DEFAULT, dd);
+22 
+23 /*
+24  * Close/release resources.
+25  */
+26 H5Dclose (dataset);
+27 H5Tclose (type);
+28 H5Fclose (file);
+	      </pre></code>
+	    </td>
+	  </tr>
+	</table>
+      </center>
+
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Tue Dec  2 09:17:09 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Wed May 13 18:57:47 EDT 1998
+<!-- hhmts end -->
+  </body>
+</html>
author	Quincey Koziol <koziol@hdfgroup.org>	1998-07-08 14:54:54 (GMT)
committer	Quincey Koziol <koziol@hdfgroup.org>	1998-07-08 14:54:54 (GMT)
commit	bd1e676c521d881b3143829f493a28b5ced1294b (patch)
tree	69c50f9fe21ce87f293d8617a6bd51b4cc1e0244 /doc/html/Datasets.html
parent	73345095897d9698bb1f2f7df830bf80a56dc65a (diff)
download	hdf5-bd1e676c521d881b3143829f493a28b5ced1294b.zip hdf5-bd1e676c521d881b3143829f493a28b5ced1294b.tar.gz hdf5-bd1e676c521d881b3143829f493a28b5ced1294b.tar.bz2