summaryrefslogtreecommitdiffstats
path: root/doc/html/Filters.html
diff options
context:
space:
mode:
Diffstat (limited to 'doc/html/Filters.html')
-rw-r--r--doc/html/Filters.html593
1 files changed, 0 insertions, 593 deletions
diff --git a/doc/html/Filters.html b/doc/html/Filters.html
deleted file mode 100644
index a253cfb..0000000
--- a/doc/html/Filters.html
+++ /dev/null
@@ -1,593 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <title>Filters</title>
-
-<!-- #BeginLibraryItem "/ed_libs/styles_UG.lbi" -->
-<!--
- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
- * Copyright by the Board of Trustees of the University of Illinois. *
- * All rights reserved. *
- * *
- * This file is part of HDF5. The full HDF5 copyright notice, including *
- * terms governing use, modification, and redistribution, is contained in *
- * the files COPYING and Copyright.html. COPYING can be found at the root *
- * of the source code distribution tree; Copyright.html can be found at the *
- * root level of an installed copy of the electronic HDF5 document set and *
- * is linked from the top-level documents page. It can also be found at *
- * http://hdf.ncsa.uiuc.edu/HDF5/doc/Copyright.html. If you do not have *
- * access to either file, you may request a copy from hdfhelp@ncsa.uiuc.edu. *
- * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
- -->
-
-<link href="ed_styles/UGelect.css" rel="stylesheet" type="text/css">
-<!-- #EndLibraryItem --></head>
-
- <body bgcolor="#FFFFFF">
-
-
-<!-- #BeginLibraryItem "/ed_libs/NavBar_UG.lbi" --><hr>
-<center>
-<table border=0 width=98%>
-<tr><td valign=top align=left>
- <a href="index.html">HDF5 documents and links</a>&nbsp;<br>
- <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
- <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
- <a href="http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/index.html">HDF5 User's Guide for Release 1.6</a>&nbsp;<br>
- <!--
- <a href="Glossary.html">Glossary</a><br>
- -->
-</td>
-<td valign=top align=right>
- And in this document, the
- <a href="H5.user.html"><strong>HDF5 User's Guide from Release 1.4.5:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
- <br>
- <a href="Files.html">Files</a>&nbsp;&nbsp;
- <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
- <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
- <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
- <a href="Groups.html">Groups</a>&nbsp;&nbsp;
- <br>
- <a href="References.html">References</a>&nbsp;&nbsp;
- <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
- <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
- <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
- <br>
- <a href="Filters.html">Filters</a>&nbsp;&nbsp;
- <a href="Caching.html">Caching</a>&nbsp;&nbsp;
- <a href="Chunking.html">Chunking</a>&nbsp;&nbsp;
- <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
- <br>
- <a href="Performance.html">Performance</a>&nbsp;&nbsp;
- <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
- <a href="Environment.html">Environment</a>&nbsp;&nbsp;
- <a href="ddl.html">DDL</a>&nbsp;&nbsp;
-</td></tr>
-</table>
-</center>
-<hr>
-<!-- #EndLibraryItem --><h1>Filters in HDF5</h1>
-
- <b>Note: Transient pipelines described in this document have not
- been implemented.</b>
-
- <h2>1. Introduction</h2>
-
- <p>HDF5 allows chunked data<sup><a href="#fn1">1</a></sup>
- to pass through user-defined filters
- on the way to or from disk. The filters operate on chunks of an
- <code>H5D_CHUNKED</code> dataset can be arranged in a pipeline
- so output of one filter becomes the input of the next filter.
-
- <p>Each filter has a two-byte identification number (type
- <code>H5Z_filter_t</code>) allocated by NCSA and can also be
- passed application-defined integer resources to control its
- behavior. Each filter also has an optional ASCII comment
- string.
-
- <p>
- <center>
- <table align=center width="80%">
- <caption alignment=top>
- <b>Values for <code>H5Z_filter_t</code></b>
- </caption>
-
- <tr>
- <th width="30%">Value</th>
- <th width="70%">Description</th>
- </tr>
-
- <tr valign=top>
- <td><code>0-255</code></td>
- <td>These values are reserved for filters predefined and
- registered by the HDF5 library and of use to the general
- public. They are described in a separate section
- below.</td>
- </tr>
-
- <tr valign=top>
- <td><code>256-511</code></td>
- <td>Filter numbers in this range are used for testing only
- and can be used temporarily by any organization. No
- attempt is made to resolve numbering conflicts since all
- definitions are by nature temporary.</td>
- </tr>
-
- <tr valign=top>
- <td><code>512-65535</code></td>
- <td>Reserved for future assignment. Please contact the
- <a href="mailto:hdf5dev@ncsa.uiuc.edu">HDF5 development
- team</a> to reserve a value or range of values for
- use by your filters.</td>
- </table>
- </center>
-
- <h2>2. Defining and Querying the Filter Pipeline</h2>
-
- <p>Two types of filters can be applied to raw data I/O: permanent
- filters and transient filters. The permanent filter pipeline is
- defned when the dataset is created while the transient pipeline
- is defined for each I/O operation. During an
- <code>H5Dwrite()</code> the transient filters are applied first
- in the order defined and then the permanent filters are applied
- in the order defined. For an <code>H5Dread()</code> the
- opposite order is used: permanent filters in reverse order, then
- transient filters in reverse order. An <code>H5Dread()</code>
- must result in the same amount of data for a chunk as the
- original <code>H5Dwrite()</code>.
-
- <p>The permanent filter pipeline is defined by calling
- <code>H5Pset_filter()</code> for a dataset creation property
- list while the transient filter pipeline is defined by calling
- that function for a dataset transfer property list.
-
- <dl>
- <dt><code>herr_t H5Pset_filter (hid_t <em>plist</em>,
- H5Z_filter_t <em>filter</em>, unsigned int <em>flags</em>,
- size_t <em>cd_nelmts</em>, const unsigned int
- <em>cd_values</em>[])</code>
- <dd>This function adds the specified <em>filter</em> and
- corresponding properties to the end of the transient or
- permanent output filter pipeline (depending on whether
- <em>plist</em> is a dataset creation or dataset transfer
- property list). The <em>flags</em> argument specifies certain
- general properties of the filter and is documented below. The
- <em>cd_values</em> is an array of <em>cd_nelmts</em> integers
- which are auxiliary data for the filter. The integer values
- will be stored in the dataset object header as part of the
- filter information.
-
- <br><br>
- <dt><code>int H5Pget_nfilters (hid_t <em>plist</em>)</code>
- <dd>This function returns the number of filters defined in the
- permanent or transient filter pipeline depending on whether
- <em>plist</em> is a dataset creation or dataset transfer
- property list. In each pipeline the filters are numbered from
- 0 through <em>N</em>-1 where <em>N</em> is the value returned
- by this function. During output to the file the filters of a
- pipeline are applied in increasing order (the inverse is true
- for input). Zero is returned if there are no filters in the
- pipeline and a negative value is returned for errors.
-
- <br><br>
- <dt><code>H5Z_filter_t H5Pget_filter (hid_t <em>plist</em>,
- int <em>filter_number</em>, unsigned int *<em>flags</em>,
- size_t *<em>cd_nelmts</em>, unsigned int
- *<em>cd_values</em>, size_t namelen, char name[])</code>
- <dd>This is the query counterpart of
- <code>H5Pset_filter()</code> and returns information about a
- particular filter number in a permanent or transient pipeline
- depending on whether <em>plist</em> is a dataset creation or
- dataset transfer property list. On input, <em>cd_nelmts</em>
- indicates the number of entries in the <em>cd_values</em>
- array allocated by the caller while on exit it contains the
- number of values defined by the filter. The
- <em>filter_number</em> should be a value between zero and
- <em>N</em>-1 as described for <code>H5Pget_nfilters()</code>
- and the function will return failure (a negative value) if the
- filter number is out of range. If <em>name</em> is a pointer
- to an array of at least <em>namelen</em> bytes then the filter
- name will be copied into that array. The name will be null
- terminated if the <em>namelen</em> is large enough. The
- filter name returned will be the name appearing in the file or
- else the name registered for the filter or else an empty string.
- </dl>
-
- <p>The flags argument to the functions above is a bit vector of
- the following fields:
-
- <p>
- <center>
- <table align=center width="80%">
- <caption align=top>
- <b>Values for the <em>flags</em> argument</b>
- </caption>
-
- <tr>
- <th width="30%">Value</th>
- <th width="70%">Description</th>
- </tr>
-
- <tr valign=top>
- <td><code>H5Z_FLAG_OPTIONAL</code></td>
- <td>If this bit is set then the filter is optional. If
- the filter fails (see below) during an
- <code>H5Dwrite()</code> operation then the filter is
- just excluded from the pipeline for the chunk for which
- it failed; the filter will not participate in the
- pipeline during an <code>H5Dread()</code> of the chunk.
- This is commonly used for compression filters: if the
- compression result would be larger than the input then
- the compression filter returns failure and the
- uncompressed data is stored in the file. If this bit is
- clear and a filter fails then the
- <code>H5Dwrite()</code> or <code>H5Dread()</code> also
- fails.</td>
- </tr>
- </table>
- </center>
-
- <h2>3. Defining Filters</h2>
-
- <p>Each filter is bidirectional, handling both input and output to
- the file, and a flag is passed to the filter to indicate the
- direction. In either case the filter reads a chunk of data from
- a buffer, usually performs some sort of transformation on the
- data, places the result in the same or new buffer, and returns
- the buffer pointer and size to the caller. If something goes
- wrong the filter should return zero to indicate a failure.
-
- <p>During output, a filter that fails or isn't defined and is
- marked as optional is silently excluded from the pipeline and
- will not be used when reading that chunk of data. A required
- filter that fails or isn't defined causes the entire output
- operation to fail. During input, any filter that has not been
- excluded from the pipeline during output and fails or is not
- defined will cause the entire input operation to fail.
-
- <p>Filters are defined in two phases. The first phase is to
- define a function to act as the filter and link the function
- into the application. The second phase is to register the
- function, associating the function with an
- <code>H5Z_filter_t</code> identification number and a comment.
-
- <dl>
- <dt><code>typedef size_t (*H5Z_func_t)(unsigned int
- <em>flags</em>, size_t <em>cd_nelmts</em>, const unsigned int
- <em>cd_values</em>[], size_t <em>nbytes</em>, size_t
- *<em>buf_size</em>, void **<em>buf</em>)</code>
- <dd>The <em>flags</em>, <em>cd_nelmts</em>, and
- <em>cd_values</em> are the same as for the
- <code>H5Pset_filter()</code> function with the additional flag
- <code>H5Z_FLAG_REVERSE</code> which is set when the filter is
- called as part of the input pipeline. The input buffer is
- pointed to by <em>*buf</em> and has a total size of
- <em>*buf_size</em> bytes but only <em>nbytes</em> are valid
- data. The filter should perform the transformation in place if
- possible and return the number of valid bytes or zero for
- failure. If the transformation cannot be done in place then
- the filter should allocate a new buffer with
- <code>malloc()</code> and assign it to <em>*buf</em>,
- assigning the allocated size of that buffer to
- <em>*buf_size</em>. The old buffer should be freed
- by calling <code>free()</code>.
-
- <br><br>
- <dt><code>herr_t H5Zregister (H5Z_filter_t <em>filter_id</em>,
- const char *<em>comment</em>, H5Z_func_t
- <em>filter</em>)</code>
- <dd>The <em>filter</em> function is associated with a filter
- number and a short ASCII comment which will be stored in the
- hdf5 file if the filter is used as part of a permanent
- pipeline during dataset creation.
- </dl>
-
-
- <h2>4. Predefined Filters</h2>
-
- <p>If <code>zlib</code> version 1.1.2 or later was found
- during configuration then the library will define a filter whose
- <code>H5Z_filter_t</code> number is
- <code>H5Z_FILTER_DEFLATE</code>. Since this compression method
- has the potential for generating compressed data which is larger
- than the original, the <code>H5Z_FLAG_OPTIONAL</code> flag
- should be turned on so such cases can be handled gracefully by
- storing the original data instead of the compressed data. The
- <em>cd_nvalues</em> should be one with <em>cd_value[0]</em>
- being a compression agression level between zero and nine,
- inclusive (zero is the fastest compression while nine results in
- the best compression ratio).
-
- <p>A convenience function for adding the
- <code>H5Z_FILTER_DEFLATE</code> filter to a pipeline is:
-
- <dl>
- <dt><code>herr_t H5Pset_deflate (hid_t <em>plist</em>, unsigned
- <em>aggression</em>)</code>
- <dd>The deflate compression method is added to the end of the
- permanent or transient filter pipeline depending on whether
- <em>plist</em> is a dataset creation or dataset transfer
- property list. The <em>aggression</em> is a number between
- zero and nine (inclusive) to indicate the tradeoff between
- speed and compression ratio (zero is fastest, nine is best
- ratio).
- </dl>
-
- <p>Even if the <code>zlib</code> isn't detected during
- configuration the application can define
- <code>H5Z_FILTER_DEFLATE</code> as a permanent filter. If the
- filter is marked as optional (as with
- <code>H5Pset_deflate()</code>) then it will always fail and be
- automatically removed from the pipeline. Applications that read
- data will fail only if the data is actually compressed; they
- won't fail if <code>H5Z_FILTER_DEFLATE</code> was part of the
- permanent output pipeline but was automatically excluded because
- it didn't exist when the data was written.
-
- <p><code>zlib</code> can be acquired from
- <code><a href="http://www.cdrom.com/pub/infozip/zlib/">http://www.cdrom.com/pub/infozip/zlib/</a></code>.
-
- <h2>5. Example</h2>
-
- <p>This example shows how to define and register a simple filter
- that adds a checksum capability to the data stream.
-
- <p>The function that acts as the filter always returns zero
- (failure) if the <code>md5()</code> function was not detected at
- configuration time (left as an excercise for the reader).
- Otherwise the function is broken down to an input and output
- half. The output half calculates a checksum, increases the size
- of the output buffer if necessary, and appends the checksum to
- the end of the buffer. The input half calculates the checksum
- on the first part of the buffer and compares it to the checksum
- already stored at the end of the buffer. If the two differ then
- zero (failure) is returned, otherwise the buffer size is reduced
- to exclude the checksum.
-
- <p>
- <center>
- <table border align=center width="100%">
- <tr>
- <td>
- <p><code><pre>
-
-size_t
-md5_filter(unsigned int flags, size_t cd_nelmts,
- const unsigned int cd_values[], size_t nbytes,
- size_t *buf_size, void **buf)
-{
-#ifdef HAVE_MD5
- unsigned char cksum[16];
-
- if (flags & H5Z_REVERSE) {
- /* Input */
- assert(nbytes>=16);
- md5(nbytes-16, *buf, cksum);
-
- /* Compare */
- if (memcmp(cksum, (char*)(*buf)+nbytes-16, 16)) {
- return 0; /*fail*/
- }
-
- /* Strip off checksum */
- return nbytes-16;
-
- } else {
- /* Output */
- md5(nbytes, *buf, cksum);
-
- /* Increase buffer size if necessary */
- if (nbytes+16>*buf_size) {
- *buf_size = nbytes + 16;
- *buf = realloc(*buf, *buf_size);
- }
-
- /* Append checksum */
- memcpy((char*)(*buf)+nbytes, cksum, 16);
- return nbytes+16;
- }
-#else
- return 0; /*fail*/
-#endif
-}
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
- <p>Once the filter function is defined it must be registered so
- the HDF5 library knows about it. Since we're testing this
- filter we choose one of the <code>H5Z_filter_t</code> numbers
- from the reserved range. We'll randomly choose 305.
-
- <p>
- <center>
- <table border align=center width="100%">
- <tr>
- <td>
- <p><code><pre>
-
-#define FILTER_MD5 305
-herr_t status = H5Zregister(FILTER_MD5, "md5 checksum", md5_filter);
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
- <p>Now we can use the filter in a pipeline. We could have added
- the filter to the pipeline before defining or registering the
- filter as long as the filter was defined and registered by time
- we tried to use it (if the filter is marked as optional then we
- could have used it without defining it and the library would
- have automatically removed it from the pipeline for each chunk
- written before the filter was defined and registered).
-
- <p>
- <center>
- <table border align=center width="100%">
- <tr>
- <td>
- <p><code><pre>
-
-hid_t dcpl = H5Pcreate(H5P_DATASET_CREATE);
-hsize_t chunk_size[3] = {10,10,10};
-H5Pset_chunk(dcpl, 3, chunk_size);
-H5Pset_filter(dcpl, FILTER_MD5, 0, 0, NULL);
-hid_t dset = H5Dcreate(file, "dset", H5T_NATIVE_DOUBLE, space, dcpl);
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
- <h2>6. Filter Diagnostics</h2>
-
- <p>If the library is compiled with debugging turned on for the H5Z
- layer (usually as a result of <code>configure
- --enable-debug=z</code>) then filter statistics are printed when
- the application exits normally or the library is closed. The
- statistics are written to the standard error stream and include
- two lines for each filter that was used: one for input and one
- for output. The following fields are displayed:
-
- <p>
- <center>
- <table align=center width="80%">
- <tr>
- <th width="30%">Field Name</th>
- <th width="70%">Description</th>
- </tr>
-
- <tr valign=top>
- <td>Method</td>
- <td>This is the name of the method as defined with
- <code>H5Zregister()</code> with the charaters
- &quot;&lt; or &quot;&gt;&quot; prepended to indicate
- input or output.</td>
- </tr>
-
- <tr valign=top>
- <td>Total</td>
- <td>The total number of bytes processed by the filter
- including errors. This is the maximum of the
- <em>nbytes</em> argument or the return value.
- </tr>
-
- <tr valign=top>
- <td>Errors</td>
- <td>This field shows the number of bytes of the Total
- column which can be attributed to errors.</td>
- </tr>
-
- <tr valign=top>
- <td>User, System, Elapsed</td>
- <td>These are the amount of user time, system time, and
- elapsed time in seconds spent in the filter function.
- Elapsed time is sensitive to system load. These times
- may be zero on operating systems that don't support the
- required operations.</td>
- </tr>
-
- <tr valign=top>
- <td>Bandwidth</td>
- <td>This is the filter bandwidth which is the total
- number of bytes processed divided by elapsed time.
- Since elapsed time is subject to system load the
- bandwidth numbers cannot always be trusted.
- Furthermore, the bandwidth includes bytes attributed to
- errors which may significanly taint the value if the
- function is able to detect errors without much
- expense.</td>
- </tr>
- </table>
- </center>
-
- <p>
- <center>
- <table border align=center width="100%">
- <caption align=bottom>
- <b>Example: Filter Statistics</b>
- </caption>
- <tr>
- <td>
- <p><code><pre>
-H5Z: filter statistics accumulated over life of library:
- Method Total Errors User System Elapsed Bandwidth
- ------ ----- ------ ---- ------ ------- ---------
- >deflate 160000 40000 0.62 0.74 1.33 117.5 kBs
- &lt;deflate 120000 0 0.11 0.00 0.12 1.000 MBs
- </pre></code>
- </td>
- </tr>
- </table>
- </center>
-
-
-<hr>
-
-
- <p><a name="fn1">Footnote 1:</a> Dataset chunks can be compressed
- through the use of filters. Developers should be aware that
- reading and rewriting compressed chunked data can result in holes
- in an HDF5 file. In time, enough such holes can increase the
- file size enough to impair application or library performance
- when working with that file. See
- &ldquo;<a href="Performance.html#Freespace">Freespace Management</a>&rdquo;
- in the chapter
- &ldquo;<a href="Performance.html">Performance Analysis and Issues</a>.&rdquo;</p>
-
-
-<!-- #BeginLibraryItem "/ed_libs/NavBar_UG.lbi" --><hr>
-<center>
-<table border=0 width=98%>
-<tr><td valign=top align=left>
- <a href="index.html">HDF5 documents and links</a>&nbsp;<br>
- <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
- <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
- <a href="http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/index.html">HDF5 User's Guide for Release 1.6</a>&nbsp;<br>
- <!--
- <a href="Glossary.html">Glossary</a><br>
- -->
-</td>
-<td valign=top align=right>
- And in this document, the
- <a href="H5.user.html"><strong>HDF5 User's Guide from Release 1.4.5:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
- <br>
- <a href="Files.html">Files</a>&nbsp;&nbsp;
- <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
- <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
- <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
- <a href="Groups.html">Groups</a>&nbsp;&nbsp;
- <br>
- <a href="References.html">References</a>&nbsp;&nbsp;
- <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
- <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
- <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
- <br>
- <a href="Filters.html">Filters</a>&nbsp;&nbsp;
- <a href="Caching.html">Caching</a>&nbsp;&nbsp;
- <a href="Chunking.html">Chunking</a>&nbsp;&nbsp;
- <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
- <br>
- <a href="Performance.html">Performance</a>&nbsp;&nbsp;
- <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
- <a href="Environment.html">Environment</a>&nbsp;&nbsp;
- <a href="ddl.html">DDL</a>&nbsp;&nbsp;
-</td></tr>
-</table>
-</center>
-<hr>
-<!-- #EndLibraryItem --><!-- #BeginLibraryItem "/ed_libs/Footer.lbi" --><address>
-<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
-<br>
-Describes HDF5 Release 1.4.5, February 2003
-</address><!-- #EndLibraryItem --><!-- Created: Fri Apr 17 13:39:35 EDT 1998 -->
-<!-- hhmts start -->
-Last modified: 2 August 2001
-<!-- hhmts end -->
-
-
-</body>
-</html>