<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <html> <head> <title>Data Caching</title> </head> <body bgcolor="#FFFFFF"> <hr> <center> <table border=0 width=98%> <tr><td valign=top align=left> <a href="H5.intro.html">Introduction to HDF5</a> <br> <a href="RM_H5Front.html">HDF5 Reference Manual</a> <br> <a href="index.html">Other HDF5 documents and links</a> <br> <!-- <a href="Glossary.html">Glossary</a><br> --> </td> <td valign=top align=right> And in this document, the <a href="H5.user.html">HDF5 User's Guide</a>: <a href="Files.html">Files</a> <br> <a href="Datasets.html">Datasets</a> <a href="Datatypes.html">Data Types</a> <a href="Dataspaces.html">Dataspaces</a> <a href="Groups.html">Groups</a> <a href="References.html">References</a> <br> <a href="Attributes.html">Attributes</a> <a href="Properties.html">Property Lists</a> <a href="Errors.html">Error Handling</a> <a href="Filters.html">Filters</a> Caching <br> <a href="Chunking.html">Chunking</a> <a href="Debugging.html">Debugging</a> <a href="Environment.html">Environment</a> <a href="ddl.html">DDL</a> <a href="Ragged.html">Ragged Arrays</a> <!-- <hr> And in this document, the <a href="H5.user.html">HDF5 User's Guide</a>: <a href="Attributes.html">H5A</a> <a href="Datasets.html">H5D</a> <a href="Errors.html">H5E</a> <a href="Files.html">H5F</a> <a href="Groups.html">H5G</a> <a href="Properties.html">H5P</a> <a href="References.html">H5R & H5I</a> <a href="Ragged.html">H5RA</a> <a href="Dataspaces.html">H5S</a> <a href="Datatypes.html">H5T</a> <a href="Filters.html">H5Z</a> <a href="Caching.html">Caching</a> <a href="Chunking.html">Chunking</a> <a href="Debugging.html">Debugging</a> <a href="Environment.html">Environment</a> <a href="ddl.html">DDL</a> --> </td></tr> </table> </center> <hr> <h1>Meta Data Caching</h1> <p>The HDF5 library caches two types of data: meta data and raw data. The meta data cache holds file objects like the file header, symbol table nodes, global heap collections, object headers and their messages, etc. in a partially decoded state. The cache has a fixed number of entries which is set with the file access property list (defaults to 10k) and each entry can hold a single meta data object. Collisions between objects are handled by preempting the older object in favor of the new one. <h1>Raw Data Chunk Caching</h1> <p>Raw data chunks are cached because I/O requests at the application level typically don't map well to chunks at the storage level. The chunk cache has a maximum size in bytes set with the file access property list (defaults to 1MB) and when the limit is reached chunks are preempted based on the following set of heuristics. <ul> <li>Chunks which have not been accessed for a long time relative to other chunks are penalized. <li>Chunks which have been accessed frequently in the recent past are favored. <li>Chunks which are completely read and not written, completely written but not read, or completely read and completely written are penalized according to <em>w0</em>, an application-defined weight between 0 and 1 inclusive. A weight of zero does not penalize such chunks while a weight of 1 penalizes those chunks more than all other chunks. The default is 0.75. <li>Chunks which are larger than the maximum cache size do not participate in the cache. </ul> <p>One should choose large values for <em>w0</em> if I/O requests typically do not overlap but smaller values for <em>w0</em> if the requests do overlap. For instance, reading an entire 2d array by reading from non-overlapping "windows" in a row-major order would benefit from a high <em>w0</em> value while reading a diagonal accross the dataset where each request overlaps the previous request would benefit from a small <em>w0</em>. <h1>The API</h1> <p>The cache parameters for both caches are part of a file access property list and are set and queried with this pair of functions: <dl> <dt><code>herr_t H5Pset_cache(hid_t <em>plist</em>, unsigned int <em>mdc_nelmts</em>, size_t <em>rdcc_nbytes</em>, double <em>w0</em>)</code> <dt><code>herr_t H5Pget_cache(hid_t <em>plist</em>, unsigned int *<em>mdc_nelmts</em>, size_t *<em>rdcc_nbytes</em>, double <em>w0</em>)</code> <dd>Sets or queries the meta data cache and raw data chunk cache parameters. The <em>plist</em> is a file access property list. The number of elements (objects) in the meta data cache is <em>mdc_nelmts</em>. The total size of the raw data chunk cache and the preemption policy is <em>rdcc_nbytes</em> and <em>w0</em>. For <code>H5Pget_cache()</code> any (or all) of the pointer arguments may be null pointers. </dl> <hr> <center> <table border=0 width=98%> <tr><td valign=top align=left> <a href="H5.intro.html">Introduction to HDF5</a> <br> <a href="RM_H5Front.html">HDF5 Reference Manual</a> <br> <a href="index.html">Other HDF5 documents and links</a> <br> <!-- <a href="Glossary.html">Glossary</a><br> --> </td> <td valign=top align=right> And in this document, the <a href="H5.user.html">HDF5 User's Guide</a>: <a href="Files.html">Files</a> <br> <a href="Datasets.html">Datasets</a> <a href="Datatypes.html">Data Types</a> <a href="Dataspaces.html">Dataspaces</a> <a href="Groups.html">Groups</a> <a href="References.html">References</a> <br> <a href="Attributes.html">Attributes</a> <a href="Properties.html">Property Lists</a> <a href="Errors.html">Error Handling</a> <a href="Filters.html">Filters</a> Caching <br> <a href="Chunking.html">Chunking</a> <a href="Debugging.html">Debugging</a> <a href="Environment.html">Environment</a> <a href="ddl.html">DDL</a> <a href="Ragged.html">Ragged Arrays</a> <!-- <hr> And in this document, the <a href="H5.user.html">HDF5 User's Guide</a>: <a href="Attributes.html">H5A</a> <a href="Datasets.html">H5D</a> <a href="Errors.html">H5E</a> <a href="Files.html">H5F</a> <a href="Groups.html">H5G</a> <a href="Properties.html">H5P</a> <a href="References.html">H5R & H5I</a> <a href="Ragged.html">H5RA</a> <a href="Dataspaces.html">H5S</a> <a href="Datatypes.html">H5T</a> <a href="Filters.html">H5Z</a> <a href="Caching.html">Caching</a> <a href="Chunking.html">Chunking</a> <a href="Debugging.html">Debugging</a> <a href="Environment.html">Environment</a> <a href="ddl.html">DDL</a> --> </td></tr> </table> </center> <!-- <hr> <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address> --> <!-- Created: Tue May 26 15:20:14 EDT 1998 --> <!-- hhmts start --> <!-- Last modified: Tue May 26 15:38:27 EDT 1998 --> <!-- hhmts end --> <hr> <address> <a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a> </address> Last modified: 30 October 1998 </body> </html>