The HDF5 library caches two types of data: meta data and raw data. The meta data cache holds file objects like the file header, symbol table nodes, global heap collections, object headers and their messages, etc. in a partially decoded state. The cache has a fixed number of entries which is set with the file access property list (defaults to 10k) and each entry can hold a single meta data object. Collisions between objects are handled by preempting the older object in favor of the new one.
Raw data chunks are cached because I/O requests at the application level typically don't map well to chunks at the storage level. The chunk cache has a maximum size in bytes set with the file access property list (defaults to 1MB) and when the limit is reached chunks are preempted based on the following set of heuristics.
One should choose large values for w0 if I/O requests typically do not overlap but smaller values for w0 if the requests do overlap. For instance, reading an entire 2d array by reading from non-overlapping "windows" in a row-major order would benefit from a high w0 value while reading a diagonal accross the dataset where each request overlaps the previous request would benefit from a small w0.
The cache parameters for both caches are part of a file access property list and are set and queried with this pair of functions:
herr_t H5Pset_cache(hid_t plist, unsigned int
mdc_nelmts, size_t rdcc_nbytes, double
w0)
herr_t H5Pget_cache(hid_t plist, unsigned int
*mdc_nelmts, size_t *rdcc_nbytes, double
w0)
H5Pget_cache()
any (or all) of
the pointer arguments may be null pointers.