1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<title>Data Caching</title>
</head>
<body bgcolor="#FFFFFF">
<hr>
<center>
<table border=0 width=98%>
<tr><td valign=top align=left>
<a href="H5.intro.html">Introduction to HDF5</a> <br>
<a href="RM_H5Front.html">HDF5 Reference Manual</a> <br>
<a href="index.html">Other HDF5 documents and links</a> <br>
<!--
<a href="Glossary.html">Glossary</a><br>
-->
</td>
<td valign=top align=right>
And in this document, the
<a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>
<br>
<a href="Files.html">Files</a>
<a href="Datasets.html">Datasets</a>
<a href="Datatypes.html">Datatypes</a>
<a href="Dataspaces.html">Dataspaces</a>
<a href="Groups.html">Groups</a>
<br>
<a href="References.html">References</a>
<a href="Attributes.html">Attributes</a>
<a href="Properties.html">Property Lists</a>
<a href="Errors.html">Error Handling</a>
<br>
<a href="Filters.html">Filters</a>
Caching
<a href="Chunking.html">Chunking</a>
<a href="MountingFiles.html">Mounting Files</a>
<br>
<a href="Performance.html">Performance</a>
<a href="Debugging.html">Debugging</a>
<a href="Environment.html">Environment</a>
<a href="ddl.html">DDL</a>
</td></tr>
</table>
</center>
<hr>
<h1>Data Caching</h1>
<h2>1. Meta Data Caching</h2>
<p>The HDF5 library caches two types of data: meta data and raw
data. The meta data cache holds file objects like the file
header, symbol table nodes, global heap collections, object
headers and their messages, etc. in a partially decoded
state. The cache has a fixed number of entries which is set with
the file access property list (defaults to 10k) and each entry
can hold a single meta data object. Collisions between objects
are handled by preempting the older object in favor of the new
one.
<h2>2. Raw Data Chunk Caching</h2>
<p>Raw data chunks are cached because I/O requests at the
application level typically don't map well to chunks at the
storage level. The chunk cache has a maximum size in bytes
set with the file access property list (defaults to 1MB) and
when the limit is reached chunks are preempted based on the
following set of heuristics.
<ul>
<li>Chunks which have not been accessed for a long time
relative to other chunks are penalized.
<li>Chunks which have been accessed frequently in the recent
past are favored.
<li>Chunks which are completely read and not written, completely
written but not read, or completely read and completely
written are penalized according to <em>w0</em>, an
application-defined weight between 0 and 1 inclusive. A weight
of zero does not penalize such chunks while a weight of 1
penalizes those chunks more than all other chunks. The
default is 0.75.
<li>Chunks which are larger than the maximum cache size do not
participate in the cache.
</ul>
<p>One should choose large values for <em>w0</em> if I/O requests
typically do not overlap but smaller values for <em>w0</em> if
the requests do overlap. For instance, reading an entire 2d
array by reading from non-overlapping "windows" in a row-major
order would benefit from a high <em>w0</em> value while reading
a diagonal accross the dataset where each request overlaps the
previous request would benefit from a small <em>w0</em>.
<h2>3. Data Caching Operations</h2>
<p>The cache parameters for both caches are part of a file access
property list and are set and queried with this pair of
functions:
<dl>
<dt><code>herr_t H5Pset_cache(hid_t <em>plist</em>, unsigned int
<em>mdc_nelmts</em>, size_t <em>rdcc_nbytes</em>, double
<em>w0</em>)</code>
<dt><code>herr_t H5Pget_cache(hid_t <em>plist</em>, unsigned int
*<em>mdc_nelmts</em>, size_t *<em>rdcc_nbytes</em>, double
<em>w0</em>)</code>
<dd>Sets or queries the meta data cache and raw data chunk cache
parameters. The <em>plist</em> is a file access property
list. The number of elements (objects) in the meta data cache
is <em>mdc_nelmts</em>. The total size of the raw data chunk
cache and the preemption policy is <em>rdcc_nbytes</em> and
<em>w0</em>. For <code>H5Pget_cache()</code> any (or all) of
the pointer arguments may be null pointers.
</dl>
<hr>
<center>
<table border=0 width=98%>
<tr><td valign=top align=left>
<a href="H5.intro.html">Introduction to HDF5</a> <br>
<a href="RM_H5Front.html">HDF5 Reference Manual</a> <br>
<a href="index.html">Other HDF5 documents and links</a> <br>
<!--
<a href="Glossary.html">Glossary</a><br>
-->
</td>
<td valign=top align=right>
And in this document, the
<a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>
<br>
<a href="Files.html">Files</a>
<a href="Datasets.html">Datasets</a>
<a href="Datatypes.html">Datatypes</a>
<a href="Dataspaces.html">Dataspaces</a>
<a href="Groups.html">Groups</a>
<br>
<a href="References.html">References</a>
<a href="Attributes.html">Attributes</a>
<a href="Properties.html">Property Lists</a>
<a href="Errors.html">Error Handling</a>
<br>
<a href="Filters.html">Filters</a>
Caching
<a href="Chunking.html">Chunking</a>
<a href="MountingFiles.html">Mounting Files</a>
<br>
<a href="Performance.html">Performance</a>
<a href="Debugging.html">Debugging</a>
<a href="Environment.html">Environment</a>
<a href="ddl.html">DDL</a>
</td></tr>
</table>
</center>
<hr>
<address>
<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
</address>
<!-- Created: Tue May 26 15:20:14 EDT 1998 -->
<!-- hhmts start -->
Last modified: 13 December 1999
<!-- hhmts end -->
<br>
Describes HDF5 Release 1.4 Beta, December 2000
</body>
</html>
|