summaryrefslogtreecommitdiffstats
path: root/doc/html/Caching.html
blob: 3b9e53c9d0c0eeb8c1dd068e444afc97a966e118 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
  <head>
    <title>Data Caching</title>
  </head>

  <body bgcolor="#FFFFFF">


<hr>
<center>
<table border=0 width=98%>
<tr><td valign=top align=left>
   <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
   <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
   <a href="index.html">Other HDF5 documents and links</a>&nbsp;<br>
   <!--
   <a href="Glossary.html">Glossary</a><br>
   -->
</td>
<td valign=top align=right>
   And in this document, the 
   <a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
      <br>
      <a href="Files.html">Files</a>&nbsp;&nbsp;
      <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
      <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
      <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
      <a href="Groups.html">Groups</a>&nbsp;&nbsp;
      <br>
      <a href="References.html">References</a>&nbsp;&nbsp;
      <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
      <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
      <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
      <br>
      <a href="Filters.html">Filters</a>&nbsp;&nbsp;
      <a href="Palettes.html">Palettes</a>&nbsp;&nbsp;
      Caching&nbsp;&nbsp;
      <a href="Chunking.html">Chunking</a>&nbsp;&nbsp;
      <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
      <br>
      <a href="Performance.html">Performance</a>&nbsp;&nbsp;
      <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
      <a href="Environment.html">Environment</a>&nbsp;&nbsp;
      <a href="ddl.html">DDL</a>&nbsp;&nbsp;
      <br>
      <a href="Ragged.html">Ragged Arrays</a>&nbsp;&nbsp;
</td></tr>
</table>
</center>
<hr>


    <h1>Data Caching</h1>

    <h2>1. Meta Data Caching</h2>

    <p>The HDF5 library caches two types of data: meta data and raw
      data.  The meta data cache holds file objects like the file
      header, symbol table nodes, global heap collections, object
      headers and their messages, etc. in a partially decoded
      state. The cache has a fixed number of entries which is set with 
      the file access property list (defaults to 10k) and each entry
      can hold a single meta data object.  Collisions between objects
      are handled by preempting the older object in favor of the new
      one.

    <h2>2. Raw Data Chunk Caching</h2>

    <p>Raw data chunks are cached because I/O requests at the
      application level typically don't map well to chunks at the
      storage level.  The chunk cache has a maximum size in bytes
      set with the file access property list (defaults to 1MB) and
      when the limit is reached chunks are preempted based on the
      following set of heuristics.

    <ul>
      <li>Chunks which have not been accessed for a long time
	relative to other chunks are penalized.
      <li>Chunks which have been accessed frequently in the recent
	past are favored.
      <li>Chunks which are completely read and not written, completely 
	written but not read, or completely read and completely
	written are penalized according to <em>w0</em>, an
	application-defined weight between 0 and 1 inclusive. A weight 
	of zero does not penalize such chunks while a weight of 1
	penalizes those chunks more than all other chunks.  The
	default is 0.75.
      <li>Chunks which are larger than the maximum cache size do not
	participate in the cache.
    </ul>

    <p>One should choose large values for <em>w0</em> if I/O requests
      typically do not overlap but smaller values for <em>w0</em> if
      the requests do overlap.  For instance, reading an entire 2d
      array by reading from non-overlapping "windows" in a row-major
      order would benefit from a high <em>w0</em> value while reading
      a diagonal accross the dataset where each request overlaps the
      previous request would benefit from a small <em>w0</em>.

    <h2>3. Data Caching Operations</h2>

    <p>The cache parameters for both caches are part of a file access
      property list and are set and queried with this pair of
      functions:

    <dl>
      <dt><code>herr_t H5Pset_cache(hid_t <em>plist</em>, unsigned int 
	  <em>mdc_nelmts</em>, size_t <em>rdcc_nbytes</em>, double
	  <em>w0</em>)</code>
      <dt><code>herr_t H5Pget_cache(hid_t <em>plist</em>, unsigned int 
	  *<em>mdc_nelmts</em>, size_t *<em>rdcc_nbytes</em>, double
	  <em>w0</em>)</code>
      <dd>Sets or queries the meta data cache and raw data chunk cache 
	parameters.  The <em>plist</em> is a file access property
	list.  The number of elements (objects) in the meta data cache 
	is <em>mdc_nelmts</em>.  The total size of the raw data chunk
	cache and the preemption policy is <em>rdcc_nbytes</em> and
	<em>w0</em>.  For <code>H5Pget_cache()</code> any (or all) of
	the pointer arguments may be null pointers.
    </dl>


<hr>
<center>
<table border=0 width=98%>
<tr><td valign=top align=left>
   <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
   <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
   <a href="index.html">Other HDF5 documents and links</a>&nbsp;<br>
   <!--
   <a href="Glossary.html">Glossary</a><br>
   -->
</td>
<td valign=top align=right>
   And in this document, the 
   <a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
      <br>
      <a href="Files.html">Files</a>&nbsp;&nbsp;
      <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
      <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
      <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
      <a href="Groups.html">Groups</a>&nbsp;&nbsp;
      <br>
      <a href="References.html">References</a>&nbsp;&nbsp;
      <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
      <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
      <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
      <br>
      <a href="Filters.html">Filters</a>&nbsp;&nbsp;
      <a href="Palettes.html">Palettes</a>&nbsp;&nbsp;
      Caching&nbsp;&nbsp;
      <a href="Chunking.html">Chunking</a>&nbsp;&nbsp;
      <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
      <br>
      <a href="Performance.html">Performance</a>&nbsp;&nbsp;
      <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
      <a href="Environment.html">Environment</a>&nbsp;&nbsp;
      <a href="ddl.html">DDL</a>&nbsp;&nbsp;
      <br>
      <a href="Ragged.html">Ragged Arrays</a>&nbsp;&nbsp;
</td></tr>
</table>
</center>



<hr>
<address>
<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
</address>

<!-- Created: Tue May 26 15:20:14 EDT 1998 -->
<!-- hhmts start -->
Last modified:  14 October 1999
<!-- hhmts end -->

  </body>
</html>