summaryrefslogtreecommitdiffstats
path: root/doc/html/Chunking.html
blob: 84dfc43d86df1b25970781c26786ab24d992d63a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
  <head>
    <title>Dataset Chunking</title>
  </head>

  <body bgcolor="#FFFFFF">


<hr>
<center>
<table border=0 width=98%>
<tr><td valign=top align=left>
   <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
   <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
   <a href="index.html">Other HDF5 documents and links</a>&nbsp;<br>
   <!--
   <a href="Glossary.html">Glossary</a><br>
   -->
</td>
<td valign=top align=right>
   And in this document, the 
   <a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
      <br>
      <a href="Files.html">Files</a>&nbsp;&nbsp;
      <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
      <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
      <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
      <a href="Groups.html">Groups</a>&nbsp;&nbsp;
      <br>
      <a href="References.html">References</a>&nbsp;&nbsp;
      <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
      <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
      <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
      <br>
      <a href="Filters.html">Filters</a>&nbsp;&nbsp;
      <a href="Palettes.html">Palettes</a>&nbsp;&nbsp;
      <a href="Caching.html">Caching</a>&nbsp;&nbsp;
      Chunking&nbsp;&nbsp;
      <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
      <br>
      <a href="Performance.html">Performance</a>&nbsp;&nbsp;
      <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
      <a href="Environment.html">Environment</a>&nbsp;&nbsp;
      <a href="ddl.html">DDL</a>&nbsp;&nbsp;
      <br>
      <a href="Ragged.html">Ragged Arrays</a>&nbsp;&nbsp;
</td></tr>
</table>
</center>
<hr>


    <h1>Dataset Chunking Issues</h1>

    <h2>Table of Contents</h2>

    <ul>
      <li><a href="#S1">1. Introduction</a>
      <li><a href="#S2">2. Raw Data Chunk Cache</a>
      <li><a href="#S3">3. Cache Efficiency</a>
      <li><a href="#S4">4. Fragmentation</a>
      <li><a href="#S5">5. File Storage Overhead</a>
    </ul>

    <h2><a name="S1">1. Introduction</a></h2>


    <p><em>Chunking</em> refers to a storage layout where a dataset is
      partitioned into fixed-size multi-dimensional chunks.  The
      chunks cover the dataset but the dataset need not be an integral
      number of chunks.  If no data is ever written to a chunk then
      that chunk isn't allocated on disk.  Figure 1 shows a 25x48
      element dataset covered by nine 10x20 chunks and 11 data points
      written to the dataset.  No data was written to the region of
      the dataset covered by three of the chunks so those chunks were
      never allocated in the file -- the other chunks are allocated at 
      independent locations in the file and written in their entirety.

      <center><image src="Chunk_f1.gif"><br><b>Figure 1</b></center>

    <p>The HDF5 library treats chunks as atomic objects -- disk I/O is 
      always in terms of complete chunks<a href="#fn1">(1)</a>. This
      allows data filters to be defined by the application to perform
      tasks such as compression, encryption, checksumming,
      <em>etc</em>. on entire chunks. As shown in Figure 2, if
      <code>H5Dwrite()</code> touches only a few bytes of the chunk,
      the entire chunk is read from the file, the data passes upward
      through the filter pipeline, the few bytes are modified, the
      data passes downward through the filter pipeline, and the entire
      chunk is written back to the file.

      <center><image src="Chunk_f2.gif"><br><b>Figure 2</b></center>

    <h2><a name="S2">2. The Raw Data Chunk Cache</a></h2>

    <p>It's obvious from Figure 2 that calling <code>H5Dwrite()</code> 
      many times from the application would result in poor performance
      even if the data being written all falls within a single chunk.
      A raw data chunk cache layer was added between the top of the
      filter stack and the bottom of the byte modification layer<a
      href="#fn2">(2)</a>. By default, the chunk cache will store 521
      chunks or 1MB of data (whichever is less) but these values can
      be modified with <code>H5Pset_cache()</code>.

    <p>The preemption policy for the cache favors certain chunks and
      tries not to preempt them.

    <ul>
      <li>Chunks that have been accessed frequently in the near past
	are favored.
      <li>A chunk which has just entered the cache is favored.
      <li>A chunk which has been completely read or completely written 
	but not partially read or written is penalized according to
	some application specified weighting between zero and one.
      <li>A chunk which is larger than the maximum cache size is not
	eligible for caching.
    </ul>

    <h2><a name="S3">3. Cache Efficiency</a></h2>

    <p>Now for some real numbers... A 2000x2000 element dataset is
      created and covered by a 20x20 array of chunks (each chunk is 100x100
      elements). The raw data cache is adjusted to hold at most 25 chunks by
      setting the maximum number of bytes to 25 times the chunk size in
      bytes. Then the application creates a square, two-dimensional memory
      buffer and uses it as a window into the dataset, first reading and then
      rewriting in row-major order by moving the window across the dataset
      (the read and write tests both start with a cold cache).

    <p>The measure of efficiency in Figure 3 is the number of bytes requested
      by the application divided by the number of bytes transferred from the
      file.  There are at least a couple ways to get an estimate of the cache
      performance: one way is to turn on <a href="Debugging.html">cache
      debugging</a> and look at the number of cache misses.  A more accurate
      and specific way is to register a data filter whose sole purpose is to
      count the number of bytes that pass through it (that's the method used
      below).

    <center><image src="Chunk_f3.gif"><br><b>Figure 3</b></center>

    <p>The read efficiency is less than one for two reasons: collisions in the
      cache are handled by preempting one of the colliding chunks, and the
      preemption algorithm occasionally preempts a chunk which hasn't been
      referenced for a long time but is about to be referenced in the near
      future.

    <p>The write test results in lower efficiency for most window
      sizes because HDF5 is unaware that the application is about to
      overwrite the entire dataset and must read in most chunks before 
      modifying parts of them.

    <p>There is a simple way to improve efficiency for this example.
      It turns out that any chunk that has been completely read or
      written is a good candidate for preemption.  If we increase the
      penalty for such chunks from the default 0.75 to the maximum
      1.00 then efficiency improves.

      <center><image src="Chunk_f4.gif"><br><b>Figure 4</b></center>

    <p>The read efficiency is still less than one because of
      collisions in the cache.  The number of collisions can often be
      reduced by increasing the number of slots in the cache.  Figure
      5 shows what happens when the maximum number of slots is
      increased by an order of magnitude from the default (this change 
      has no major effect on memory used by the test since the byte
      limit was not increased for the cache).

      <center><image src="Chunk_f5.gif"><br><b>Figure 5</b></center>

    <p>Although the application eventually overwrites every chunk
      completely the library has know way of knowing this before hand
      since most calls to <code>H5Dwrite()</code> modify only a
      portion of any given chunk. Therefore, the first modification of 
      a chunk will cause the chunk to be read from disk into the chunk 
      buffer through the filter pipeline.  Eventually HDF5 might
      contain a data set transfer property that can turn off this read 
      operation resulting in write efficiency which is equal to read
      efficiency.


    <h2><a name="S4">4. Fragmentation</a></h2>

    <p>Even if the application transfers the entire dataset contents with a
      single call to <code>H5Dread()</code> or <code>H5Dwrite()</code> it's
      possible the request will be broken into smaller, more manageable
      pieces by the library.  This is almost certainly true if the data
      transfer includes a type conversion.

      <center><image src="Chunk_f6.gif"><br><b>Figure 6</b></center>

    <p>By default the strip size is 1MB but it can be changed by calling
      <code>H5Pset_buffer()</code>.


    <h2><a name="S5">5. File Storage Overhead</a></h2>

    <p>The chunks of the dataset are allocated at independent
      locations throughout the HDF5 file and a B-tree maps chunk
      N-dimensional addresses to file addresses. The more chunks that
      are allocated for a dataset the larger the B-tree. Large B-trees
      have two disadvantages:

    <ul>
      <li>The file storage overhead is higher and more disk I/O is
	required to traverse the tree from root to leaves.
      <li>The increased number of B-tree nodes will result in higher
	contention for the meta data cache.
    </ul>

    <p>There are three ways to reduce the number of B-tree nodes.  The
      obvious way is to reduce the number of chunks by choosing a larger chunk
      size (doubling the chunk size will cut the number of B-tree nodes in
      half).  Another method is to adjust the split ratios for the B-tree by
      calling <code>H5Pset_split_ratios()</code>, but this method typically
      results in only a slight improvement over the default settings.
      Finally, the out-degree of each node can be increased by calling
      <code>H5Pset_istore_k()</code> (increasing the out degree actually
      increases file overhead while decreasing the number of nodes).

    <hr>

    <p><a name="fn1">Footnote 1:</a> Parallel versions of the library
      can access individual bytes of a chunk when the underlying file
      uses MPI-IO.

    <p><a name="fn2">Footnote 2:</a> The raw data chunk cache was
      added before the second alpha release.


<hr>
<center>
<table border=0 width=98%>
<tr><td valign=top align=left>
   <a href="H5.intro.html">Introduction to HDF5</a>&nbsp;<br>
   <a href="RM_H5Front.html">HDF5 Reference Manual</a>&nbsp;<br>
   <a href="index.html">Other HDF5 documents and links</a>&nbsp;<br>
   <!--
   <a href="Glossary.html">Glossary</a><br>
   -->
</td>
<td valign=top align=right>
   And in this document, the 
   <a href="H5.user.html"><strong>HDF5 User's Guide:</strong></a>&nbsp;&nbsp;&nbsp;&nbsp;
      <br>
      <a href="Files.html">Files</a>&nbsp;&nbsp;
      <a href="Datasets.html">Datasets</a>&nbsp;&nbsp;
      <a href="Datatypes.html">Datatypes</a>&nbsp;&nbsp;
      <a href="Dataspaces.html">Dataspaces</a>&nbsp;&nbsp;
      <a href="Groups.html">Groups</a>&nbsp;&nbsp;
      <br>
      <a href="References.html">References</a>&nbsp;&nbsp;
      <a href="Attributes.html">Attributes</a>&nbsp;&nbsp;
      <a href="Properties.html">Property Lists</a>&nbsp;&nbsp;
      <a href="Errors.html">Error Handling</a>&nbsp;&nbsp;
      <br>
      <a href="Filters.html">Filters</a>&nbsp;&nbsp;
      <a href="Palettes.html">Palettes</a>&nbsp;&nbsp;
      <a href="Caching.html">Caching</a>&nbsp;&nbsp;
      Chunking&nbsp;&nbsp;
      <a href="MountingFiles.html">Mounting Files</a>&nbsp;&nbsp;
      <br>
      <a href="Performance.html">Performance</a>&nbsp;&nbsp;
      <a href="Debugging.html">Debugging</a>&nbsp;&nbsp;
      <a href="Environment.html">Environment</a>&nbsp;&nbsp;
      <a href="ddl.html">DDL</a>&nbsp;&nbsp;
      <br>
      <a href="Ragged.html">Ragged Arrays</a>&nbsp;&nbsp;
</td></tr>
</table>
</center>



<hr>
<address>
<a href="mailto:hdfhelp@ncsa.uiuc.edu">HDF Help Desk</a>
</address>

<!-- Created: Tue Oct 20 12:38:40 EDT 1998 -->
<!-- hhmts start -->
Last modified:  13 December 1999
<!-- hhmts end -->

<br>
Describes HDF5 Release 1.4 Beta, December 2000



  </body>
</html>