diff options
Diffstat (limited to 'doc/html/Chunking.html')
-rw-r--r-- | doc/html/Chunking.html | 193 |
1 files changed, 193 insertions, 0 deletions
diff --git a/doc/html/Chunking.html b/doc/html/Chunking.html new file mode 100644 index 0000000..ccf4f9e --- /dev/null +++ b/doc/html/Chunking.html @@ -0,0 +1,193 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <title>Dataset Chunking Pitfalls</title> + </head> + + <body> + <h1>Dataset Chunking Pitfalls</h1> + + <h2>Table of Contents</h2> + + <ul> + <li><a href="#S1">1. Introduction</a> + <li><a href="#S2">2. Raw Data Chunk Cache</a> + <li><a href="#S3">3. Cache Efficiency</a> + <li><a href="#S4">4. Fragmentation</a> + <li><a href="#S5">5. File Storage Overhead</a> + </ul> + + <h2><a name="S1">1. Introduction</a></h2> + + + <p><em>Chunking</em> refers to a storage layout where a dataset is + partitioned into fixed-size multi-dimensional chunks. The + chunks cover the dataset but the dataset need not be an integral + number of chunks. If no data is ever written to a chunk then + that chunk isn't allocated on disk. Figure 1 shows a 25x48 + element dataset covered by nine 10x20 chunks and 11 data points + written to the dataset. No data was written to the region of + the dataset covered by three of the chunks so those chunks were + never allocated in the file -- the other chunks are allocated at + independent locations in the file and written in their entirety. + + <center><image src="Chunk_f1.gif"><br><b>Figure 1</b></center> + + <p>The HDF5 library treats chunks as atomic objects -- disk I/O is + always in terms of complete chunks<a href="#fn1">(1)</a>. This + allows data filters to be defined by the application to perform + tasks such as compression, encryption, checksumming, + <em>etc</em>. on entire chunks. As shown in Figure 2, if + <code>H5Dwrite()</code> touches only a few bytes of the chunk, + the entire chunk is read from the file, the data passes upward + through the filter pipeline, the few bytes are modified, the + data passes downward through the filter pipeline, and the entire + chunk is written back to the file. + + <center><image src="Chunk_f2.gif"><br><b>Figure 2</b></center> + + <h2><a name="S2">2. The Raw Data Chunk Cache</a></h2> + + <p>It's obvious from Figure 2 that calling <code>H5Dwrite()</code> + many times from the application would result in poor performance + even if the data being written all falls within a single chunk. + A raw data chunk cache layer was added between the top of the + filter stack and the bottom of the byte modification layer<a + href="#fn2">(2)</a>. By default, the chunk cache will store 521 + chunks or 1MB of data (whichever is less) but these values can + be modified with <code>H5Pset_cache()</code>. + + <p>The preemption policy for the cache favors certain chunks and + tries not to preempt them. + + <ul> + <li>Chunks that have been accessed frequently in the near past + are favored. + <li>A chunk which has just entered the cache is favored. + <li>A chunk which has been completely read or completely written + but not partially read or written is penalized according to + some application specified weighting between zero and one. + <li>A chunk which is larger than the maximum cache size is not + eligible for caching. + </ul> + + <h2><a name="S3">3. Cache Efficiency</a></h2> + + <p>Now for some real numbers... A 2000x2000 element dataset is + created and covered by a 20x20 array of chunks (each chunk is 100x100 + elements). The raw data cache is adjusted to hold at most 25 chunks by + setting the maximum number of bytes to 25 times the chunk size in + bytes. Then the application creates a square, two-dimensional memory + buffer and uses it as a window into the dataset, first reading and then + rewriting in row-major order by moving the window across the dataset + (the read and write tests both start with a cold cache). + + <p>The measure of efficiency in Figure 3 is the number of bytes requested + by the application divided by the number of bytes transferred from the + file. There are at least a couple ways to get an estimate of the cache + performance: one way is to turn on <a href="Debugging.html">cache + debugging</a> and look at the number of cache misses. A more accurate + and specific way is to register a data filter whose sole purpose is to + count the number of bytes that pass through it (that's the method used + below). + + <center><image src="Chunk_f3.gif"><br><b>Figure 3</b></center> + + <p>The read efficiency is less than one for two reasons: collisions in the + cache are handled by preempting one of the colliding chunks, and the + preemption algorithm occasionally preempts a chunk which hasn't been + referenced for a long time but is about to be referenced in the near + future. + + <p>The write test results in lower efficiency for most window + sizes because HDF5 is unaware that the application is about to + overwrite the entire dataset and must read in most chunks before + modifying parts of them. + + <p>There is a simple way to improve efficiency for this example. + It turns out that any chunk that has been completely read or + written is a good candidate for preemption. If we increase the + penalty for such chunks from the default 0.75 to the maximum + 1.00 then efficiency improves. + + <center><image src="Chunk_f4.gif"><br><b>Figure 4</b></center> + + <p>The read efficiency is still less than one because of + collisions in the cache. The number of collisions can often be + reduced by increasing the number of slots in the cache. Figure + 5 shows what happens when the maximum number of slots is + increased by an order of magnitude from the default (this change + has no major effect on memory used by the test since the byte + limit was not increased for the cache). + + <center><image src="Chunk_f5.gif"><br><b>Figure 5</b></center> + + <p>Although the application eventually overwrites every chunk + completely the library has know way of knowing this before hand + since most calls to <code>H5Dwrite()</code> modify only a + portion of any given chunk. Therefore, the first modification of + a chunk will cause the chunk to be read from disk into the chunk + buffer through the filter pipeline. Eventually HDF5 might + contain a data set transfer property that can turn off this read + operation resulting in write efficiency which is equal to read + efficiency. + + + <h2><a name="S4">4. Fragmentation</a></h2> + + <p>Even if the application transfers the entire dataset contents with a + single call to <code>H5Dread()</code> or <code>H5Dwrite()</code> it's + possible the request will be broken into smaller, more manageable + pieces by the library. This is almost certainly true if the data + transfer includes a type conversion. + + <center><image src="Chunk_f6.gif"><br><b>Figure 6</b></center> + + <p>By default the strip size is 1MB but it can be changed by calling + <code>H5Pset_buffer()</code>. + + + <h2><a name="S5">5. File Storage Overhead</a></h2> + + <p>The chunks of the dataset are allocated at independent + locations throughout the HDF5 file and a B-tree maps chunk + N-dimensional addresses to file addresses. The more chunks that + are allocated for a dataset the larger the B-tree. Large B-trees + have two disadvantages: + + <ul> + <li>The file storage overhead is higher and more disk I/O is + required to traverse the tree from root to leaves. + <li>The increased number of B-tree nodes will result in higher + contention for the meta data cache. + </ul> + + <p>There are three ways to reduce the number of B-tree nodes. The + obvious way is to reduce the number of chunks by choosing a larger chunk + size (doubling the chunk size will cut the number of B-tree nodes in + half). Another method is to adjust the split ratios for the B-tree by + calling <code>H5Pset_split_ratios()</code>, but this method typically + results in only a slight improvement over the default settings. + Finally, the out-degree of each node can be increased by calling + <code>H5Pset_istore_k()</code> (increasing the out degree actually + increases file overhead while decreasing the number of nodes). + + <hr> + + <p><a name="fn1">Footnote 1:</a> Parallel versions of the library + can access individual bytes of a chunk when the underlying file + uses MPI-IO. + + <p><a name="fn2">Footnote 2:</a> The raw data chunk cache was + added before the second alpha release. + + + + <hr> + <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address> +<!-- Created: Tue Oct 20 12:38:40 EDT 1998 --> +<!-- hhmts start --> +Last modified: Fri Oct 23 10:30:52 EDT 1998 +<!-- hhmts end --> + </body> +</html> |