diff options
Diffstat (limited to 'doc/html/ExternalFiles.html')
-rw-r--r-- | doc/html/ExternalFiles.html | 278 |
1 files changed, 278 insertions, 0 deletions
diff --git a/doc/html/ExternalFiles.html b/doc/html/ExternalFiles.html new file mode 100644 index 0000000..39ebd2b --- /dev/null +++ b/doc/html/ExternalFiles.html @@ -0,0 +1,278 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <title>External Files in HDF5</title> + </head> + + <body> + <center><h1>External Files in HDF5</h1></center> + + <h3>Overview of Layers</h3> + + <p>This table shows some of the layers of HDF5. Each layer calls + functions at the same or lower layers and never functions at + higher layers. An object identifier (OID) takes various forms + at the various layers: at layer 0 an OID is an absolute physical + file address; at layers 1 and 2 it's an absolute virtual file + address. At layers 3 through 6 it's a relative address, and at + layers 7 and above it's an object handle. + + <p><center> + <table border cellpadding=4 width="60%"> + <tr align=center> + <td>Layer-7</td> + <td>Groups</td> + <td>Datasets</td> + </tr> + <tr align=center> + <td>Layer-6</td> + <td>Indirect Storage</td> + <td>Symbol Tables</td> + </tr> + <tr align=center> + <td>Layer-5</td> + <td>B-trees</td> + <td>Object Hdrs</td> + <td>Heaps</td> + </tr> + <tr align=center> + <td>Layer-4</td> + <td>Caching</td> + </tr> + <tr align=center> + <td>Layer-3</td> + <td>H5F chunk I/O</td> + </tr> + <tr align=center> + <td>Layer-2</td> + <td>H5F low</td> + </tr> + <tr align=center> + <td>Layer-1</td> + <td>File Family</td> + <td>Split Meta/Raw</td> + </tr> + <tr align=center> + <td>Layer-0</td> + <td>Section-2 I/O</td> + <td>Standard I/O</td> + <td>Malloc/Free</td> + </tr> + </table> + </center> + + <h3>Single Address Space</h3> + + <p>The simplest form of hdf5 file is a single file containing only + hdf5 data. The file begins with the boot block, which is + followed until the end of the file by hdf5 data. The next most + complicated file allows non-hdf5 data (user defined data or + internal wrappers) to appear before the boot block and after the + end of the hdf5 data. The hdf5 data is treated as a single + linear address space in both cases. + + <p>The next level of complexity comes when non-hdf5 data is + interspersed with the hdf5 data. We handle that by including + the non-hdf5 interspersed data in the hdf5 address space and + simply not referencing it (eventually we might add those + addresses to a "do-not-disturb" list using the same mechanism as + the hdf5 free list, but it's not absolutely necessary). This is + implemented except for the "do-not-disturb" list. + + <p>The most complicated single address space hdf5 file is when we + allow the address space to be split among multiple physical + files. For instance, a >2GB file can be split into smaller + chunks and transfered to a 32 bit machine, then accessed as a + single logical hdf5 file. The library already supports >32 bit + addresses, so at layer 1 we split a 64-bit address into a 32-bit + file number and a 32-bit offset (the 64 and 32 are + arbitrary). The rest of the library still operates with a linear + address space. + + <p>Another variation might be a family of two files where all the + meta data is stored in one file and all the raw data is stored + in another file to allow the HDF5 wrapper to be easily replaced + with some other wrapper. + + <p>The <code>H5Fcreate</code> and <code>H5Fopen</code> functions + would need to be modified to pass file-type info down to layer 2 + so the correct drivers can be called and parameters passed to + the drivers to initialize them. + + <h4>Implementation</h4> + + <p>I've implemented fixed-size family members. The entire hdf5 + file is partitioned into members where each member is the same + size. The family scheme is used if one passes a name to + <code>H5F_open</code> (which is called by <code>H5Fopen()</code> + and <code>H5Fcreate</code>) that contains a + <code>printf(3c)</code>-style integer format specifier. + Currently, the default low-level file driver is used for all + family members (H5F_LOW_DFLT, usually set to be Section 2 I/O or + Section 3 stdio), but we'll probably eventually want to pass + that as a parameter of the file access template, which hasn't + been implemented yet. When creating a family, a default family + member size is used (defined at the top H5Ffamily.c, currently + 64MB) but that also should be settable in the file access + template. When opening an existing family, the size of the first + member is used to determine the member size (flushing/closing a + family ensures that the first member is the correct size) but + the other family members don't have to be that large (the local + address space, however, is logically the same size for all + members). + + <p>I haven't implemented a split meta/raw family yet but am rather + curious to see how it would perform. I was planning to use the + `.h5' extension for the meta data file and `.raw' for the raw + data file. The high-order bit in the address would determine + whether the address refers to meta data or raw data. If the user + passes a name that ends with `.raw' to <code>H5F_open</code> + then we'll chose the split family and use the default low level + driver for each of the two family members. Eventually we'll + want to pass these kinds of things through the file access + template instead of relying on naming convention. + + <h3>External Raw Data</h3> + + <p>We also need the ability to point to raw data that isn't in the + HDF5 linear address space. For instance, a dataset might be + striped across several raw data files. + + <p>Fortunately, the only two packages that need to be aware of + this are the packages for reading/writing contiguous raw data + and discontiguous raw data. Since contiguous raw data is a + special case, I'll discuss how to implement external raw data in + the discontiguous case. + + <p>Discontiguous data is stored as a B-tree whose keys are the + chunk indices and whose leaf nodes point to the raw data by + storing a file address. So what we need is some way to name the + external files, and a way to efficiently store the external file + name for each chunk. + + <p>I propose adding to the object header an <em>External File + List</em> message that is a 1-origin array of file names. + Then, in the B-tree, each key has an index into the External + File List (or zero for the HDF5 file) for the file where the + chunk can be found. The external file index is only used at + the leaf nodes to get to the raw data (the entire B-tree is in + the HDF5 file) but because of the way keys are copied among + the B-tree nodes, it's much easier to store the index with + every key. + + <h3>Multiple HDF5 Files</h3> + + <p>One might also want to combine two or more HDF5 files in a + manner similar to mounting file systems in Unix. That is, the + group structure and meta data from one file appear as though + they exist in the first file. One opens File-A, and then + <em>mounts</em> File-B at some point in File-A, the <em>mount + point</em>, so that traversing into the mount point actually + causes one to enter the root object of File-B. File-A and + File-B are each complete HDF5 files and can be accessed + individually without mounting them. + + <p>We need a couple additional pieces of machinery to make this + work. First, an haddr_t type (a file address) doesn't contain + any info about which HDF5 file's address space the address + belongs to. But since haddr_t is an opaque type except at + layers 2 and below, it should be quite easy to add a pointer to + the HDF5 file. This would also remove the H5F_t argument from + most of the low-level functions since it would be part of the + OID. + + <p>The other thing we need is a table of mount points and some + functions that understand them. We would add the following + table to each H5F_t struct: + + <p><code><pre> +struct H5F_mount_t { + H5F_t *parent; /* Parent HDF5 file if any */ + struct { + H5F_t *f; /* File which is mounted */ + haddr_t where; /* Address of mount point */ + } *mount; /* Array sorted by mount point */ + intn nmounts; /* Number of mounted files */ + intn alloc; /* Size of mount table */ +} + </pre></code> + + <p>The <code>H5Fmount</code> function takes the ID of an open + file, the name of a to-be-mounted file, the name of the mount + point, and a file access template (like <code>H5Fopen</code>). + It opens the new file and adds a record to the parent's mount + table. The <code>H5Funmount</code> function takes the parent + file ID and the name of the mount point and closes the file + that's mounted at that point. The <code>H5Fclose</code> + function closes/unmounts files recursively. + + <p>The <code>H5G_iname</code> function which translates a name to + a file address (<code>haddr_t</code>) looks at the mount table + at each step in the translation and switches files where + appropriate. All name-to-address translations occur through + this function. + + <h3>How Long?</h3> + + <p>I'm expecting to be able to implement the two new flavors of + single linear address space in about two days. It took two hours + to implement the malloc/free file driver at level zero and I + don't expect this to be much more work. + + <p>I'm expecting three days to implement the external raw data for + discontiguous arrays. Adding the file index to the B-tree is + quite trivial; adding the external file list message shouldn't + be too hard since the object header message class from wich this + message derives is fully implemented; and changing + <code>H5F_istore_read</code> should be trivial. Most of the + time will be spent designing a way to cache Unix file + descriptors efficiently since the total number open files + allowed per process could be much smaller than the total number + of HDF5 files and external raw data files. + + <p>I'm expecting four days to implement being able to mount one + HDF5 file on another. I was originally planning a lot more, but + making <code>haddr_t</code> opaque turned out to be much easier + than I planned (I did it last Fri). Most of the work will + probably be removing the redundant H5F_t arguments for lots of + functions. + + <h3>Conclusion</h3> + + <p>The external raw data could be implemented as a single linear + address space, but doing so would require one to allocate large + enough file addresses throughout the file (>32bits) before the + file was created. It would make mixing an HDF5 file family with + external raw data, or external HDF5 wrapper around an HDF4 file + a more difficult process. So I consider the implementation of + external raw data files as a single HDF5 linear address space a + kludge. + + <p>The ability to mount one HDF5 file on another might not be a + very important feature especially since each HDF5 file must be a + complete file by itself. It's not possible to stripe an array + over multiple HDF5 files because the B-tree wouldn't be complete + in any one file, so the only choice is to stripe the array + across multiple raw data files and store the B-tree in the HDF5 + file. On the other hand, it might be useful if one file + contains some public data which can be mounted by other files + (e.g., a mesh topology shared among collaborators and mounted by + files that contain other fields defined on the mesh). Of course + the applications can open the two files separately, but it might + be more portable if we support it in the library. + + <p>So we're looking at about two weeks to implement all three + versions. I didn't get a chance to do any of them in AIO + although we had long-term plans for the first two with a + possibility of the third. They'll be much easier to implement in + HDF5 than AIO since I've been keeping these in mind from the + start. + + <hr> + <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address> +<!-- Created: Sat Nov 8 18:08:52 EST 1997 --> +<!-- hhmts start --> +Last modified: Wed Nov 12 15:01:14 EST 1997 +<!-- hhmts end --> + </body> +</html> |