summaryrefslogtreecommitdiffstats
path: root/doc/html/TechNotes/H4-H5Compat.html
diff options
context:
space:
mode:
authorFrank Baker <fbaker@hdfgroup.org>2000-08-25 17:42:24 (GMT)
committerFrank Baker <fbaker@hdfgroup.org>2000-08-25 17:42:24 (GMT)
commit2e8cd59163cda6b051eae53aac58f54e0a9e9351 (patch)
tree5d9058906b7a9a39b0815476d7daea0229c67eb4 /doc/html/TechNotes/H4-H5Compat.html
parent3ff571ab58fedf8dbb20ac8e1eea251ffb343f9e (diff)
downloadhdf5-2e8cd59163cda6b051eae53aac58f54e0a9e9351.zip
hdf5-2e8cd59163cda6b051eae53aac58f54e0a9e9351.tar.gz
hdf5-2e8cd59163cda6b051eae53aac58f54e0a9e9351.tar.bz2
[svn-r2482] Bringing "HDF5 Technical Notes" into development branch (from R1.2 branch)
Diffstat (limited to 'doc/html/TechNotes/H4-H5Compat.html')
-rw-r--r--doc/html/TechNotes/H4-H5Compat.html271
1 files changed, 271 insertions, 0 deletions
diff --git a/doc/html/TechNotes/H4-H5Compat.html b/doc/html/TechNotes/H4-H5Compat.html
new file mode 100644
index 0000000..2992476
--- /dev/null
+++ b/doc/html/TechNotes/H4-H5Compat.html
@@ -0,0 +1,271 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <title>Backward/Forward Compatability</title>
+ </head>
+
+ <body>
+ <h1>Backward/Forward Compatability</h1>
+
+ <p>The HDF5 development must proceed in such a manner as to
+ satisfy the following conditions:
+
+ <ol type=A>
+ <li>HDF5 applications can produce data that HDF5
+ applications can read and write and HDF4 applications can produce
+ data that HDF4 applications can read and write. The situation
+ that demands this condition is obvious.</li>
+
+ <li>HDF5 applications are able to produce data that HDF4 applications
+ can read and HDF4 applications can subsequently modify the
+ file subject to certain constraints depending on the
+ implementation. This condition is for the temporary
+ situation where a consumer has neither been relinked with a new
+ HDF4 API built on top of the HDF5 API nor recompiled with the
+ HDF5 API.</li>
+
+ <li>HDF5 applications can read existing HDF4 files and subsequently
+ modify the file subject to certain constraints depending on
+ the implementation. This is condition is for the temporary
+ situation in which the producer has neither been relinked with a
+ new HDF4 API built on top of the HDF5 API nor recompiled with
+ the HDF5 API, or the permanent situation of HDF5 consumers
+ reading archived HDF4 files.</li>
+ </ul>
+
+ <p>There's at least one invarient: new object features introduced
+ in the HDF5 file format (like 2-d arrays of structs) might be
+ impossible to "translate" to a format that an old HDF4
+ application can understand either because the HDF4 file format
+ or the HDF4 API has no mechanism to describe the object.
+
+ <p>What follows is one possible implementation based on how
+ Condition B was solved in the AIO/PDB world. It also attempts
+ to satisfy these goals:
+
+ <ol type=1>
+ <li>The main HDF5 library contains as little extra baggage as
+ possible by either relying on external programs to take care
+ of compatability issues or by incorporating the logic of such
+ programs as optional modules in the HDF5 library. Conditions B
+ and C are separate programs/modules.</li>
+
+ <li>No extra baggage not only means the library proper is small,
+ but also means it can be implemented (rather than migrated
+ from HDF4 source) from the ground up with minimal regard for
+ HDF4 thus keeping the logic straight forward.</li>
+
+ <li>Compatability issues are handled behind the scenes when
+ necessary (and possible) but can be carried out explicitly
+ during things like data migration.</li>
+ </ol>
+
+ <hr>
+ <h2>Wrappers</h2>
+
+ <p>The proposed implementation uses <i>wrappers</i> to handle
+ compatability issues. A Format-X file is <i>wrapped</i> in a
+ Format-Y file by creating a Format-Y skeleton that replicates
+ the Format-X meta data. The Format-Y skeleton points to the raw
+ data stored in Format-X without moving the raw data. The
+ restriction is that raw data storage methods in Format-Y is a
+ superset of raw data storage methods in Format-X (otherwise the
+ raw data must be copied to Format-Y). We're assuming that meta
+ data is small wrt the entire file.
+
+ <p>The wrapper can be a separate file that has pointers into the
+ first file or it can be contained within the first file. If
+ contained in a single file, the file can appear as a Format-Y
+ file or simultaneously a Format-Y and Format-X file.
+
+ <p>The Format-X meta-data can be thought of as the original
+ wrapper around raw data and Format-Y is a second wrapper around
+ the same data. The wrappers are independend of one another;
+ modifying the meta-data in one wrapper causes the other to
+ become out of date. Modification of raw data doesn't invalidate
+ either view as long as the meta data that describes its storage
+ isn't modifed. For instance, an array element can change values
+ if storage is already allocated for the element, but if storage
+ isn't allocated then the meta data describing the storage must
+ change, invalidating all wrappers but one.
+
+ <p>It's perfectly legal to modify the meta data of one wrapper
+ without modifying the meta data in the other wrapper(s). The
+ illegal part is accessing the raw data through a wrapper which
+ is out of date.
+
+ <p>If raw data is wrapped by more than one internal wrapper
+ (<i>internal</i> means that the wrapper is in the same file as
+ the raw data) then access to that file must assume that
+ unreferenced parts of that file contain meta data for another
+ wrapper and cannot be reclaimed as free memory.
+
+ <hr>
+ <h2>Implementation of Condition B</h2>
+
+ <p>Since this is a temporary situation which can't be
+ automatically detected by the HDF5 library, we must rely
+ on the application to notify the HDF5 library whether or not it
+ must satisfy Condition B. (Even if we don't rely on the
+ application, at some point someone is going to remove the
+ Condition B constraint from the library.) So the module that
+ handles Condition B is conditionally compiled and then enabled
+ on a per-file basis.
+
+ <p>If the application desires to produce an HDF4 file (determined
+ by arguments to <code>H5Fopen</code>), and the Condition B
+ module is compiled into the library, then <code>H5Fclose</code>
+ calls the module to traverse the HDF5 wrapper and generate an
+ additional internal or external HDF4 wrapper (wrapper specifics
+ are described below). If Condition B is implemented as a module
+ then it can benefit from the metadata already cached by the main
+ library.
+
+ <p>An internal HDF4 wrapper would be used if the HDF5 file is
+ writable and the user doesn't mind that the HDF5 file is
+ modified. An external wrapper would be used if the file isn't
+ writable or if the user wants the data file to be primarily HDF5
+ but a few applications need an HDF4 view of the data.
+
+ <p>Modifying through the HDF5 library an HDF5 file that has
+ internal HDF4 wrapper should invalidate the HDF4 wrapper (and
+ optionally regenerate it when <code>H5Fclose</code> is
+ called). The HDF5 library must understand how wrappers work, but
+ not necessarily anything about the HDF4 file format.
+
+ <p>Modifying through the HDF5 library an HDF5 file that has an
+ external HDF4 wrapper will cause the HDF4 wrapper to become out
+ of date (but possibly regenerated during <code>H5Fclose</code>).
+ <b>Note: Perhaps the next release of the HDF4 library should
+ insure that the HDF4 wrapper file has a more recent modification
+ time than the raw data file (the HDF5 file) to which it
+ points(?)</b>
+
+ <p>Modifying through the HDF4 library an HDF5 file that has an
+ internal or external HDF4 wrapper will cause the HDF5 wrapper to
+ become out of date. However, there is now way for the old HDF4
+ library to notify the HDF5 wrapper that it's out of date.
+ Therefore the HDF5 library must be able to detect when the HDF5
+ wrapper is out of date and be able to fix it. If the HDF4
+ wrapper is complete then the easy way is to ignore the original
+ HDF5 wrapper and generate a new one from the HDF4 wrapper. The
+ other approach is to compare the HDF4 and HDF5 wrappers and
+ assume that if they differ HDF4 is the right one, if HDF4 omits
+ data then it was because HDF4 is a partial wrapper (rather than
+ assume HDF4 deleted the data), and if HDF4 has new data then
+ copy the new meta data to the HDF5 wrapper. On the other hand,
+ perhaps we don't need to allow these situations (modifying an
+ HDF5 file with the old HDF4 library and then accessing it with
+ the HDF5 library is either disallowed or causes HDF5 objects
+ that can't be described by HDF4 to be lost).
+
+ <p>To convert an HDF5 file to an HDF4 file on demand, one simply
+ opens the file with the HDF4 flag and closes it. This is also
+ how AIO implemented backward compatability with PDB in its file
+ format.
+
+ <hr>
+ <h2>Implementation of Condition C</h2>
+
+ <p>This condition must be satisfied for all time because there
+ will always be archived HDF4 files. If a pure HDF4 file (that
+ is, one without HDF5 meta data) is opened with an HDF5 library,
+ the <code>H5Fopen</code> builds an internal or external HDF5
+ wrapper and then accesses the raw data through that wrapper. If
+ the HDF5 library modifies the file then the HDF4 wrapper becomes
+ out of date. However, since the HDF5 library hasn't been
+ released, we can at least implement it to disable and/or reclaim
+ the HDF4 wrapper.
+
+ <p>If an external and temporary HDF5 wrapper is desired, the
+ wrapper is created through the cache like all other HDF5 files.
+ The data appears on disk only if a particular cached datum is
+ preempted. Instead of calling <code>H5Fclose</code> on the HDF5
+ wrapper file we call <code>H5Fabort</code> which immediately
+ releases all file resources without updating the file, and then
+ we unlink the file from Unix.
+
+ <hr>
+ <h2>What do wrappers look like?</h2>
+
+ <p>External wrappers are quite obvious: they contain only things
+ from the format specs for the wrapper and nothing from the
+ format specs of the format which they wrap.
+
+ <p>An internal HDF4 wrapper is added to an HDF5 file in such a way
+ that the file appears to be both an HDF4 file and an HDF5
+ file. HDF4 requires an HDF4 file header at file offset zero. If
+ a user block is present then we just move the user block down a
+ bit (and truncate it) and insert the minimum HDF4 signature.
+ The HDF4 <code>dd</code> list and any other data it needs are
+ appended to the end of the file and the HDF5 signature uses the
+ logical file length field to determine the beginning of the
+ trailing part of the wrapper.
+
+ <p>
+ <center>
+ <table border width="60%">
+ <tr>
+ <td>HDF4 minimal file header. Its main job is to point to
+ the <code>dd</code> list at the end of the file.</td>
+ </tr>
+ <tr>
+ <td>User-defined block which is truncated by the size of the
+ HDF4 file header so that the HDF5 boot block file address
+ doesn't change.</td>
+ </tr>
+ <tr>
+ <td>The HDF5 boot block and data, unmodified by adding the
+ HDF4 wrapper.</td>
+ </tr>
+ <tr>
+ <td>The main part of the HDF4 wrapper. The <code>dd</code>
+ list will have entries for all parts of the file so
+ hdpack(?) doesn't (re)move anything.</td>
+ </tr>
+ </table>
+ </center>
+
+ <p>When such a file is opened by the HDF5 library for
+ modification it shifts the user block back down to address zero
+ and fills with zeros, then truncates the file at the end of the
+ HDF5 data or adds the trailing HDF4 wrapper to the free
+ list. This prevents HDF4 applications from reading the file with
+ an out of date wrapper.
+
+ <p>If there is no user block then we have a problem. The HDF5
+ boot block must be moved to make room for the HDF4 file header.
+ But moving just the boot block causes problems because all file
+ addresses stored in the file are relative to the boot block
+ address. The only option is to shift the entire file contents
+ by 512 bytes to open up a user block (too bad we don't have
+ hooks into the Unix i-node stuff so we could shift the entire
+ file contents by the size of a file system page without ever
+ performing I/O on the file :-)
+
+ <p>Is it possible to place an HDF5 wrapper in an HDF4 file? I
+ don't know enough about the HDF4 format, but I would suspect it
+ might be possible to open a hole at file address 512 (and
+ possibly before) by moving some things to the end of the file
+ to make room for the HDF5 signature. The remainder of the HDF5
+ wrapper goes at the end of the file and entries are added to the
+ HDF4 <code>dd</code> list to mark the location(s) of the HDF5
+ wrapper.
+
+ <hr>
+ <h2>Other Thoughts</h2>
+
+ <p>Conversion programs that copy an entire HDF4 file to a separate,
+ self-contained HDF5 file and vice versa might be useful.
+
+
+
+
+ <hr>
+ <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Fri Oct 3 11:52:31 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Wed Oct 8 12:34:42 EST 1997
+<!-- hhmts end -->
+ </body>
+</html>