1 files changed, 252 insertions, 0 deletions
diff --git a/doc/html/review1a.html b/doc/html/review1a.html
new file mode 100644
index 0000000..78a5a84
--- /dev/null
+++ b/doc/html/review1a.html
@@ -0,0 +1,252 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <title>Group Examples</title>
+  </head>
+  <body>
+    <center><h1>Group Examples</h1></center>
+
+    <hr>
+    <h1>Background</h1>
+
+    <p>Directories (or now <i>Groups</i>) are currently implemented as
+      a directed graph with a single entry point into the graph which
+      is the <i>Root Object</i>. The root object is usually a
+      group. All objects have at least one predecessor (the <i>Root
+      Object</i> always has the HDF5 file boot block as a
+      predecessor).  The number of predecessors of a group is also
+      known as the <i>hard link count</i> or just <i>link count</i>.
+      Unlike Unix directories, HDF5 groups have no ".."  entry since
+      any group can have multiple predecessors.  Given the handle or
+      id of some object and returning a full name for that object
+      would be an expensive graph traversal.
+
+    <p>A special optimization is that a file may contain a single
+      non-group object and no group(s).  The object has one
+      predecessor which is the file boot block.  However, once a root
+      group is created it never dissappears (although I suppose it
+      could if we wanted).
+
+    <p>A special object called a <i>Symbolic Link</i> is simply a
+      name. Usually the name refers to some (other) object, but that
+      object need not exist.  Symbolic links in HDF5 will have the
+      same semantics as symbolic links in Unix.
+
+    <p>The symbol table graph contains "entries" for each name.  An
+      entry contains the file address for the object header and
+      possibly certain messages cached from the object header.
+
+    <p>The H5G package understands the notion of <i>opening</i> and object
+      which means that given the name of the object, a handle to the
+      object is returned (this isn't an API function).  Objects can be
+      opened multiple times simultaneously through the same name or,
+      if the object has hard links, through other names.  The name of
+      an object cannot be removed from a group if the object is opened
+      through that group (although the name can change within the
+      group).
+
+    <p>Below the API, object attributes can be read without opening
+      the object; object attributes cannot change without first
+      opening that object. The one exception is that the contents of a
+      group can change without opening the group.
+
+    <hr>
+    <h1>Building a hierarchy from a flat namespace</h1>
+
+    <p>Assuming we have a flat name space (that is, the root object is
+      a group which contains names for all other objects in the file
+      and none of those objects are groups), then we can build a
+      hierarchy of groups that also refer to the objects.
+
+    <p>The file initially contains `foo' `bar' `baz' in the root
+      group.  We wish to add groups `grp1' and `grp2' so that `grp1'
+      contains objects `foo' and `baz' and `grp2' contains objects
+      `bar' and `baz' (so `baz' appears in both groups).
+
+    <p>In either case below, one might want to move the flat objects
+      into some other group (like `flat') so their names don't
+      interfere with the rest of the hierarchy (or move the hierarchy
+      into a directory called `/hierarchy').
+
+    <h2>with symbolic links</h2>
+
+    <p>Create group `grp1' and add symbolic links called `foo' whose
+      value is `/foo' and `baz' whose value is `/baz'.  Similarly for
+      `grp2'.
+
+    <p>Accessing `grp1/foo' involves searching the root group for
+      the name `grp1', then searching that group for `foo', then
+      searching the root directory for `foo'.  Alternatively, one
+      could change working groups to the grp1 group and then ask for
+      `foo' which searches `grp1' for the name `foo', then searches
+      the root group for the name `foo'.
+
+    <p>Deleting `/grp1/foo' deletes the symbolic link without
+      affecting the `/foo' object.  Deleting `/foo' leaves the
+      `/grp1/foo' link dangling.
+
+    <h2>with hard links</h2>
+
+    <p>Creating the hierarchy is the same as with symbolic links.
+
+    <p>Accessing `/grp1/foo' searches the root group for the name
+      `grp1', then searches that group for the name `foo'.  If the
+      current working group is `/grp1' then we just search for the
+      name `foo'.
+
+    <p>Deleting `/grp1/foo' leaves `/foo' and vice versa.
+
+    <h2>the code</h2>
+
+    <p>Depending on the eventual API...
+
+    <code><pre>
+H5Gcreate (file_id, "/grp1");
+H5Glink (file_id, H5G_HARD, "/foo", "/grp1/foo");
+    </pre></code>
+
+    or
+
+    <code><pre>
+group_id = H5Gcreate (root_id, "grp1");
+H5Glink (file_id, H5G_HARD, root_id, "foo", group_id, "foo");
+H5Gclose (group_id);
+    </pre></code>
+
+
+    <hr>
+    <h1>Building a flat namespace from a hierarchy</h1>
+    
+    <p>Similar to abvoe, but in this case we have to watch out that
+      we don't get two names which are the same: what happens to
+      `/grp1/baz' and `/grp2/baz'?  If they really refer to the same
+      object then we just have `/baz', but if they point to two
+      different objects what happens?
+
+    <p>The other thing to watch out for cycles in the graph when we
+      traverse it to build the flat namespace.
+
+    <hr>
+    <h1>Listing the Group Contents</h1>
+
+    <p>Two things to watch out for are that the group contents don't
+      appear to change in a manner which would confuse the
+      application, and that listing everything in a group is as
+      efficient as possible.
+
+    <h2>Method A</h2>
+
+    <p>Query the number of things in a group and then query each item
+      by index. A trivial implementation would be O(n*n) and wouldn't
+      protect the caller from changes to the directory which move
+      entries around and therefore change their indices.
+
+      <code><pre>
+n = H5GgetNumContents (group_id);
+for (i=0; i&lt;n; i++) {
+   H5GgetNameByIndex (group_id, i, ...); /*don't worry about args yet*/
+}
+    </pre></code>
+
+    <h2>Method B</h2>
+
+    <p>The API contains a single function that reads all information
+      from the specified group and returns that info through an array.
+      The caller is responsible for freeing the array allocated by the
+      query and the things to which it points.  This also makes it
+      clear the the returned value is a snapshot of the group which
+      doesn't change if the group is modified.
+
+      <code><pre>
+n = H5Glist (file_id, "/grp1", info, ...);
+for (i=0; i&lt;n; i++) {
+   printf ("name = %s\n", info[i].name);
+   free (info[i].name); /*and maybe other fields too?*/
+}
+free (info);
+    </pre></code>
+
+    Notice that it would be difficult to expand the info struct since
+    its definition is part of the API.
+
+    <h2>Method C</h2>
+
+    <p>The caller asks for a snapshot of the group and then accesses
+      items in the snapshot through various query-by-index API
+      functions.  When finished, the caller notifies the library that
+      it's done with the snapshot.  The word "snapshot" makes it clear
+      that subsequent changes to the directory will not be reflected in
+      the shapshot_id.
+
+      <code><pre>
+snapshot_id = H5Gsnapshot (group_id); /*or perhaps group_name */
+n = H5GgetNumContents (snapshot_id);
+for (i=0; i&lt;n; i++) {
+   H5GgetNameByIndex (shapshot_id, i, ...);
+}
+H5Grelease (shapshot_id); 
+    </pre></code>
+
+    In fact, we could allow the user to leave off the H5Gsnapshot and
+    H5Grelease and use group_id in the H5GgetNumContents and
+    H5GgetNameByIndex so they can choose between Method A and Method
+    C.
+
+    <hr>
+    <h1>An implementation of Method C</h1>
+
+    <dl>
+      <dt><code>hid_t H5Gshapshot (hid_t group_id)</code>
+      <dd>Opens every object in the specified group and stores the
+	handles in an array managed by the library (linear-time
+	operation).  Open object handles are essentialy symbol table
+	entries with a little extra info (symbol table entries cache
+	certain things about the object which are also found in the
+	object header). Because the objects are open (A) they cannot be
+	removed from the group, (B) querying the object returns the
+	latest info even if something else has that object open, (C)
+	if the object is renamed within the group then its name with
+	<code>H5GgetNameByIndex</code> is changed.  Adding new entries
+	to a group doesn't affect the snapshot.
+
+      <dt><code>char *H5GgetNameByIndex (hid_t shapshot_id, int
+	  index)</code>
+      <dd>Uses the open object handle from entry <code>index</code> of
+	the snapshot array to get the object name.  This is a
+	constant-time operation.  The name is updated automatically if
+	the object is renamed within the group.
+
+      <dt><code>H5Gget&lt;whatever&gt;ByIndex...()</code>
+      <dd>Uses the open object handle from entry <code>index</code>,
+	which is just a symbol table entry, and reads the appropriate
+	object header message(s) which might be cached in the symbol
+	table entry.  This is a constant-time operation if cached,
+	linear in the number of messages if not cached.
+
+      <dt><code>H5Grelease (hid_t snapshot_id)</code>
+      <dd>Closes each object refered to by the snapshot and then frees
+	the snapshot array.  This is a linear-time operation.
+    </dl>
+
+    <hr>
+    <h1>To return <code>char*</code> or some HDF5 string type.</h1>
+
+    <p>In either case, the caller has to release resources associated
+      with the return value, calling free() or some HDF5 function.
+
+    <p>Names in the current implementation of the H5G package don't
+      contain embedded null characters and are always null terminated.
+
+    <p>Eventually the caller probably wants a <code>char*</code> so it
+      can pass it to some non-HDF5 function, does that require
+      strdup'ing the string again? Then the caller has to free() the
+      the char* <i>and</i> release the DHF5 string.
+
+    <hr>
+    <address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
+<!-- Created: Fri Sep 26 12:03:20 EST 1997 -->
+<!-- hhmts start -->
+Last modified: Fri Oct  3 09:32:10 EST 1997
+<!-- hhmts end -->
+  </body>
+</html>