summaryrefslogtreecommitdiffstats
path: root/doc/html/symtab
diff options
context:
space:
mode:
Diffstat (limited to 'doc/html/symtab')
-rw-r--r--doc/html/symtab313
1 files changed, 313 insertions, 0 deletions
diff --git a/doc/html/symtab b/doc/html/symtab
new file mode 100644
index 0000000..a657729
--- /dev/null
+++ b/doc/html/symtab
@@ -0,0 +1,313 @@
+A number of issues involving caching of object header messages in
+symbol table entries must be resolved.
+
+What is the motivation for these changes?
+
+ If we make objects completely independent of object name it allows
+ us to refer to one object by multiple names (a concept called hard
+ links in Unix file systems), which in turn provides an easy way to
+ share data between datasets.
+
+ Every object in an HDF5 file has a unique, constant object header
+ address which serves as a handle (or OID) for the object. The
+ object header contains messages which describe the object.
+
+ HDF5 allows some of the object header messages to be cached in
+ symbol table entries so that the object header doesn't have to be
+ read from disk. For instance, an entry for a directory caches the
+ directory disk addresses required to access that directory, so the
+ object header for that directory is seldom read.
+
+ If an object has multiple names (that is, a link count greater than
+ one), then it has multiple symbol table entries which point to it.
+ All symbol table entries must agree on header messages. The
+ current mechanism is to turn off the caching of header messages in
+ symbol table entries when the header link count is more than one,
+ and to allow caching once the link count returns to one.
+
+ However, in the current implementation, a package is allowed to
+ copy a symbol table entry and use it as a private cache for the
+ object header. This doesn't work for a number of reasons (all but
+ one require a `delete symbol entry' operation).
+
+ 1. If two packages hold copies of the same symbol table entry,
+ they don't notify each other of changes to the symbol table
+ entry. Eventually, one package reads a cached message and
+ gets the wrong value because the other package changed the
+ message in the object header.
+
+ 2. If one package holds a copy of the symbol table entry and
+ some other part of HDF5 removes the object and replaces it
+ with some other object, then the original package will
+ continue to access the non-existent object using the new
+ object header.
+
+ 3. If one package holds a copy of the symbol table entry and
+ some other part of HDF5 (re)moves the directory which
+ contains the object, then the package will be unable to
+ update the symbol table entry with the new cached
+ data. Packages that refer to the object by the new name will
+ use old cached data.
+
+
+The basic problem is that there may be multiple copies of the object
+symbol table entry floating around in the code when there should
+really be at most one per hard link.
+
+ Level 0: A copy may exist on disk as part of a symbol table node, which
+ is a small 1d array of symbol table entries.
+
+ Level 1: A copy may be cached in memory as part of a symbol table node
+ in the H5Gnode.c file by the H5AC layer.
+
+ Level 2a: Another package may be holding a copy so it can perform
+ fast lookup of any header messages that might be cached in
+ the symbol table entry. It can't point directly to the
+ cached symbol table node because that node can dissappear
+ at any time.
+
+ Level 2b: Packages may hold more than one copy of a symbol table
+ entry. For instance, if H5D_open() is called twice for
+ the same name, then two copies of the symbol table entry
+ for the dataset exist in the H5D package.
+
+How can level 2a and 2b be combined?
+
+ If package data structures contained pointers to symbol table
+ entries instead of copies of symbol table entries and if H5G
+ allocated one symbol table entry per hard link, then it's trivial
+ for Level 2a and 2b to benefit from one another's actions since
+ they share the same cache.
+
+How does this work conceptually?
+
+ Level 2a and 2b must notify Level 1 of their intent to use (or stop
+ using) a symbol table entry to access an object header. The
+ notification of the intent to access an object header is called
+ `opening' the object and releasing the access is `closing' the
+ object.
+
+ Opening an object requires an object name which is used to locate
+ the symbol table entry to use for caching of object header
+ messages. The return value is a handle for the object. Figure 1
+ shows the state after Dataset1 opens Object with a name that maps
+ through Entry1. The open request created a copy of Entry1 called
+ Shadow1 which exists even if SymNode1 is preempted from the H5AC
+ layer.
+
+ ______
+ Object / \
+ SymNode1 +--------+ |
+ +--------+ _____\ | Header | |
+ | | / / +--------+ |
+ +--------+ +---------+ \______/
+ | Entry1 | | Shadow1 | /____
+ +--------+ +---------+ \ \
+ : : \
+ +--------+ +----------+
+ | Dataset1 |
+ +----------+
+ FIGURE 1
+
+
+
+ The SymNode1 can appear and disappear from the H5AC layer at any
+ time without affecting the Object Header data cached in the Shadow.
+ The rules are:
+
+ * If the SymNode1 is present and is about to disappear and the
+ Shadow1 dirty bit is set, then Shadow1 is copied over Entry1, the
+ Entry1 dirty bit is set, and the Shadow1 dirty bit is cleared.
+
+ * If something requests a copy of Entry1 (for a read-only peek
+ request), and Shadow1 exists, then a copy (not pointer) of Shadow1
+ is returned instead.
+
+ * Entry1 cannot be deleted while Shadow1 exists.
+
+ * Entry1 cannot change directly if Shadow1 exists since this means
+ that some other package has opened the object and may be modifying
+ it. I haven't decided if it's useful to ever change Entry1
+ directly (except of course within the H5G layer itself).
+
+ * Shadow1 is created when Dataset1 `opens' the object through
+ Entry1. Dataset1 is given a pointer to Shadow1 and Shadow1's
+ reference count is incremented.
+
+ * When Dataset1 `closes' the Object the Shadow1 reference count is
+ decremented. When the reference count reaches zero, if the
+ Shadow1 dirty bit is set, then Shadow1's contents are copied to
+ Entry1, and the Entry1 dirty bit is set. Shadow1 is then deleted
+ if its reference count is zero. This may require reading SymNode1
+ back into the H5AC layer.
+
+What happens when another Dataset opens the Object through Entry1?
+
+ If the current state is represented by the top part of Figure 2,
+ then Dataset2 will be given a pointer to Shadow1 and the Shadow1
+ reference count will be incremented to two. The Object header link
+ count remains at one so Object Header messages continue to be cached
+ by Shadow1. Dataset1 and Dataset2 benefit from one another
+ actions. The resulting state is represented by Figure 2.
+
+ _____
+ SymNode1 Object / \
+ +--------+ _____\ +--------+ |
+ | | / / | Header | |
+ +--------+ +---------+ +--------+ |
+ | Entry1 | | Shadow1 | /____ \_____/
+ +--------+ +---------+ \ \
+ : : _ \
+ +--------+ |\ +----------+
+ \ | Dataset1 |
+ \________ +----------+
+ \ \
+ +----------+ |
+ | Dataset2 | |- New Dataset
+ +----------+ |
+ /
+ FIGURE 2
+
+
+What happens when the link count for Object increases while Dataset
+has the Object open?
+
+ SymNode2
+ +--------+
+ SymNode1 Object | |
+ +--------+ ____\ +--------+ /______ +--------+
+ | | / / | header | \ `| Entry2 |
+ +--------+ +---------+ +--------+ +--------+
+ | Entry1 | | Shadow1 | /____ : :
+ +--------+ +---------+ \ \ +--------+
+ : : \
+ +--------+ +----------+ \________________/
+ | Dataset1 | |
+ +----------+ New Link
+
+ FIGURE 3
+
+ The current state is represented by the left part of Figure 3. To
+ create a new link the Object Header had to be located by traversing
+ through Entry1/Shadow1. On the way through, the Entry1/Shadow1
+ cache is invalidated and the Object Header link count is
+ incremented. Entry2 is then added to SymNode2.
+
+ Since the Object Header link count is greater than one, Object
+ header data will not be cached in Entry1/Shadow1.
+
+ If the initial state had been all of Figure 3 and a third link is
+ being added and Object is open by Entry1 and Entry2, then creation
+ of the third link will invalidate the cache in Entry1 or Entry2. It
+ doesn't matter which since both caches are already invalidated
+ anyway.
+
+What happens if another Dataset opens the same object by another name?
+
+ If the current state is represented by Figure 3, then a Shadow2 is
+ created and associated with Entry2. However, since the Object
+ Header link count is more than one, nothing gets cached in Shadow2
+ (or Shadow1).
+
+What happens if the link count decreases?
+
+ If the current state is represented by all of Figure 3 then it isn't
+ possible to delete Entry1 because the object is currently open
+ through that entry. Therefore, the link count must have
+ decreased because Entry2 was removed.
+
+ As Dataset1 reads/writes messages in the Object header they will
+ begin to be cached in Shadow1 again because the Object header link
+ count is one.
+
+What happens if the object is removed while it's open?
+
+ That operation is not allowed.
+
+What happens if the directory containing the object is deleted?
+
+ That operation is not allowed since deleting the directory requires
+ that the directory be empty. The directory cannot be emptied
+ because the open object cannot be removed from the directory.
+
+What happens if the object is moved?
+
+ Moving an object is a process consisting of creating a new
+ hard-link with the new name and then deleting the old name.
+ This will fail if the object is open.
+
+What happens if the directory containing the entry is moved?
+
+ The entry and the shadow still exist and are associated with one
+ another.
+
+What if a file is flushed or closed when objects are open?
+
+ Flushing a symbol table with open objects writes correct information
+ to the file since Shadow is copied to Entry before the table is
+ flushed.
+
+ Closing a file with open objects will create a valid file but will
+ return failure.
+
+How is the Shadow associated with the Entry?
+
+ A symbol table is composed of one or more symbol nodes. A node is a
+ small 1-d array of symbol table entries. The entries can move
+ around within a node and from node-to-node as entries are added or
+ removed from the symbol table and nodes can move around within a
+ symbol table, being created and destroyed as necessary.
+
+ Since a symbol table has an object header with a unique and constant
+ file offset, and since H5G contains code to efficiently locate a
+ symbol table entry given it's name, we use these two values as a key
+ within a shadow to associate the shadow with the symbol table
+ entry.
+
+ struct H5G_shadow_t {
+ haddr_t stab_addr; /*symbol table header address*/
+ char *name; /*entry name wrt symbol table*/
+ hbool_t dirty; /*out-of-date wrt stab entry?*/
+ H5G_entry_t ent; /*my copy of stab entry */
+ H5G_entry_t *main; /*the level 1 entry or null */
+ H5G_shadow_t *next, *prev; /*other shadows for this stab*/
+ };
+
+ The set of shadows will be organized in a hash table of linked
+ lists. Each linked list will contain the shadows associated with a
+ particular symbol table header address and the list will be sorted
+ lexicographically.
+
+ Also, each Entry will have a pointer to the corresponding Shadow or
+ null if there is no shadow.
+
+ When a symbol table node is loaded into the main cache, we look up
+ the linked list of shadows in the shadow hash table based on the
+ address of the symbol table object header. We then traverse that
+ list matching shadows with symbol table entries.
+
+ We assume that opening/closing objects will be a relatively
+ infrequent event compared with loading/flushing symbol table
+ nodes. Therefore, if we keep the linked list of shadows sorted it
+ costs O(N) to open and close objects where N is the number of open
+ objects in that symbol table (instead of O(1)) but it costs only
+ O(N) to load a symbol table node (instead of O(N^2)).
+
+What about the root symbol entry?
+
+ Level 1 storage for the root symbol entry is always available since
+ it's stored in the hdf5_file_t struct instead of a symbol table
+ node. However, the contents of that entry can move from the file
+ handle to a symbol table node by H5G_mkroot(). Therefore, if the
+ root object is opened, we keep a shadow entry for it whose
+ `stab_addr' field is zero and whose `name' is null.
+
+ For this reason, the root object should always be read through the
+ H5G interface.
+
+One more key invariant: The H5O_STAB message in a symbol table header
+never changes. This allows symbol table entries to cache the H5O_STAB
+message for the symbol table to which it points without worrying about
+whether the cache will ever be invalidated.
+
+