diff options
Diffstat (limited to 'doc/html/symtab')
-rw-r--r-- | doc/html/symtab | 313 |
1 files changed, 313 insertions, 0 deletions
diff --git a/doc/html/symtab b/doc/html/symtab new file mode 100644 index 0000000..a657729 --- /dev/null +++ b/doc/html/symtab @@ -0,0 +1,313 @@ +A number of issues involving caching of object header messages in +symbol table entries must be resolved. + +What is the motivation for these changes? + + If we make objects completely independent of object name it allows + us to refer to one object by multiple names (a concept called hard + links in Unix file systems), which in turn provides an easy way to + share data between datasets. + + Every object in an HDF5 file has a unique, constant object header + address which serves as a handle (or OID) for the object. The + object header contains messages which describe the object. + + HDF5 allows some of the object header messages to be cached in + symbol table entries so that the object header doesn't have to be + read from disk. For instance, an entry for a directory caches the + directory disk addresses required to access that directory, so the + object header for that directory is seldom read. + + If an object has multiple names (that is, a link count greater than + one), then it has multiple symbol table entries which point to it. + All symbol table entries must agree on header messages. The + current mechanism is to turn off the caching of header messages in + symbol table entries when the header link count is more than one, + and to allow caching once the link count returns to one. + + However, in the current implementation, a package is allowed to + copy a symbol table entry and use it as a private cache for the + object header. This doesn't work for a number of reasons (all but + one require a `delete symbol entry' operation). + + 1. If two packages hold copies of the same symbol table entry, + they don't notify each other of changes to the symbol table + entry. Eventually, one package reads a cached message and + gets the wrong value because the other package changed the + message in the object header. + + 2. If one package holds a copy of the symbol table entry and + some other part of HDF5 removes the object and replaces it + with some other object, then the original package will + continue to access the non-existent object using the new + object header. + + 3. If one package holds a copy of the symbol table entry and + some other part of HDF5 (re)moves the directory which + contains the object, then the package will be unable to + update the symbol table entry with the new cached + data. Packages that refer to the object by the new name will + use old cached data. + + +The basic problem is that there may be multiple copies of the object +symbol table entry floating around in the code when there should +really be at most one per hard link. + + Level 0: A copy may exist on disk as part of a symbol table node, which + is a small 1d array of symbol table entries. + + Level 1: A copy may be cached in memory as part of a symbol table node + in the H5Gnode.c file by the H5AC layer. + + Level 2a: Another package may be holding a copy so it can perform + fast lookup of any header messages that might be cached in + the symbol table entry. It can't point directly to the + cached symbol table node because that node can dissappear + at any time. + + Level 2b: Packages may hold more than one copy of a symbol table + entry. For instance, if H5D_open() is called twice for + the same name, then two copies of the symbol table entry + for the dataset exist in the H5D package. + +How can level 2a and 2b be combined? + + If package data structures contained pointers to symbol table + entries instead of copies of symbol table entries and if H5G + allocated one symbol table entry per hard link, then it's trivial + for Level 2a and 2b to benefit from one another's actions since + they share the same cache. + +How does this work conceptually? + + Level 2a and 2b must notify Level 1 of their intent to use (or stop + using) a symbol table entry to access an object header. The + notification of the intent to access an object header is called + `opening' the object and releasing the access is `closing' the + object. + + Opening an object requires an object name which is used to locate + the symbol table entry to use for caching of object header + messages. The return value is a handle for the object. Figure 1 + shows the state after Dataset1 opens Object with a name that maps + through Entry1. The open request created a copy of Entry1 called + Shadow1 which exists even if SymNode1 is preempted from the H5AC + layer. + + ______ + Object / \ + SymNode1 +--------+ | + +--------+ _____\ | Header | | + | | / / +--------+ | + +--------+ +---------+ \______/ + | Entry1 | | Shadow1 | /____ + +--------+ +---------+ \ \ + : : \ + +--------+ +----------+ + | Dataset1 | + +----------+ + FIGURE 1 + + + + The SymNode1 can appear and disappear from the H5AC layer at any + time without affecting the Object Header data cached in the Shadow. + The rules are: + + * If the SymNode1 is present and is about to disappear and the + Shadow1 dirty bit is set, then Shadow1 is copied over Entry1, the + Entry1 dirty bit is set, and the Shadow1 dirty bit is cleared. + + * If something requests a copy of Entry1 (for a read-only peek + request), and Shadow1 exists, then a copy (not pointer) of Shadow1 + is returned instead. + + * Entry1 cannot be deleted while Shadow1 exists. + + * Entry1 cannot change directly if Shadow1 exists since this means + that some other package has opened the object and may be modifying + it. I haven't decided if it's useful to ever change Entry1 + directly (except of course within the H5G layer itself). + + * Shadow1 is created when Dataset1 `opens' the object through + Entry1. Dataset1 is given a pointer to Shadow1 and Shadow1's + reference count is incremented. + + * When Dataset1 `closes' the Object the Shadow1 reference count is + decremented. When the reference count reaches zero, if the + Shadow1 dirty bit is set, then Shadow1's contents are copied to + Entry1, and the Entry1 dirty bit is set. Shadow1 is then deleted + if its reference count is zero. This may require reading SymNode1 + back into the H5AC layer. + +What happens when another Dataset opens the Object through Entry1? + + If the current state is represented by the top part of Figure 2, + then Dataset2 will be given a pointer to Shadow1 and the Shadow1 + reference count will be incremented to two. The Object header link + count remains at one so Object Header messages continue to be cached + by Shadow1. Dataset1 and Dataset2 benefit from one another + actions. The resulting state is represented by Figure 2. + + _____ + SymNode1 Object / \ + +--------+ _____\ +--------+ | + | | / / | Header | | + +--------+ +---------+ +--------+ | + | Entry1 | | Shadow1 | /____ \_____/ + +--------+ +---------+ \ \ + : : _ \ + +--------+ |\ +----------+ + \ | Dataset1 | + \________ +----------+ + \ \ + +----------+ | + | Dataset2 | |- New Dataset + +----------+ | + / + FIGURE 2 + + +What happens when the link count for Object increases while Dataset +has the Object open? + + SymNode2 + +--------+ + SymNode1 Object | | + +--------+ ____\ +--------+ /______ +--------+ + | | / / | header | \ `| Entry2 | + +--------+ +---------+ +--------+ +--------+ + | Entry1 | | Shadow1 | /____ : : + +--------+ +---------+ \ \ +--------+ + : : \ + +--------+ +----------+ \________________/ + | Dataset1 | | + +----------+ New Link + + FIGURE 3 + + The current state is represented by the left part of Figure 3. To + create a new link the Object Header had to be located by traversing + through Entry1/Shadow1. On the way through, the Entry1/Shadow1 + cache is invalidated and the Object Header link count is + incremented. Entry2 is then added to SymNode2. + + Since the Object Header link count is greater than one, Object + header data will not be cached in Entry1/Shadow1. + + If the initial state had been all of Figure 3 and a third link is + being added and Object is open by Entry1 and Entry2, then creation + of the third link will invalidate the cache in Entry1 or Entry2. It + doesn't matter which since both caches are already invalidated + anyway. + +What happens if another Dataset opens the same object by another name? + + If the current state is represented by Figure 3, then a Shadow2 is + created and associated with Entry2. However, since the Object + Header link count is more than one, nothing gets cached in Shadow2 + (or Shadow1). + +What happens if the link count decreases? + + If the current state is represented by all of Figure 3 then it isn't + possible to delete Entry1 because the object is currently open + through that entry. Therefore, the link count must have + decreased because Entry2 was removed. + + As Dataset1 reads/writes messages in the Object header they will + begin to be cached in Shadow1 again because the Object header link + count is one. + +What happens if the object is removed while it's open? + + That operation is not allowed. + +What happens if the directory containing the object is deleted? + + That operation is not allowed since deleting the directory requires + that the directory be empty. The directory cannot be emptied + because the open object cannot be removed from the directory. + +What happens if the object is moved? + + Moving an object is a process consisting of creating a new + hard-link with the new name and then deleting the old name. + This will fail if the object is open. + +What happens if the directory containing the entry is moved? + + The entry and the shadow still exist and are associated with one + another. + +What if a file is flushed or closed when objects are open? + + Flushing a symbol table with open objects writes correct information + to the file since Shadow is copied to Entry before the table is + flushed. + + Closing a file with open objects will create a valid file but will + return failure. + +How is the Shadow associated with the Entry? + + A symbol table is composed of one or more symbol nodes. A node is a + small 1-d array of symbol table entries. The entries can move + around within a node and from node-to-node as entries are added or + removed from the symbol table and nodes can move around within a + symbol table, being created and destroyed as necessary. + + Since a symbol table has an object header with a unique and constant + file offset, and since H5G contains code to efficiently locate a + symbol table entry given it's name, we use these two values as a key + within a shadow to associate the shadow with the symbol table + entry. + + struct H5G_shadow_t { + haddr_t stab_addr; /*symbol table header address*/ + char *name; /*entry name wrt symbol table*/ + hbool_t dirty; /*out-of-date wrt stab entry?*/ + H5G_entry_t ent; /*my copy of stab entry */ + H5G_entry_t *main; /*the level 1 entry or null */ + H5G_shadow_t *next, *prev; /*other shadows for this stab*/ + }; + + The set of shadows will be organized in a hash table of linked + lists. Each linked list will contain the shadows associated with a + particular symbol table header address and the list will be sorted + lexicographically. + + Also, each Entry will have a pointer to the corresponding Shadow or + null if there is no shadow. + + When a symbol table node is loaded into the main cache, we look up + the linked list of shadows in the shadow hash table based on the + address of the symbol table object header. We then traverse that + list matching shadows with symbol table entries. + + We assume that opening/closing objects will be a relatively + infrequent event compared with loading/flushing symbol table + nodes. Therefore, if we keep the linked list of shadows sorted it + costs O(N) to open and close objects where N is the number of open + objects in that symbol table (instead of O(1)) but it costs only + O(N) to load a symbol table node (instead of O(N^2)). + +What about the root symbol entry? + + Level 1 storage for the root symbol entry is always available since + it's stored in the hdf5_file_t struct instead of a symbol table + node. However, the contents of that entry can move from the file + handle to a symbol table node by H5G_mkroot(). Therefore, if the + root object is opened, we keep a shadow entry for it whose + `stab_addr' field is zero and whose `name' is null. + + For this reason, the root object should always be read through the + H5G interface. + +One more key invariant: The H5O_STAB message in a symbol table header +never changes. This allows symbol table entries to cache the H5O_STAB +message for the symbol table to which it points without worrying about +whether the cache will ever be invalidated. + + |