From c492036b0d5a6dbcd2389af48fec95c80098210e Mon Sep 17 00:00:00 2001 From: David Young Date: Tue, 25 Aug 2020 09:22:04 -0500 Subject: Add the VFD SWMR User's Guide, a work in progress. --- doc/vfd-swmr-user-guide.md | 468 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 468 insertions(+) create mode 100644 doc/vfd-swmr-user-guide.md diff --git a/doc/vfd-swmr-user-guide.md b/doc/vfd-swmr-user-guide.md new file mode 100644 index 0000000..e80ba98 --- /dev/null +++ b/doc/vfd-swmr-user-guide.md @@ -0,0 +1,468 @@ +# Welcome to VFD SWMR + +Thank you for volunteering to test VFD SWMR. VFD SMWR is a new feature +of the HDF5 library that lets a process write data to an HDF5 file while +one or more processes read the file. Use cases range from monitoring +data collection and/or steering experiments in progress to financial +applications. + +VFD SWMR is designed to be a more flexible, more modular, +better-performing replacement for the existing SWMR feature. VFD +SWMR allows HDF5 objects (groups, datasets, attributes) to be +created and destroyed in the course of a reader-writer session. It +compartmentalizes much of the SWMR functionality in a virtual-file +driver (VFD), thus easing The HDF Group's software-maintenance burden. +And it makes guarantees for the maximum time from write to availability +of data for read, provided that the reading and writing systems and +their interconnections can keep up with the data flow. + +# Quick start + +Follow these instructions to download, configure, and build the VFD SWMR +project in a jiffy. Then install the HDF5 library and utilites built +by the VFD SWMR project. + +## Download + +The latest source code here for VFD SWMR is found on the `multi` +branch of [the VFD SWMR +repository](https://bitbucket.hdfgroup.org/scm/~dyoung/vchoi_fork.git). + +Clone the repository in a new directory, then switch to the VFD SWMR branch: + +``` +% git clone https://bitbucket.hdfgroup.org/scm/~dyoung/vchoi_fork.git swmr +% cd swmr +% git checkout multi +``` + +## Build + +Setup for autotools: + +``` +% sh ./autogen.sh +``` + +Create a build directory, change to that directory, and run the +configure script: + +``` +% mkdir -p ../build/swmr +% cd ../build/swmr +% ../../swmr/configure CFLAGS="-g -O3" +``` + +You don't have to provide the CFLAGS, but usually I want compiler +optimizations, and I want debugging symbols. + +Build the project: + +``` +% make +``` + +## Test + +We recommend that you run the full HDF5 test suite to make sure that VFD +SWMR works correctly on your system. To test the library, utilities, run + +``` +% make check +``` + +If the tests don't pass, please let the developers know! + +# Sample programs + +## Extensible datasets + +For an example of a program that uses VFD SWMR to write/read many +extensible datasets, have a look at `test/vfd_swmr_bigset_writer.c`, the +"bigset" test. We compile two binaries from that source file, one that +operates in write mode, and a second that operates in read mode. + +In write mode, "bigset" creates an HDF5 file containing one or more +datasets that are extensible in either one dimension or two. Then it +runs for several steps, increasing the size of each dataset in each +dimension once every step. The dimensions, number of datasets, the +step increase in dataset size, and the number of steps are configurable +using command-line options -d, -s, -r and -c, and -n, respectively---use +the -h option to get a usage message. Each dataset is written with a +predictable pattern. + +In read mode, "bigset" reads each dataset from an HDF5 file created +by a "bigset" writer and verifies the patterns. It takes the same +command-line parameters as the "bigset" writer. The reader and writer +may run concurrently; the reader "polls" the content until it is just +shy of complete, given the number of steps expected. + +To run a bigset test, I open a couple of terminal windows, one for the +reader and one for the writer. I change to the `test` directory under +my build directory, and I run the writer in one window: + +% ./vfd_swmr_bigset_writer -n 50 -d 2 + +and in the other window, I run the reader: + +% ./vfd_swmr_bigset_reader -n 50 -d 2 -W + +The writer will wait for a signal before it quits. You may tap +Control-C to make it quit. If you don't want it to wait, then you can +pass option flag -W. (The reader accepts the same flag.) Use the `-q` +option to suppress the progress messages that the programs write to the +standard error stream. + +When the writer is creating a dataset extensible in one dimension +(`-d 1`), you can add the `-V` option flag to create a virtual +dataset with content in three source datasets in the same HDF5 +file. + +The `-M` option works like `-V`, only the writer creates the virtual +dataset on three source datasets each in a different HDF5 file. + +## The VFD SWMR demos + +TBD: edit this section + +My repository containing the VFD SWMR +demos is https://bitbucket.hdfgroup.org/scm/~dyoung/swmr-demo.git . +They are the same two demos that were shown at the webinar. + +Before you build the demos, you will need to install the HDF5 library +and utilities built from our VFD SWMR branch in your home directory +somewhere. In the ./configure step, I set the install directory using +a --prefix=$HOME/path/for/library command-line option, but there may be +another way. Update the H5CC lines in the Makefiles with the path to +h5cc. Then you should be able to `make`, `make clean`, etc. + +Under gaussians/, two programs are built, `wgaussians` and `rgaussians`. +If you start both from the same directory in different terminals, you +should see the "bouncing 2-D Gaussian distributions" in the `rgaussians` +terminal. + +The creation-deletion (`credel`) demo is also run in two terminals. The +two command lines are given in `credel/README.md`. You need to use the +`h5ls` installed from the VFD SWMR branch, since only that version has the +`--poll` option. + +# Developer tips + +## Configuring VFD SWMR + +### File-creation properties + +To use VFD SWMR, creating your HDF5 file with paged allocation strategy +is mandatory. This call enables the paged allocation strategy: + +``` +ret = H5Pset_file_space_strategy(fcpl, H5F_FSPACE_STRATEGY_PAGE, false, 1); +``` + +Allocated storage that is smaller than the page size will +not overlap a page boundary, and allocated storage that is one page or +greater in size will start on a page boundary. VFD SWMR relies on that +allocation strategy. + +### File-access properties + +In this section we dissect `vfd_swmr_create_fapl()`, a helper routine in +the VFD SWMR tests, to show how to configure your application to use VFD +SWMR. + +``` +hid_t +vfd_swmr_create_fapl(bool writer, bool only_meta_pages, bool use_vfd_swmr) +{ + H5F_vfd_swmr_config_t config; + hid_t fapl; + +/** + ** h5_fileaccess() is also a helper routine for the tests. + ** In your program, you can replace the h5_fileaccess() call with a call to + ** H5Pcreate(H5P_FILE_ACCESS). + **/ + + /* Create file access property list */ + if((fapl = h5_fileaccess()) < 0) { + warnx("h5_fileaccess"); + return badhid; + } + +/** + ** VFD SWMR has only been tested with the latest file format. It may + ** malfunction with older formats, we just don't know. We force the + ** latest version here. + **/ + + /* FOR NOW: set to use latest format, the "old" parameter is not used */ + if(H5Pset_libver_bounds(fapl, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST) < 0) { + warnx("H5Pset_libver_bounds"); + return badhid; + } + + /* + * Set up to open the file with VFD SWMR configured. + */ + +/** + ** VFD SWMR relies on metadata reads and writes to go through the + ** page buffer. Note that the default page size is 4096 bytes. This + ** call sets the total page buffer size to 4096 bytes. So we have + ** effectively created a one-page page buffer! That is adequate for + ** testing, but it may not be best for your application. + ** + ** If `only_meta_pages` is true, then the entire page buffer is + ** dedicated to metadata. That's fine for VFD SWMR. + ** + ** NOTE WELL: when VFD SWMR is enabled, the meta-/raw-data pages proportion + ** set by H5Pset_page_buffer_size() does not actually control the + ** pages reserved for raw data. *All* pages are dedicated to buffering + ** metadata. + **/ + + /* Enable page buffering */ + if(H5Pset_page_buffer_size(fapl, 4096, only_meta_pages ? 100 : 0, 0) < 0) { + warnx("H5Pset_page_buffer_size"); + return badhid; + } + +/** + ** Add VFD SWMR-specific configuration to the file-access property list + ** (`fapl`) using an H5Pset_vfd_swmr_config() call. + ** + ** When VFD SWMR is enabled, changes to the HDF5 metadata accumulate in + ** RAM until a configurable unit of time known as a *tick* has passed. + ** At the end of each tick, a snapshot of the metadata at the end of + ** the tick is "published"---that is, made visible to the readers. + ** + ** The length of a *tick* is configurable in units of 100 milliseconds + ** using the `tick_len` parameter. Below, `tick_len` is set to `4` to + ** select a tick length of 400ms. + ** + ** A snapshot does not persist forever, but it expires after a number + ** of ticks, given by the *maximum lag*, has passed. Below, `max_lag` + ** is set to `7` to select a maximum lag of 7 ticks. After a snapshot + ** has expired, the writer may overwrite it. + ** + ** When a reader first enters the API, it starts to use, or "selects," + ** the metadata in the newest snapshot, and on every subsequent API + ** entry, if a tick has passed since the last selection, and if new + ** snapshots are available, then the reader selects the latest. + ** + ** If a reader spends longer than `max_lag - 1` ticks (2400ms with + ** the example configuration) inside the HDF5 API, then its snapshot + ** may expire, resulting in undefined behavior. When a snapshot + ** expires while the reader is using it, we say that the writer has + ** "overrun" the reader. The writer cannot currently detect overruns. + ** Frequently the reader will detect an overrun and force the program + ** to exit with a diagnostic assertion failure. + ** + ** The application tells VFD SWMR whether or not to configure for + ** reading or writing a file by setting the `writer` parameter to + ** `true` for writing or `false` for reading. + ** + ** VFD SWMR snapshots are stored in a "shadow file" that is shared + ** between writer and readers. On a POSIX system, the shadow file + ** may be placed on any *local* filesystem that the reader and writer + ** share. The `md_file_path` parameter tells where to put the shadow + ** file. + ** + ** The `md_pages_reserved` parameter tells how many pages to reserve + ** at the beginning of the shadow file for the shadow-file header + ** and the shadow index. The header has an entire page to itself. + ** The remaining `md_pages_reserved - 1` pages are reserved for the + ** shadow index. If the index grows larger than its initial + ** allocation, then it will move to a new location in the shadow file, + ** and the initial allocation will be reclaimed. `md_pages_reserved` + ** must be at least 2. + ** + ** The `version` parameter tells what version of VFD SWMR configuration + ** the parameter struct `config` contains. For now, it should be + ** initialized to `H5F__CURR_VFD_SWMR_CONFIG_VERSION`. + **/ + + memset(&config, 0, sizeof(config)); + + config.version = H5F__CURR_VFD_SWMR_CONFIG_VERSION; + config.tick_len = 4; + config.max_lag = 7; + config.writer = writer; + config.md_pages_reserved = 128; + HDstrcpy(config.md_file_path, "./my_md_file"); + + /* Enable VFD SWMR configuration */ + if(use_vfd_swmr && H5Pset_vfd_swmr_config(fapl, &config) < 0) { + warnx("H5Pset_vfd_swmr_config"); + return badhid; + } + return fapl; +} +``` + +## Using virtual datasets (VDS) + +An application may want to use VFD SWMR to create, read, or write +a virtual dataset. Unfortunately, VDS does not work properly with +VFD SWMR at this time. In this section, we describe some workarounds +that can be used with great care to make VDS and VFD SWMR operate +simultaneously. + +A virtual dataset, when it is read or written, will open files on +an application's behalf in order to access the source datasets +inside. If a virtual dataset resides on file `v.h5`, and one of +its source datasets resides on a second file, `s1.h5`, then the +virtual dataset will try to open `s1.h5` using the same file-access +properties as `v.h5`. Thus, if `v.h5` is open with VFD SWMR with +shadow file `v.shadow`, then the virtual dataset will try to open +`s1.h5` with the same shadow file, which will fail. + +Suppose that `v.h5` is *not* open with VFD SWMR, but it was opened +with default file-access properties. Then the virtual dataset will +open the source dataset on `s1.h5` with default file-access +properties, too. This default virtual-dataset behavior is not +helpful to the application that wants to use VFD SWMR to read or +write source datasets. + +To use VFD SWMR with VDS, an application should *pre-open* each file +using its preferred file-access properties, including independent shadow +filenames for each source file. As long as the virtual dataset remains +in use, the application should leave each of the pre-opened files open. +In this way the library, when it tries to open the source files, will +always find them already open and re-use the already-open files with the +file-access properties established on first open. + +## Pushing HDF5 content to reader visibility + +With VFD SWMR, ordinarily it should not be necessary to call +H5Fflush(). In fact, when VFD SWMR is active, calling H5Fflush() +may slow down your program considerably because the call will not +return until after `max_lag` ticks have passed. + +A writer can make its last changes to an HDF5 file visible to all +readers immediately using the new call, `H5Fvfd_swmr_end_tick()`. +A writer should use `H5Fvfd_swmr_end_tick()` carefully: by calling +it more frequently than once a tick, a writer may corrupt a reader's +view of the HDF5 file. + +When VFD SWMR is enabled, raw data is not cached in the page buffer. On +each tick, the content of chunk caches and other unwritten raw data is +flushed directly to the HDF5 file, so that raw data is always available +before the HDF5 structural metadata that describes it. + +## Reading up-to-date content + +The HDF Group (THG) expects that in one class of VFD SWMR application, +instruments on a particle accelerator will continuously generate +2-dimensional data frames and add them to HDF5 datasets while an +experiment is ongoing. The datasets will be written to an HDF5 +file opened in VFD SWMR mode. Experimenters will monitor a real-time +display of the datasets while the experiment takes place. A second +program, possibly running on a second computer, will generate the +display. The second program will open the HDF5 file in VFD SWMR +mode, too. + +THG developed a demonstration program for class of application, +and we have some advice based on that experience. + +The writer typically will increase a dataset's dimensions by a +frame, using `H5Dset_extent()`, before it writes the data of that +frame with `H5Dwrite()`. It's possible that a snapshot of the HDF5 +file will propagate to the reader between the `H5Dset_extent()` +call and the `H5Dwrite()`. Values `H5Dread()` from the last frame +at that juncture will not reflect the actual experimental data. +Instead, the reader will see arbitrary values or the fill value. +To display those values would be distracting and misleading to +the experimenter. + +On the reader, a strategy for displaying the most current, bonafide application +data is to read the dimensions of the frames dataset, `d`, compute +the number `n` of full frames contained in `d`, and read the +next-to-last frame, `n - 2`. THG uses a variant of this strategy +in its `gaussians` demo. + +On the writer, a strategy for protecting against snapshots between +the `H5Dset_extent()` and `H5Dwrite()` calls is to suspend VFD +SWMR's clock across both of the calls. The +`H5Fvfd_swmr_disable_end_of_tick()` call takes a file identifier +and stops new snapshots from being taken on the given file until +`H5Fvfd_swmr_enable_end_of_tick()` is called on the same file. + +# Known issues + +## Variable-length data + +A VFD SWMR reader cannot reliably read back a variable-length dataset +written by VFD SWMR. For example, a variable-length string +created and written as follows + +``` + hid_t dset, space, type; + char data[] = "content"; + + type = H5Tcopy(H5T_C_S1); + + H5Tset_size(type, H5T_VARIABLE); + + space = H5Screate(H5S_SCALAR); + + dset = H5Dcreate2(..., "string", type, space, H5P_DEFAULT, H5P_DEFAULT, + H5P_DEFAULT); + + H5Dwrite(dset, type, space, space, H5P_DEFAULT, &data); +``` + +and read back like this, + +``` + char *data; + herr_t ret; + + ret = H5Dread(..., ..., H5S_ALL, H5S_ALL, H5P_DEFAULT, &data); +``` + +may produce either an error return from `H5Dread` (`ret < 0`) or +a `NULL` pointer (`data == NULL`). + +Planned improvements to the HDF5 *global heap* may alleviate this +problem. There is no schedule for those improvements. + +Improvements to VFD SWMR may also alleviate the problem. + +## Microsoft Windows + +VFD SWMR does not support Microsoft Windows at this time. We do plan to +add support this year. + +## Supported filesystems + +A VFD SWMR writer and readers share a couple of files, the HDF5 (`.h5`) +file and the shadow file. VFD SWMR relies on writes to the files to +take effect in the order described in the POSIX documentation for +`read(2)` and `write(2)` system calls. If the VFD SWMR readers and the +writer run on the same POSIX host, this ordering should take effect, +regardless of the underlying filesystem. + +If the VFD SWMR reader and the writer run on *different* hosts, then +the write-ordering rules depend on the shared filesystem. VFD SWMR is +not generally expected to work with NFS at this time. GPFS is reputed +to order writes according to POSIX convention, so we expect VFD SWMR +to work with GPFS. (Caveat: we are still looking for an authoritative +description of GPFS I/O semantics.) + +The HDF Group plans to add support for NFS to VFD SWMR in the future. + +## File-opening order + +If an application tries to open a file in VFD SWMR reader mode, and the +file is not already open by a VFD SWMR writer, then the application will +sleep in the `H5Fopen()` call until either the writer opens the same +file (using the same shadow file) or the reader times out after several +seconds. + +# Reporting bugs + +VFD SWMR is still under construction, so I think that you will find some +bugs. Please do not hesitate to report them. + +TBD: email addresses here -- cgit v0.12 From da4c72a138f0e418d3f72d6f095f5f40c92d9648 Mon Sep 17 00:00:00 2001 From: David Young Date: Tue, 25 Aug 2020 10:23:01 -0500 Subject: Incorporate Mike's changes, fix some of my punctuation and markdown. --- doc/SWMR Example.png | Bin 0 -> 81265 bytes doc/SWMRdataflow.png | Bin 0 -> 66196 bytes doc/vfd-swmr-user-guide.md | 50 +++++++++++++++++++++++++++++---------------- 3 files changed, 32 insertions(+), 18 deletions(-) create mode 100644 doc/SWMR Example.png create mode 100644 doc/SWMRdataflow.png diff --git a/doc/SWMR Example.png b/doc/SWMR Example.png new file mode 100644 index 0000000..e35624c Binary files /dev/null and b/doc/SWMR Example.png differ diff --git a/doc/SWMRdataflow.png b/doc/SWMRdataflow.png new file mode 100644 index 0000000..f28d86b Binary files /dev/null and b/doc/SWMRdataflow.png differ diff --git a/doc/vfd-swmr-user-guide.md b/doc/vfd-swmr-user-guide.md index e80ba98..894d523 100644 --- a/doc/vfd-swmr-user-guide.md +++ b/doc/vfd-swmr-user-guide.md @@ -1,26 +1,36 @@ # Welcome to VFD SWMR -Thank you for volunteering to test VFD SWMR. VFD SMWR is a new feature -of the HDF5 library that lets a process write data to an HDF5 file while -one or more processes read the file. Use cases range from monitoring -data collection and/or steering experiments in progress to financial -applications. +Thank you for volunteering to test VFD SWMR. + +SWMR, which stands for Single Writer/Multiple Reader, is a feature +of the HDF5 library that lets a process write data to an HDF5 file +while one or more processes read the file. Use cases range from +monitoring data collection and/or steering experiments in progress +to financial applications. + +The following diagram illustrates how SWMR works. + + + VFD SWMR is designed to be a more flexible, more modular, -better-performing replacement for the existing SWMR feature. VFD -SWMR allows HDF5 objects (groups, datasets, attributes) to be -created and destroyed in the course of a reader-writer session. It -compartmentalizes much of the SWMR functionality in a virtual-file -driver (VFD), thus easing The HDF Group's software-maintenance burden. -And it makes guarantees for the maximum time from write to availability -of data for read, provided that the reading and writing systems and -their interconnections can keep up with the data flow. +better-performing replacement for the existing SWMR feature. + +* VFD SWMR allows HDF5 objects (groups, datasets, attributes) to be + created and destroyed in the course of a reader-writer session. +* It compartmentalizes much of the SWMR functionality in a virtual-file + driver (VFD), thus easing The HDF Group's software-maintenance burden. +* And it makes guarantees for the maximum time from write to availability + of data for read, provided that the reading and writing systems and + their interconnections can keep up with the data flow. + +For details on how VFD SWMR is implemented, see [LINK to RFC]. # Quick start -Follow these instructions to download, configure, and build the VFD SWMR -project in a jiffy. Then install the HDF5 library and utilites built -by the VFD SWMR project. +Follow these instructions to download, configure, and build the +VFD SWMR project in a jiffy. Then install the HDF5 library and +utilites built by the VFD SWMR project. ## Download @@ -101,15 +111,19 @@ To run a bigset test, I open a couple of terminal windows, one for the reader and one for the writer. I change to the `test` directory under my build directory, and I run the writer in one window: +``` % ./vfd_swmr_bigset_writer -n 50 -d 2 +``` and in the other window, I run the reader: +``` % ./vfd_swmr_bigset_reader -n 50 -d 2 -W +``` The writer will wait for a signal before it quits. You may tap Control-C to make it quit. If you don't want it to wait, then you can -pass option flag -W. (The reader accepts the same flag.) Use the `-q` +pass option flag `-W`. (The reader accepts the same flag.) Use the `-q` option to suppress the progress messages that the programs write to the standard error stream. @@ -119,7 +133,7 @@ dataset with content in three source datasets in the same HDF5 file. The `-M` option works like `-V`, only the writer creates the virtual -dataset on three source datasets each in a different HDF5 file. +dataset on three source datasets, each in a different HDF5 file. ## The VFD SWMR demos -- cgit v0.12 From e1419c872e27a4178e3e6eabf78f48544d7f03ca Mon Sep 17 00:00:00 2001 From: David Young Date: Tue, 25 Aug 2020 10:46:36 -0500 Subject: Do not use first person singular in the SWMR demos section. In the `vfd_swmr_create_fapl()` dissection, change the /** **/ comments in the literal code to plain markdown paragraphs. Slightly change wording and markdown elsewhere. --- doc/vfd-swmr-user-guide.md | 203 +++++++++++++++++++++++---------------------- 1 file changed, 102 insertions(+), 101 deletions(-) diff --git a/doc/vfd-swmr-user-guide.md b/doc/vfd-swmr-user-guide.md index 894d523..69b7f96 100644 --- a/doc/vfd-swmr-user-guide.md +++ b/doc/vfd-swmr-user-guide.md @@ -137,28 +137,26 @@ dataset on three source datasets, each in a different HDF5 file. ## The VFD SWMR demos -TBD: edit this section - -My repository containing the VFD SWMR -demos is https://bitbucket.hdfgroup.org/scm/~dyoung/swmr-demo.git . -They are the same two demos that were shown at the webinar. +The VFD SWMR demos are in a [separate +repository](https://bitbucket.hdfgroup.org/scm/~dyoung/swmr-demo.git). Before you build the demos, you will need to install the HDF5 library -and utilities built from our VFD SWMR branch in your home directory -somewhere. In the ./configure step, I set the install directory using -a --prefix=$HOME/path/for/library command-line option, but there may be -another way. Update the H5CC lines in the Makefiles with the path to -h5cc. Then you should be able to `make`, `make clean`, etc. - -Under gaussians/, two programs are built, `wgaussians` and `rgaussians`. -If you start both from the same directory in different terminals, you -should see the "bouncing 2-D Gaussian distributions" in the `rgaussians` -terminal. - -The creation-deletion (`credel`) demo is also run in two terminals. The -two command lines are given in `credel/README.md`. You need to use the -`h5ls` installed from the VFD SWMR branch, since only that version has the -`--poll` option. +and utilities built from the VFD SWMR branch in your home directory +somewhere. In the ./configure step, use the command-line option +`--prefix=$HOME/path/for/library` to set the directory you prefer. +In the demo Makefiles, update the `H5CC` variable with the path to +the `h5cc` installed from the VFD SWMR branch. Then you should be +able to `make` and `make clean` the demos. + +Under `gaussians/`, two programs are built, `wgaussians` and +`rgaussians`. If you start both from the same directory in different +terminals, you should see the "bouncing 2-D Gaussian distributions" +in the `rgaussians` terminal. + +The creation-deletion (`credel`) demo is also run in two terminals. +The two command lines are given in `credel/README.md`. You need +to use the `h5ls` installed from the VFD SWMR branch, since only +that version has the `--poll` option. # Developer tips @@ -191,24 +189,26 @@ vfd_swmr_create_fapl(bool writer, bool only_meta_pages, bool use_vfd_swmr) H5F_vfd_swmr_config_t config; hid_t fapl; -/** - ** h5_fileaccess() is also a helper routine for the tests. - ** In your program, you can replace the h5_fileaccess() call with a call to - ** H5Pcreate(H5P_FILE_ACCESS). - **/ +``` + +`h5_fileaccess()` is also a helper routine for the tests. In your +program, you can replace the `h5_fileaccess()` call with a call to +`H5Pcreate(H5P_FILE_ACCESS)`. +``` /* Create file access property list */ if((fapl = h5_fileaccess()) < 0) { warnx("h5_fileaccess"); return badhid; } +``` -/** - ** VFD SWMR has only been tested with the latest file format. It may - ** malfunction with older formats, we just don't know. We force the - ** latest version here. - **/ +VFD SWMR has only been tested with the latest file format. It may +malfunction with older formats, we just don't know. We force the +latest version here. + +``` /* FOR NOW: set to use latest format, the "old" parameter is not used */ if(H5Pset_libver_bounds(fapl, H5F_LIBVER_LATEST, H5F_LIBVER_LATEST) < 0) { warnx("H5Pset_libver_bounds"); @@ -218,84 +218,86 @@ vfd_swmr_create_fapl(bool writer, bool only_meta_pages, bool use_vfd_swmr) /* * Set up to open the file with VFD SWMR configured. */ +``` + +VFD SWMR relies on metadata reads and writes to go through the +page buffer. Note that the default page size is 4096 bytes. This +call sets the total page buffer size to 4096 bytes. So we have +effectively created a one-page page buffer! That is adequate for +testing, but it may not be best for your application. -/** - ** VFD SWMR relies on metadata reads and writes to go through the - ** page buffer. Note that the default page size is 4096 bytes. This - ** call sets the total page buffer size to 4096 bytes. So we have - ** effectively created a one-page page buffer! That is adequate for - ** testing, but it may not be best for your application. - ** - ** If `only_meta_pages` is true, then the entire page buffer is - ** dedicated to metadata. That's fine for VFD SWMR. - ** - ** NOTE WELL: when VFD SWMR is enabled, the meta-/raw-data pages proportion - ** set by H5Pset_page_buffer_size() does not actually control the - ** pages reserved for raw data. *All* pages are dedicated to buffering - ** metadata. - **/ +If `only_meta_pages` is true, then the entire page buffer is +dedicated to metadata. That's fine for VFD SWMR. +*Note well*: when VFD SWMR is enabled, the meta-/raw-data pages proportion +set by `H5Pset_page_buffer_size()` does not actually control the +pages reserved for raw data. *All* pages are dedicated to buffering +metadata. + + +``` /* Enable page buffering */ if(H5Pset_page_buffer_size(fapl, 4096, only_meta_pages ? 100 : 0, 0) < 0) { warnx("H5Pset_page_buffer_size"); return badhid; } +``` -/** - ** Add VFD SWMR-specific configuration to the file-access property list - ** (`fapl`) using an H5Pset_vfd_swmr_config() call. - ** - ** When VFD SWMR is enabled, changes to the HDF5 metadata accumulate in - ** RAM until a configurable unit of time known as a *tick* has passed. - ** At the end of each tick, a snapshot of the metadata at the end of - ** the tick is "published"---that is, made visible to the readers. - ** - ** The length of a *tick* is configurable in units of 100 milliseconds - ** using the `tick_len` parameter. Below, `tick_len` is set to `4` to - ** select a tick length of 400ms. - ** - ** A snapshot does not persist forever, but it expires after a number - ** of ticks, given by the *maximum lag*, has passed. Below, `max_lag` - ** is set to `7` to select a maximum lag of 7 ticks. After a snapshot - ** has expired, the writer may overwrite it. - ** - ** When a reader first enters the API, it starts to use, or "selects," - ** the metadata in the newest snapshot, and on every subsequent API - ** entry, if a tick has passed since the last selection, and if new - ** snapshots are available, then the reader selects the latest. - ** - ** If a reader spends longer than `max_lag - 1` ticks (2400ms with - ** the example configuration) inside the HDF5 API, then its snapshot - ** may expire, resulting in undefined behavior. When a snapshot - ** expires while the reader is using it, we say that the writer has - ** "overrun" the reader. The writer cannot currently detect overruns. - ** Frequently the reader will detect an overrun and force the program - ** to exit with a diagnostic assertion failure. - ** - ** The application tells VFD SWMR whether or not to configure for - ** reading or writing a file by setting the `writer` parameter to - ** `true` for writing or `false` for reading. - ** - ** VFD SWMR snapshots are stored in a "shadow file" that is shared - ** between writer and readers. On a POSIX system, the shadow file - ** may be placed on any *local* filesystem that the reader and writer - ** share. The `md_file_path` parameter tells where to put the shadow - ** file. - ** - ** The `md_pages_reserved` parameter tells how many pages to reserve - ** at the beginning of the shadow file for the shadow-file header - ** and the shadow index. The header has an entire page to itself. - ** The remaining `md_pages_reserved - 1` pages are reserved for the - ** shadow index. If the index grows larger than its initial - ** allocation, then it will move to a new location in the shadow file, - ** and the initial allocation will be reclaimed. `md_pages_reserved` - ** must be at least 2. - ** - ** The `version` parameter tells what version of VFD SWMR configuration - ** the parameter struct `config` contains. For now, it should be - ** initialized to `H5F__CURR_VFD_SWMR_CONFIG_VERSION`. - **/ +Add VFD SWMR-specific configuration to the file-access property list +(`fapl`) using an `H5Pset_vfd_swmr_config()` call. + +When VFD SWMR is enabled, changes to the HDF5 metadata accumulate in +RAM until a configurable unit of time known as a *tick* has passed. +At the end of each tick, a snapshot of the metadata at the end of +the tick is "published"---that is, made visible to the readers. + +The length of a *tick* is configurable in units of 100 milliseconds +using the `tick_len` parameter. Below, `tick_len` is set to `4` to +select a tick length of 400ms. + +A snapshot does not persist forever, but it expires after a number +of ticks, given by the *maximum lag*, has passed. Below, `max_lag` +is set to `7` to select a maximum lag of 7 ticks. After a snapshot +has expired, the writer may overwrite it. + +When a reader first enters the API, it starts to use, or "selects," +the metadata in the newest snapshot, and on every subsequent API +entry, if a tick has passed since the last selection, and if new +snapshots are available, then the reader selects the latest. + +If a reader spends longer than `max_lag - 1` ticks (2400ms with +the example configuration) inside the HDF5 API, then its snapshot +may expire, resulting in undefined behavior. When a snapshot +expires while the reader is using it, we say that the writer has +"overrun" the reader. The writer cannot currently detect overruns. +Frequently the reader will detect an overrun and force the program +to exit with a diagnostic assertion failure. + +The application tells VFD SWMR whether or not to configure for +reading or writing a file by setting the `writer` parameter to +`true` for writing or `false` for reading. + +VFD SWMR snapshots are stored in a "shadow file" that is shared +between writer and readers. On a POSIX system, the shadow file +may be placed on any *local* filesystem that the reader and writer +share. The `md_file_path` parameter tells where to put the shadow +file. + +The `md_pages_reserved` parameter tells how many pages to reserve +at the beginning of the shadow file for the shadow-file header +and the shadow index. The header has an entire page to itself. +The remaining `md_pages_reserved - 1` pages are reserved for the +shadow index. If the index grows larger than its initial +allocation, then it will move to a new location in the shadow file, +and the initial allocation will be reclaimed. `md_pages_reserved` +must be at least 2. + +The `version` parameter tells what version of VFD SWMR configuration +the parameter struct `config` contains. For now, it should be +initialized to `H5F__CURR_VFD_SWMR_CONFIG_VERSION`. + +``` memset(&config, 0, sizeof(config)); config.version = H5F__CURR_VFD_SWMR_CONFIG_VERSION; @@ -319,8 +321,7 @@ vfd_swmr_create_fapl(bool writer, bool only_meta_pages, bool use_vfd_swmr) An application may want to use VFD SWMR to create, read, or write a virtual dataset. Unfortunately, VDS does not work properly with VFD SWMR at this time. In this section, we describe some workarounds -that can be used with great care to make VDS and VFD SWMR operate -simultaneously. +that can be used with great care to make VDS and VFD SWMR cooperate. A virtual dataset, when it is read or written, will open files on an application's behalf in order to access the source datasets -- cgit v0.12 From 9ee6be22b99ee9a0e88385fb5929f666331306dd Mon Sep 17 00:00:00 2001 From: David Young Date: Tue, 25 Aug 2020 15:45:08 -0500 Subject: Clarify my descriptions of H5FD_dedup() and H5FD_deduplicate(). --- src/H5FD.c | 31 ++++++++++++++++++++++++------- 1 file changed, 24 insertions(+), 7 deletions(-) diff --git a/src/H5FD.c b/src/H5FD.c index c8b8da2..602b198 100644 --- a/src/H5FD.c +++ b/src/H5FD.c @@ -679,9 +679,18 @@ done: FUNC_LEAVE_API(ret_value) } -/* Return `other` if `self` has no de-duplication method. Otherwise, return - * `other` if it duplicates `self`, `self` if `other` does NOT duplicate it, - * NULL if `other` conflicts with `self` or if there is an error. +/* Helper routine for H5FD_deduplicate(): compare `self` and `other` using + * the deduplication method of `self`, if it has one; otherwise compare using + * `H5FDcmp()`. + * + * If `self` has no de-duplication method, compare `self` and `other` + * using `H5FDcmp()` and return `self` if they're equal and `other` if + * unequal. + * + * If `self` does have a de-duplication method, call it and return the + * method's result: `other` if it duplicates `self`, `self` if `other` + * does NOT duplicate it, NULL if `other` conflicts with `self` or if + * there is an error. * * Unlike H5FD_deduplicate(), this routine does not free `self` under any * circumstances. @@ -700,11 +709,19 @@ H5FD_dedup(H5FD_t *self, H5FD_t *other, hid_t fapl) return other; } -/* If any other open H5FD_t is functionally equivalent to `file` under - * the given file-access properties, then return it and close `file`. +/* Search the already-opened VFD instances for an instance similar to the + * instance `file` newly-opened using file-access properties given by `fapl`. + * + * If there is an already-open instance that is functionally + * identical to `file`, close `file` and return the already-open instance. + * + * If there is an already-open instance that conflicts with `file` because, + * for example, its file-access properties are incompatible with `fapl`'s + * or, for another example, it is under exclusive control by a third VFD + * instance, then close `file` and return `NULL`. * - * If any other open H5FD_t is not equivalent to `file`, but its - * operation would conflict with `file`, then return NULL and close `file`. + * Otherwise, return `file` to indicate that there are no identical or + * conflicting VFD instances already open. */ H5FD_t * H5FD_deduplicate(H5FD_t *file, hid_t fapl) -- cgit v0.12 From ea6dd16fbb8ab5a5c09d4b03c9a60512da2d0ba5 Mon Sep 17 00:00:00 2001 From: David Young Date: Tue, 25 Aug 2020 15:45:34 -0500 Subject: Describe the behavior of H5FD_vfd_swmr_dedup() in excruciating detail. --- src/H5FDvfd_swmr.c | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/src/H5FDvfd_swmr.c b/src/H5FDvfd_swmr.c index f0e0cfd..5322cf3 100644 --- a/src/H5FDvfd_swmr.c +++ b/src/H5FDvfd_swmr.c @@ -508,6 +508,38 @@ H5FD_vfd_swmr_cmp(const H5FD_t *_f1, const H5FD_t *_f2) FUNC_LEAVE_NOAPI(ret_value) } /* end H5FD_vfd_swmr_cmp() */ +/* Compare the already-opened VFD instance `_self` with the + * VFD instance `_other` newly-opened with file-access properties `fapl` + * and indicate whether the instances duplicate each other, if they conflict + * with each other, or if they are dissimilar. + * + * If `_self` duplicates `_other`, return `_self`. + * + * Return NULL on error, or if `_other` and `_self` refer to the same file + * but the file-access properties, `fapl`, conflict with the properties of + * `_self`. + * + * If `_other` neither duplicates nor conflicts with `_self`, then return + * `_other`. + * + * # Judging duplicate/conflicting/dissimilar VFD instances + * + * `_self` duplicates `_other` if `_other` is also an instance of SWMR + * class, the instances' lower files are equal under `H5FD_cmp()`, and + * the file-access properties of `_self` match `fapl`. + * The wildcard `fapl` value, `H5P_FILE_ACCESS_ANY_VFD`, matches all. + * + * `_self` also duplicates `_other` if `_other` is not a SWMR instance, but + * it equals the lower file of `_self` under `H5FD_cmp()`, and `fapl` is + * `H5P_FILE_ACCESS_ANY_VFD`. + * + * `_self` and `_other` conflict if both are SWMR instances referring to + * the same lower file, and their file-access properties differ. + * + * `_self` and `_other` conflict if `_other` is not a SWMR instance, it + * equals the lower file of `_self`, and `fapl` is not equal to + * `H5P_FILE_ACCESS_ANY_VFD`. + */ static H5FD_t * H5FD_vfd_swmr_dedup(H5FD_t *_self, H5FD_t *_other, hid_t fapl) { -- cgit v0.12 From c97970f7a1d4ff0f8cca9a700989a477370dd231 Mon Sep 17 00:00:00 2001 From: David Young Date: Tue, 25 Aug 2020 15:45:56 -0500 Subject: Use the terminology "expected" and "unexpected" errors instead of "soft" and "hard" errors. --- test/testvfdswmr.sh.in | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/test/testvfdswmr.sh.in b/test/testvfdswmr.sh.in index a355245..b492667 100644 --- a/test/testvfdswmr.sh.in +++ b/test/testvfdswmr.sh.in @@ -731,7 +731,7 @@ if test $nerrors -eq 0 ; then echo "VFD SWMR tests passed." if test $nsofterrors -ne 0 ; then echo - echo "${nsofterrors} soft errors occurred. That's safe to ignore." + echo "${nsofterrors} expected errors occurred. Expected errors are ok." fi if test -z "$HDF5_NOCLEANUP"; then # delete the test directory @@ -739,8 +739,9 @@ if test $nerrors -eq 0 ; then fi exit 0 else - echo -n "VFD SWMR tests failed with $nerrors hard errors " - echo "and $nsofterrors soft errors." + echo -n "VFD SWMR tests failed with $nerrors unexpected errors " + echo "and $nsofterrors expected errors. Expected errors are ok." + echo "Please report unexpected errors, they may indicate a bug." exit 1 fi -- cgit v0.12