diff options
author | Gerd Heber <gheber@hdfgroup.org> | 2020-11-21 05:18:14 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-11-21 05:18:14 (GMT) |
commit | d4a3097ec5d9e44d377c4b91a05b3e0c5f9f1e2c (patch) | |
tree | 493b82ecf3d4c1f5a908f7d20c3ea0f47ab707da /doxygen/dox | |
parent | 133c0ad36f5a5864b840a26b6d85b13ad59b6819 (diff) | |
download | hdf5-d4a3097ec5d9e44d377c4b91a05b3e0c5f9f1e2c.zip hdf5-d4a3097ec5d9e44d377c4b91a05b3e0c5f9f1e2c.tar.gz hdf5-d4a3097ec5d9e44d377c4b91a05b3e0c5f9f1e2c.tar.bz2 |
Full set of current H5F documentation. (#105)
* First cut of the H5 public API documentation.
* Added H5Z "bonus track."
* Applied Quincey's patch.
* Added the missing patches from Quincey's original patch.
* H5PL (complete) and basic H5VL API documentation.
* Added H5I API docs.
* Added H5L API docs.
* First installment from Elena's H5T batch.
* Second installment of Elena's H5T batch.
* Final installment of Elena's H5T batch.
* Migrated documentation for SWMR functions.
* Catching up on MDC functions.
* Integrated the H5F MDC function documentation.
* Added MDC and parallel H5F functions.
* Slightly updated main page.
* Added doxygen/dox/H5AC_cache_config_t.dox to MANIFEST.
Diffstat (limited to 'doxygen/dox')
-rw-r--r-- | doxygen/dox/H5AC_cache_config_t.dox | 415 | ||||
-rw-r--r-- | doxygen/dox/mainpage.dox | 70 |
2 files changed, 454 insertions, 31 deletions
diff --git a/doxygen/dox/H5AC_cache_config_t.dox b/doxygen/dox/H5AC_cache_config_t.dox new file mode 100644 index 0000000..9b9862b --- /dev/null +++ b/doxygen/dox/H5AC_cache_config_t.dox @@ -0,0 +1,415 @@ +/** + * \page H5AC-cache-config-t Metadata Cache Configuration + * \tableofcontents + * + * \section gcf General configuration fields + * + * \par version + * Integer field containing the version number of this version + * of the H5AC_cache_config_t structure. Any instance of + * H5AC_cache_config_t passed to the cache must have a known + * version number, or an error will be flagged. + * + * \par rpt_fcn_enabled + * \parblock + * Boolean field used to enable and disable the default + * reporting function. This function is invoked every time the + * automatic cache resize code is run, and reports on its activities. + * + * This is a debugging function, and should normally be turned off. + * \endparblock + * + * \par open_trace_file + * \parblock + * Boolean field indicating whether the trace_file_name + * field should be used to open a trace file for the cache. + * + * \Emph{*** DEPRECATED ***} Use \Code{H5Fstart/stop} logging functions instead + * + * The trace file is a debuging feature that allow the capture of + * top level metadata cache requests for purposes of debugging and/or + * optimization. This field should normally be set to \c FALSE, as + * trace file collection imposes considerable overhead. + * + * This field should only be set to \c TRUE when the trace_file_name + * contains the full path of the desired trace file, and either + * there is no open trace file on the cache, or the \c close_trace_file + * field is also \c TRUE. + * \endparblock + * + * \par close_trace_file + * \parblock + * Boolean field indicating whether the current trace + * file (if any) should be closed. + * + * \Emph{*** DEPRECATED ***} Use \Code{H5Fstart/stop} logging functions instead + * + * See the above comments on the open_trace_file field. This field + * should be set to \c FALSE unless there is an open trace file on the + * cache that you wish to close. + * \endparblock + * + * \par trace_file_name + * \parblock + * Full path of the trace file to be opened if the + * open_trace_file field is \c TRUE. + * + * \Emph{*** DEPRECATED ***} Use \Code{H5Fstart/stop} logging functions instead + * + * In the parallel case, an ascii representation of the mpi rank of + * the process will be appended to the file name to yield a unique + * trace file name for each process. + * + * The length of the path must not exceed #H5AC__MAX_TRACE_FILE_NAME_LEN + * characters. + * \endparblock + * + * \par evictions_enabled + * \parblock + * Boolean field used to either report the current + * evictions enabled status of the cache, or to set the cache's + * evictions enabled status. + * + * In general, the metadata cache should always be allowed to + * evict entries. However, in some cases it is advantageous to + * disable evictions briefly, and thereby postpone metadata + * writes. However, this must be done with care, as the cache + * can grow quickly. If you do this, re-enable evictions as + * soon as possible and monitor cache size. + * + * At present, evictions can only be disabled if automatic + * cache resizing is also disabled (that is, \Code{(incr_mode == + * H5C_incr__off ) && ( decr_mode == H5C_decr__off )}). There + * is no logical reason why this should be so, but it simplifies + * implementation and testing, and I can't think of any reason + * why it would be desireable. If you can think of one, I'll + * revisit the issue. (JM) + * \endparblock + * + * \par set_initial_size + * Boolean flag indicating whether the size of the + * initial size of the cache is to be set to the value given in + * the initial_size field. If set_initial_size is \c FALSE, the + * initial_size field is ignored. + * + * \par initial_size + * If enabled, this field contain the size the cache is + * to be set to upon receipt of this structure. Needless to say, + * initial_size must lie in the closed interval \Code{[min_size, max_size]}. + * + * \par min_clean_fraction + * \c double in the range 0 to 1 indicating the fraction + * of the cache that is to be kept clean. This field is only used + * in parallel mode. Typical values are 0.1 to 0.5. + * + * \par max_size + * Maximum size to which the cache can be adjusted. The + * supplied value must fall in the closed interval + * \Code{[MIN_MAX_CACHE_SIZE, MAX_MAX_CACHE_SIZE]}. Also, \c max_size must + * be greater than or equal to \c min_size. + * + * \par min_size + * Minimum size to which the cache can be adjusted. The + * supplied value must fall in the closed interval + * \Code{[H5C__MIN_MAX_CACHE_SIZE, H5C__MAX_MAX_CACHE_SIZE]}. Also, \c min_size + * must be less than or equal to \c max_size. + * + * \par epoch_length + * \parblock + * Number of accesses on the cache over which to collect + * hit rate stats before running the automatic cache resize code, + * if it is enabled. + * + * At the end of an epoch, we discard prior hit rate data and start + * collecting afresh. The epoch_length must lie in the closed + * interval \Code{[H5C__MIN_AR_EPOCH_LENGTH, H5C__MAX_AR_EPOCH_LENGTH]}. + * \endparblock + * + * + * \section csicf Cache size increase control fields + * + * \par incr_mode + * Instance of the \c H5C_cache_incr_mode enumerated type whose + * value indicates how we determine whether the cache size should be + * increased. At present there are two possible values: + * \li \c H5C_incr__off: Don't attempt to increase the size of the cache + * automatically.\n + * When this increment mode is selected, the remaining fields + * in the cache size increase section ar ignored. + * \li \c H5C_incr__threshold: Attempt to increase the size of the cache + * whenever the average hit rate over the last epoch drops + * below the value supplied in the \c lower_hr_threshold + * field.\n + * Note that this attempt will fail if the cache is already + * at its maximum size, or if the cache is not already using + * all available space. + * + * Note that you must set \c decr_mode to \c H5C_incr__off if you + * disable metadata cache entry evictions. + * + * \par lower_hr_threshold + * \parblock + * Lower hit rate threshold. If the increment mode + * (\c incr_mode) is \c H5C_incr__threshold and the hit rate drops below the + * value supplied in this field in an epoch, increment the cache size by + * \c size_increment. Note that cache size may not be incremented above + * \c max_size, and that the increment may be further restricted by the + * \c max_increment field if it is enabled. + * + * When enabled, this field must contain a value in the range [0.0, 1.0]. + * Depending on the \c incr_mode selected, it may also have to be less than + * \c upper_hr_threshold. + * \endparblock + * + * \par increment + * \parblock + * Double containing the multiplier used to derive the new + * cache size from the old if a cache size increment is triggered. + * The increment must be greater than 1.0, and should not exceed 2.0. + * + * The new cache size is obtained my multiplying the current max cache + * size by the increment, and then clamping to \c max_size and to stay + * within the \c max_increment as necessary. + * \endparblock + * + * \par apply_max_increment + * Boolean flag indicating whether the \c max_increment + * field should be used to limit the maximum cache size increment. + * + * \par max_increment + * If enabled by the \c apply_max_increment field described + * above, this field contains the maximum number of bytes by which the + * cache size can be increased in a single re-size. + * + * \par flash_incr_mode + * \parblock + * Instance of the \c H5C_cache_flash_incr_mode enumerated + * type whose value indicates whether and by which algorithm we should + * make flash increases in the size of the cache to accommodate insertion + * of large entries and large increases in the size of a single entry. + * + * The addition of the flash increment mode was occasioned by performance + * problems that appear when a local heap is increased to a size in excess + * of the current cache size. While the existing re-size code dealt with + * this eventually, performance was very bad for the remainder of the + * epoch. + * + * At present, there are two possible values for the \c flash_incr_mode: + * + * \li \c H5C_flash_incr__off: Don't perform flash increases in the size of the cache. + * + * \li \c H5C_flash_incr__add_space: Let \c x be either the size of a newly + * newly inserted entry, or the number of bytes by which the + * size of an existing entry has been increased.\n + * If \Code{x > flash_threshold * current max cache size}, + * increase the current maximum cache size by \Code{x * flash_multiple} + * less any free space in the cache, and star a new epoch. For + * now at least, pay no attention to the maximum increment. + * + * In both of the above cases, the flash increment pays no attention to + * the maximum increment (at least in this first incarnation), but DOES + * stay within max_size. + * + * With a little thought, it should be obvious that the above flash + * cache size increase algorithm is not sufficient for all circumstances + * -- for example, suppose the user round robins through + * \Code{(1/flash_threshold) +1} groups, adding one data set to each on each + * pass. Then all will increase in size at about the same time, requiring + * the max cache size to at least double to maintain acceptable + * performance, however the above flash increment algorithm will not be + * triggered. + * + * Hopefully, the add space algorithms detailed above will be sufficient + * for the performance problems encountered to date. However, we should + * expect to revisit the issue. + * \endparblock + * + * \par flash_multiple + * Double containing the multiple described above in the + * \c H5C_flash_incr__add_space section of the discussion of the + * \c flash_incr_mode section. This field is ignored unless \c flash_incr_mode + * is \c H5C_flash_incr__add_space. + * + * \par flash_threshold + * Double containing the factor by which current max cache + * size is multiplied to obtain the size threshold for the add_space flash + * increment algorithm. The field is ignored unless \c flash_incr_mode is + * \c H5C_flash_incr__add_space. + * + * + * \section csdcf Cache size decrease control fields + * + * \par decr_mode + * \parblock + * Instance of the \c H5C_cache_decr_mode enumerated type whose + * value indicates how we determine whether the cache size should be + * decreased. At present there are four possibilities. + * + * \li \c H5C_decr__off: Don't attempt to decrease the size of the cache + * automatically.\n + * When this increment mode is selected, the remaining fields + * in the cache size decrease section are ignored. + * \li \c H5C_decr__threshold: Attempt to decrease the size of the cache + * whenever the average hit rate over the last epoch rises + * above the value supplied in the \c upper_hr_threshold + * field. + * \li \c H5C_decr__age_out: At the end of each epoch, search the cache for + * entries that have not been accessed for at least the number + * of epochs specified in the epochs_before_eviction field, and + * evict these entries. Conceptually, the maximum cache size + * is then decreased to match the new actual cache size. However, + * this reduction may be modified by the \c min_size, the + * \c max_decrement, and/or the \c empty_reserve. + * \li \c H5C_decr__age_out_with_threshold: Same as age_out, but we only + * attempt to reduce the cache size when the hit rate observed + * over the last epoch exceeds the value provided in the + * \c upper_hr_threshold field. + * + * Note that you must set \c decr_mode to \c H5C_decr__off if you + * disable metadata cache entry evictions. + * \endparblock + * + * \par upper_hr_threshold + * \parblock + * Upper hit rate threshold. The use of this field + * varies according to the current \c decr_mode : + * + * \c H5C_decr__off or \c H5C_decr__age_out: The value of this field is + * ignored. + * + * \li \c H5C_decr__threshold: If the hit rate exceeds this threshold in any + * epoch, attempt to decrement the cache size by size_decrement.\n + * Note that cache size may not be decremented below \c min_size.\n + * Note also that if the \c upper_threshold is 1.0, the cache size\n + * will never be reduced. + * + * \li \c H5C_decr__age_out_with_threshold: If the hit rate exceeds this + * threshold in any epoch, attempt to reduce the cache size + * by evicting entries that have not been accessed for more + * than the specified number of epochs. + * \endparblock + * + * \par decrement + * \parblock + * This field is only used when the decr_mode is + * \c H5C_decr__threshold. + * + * The field is a double containing the multiplier used to derive the + * new cache size from the old if a cache size decrement is triggered. + * The decrement must be in the range 0.0 (in which case the cache will + * try to contract to its minimum size) to 1.0 (in which case the + * cache will never shrink). + * \endparblock + * + * \par apply_max_decrement + * Boolean flag used to determine whether decrements + * in cache size are to be limited by the \c max_decrement field. + * + * \par max_decrement + * Maximum number of bytes by which the cache size can be + * decreased in a single re-size. Note that decrements may also be + * restricted by the \c min_size of the cache, and (in age out modes) by + * the \c empty_reserve field. + * + * \par epochs_before_eviction + * \parblock + * Integer field used in \c H5C_decr__age_out and + * \c H5C_decr__age_out_with_threshold decrement modes. + * + * This field contains the number of epochs an entry must remain + * unaccessed before it is evicted in an attempt to reduce the + * cache size. If applicable, this field must lie in the range + * \Code{[1, H5C__MAX_EPOCH_MARKERS]}. + * \endparblock + * + * \par apply_empty_reserve + * Boolean field controlling whether the empty_reserve + * field is to be used in computing the new cache size when the + * decr_mode is H5C_decr__age_out or H5C_decr__age_out_with_threshold. + * + * \par empty_reserve + * \parblock + * To avoid a constant racheting down of cache size by small + * amounts in the \c H5C_decr__age_out and \c H5C_decr__age_out_with_threshold + * modes, this field allows one to require that any cache size + * reductions leave the specified fraction of unused space in the cache. + * + * The value of this field must be in the range [0.0, 1.0]. I would + * expect typical values to be in the range of 0.01 to 0.1. + * \endparblock + * + * + * \section pcf Parallel Configuration Fields + * + * In PHDF5, all operations that modify metadata must be executed collectively. + * + * We used to think that this was enough to ensure consistency across the + * metadata caches, but since we allow processes to read metadata individually, + * the order of dirty entries in the LRU list can vary across processes, + * which can result in inconsistencies between the caches. + * + * PHDF5 uses several strategies to prevent such inconsistencies in metadata, + * all of which use the fact that the same stream of dirty metadata is seen + * by all processes for purposes of synchronization. This is done by + * having each process count the number of bytes of dirty metadata generated, + * and then running a "sync point" whenever this count exceeds a user + * specified threshold (see \c dirty_bytes_threshold below). + * + * The current metadata write strategy is indicated by the + * \c metadata_write_strategy field. The possible values of this field, along + * with the associated metadata write strategies are discussed below. + * + * \par dirty_bytes_threshold + * \parblock + * Threshold of dirty byte creation used to + * synchronize updates between caches. (See above for outline and + * motivation.) + * + * This value MUST be consistent across all processes accessing the + * file. This field is ignored unless HDF5 has been compiled for + * parallel. + * \endparblock + * + * \par metadata_write_strategy + * Integer field containing a code indicating the + * desired metadata write strategy. The valid values of this field + * are enumerated and discussed below: + * + * \li #H5AC_METADATA_WRITE_STRATEGY__PROCESS_0_ONLY\n + * When metadata_write_strategy is set to this value, only process + * zero is allowed to write dirty metadata to disk. All other + * processes must retain dirty metadata until they are informed at + * a sync point that the dirty metadata in question has been written + * to disk.\n + * When the sync point is reached (or when there is a user generated + * flush), process zero flushes sufficient entries to bring it into + * complience with its min clean size (or flushes all dirty entries in + * the case of a user generated flush), broad casts the list of + * entries just cleaned to all the other processes, and then exits + * the sync point.\n + * Upon receipt of the broadcast, the other processes mark the indicated + * entries as clean, and leave the sync point as well. + * + * \li #H5AC_METADATA_WRITE_STRATEGY__DISTRIBUTED\n + * In the distributed metadata write strategy, process zero still makes + * the decisions as to what entries should be flushed, but the actual + * flushes are distributed across the processes in the computation to + * the extent possible.\n + * In this strategy, when a sync point is triggered (either by dirty + * metadata creation or manual flush), all processes enter a barrier.\n + * On the other side of the barrier, process 0 constructs an ordered + * list of the entries to be flushed, and then broadcasts this list + * to the caches in all the processes.\n + * All processes then scan the list of entries to be flushed, flushing + * some, and marking the rest as clean. The algorithm for this purpose + * ensures that each entry in the list is flushed exactly once, and + * all are marked clean in each cache.\n + * Note that in the case of a flush of the cache, no message passing + * is necessary, as all processes have the same list of dirty entries, + * and all of these entries must be flushed. Thus in this case it is + * sufficient for each process to sort its list of dirty entries after + * leaving the initial barrier, and use this list as if it had been + * received from process zero.\n + * To avoid possible messages from the past/future, all caches must + * wait until all caches are done before leaving the sync point. + */
\ No newline at end of file diff --git a/doxygen/dox/mainpage.dox b/doxygen/dox/mainpage.dox index 83fc323..eda967b 100644 --- a/doxygen/dox/mainpage.dox +++ b/doxygen/dox/mainpage.dox @@ -1,36 +1,44 @@ -/*! \mainpage API Documentation for HDF5 Version 1.13 (Draft) +/*! \mainpage HDF5 C-API Reference + * + * The HDF5 C-API provides applications with fine-grained control over all + * aspects HDF5 functionality. This functionality is grouped into the following + * \Emph{modules}: + * \li \ref H5A "Attributes" — Management of HDF5 attributes (\ref H5A) + * \li \ref H5D "Datasets" — Management of HDF5 datasets (\ref H5D) + * \li \ref H5S "Dataspaces" — Management of HDF5 dataspaces which describe the shape of datasets and attributes (\ref H5S) + * \li \ref H5T "Datatypes" — Management of datatypes which describe elements of datasets and attributes (\ref H5T) + * \li \ref H5E "Error Handling" — Functions for handling errors that occur within HDF5 (\ref H5E) + * \li \ref H5F "Files" — Management of HDF5 files (\ref H5F) + * \li \ref H5Z "Filters" — Configuration of filters that process data during I/O operation (\ref H5Z) + * \li \ref H5G "Groups" — Management of groups in HDF5 files (\ref H5G) + * \li \ref H5I "Identifiers" — Management of object identifiers and object names (\ref H5I) + * \li \ref H5 "Library" — General purpose library functions (\ref H5) + * \li \ref H5L "Links" — Management of links in HDF5 groups (\ref H5L) + * \li \ref H5O "Objects" — Management of objects in HDF5 files (\ref H5O) + * \li \ref H5PL "Plugins" — Programmatic control over dynamically loaded plugins (\ref H5PL) + * \li \ref H5P "Property Lists" — Management of property lists to control HDF5 library behavior (\ref H5P) + * \li \ref H5R "References" — Management of references to specific objects and data regions in an HDF5 file (\ref H5R) + * \li \ref H5VL "Virtual Object Layer" — Management of the Virtual Object Layer (\ref H5VL) + * + * Here are a few simple rules to follow: + * + * \li \Bold{Handle discipline:} If you acquire a handle (by creation or coopy), \Emph{you own it!} (..., i.e., you have to close it.) + * \li \Bold{Dynamic memory allocation:} ... + * \li \Bold{Use of locations:} Identifier + name combo + * + * \attention \Bold{C++ Developers using HDF5 C-API functions beware:}\n + * If a C routine that takes a function pointer as an argument is called from + * within C++ code, the C routine should be returned from normally. + * Examples of this kind of routine include callbacks such as H5Pset_elink_cb() + * and H5Pset_type_conv_cb() and functions such as H5Tconvert() and H5Ewalk2().\n + * Exiting the routine in its normal fashion allows the HDF5 C library to clean + * up its work properly. In other words, if the C++ application jumps out of + * the routine back to the C++ \c catch statement, the library is not given the + * opportunity to close any temporary data structures that were set up when the + * routine was called. The C++ application should save some state as the + * routine is started so that any problem that occurs might be diagnosed. * * \todo Fix the search form for server deployments. * \todo Make it mobile-friendly * - * \section intro_sec Introduction - * - * \todo Write an introduction. - * - * \section quick_links Quick Links - * - * <ul> - * <li>\ref PDT "Predefined Datatypes"</li> - * <li>\ref api-compat-macros "API Compatibility Macros"</li> - * <li><a href="https://hdf5.wiki/">HDF5 Wiki</a></li> - * </ul> - * - * \section using_locations The Use of Locations (Identifier + Name) in the HDF5 API - * - * \todo Make this crystal clear! - * - * \section cpp_note Programming Note for C++ Developers Using C Functions - * - * If a C routine that takes a function pointer as an argument is called from - * within C++ code, the C routine should be returned from normally. - * - * Examples of this kind of routine include callbacks such as H5Pset_elink_cb() - * and H5Pset_type_conv_cb() and functions such as H5Tconvert() and H5Ewalk2(). - * - * Exiting the routine in its normal fashion allows the HDF5 C library to clean - * up its work properly. In other words, if the C++ application jumps out of - * the routine back to the C++ \c catch statement, the library is not given the - * opportunity to close any temporary data structures that were set up when the - * routine was called. The C++ application should save some state as the - * routine is started so that any problem that occurs might be diagnosed. */
\ No newline at end of file |