From b9e4deec10cc943a7ecb3ac6bc6bd6695b2f33a0 Mon Sep 17 00:00:00 2001 From: Gerd Heber Date: Tue, 23 Nov 2021 08:05:40 -0600 Subject: Next batch of Doxygen updates. (#1180) * Sketch of the H5S life cycle. * Committing clang-format changes * Fix H5S_UNLIMITED snafu. * Updated RM template and RM page. * Added H5S life cycle. * Committing clang-format changes * Added H5T life cycle. * Committing clang-format changes * Cleaner layout (?) * Cleaned the H5F life cycle. Called out unfinished biz. * Committing clang-format changes * Remaining life cycle skeletons. * Committing clang-format changes * Committing clang-format changes * Added H5Z life cycle. * Committing clang-format changes * Added H5G life cycle. * Committing clang-format changes * H5 and H5I life cycle updates. * Committing clang-format changes * Added H5PL life cycle. * Committing clang-format changes * Added H5L life cycle. * Committing clang-format changes * Fix for Chris' comment. * Add a variable for Doxygen pre-processor definitions. * Forgot to add the H5M API. * Clarify the H5Z life cycle. * Committing clang-format changes * Add H5Zdevelop.h to Doxygen.in. Added H5I life cycle. * Committing clang-format changes * Clarified introduction and fixed missing label declaration. * Added H5O life cycle. * Committing clang-format changes * H5O cleanup, part 1. * Committing clang-format changes * Cleaned up some of the endless repetition in H5O. * Committing clang-format changes * Cookbook & RFC draft layouts. * Updated manifest. * Updated the manifest, the example paths, and sketched the 1st recipe. * Committing clang-format changes * Outlined two more recipes. * Committing clang-format changes * More recipes and RFCs. * Committing clang-format changes * Draft of templatized RFC references. * Another batch of RFC changes. * Another batch of RFCs. * Fixed reference. * RFCs in reverse chronological order. * First cut of RFCs. * Fixed reference. * Updated recipes. * Updated recipes. * More RFCs. * Updated D*PL comments. * Added H5P descriptions. * Committing clang-format changes * H5R life-cycle snapshot. * Committing clang-format changes * H5R life-cycle. Added line numbers to life-cycle examples. * Committing clang-format changes * Fixed formatting for H5Dchunk_iter(). * Added comment on collective mode requirement w/ compression. * Simplified API compat. macro dox. * More API vers. updates. * Hide the async macro entrails. * Latest VFD SWMR RFC. * Create a tag file for permalinks. * Added TODOs for metadoc. * Removed duplication. * Revised RM landing page. * Trimmed more duplication. * Committing clang-format changes * Revised H5D. * Committing clang-format changes * Updated survey link. * Added Doxygen RM entry template link. * Added the "Multi-Thread HDF5" RFC. * Added DOXYGEN_TAG_FILE. * Added selection I/O RFC. * Added the VFD Sub-filing RFC. * Updated meta-documentation and added two old presentations. * Added a few more RFCs (4). * Fixed MANIFEST. * Updated meta-documentation. * Added Filters technical note. * Fixed MANIFEST. * Restore the path stripper. * Experimental full-text search via Google. * Better full-text search integration. * Whoops. Forgot this one. * Oh boy. * Make CMake happy. * Added "Debugging HDF5 Applications" technical note. * Another batch of RFCs. Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> --- MANIFEST | 10 +- configure.ac | 6 +- doxygen/CMakeLists.txt | 1 + doxygen/Doxyfile.in | 3 +- doxygen/aliases | 16 +- doxygen/dox/About.dox | 120 ++- doxygen/dox/FTS.dox | 8 + doxygen/dox/FileFormatSpec.dox | 23 - doxygen/dox/OtherSpecs.dox | 11 - doxygen/dox/RFC.dox | 11 + doxygen/dox/Specifications.dox | 38 +- doxygen/dox/TechnicalNotes.dox | 26 + doxygen/examples/DebuggingHDF5Applications.html | 392 +++++++ doxygen/examples/FileFormat.html | 1275 +++++++++++++++++++++++ doxygen/examples/Filters.html | 450 ++++++++ doxygen/examples/IOFlow.html | 137 +++ doxygen/hdf5doxy_layout.xml | 1 + doxygen/img/IOFlow.gif | Bin 0 -> 57285 bytes doxygen/img/IOFlow2.gif | Bin 0 -> 29805 bytes doxygen/img/IOFlow3.gif | Bin 0 -> 21442 bytes 20 files changed, 2479 insertions(+), 49 deletions(-) create mode 100644 doxygen/dox/FTS.dox delete mode 100644 doxygen/dox/FileFormatSpec.dox delete mode 100644 doxygen/dox/OtherSpecs.dox create mode 100644 doxygen/examples/DebuggingHDF5Applications.html create mode 100644 doxygen/examples/FileFormat.html create mode 100644 doxygen/examples/Filters.html create mode 100644 doxygen/examples/IOFlow.html create mode 100644 doxygen/img/IOFlow.gif create mode 100644 doxygen/img/IOFlow2.gif create mode 100644 doxygen/img/IOFlow3.gif diff --git a/MANIFEST b/MANIFEST index 0030b6e..1cbb856 100644 --- a/MANIFEST +++ b/MANIFEST @@ -216,11 +216,10 @@ ./doxygen/dox/Cookbook.dox ./doxygen/dox/DDLBNF110.dox ./doxygen/dox/DDLBNF112.dox -./doxygen/dox/FileFormatSpec.dox +./doxygen/dox/FTS.dox ./doxygen/dox/GettingStarted.dox ./doxygen/dox/H5AC_cache_config_t.dox ./doxygen/dox/MetadataCachingInHDF5.dox -./doxygen/dox/OtherSpecs.dox ./doxygen/dox/Overview.dox ./doxygen/dox/ReferenceManual.dox ./doxygen/dox/RFC.dox @@ -236,9 +235,12 @@ ./doxygen/dox/cookbook/Files.c ./doxygen/dox/cookbook/Files.dox ./doxygen/dox/cookbook/Performance.dox +./doxygen/examples/DebuggingHDF5Applications.html ./doxygen/examples/FF-IH_FileGroup.gif ./doxygen/examples/FF-IH_FileObject.gif +./doxygen/examples/FileFormat.html ./doxygen/examples/FileFormatSpecChunkDiagram.jpg +./doxygen/examples/Filters.html ./doxygen/examples/H5Pset_metadata_read_attempts.c ./doxygen/examples/H5Pset_object_flush_cb.c ./doxygen/examples/H5.format.1.0.html @@ -267,6 +269,7 @@ ./doxygen/examples/H5Z_examples.c ./doxygen/examples/H5_examples.c ./doxygen/examples/ImageSpec.html +./doxygen/examples/IOFlow.html ./doxygen/examples/PaletteExample1.gif ./doxygen/examples/Palettes.fm.anc.gif ./doxygen/examples/TableSpec.html @@ -282,6 +285,9 @@ ./doxygen/img/FF-IH_FileObject.gif ./doxygen/img/FileFormatSpecChunkDiagram.jpg ./doxygen/img/HDFG-logo.png +./doxygen/img/IOFlow.gif +./doxygen/img/IOFlow2.gif +./doxygen/img/IOFlow3.gif ./doxygen/img/PaletteExample1.gif ./doxygen/img/Palettes.fm.anc.gif ./doxygen/img/ftv2node.png diff --git a/configure.ac b/configure.ac index 164fd12..8559792 100644 --- a/configure.ac +++ b/configure.ac @@ -892,7 +892,7 @@ if test "X-$DIMENSION_SCALES_WITH_NEW_REF" = X- ; then DIMENSION_SCALES_WITH_NEW_REF=no fi -case "X-$DIMENSION_SCALES_WITH_NEW_REF" in +case "X-$DIMENSION_SCALES_WITH_NEW_REF" in X-yes) AC_MSG_RESULT([yes]) AC_DEFINE([DIMENSION_SCALES_WITH_NEW_REF], [1], @@ -1193,6 +1193,7 @@ if test "X$HDF5_DOXYGEN" = "Xyes"; then AC_SUBST([DOXYGEN_EXTERNAL_SEARCH]) AC_SUBST([DOXYGEN_SEARCHENGINE_URL]) AC_SUBST([DOXYGEN_STRIP_FROM_PATH]) + AC_SUBST([DOXYGEN_STRIP_FROM_INC_PATH]) AC_SUBST([DOXYGEN_PREDEFINED]) # SRCDIR Environment variables used inside doxygen macro for the source location: @@ -1200,7 +1201,7 @@ if test "X$HDF5_DOXYGEN" = "Xyes"; then DOXYGEN_VERSION_STRING=${PACKAGE_VERSION} DOXYGEN_INCLUDE_ALIASES='$(SRCDIR)/doxygen/aliases' DOXYGEN_PROJECT_LOGO='$(SRCDIR)/doxygen/img/HDFG-logo.png' - DOXYGEN_PROJECT_BRIEF='C-API Reference' + DOXYGEN_PROJECT_BRIEF='' DOXYGEN_INPUT_DIRECTORY='$(SRCDIR) $(SRCDIR)/doxygen/dox' DOXYGEN_OPTIMIZE_OUTPUT_FOR_C=YES DOXYGEN_MACRO_EXPANSION=YES @@ -1216,6 +1217,7 @@ if test "X$HDF5_DOXYGEN" = "Xyes"; then DOXYGEN_EXTERNAL_SEARCH=NO DOXYGEN_SEARCHENGINE_URL= DOXYGEN_STRIP_FROM_PATH='$(SRCDIR)' + DOXYGEN_STRIP_FROM_INC_PATH='$(SRCDIR)' DOXYGEN_PREDEFINED='H5_HAVE_DIRECT H5_HAVE_LIBHDFS H5_HAVE_MAP_API H5_HAVE_PARALLEL H5_HAVE_ROS3_VFD' DX_INIT_DOXYGEN([HDF5], [./doxygen/Doxyfile], [hdf5lib_docs]) diff --git a/doxygen/CMakeLists.txt b/doxygen/CMakeLists.txt index 36ce590..3462d50 100644 --- a/doxygen/CMakeLists.txt +++ b/doxygen/CMakeLists.txt @@ -27,6 +27,7 @@ if (DOXYGEN_FOUND) set (DOXYGEN_EXTERNAL_SEARCH NO) set (DOXYGEN_SEARCHENGINE_URL) set (DOXYGEN_STRIP_FROM_PATH ${HDF5_SOURCE_DIR}) + set (DOXYGEN_STRIP_FROM_INC_PATH ${HDF5_SOURCE_DIR}) set (DOXYGEN_PREDEFINED "H5_HAVE_DIRECT H5_HAVE_LIBHDFS H5_HAVE_MAP_API H5_HAVE_PARALLEL H5_HAVE_ROS3_VFD") # This configure and individual custom targets work together diff --git a/doxygen/Doxyfile.in b/doxygen/Doxyfile.in index 44d9974..8c871de 100644 --- a/doxygen/Doxyfile.in +++ b/doxygen/Doxyfile.in @@ -179,7 +179,7 @@ STRIP_FROM_PATH = @DOXYGEN_STRIP_FROM_PATH@ # specify the list of include paths that are normally passed to the compiler # using the -I flag. -STRIP_FROM_INC_PATH = +STRIP_FROM_INC_PATH = @DOXYGEN_STRIP_FROM_INC_PATH@ # If the SHORT_NAMES tag is set to YES, doxygen will generate much shorter (but # less readable) file names. This can be useful is your file systems doesn't @@ -856,6 +856,7 @@ INPUT_ENCODING = UTF-8 FILE_PATTERNS = H5*public.h \ H5*module.h \ H5FDcore.h \ + H5FDdevelop.h \ H5FDdirect.h \ H5FDfamily.h \ H5FDhdfs.h \ diff --git a/doxygen/aliases b/doxygen/aliases index 06c3445..68efeb7 100644 --- a/doxygen/aliases +++ b/doxygen/aliases @@ -24,6 +24,7 @@ ALIASES += htri_t="Returns zero (false), a positive (true) or a negative (failur ALIASES += api_vers_2{3}="\1() is a macro that is mapped to either \2() or \3().\n\see \ref api-compat-macros" ALIASES += api_vers_3{4}="\1() is a macro that is mapped to either \2() or \3() or \4().\n\see \ref api-compat-macros" +ALIASES += api_vers_4{5}="\1() is a macro that is mapped to either \2() or \3() or \4() or \5().\n\see \ref api-compat-macros" ALIASES += deprecation_note{1}="\deprecated Superseded by \1." @@ -252,10 +253,17 @@ ALIASES += ref_vol_doc="VOL documentation" ################################################################################ ALIASES += ref_rfc20210528="Multi-Thread HDF5" +ALIASES += ref_rfc20210219="Selection I/O" +ALIASES += ref_rfc20200213="VFD Sub-filing" +ALIASES += ref_rfc20200210="Onion VFD" ALIASES += ref_rfc20190923="Virtual Object Layer (VOL) API Compatibility" +ALIASES += ref_rfc20190715="Variable-Length Data in HDF5 Sketch Design" ALIASES += ref_rfc20190410="A Plugin Interface for HDF5 Virtual File Drivers" ALIASES += ref_rfc20181231="Dataset Object Header Size" ALIASES += ref_rfc20181220="MS 3.2 – Addressing Scalability: Scalability of open, close, flush CASE STUDY: CGNS Hotspot analysis of CGNS cgp_open" +ALIASES += ref_rfc20180830="Sparse Chunks" +ALIASES += ref_rfc20180829="H5FD_MIRROR Virtual File Driver" +ALIASES += ref_rfc20180815="Splitter_VFD" ALIASES += ref_rfc20180620="Chunk query functionality in HDF5" ALIASES += ref_rfc20180610="VFD SWMR" ALIASES += ref_rfc20180321="API Contexts" @@ -298,7 +306,7 @@ ALIASES += ref_rfc20120305="h5repack: Improved Hyperslab selections for Large Chunked Datasets" ALIASES += ref_rfc20120120="A Maintainer’s Guide for the Datatype Module in HDF5 Library" ALIASES += ref_rfc20120104="Actual I/O Mode" -ALIASES += ref_rfc20111119="New public functions to handle comparison" +ALIASES += ref_rfc20111119="New public functions to handle comparison" ALIASES += ref_rfc20110825="Merging Named Datatypes in H5Ocopy()" ALIASES += ref_rfc20110811="Expanding the HDF5 Hyperslab Selection Interface" ALIASES += ref_rfc20110726="HDF5 File Space Allocation and Aggregation" @@ -318,8 +326,12 @@ ALIASES += ref_rfc20091218="HDF5 Tools Library Functions" ALIASES += ref_rfc20090612="Default EPSILON values for comparing floating point data" ALIASES += ref_rfc20081218="Reporting of Non-Comparable Datasets by h5diff" +ALIASES += ref_rfc20081205="External Link Traversal Callback" +ALIASES += ref_rfc20081030="Setting Raw Data Chunk Cache Parameters in HDF5" ALIASES += ref_rfc20080915="Performance Report for Free-space Manager" ALIASES += ref_rfc20080904="Setting File Access Property List for accessing External Link" +ALIASES += ref_rfc20080728="Native Time Types in HDF5" +ALIASES += ref_rfc20080723="Special Values in HDF5" ALIASES += ref_rfc20080301="Dynamic Transformations to HDF5 Data" ALIASES += ref_rfc20080209="Using SVN branching to improve software development process at THG" ALIASES += ref_rfc20080206="Maintaining the HISTORY.txt and RELEASE.txt files in HDF5" @@ -327,7 +339,7 @@ ALIASES += ref_rfc20071111="NaN detection in HDF5" ALIASES += ref_rfc20070801="Metadata Journaling to Improve Crash Survivability" ALIASES += ref_rfc20070413="API Compatibility Strategies for HDF5" -ALIASES += ref_rfc20070115="A 'Private' Heap for HDF5" +ALIASES += ref_rfc20070115="A "Private" Heap for HDF5" ALIASES += ref_rfc20060623="Performance Comparison of Collective I/O and Independent I/O with Derived Datatypes" ALIASES += ref_rfc20060604="h5stat tool" ALIASES += ref_rfc20060505="Simple Performance Test on Fletcher32 Filter" diff --git a/doxygen/dox/About.dox b/doxygen/dox/About.dox index 32930a8..0b21fcc 100644 --- a/doxygen/dox/About.dox +++ b/doxygen/dox/About.dox @@ -12,12 +12,118 @@ of documentation done right. \section about_documentation Documentation about Documentation -\li \todo Describe how to add a reference or a new RFC -\li \todo Describe how to add an example -\li \todo Describe how to include plain HTML -\li \todo Describe how to add an API macro -\li \todo Describe the custom commands -\li \todo Describe the S3 bucket layout and update routine -\li \todo Link the RM template +In this section, we describe common documentation maintenance tasks. + +\subsection plain_html Including Plain HTML Pages + +The most common use case for this is the inclusion of older documentation. +New documentation should, whenever possible, be created using Doxygen markdown! + +Use Doxygen's htmlinclude +special command to include existing plain HTML pages. + +An example from this documentation set can be seen +here. + +\subsection new_rm_entry Creating a New Reference Manual Entry + +Please refer to the \ref RMT for guidance on how to create a new reference manual entry. + +\subsubsection new_example Adding and Referencing API Examples + +For each HDF5 module, such as \Code{H5F}, there is an examples source file called +\Code{H5*_examples.c}. For example, the \Code{H5F} API examples are located in + +H5F_examples.c. Examples are code blocks marked as Doxygen +snippets. +For example, the source code for the H5Fcreate() API sample is located between +the +\verbatim +//! +... +//! +\endverbatim +comments in + +H5F_examples.c. + +Add a new API example by adding a new code block enclosed between matching +snippet tags. The name of the tag is usually the function name stripped of +the module prefix. + +The inclusion of such a block of code can then be triggered via Doxygen's +snippet +special command. For example, the following markup +\verbatim +* \snippet H5F_examples.c create +\endverbatim +yields +\snippet H5F_examples.c create + +\subsubsection api_macro Adding an API Macro + +API macros are handled by the api_vers_2, api_vers_3, api_vers_4 +custom commands. The numbers indicate the number of potential API function +mappings. For example, H5Acreate() has two potential mappings, H5Acreate1() and +H5Acreate2(). To trigger the creation of a reference manual entry for H5Acreate() +use the following markup: +\verbatim +\api_vers_2{H5Acreate,H5Acreate1,H5Acreate2} +\endverbatim +This yields: + +\api_vers_2{H5Acreate,H5Acreate1,H5Acreate2} + +\subsection custom_commands Creating Custom Commands + +See Doxygen's Custom Commands documentation +as a general reference. + +All custom commands for this project are located in the +aliases +file in the doxygen +subdirectory of the main HDF5 repo. + +The custom commands are grouped in sections. Find a suitable section for your command or +ask for help if unsure! + +\subsection new_rfc Adding a New RFC or Referencing an Existing RFC + +For ease of reference, we define custom commands for each RFC in the RFCs section +of the +aliases +file. For example the custom command \Code{ref_rfc20141210} can be used to insert a +reference to "RFC: Virtual Object Layer". In other words, the markup +\verbatim +\ref_rfc20141210 +\endverbatim +yields a clickable link: + +\ref_rfc20141210 + +To add a new RFC, add a custom command for the RFC to the +aliases +file. The naming convention for the custom command is \Code{ref_rfcYYYYMMDD}, +where \Code{YYYYMMDD} is the ID of the RFC. The URL is composed of the prefix +\verbatim +https://docs.hdfgroup.org/hdf5/rfc/ +\endverbatim +and the name of your RFC file, typically, a PDF file, i.e., the full URL would +be +\verbatim +https://docs.hdfgroup.org/hdf5/rfc/my_great_rfc_name.pdf +\endverbatim + +\subsection hosting How Do Updates and Changes Get Published? + +Currently, the files underlying this documentation website are stored in an +bucket on AWS S3. The top-level bucket is
s3://docs.hdfgroup.org/hdf5/
+There are folders for the development branch and all supported release +version. + +Talk to your friendly IT-team if you need write access, or you need someone to +push an updated version for you! + +\todo Make the publication a GitHub action! */ \ No newline at end of file diff --git a/doxygen/dox/FTS.dox b/doxygen/dox/FTS.dox new file mode 100644 index 0000000..9dae7c1 --- /dev/null +++ b/doxygen/dox/FTS.dox @@ -0,0 +1,8 @@ +/** \page FTS Full-Text Search + +\htmlonly + + +\endhtmlonly + +*/ \ No newline at end of file diff --git a/doxygen/dox/FileFormatSpec.dox b/doxygen/dox/FileFormatSpec.dox deleted file mode 100644 index fc10574..0000000 --- a/doxygen/dox/FileFormatSpec.dox +++ /dev/null @@ -1,23 +0,0 @@ -/** \page FMT3 HDF5 File Format Specification Version 3.0 - -\htmlinclude H5.format.html - -*/ - -/** \page FMT2 HDF5 File Format Specification Version 2.0 - -\htmlinclude H5.format.2.0.html - -*/ - -/** \page FMT11 HDF5 File Format Specification Version 1.1 - -\htmlinclude H5.format.1.1.html - -*/ - -/** \page FMT1 HDF5 File Format Specification Version 1.0 - -\htmlinclude H5.format.1.0.html - -*/ \ No newline at end of file diff --git a/doxygen/dox/OtherSpecs.dox b/doxygen/dox/OtherSpecs.dox deleted file mode 100644 index e53f26e..0000000 --- a/doxygen/dox/OtherSpecs.dox +++ /dev/null @@ -1,11 +0,0 @@ -/** \page IMG HDF5 Image and Palette Specification Version 1.2 - -\htmlinclude ImageSpec.html - -*/ - -/** \page TBL HDF5 Table Specification Version 1.0 - -\htmlinclude TableSpec.html - -*/ diff --git a/doxygen/dox/RFC.dox b/doxygen/dox/RFC.dox index c16dcea..c2562b0 100644 --- a/doxygen/dox/RFC.dox +++ b/doxygen/dox/RFC.dox @@ -3,10 +3,17 @@ + + + + + + + @@ -69,8 +76,12 @@ + + + + diff --git a/doxygen/dox/Specifications.dox b/doxygen/dox/Specifications.dox index 4ae48d0..5a36d61 100644 --- a/doxygen/dox/Specifications.dox +++ b/doxygen/dox/Specifications.dox @@ -19,4 +19,40 @@ \li HDF5 Dimension Scale Specification -*/ \ No newline at end of file +*/ + +/** \page FMT3 HDF5 File Format Specification Version 3.0 + +\htmlinclude H5.format.html + +*/ + +/** \page FMT2 HDF5 File Format Specification Version 2.0 + +\htmlinclude H5.format.2.0.html + +*/ + +/** \page FMT11 HDF5 File Format Specification Version 1.1 + +\htmlinclude H5.format.1.1.html + +*/ + +/** \page FMT1 HDF5 File Format Specification Version 1.0 + +\htmlinclude H5.format.1.0.html + +*/ + +/** \page IMG HDF5 Image and Palette Specification Version 1.2 + +\htmlinclude ImageSpec.html + +*/ + +/** \page TBL HDF5 Table Specification Version 1.0 + +\htmlinclude TableSpec.html + +*/ diff --git a/doxygen/dox/TechnicalNotes.dox b/doxygen/dox/TechnicalNotes.dox index 2bda175..0cabdeb 100644 --- a/doxygen/dox/TechnicalNotes.dox +++ b/doxygen/dox/TechnicalNotes.dox @@ -1,6 +1,10 @@ /** \page TN Technical Notes \li \link api-compat-macros API Compatibility Macros \endlink +\li \ref APPDBG "Debugging HDF5 Applications" +\li \ref FMTDISC "File Format Walkthrough" +\li \ref FILTER "Filters" +\li \ref IOFLOW "HDF5 Raw I/O Flow Notes" \li \ref TNMDC "Metadata Caching in HDF5" \li \ref MT "Thread Safe library" \li \ref VFL "Virtual File Layer" @@ -13,8 +17,30 @@ */ +/** \page IOFLOW HDF5 Raw I/O Flow Notes + +\htmlinclude IOFlow.html + +*/ + /** \page VFL HDF5 Virtual File Layer \htmlinclude VFL.html */ + +/** \page FMTDISC HDF5 File Format Discussion + +\htmlinclude FileFormat.html + +/** \page FILTER HDF5 Filters + +\htmlinclude Filters.html + +*/ + +/** \page APPDBG Debugging HDF5 Applications + +\htmlinclude DebuggingHDF5Applications.html + +*/ \ No newline at end of file diff --git a/doxygen/examples/DebuggingHDF5Applications.html b/doxygen/examples/DebuggingHDF5Applications.html new file mode 100644 index 0000000..c6aaf74 --- /dev/null +++ b/doxygen/examples/DebuggingHDF5Applications.html @@ -0,0 +1,392 @@ + + + Debugging HDF5 Applications + +

Introduction

+ +

The HDF5 library contains a number of debugging features to + make programmers' lives easier including the ability to print + detailed error messages, check invariant conditions, display + timings and other statistics, and trace API function calls and + return values. + +

+
Error Messages +
Error messages are normally displayed automatically on the + standard error stream and include a stack trace of the library + including file names, line numbers, and function names. The + application has complete control over how error messages are + displayed and can disable the display on a permanent or + temporary basis. Refer to the documentation for the H5E error + handling package. + +

+
Invariant Conditions +
Unless NDEBUG is defined during compiling, the + library will include code to verify that invariant conditions + have the expected values. When a problem is detected the + library will display the file and line number within the + library and the invariant condition that failed. A core dump + may be generated for post mortem debugging. The code to + perform these checks can be included on a per-package bases. + +

+
Timings and Statistics +
The library can be configured to accumulate certain + statistics about things like cache performance, datatype + conversion, data space conversion, and data filters. The code + is included on a per-package basis and enabled at runtime by + an environment variable. + +

+
API Tracing +
All API calls made by an application can be displayed and + include formal argument names and actual values and the + function return value. This code is also conditionally + included at compile time and enabled at runtime. +
+ +

The statistics and tracing can be displayed on any output + stream (including streams opened by the shell) with output from + different packages even going to different streams. + +

Error Messages

+ +

By default any API function that fails will print an error + stack to the standard error stream. + +

+

+
RFC IDTitleComments
2021-05-28 \ref_rfc20210528
2021-02-19 \ref_rfc20210219
2020-02-13 \ref_rfc20200213
2020-02-10 \ref_rfc20200210
2019-09-23 \ref_rfc20190923
2019-07-15 \ref_rfc20190715
2019-04-10 \ref_rfc20190410
2018-12-31 \ref_rfc20181231
2018-12-20 \ref_rfc20181220
2018-08-30 \ref_rfc20180830
2018-08-29 \ref_rfc20180829
2018-08-15 \ref_rfc20180815
2018-06-20 \ref_rfc20180620
2018-06-10 \ref_rfc20180610
2018-03-21 \ref_rfc20180321
2009-09-07 \ref_rfc20090907
2009-06-12 \ref_rfc20090612
2008-12-18 \ref_rfc20081218
2008-12-05 \ref_rfc20081205
2008-10-30 \ref_rfc20081030
2008-09-15 \ref_rfc20080915
2008-09-04 \ref_rfc20080904
2008-07-28 \ref_rfc20080728
2008-07-23 \ref_rfc20080723
2008-03-01 \ref_rfc20080301
2008-02-09 \ref_rfc20080209
2008-02-06 \ref_rfc20080206
+ + + +
+


+HDF5-DIAG: Error detected in thread 0.  Back trace follows.
+  #000: H5F.c line 1245 in H5Fopen(): unable to open file
+    major(04): File interface
+    minor(10): Unable to open file
+  #001: H5F.c line 846 in H5F_open(): file does not exist
+    major(04): File interface
+    minor(10): Unable to open file
+	      
+
+ + +

The error handling package (H5E) is described + elsewhere. + +

Invariant Conditions

+ +

To include checks for invariant conditions the library should + be configured with --disable-production, the + default for versions before 1.2. The library designers have made + every attempt to handle error conditions gracefully but an + invariant condition assertion may fail in certain cases. The + output from a failure usually looks something like this: + +

+

+ + + + +
+


+Assertion failed: H5.c:123: i<NELMTS(H5_debug_g)
+IOT Trap, core dumped.
+	      
+
+
+ +

Timings and Statistics

+ +

Code to accumulate statistics is included at compile time by + using the --enable-debug configure switch. The + switch can be followed by an equal sign and a comma-separated + list of package names or else a default list is used. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
NameDefaultDescription
aNoAttributes
acYesMeta data cache
bYesB-Trees
dYesDatasets
eYesError handling
fYesFiles
gYesGroups
hgYesGlobal heap
hlNoLocal heaps
iYesInterface abstraction
mfNoFile memory management
mmYesLibrary memory managment
oNoObject headers and messages
pYesProperty lists
sYesData spaces
tYesDatatypes
vYesVectors
zYesRaw data filters
+
+ +

In addition to including the code at compile time the + application must enable each package at runtime. This is done + by listing the package names in the HDF5_DEBUG + environment variable. That variable may also contain file + descriptor numbers (the default is `2') which control the output + for all following packages up to the next file number. The + word all refers to all packages. Any word my be + preceded by a minus sign to turn debugging off for the package. + +

+

+ + + + + + + + + + + + + + +
Sample debug specifications
allThis causes debugging output from all packages to be + sent to the standard error stream.
all -t -sDebugging output for all packages except datatypes + and data spaces will appear on the standard error + stream.
-all ac 255 t,sThis disables all debugging even if the default was to + debug something, then output from the meta data cache is + send to the standard error stream and output from data + types and spaces is sent to file descriptor 255 which + should be redirected by the shell.
+
+ +

The components of the HDF5_DEBUG value may be + separated by any non-lowercase letter. + +

API Tracing

+ +

The HDF5 library can trace API calls by printing the + function name, the argument names and their values, and the + return value. Some people like to see lots of output during + program execution instead of using a good symbolic debugger, and + this feature is intended for their consumption. For example, + the output from h5ls foo after turning on tracing, + includes: + +

+

+ + + + +
+
+H5Tcopy(type=184549388) = 184549419 (type);
+H5Tcopy(type=184549392) = 184549424 (type);
+H5Tlock(type=184549424) = SUCCEED;
+H5Tcopy(type=184549393) = 184549425 (type);
+H5Tlock(type=184549425) = SUCCEED;
+H5Fopen(filename="foo", flags=0, access=H5P_DEFAULT) = FAIL;
+HDF5-DIAG: Error detected in thread 0.  Back trace follows.
+  #000: H5F.c line 1245 in H5Fopen(): unable to open file
+    major(04): File interface
+    minor(10): Unable to open file
+  #001: H5F.c line 846 in H5F_open(): file does not exist
+    major(04): File interface
+    minor(10): Unable to open file
+	      
+
+
+ +

The code that performs the tracing must be included in the + library by specifying the --enable-trace + configuration switch (the default for versions before 1.2). Then + the word trace must appear in the value of the + HDF5_DEBUG variable. The output will appear on the + last file descriptor before the word trace or two + (standard error) by default. + +

+

+ + + + + + + +
To display the trace on the standard error stream: +
$ env HDF5_DEBUG=trace a.out
+	      
+
To send the trace to a file: +
$ env HDF5_DEBUG="55 trace" a.out 55>trace-output
+	      
+
+
+ +

Performance

+ +

If the library was not configured for tracing then there is no + unnecessary overhead since all tracing code is excluded. + However, if tracing is enabled but not used there is a small + penalty. First, code size is larger because of extra + statically-declared character strings used to store argument + types and names and extra auto variable pointer in each + function. Also, execution is slower because each function sets + and tests a local variable and each API function calls the + H5_trace() function. + +

If tracing is enabled and turned on then the penalties from the + previous paragraph apply plus the time required to format each + line of tracing information. There is also an extra call to + H5_trace() for each API function to print the return value. + +

Safety

+ +

The tracing mechanism is invoked for each API function before + arguments are checked for validity. If bad arguments are passed + to an API function it could result in a segmentation fault. + However, the tracing output is line-buffered so all previous + output will appear. + +

Completeness

+ +

There are two API functions that don't participate in + tracing. They are H5Eprint() and + H5Eprint_cb() because their participation would + mess up output during automatic error reporting. + +

On the other hand, a number of API functions are called during + library initialization and they print tracing information. + +

Implementation

+ +

For those interested in the implementation here is a + description. Each API function should have a call to one of the + H5TRACE() macros immediately after the + FUNC_ENTER() macro. The first argument is the + return type encoded as a string. The second argument is the + types of all the function arguments encoded as a string. The + remaining arguments are the function arguments. This macro was + designed to be as terse and unobtrousive as possible. + +

In order to keep the H5TRACE() calls synchronized + with the source code we've written a perl script which gets + called automatically just before Makefile dependencies are + calculated for the file. However, this only works when one is + using GNU make. To reinstrument the tracing explicitly, invoke + the trace program from the hdf5 bin directory with + the names of the source files that need to be updated. If any + file needs to be modified then a backup is created by appending + a tilde to the file name. + +

+

+ + + + + +
Explicit Instrumentation
+
+$ ../bin/trace *.c
+H5E.c: in function `H5Ewalk_cb':
+H5E.c:336: warning: trace info was not inserted
+	      
+
+
+ +

Note: The warning message is the result of a comment of the + form /*NO TRACE*/ somewhere in the function + body. Tracing information will not be updated or inserted if + such a comment exists. + +

Error messages have the same format as a compiler so that they + can be parsed from program development environments like + Emacs. Any function which generates an error will not be + modified.

+ + diff --git a/doxygen/examples/FileFormat.html b/doxygen/examples/FileFormat.html new file mode 100644 index 0000000..fc35357 --- /dev/null +++ b/doxygen/examples/FileFormat.html @@ -0,0 +1,1275 @@ + + + + HDF5 File Format Discussion + + + + + + + + + + +

HDF5 File Format Discussion

+

Quincey Koziol
+ koziol@ncsa.uiuc.edu
+ May 15, 2003 +

+ +
    + +
  1. Document's Audience:

    + +
      +
    • Current H5 library designers and knowledgable external developers.
    • +
    + +
  2. Background Reading:

    + +
    +
    HDF5 File Format Specification +
    This describes the current HDF5 file format. +
    + +
  3. Introduction:

    + +
    +
    What is this document about?
    +
    This document attempts to explain the HDF5 file format + specification with a few examples and describes some potential + improvements to the format specification. +

    +
    + +
  4. File Format Examples:

    + +

    This section has several small programs and describes the format of a file +created with each of them. +

    + +

    Example program one - Create an empty file: +

     
    +#include "hdf5.h"
    +#include 
    +
    +int main()
    +{
    +    hid_t fid;      /* File ID */
    +    herr_t ret;     /* Generic return value */
    +
    +    /* Create the file */
    +    fid=H5Fcreate("example1.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
    +    assert(fid>=0);
    +
    +    /* Close the file */
    +    ret=H5Fclose(fid);
    +    assert(ret>=0);
    +
    +    return(0);
    +}
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Super Block +
    bytebytebytebyte
    \211'H''D''F'
    \r\n\032\n
    0000
    0880
    416
    0x00000003
    0

    0xffffffffffffffff

    ?

    0xffffffffffffffff

    + + + + + + + + + + + + + + + + + + + + +
    0

    928

    H5G_CACHED_STAB (1)
    0
    + + + + + + + +

    384


    96

    +
    +
    +
    +
    +
     
    +%h5debug example1.h5
    +
    +Reading signature at address 0 (rel)
    +File Super Block...
    +File name:                                         example1.h5
    +File access flags                                  0x00000000
    +File open reference count:                         1
    +Address of super block:                            0 (abs)
    +Size of user block:                                0 bytes
    +Super block version number:                        0
    +Free list version number:                          0
    +Root group symbol table entry version number:      0
    +Shared header version number:                      0
    +Size of file offsets (haddr_t type):               8 bytes
    +Size of file lengths (hsize_t type):               8 bytes
    +Symbol table leaf node 1/2 rank:                   4
    +Symbol table internal node 1/2 rank:               16
    +File consistency flags:                            0x00000003
    +Base address:                                      0 (abs)
    +Free list address:                                 UNDEF (rel)
    +Address of driver information block:               UNDEF (rel)
    +Root group symbol table entry:
    +   Name offset into private heap:                  0
    +   Object header address:                          928
    +   Dirty:                                          Yes
    +   Cache info type:                                Symbol Table
    +   Cached information:
    +      B-tree address:                              384
    +      Heap address:                                96
    + 
    + + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group Object Header +
    bytebytebytebyte
    102
    1
    32
    0x001116
    0x010
    + + + + + + + +

    384


    96

    +
    00
    0x000
    +
    +
    +
     
    +%h5debug example1.h5 928
    +
    +New address: 928
    +Reading signature at address 928 (rel)
    +Object Header...
    +Dirty:                                             0
    +Version:                                           1
    +Header size (in bytes):                            16
    +Number of links:                                   1
    +Number of messages (allocated):                    2 (32)
    +Number of chunks (allocated):                      1 (8)
    +Chunk 0...
    +   Dirty:                                          0
    +   Address:                                        944
    +   Size in bytes:                                  32
    +Message 0...
    +   Message ID (sequence number):                   0x0011 stab(0)
    +   Shared message:                                 No
    +   Constant:                                       Yes
    +   Raw size in obj header:                         16 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      B-tree address:                              384
    +      Name heap address:                           96
    +Message 1...
    +   Message ID (sequence number):                   0x0000 null(0)
    +   Shared message:                                 No
    +   Constant:                                       No
    +   Raw size in obj header:                         0 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group Local Heap +
    bytebytebytebyte
    'H''E''A''P'
    0
    256
    8
    128
    +
    +
    + +
     
    +%h5debug example1.h5 96
    +
    +New address: 96
    +Reading signature at address 96 (rel)
    +Local Heap...
    +Dirty:                                             0
    +Header size (in bytes):                            32
    +Address of heap data:                              128
    +Data bytes allocated on disk:                      256
    +Data bytes allocated in core:                      256
    +Free Blocks (offset, size):
    +   Block #0:                                        8,      248
    +Percent of heap used:                              3.12%
    +Data follows (`__' indicates free region)...
    +     0: 00 00 00 00 00 00 00 00  __ __ __ __ __ __ __ __ ........
    +    16: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    32: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    48: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    64: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    80: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    96: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   112: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   128: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   144: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   160: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   176: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   192: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   208: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   224: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +   240: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group B-tree +
    bytebytebytebyte
    'T''R''E''E'
    000

    0xffffffffffffffff


    0xffffffffffffffff

    +
    +
    +
     
    +%h5debug example1.h5 384 96
    +
    +New address: 384
    +Reading signature at address 384 (rel)
    +Tree type ID:                                      H5B_SNODE_ID
    +Size of node:                                      544
    +Size of raw (disk) key:                            8
    +Dirty flag:                                        False
    +Number of initial dirty children:                  0
    +Level:                                             0
    +Address of left sibling:                           UNDEF
    +Address of right sibling:                          UNDEF
    +Number of children (max):                          0 (32)
    +
    + 
    + +

    + +

    Example program two - Create a file with a single dataset in it: +

     
    +#include "hdf5.h"
    +#include 
    +
    +int main()
    +{
    +    hid_t fid;      /* File ID */
    +    hid_t sid;      /* Dataspace ID */
    +    hid_t did;      /* Dataset ID */
    +    herr_t ret;     /* Generic return value */
    +
    +    /* Create the file */
    +    fid=H5Fcreate("example2.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
    +    assert(fid>=0);
    +
    +    /* Create a scalar dataspace for the dataset */
    +    sid=H5Screate(H5S_SCALAR);
    +    assert(sid>=0);
    +
    +    /* Create a trivial dataset */
    +    did=H5Dcreate(fid, "Dataset", H5T_NATIVE_INT, sid, H5P_DEFAULT);
    +    assert(did>=0);
    +
    +    /* Close the dataset */
    +    ret=H5Dclose(did);
    +    assert(ret>=0);
    +
    +    /* Close the dataspace */
    +    ret=H5Sclose(sid);
    +    assert(ret>=0);
    +
    +    /* Close the file */
    +    ret=H5Fclose(fid);
    +    assert(ret>=0);
    +
    +    return(0);
    +}
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Super Block +
    bytebytebytebyte
    \211'H''D''F'
    \r\n\032\n
    0000
    0880
    416
    0x00000003
    0

    0xffffffffffffffff

    ?

    0xffffffffffffffff

    + + + + + + + + + + + + + + + + + + + + +
    0

    928

    H5G_CACHED_STAB (1)
    0
    + + + + + + + +

    384


    96

    +
    +
    +
    +
    +
     
    +%h5debug example2.h5
    +
    +Reading signature at address 0 (rel)
    +File Super Block...
    +File name:                                         example2.h5
    +File access flags                                  0x00000000
    +File open reference count:                         1
    +Address of super block:                            0 (abs)
    +Size of user block:                                0 bytes
    +Super block version number:                        0
    +Free list version number:                          0
    +Root group symbol table entry version number:      0
    +Shared header version number:                      0
    +Size of file offsets (haddr_t type):               8 bytes
    +Size of file lengths (hsize_t type):               8 bytes
    +Symbol table leaf node 1/2 rank:                   4
    +Symbol table internal node 1/2 rank:               16
    +File consistency flags:                            0x00000003
    +Base address:                                      0 (abs)
    +Free list address:                                 UNDEF (rel)
    +Address of driver information block:               UNDEF (rel)
    +Root group symbol table entry:
    +   Name offset into private heap:                  0
    +   Object header address:                          928
    +   Dirty:                                          Yes
    +   Cache info type:                                Symbol Table
    +   Cached entry information:
    +      B-tree address:                              384
    +      Heap address:                                96
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group Object Header +
    bytebytebytebyte
    102
    1
    32
    0x001116
    0x010
    + + + + + + + +

    384


    96

    +
    00
    0x000
    +
    +
    +
     
    +%h5debug example2.h5 928
    +
    +New address: 928
    +Reading signature at address 928 (rel)
    +Object Header...
    +Dirty:                                             0
    +Version:                                           1
    +Header size (in bytes):                            16
    +Number of links:                                   1
    +Number of messages (allocated):                    2 (32)
    +Number of chunks (allocated):                      1 (8)
    +Chunk 0...
    +   Dirty:                                          0
    +   Address:                                        944
    +   Size in bytes:                                  32
    +Message 0...
    +   Message ID:                                     0x0011 stab(0)
    +   Shared message:                                 No
    +   Constant:                                       Yes
    +   Raw size in obj header:                         16 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      B-tree address:                              384
    +      Name heap address:                           96
    +Message 1...
    +   Message ID:                                     0x0000 null(0)
    +   Shared message:                                 No
    +   Constant:                                       No
    +   Raw size in obj header:                         0 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group Local Heap +
    bytebytebytebyte
    'H''E''A''P'
    0
    256
    16
    128
    +
    +
    + +
     
    +%h5debug example2.h5 96
    +
    +New address: 96
    +Reading signature at address 96 (rel)
    +Local Heap...
    +Dirty:                                             0
    +Header size (in bytes):                            32
    +Address of heap data:                              128
    +Data bytes allocated on disk:                      256
    +Data bytes allocated in core:                      256
    +Free Blocks (offset, size):
    +   Block #0:                                       16,      240
    +Percent of heap used:                              6.25%
    +Data follows (`__' indicates free region)...
    +      0: 00 00 00 00 00 00 00 00  44 61 74 61 73 65 74 00 ........Dataset.
    +     16: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +     32: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +     48: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +     64: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +     80: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +     96: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    112: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    128: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    144: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    160: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    176: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    192: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    208: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    224: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    +    240: __ __ __ __ __ __ __ __  __ __ __ __ __ __ __ __
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group B-tree +
    bytebytebytebyte
    'T''R''E''E'
    001

    0xffffffffffffffff


    0xffffffffffffffff


    0


    1248


    8

    +
    +
    +
     
    +%h5debug example2.h5 384 96
    +
    +New address: 384
    +Reading signature at address 384 (rel)
    +Tree type ID:                                      H5B_SNODE_ID
    +Size of node:                                      544
    +Size of raw (disk) key:                            8
    +Dirty flag:                                        False
    +Number of initial dirty children:                  0
    +Level:                                             0
    +Address of left sibling:                           UNDEF
    +Address of right sibling:                          UNDEF
    +Number of children (max):                          1 (32)
    +Child 0...
    +   Address:                                        1248
    +   Left Key:
    +      Heap offset:                                 0
    +      Name :
    +   Right Key:
    +      Heap offset:                                 8
    +      Name :                                       Dataset
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + +
    + Root Group B-tree Symbol Table Node +
    bytebytebytebyte
    'S''N''O''D'
    101
    + + + + + + + + + + + + + + + + + + + + +
    8

    976

    0
    0


    0


    +
    +
    +
    +
     
    +%h5debug example2.h5 1248 96
    +
    +New address: 1248
    +Reading signature at address 1248 (rel)
    +Symbol Table Node...
    +Dirty:                                             No
    +Size of Node (in bytes):                           328
    +Number of Symbols:                                 1 of 8
    +Symbol 0:
    +   Name:                                           `Dataset'
    +   Name offset into private heap:                  8
    +   Object header address:                          976
    +   Dirty:                                          No
    +   Cache info type:                                Nothing Cached
    + 
    + +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + '/Dataset' Object Header +
    bytebytebytebyte
    Version: 1Reserved: 0Number of Header Messages: 6
    Object Reference Count: 1
    Total Object Header Size: 256
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Fill Value Header Message +
    Message Type: 0x0005Message Data Size: 8
    Flags: 0x01Reserved: 0
    Version: 1Space Allocation Time: 2 (Late)Fill Value Writing Time: 0 (At allocation)Fill Value Defined: 0 (Undefined)
    Fill Value Datatype Size: 0 (Use dataset's datatype for fill-value datatype)
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Datatype Header Message +
    Message Type: 0x0003Message Data Size: 16
    Flags: 0x01Reserved: 0
    + + + + + +
    Version: 0x1Class: 0x0 (Fixed-Point)
    +
    Fixed-Point Bit-Field: 0x08 (Little-endian, No padding, Signed)
    Size: 4
    Bit Offset: 0Bit Precision: 32
    Message Alignment Filler: -
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Dataspace Header Message +
    Message Type: 0x0001Message Data Size: 8
    Flags: 0x00Reserved: 0
    Version: 1Rank: 0 (Scalar)Flags: 0x00 (No maximum dimensions, no permutation information)Reserved: 0
    Reserved: 0
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Layout Header Message +
    Message Type: 0x0008Message Data Size: 24
    Flags: 0x00Reserved: 0
    Version: 1Rank: 1 (Dataspace rank+1)Class: 1 (Contiguous)Reserved: 0
    Reserved: 0

    Address: 0xffffffffffffffff (Undefined)

    Dimension 0 Size: 4 (Datatype size)
    Message Alignment Filler: -
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Modification Date & Time Header Message +
    Message Type: 0x0012Message Data Size: 8
    Flags: 0x00Reserved: 0
    Version: 1Reserved: 0
    Seconds Since Epoch: 1052401700 (2003-05-08 08:48:20 CDT)
    +
    + + + + + + + + + + + + + + + + + +
    + Null Header Message +
    Message Type: 0x0000Message Data Size: 144
    Flags: 0x00Reserved: 0
    +
    +
    +
    +
     
    +%h5debug example2.h5 976
    +
    +New address: 976
    +Reading signature at address 976 (rel)
    +Object Header...
    +Dirty:                                             0
    +Version:                                           1
    +Header size (in bytes):                            16
    +Number of links:                                   1
    +Number of messages (allocated):                    6 (32)
    +Number of chunks (allocated):                      1 (8)
    +Chunk 0...
    +   Dirty:                                          0
    +   Address:                                        992
    +   Size in bytes:                                  256
    +Message 0...
    +   Message ID (sequence number):                   0x0005 `fill_new' (0)
    +   Shared:                                         No
    +   Constant:                                       Yes
    +   Raw size in obj header:                         8 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      Version:                                     1
    +      Space Allocation Time:                       Late
    +      Fill Time:                                   On Allocation
    +      Fill Value Defined:                          Undefined
    +      Size:                                        0
    +      Data type:                                   
    +Message 1...
    +   Message ID (sequence number):                   0x0003 data_type(0)
    +   Shared message:                                 No
    +   Constant:                                       Yes
    +   Raw size in obj header:                         16 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      Type class:                                  integer
    +      Size:                                        4 bytes
    +      Byte order:                                  little endian
    +      Precision:                                   32 bits
    +      Offset:                                      0 bits
    +      Low pad type:                                zero
    +      High pad type:                               zero
    +      Sign scheme:                                 2's comp
    +Message 2...
    +   Message ID (sequence number):                   0x0001 simple_dspace(0)
    +   Shared message:                                 No
    +   Constant:                                       No
    +   Raw size in obj header:                         8 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      Rank:                                        0
    +Message 3...
    +   Message ID (sequence number):                   0x0008 layout(0)
    +   Shared message:                                 No
    +   Constant:                                       No
    +   Raw size in obj header:                         24 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      Data address:                                UNDEF
    +      Number of dimensions:                        1
    +      Size:                                        {4}
    +Message 4...
    +   Message ID (sequence number):                   0x0012 mtime_new(0)
    +   Shared message:                                 No
    +   Constant:                                       No
    +   Raw size in obj header:                         8 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      Time:                                        2003-03-05 14:52:00 CST
    +Message 5...
    +   Message ID (sequence number):                   0x0000 null(0)
    +   Shared message:                                 No
    +   Constant:                                       No
    +   Raw size in obj header:                         144 bytes
    +   Chunk number:                                   0
    +   Message Information:
    +      
    + 
    + +

    + +
+ + + + diff --git a/doxygen/examples/Filters.html b/doxygen/examples/Filters.html new file mode 100644 index 0000000..2d5bc5e --- /dev/null +++ b/doxygen/examples/Filters.html @@ -0,0 +1,450 @@ + + + Filters +

Filters in HDF5

+ + Note: Transient pipelines described in this document have not + been implemented. + +

Introduction

+ +

HDF5 allows chunked data to pass through user-defined filters + on the way to or from disk. The filters operate on chunks of an + H5D_CHUNKED dataset can be arranged in a pipeline + so output of one filter becomes the input of the next filter. + +

Each filter has a two-byte identification number (type + H5Z_filter_t) allocated by The HDF Group and can also be + passed application-defined integer resources to control its + behavior. Each filter also has an optional ASCII comment + string. + +

+ + + + + + + + + + + + + + + + + + + +
Values for H5Z_filter_tDescription
0-255These values are reserved for filters predefined and + registered by the HDF5 library and of use to the general + public. They are described in a separate section + below.
256-511Filter numbers in this range are used for testing only + and can be used temporarily by any organization. No + attempt is made to resolve numbering conflicts since all + definitions are by nature temporary.
512-65535Reserved for future assignment. Please contact the + HDF5 development team + to reserve a value or range of values for + use by your filters.
+ +

Defining and Querying the Filter Pipeline

+ +

Two types of filters can be applied to raw data I/O: permanent + filters and transient filters. The permanent filter pipeline is + defned when the dataset is created while the transient pipeline + is defined for each I/O operation. During an + H5Dwrite() the transient filters are applied first + in the order defined and then the permanent filters are applied + in the order defined. For an H5Dread() the + opposite order is used: permanent filters in reverse order, then + transient filters in reverse order. An H5Dread() + must result in the same amount of data for a chunk as the + original H5Dwrite(). + +

The permanent filter pipeline is defined by calling + H5Pset_filter() for a dataset creation property + list while the transient filter pipeline is defined by calling + that function for a dataset transfer property list. + +

+
herr_t H5Pset_filter (hid_t plist, + H5Z_filter_t filter, unsigned int flags, + size_t cd_nelmts, const unsigned int + cd_values[]) +
This function adds the specified filter and + corresponding properties to the end of the transient or + permanent output filter pipeline (depending on whether + plist is a dataset creation or dataset transfer + property list). The flags argument specifies certain + general properties of the filter and is documented below. The + cd_values is an array of cd_nelmts integers + which are auxiliary data for the filter. The integer values + will be stored in the dataset object header as part of the + filter information. +
int H5Pget_nfilters (hid_t plist) +
This function returns the number of filters defined in the + permanent or transient filter pipeline depending on whether + plist is a dataset creation or dataset transfer + property list. In each pipeline the filters are numbered from + 0 through N-1 where N is the value returned + by this function. During output to the file the filters of a + pipeline are applied in increasing order (the inverse is true + for input). Zero is returned if there are no filters in the + pipeline and a negative value is returned for errors. +
H5Z_filter_t H5Pget_filter (hid_t plist, + int filter_number, unsigned int *flags, + size_t *cd_nelmts, unsigned int + *cd_values, size_t namelen, char name[]) +
This is the query counterpart of + H5Pset_filter() and returns information about a + particular filter number in a permanent or transient pipeline + depending on whether plist is a dataset creation or + dataset transfer property list. On input, cd_nelmts + indicates the number of entries in the cd_values + array allocated by the caller while on exit it contains the + number of values defined by the filter. The + filter_number should be a value between zero and + N-1 as described for H5Pget_nfilters() + and the function will return failure (a negative value) if the + filter number is out of range. If name is a pointer + to an array of at least namelen bytes then the filter + name will be copied into that array. The name will be null + terminated if the namelen is large enough. The + filter name returned will be the name appearing in the file or + else the name registered for the filter or else an empty string. +
+ +

The flags argument to the functions above is a bit vector of + the following fields: + +

+ + + + + + + + + + +
Values for flagsDescription
H5Z_FLAG_OPTIONALIf this bit is set then the filter is optional. If + the filter fails (see below) during an + H5Dwrite() operation then the filter is + just excluded from the pipeline for the chunk for which + it failed; the filter will not participate in the + pipeline during an H5Dread() of the chunk. + This is commonly used for compression filters: if the + compression result would be larger than the input then + the compression filter returns failure and the + uncompressed data is stored in the file. If this bit is + clear and a filter fails then the + H5Dwrite() or H5Dread() also + fails.
+ +

Defining Filters

+ +

Each filter is bidirectional, handling both input and output to + the file, and a flag is passed to the filter to indicate the + direction. In either case the filter reads a chunk of data from + a buffer, usually performs some sort of transformation on the + data, places the result in the same or new buffer, and returns + the buffer pointer and size to the caller. If something goes + wrong the filter should return zero to indicate a failure. + +

During output, a filter that fails or isn't defined and is + marked as optional is silently excluded from the pipeline and + will not be used when reading that chunk of data. A required + filter that fails or isn't defined causes the entire output + operation to fail. During input, any filter that has not been + excluded from the pipeline during output and fails or is not + defined will cause the entire input operation to fail. + +

Filters are defined in two phases. The first phase is to + define a function to act as the filter and link the function + into the application. The second phase is to register the + function, associating the function with an + H5Z_filter_t identification number and a comment. + +

+
typedef size_t (*H5Z_func_t)(unsigned int + flags, size_t cd_nelmts, const unsigned int + cd_values[], size_t nbytes, size_t + *buf_size, void **buf) +
The flags, cd_nelmts, and + cd_values are the same as for the + H5Pset_filter() function with the additional flag + H5Z_FLAG_REVERSE which is set when the filter is + called as part of the input pipeline. The input buffer is + pointed to by *buf and has a total size of + *buf_size bytes but only nbytes are valid + data. The filter should perform the transformation in place if + possible and return the number of valid bytes or zero for + failure. If the transformation cannot be done in place then + the filter should allocate a new buffer with + malloc() and assign it to *buf, + assigning the allocated size of that buffer to + *buf_size. The old buffer should be freed + by calling free(). + +

+
herr_t H5Zregister (H5Z_filter_t filter_id, + const char *comment, H5Z_func_t + filter) +
The filter function is associated with a filter + number and a short ASCII comment which will be stored in the + hdf5 file if the filter is used as part of a permanent + pipeline during dataset creation. +
+ +

Predefined Filters

+ +

If zlib version 1.1.2 or later was found + during configuration then the library will define a filter whose + H5Z_filter_t number is + H5Z_FILTER_DEFLATE. Since this compression method + has the potential for generating compressed data which is larger + than the original, the H5Z_FLAG_OPTIONAL flag + should be turned on so such cases can be handled gracefully by + storing the original data instead of the compressed data. The + cd_nvalues should be one with cd_value[0] + being a compression agression level between zero and nine, + inclusive (zero is the fastest compression while nine results in + the best compression ratio). + +

A convenience function for adding the + H5Z_FILTER_DEFLATE filter to a pipeline is: + +

+
herr_t H5Pset_deflate (hid_t plist, unsigned + aggression) +
The deflate compression method is added to the end of the + permanent or transient filter pipeline depending on whether + plist is a dataset creation or dataset transfer + property list. The aggression is a number between + zero and nine (inclusive) to indicate the tradeoff between + speed and compression ratio (zero is fastest, nine is best + ratio). +
+ +

Even if the zlib isn't detected during + configuration the application can define + H5Z_FILTER_DEFLATE as a permanent filter. If the + filter is marked as optional (as with + H5Pset_deflate()) then it will always fail and be + automatically removed from the pipeline. Applications that read + data will fail only if the data is actually compressed; they + won't fail if H5Z_FILTER_DEFLATE was part of the + permanent output pipeline but was automatically excluded because + it didn't exist when the data was written. + +

zlib can be acquired from + + http://www.cdrom.com/pub/infozip/zlib/. + +

Example

+ +

This example shows how to define and register a simple filter + that adds a checksum capability to the data stream. + +

The function that acts as the filter always returns zero + (failure) if the md5() function was not detected at + configuration time (left as an excercise for the reader). + Otherwise the function is broken down to an input and output + half. The output half calculates a checksum, increases the size + of the output buffer if necessary, and appends the checksum to + the end of the buffer. The input half calculates the checksum + on the first part of the buffer and compares it to the checksum + already stored at the end of the buffer. If the two differ then + zero (failure) is returned, otherwise the buffer size is reduced + to exclude the checksum. + +

+ + + + +
+


+                  size_t
+                  md5_filter(unsigned int flags, size_t cd_nelmts,
+                  const unsigned int cd_values[], size_t nbytes,
+                  size_t *buf_size, void **buf)
+                  {
+                  #ifdef HAVE_MD5
+                  unsigned char       cksum[16];
+
+                  if (flags & H5Z_REVERSE) {
+                  /* Input */
+                  assert(nbytes>=16);
+                  md5(nbytes-16, *buf, cksum);
+
+                  /* Compare */
+                  if (memcmp(cksum, (char*)(*buf)+nbytes-16, 16)) {
+                  return 0; /*fail*/
+                  }
+
+                  /* Strip off checksum */
+                  return nbytes-16;
+
+                  } else {
+                  /* Output */
+                  md5(nbytes, *buf, cksum);
+
+                  /* Increase buffer size if necessary */
+                  if (nbytes+16>*buf_size) {
+                  *buf_size = nbytes + 16;
+                  *buf = realloc(*buf, *buf_size);
+                  }
+
+                  /* Append checksum */
+                  memcpy((char*)(*buf)+nbytes, cksum, 16);
+                  return nbytes+16;
+                  }
+                  #else
+                  return 0; /*fail*/
+                  #endif
+                  }
+	          
+
+ +

Once the filter function is defined it must be registered so + the HDF5 library knows about it. Since we're testing this + filter we choose one of the H5Z_filter_t numbers + from the reserved range. We'll randomly choose 305. + +

+

+ + + + +
+


+                  #define FILTER_MD5 305
+                  herr_t status = H5Zregister(FILTER_MD5, "md5 checksum", md5_filter);
+	          
+
+ +

Now we can use the filter in a pipeline. We could have added + the filter to the pipeline before defining or registering the + filter as long as the filter was defined and registered by time + we tried to use it (if the filter is marked as optional then we + could have used it without defining it and the library would + have automatically removed it from the pipeline for each chunk + written before the filter was defined and registered). + +

+

+ + + + +
+


+                  hid_t dcpl = H5Pcreate(H5P_DATASET_CREATE);
+                  hsize_t chunk_size[3] = {10,10,10};
+                  H5Pset_chunk(dcpl, 3, chunk_size);
+                  H5Pset_filter(dcpl, FILTER_MD5, 0, 0, NULL);
+                  hid_t dset = H5Dcreate(file, "dset", H5T_NATIVE_DOUBLE, space, dcpl);
+	          
+
+ +

6. Filter Diagnostics

+ +

If the library is compiled with debugging turned on for the H5Z + layer (usually as a result of configure + --enable-debug=z) then filter statistics are printed when + the application exits normally or the library is closed. The + statistics are written to the standard error stream and include + two lines for each filter that was used: one for input and one + for output. The following fields are displayed: + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
MethodThis is the name of the method as defined with + H5Zregister() with the charaters + "< or ">" prepended to indicate + input or output.
TotalThe total number of bytes processed by the filter + including errors. This is the maximum of the + nbytes argument or the return value. +
ErrorsThis field shows the number of bytes of the Total + column which can be attributed to errors.
User, System, ElapsedThese are the amount of user time, system time, and + elapsed time in seconds spent in the filter function. + Elapsed time is sensitive to system load. These times + may be zero on operating systems that don't support the + required operations.
BandwidthThis is the filter bandwidth which is the total + number of bytes processed divided by elapsed time. + Since elapsed time is subject to system load the + bandwidth numbers cannot always be trusted. + Furthermore, the bandwidth includes bytes attributed to + errors which may significanly taint the value if the + function is able to detect errors without much + expense.
+ +

+

+ + + + + +
+ Example: Filter Statistics +
+

H5Z: filter statistics accumulated ov=
+                  er life of library:
+                  Method     Total  Errors  User  System  Elapsed Bandwidth
+                  ------     -----  ------  ----  ------  ------- ---------
+                  >deflate  160000   40000  0.62    0.74     1.33 117.5 kBs
+                  <deflate  120000       0  0.11    0.00     0.12 1.000 MBs
+	          
+
+ +
+ + +

Footnote 1: Dataset chunks can be compressed + through the use of filters. Developers should be aware that + reading and rewriting compressed chunked data can result in holes + in an HDF5 file. In time, enough such holes can increase the + file size enough to impair application or library performance + when working with that file. See + + Freespace Management + in the chapter + + Performance Analysis and Issues.

+ diff --git a/doxygen/examples/IOFlow.html b/doxygen/examples/IOFlow.html new file mode 100644 index 0000000..6b2c27e --- /dev/null +++ b/doxygen/examples/IOFlow.html @@ -0,0 +1,137 @@ + + + + HDF5 Raw I/O Flow Notes + + + + + + + + +

HDF5 Raw I/O Flow Notes

+

Quincey Koziol
+ koziol@ncsa.uiuc.edu
+ August 20, 2003 +

+ +
    + +
  1. Document's Audience:

    + +
      +
    • Current H5 library designers and knowledgable external developers.
    • +
    + +
  2. Background Reading:

    + +
  3. Introduction:

    + +
    +
    What is this document about?
    +
    This document attempts to supplement the flow charts describing + the flow of control for raw data I/O in the library. +

    +
    + +
  4. Figures:

    +

    The following figures provide the main information:

    + + + + +
    High-Level View of Writing Raw Data
    Perform Serial or Parallel I/O
    Gather/Convert/Scatter
    + +
  5. Notes From Accompanying Figures:

    + +

    This section provides notes to augment the information in the accompanying + figures. +

    + +
      +
    1. Validate Parameters - Resolve any H5S_ALL parameters + for dataspace selections to actual dataspaces, allocate + conversion buffers, etc. +
    2. + +
    3. Space Allocated in File? - Space may not have been allocated + in the file to store the dataset data, if "late allocation" was chosen + for the allocation time when the dataset was created. +
    4. + +
    5. Allocate & Fill Space - These operations allocate both contiguous + and chunked dataset's space in the file. The chunked dataset space + allocation iterates through all the chunks in the file and allocates + both the B-tree information and the raw data in the file. Because of + the way filters work, fill-values are written out for chunked datasets + as they are allocated, instead of as a separate step. + In parallel + I/O, the chunked dataset allocation can potentially be time-consuming, + since all the raw data in the dataset is allocated from one process. +
    6. + +
    7. Datatype Conversion Needed? - This currently is the deciding + factor between doing "direct I/O" (in serial or parallel) and needing + to perform gather/convert/scatter operations. I believe that MPI + is capable of performing a limited range of type conversions and if so, + we should add support to detect when they can be used. This will + allow more I/O operations to be performed collectively. +
    8. + +
    9. Collective I/O Requested/Allowed? - A user has to both request + that collective I/O occur and also their I/O operation must meet the + requirements that the library sets for supporting collective parallel + I/O: +
        +
      • The dataspace must be scalar or simple (which is a no-op really, + since we don't support "complex" dataspaces in the library + currently). +
      • +
      • The selection must be regular. "all" selections + and hyperslab selections that were + made with only one call to H5Sselect_hyperslab() (i.e. not a + hyperslab selection that has been aggregated over multiple + selection calls) are regular. Supporting point and + irregular hyperslab selections are on the "to do" list. +
      • +
      • The dataset must be stored contiguously on disk (as shown in the + figure also). Supporting chunked dataset storage is also + on the "to do" list. +
      • +
      +
    10. + +
    11. Build "chunk map" - This step still has some scalability issues + as it creates a data structure that is proportional to the number of + chunks which will be written to, which could potentially be very large. + Building the "chunk map" information incrementally is on the "to do" + list also. +
    12. + +
    13. Perform Chunked I/O - As the figure shows, there is no support + for collective parallel I/O on chunked datasets currently. As noted + earlier, this is on the "to do" list. +
    14. + +
    15. Perform "Direct" Serial I/O - "Direct" serial I/O writes data + from the application's buffer, without any intervening buffer or memory + copies. For maximum efficiency and performance, the elements in the + selections should be adjoining. +
    16. + +
    17. Perform Collective Parallel I/O - This step also writes data + directly from an application buffer, but additionally uses collective + MPI I/O operations to combine the data from each process in the parallel + application in an efficient manner. +
    18. +
    + +
+ + + + diff --git a/doxygen/hdf5doxy_layout.xml b/doxygen/hdf5doxy_layout.xml index fc20aa1..6efa690 100644 --- a/doxygen/hdf5doxy_layout.xml +++ b/doxygen/hdf5doxy_layout.xml @@ -12,6 +12,7 @@ + diff --git a/doxygen/img/IOFlow.gif b/doxygen/img/IOFlow.gif new file mode 100644 index 0000000..3e79030 Binary files /dev/null and b/doxygen/img/IOFlow.gif differ diff --git a/doxygen/img/IOFlow2.gif b/doxygen/img/IOFlow2.gif new file mode 100644 index 0000000..c75ca79 Binary files /dev/null and b/doxygen/img/IOFlow2.gif differ diff --git a/doxygen/img/IOFlow3.gif b/doxygen/img/IOFlow3.gif new file mode 100644 index 0000000..316cd1e Binary files /dev/null and b/doxygen/img/IOFlow3.gif differ -- cgit v0.12