From 1d680fe04c545d601678d876ad22dea7bfc31edd Mon Sep 17 00:00:00 2001 From: Gerd Heber Date: Mon, 26 Apr 2021 14:07:29 -0500 Subject: Merge doxygen2 into develop (#553) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Fixed warnings and started H5Epublic.h. * Include H5FD* headers to correctly resolve references. * Doxygen2 (#330) * H5Eauto_is_v2. * Added a few more calls. * Added a few more H5E calls. * First cut of H5E v2. * Added the deprecated v1 calls. * Updated spacing. * Once more. * Taking some inspiration from Eigen3. * Add doxygen for the assigned functions: H5Pregister1,H5Pinsert1,H5Pen… (#352) * Add doxygen for the assigned functions: H5Pregister1,H5Pinsert1,H5Pencode1, H5Pget_filter_by_id1,H5Pget_version, H5Pset_file_space,H5Pget_file_space. Someone already adds H5Pget_filter1. Also fixs an extra parameter 'close' call back function for HPregister2. * doxygen work. fixs format by using clang-format. * doxgen work for H5Pregister1 etc. Addressed Barbara and Gerd's comments. For Quincey's comments, since we are not supposed to change the source code. I leave this to future improvements. * added documentation for H5P APIs (#350) * add documenation for H5Pget_buffer,H5Pget_data_transform,H5Pget_edc_check,H5Pget_hyper_vector_size,H5Pget_preserve,H5Pget_type_conv_cb,H5Pget_vlen_mem_manager,H5Pset_btree_ratios * format corrections * fixed grammer * fixed herr_t * Better name. * A fresh look. * add doxygen to H5Ppublic.h * use attention instead of warning * Add doxygen comments in H5Ppublic.h (#375) * Add doxygen comments in H5Ppublic.h * H5Pset_meta_block_size * H5Pset_metadata_read_attempts * H5Pset_multi_type * H5Pset_object_flush_cb * H5Pset_sieve_buf_size * H5Pset_small_data_block_size * H5Pset_all_coll_metadata_ops * H5Pget_all_coll_metadata_ops * Add DOXYGEN_EXAMPLES_DIR to src/CMakeLists.txt * Fix clang-format errors * Fix filenames in doxygen/examples * add doxygen to H5Ppublic.h (#378) * add doxygen to H5Ppublic.h * use attention instead of warning Co-authored-by: Kimmy Mu * Revert "add doxygen to H5Ppublic.h (#378)" This reverts commit 2ee1821b138a5c00b15ea57ce9e950367480f5f2. * Updated Doxygen variables. * I forgot to copy two images. * Enable desktop search by default. * Add my assigned Doxygen documentation. * Remove whitespace at EOL. Appease clang-format. * Addressed Chris' comments. * Added an alias for asynchronous functions. * One space is enough for all of us. * Slightly restructured RM page. * address some issues * reformatting * Style external links. * reformatting * reformatting * Added "Metadata Caching in HDF5" as a technical note example. * Revise this soon! * Added specification examples. * Fixed references. * Added H5AC cache image stuff and file format study. * Added older FMT versions. Where did 1.0 go? * Updated C/C++ note and replaced ambiguous labels. * Reformat source with clang v10.0.1. * Added the VFL technical note. * Added what I believe might be called version 1.0 of the format. * Added the remaining specs. * Added H5Z callback documentation and fixed a few mistakes. * Added dox for deprecated H5G calls and fixed a few snippet blockIDs. * clang-format happy? * Ok? * Bonus track: Deprecated H5D functions. * Carry over the more detailed group description. * Added documentation for the missing and deprecated H5R calls. * Life is easier and less repetitive w/ snippets. Use them! * Eliminate the snippet block ID artifacts in the HTML rendering. * Fixed snippet HTML artifacts and added a few missing calls. * Under 20 H5Ps to go! * Almost complete! * "This is a form of pedantry up with which I will not put." (Churchill) * Let's not waste as much space on bulleted lists! * First complete (?) draft of the Doxygen-based RM. * Completeness check and minor fixes along the way. * Pedantry. * Adding missing H5FD calls checkpoint. * Pedantry. * More pedantry. * Added H5Pset_fapl_log. * First draft of H5ES. * Fixed warnings. * Prep. for map module. * First cut of the map module. * Pedantry. * Possible H5F introduction. * Fix the indentation. * Pedantry. * Ditto. * Thanks to the reviewers for their comments. * Added missing images. * Line numbers are a distraction here. * More examples, references, and clean-up. Don't repeat yourself! * Clang pedantry. * Ditto. * More reviewer comments... * Templatized references and cleaned up \todos. * Committing clang-format changes * Fixed MANIFEST. * Addressed Quincey's comments. (OCPLs) * Fixed a few more \todo items. * Fixed more \todo items. * Added attribute life cycle. * Forgot the examples file. * Committing clang-format changes * Pedantry. * Live and learn! * Added a sample H5D life cycle. * Committing clang-format changes * Pedantry. Co-authored-by: kyang2014 Co-authored-by: Scot Breitenfeld Co-authored-by: Kimmy Mu Co-authored-by: Christopher Hogan Co-authored-by: jya-kmu <53388330+jya-kmu@users.noreply.github.com> Co-authored-by: David Young Co-authored-by: Larry Knox Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com> --- MANIFEST | 50 +- configure.ac | 24 +- doxygen/Doxyfile.in | 35 +- doxygen/aliases | 63 +- doxygen/dox/About.dox | 11 + doxygen/dox/Cookbook.dox | 5 + doxygen/dox/DDLBNF110.dox | 650 + doxygen/dox/DDLBNF112.dox | 653 + doxygen/dox/FileFormatSpec.dox | 23 + doxygen/dox/GettingStarted.dox | 3 + doxygen/dox/H5Fget_info.dox | 7 +- doxygen/dox/H5Lget_info.dox | 5 +- doxygen/dox/H5Lget_info_by_idx.dox | 5 +- doxygen/dox/H5Literate.dox | 4 +- doxygen/dox/H5Literate_by_name.dox | 4 +- doxygen/dox/H5Lvisit.dox | 4 +- doxygen/dox/H5Lvisit_by_name.dox | 4 +- doxygen/dox/MetadataCachingInHDF5.dox | 1020 + doxygen/dox/OtherSpecs.dox | 11 + doxygen/dox/Overview.dox | 32 + doxygen/dox/ReferenceManual.dox | 43 + doxygen/dox/Specifications.dox | 22 + doxygen/dox/TechnicalNotes.dox | 20 + doxygen/dox/api-compat-macros.dox | 1 - doxygen/dox/mainpage.dox | 44 - doxygen/dox/maybe_metadata_reads.dox | 82 + doxygen/examples/FF-IH_FileGroup.gif | Bin 0 -> 3407 bytes doxygen/examples/FF-IH_FileObject.gif | Bin 0 -> 2136 bytes doxygen/examples/FileFormatSpecChunkDiagram.jpg | Bin 0 -> 29237 bytes doxygen/examples/H5.format.1.0.html | 4050 ++++ doxygen/examples/H5.format.1.1.html | 6439 ++++++ doxygen/examples/H5.format.2.0.html | 14902 ++++++++++++++ doxygen/examples/H5.format.html | 20400 +++++++++++++++++++ doxygen/examples/H5A_examples.c | 145 + doxygen/examples/H5D_examples.c | 173 + doxygen/examples/H5F_examples.c | 187 + doxygen/examples/H5Pget_metadata_read_attempts.1.c | 22 + doxygen/examples/H5Pget_metadata_read_attempts.2.c | 44 + doxygen/examples/H5Pget_metadata_read_attempts.3.c | 44 + doxygen/examples/H5Pget_object_flush_cb.c | 41 + doxygen/examples/H5Pset_metadata_read_attempts.c | 59 + doxygen/examples/H5Pset_object_flush_cb.c | 41 + doxygen/examples/ImageSpec.html | 1203 ++ doxygen/examples/PaletteExample1.gif | Bin 0 -> 2731 bytes doxygen/examples/Palettes.fm.anc.gif | Bin 0 -> 4748 bytes doxygen/examples/TableSpec.html | 193 + doxygen/examples/ThreadSafeLibrary.html | 787 + doxygen/examples/VFL.html | 1601 ++ doxygen/hdf5_footer.html | 21 + doxygen/hdf5_header.html | 61 + doxygen/hdf5_navtree_hacks.js | 246 + doxygen/hdf5doxy.css | 251 + doxygen/hdf5doxy_layout.xml | 182 + doxygen/img/FF-IH_FileGroup.gif | Bin 0 -> 3407 bytes doxygen/img/FF-IH_FileObject.gif | Bin 0 -> 2136 bytes doxygen/img/FileFormatSpecChunkDiagram.jpg | Bin 0 -> 29237 bytes doxygen/img/HDFG-logo.png | Bin 4541 -> 1689 bytes doxygen/img/PaletteExample1.gif | Bin 0 -> 2731 bytes doxygen/img/Palettes.fm.anc.gif | Bin 0 -> 4748 bytes doxygen/img/ftv2node.png | Bin 0 -> 86 bytes doxygen/img/ftv2pnode.png | Bin 0 -> 229 bytes src/CMakeLists.txt | 1 + src/H5ACpublic.h | 343 +- src/H5Amodule.h | 40 +- src/H5Apublic.h | 193 +- src/H5Cpublic.h | 23 +- src/H5Dmodule.h | 38 +- src/H5Dpublic.h | 415 +- src/H5ESmodule.h | 33 + src/H5ESpublic.h | 167 +- src/H5Emodule.h | 29 + src/H5Epublic.h | 788 +- src/H5FDcore.h | 64 +- src/H5FDdirect.h | 63 +- src/H5FDfamily.h | 52 +- src/H5FDhdfs.h | 14 +- src/H5FDlog.h | 405 +- src/H5FDmirror.h | 14 +- src/H5FDmpi.h | 8 +- src/H5FDmpio.h | 225 +- src/H5FDmulti.h | 219 +- src/H5FDpublic.h | 88 +- src/H5FDros3.h | 14 +- src/H5FDsplitter.h | 14 +- src/H5FDstdio.h | 16 +- src/H5FDwindows.h | 30 + src/H5Fmodule.h | 30 +- src/H5Fpublic.h | 588 +- src/H5Gmodule.h | 91 +- src/H5Gpublic.h | 826 +- src/H5Ipublic.h | 22 +- src/H5Lmodule.h | 2 + src/H5Lpublic.h | 76 +- src/H5MMpublic.h | 5 + src/H5Mmodule.h | 45 + src/H5Mpublic.h | 373 +- src/H5Opublic.h | 227 +- src/H5PLpublic.h | 6 +- src/H5Pmodule.h | 5 +- src/H5Ppublic.h | 3246 ++- src/H5Rpublic.h | 492 +- src/H5Spublic.h | 8 +- src/H5Tmodule.h | 4 - src/H5Tpublic.h | 114 +- src/H5VLconnector.h | 9 +- src/H5VLmodule.h | 4 + src/H5VLpublic.h | 14 +- src/H5Zmodule.h | 8 +- src/H5Zpublic.h | 145 +- src/H5public.h | 150 +- 110 files changed, 61921 insertions(+), 1782 deletions(-) create mode 100644 doxygen/dox/About.dox create mode 100644 doxygen/dox/Cookbook.dox create mode 100644 doxygen/dox/DDLBNF110.dox create mode 100644 doxygen/dox/DDLBNF112.dox create mode 100644 doxygen/dox/FileFormatSpec.dox create mode 100644 doxygen/dox/GettingStarted.dox create mode 100644 doxygen/dox/MetadataCachingInHDF5.dox create mode 100644 doxygen/dox/OtherSpecs.dox create mode 100644 doxygen/dox/Overview.dox create mode 100644 doxygen/dox/ReferenceManual.dox create mode 100644 doxygen/dox/Specifications.dox create mode 100644 doxygen/dox/TechnicalNotes.dox delete mode 100644 doxygen/dox/mainpage.dox create mode 100644 doxygen/dox/maybe_metadata_reads.dox create mode 100644 doxygen/examples/FF-IH_FileGroup.gif create mode 100644 doxygen/examples/FF-IH_FileObject.gif create mode 100644 doxygen/examples/FileFormatSpecChunkDiagram.jpg create mode 100644 doxygen/examples/H5.format.1.0.html create mode 100644 doxygen/examples/H5.format.1.1.html create mode 100644 doxygen/examples/H5.format.2.0.html create mode 100644 doxygen/examples/H5.format.html create mode 100644 doxygen/examples/H5A_examples.c create mode 100644 doxygen/examples/H5D_examples.c create mode 100644 doxygen/examples/H5F_examples.c create mode 100644 doxygen/examples/H5Pget_metadata_read_attempts.1.c create mode 100644 doxygen/examples/H5Pget_metadata_read_attempts.2.c create mode 100644 doxygen/examples/H5Pget_metadata_read_attempts.3.c create mode 100644 doxygen/examples/H5Pget_object_flush_cb.c create mode 100644 doxygen/examples/H5Pset_metadata_read_attempts.c create mode 100644 doxygen/examples/H5Pset_object_flush_cb.c create mode 100644 doxygen/examples/ImageSpec.html create mode 100644 doxygen/examples/PaletteExample1.gif create mode 100644 doxygen/examples/Palettes.fm.anc.gif create mode 100644 doxygen/examples/TableSpec.html create mode 100644 doxygen/examples/ThreadSafeLibrary.html create mode 100644 doxygen/examples/VFL.html create mode 100644 doxygen/hdf5_footer.html create mode 100644 doxygen/hdf5_header.html create mode 100644 doxygen/hdf5_navtree_hacks.js create mode 100644 doxygen/hdf5doxy.css create mode 100644 doxygen/hdf5doxy_layout.xml create mode 100644 doxygen/img/FF-IH_FileGroup.gif create mode 100644 doxygen/img/FF-IH_FileObject.gif create mode 100644 doxygen/img/FileFormatSpecChunkDiagram.jpg create mode 100644 doxygen/img/PaletteExample1.gif create mode 100644 doxygen/img/Palettes.fm.anc.gif create mode 100644 doxygen/img/ftv2node.png create mode 100644 doxygen/img/ftv2pnode.png diff --git a/MANIFEST b/MANIFEST index 9b0ce1c..f7a4983 100644 --- a/MANIFEST +++ b/MANIFEST @@ -207,7 +207,12 @@ ./doxygen/aliases ./doxygen/Doxyfile.in -./doxygen/dox/api-compat-macros.dox +./doxygen/dox/About.dox +./doxygen/dox/Cookbook.dox +./doxygen/dox/DDLBNF110.dox +./doxygen/dox/DDLBNF112.dox +./doxygen/dox/FileFormatSpec.dox +./doxygen/dox/GettingStarted.dox ./doxygen/dox/H5AC_cache_config_t.dox ./doxygen/dox/H5Acreate.dox ./doxygen/dox/H5Aiterate.dox @@ -224,12 +229,53 @@ ./doxygen/dox/H5Ovisit_by_name.dox ./doxygen/dox/H5Ovisit.dox ./doxygen/dox/H5Sencode.dox -./doxygen/dox/mainpage.dox +./doxygen/dox/MetadataCachingInHDF5.dox +./doxygen/dox/OtherSpecs.dox +./doxygen/dox/Overview.dox +./doxygen/dox/ReferenceManual.dox +./doxygen/dox/Specifications.dox +./doxygen/dox/TechnicalNotes.dox +./doxygen/dox/api-compat-macros.dox +./doxygen/dox/maybe_metadata_reads.dox ./doxygen/dox/rm-template.dox +./doxygen/examples/FF-IH_FileGroup.gif +./doxygen/examples/FF-IH_FileObject.gif +./doxygen/examples/FileFormatSpecChunkDiagram.jpg +./doxygen/examples/H5Pset_metadata_read_attempts.c +./doxygen/examples/H5Pset_object_flush_cb.c +./doxygen/examples/H5.format.1.0.html +./doxygen/examples/H5.format.1.1.html +./doxygen/examples/H5.format.2.0.html +./doxygen/examples/H5.format.html +./doxygen/examples/H5A_examples.c +./doxygen/examples/H5D_examples.c ./doxygen/examples/H5Fclose.c ./doxygen/examples/H5Fcreate.c +./doxygen/examples/H5F_examples.c +./doxygen/examples/H5Pget_metadata_read_attempts.1.c +./doxygen/examples/H5Pget_metadata_read_attempts.2.c +./doxygen/examples/H5Pget_metadata_read_attempts.3.c +./doxygen/examples/H5Pget_object_flush_cb.c +./doxygen/examples/ImageSpec.html +./doxygen/examples/PaletteExample1.gif +./doxygen/examples/Palettes.fm.anc.gif +./doxygen/examples/TableSpec.html +./doxygen/examples/ThreadSafeLibrary.html +./doxygen/examples/VFL.html ./doxygen/examples/hello_hdf5.c +./doxygen/hdf5_footer.html +./doxygen/hdf5_header.html +./doxygen/hdf5_navtree_hacks.js +./doxygen/hdf5doxy.css +./doxygen/hdf5doxy_layout.xml +./doxygen/img/FF-IH_FileGroup.gif +./doxygen/img/FF-IH_FileObject.gif +./doxygen/img/FileFormatSpecChunkDiagram.jpg ./doxygen/img/HDFG-logo.png +./doxygen/img/PaletteExample1.gif +./doxygen/img/Palettes.fm.anc.gif +./doxygen/img/ftv2node.png +./doxygen/img/ftv2pnode.png ./examples/Attributes.txt ./examples/Makefile.am diff --git a/configure.ac b/configure.ac index d769ddc..b6694ad 100644 --- a/configure.ac +++ b/configure.ac @@ -95,12 +95,12 @@ AC_CONFIG_COMMANDS([pubconf], [ sed 's/#define /#define H5_/' pubconf if test ! -f src/H5pubconf.h; then - /bin/mv -f pubconf src/H5pubconf.h + mv -f pubconf src/H5pubconf.h elif (diff pubconf src/H5pubconf.h >/dev/null); then rm -f pubconf echo "src/H5pubconf.h is unchanged" else - /bin/mv -f pubconf src/H5pubconf.h + mv -f pubconf src/H5pubconf.h fi echo "Post process src/libhdf5.settings" sed '/^#/d' < src/libhdf5.settings > libhdf5.settings.TMP @@ -1116,16 +1116,34 @@ if test "X$HDF5_DOXYGEN" = "Xyes"; then AC_SUBST([DOXYGEN_OPTIMIZE_OUTPUT_FOR_C]) AC_SUBST([DOXYGEN_MACRO_EXPANSION]) AC_SUBST([DOXYGEN_OUTPUT_DIRECTORY]) + AC_SUBST([DOXYGEN_EXAMPLES_DIRECTORY]) + AC_SUBST([DOXYGEN_LAYOUT_FILE]) + AC_SUBST([DOXYGEN_HTML_HEADER]) + AC_SUBST([DOXYGEN_HTML_FOOTER]) + AC_SUBST([DOXYGEN_HTML_EXTRA_STYLESHEET]) + AC_SUBST([DOXYGEN_HTML_EXTRA_FILES]) + AC_SUBST([DOXYGEN_SERVER_BASED_SEARCH]) + AC_SUBST([DOXYGEN_EXTERNAL_SEARCH]) + AC_SUBST([DOXYGEN_SEARCHENGINE_URL]) DOXYGEN_PACKAGE=${PACKAGE_NAME} DOXYGEN_VERSION_STRING=${PACKAGE_VERSION} DOXYGEN_INCLUDE_ALIASES='$(SRCDIR)/doxygen/aliases' DOXYGEN_PROJECT_LOGO='$(SRCDIR)/doxygen/img/HDFG-logo.png' - DOXYGEN_PROJECT_BRIEF="C-API Reference" + DOXYGEN_PROJECT_BRIEF= DOXYGEN_INPUT_DIRECTORY='$(SRCDIR) $(SRCDIR)/doxygen/dox' DOXYGEN_OPTIMIZE_OUTPUT_FOR_C=YES DOXYGEN_MACRO_EXPANSION=YES DOXYGEN_OUTPUT_DIRECTORY=hdf5lib_docs + DOXYGEN_EXAMPLES_DIRECTORY='$(SRCDIR)/doxygen/examples' + DOXYGEN_LAYOUT_FILE='$(SRCDIR)/doxygen/hdf5doxy_layout.xml' + DOXYGEN_HTML_HEADER='$(SRCDIR)/doxygen/hdf5_header.html' + DOXYGEN_HTML_FOOTER='$(SRCDIR)/doxygen/hdf5_footer.html' + DOXYGEN_HTML_EXTRA_STYLESHEET='$(SRCDIR)/doxygen/hdf5doxy.css' + DOXYGEN_HTML_EXTRA_FILES='$(SRCDIR)/doxygen/hdf5_navtree_hacks.js $(SRCDIR)/doxygen/img/ftv2node.png $(SRCDIR)/doxygen/img/ftv2pnode.png' + DOXYGEN_SERVER_BASED_SEARCH=NO + DOXYGEN_EXTERNAL_SEARCH=NO + DOXYGEN_SEARCHENGINE_URL= DX_INIT_DOXYGEN([HDF5], [../doxygen/Doxyfile], [hdf5lib_docs]) diff --git a/doxygen/Doxyfile.in b/doxygen/Doxyfile.in index 2395d6c..b1cb955 100644 --- a/doxygen/Doxyfile.in +++ b/doxygen/Doxyfile.in @@ -738,7 +738,7 @@ FILE_VERSION_FILTER = # DoxygenLayout.xml, doxygen will parse it automatically even if the LAYOUT_FILE # tag is left empty. -LAYOUT_FILE = +LAYOUT_FILE = @DOXYGEN_LAYOUT_FILE@ # The CITE_BIB_FILES tag can be used to specify one or more bib files containing # the reference definitions. This must be a list of .bib files. The .bib @@ -855,7 +855,16 @@ INPUT_ENCODING = UTF-8 FILE_PATTERNS = H5*public.h \ H5*module.h \ + H5FDcore.h \ + H5FDdirect.h \ + H5FDfamily.h \ + H5FDlog.h \ + H5FDmpi.h \ H5FDmpio.h \ + H5FDmulti.h \ + H5FDsec2.h \ + H5FDstdio.h \ + H5FDwindows.h \ H5VLconnector.h \ H5VLconnector_passthru.h \ H5VLnative.h \ @@ -908,7 +917,7 @@ EXCLUDE_SYMBOLS = # that contain example code fragments that are included (see the \include # command). -EXAMPLE_PATH = ../src ../examples ../test examples +EXAMPLE_PATH = ../src ../examples ../test @DOXYGEN_EXAMPLES_DIRECTORY@ # If the value of the EXAMPLE_PATH tag contains directories, you can use the # EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp and @@ -1169,7 +1178,7 @@ HTML_FILE_EXTENSION = .html # of the possible markers and block names see the documentation. # This tag requires that the tag GENERATE_HTML is set to YES. -HTML_HEADER = +HTML_HEADER = @DOXYGEN_HTML_HEADER@ # The HTML_FOOTER tag can be used to specify a user-defined HTML footer for each # generated HTML page. If the tag is left blank doxygen will generate a standard @@ -1179,7 +1188,7 @@ HTML_HEADER = # that doxygen normally uses. # This tag requires that the tag GENERATE_HTML is set to YES. -HTML_FOOTER = +HTML_FOOTER = @DOXYGEN_HTML_FOOTER@ # The HTML_STYLESHEET tag can be used to specify a user-defined cascading style # sheet that is used by each HTML page. It can be used to fine-tune the look of @@ -1204,7 +1213,7 @@ HTML_STYLESHEET = # list). For an example see the documentation. # This tag requires that the tag GENERATE_HTML is set to YES. -HTML_EXTRA_STYLESHEET = +HTML_EXTRA_STYLESHEET = @DOXYGEN_HTML_EXTRA_STYLESHEET@ # The HTML_EXTRA_FILES tag can be used to specify one or more extra images or # other source files which should be copied to the HTML output directory. Note @@ -1214,7 +1223,7 @@ HTML_EXTRA_STYLESHEET = # files will be copied as-is; there are no commands or markers available. # This tag requires that the tag GENERATE_HTML is set to YES. -HTML_EXTRA_FILES = +HTML_EXTRA_FILES = @DOXYGEN_HTML_EXTRA_FILES@ # The HTML_COLORSTYLE_HUE tag controls the color of the HTML output. Doxygen # will adjust the colors in the style sheet and background images according to @@ -1272,7 +1281,7 @@ HTML_DYNAMIC_MENUS = NO # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. -HTML_DYNAMIC_SECTIONS = NO +HTML_DYNAMIC_SECTIONS = YES # With HTML_INDEX_NUM_ENTRIES one can control the preferred number of entries # shown in the various tree structured indices initially; the user can expand @@ -1484,7 +1493,7 @@ ECLIPSE_DOC_ID = org.doxygen.Project # The default value is: NO. # This tag requires that the tag GENERATE_HTML is set to YES. -DISABLE_INDEX = NO +DISABLE_INDEX = YES # The GENERATE_TREEVIEW tag is used to specify whether a tree-like index # structure should be generated to display hierarchical information. If the tag @@ -1632,7 +1641,7 @@ MATHJAX_CODEFILE = # The default value is: YES. # This tag requires that the tag GENERATE_HTML is set to YES. -SEARCHENGINE = NO +SEARCHENGINE = YES # When the SERVER_BASED_SEARCH tag is enabled the search engine will be # implemented using a web server instead of a web client using JavaScript. There @@ -1644,7 +1653,7 @@ SEARCHENGINE = NO # The default value is: NO. # This tag requires that the tag SEARCHENGINE is set to YES. -SERVER_BASED_SEARCH = YES +SERVER_BASED_SEARCH = @DOXYGEN_SERVER_BASED_SEARCH@ # When EXTERNAL_SEARCH tag is enabled doxygen will no longer generate the PHP # script for searching. Instead the search results are written to an XML file @@ -1660,7 +1669,7 @@ SERVER_BASED_SEARCH = YES # The default value is: NO. # This tag requires that the tag SEARCHENGINE is set to YES. -EXTERNAL_SEARCH = NO +EXTERNAL_SEARCH = @DOXYGEN_EXTERNAL_SEARCH@ # The SEARCHENGINE_URL should point to a search engine hosted by a web server # which will return the search results when EXTERNAL_SEARCH is enabled. @@ -1671,7 +1680,7 @@ EXTERNAL_SEARCH = NO # Searching" for details. # This tag requires that the tag SEARCHENGINE is set to YES. -SEARCHENGINE_URL = +SEARCHENGINE_URL = @DOXYGEN_SEARCHENGINE_URL@ # When SERVER_BASED_SEARCH and EXTERNAL_SEARCH are both enabled the unindexed # search data is written to a file for indexing by an external tool. With the @@ -2168,7 +2177,7 @@ INCLUDE_FILE_PATTERNS = # recursively expanded use the := operator instead of the = operator. # This tag requires that the tag ENABLE_PREPROCESSING is set to YES. -PREDEFINED = +PREDEFINED = H5_HAVE_PARALLEL # If the MACRO_EXPANSION and EXPAND_ONLY_PREDEF tags are set to YES then this # tag can be used to specify a list of macro names that should be expanded. The diff --git a/doxygen/aliases b/doxygen/aliases index 2255c9b..af43902 100644 --- a/doxygen/aliases +++ b/doxygen/aliases @@ -1,3 +1,5 @@ +ALIASES += THG="The HDF Group" + ################################################################################ # Styling ################################################################################ @@ -35,6 +37,15 @@ ALIASES += op{1}="\param[in] \1 Callback function" ALIASES += op_data="\param[in,out] op_data User-defined callback function context" ALIASES += op_data{1}="\param[in,out] \1 User-defined callback function context" +ALIASES += op_data_in="\param[in] op_data User-defined callback function context" +ALIASES += op_data_in{1}="\param[in] \1 User-defined callback function context" + +################################################################################ +# Asynchronous +################################################################################ + +ALIASES += async_variant_of{1}="Asynchronous version of \1()" + ################################################################################ # Attributes ################################################################################ @@ -67,6 +78,13 @@ ALIASES += file_type_id{1}="\param[in] \1 Datatype (in-file) identifier" ALIASES += mem_type_id{1}="\param[in] \1 Datatype (in-memory) identifier" ################################################################################ +# Errors +################################################################################ + +ALIASES += estack_id="\param[in] estack_id Error stack identifier" +ALIASES += estack_id{1}="\param[in] \1 Error stack identifier" + +################################################################################ # Files ################################################################################ @@ -103,6 +121,13 @@ ALIASES += fg_loc_id="\loc_id. The identifier may be that of a file or group." ALIASES += fg_loc_id{1}="\loc_id{\1}. The identifier may be that of a file or group." ################################################################################ +# Maps +################################################################################ + +ALIASES += map_id="\param[in] map_id Map identifier" +ALIASES += map_id{1}="\param[in] \1 Map identifier" + +################################################################################ # Property lists ################################################################################ @@ -121,6 +146,9 @@ ALIASES += dcpl_id{1}="\param[in] \1 Dataset creation property list identifier" ALIASES += dxpl_id="\param[in] dxpl_id Dataset transfer property list identifier" ALIASES += dxpl_id{1}="\param[in] \1 Dataset transfer property list identifier" +ALIASES += gacpl_id="\param[in] plist_id File, group, dataset, datatype, link, or attribute access property list identifier" +ALIASES += gacpl_id{1}="\param[in] \1 File, group, dataset, datatype, link, or attribute access property list identifier" + ALIASES += gapl_id="\param[in] gapl_id Group access property list identifier" ALIASES += gapl_id{1}="\param[in] \1 Group access property list identifier" @@ -133,9 +161,18 @@ ALIASES += lapl_id{1}="\param[in] \1 Link access property list identifier" ALIASES += lcpl_id="\param[in] lcpl_id Link creation property list identifier" ALIASES += lcpl_id{1}="\param[in] \1 Link creation property list identifier" +ALIASES += mapl_id="\param[in] mapl_id Map access property list identifier" +ALIASES += mapl_id{1}="\param[in] \1 Map access property list identifier" + +ALIASES += mcpl_id="\param[in] mcpl_id Map creation property list identifier" +ALIASES += mcpl_id{1}="\param[in] \1 Map creation property list identifier" + ALIASES += oapl_id="\param[in] oapl_id Object access property list identifier" ALIASES += oapl_id{1}="\param[in] \1 Object access property list identifier" +ALIASES += ocpl_id="\param[in] oapl_id Object creation property list identifier" +ALIASES += ocpl_id{1}="\param[in] \1 Object creation property list identifier" + ALIASES += plist_id="\param[in] plist_id Property list identifier" ALIASES += plist_id{1}="\param[in] \1 Property list identifier" @@ -173,7 +210,8 @@ ALIASES += fgdta_loc_obj_id{1}="\loc_obj_id{\1}. The identifier may be that of a ALIASES += app_file="\param[in] app_file For internal use only, not a visible user parameter" ALIASES += app_func="\param[in] app_func For internal use only, not a visible user parameter" ALIASES += app_line="\param[in] app_line For internal use only, not a visible user parameter" -ALIASES += es_id="\param[in] es_id The event set ID to add this asynchronous operation to. H5ES_NONE may be used for synchronous execution." +ALIASES += es_id="\param[in] es_id Event set identifier" +ALIASES += es_id{1}="\param[in] \1 Event set identifier" ################################################################################ # Others @@ -181,11 +219,32 @@ ALIASES += es_id="\param[in] es_id The event set ID to add this asynchronous ope ALIASES += estack_id="\param[in] estack_id Error stack identifier" ALIASES += estack_id{1}="\param[in] \1 Error stack identifier" +ALIASES += cpp_c_api_note="\attention \Bold{C++ Developers using HDF5 C-API functions beware:}\n Several functions in this C-API take function pointers or callbacks as arguments. Examples include H5Pset_elink_cb(), H5Pset_type_conv_cb(), H5Tconvert(), and H5Ewalk2(). Application code must ensure that those callback functions return normally such to allow the HDF5 to manage its resources and maintain a consistent state. For instance, those functions must not use the C \c setjmp / \c longjmp mechanism to leave those callback functions. Within the context of C++, any exceptions thrown within the callback function must be caught, such as with a \Code{catch(…)} statement. Any exception state can be placed within the provided user data function call arguments, and may be thrown again once the calling function has returned. Exceptions raised and not handled inside the callback are not supported as it might leave the HDF5 library in an inconsistent state. Similarly, using C++20 coroutines cannot be used as callbacks, since they do not support plain return statements. If a callback function yields execution to another C++20 coroutine calling HDF5 functions as well, this may lead to undefined behavior." +ALIASES += sa_metadata_ops="\sa \li H5Pget_all_coll_metadata_ops() \li H5Pget_coll_metadata_write() \li H5Pset_all_coll_metadata_ops() \li H5Pset_coll_metadata_write() \li \ref maybe_metadata_reads" + +################################################################################ +# References +################################################################################ + +ALIASES += ref_cons_semantics="Enabling a Strict Consistency Semantics Model in Parallel HDF5" +ALIASES += ref_dld_filters="HDF5 Dynamically Loaded Filters" +ALIASES += ref_file_image_ops="HDF5 File Image Operations" +ALIASES += ref_filter_pipe="Data Flow Pipeline for H5Dread()" +ALIASES += ref_group_impls="Group implementations in HDF5" +ALIASES += ref_h5lib_relver="HDF5 Library Release Version Numbers" +ALIASES += ref_mdc_in_hdf5="Metadata Caching in HDF5" +ALIASES += ref_mdc_logging="Metadata Cache Logging" +ALIASES += ref_news_112="New Features in HDF5 Release 1.12" +ALIASES += ref_h5ocopy="Copying Committed Datatypes with H5Ocopy()" +ALIASES += ref_sencode_fmt_change="RFC H5Secnode() / H5Sdecode() Format Change" +ALIASES += ref_vlen_strings="\Emph{Creating variable-length string datatypes}" +ALIASES += ref_vol_doc="VOL documentation" ################################################################################ # The Usual Suspects ################################################################################ +ALIASES += click4more="(Click on a enumerator, field, or type for more information.)" ALIASES += csets="
#H5T_CSET_ASCIIUS ASCII
#H5T_CSET_UTF8UTF-8 Unicode encoding
" ALIASES += datatype_class=" \li #H5T_INTEGER \li #H5T_FLOAT \li #H5T_STRING \li #H5T_BITFIELD \li #H5T_OPAQUE \li #H5T_COMPOUND \li #H5T_REFERENCE \li #H5T_ENUM \li #H5T_VLEN \li #H5T_ARRAY" ALIASES += file_access="
#H5F_ACC_RDWRFile was opened with read/write access.
#H5F_ACC_RDONLYFile was opened with read-only access.
#H5F_ACC_SWMR_WRITEFile was opened with read/write access for a single-writer/multiple-reader (SWMR) scenario. Note that the writer process must also open the file with the #H5F_ACC_RDWR flag.
#H5F_ACC_SWMR_READFile was opened with read-only access for a single-writer/multiple-reader (SWMR) scenario. Note that the reader process must also open the file with the #H5F_ACC_RDONLY flag.
" @@ -201,5 +260,5 @@ ALIASES += scopes=" + * + * + * + * + * + * + * + * + *
#H5F_SCOPE_GLOBALFlushes the entire v ALIASES += sign_prop="
#H5T_SGN_NONE0Unsigned integer type
#H5T_SGN_21Two's complement signed integer type
" ALIASES += storage_type="
#H5G_STORAGE_TYPE_COMPACTCompact storage
#H5G_STORAGE_TYPE_DENSEIndexed storage
#H5G_STORAGE_TYPE_SYMBOL_TABLESymbol tables, the original HDF5 structure
" ALIASES += str_pad_type="
#H5T_STR_NULLTERM0Null terminate (as C does)
#H5T_STR_NULLPAD1Pad with zeros
#H5T_STR_SPACEPAD2Pad with spaces (as FORTRAN does)
" -ALIASES += virtual=" \see Supporting Functions: \li H5Pget_layout() \li H5Pset_layout() \li H5Sget_regular_hyperslab() \li H5Sis_regular_hyperslab() \li H5Sselect_hyperslab() \see VDS Functions: \li H5Pget_virtual_count() \li H5Pget_virtual_dsetname() \li H5Pget_virtual_filename() \li H5Pget_virtual_prefix() \li H5Pget_virtual_printf_gap() \li H5Pget_virtual_srcspace() \li H5Pget_virtual_view() \li H5Pget_virtual_vspace() \li H5Pset_virtual \li H5Pset_virtual_prefix() \li H5Pset_virtual_printf_gap() \li H5Pset_virtual_view()" +ALIASES += see_virtual=" \see Supporting Functions: H5Pget_layout(), H5Pset_layout(), H5Sget_regular_hyperslab(), H5Sis_regular_hyperslab(), H5Sselect_hyperslab() \see VDS Functions: H5Pget_virtual_count(), H5Pget_virtual_dsetname(), H5Pget_virtual_filename(), H5Pget_virtual_prefix(), H5Pget_virtual_printf_gap(), H5Pget_virtual_srcspace(), H5Pget_virtual_view(), H5Pget_virtual_vspace(), H5Pset_virtual(), H5Pset_virtual_prefix(), H5Pset_virtual_printf_gap(), H5Pset_virtual_view()" ALIASES += obj_info_fields="
FlagPurpose
#H5O_INFO_BASICFill in the fileno, addr, type, and rc fields
#H5O_INFO_TIMEFill in the atime, mtime, ctime, and btime fields
#H5O_INFO_NUM_ATTRS Fill in the num_attrs field
#H5O_INFO_HDRFill in the num_attrs field
#H5O_INFO_META_SIZEFill in the meta_size field
#H5O_INFO_ALL#H5O_INFO_BASIC | #H5O_INFO_TIME | #H5O_INFO_NUM_ATTRS | #H5O_INFO_HDR | #H5O_INFO_META_SIZE
" diff --git a/doxygen/dox/About.dox b/doxygen/dox/About.dox new file mode 100644 index 0000000..3be9202 --- /dev/null +++ b/doxygen/dox/About.dox @@ -0,0 +1,11 @@ +/** \page About About + +The implementation of this documentation set is based on the fantastic work of the +Eigen project. +Please refer to their GitLab repository +and the online version of their +Doxygen-based documentation. +Not only does Eigen set a standard as a piece of software, but also as an example +of documentation done right. + +*/ \ No newline at end of file diff --git a/doxygen/dox/Cookbook.dox b/doxygen/dox/Cookbook.dox new file mode 100644 index 0000000..4abc896 --- /dev/null +++ b/doxygen/dox/Cookbook.dox @@ -0,0 +1,5 @@ +/** \page Cookbook Cookbook + + Healthy, everyday recipes for every taste and budget... + + */ \ No newline at end of file diff --git a/doxygen/dox/DDLBNF110.dox b/doxygen/dox/DDLBNF110.dox new file mode 100644 index 0000000..f7e4267 --- /dev/null +++ b/doxygen/dox/DDLBNF110.dox @@ -0,0 +1,650 @@ +/** \page DDLBNF110 DDL in BNF through HDF5 1.10 + +\todo Revise this & break it up! + +\section intro110 Introduction + +This document contains the data description language (DDL) for an HDF5 file. The +description is in Backus-Naur Form (BNF). + +\section expo110 Explanation of Symbols + +This section contains a brief explanation of the symbols used in the DDL. + +\code{.unparsed} +::= defined as + a token with the name tname + | one of or + opt zero or one occurrence of + * zero or more occurrence of + + one or more occurrence of + [0-9] an element in the range between 0 and 9 + '[' the token within the quotes (used for special characters) + TBD To Be Decided +\endcode + +\section ddl110 The DDL + +\code{.unparsed} + ::= HDF5 { opt } + + ::= + + ::= SUPER_BLOCK { + SUPERBLOCK_VERSION + FREELIST_VERSION + SYMBOLTABLE_VERSION + OBJECTHEADER_VERSION + OFFSET_SIZE + LENGTH_SIZE + BTREE_RANK + BTREE_LEAF + ISTORE_K + + USER_BLOCK { + USERBLOCK_SIZE + } + } + + ::= FILE_SPACE_STRATEGY + FREE_SPACE_PERSIST + FREE_SPACE_SECTION_THRESHOLD + FILE_SPACE_PAGE_SIZE + + ::= H5F_FSPACE_STRATEGY_FSM_AGGR | H5F_FSPACE_STRATEGY_PAGE | + H5F_FSPACE_STRATEGY_AGGR | H5F_FSPACE_STRATEGY_NONE | + Unknown strategy + + ::= GROUP "/" { + * + opt + opt + * + * + } + + ::= | | | + + ::= DATATYPE { + + } + + ::= the assigned name for anonymous named type is + in the form of #oid, where oid is the object id + of the type + + ::= | | | one of or + opt zero or one occurrence of + * zero or more occurrence of + + one or more occurrence of + [0-9] an element in the range between 0 and 9 + '[' the token within the quotes (used for special characters) + TBD To Be Decided +\endcode + +\section ddl112 The DDL + +\code{.unparsed} + ::= HDF5 { opt } + + ::= + + ::= SUPER_BLOCK { + SUPERBLOCK_VERSION + FREELIST_VERSION + SYMBOLTABLE_VERSION + OBJECTHEADER_VERSION + OFFSET_SIZE + LENGTH_SIZE + BTREE_RANK + BTREE_LEAF + ISTORE_K + + USER_BLOCK { + USERBLOCK_SIZE + } + } + + ::= FILE_SPACE_STRATEGY + FREE_SPACE_PERSIST + FREE_SPACE_SECTION_THRESHOLD + FILE_SPACE_PAGE_SIZE + + ::= H5F_FSPACE_STRATEGY_FSM_AGGR | H5F_FSPACE_STRATEGY_PAGE | + H5F_FSPACE_STRATEGY_AGGR | H5F_FSPACE_STRATEGY_NONE | + Unknown strategy + + ::= GROUP "/" { + * + opt + opt + * + * + } + + ::= | | | + + ::= DATATYPE { + + } + + ::= the assigned name for anonymous named type is + in the form of #oid, where oid is the object id + of the type + + ::= | | download it as a tgz archive for offline reading. + +This is the documention set for HDF5 in terms of specifications and software +developed and maintained by The HDF +Group. It is impractical to document the entire HDF5 ecosystem in one place, +and you should also consult the documentation sets of the many outstanding +community projects. + +For a first contact with HDF5, the best place is to have a look at the \link +GettingStarted getting started \endlink page that shows you how to write and +compile your first program with HDF5. + +The \b main \b documentation is organized by documentation flavor. Most +technical documentation consists to varying degrees of information related to +tasks, concepts, or reference material. As its title +suggests, the \link RM Reference Manual \endlink is 100% reference material, +while the \link Cookbook \endlink is focused on tasks. The different guide-type +documents cover a mix of tasks, concepts, and reference, to help a certain +audience succeed. + +Finally, do not miss the search engine (top right-hand corner)! If you are +looking for a specific function, it'll take you there directly. If unsure, it'll +give you an idea of what's on offer and a few promising leads. + +\par ToDo List + There is plenty of unfinished business. + +*/ diff --git a/doxygen/dox/ReferenceManual.dox b/doxygen/dox/ReferenceManual.dox new file mode 100644 index 0000000..596a224 --- /dev/null +++ b/doxygen/dox/ReferenceManual.dox @@ -0,0 +1,43 @@ +/** \page RM Reference Manual + +The functions provided by the HDF5 C-API are grouped into the following +\Emph{modules}: + +\li \ref H5A "Attributes" — Management of HDF5 attributes (\ref H5A) +\li \ref H5D "Datasets" — Management of HDF5 datasets (\ref H5D) +\li \ref H5S "Dataspaces" — Management of HDF5 dataspaces which describe the shape of datasets and attributes (\ref H5S) +\li \ref H5T "Datatypes" — Management of datatypes which describe elements of datasets and attributes (\ref H5T) +\li \ref H5E "Error Handling" — Functions for handling HDF5 errors (\ref H5E) +\li \ref H5ES "Event Sets" — Functions for handling HDF5 event sets (\ref H5ES) +\li \ref H5F "Files" — Management of HDF5 files (\ref H5F) +\li \ref H5Z "Filters" — Configuration of filters that process data during I/O operation (\ref H5Z) +\li \ref H5G "Groups" — Management of groups in HDF5 files (\ref H5G) +\li \ref H5I "Identifiers" — Management of object identifiers and object names (\ref H5I) +\li \ref H5 "Library" — General purpose library functions (\ref H5) +\li \ref H5L "Links" — Management of links in HDF5 groups (\ref H5L) +\li \ref H5M "Maps" — Management of HDF5 maps (\ref H5M) +\li \ref H5O "Objects" — Management of objects in HDF5 files (\ref H5O) +\li \ref H5PL "Plugins" — Programmatic control over dynamically loaded plugins (\ref H5PL) +\li \ref H5P "Property Lists" — Management of property lists to control HDF5 library behavior (\ref H5P) +\li \ref H5R "References" — Management of references to specific objects and data regions in an HDF5 file (\ref H5R) +\li \ref H5VL "Virtual Object Layer" — Management of the Virtual Object Layer (\ref H5VL) + +\par Asynchronous Functions + A subset of functions has \ref ASYNC "asynchronous variants". + +\par API Versioning + See \ref api-compat-macros + +\par Deprecated Functions and Types + A list of deprecated functions and types can be found + here. + +\par Etiquette + Here are a few simple rules to follow: + \li \Bold{Handle discipline:} If you acquire a handle (by creation or copy), \Emph{you own it!} (..., i.e., you have to close it.) + \li \Bold{Dynamic memory allocation:} ... + \li \Bold{Use of locations:} Identifier + name combo + +\cpp_c_api_note + +*/ \ No newline at end of file diff --git a/doxygen/dox/Specifications.dox b/doxygen/dox/Specifications.dox new file mode 100644 index 0000000..4ae48d0 --- /dev/null +++ b/doxygen/dox/Specifications.dox @@ -0,0 +1,22 @@ +/** \page SPEC Specifications + +\section DDL + +\li \ref DDLBNF110 "DDL in BNF through HDF5 1.10" +\li \ref DDLBNF112 "DDL in BNF for HDF5 1.12 and above" + +\section File Format + +\li \ref FMT1 "HDF5 File Format Specification Version 1.0" +\li \ref FMT11 "HDF5 File Format Specification Version 1.1" +\li \ref FMT2 "HDF5 File Format Specification Version 2.0" +\li \ref FMT3 "HDF5 File Format Specification Version 3.0" + +\section Other + +\li \ref IMG "HDF5 Image and Palette Specification Version 1.2" +\li \ref TBL "HDF5 Table Specification Version 1.0" +\li + HDF5 Dimension Scale Specification + +*/ \ No newline at end of file diff --git a/doxygen/dox/TechnicalNotes.dox b/doxygen/dox/TechnicalNotes.dox new file mode 100644 index 0000000..2bda175 --- /dev/null +++ b/doxygen/dox/TechnicalNotes.dox @@ -0,0 +1,20 @@ +/** \page TN Technical Notes + +\li \link api-compat-macros API Compatibility Macros \endlink +\li \ref TNMDC "Metadata Caching in HDF5" +\li \ref MT "Thread Safe library" +\li \ref VFL "Virtual File Layer" + + */ + +/** \page MT HDF5 Thread Safe library + +\htmlinclude ThreadSafeLibrary.html + +*/ + +/** \page VFL HDF5 Virtual File Layer + +\htmlinclude VFL.html + +*/ diff --git a/doxygen/dox/api-compat-macros.dox b/doxygen/dox/api-compat-macros.dox index 6b85ccb..4a1578d 100644 --- a/doxygen/dox/api-compat-macros.dox +++ b/doxygen/dox/api-compat-macros.dox @@ -1,5 +1,4 @@ /** \page api-compat-macros API Compatibility Macros - \tableofcontents \section audience Audience The target audience for this document has existing applications that use the diff --git a/doxygen/dox/mainpage.dox b/doxygen/dox/mainpage.dox deleted file mode 100644 index eda967b..0000000 --- a/doxygen/dox/mainpage.dox +++ /dev/null @@ -1,44 +0,0 @@ -/*! \mainpage HDF5 C-API Reference - * - * The HDF5 C-API provides applications with fine-grained control over all - * aspects HDF5 functionality. This functionality is grouped into the following - * \Emph{modules}: - * \li \ref H5A "Attributes" — Management of HDF5 attributes (\ref H5A) - * \li \ref H5D "Datasets" — Management of HDF5 datasets (\ref H5D) - * \li \ref H5S "Dataspaces" — Management of HDF5 dataspaces which describe the shape of datasets and attributes (\ref H5S) - * \li \ref H5T "Datatypes" — Management of datatypes which describe elements of datasets and attributes (\ref H5T) - * \li \ref H5E "Error Handling" — Functions for handling errors that occur within HDF5 (\ref H5E) - * \li \ref H5F "Files" — Management of HDF5 files (\ref H5F) - * \li \ref H5Z "Filters" — Configuration of filters that process data during I/O operation (\ref H5Z) - * \li \ref H5G "Groups" — Management of groups in HDF5 files (\ref H5G) - * \li \ref H5I "Identifiers" — Management of object identifiers and object names (\ref H5I) - * \li \ref H5 "Library" — General purpose library functions (\ref H5) - * \li \ref H5L "Links" — Management of links in HDF5 groups (\ref H5L) - * \li \ref H5O "Objects" — Management of objects in HDF5 files (\ref H5O) - * \li \ref H5PL "Plugins" — Programmatic control over dynamically loaded plugins (\ref H5PL) - * \li \ref H5P "Property Lists" — Management of property lists to control HDF5 library behavior (\ref H5P) - * \li \ref H5R "References" — Management of references to specific objects and data regions in an HDF5 file (\ref H5R) - * \li \ref H5VL "Virtual Object Layer" — Management of the Virtual Object Layer (\ref H5VL) - * - * Here are a few simple rules to follow: - * - * \li \Bold{Handle discipline:} If you acquire a handle (by creation or coopy), \Emph{you own it!} (..., i.e., you have to close it.) - * \li \Bold{Dynamic memory allocation:} ... - * \li \Bold{Use of locations:} Identifier + name combo - * - * \attention \Bold{C++ Developers using HDF5 C-API functions beware:}\n - * If a C routine that takes a function pointer as an argument is called from - * within C++ code, the C routine should be returned from normally. - * Examples of this kind of routine include callbacks such as H5Pset_elink_cb() - * and H5Pset_type_conv_cb() and functions such as H5Tconvert() and H5Ewalk2().\n - * Exiting the routine in its normal fashion allows the HDF5 C library to clean - * up its work properly. In other words, if the C++ application jumps out of - * the routine back to the C++ \c catch statement, the library is not given the - * opportunity to close any temporary data structures that were set up when the - * routine was called. The C++ application should save some state as the - * routine is started so that any problem that occurs might be diagnosed. - * - * \todo Fix the search form for server deployments. - * \todo Make it mobile-friendly - * - */ \ No newline at end of file diff --git a/doxygen/dox/maybe_metadata_reads.dox b/doxygen/dox/maybe_metadata_reads.dox new file mode 100644 index 0000000..25c905f --- /dev/null +++ b/doxygen/dox/maybe_metadata_reads.dox @@ -0,0 +1,82 @@ +/** + * \page maybe_metadata_reads Functions with No Access Property List Parameter that May Generate Metadata Reads + * + * \ingroup GACPL + * + * Currently there are several operations in HDF5 that can issue metadata reads + * from the metadata cache, but that take no property list. It is therefore not + * possible to set a collective requirement individually for those operations. The + * only solution with the HDF5 1.10.0 release is to set the collective requirement + * globally on H5Fopen() or H5Fcreate() for all metadata operations to be + * collective. + * + * The following is a list of those functions in the HDF5 library. This list is + * integral to the discussion in the H5Pset_all_coll_metadata_ops() entry: + * + *
+ *
+ * H5Awrite()
+ * H5Aread()
+ * H5Arename()
+ * H5Aiterate2()
+ * H5Adelete()
+ * H5Aexists()
+ *
+ * H5Dget_space_status()
+ * H5Dget_storage_size()
+ * H5Dset_extent()
+ * H5Ddebug()
+ * H5Dclose()
+ * H5Dget_create_plist()
+ * H5Dget_space()   (when dataset is a virtual dataset)
+ *
+ * H5Gget_create_plist()
+ * H5Gget_info()
+ * H5Gclose()
+ *
+ * H5Literate()
+ * H5Lvisit()
+ *
+ * H5Rcreate()
+ * H5Rdereference2()   (when reference is an object reference)
+ * H5Rget_region()
+ * H5Rget_obj_type2()
+ * H5Rget_name()
+ *
+ * H5Ocopy()
+ * H5Oopen_by_addr()
+ * H5Oincr_refcount()
+ * H5Odecr_refcount()
+ * H5Oget_info()
+ * H5Oset_comment()
+ * H5Ovisit()
+ *
+ * H5Fis_hdf5()
+ * H5Fflush()
+ * H5Fclose()
+ * H5Fget_file_image()
+ * H5Freopen()
+ * H5Fget_freespace()
+ * H5Fget_info2()
+ * H5Fget_free_sections()
+ * H5Fmount()
+ * H5Funmount()
+ *
+ * H5Iget_name()
+ *
+ * H5Tget_create_plist()
+ * H5Tclose()
+ *
+ * H5Zunregister()
+ * 
+ * + * In addition, \b most deprecated functions fall into this category. + * + * The HDF Group may address the above limitation in a future major release, but + * no decision has been made at this time. Such a change might, for example, + * include adding new versions of some or all the above functions with an extra + * property list parameter to allow an individual setting for the collective + * calling requirement. + * + * \sa_metadata_ops + */ diff --git a/doxygen/examples/FF-IH_FileGroup.gif b/doxygen/examples/FF-IH_FileGroup.gif new file mode 100644 index 0000000..b0d76f5 Binary files /dev/null and b/doxygen/examples/FF-IH_FileGroup.gif differ diff --git a/doxygen/examples/FF-IH_FileObject.gif b/doxygen/examples/FF-IH_FileObject.gif new file mode 100644 index 0000000..8eba623 Binary files /dev/null and b/doxygen/examples/FF-IH_FileObject.gif differ diff --git a/doxygen/examples/FileFormatSpecChunkDiagram.jpg b/doxygen/examples/FileFormatSpecChunkDiagram.jpg new file mode 100644 index 0000000..03fd90a Binary files /dev/null and b/doxygen/examples/FileFormatSpecChunkDiagram.jpg differ diff --git a/doxygen/examples/H5.format.1.0.html b/doxygen/examples/H5.format.1.0.html new file mode 100644 index 0000000..2d3ffbe --- /dev/null +++ b/doxygen/examples/H5.format.1.0.html @@ -0,0 +1,4050 @@ + + + + HDF5 File Format Specification + + + + +
+ + + +
+
    +
  1. Introduction +
  2. Disk Format Level 0 - File Signature and Super Block +
  3. Disk Format Level 1 - File Infrastructure + +
      +
    1. Disk Format Level 1A - B-link Trees and B-tree Nodes +
    2. Disk Format Level 1B - Group +
    3. Disk Format Level 1C - Group Entry +
    4. Disk Format Level 1D - Local Heaps +
    5. Disk Format Level 1E - Global Heap +
    6. Disk Format Level 1F - Free-space Index +
    +
    +
  4. Disk Format Level 2 - Data Objects + +
      +
    1. Disk Format Level 2a - Data Object Headers +
        +
      1. Name: NIL +
      2. Name: Simple Dataspace + +
      3. Name: Datatype +
      4. Name: Data Storage - Fill Value +
      5. Name: Reserved - not assigned yet +
      +
    +
    +
+
   +
    + +
  1. Disk Format Level 2 - Data Objects + (Continued) +
      +
    1. Disk Format Level 2a - Data Object Headers(Continued) +
        +
      1. Name: Data Storage - Compact +
      2. Name: Data Storage - External Data Files +
      3. Name: Data Storage - Layout +
      4. Name: Reserved - not assigned yet +
      5. Name: Reserved - not assigned yet +
      6. Name: Data Storage - Filter Pipeline +
      7. Name: Attribute +
      8. Name: Object Name +
      9. Name: Object Modification Date and Time +
      10. Name: Shared Object Message +
      11. Name: Object Header Continuation +
      12. Name: Group Message +
      +
    2. Disk Format: Level 2b - Shared Data Object Headers +
    3. Disk Format: Level 2c - Data Object Data Storage +
    +
    +
+
+
+ +

+ + +

Introduction

+ + + + + + + +
  +
+ HDF5 Groups +
 
  + Figure 1: Relationships among the HDF5 root group, other groups, and objects +
+
 
  + HDF5 Objects +  
  + Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces +
+
 
+ + +

The format of an HDF5 file on disk encompasses several + key ideas of the HDF4 and AIO file formats as well as + addressing some shortcomings therein. The new format is + more self-describing than the HDF4 format and is more + uniformly applied to data objects in the file. + +

An HDF5 file appears to the user as a directed graph. + The nodes of this graph are the higher-level HDF5 objects + that are exposed by the HDF5 APIs: + +

    +
  • Groups +
  • Datasets +
  • Datatypes +
  • Dataspaces +
+ +

At the lowest level, as information is actually written to the disk, + an HDF5 file is made up of the following objects: +

    +
  • A super block +
  • B-tree nodes (containing either symbol nodes or raw data chunks) +
  • Object headers + +
  • Collections +
  • Local heaps +
  • Free space +
+ + The HDF5 library uses these lower-level objects to represent the + higher-level objects that are then presented to the user or + to applications through the APIs. + For instance, a group is an object header that contains a message that + points to a local heap and to a B-tree which points to symbol nodes. + A dataset is an object header that contains messages that describe + datatype, space, layout, filters, external files, fill value, etc + with the layout message pointing to either a raw data chunk or to a + B-tree that points to raw data chunks. + + +

This Document

+ +

This document describes the lower-level data objects; + the higher-level objects and their properties are described + in the HDF5 User's Guide. + + + + + + +

Three levels of information comprise the file format. + Level 0 contains basic information for identifying and + defining information about the file. Level 1 information contains + the group information (stored as a B-tree) and is used as the + index for all the objects in the file. Level 2 is the rest + of the file and contains all of the data objects, with each object + partitioned into header information, also known as + meta information, and data. + +

The sizes of various fields in the following layout tables are + determined by looking at the number of columns the field spans + in the table. There are three exceptions: (1) The size may be + overridden by specifying a size in parentheses, (2) the size of + addresses is determined by the Size of Offsets field + in the super block, and (3) the size of size fields is determined + by the Size of Lengths field in the super block. + + + +

+

+ + +

+ Disk Format: Level 0 - File Signature and Super Block

+ +

The super block may begin at certain predefined offsets within + the HDF5 file, allowing a block of unspecified content for + users to place additional information at the beginning (and + end) of the HDF5 file without limiting the HDF5 library's + ability to manage the objects within the file itself. This + feature was designed to accommodate wrapping an HDF5 file in + another file format or adding descriptive information to the + file without requiring the modification of the actual file's + information. The super block is located by searching for the + HDF5 file signature at byte offset 0, byte offset 512 and at + successive locations in the file, each a multiple of two of + the previous location, i.e. 0, 512, 1024, 2048, etc. + +

The super block is composed of a file signature, followed by + super block and group version numbers, information + about the sizes of offset and length values used to describe + items within the file, the size of each group page, + and a group entry for the root object in the file. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ HDF5 Super Block Layout +
bytebytebytebyte

HDF5 File Signature (8 bytes)

Version # of Super BlockVersion # of Global Free-space StorageVersion # of GroupReserved
Version # of Shared Header Message FormatSize of OffsetsSize of LengthsReserved (zero)
Group Leaf Node KGroup Internal Node K
File Consistency Flags
Base Address*
Address of Global Free-space Heap*
End of File Address*
Driver Information Block Address*
Root Group Address*
+ + + +
+
+ (Items marked with an asterisk (*) in the above table +
+ are of the size specified in "Size of Offsets.") +
+
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
File SignatureThis field contains a constant value and can be used to + quickly identify a file as being an HDF5 file. The + constant value is designed to allow easy identification of + an HDF5 file and to allow certain types of data corruption + to be detected. The file signature of an HDF5 file always + contains the following values: + +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
decimal13772687013102610
hexadecimal894844460d0a1a0a
ASCII C Notation\211HDF\r\n\032\n
+
+
+ + This signature both identifies the file as an HDF5 file + and provides for immediate detection of common + file-transfer problems. The first two bytes distinguish + HDF5 files on systems that expect the first two bytes to + identify the file type uniquely. The first byte is + chosen as a non-ASCII value to reduce the probability + that a text file may be misrecognized as an HDF5 file; + also, it catches bad file transfers that clear bit + 7. Bytes two through four name the format. The CR-LF + sequence catches bad file transfers that alter newline + sequences. The control-Z character stops file display + under MS-DOS. The final line feed checks for the inverse + of the CR-LF translation problem. (This is a direct + descendent of the PNG file signature.)
Version Number of the Super BlockThis value is used to determine the format of the + information in the super block. When the format of the + information in the super block is changed, the version number + is incremented to the next integer and can be used to + determine how the information in the super block is + formatted.
Version Number of the Global Free-space HeapThis value is used to determine the format of the + information in the Global Free-space Heap.
Version Number of the GroupThis value is used to determine the format of the + information in the Group. When the format of + the information in the Group is changed, the + version number is incremented to the next integer and can be + used to determine how the information in the Group + is formatted.
Version Number of the Shared Header Message FormatThis value is used to determine the format of the + information in a shared object header message, which is + stored in the global small-data heap. Since the format + of the shared header messages differs from the private + header messages, a version number is used to identify changes + in the format.
Size of OffsetsThis value contains the number of bytes used to store + addresses in the file. The values for the addresses of + objects in the file are offsets relative to a base address, + usually the address of the super block signature. This + allows a wrapper to be added after the file is created + without invalidating the internal offset locations.
Size of LengthsThis value contains the number of bytes used to store + the size of an object.
Group Leaf Node KEach leaf node of a group B-tree will have at + least this many entries but not more than twice this + many. If a group has a single leaf node then it + may have fewer entries.
Group Internal Node KEach internal node of a group B-tree will have + at least K pointers to other nodes but not more than 2K + pointers. If the group has only one internal + node then it might have fewer than K pointers.
Bytes per B-tree PageThis value contains the number of bytes used for symbol + pairs per page of the B-trees used in the file. All + B-tree pages will have the same size per page. +
+ For 32-bit file offsets, 340 objects is the maximum + per 4KB page; for 64-bit file offset, 254 objects will fit + per 4KB page. In general, the equation is: +
+    <number of objects> = +
       + FLOOR((<page size> - <offset size>) / +
          + (<Symbol size> + <offset size>)) + - 1
File Consistency FlagsThis value contains flags to indicate information + about the consistency of the information contained + within the file. Currently, the following bit flags are + defined: +
    +
  • Bit 0 set indicates that the file is opened for + write-access. +
  • Bit 1 set indicates that the file has + been verified for consistency and is guaranteed to be + consistent with the format defined in this document. +
  • Bits 2-31 are reserved for future use. +
+ Bit 0 should be + set as the first action when a file is opened for write + access and should be cleared only as the final action + when closing a file. Bit 1 should be cleared during + normal access to a file and only set after the file's + consistency is guaranteed by the library or a + consistency utility.
Base AddressThis is the absolute file address of the first byte of + the HDF5 data within the file. The library currently + constrains this value to be the absolute file address + of the super block itself when creating new files; + future versions of the library may provide greater + flexibility. Unless otherwise noted, + all other file addresses are relative to this base + address.
Address of Global Free-space HeapFree-space management is not yet defined in the HDF5 + file format and is not handled by the library. + Currently this field always contains the + undefined address 0xfff...ff. + +
End of File AddressThis is the relative file address of the first byte past + the end of all HDF5 data. It is used to determine whether a + file has been accidently truncated and as an address where + file data allocation can occur if the free list is not + used.
Driver Information Block AddressThis is the relative file address of the file driver + information block which contains driver-specific + information needed to reopen the file. If there is no + driver information block then this entry should be the + undefined address (all bits set).
Root Group AddressThis is the address of the root group (described later + in this document), which serves as the entry point into + the group graph.
+
+ + +

The file driver information block is an optional region of the + file which contains information needed by the file driver in + order to reopen a file. The format of the file driver information + block is: + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Driver Information Block +
bytebytebytebyte
VersionReserved (zero)
Driver Information Size (4 bytes)

Driver Identification (8 bytes)



Driver Information


+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
VersionThe version number of the driver information block. The + file format documented here is version zero.
Driver Information SizeThe size in bytes of the Driver Information part of this + structure.
Driver IdentificationThis is an eight-byte ASCII string without null + termination which identifies the driver and version number + of the Driver Information block. The predefined drivers + supplied with the HDF5 library are identified by the + letters NCSA followed by the first four characters of + the driver name. If the Driver Information block is not + the original version then the last letter(s) of the + identification will be replaced by a version number in + ASCII. + For example, the various versions of the family driver + will be identified by NCSAfami, NCSAfam0, + NCSAfam1, etc. + (NCSAfami is simply NCSAfamily truncated + to eight characters. Subsequent identifiers will be created by + substituting sequential numerical values for the final character, + starting with zero.) +

+ Identification for user-defined drivers + is arbitrary but should be unique.

Driver InformationDriver information is stored in a format defined by the + file driver and encoded/decoded by the driver callbacks + invoked from the H5FD_sb_encode and + H5FD_sb_decode functions.
+
+ + +

+

+ + +

+ Disk Format: Level 1 - File Infrastructure

+

Disk Format: Level 1A - B-link Trees and B-tree Nodes

+ +

B-link trees allow flexible storage for objects which tend to grow + in ways that cause the object to be stored discontiguously. B-trees + are described in various algorithms books including "Introduction to + Algorithms" by Thomas H. Cormen, Charles E. Leiserson, and Ronald + L. Rivest. The B-link tree, in which the sibling nodes at a + particular level in the tree are stored in a doubly-linked list, + is described in the "Efficient Locking for Concurrent Operations + on B-trees" paper by Phillip Lehman and S. Bing Yao as published + in the ACM Transactions on Database Systems, Vol. 6, + No. 4, December 1981. + +

The B-link trees implemented by the file format contain one more + key than the number of children. In other words, each child + pointer out of a B-tree node has a left key and a right key. + The pointers out of internal nodes point to sub-trees while + the pointers out of leaf nodes point to symbol nodes and + raw data chunks. + Aside from that difference, internal nodes and leaf nodes + are identical. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ B-tree Nodes +
bytebytebytebyte
Node Signature
Node TypeNode LevelEntries Used
Address of Left Sibling
Address of Right Sibling
Key 0 (variable size)
Address of Child 0
Key 1 (variable size)
Address of Child 1
...
Key 2K (variable size)
Address of Child 2K
Key 2K+1 (variable size)
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Node SignatureThe ASCII character string TREE is + used to indicate the + beginning of a B-link tree node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file.
Node TypeEach B-link tree points to a particular type of data. + This field indicates the type of data as well as + implying the maximum degree K of the tree and + the size of each Key field. +
+
+
0 +
This tree points to group nodes. +
1 +
This tree points to a new data chunk. +
+
Node LevelThe node level indicates the level at which this node + appears in the tree (leaf nodes are at level zero). Not + only does the level indicate whether child pointers + point to sub-trees or to data, but it can also be used + to help file consistency checking utilities reconstruct + damanged trees.
Entries UsedThis determines the number of children to which this + node points. All nodes of a particular type of tree + have the same maximum degree, but most nodes will point + to less than that number of children. The valid child + pointers and keys appear at the beginning of the node + and the unused pointers and keys appear at the end of + the node. The unused pointers and keys have undefined + values.
Address of Left SiblingThis is the file address of the left sibling of the + current node relative to the super block. If the current + node is the left-most node at this level then this field + is the undefined address (all bits set).
Address of Right SiblingThis is the file address of the right sibling of the + current node relative to the super block. If the current + node is the right-most node at this level then this + field is the undefined address (all bits set).
Keys and Child PointersEach tree has 2K+1 keys with 2K + child pointers interleaved between the keys. The number + of keys and child pointers actually containing valid + values is determined by the Entries Used field. If + that field is N then the B-link tree contains + N child pointers and N+1 keys.
KeyThe format and size of the key values is determined by + the type of data to which this tree points. The keys are + ordered and are boundaries for the contents of the child + pointer; that is, the key values represented by child + N fall between Key N and Key + N+1. Whether the interval is open or closed on + each end is determined by the type of data to which the + tree points. +

+ The format of the key depends on the node type. + For nodes of node type 1, the key is formatted as follows: +

+ + + + + + + + + + + +
Bytes 1-4Size of chunk in bytes.
Bytes 4-8Filter mask, a 32-bit bitfield indicating which + filters have been applied to that chunk.
N fields of 8 bytes eachA 64-bit index indicating the offset of the + chunk within the dataset where N is the number + of dimensions of the dataset. For example, if + a chunk in a 3-dimensional dataset begins at the + position [5,5,5], there will be three + such 8-bit indices, each with the value of + 5.
+
+

+ For nodes of node type 0, the key is formatted as follows: +

+ + + + + +
A single field of Size of Lengths + bytesIndicates the byte offset into the local heap + for the first object name in the subtree which + that key describes.
+
+
Child PointersThe tree node contains file addresses of subtrees or + data depending on the node level. Nodes at Level 0 point + to data addresses, either data chunk or group nodes. + Nodes at non-zero levels point to other nodes of the + same B-tree.
+
+ +

+ Each B-tree node looks like this: + +

+ + + + + + + + + + + + + +
key[0]  child[0]  key[1]  child[1]  key[2]  ...  ...  key[N-1]  child[N-1]  key[N]
+
+ + where child[i] is a pointer to a sub-tree (at a level + above Level 0) or to data (at Level 0). + Each key[i] describes an item stored by the B-tree + (a chunk or an object of a group node). The range of values + represented by child[i] are indicated by key[i] + and key[i+1]. + + +

The following question must next be answered: + "Is the value described by key[i] contained in + child[i-1] or in child[i]?" + The answer depends on the type of tree. + In trees for groups (node type 0) the object described by + key[i] is the greatest object contained in + child[i-1] while in chunk trees (node type 1) the + chunk described by key[i] is the least chunk in + child[i]. + +

That means that key[0] for group trees is sometimes unused; + it points to offset zero in the heap, which is always the + empty string and compares as "less-than" any valid object name. + +

And key[N] for chunk trees is sometimes unused; + it contains a chunk offset which compares as "greater-than" + any other chunk offset and has a chunk byte size of zero + to indicate that it is not actually allocated. + + +

Disk Format: Level 1B - Group and Symbol Nodes

+ +

A group is an object internal to the file that allows + arbitrary nesting of objects (including other groups). + A group maps a set of names to a set of file + address relative to the base address. Certain meta data + for an object to which the group points can be duplicated + in the group symbol table in addition to the object header. + +

An HDF5 object name space can be stored hierarchically by + partitioning the name into components and storing each + component in a group. The group entry for a + non-ultimate component points to the group containing + the next component. The group entry for the last + component points to the object being named. + +

A group is a collection of group nodes pointed + to by a B-link tree. Each group node contains entries + for one or more symbols. If an attempt is made to add a + symbol to an already full group node containing + 2K entries, then the node is split and one node + contains K symbols and the other contains + K+1 symbols. + +

+

+ + + + + + + + + + + + + + + + + + + +
+ Group Node (A Leaf of a B-tree) +
bytebytebytebyte
Node Signature
Version NumberReserved for Future UseNumber of Symbols


Group Entries


+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Node SignatureThe ASCII character string SNOD is + used to indicate the + beginning of a group node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file.
Version NumberThe version number for the group node. This + document describes version 1.
Number of SymbolsAlthough all group nodes have the same length, + most contain fewer than the maximum possible number of + symbol entries. This field indicates how many entries + contain valid data. The valid entries are packed at the + beginning of the group node while the remaining + entries contain undefined values.
Group EntriesEach symbol has an entry in the group node. + The format of the entry is described below.
+
+ +

+ Disk Format: Level 1C - Group Entry

+ +

Each group entry in a group node is designed + to allow for very fast browsing of stored objects. + Toward that design goal, the group entries + include space for caching certain constant meta data from the + object header. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Group Entry +
bytebytebytebyte
Name Offset (<size> bytes)
Object Header Address
Cache Type
Reserved


Scratch-pad Space (16 bytes)


+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Name OffsetThis is the byte offset into the group local + heap for the name of the object. The name is null + terminated.
Object Header AddressEvery object has an object header which serves as a + permanent location for the object's meta data. In addition + to appearing in the object header, some meta data can be + cached in the scratch-pad space.
Cache TypeThe cache type is determined from the object header. + It also determines the format for the scratch-pad space. +
+
+
0 +
No data is cached by the group entry. This + is guaranteed to be the case when an object header + has a link count greater than one. + +
1 +
Object header meta data is cached in the group + entry. This implies that the group + entry refers to another group. + +
2 +
The entry is a symbolic link. The first four bytes + of the scratch-pad space are the offset into the local + heap for the link value. The object header address + will be undefined. + +
N +
Other cache values can be defined later and + libraries that do not understand the new values will + still work properly. +
+
ReservedThese four bytes are present so that the scratch-pad + space is aligned on an eight-byte boundary. They are + always set to zero.
Scratch-pad SpaceThis space is used for different purposes, depending + on the value of the Cache Type field. Any meta-data + about a dataset object represented in the scratch-pad + space is duplicated in the object header for that + dataset. This meta data can include the datatype + and the size of the dataspace for a dataset whose datatype + is atomic and whose dataspace is fixed and less than + four dimensions. + Furthermore, no data is cached in the group + entry scratch-pad space if the object header for + the group entry has a link count greater than + one.
+
+ +

Format of the Scratch-pad Space

+ +

The group entry scratch-pad space is formatted + according to the value in the Cache Type field. + +

If the Cache Type field contains the value zero + (0) then no information is + stored in the scratch-pad space. + +

If the Cache Type field contains the value one + (1), then the scratch-pad space + contains cached meta data for another object header + in the following format: + +

+

+ + + + + + + + + + + + + + +
+ Object Header Scratch-pad Format +
bytebytebytebyte
Address of B-tree
Address of Name Heap
+
+ +

+

+ + + + + + + + + + + + + + + +
Field NameDescription
Address of B-treeThis is the file address for the root of the + group's B-tree.
Address of Name HeapThis is the file address for the group's local + heap, in which are stored the symbol names.
+
+ + +

If the Cache Type field contains the value two + (2), then the scratch-pad space + contains cached meta data for another symbolic link + in the following format: + +

+

+ + + + + + + + + + + + + +
+ Symbolic Link Scratch-pad Format +
bytebytebytebyte
Offset to Link Value
+
+ +

+

+ + + + + + + + + + +
Field NameDescription
Offset to Link ValueThe value of a symbolic link (that is, the name of the + thing to which it points) is stored in the local heap. + This field is the 4-byte offset into the local heap for + the start of the link value, which is null terminated.
+
+ +

Disk Format: Level 1D - Local Heaps

+ +

A heap is a collection of small heap objects. Objects can be + inserted and removed from the heap at any time. + The address of a heap does not change once the heap is created. + References to objects are stored in the group table; + the names of those objects are stored in the local heap. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Local Heaps +
bytebytebytebyte
Heap Signature
Reserved (zero)
Data Segment Size
Offset to Head of Free-list (<size> bytes)
Address of Data Segment
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Heap SignatureThe ASCII character string HEAP + is used to indicate the + beginning of a heap. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file.
Data Segment SizeThe total amount of disk memory allocated for the heap + data. This may be larger than the amount of space + required by the object stored in the heap. The extra + unused space holds a linked list of free blocks.
Offset to Head of Free-listThis is the offset within the heap data segment of the + first free block (or all 0xff bytes if there is no free + block). The free block contains <size> bytes that + are the offset of the next free chunk (or all 0xff bytes + if this is the last free chunk) followed by <size> + bytes that store the size of this free chunk.
Address of Data SegmentThe data segment originally starts immediately after + the heap header, but if the data segment must grow as a + result of adding more objects, then the data segment may + be relocated, in its entirety, to another part of the + file.
+
+ +

Objects within the heap should be aligned on an 8-byte boundary. + +

Disk Format: Level 1E - Global Heap

+ +

Each HDF5 file has a global heap which stores various types of + information which is typically shared between datasets. The + global heap was designed to satisfy these goals: + +

    +
  1. Repeated access to a heap object must be efficient without + resulting in repeated file I/O requests. Since global heap + objects will typically be shared among several datasets, it is + probable that the object will be accessed repeatedly. + +

    +
  2. Collections of related global heap objects should result in + fewer and larger I/O requests. For instance, a dataset of + void pointers will have a global heap object for each + pointer. Reading the entire set of void pointer objects + should result in a few large I/O requests instead of one small + I/O request for each object. + +

    +
  3. It should be possible to remove objects from the global heap + and the resulting file hole should be eligible to be reclaimed + for other uses. +

    +
+ +

The implementation of the heap makes use of the memory + management already available at the file level and combines that + with a new top-level object called a collection to + achieve Goal B. The global heap is the set of all collections. + Each global heap object belongs to exactly one collection and + each collection contains one or more global heap objects. For + the purposes of disk I/O and caching, a collection is treated as + an atomic object. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ A Global Heap Collection +
bytebytebytebyte
Magic Number
VersionReserved
Collection Size

Global Heap Object 1 + (described below)


Global Heap Object 2


...


Global Heap Object N


Global Heap Object 0 (free space)

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Magic NumberThe magic number for global heap collections are the + four bytes G, C, O, + and L.
VersionEach collection has its own version number so that new + collections can be added to old files. This document + describes version zero of the collections. +
Collection Data SizeThis is the size in bytes of the entire collection + including this field. The default (and minimum) + collection size is 4096 bytes which is a typical file + system block size and which allows for 170 16-byte heap + objects plus their overhead.
Object 1 through NThe objects are stored in any order with no + intervening unused space.
Object 0Object 0 (zero), when present, represents the free space in + the collection. Free space always appears at the end of + the collection. If the free space is too small to store + the header for Object 0 (described below) then the + header is implied and the collection contains no free space. +
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Global Heap Object +
bytebytebytebyte
Object IDReference Count
Reserved
Object Data Size

Object Data

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Object IDEach object has a unique identification number within a + collection. The identification numbers are chosen so that + new objects have the smallest value possible with the + exception that the identifier 0 always refers to the + object which represents all free space within the + collection.
Reference CountAll heap objects have a reference count field. An + object which is referenced from some other part of the + file will have a positive reference count. The reference + count for Object 0 is always zero.
ReservedZero padding to align next field on an 8-byte + boundary.
Object Size This is the size of the the fields + above plus the object data stored for the object. The + actual storage size is rounded up to a multiple of + eight.
Object DataThe object data is treated as a one-dimensional array + of bytes to be interpreted by the caller.
+
+ +

Disk Format: Level 1F - Free-space Heap

+ +

The Free-space Index is a collection of blocks of data, + dispersed throughout the file, which are currently not used by + any file objects. + +

The super block contains a pointer to root of the free-space description; + that pointer is currently (i.e., in HDF5 Release 1.2) required + to be the undefined address 0xfff...ff. + +

The free-sapce index is not otherwise publicly defined at this time. + + + + + +

+

+ + +

Disk Format: Level 2 - Data Objects

+ +

Data objects contain the real information in the file. These + objects compose the scientific data and other information which + are generally thought of as "data" by the end-user. All the + other information in the file is provided as a framework for + these data objects. + +

A data object is composed of header information and data + information. The header information contains the information + needed to interpret the data information for the data object as + well as additional "meta-data" or pointers to additional + "meta-data" used to describe or annotate each data object. + +

+ Disk Format: Level 2a - Data Object Headers

+ +

The header information of an object is designed to encompass + all the information about an object which would be desired to be + known, except for the data itself. This information includes + the dimensionality, number-type, information about how the data + is stored on disk (in external files, compressed, broken up in + blocks, etc.), as well as other information used by the library + to speed up access to the data objects or maintain a file's + integrity. The header of each object is not necessarily located + immediately prior to the object's data in the file and in fact + may be located in any position in the file. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Object Headers +
bytebytebytebyte
Version # of Object HeaderReservedNumber of Header Messages
Object Reference Count

Total Object Header Size

Header Message Type #1Size of Header Message Data #1
FlagsReserved

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #n
FlagsReserved

Header Message Data #n

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version number of the object headerThis value is used to determine the format of the + information in the object header. When the format of the + information in the object header is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted.
ReservedAlways set to zero.
Number of header messagesThis value determines the number of messages listed in + this object header. This provides a fast way for software + to prepare storage for the messages in the header.
Object Reference CountThis value specifies the number of references to this + object within the current file. References to the + data object from external files are not tracked.
Total Object Header SizeThis value specifies the total number of bytes of header + message data following this length field for the current + message as well as any continuation data located elsewhere + in the file.
Header Message TypeThe header message type specifies the type of + information included in the header message data following + the type along with a small amount of other information. + Bit 15 of the message type is set if the message is + constant (constant messages cannot be changed since they + may be cached in group entries throughout the + file). The header message types for the pre-defined + header messages will be included in further discussion + below.
Size of Header Message DataThis value specifies the number of bytes of header + message data following the header message type and length + information for the current message. The size includes + padding bytes to make the message a multiple of eight + bytes.
FlagsThis is a bit field with the following definition: +
+
0 +
If set, the message data is constant. This is used + for messages like the datatype message of a dataset. +
1 +
If set, the message is stored in the global heap and + the Header Message Data field contains a Shared Object + message and the Size of Header Message Data field + contains the size of that Shared Object message. +
2-7 +
Reserved +
+
Header Message DataThe format and length of this field is determined by the + header message type and size respectively. Some header + message types do not require any data and this information + can be eliminated by setting the length of the message to + zero. The data is padded with enough zeros to make the + size a multiple of eight.
+
+ +

The header message types and the message data associated with + them compose the critical "meta-data" about each object. Some + header messages are required for each object while others are + optional. Some optional header messages may also be repeated + several times in the header itself, the requirements and number + of times allowed in the header will be noted in each header + message description below. + +

The following is a list of currently defined header messages: + +


+

Name: NIL

+ Type: 0x0000
+ Length: varies
+ Status: Optional, may be repeated.
+ Purpose and Description: The NIL message is used to + indicate a message + which is to be ignored when reading the header messages for a data object. + [Probably one which has been deleted for some reason.]
+ Format of Data: Unspecified.
+ + + + +
+

Name: Simple Dataspace

+ + Type: 0x0001
+ Length: Varies according to the number of dimensions, + as described in the following table
+ Status: The Simple Dataspace message is required + and may not be repeated. This message is currently used with + datasets and named dataspaces.
+ +

The Simple Dataspace message describes the number + of dimensions and size of each dimension that the data object + has. This message is only used for datasets which have a + simple, rectilinear grid layout; datasets requiring a more + complex layout (irregularly structured or unstructured grids, etc.) + must use the Complex Dataspace message for expressing + the space the dataset inhabits. + (Note: The Complex Dataspace functionality is + not yet implemented (as of HDF5 Release 1.2). It is not described + in this document.) + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Simple Dataspace Message +
bytebytebytebyte
VersionDimensionalityFlagsReserved
Reserved
Dimension Size #1 (<size> bytes)
.
.
.
Dimension Size #n (<size> bytes)
Dimension Maximum #1 (<size> bytes)
.
.
.
Dimension Maximum #n (<size> bytes)
Permutation Index #1
.
.
.
Permutation Index #n
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version This value is used to determine the format of the + Simple Dataspace Message. When the format of the + information in the message is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted.
DimensionalityThis value is the number of dimensions that the data + object has.
FlagsThis field is used to store flags to indicate the + presence of parts of this message. Bit 0 (the least + significant bit) is used to indicate that maximum + dimensions are present. Bit 1 is used to indicate that + permutation indices are present for each dimension.
Dimension Size #n (<size> bytes)This value is the current size of the dimension of the + data as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension.
Dimension Maximum #n (<size> bytes)This value is the maximum size of the dimension of the + data as stored in the file. This value may be the special + value <UNLIMITED> (all bits set) which indicates + that the data may expand along this dimension + indefinitely. If these values are not stored, the maximum + value of each dimension is assumed to be the same as the + current size value.
Permutation Index #n (4 bytes)This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. If these values are + not stored, the first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension.
+
+ + + + + + + +
+

Name: Datatype

+ + Type: 0x0003
+ Length: variable
+ Status: One required per dataset or named datatype
+ +

The datatype message defines the datatype for each data point + of a dataset. A datatype can describe an atomic type like a + fixed- or floating-point type or a compound type like a C + struct. A datatype does not, however, describe how data points + are combined to produce a dataset. Datatypes are stored on disk + as a datatype message, which is a list of datatype classes and + their associated properties. + +

+

+ + + + + + + + + + + + + + + + + + + + + + +
+ Datatype Message +
bytebytebytebyte
Type Class and VersionClass Bit Field
Size in Bytes (4 bytes)


Properties


+
+ +

The Class Bit Field and Properties fields vary depending + on the Type Class, which is the low-order four bits of the Type + Class and Version field (the high-order four bits are the + version, which should be set to the value one). The type class + is one of 0 (fixed-point number), 1 (floating-point number), + 2 (date and time), 3 (text string), 4 (bit field), 5 (opaque), + 6 (compound), 7 (reference), 8 (enumeration), or 9 (variable-length). + The Class Bit Field is zero and the size of the + Properties field is zero except for the cases noted here. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field for Fixed-point Numbers (Class 0) +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1, 2Padding type. Bit 1 is the lo_pad type and bit 2 + is the hi_pad type. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.
3Signed. If this bit is set then the fixed-point + number is in 2's complement form.
4-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + + +
+ Properties for Fixed-point Numbers (Class 0) +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field for Floating-point Numbers (Class 1) +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1, 2, 3Padding type. Bit 1 is the low bits pad type, bit 2 + is the high bits pad type, and bit 3 is the internal bits + pad type. If a datum has unused bits at either or between + the sign bit, exponent, or mantissa, then the value of bit + 1, 2, or 3 is copied to those locations.
4-5Normalization. The value can be 0 if there is no + normalization, 1 if the most significant bit of the + mantissa is always set (except for 0.0), and 2 if the most + signficant bit of the mantissa is not stored but is + implied to be set. The value 3 is reserved and will not + appear in this field.
6-7Reserved (zero).
8-15Sign. This is the bit position of the sign + bit.
16-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Properties for Floating-point Numbers (Class 1) +
ByteByteByteByte
Bit OffsetBit Precision
Exponent LocationExponent Size in BitsMantissa LocationMantissa Size in Bits
Exponent Bias
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + +
+ Bit Field for Strings (Class 3) +
BitsMeaning
0-3Padding type. This four-bit value determines the + type of padding to use for the string. The values are: + +
+
0 Null terminate. +
A zero byte marks the end of the string and is + guaranteed to be present after converting a long + string to a short string. When converting a short + string to a long string the value is padded with + additional null characters as necessary. + +

+
1 Null pad. +
Null characters are added to the end of the value + during conversions from short values to long values + but conversion in the opposite direction simply + truncates the value. + +

+
2 Space pad. +
Space characters are added to the end of the value + during conversions from short values to long values + but conversion in the opposite direction simply + truncates the value. This is the Fortran + representation of the string. + +

+
3-15 Reserved. +
These values are reserved for future use. +
+
4-7Character Set. The character set to use for + encoding the string. The only character set supported is + the 8-bit ASCII (zero) so no translations have been defined + yet.
8-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field for Bitfield Types (Class 4) +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1, 2Padding type. Bit 1 is the lo_pad type and bit 2 + is the hi_pad type. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.
3-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + + +
+ Properties for Bitfield Types (Class 4) +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +

+

+ + + + + + + + + + + + +
+ Bit Field for Opaque Types (Class 5) +
BitsMeaning
0-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + +
+ Properties for Opaque Types (Class 5) +
ByteByteByteByte

Null-terminated ASCII Tag
+ (multiple of 8 bytes)

+
+ +

+

+ + + + + + + + + + + + + + + + +
+ Bit Field for Compound Types (Class 6) +
BitsMeaning
0-15Number of Members. This field contains the number + of members defined for the compound datatype. The member + definitions are listed in the Properties field of the data + type message. +
15-23Reserved (zero).
+
+ +

The Properties field of a compound datatype is a list of the + member definitions of the compound datatype. The member + definitions appear one after another with no intervening bytes. + The member types are described with a recursive datatype + message. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Properties for Compound Types (Class 6) +
ByteByteByteByte


Name (null terminated, multiple of + eight bytes)


Byte Offset of Member in Compound Instance
Dimensionalityreserved
Dimension Permutation
Reserved
Size of Dimension 0 (required)
Size of Dimension 1 (required)
Size of Dimension 2 (required)
Size of Dimension 3 (required)


Member Type Message


+
+ +

+

+ + + + + + + + + + + + + + + + + +
+ Bit Field for Enumeration Types (Class 8) +
BitsMeaning
0-15Number of Members. The number of name/value + pairs defined for the enumeration type.
16-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + +
+ Properties for Enumeration Types (Class 8) +
ByteByteByteByte

Parent Type


Names


Values

+
+ +
+ + + + + + + + + + + +
Parent Type:Each enumeration type is based on some parent type, + usually an integer. The information for that parent type is + described recursively by this field.
Names:The name for each name/value pair. Each name is + stored as a null terminated ASCII string in a multiple of + eight bytes. The names are in no particular order.
Values:The list of values in the same order as the names. + The values are packed (no inter-value padding) and the + size of each value is determined by the parent type.
+
+ + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field for Variable-length Types (Class 9) +
BitsMeaning
0-3
Type
+
0 Variable-length sequence
+
This variable-length datatype can be of any sequence + of data. Variable-length sequences do not have padding + or character set information.
+
1 Variable-length string
+
This variable-length datatype is composed of a series of + characters. Variable-length strings have padding and + character set information.
+
4-7
Padding type (variable-length string only)
+
This four-bit value determines the type of padding + used for variable-length strings. The values are the same + as for the string padding type, as follows:
+
0 Null terminate
+
A zero byte marks the end of a string and is guaranteed + to be present after converting a long string to a short + string. When converting a short string to a long string, + the value is padded with additional null characters + as necessary. +
1 Null pad
+
Null characters are added to the end of the value + during conversion from a short string to a longer string. + Conversion from a long string to a shorter string + simply truncates the value.
+
2 Space pad
+
Space characters are added to the end of the value + during conversion from a short string to a longer string. + Conversion from a long string to a shorter string simply + truncates the value. + This is the Fortran representation of the string. +
+
3-15 Reserved
+
These values are reserved for future use.
+
8-11
Character set (variable-length string only)
+
This four-bit value specifies the character set + to be used for encoding the string.
+
0 8-bit ASCII
+
As of this writing (July 2002, Release 1.4.4), + 8-bit ASCII is the only character set supported. + Therefore, no translations have been defined.
+
12-23Reserved (zero).
+
+ +

+

+ + + + + + + + + + + + + + +
+ Properties for Variable-length Types (Class 9) +
ByteByteByteByte

Parent Type

+
+ +
+ + + + + +
Parent Type:Each variable-length type is based on + some parent type. The information for that parent type is + described recursively by this field.
+
+ + + +

+ + + + +


+

Name: Data Storage - Fill Value

+ Type: 0x0004
+ Length: varies
+ Status: Optional, may not be repeated.
+ +

The fill value message stores a single data point value which + is returned to the application when an uninitialized data point + is read from the dataset. The fill value is interpretted with + the same datatype as the dataset. If no fill value message is + present then a fill value of all zero is assumed. + +

+

+ + + + + + + + + + + + + + + + + +
+ Fill Value Message +
bytebytebytebyte
Size (4 bytes)

Fill Value

+
+ +

+

+ + + + + + + + + + + + + + + +
Field NameDescription
Size (4 bytes)This is the size of the Fill Value field in bytes.
Fill ValueThe fill value. The bytes of the fill value are + interpreted using the same datatype as for the dataset.
+
+ +
+

Name: Reserved - Not Assigned Yet

+ Type: 0x0005
+ Length: N/A
+ Status: N/A
+ + + +
+

Name: Data Storage - Compact

+ + Type: 0x0006
+ Length: varies
+ Status: Optional, may not be repeated.
+ +

This message indicates that the data for the data object is + stored within the current HDF file by including the actual + data as the header data for this message. The data is + stored internally in + the normal format, i.e. in one chunk, uncompressed, etc. + +

Note that one and only one of the Data Storage headers can be + stored for each data object. + +

Format of Data: The message data is actually composed + of dataset data, so the format will be determined by the dataset + format. + + + +


+

Name: Data Storage - + External Data Files

+ Type: 0x0007
+ Length: varies
+ Status: Optional, may not be repeated.
+ +

Purpose and Description: The external object message + indicates that the data for an object is stored outside the HDF5 + file. The filename of the object is stored as a Universal + Resource Location (URL) of the actual filename containing the + data. An external file list record also contains the byte offset + of the start of the data within the file and the amount of space + reserved in the file for that data. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ External File List Message +
bytebytebytebyte
VersionReserved
Allocated SlotsUsed Slots

Heap Address


Slot Definitions...

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version This value is used to determine the format of the + External File List Message. When the format of the + information in the message is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted.
ReservedThis field is reserved for future use.
Allocated SlotsThe total number of slots allocated in the message. Its + value must be at least as large as the value contained in + the Used Slots field.
Used SlotsThe number of initial slots which contain valid + information. The remaining slots are zero filled.
Heap AddressThis is the address of a local name heap which contains + the names for the external files. The name at offset zero + in the heap is always the empty string.
Slot DefinitionsThe slot definitions are stored in order according to + the array addresses they represent. If more slots have + been allocated than what has been used then the defined + slots are all at the beginning of the list.
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + +
+ External File List Slot +
bytebytebytebyte

Name Offset (<size> bytes)


File Offset (<size> bytes)


Size

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Name Offset (<size> bytes)The byte offset within the local name heap for the name + of the file. File names are stored as a URL which has a + protocol name, a host name, a port number, and a file + name: + protocol:port//host/file. + If the protocol is omitted then "file:" is assumed. If + the port number is omitted then a default port for that + protocol is used. If both the protocol and the port + number are omitted then the colon can also be omitted. If + the double slash and host name are omitted then + "localhost" is assumed. The file name is the only + mandatory part, and if the leading slash is missing then + it is relative to the application's current working + directory (the use of relative names is not + recommended).
File Offset (<size> bytes)This is the byte offset to the start of the data in the + specified file. For files that contain data for a single + dataset this will usually be zero.
SizeThis is the total number of bytes reserved in the + specified file for raw data storage. For a file that + contains exactly one complete dataset which is not + extendable, the size will usually be the exact size of the + dataset. However, by making the size larger one allows + HDF5 to extend the dataset. The size can be set to a value + larger than the entire file since HDF5 will read zeros + past the end of the file without failing.
+
+ + +
+

Name: Data Storage - Layout

+ + Type: 0x0008
+ Length: varies
+ Status: Required for datasets, may not be repeated. + +

Purpose and Description: Data layout describes how the + elements of a multi-dimensional array are arranged in the linear + address space of the file. Two types of data layout are + supported: + +

    +
  1. The array can be stored in one contiguous area of the file. + The layout requires that the size of the array be constant and + does not permit chunking, compression, checksums, encryption, + etc. The message stores the total size of the array and the + offset of an element from the beginning of the storage area is + computed as in C. + +
  2. The array domain can be regularly decomposed into chunks and + each chunk is allocated separately. This layout supports + arbitrary element traversals, compression, encryption, and + checksums, and the chunks can be distributed across external + raw data files (these features are described in other + messages). The message stores the size of a chunk instead of + the size of the entire array; the size of the entire array can + be calculated by traversing the B-tree that stores the chunk + addresses. +
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Data Layout Message +
bytebytebytebyte
VersionDimensionalityLayout ClassReserved
Reserved

Address

Dimension 0 (4-bytes)
Dimension 1 (4-bytes)
...
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
VersionA version number for the layout message. This + documentation describes version one.
DimensionalityAn array has a fixed dimensionality. This field + specifies the number of dimension size fields later in the + message.
Layout ClassThe layout class specifies how the other fields of the + layout message are to be interpreted. A value of one + indicates contiguous storage while a value of two + indicates chunked storage. Other values will be defined + in the future.
AddressFor contiguous storage, this is the address of the first + byte of storage. For chunked storage this is the address + of the B-tree that is used to look up the addresses of the + chunks.
DimensionsFor contiguous storage the dimensions define the entire + size of the array while for chunked storage they define + the size of a single chunk.
+
+ + +
+

Name: Reserved - Not Assigned Yet

+ Type: 0x0009
+ Length: N/A
+ Status: N/A
+ Purpose and Description: N/A
+ Format of Data: N/A + +
+

Name: Reserved - Not Assigned Yet

+ Type: 0x000A
+ Length: N/A
+ Status: N/A
+ Purpose and Description: N/A
+ Format of Data: N/A + +
+

Name: Data Storage - Filter Pipeline

+ Type: 0x000B
+ Length: varies
+ Status: Optional, may not be repeated. + +

Purpose and Description: This message describes the + filter pipeline which should be applied to the data stream by + providing filter identification numbers, flags, a name, an + client data. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Pipeline Message +
bytebytebytebyte
VersionNumber of FiltersReserved
Reserved

Filter List

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription
VersionThe version number for this message. This document + describes version one.
Number of FiltersThe total number of filters described by this + message. The maximum possible number of filters in a + message is 32.
Filter ListA description of each filter. A filter description + appears in the next table.
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Pipeline Message +
bytebytebytebyte
Filter IdentificationName Length
FlagsClient Data Number of Values

Name


Client Data

Padding
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Filter IdentificationThis is a unique (except in the case of testing) + identifier for the filter. Values from zero through 255 + are reserved for filters defined by the NCSA HDF5 + library. Values 256 through 511 have been set aside for + use when developing/testing new filters. The remaining + values are allocated to specific filters by contacting the + HDF5 Development + Team.
Name LengthEach filter has an optional null-terminated ASCII name + and this field holds the length of the name including the + null termination padded with nulls to be a multiple of + eight. If the filter has no name then a value of zero is + stored in this field.
FlagsThe flags indicate certain properties for a filter. The + bit values defined so far are: + +
+
bit 1 +
If set then the filter is an optional filter. + During output, if an optional filter fails it will be + silently removed from the pipeline. +
+
Client Data Number of ValuesEach filter can store a few integer values to control + how the filter operates. The number of entries in the + Client Data array is stored in this field.
NameIf the Name Length field is non-zero then it will + contain the size of this field, a multiple of eight. This + field contains a null-terminated, ASCII character + string to serve as a comment/name for the filter.
Client DataThis is an array of four-byte integers which will be + passed to the filter function. The Client Data Number of + Values determines the number of elements in the + array.
PaddingFour bytes of zeros are added to the message at this + point if the Client Data Number of Values field contains + an odd number.
+
+ +
+

Name: Attribute

+ Type: 0x000C
+ Length: varies
+ Status: Optional, may be repeated.
+ +

Purpose and Description: The Attribute + message is used to list objects in the HDF file which are used + as attributes, or "meta-data" about the current object. An + attribute is a small dataset; it has a name, a datatype, a data + space, and raw data. Since attributes are stored in the object + header they must be relatively small (<64kb) and can be + associated with any type of object which has an object header + (groups, datasets, named types and spaces, etc.). + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Message +
bytebytebytebyte
VersionReservedName Size
Type SizeSpace Size

Name


Type


Space


Data

+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
VersionVersion number for the message. This document describes + version 1 of attribute messages.
ReservedThis field is reserved for later use and is set to + zero.
Name SizeThe length of the attribute name in bytes including the + null terminator. Note that the Name field below may + contain additional padding not represented by this + field.
Type SizeThe length of the datatype description in the Type + field below. Note that the Type field may contain + additional padding not represented by this field.
Space SizeThe length of the dataspace description in the Space + field below. Note that the Space field may contain + additional padding not represented by this field.
NameThe null-terminated attribute name. This field is + padded with additional null characters to make it a + multiple of eight bytes.
TypeThe datatype description follows the same format as + described for the datatype object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.
SpaceThe dataspace description follows the same format as + described for the dataspace object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.
DataThe raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. This + field is not padded with additional zero + bytes.
+
+ +
+

Name: Object Name

+ +

Type: 0x000D
+ Length: varies
+ Status: Optional, may not be repeated. + +

Purpose and Description: The object name or comment is + designed to be a short description of an object. An object name + is a sequence of non-zero (\0) ASCII characters with no other + formatting included by the library. + +

+

+ + + + + + + + + + + + + +
+ Name Message +
bytebytebytebyte

Name

+
+ +

+

+ + + + + + + + + + +
Field NameDescription
NameA null terminated ASCII character string.
+
+ +
+

Name: Object Modification Date & Time

+ +

Type: 0x000E
+ Length: fixed
+ Status: Optional, may not be repeated. + +

Purpose and Description: The object modification date + and time is a timestamp which indicates (using ISO-8601 date and + time format) the last modification of an object. The time is + updated when any object header message changes according to the + system clock where the change was posted. + +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Modification Time Message +
bytebytebytebyte
Year
MonthDay of Month
HourMinute
SecondReserved
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
YearThe four-digit year as an ASCII string. For example, + 1998. All fields of this message should be interpreted + as coordinated universal time (UTC)
MonthThe month number as a two digit ASCII string where + January is 01 and December is 12.
Day of MonthThe day number within the month as a two digit ASCII + string. The first day of the month is 01.
HourThe hour of the day as a two digit ASCII string where + midnight is 00 and 11:00pm is 23.
MinuteThe minute of the hour as a two digit ASCII string where + the first minute of the hour is 00 and + the last is 59.
SecondThe second of the minute as a two digit ASCII string + where the first second of the minute is 00 + and the last is 59.
ReservedThis field is reserved and should always be zero.
+
+ +
+

Name: Shared Object Message

+ Type: 0x000F
+ Length: 4 Bytes
+ Status: Optional, may be repeated. + +

A constant message can be shared among several object headers + by writing that message in the global heap and having the object + headers all point to it. The pointing is accomplished with a + Shared Object message which is understood directly by the object + header layer of the library. It is also possible to have a + message of one object header point to a message in some other + object header, but care must be exercised to prevent cycles. + +

If a message is shared, then the message appears in the global + heap and its message ID appears in the Header Message Type + field of the object header. Also, the Flags field in the object + header for that message will have bit two set (the + H5O_FLAG_SHARED bit). The message body in the + object header will be that of a Shared Object message defined + here and not that of the pointed-to message. + +

+

+ + + + + + + + + + + + + + + + + + + +
+ Shared Message Message +
byte + byte + byte + byte +
VersionFlagsReserved
Reserved

Pointer

+
+ +

+

+ + + + + + + + + + + + + + + + + + + +
Field NameDescription
VersionThe version number for the message. This document + describes version one of shared messages.
FlagsThe Shared Message message points to a message which is + shared among multiple object headers. The Flags field + describes the type of sharing: + +
+
Bit 0 +
If this bit is clear then the actual message is the + first message in some other object header; otherwise + the actual message is stored in the global heap. + +
Bits 2-7 +
Reserved (always zero) +
+
PointerThis field points to the actual message. The format of + the pointer depends on the value of the Flags field. If + the actual message is in the global heap then the pointer + is the file address of the global heap collection that + holds the message, and a four-byte index into that + collection. Otherwise the pointer is a group entry + that points to some other object header.
+
+ + +
+

Name: Object Header Continuation

+Type: 0x0010
+Length: fixed
+Status: Optional, may be repeated.
+Purpose and Description: The object header continuation is the location +in the file of more header messages for the current data object. This can be +used when header blocks are large, or likely to change over time.
+Format of Data:

+ The object header continuation is formatted as follows (assuming a 4-byte +length & offset are being used in the current file): + +

+

+ + + + + + + + + + + + + +
+HDF5 Object Header Continuation Message Layout +
bytebytebytebyte
Header Continuation Offset
Header Continuation Length
+
+ +

+

+
The elements of the Header Continuation Message are described below: +
+
+
Header Continuation Offset: (<offset> bytes) +
This value is the offset in bytes from the beginning of the file where the +header continuation information is located. +
Header Continuation Length: (<length> bytes) +
This value is the length in bytes of the header continuation information in +the file. +
+
+ + + +
+

Name: Group Message

+Type: 0x0011
+Length: fixed
+Status: Required for groups, may not be repeated.
+Purpose and Description: Each group has a B-tree and a +name heap which are pointed to by this message.
+Format of data: +

The group message is formatted as follows: + +

+

+ + + + + + + + + + + + + + +
+HDF5 Object Header Group Message Layout +
bytebytebytebyte
B-tree Address
Heap Address
+
+ +

+

+
The elements of the Group Message are described below: +
+
+
B-tree Address (<offset> bytes) +
This value is the offset in bytes from the beginning of the file +where the B-tree is located. +
Heap Address (<offset> bytes) +
This value is the offset in bytes from the beginning of the file +where the group name heap is located. +
+
+ +

Disk Format: Level 2b - Shared Data Object Headers

+

In order to share header messages between several dataset objects, object +header messages may be placed into the global heap. Since these +messages require additional information beyond the basic object header message +information, the format of the shared message is detailed below. + +

+

+ + + + + + + + + + + + + +
+HDF5 Shared Object Header Message +
bytebytebytebyte
Reference Count of Shared Header Message

Shared Object Header Message

+
+ +

+

+
The elements of the shared object header message are described below: +
+
+
Reference Count of Shared Header Message: (32-bit unsigned integer) +
This value is used to keep a count of the number of dataset objects which +refer to this message from their dataset headers. When this count reaches zero, +the shared message header may be removed from the global heap. +
Shared Object Header Message: (various lengths) +
The data stored for the shared object header message is formatted in the +same way as the private object header messages described in the object header +description earlier in this document and begins with the header message Type. +
+
+ + +

Disk Format: Level 2c - Data Object Data Storage

+

The data for an object is stored separately from the header +information in the file and may not actually be located in the HDF5 file +itself if the header indicates that the data is stored externally. The +information for each record in the object is stored according to the +dimensionality of the object (indicated in the dimensionality header message). +Multi-dimensional data is stored in C order [same as current scheme], i.e. the +"last" dimension changes fastest. +

Data whose elements are composed of simple number-types are stored in +native-endian IEEE format, unless they are specifically defined as being stored +in a different machine format with the architecture-type information from the +number-type header message. This means that each architecture will need to +[potentially] byte-swap data values into the internal representation for that +particular machine. +

Data with a "variable" sized number-type is stored in a data heap +internal to the HDF5 file. Global heap identifiers are stored in the +data object storage. +

Data whose elements are composed of pointer number-types are stored in several +different ways depending on the particular pointer type involved. Simple +pointers are just stored as the dataset offset of the object being pointed to with the +size of the pointer being the same number of bytes as offsets in the file. +Partial-object pointers are stored as a heap-ID which points to the following +information within the file-heap: an offset of the object pointed to, number-type +information (same format as header message), dimensionality information (same +format as header message), sub-set start and end information (i.e. a coordinate +location for each), and field start and end names (i.e. a [pointer to the] +string indicating the first field included and a [pointer to the] string name +for the last field). + +

Data of a compound datatype is stored as a contiguous stream of the items +in the structure, with each item formatted according to its datatype. + + + diff --git a/doxygen/examples/H5.format.1.1.html b/doxygen/examples/H5.format.1.1.html new file mode 100644 index 0000000..ebbbe8e --- /dev/null +++ b/doxygen/examples/H5.format.1.1.html @@ -0,0 +1,6439 @@ + + + + HDF5 File Format Specification Version 1.1 + + + + + + +

+ + + +
+
    +
  1. Introduction +
  2. Disk Format Level 0 - File Metadata + +
      +
    1. Disk Format Level 0A - File Signature and Super Block +
    2. Disk Format Level 0B - File Driver Info +
    +
    +
  3. Disk Format Level 1 - File Infrastructure + +
      +
    1. Disk Format Level 1A - B-link Trees and B-tree Nodes +
    2. Disk Format Level 1B - Group +
    3. Disk Format Level 1C - Group Entry +
    4. Disk Format Level 1D - Local Heaps +
    5. Disk Format Level 1E - Global Heap +
    6. Disk Format Level 1F - Free-space Index +
    +
    +
  4. Disk Format Level 2 - Data Objects + +
      +
    1. Disk Format Level 2a - Data Object Headers +
        +
      1. Name: NIL +
      2. Name: Simple Dataspace + +
      3. Name: Reserved - not assigned yet +
      4. Name: Datatype +
      5. Name: Data Storage - Fill Value (Old) +
      6. Name: Data Storage - Fill Value +
      +
    +
    +
+
   +
    + +
  1. Disk Format Level 2 - Data Objects + (Continued) +
      +
    1. Disk Format Level 2a - Data Object Headers(Continued) +
        + +
      1. Name: Reserved - not assigned yet +
      2. Name: Data Storage - External Data Files +
      3. Name: Data Storage - Layout +
      4. Name: Reserved - not assigned yet +
      5. Name: Reserved - not assigned yet +
      6. Name: Data Storage - Filter Pipeline +
      7. Name: Attribute +
      8. Name: Object Comment +
      9. Name: Object Modification Date and Time (Old) +
      10. Name: Shared Object Message +
      11. Name: Object Header Continuation +
      12. Name: Group Message +
      13. Name: Object Modification Date and Time +
      +
    2. Disk Format: Level 2b - Data Object Data Storage +
    +
    +
  2. Appendix +
+
+
+ +
+
+ + +

Introduction

+ + + + + + + +
  +
+ HDF5 Groups +
 
  + Figure 1: Relationships among the HDF5 root group, other groups, and objects +
+
 
  + HDF5 Objects +  
  + Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces +
+
 
+ + +

The format of an HDF5 file on disk encompasses several + key ideas of the HDF4 and AIO file formats as well as + addressing some shortcomings therein. The new format is + more self-describing than the HDF4 format and is more + uniformly applied to data objects in the file. + +

An HDF5 file appears to the user as a directed graph. + The nodes of this graph are the higher-level HDF5 objects + that are exposed by the HDF5 APIs: + +

    +
  • Groups +
  • Datasets +
  • Named datatypes +
+ +

At the lowest level, as information is actually written to the disk, + an HDF5 file is made up of the following objects: +

    +
  • A super block +
  • B-tree nodes (containing either symbol nodes or raw data chunks) +
  • Object headers +
  • A global heap +
  • Local heaps +
  • Free space +
+ +

The HDF5 library uses these low-level objects to represent the + higher-level objects that are then presented to the user or + to applications through the APIs. + For instance, a group is an object header that contains a message that + points to a local heap and to a B-tree which points to symbol nodes. + A dataset is an object header that contains messages that describe + datatype, space, layout, filters, external files, fill value, etc + with the layout message pointing to either a raw data chunk or to a + B-tree that points to raw data chunks. + + +

This Document

+ +

This document describes the lower-level data objects; + the higher-level objects and their properties are described + in the HDF5 User's Guide. + +

Three levels of information comprise the file format. + Level 0 contains basic information for identifying and + defining information about the file. Level 1 information contains + the information about the pieces of a file shared by many objects + in the file (such as a B-trees and heaps). Level 2 is the rest + of the file and contains all of the data objects, with each object + partitioned into header information, also known as + metadata, and data. + +

The sizes of various fields in the following layout tables are + determined by looking at the number of columns the field spans + in the table. There are three exceptions: (1) The size may be + overridden by specifying a size in parentheses, (2) the size of + addresses is determined by the Size of Offsets field + in the super block and is indicated in this document with a + superscripted 'O', and (3) the size of length fields is determined + by the Size of Lengths field in the super block and is + indicated in this document with a superscripted 'L'. + +

Values for all fields in this document should be treated as unsigned + integers, unless otherwise noted in the description of a field. + Additionally, all metadata fields are stored in little-endian byte + order. +

+ +
+
+ +

+ Disk Format: Level 0 - File Metadata

+ +

+ Disk Format: Level 0A - File Signature and Super Block

+ +

The super block may begin at certain predefined offsets within + the HDF5 file, allowing a block of unspecified content for + users to place additional information at the beginning (and + end) of the HDF5 file without limiting the HDF5 library's + ability to manage the objects within the file itself. This + feature was designed to accommodate wrapping an HDF5 file in + another file format or adding descriptive information to the + file without requiring the modification of the actual file's + information. The super block is located by searching for the + HDF5 file signature at byte offset 0, byte offset 512 and at + successive locations in the file, each a multiple of two of + the previous location, i.e. 0, 512, 1024, 2048, etc. + +

The super block is composed of a file signature, followed by + super block and group version numbers, information + about the sizes of offset and length values used to describe + items within the file, the size of each group page, + and a group entry for the root object in the file. + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ HDF5 Super Block Layout +
bytebytebytebyte

HDF5 File Signature (8 bytes)

Version # of Super BlockVersion # of Global Free-space StorageVersion # of Root Group Symbol Table EntryReserved (zero)
Version # of Shared Header Message FormatSize of OffsetsSize of LengthsReserved (zero)
Group Leaf Node KGroup Internal Node K
File Consistency Flags
Indexed Storage Internal Node K1Reserved (zero)1
Base AddressO
Address of Global Free-space HeapO
End of File AddressO
Driver Information Block AddressO
Root Group Symbol Table Entry
+ + + + +
+ (Items marked with an 'O' the above table are +
+ of the size specified in "Size of Offsets.") +
+ (Items marked with an '1' the above table are +
+ new in version 1 of the superblock) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
HDF5 File Signature +

This field contains a constant value and can be used to + quickly identify a file as being an HDF5 file. The + constant value is designed to allow easy identification of + an HDF5 file and to allow certain types of data corruption + to be detected. The file signature of an HDF5 file always + contains the following values: +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Decimal:13772687013102610
Hexadecimal:894844460d0a1a0a
ASCII C Notation:\211HDF\r\n\032\n
+
+
+ +

This signature both identifies the file as an HDF5 file + and provides for immediate detection of common + file-transfer problems. The first two bytes distinguish + HDF5 files on systems that expect the first two bytes to + identify the file type uniquely. The first byte is + chosen as a non-ASCII value to reduce the probability + that a text file may be misrecognized as an HDF5 file; + also, it catches bad file transfers that clear bit + 7. Bytes two through four name the format. The CR-LF + sequence catches bad file transfers that alter newline + sequences. The control-Z character stops file display + under MS-DOS. The final line feed checks for the inverse + of the CR-LF translation problem. (This is a direct + descendent of the PNG file + signature.) +

+ +

This field is present in version 0+ of the superblock. +

+
Version Number of the Super Block +

This value is used to determine the format of the + information in the super block. When the format of the + information in the super block is changed, the version number + is incremented to the next integer and can be used to + determine how the information in the super block is + formatted. +

+ +

Values of 0 and 1 are defined for this field. +

+ +

This field is present in version 0+ of the superblock. +

+
Version Number of the File Free-space Information +

This value is used to determine the format of the + information in the File Free-space Information. +

+

The only value currently valid in this field is '0', which + indicates that the free space index is formatted as described + below. +

+ +

This field is present in version 0+ of the superblock. +

+
Version Number of the Root Group Symbol Table Entry +

This value is used to determine the format of the + information in the Root Group Symbol Table Entry. When the + format of the information in that field is changed, the + version number is incremented to the next integer and can be + used to determine how the information in the field + is formatted. +

+

The only value currently valid in this field is '0', which + indicates that the root group symbol table entry is formatted as + described below. +

+ +

This field is present in version 0+ of the superblock. +

+
Version Number of the Shared Header Message Format +

This value is used to determine the format of the + information in a shared object header message. Since the format + of the shared header messages differs from the other private + header messages, a version number is used to identify changes + in the format. +

+

The only value currently valid in this field is '0', which + indicates that shared header messages are formatted as + described below. +

+ +

This field is present in version 0+ of the superblock. +

+
Size of Offsets +

This value contains the number of bytes used to store + addresses in the file. The values for the addresses of + objects in the file are offsets relative to a base address, + usually the address of the super block signature. This + allows a wrapper to be added after the file is created + without invalidating the internal offset locations. +

+ +

This field is present in version 0+ of the superblock. +

+
Size of Lengths +

This value contains the number of bytes used to store + the size of an object. +

+ +

This field is present in version 0+ of the superblock. +

+
Group Leaf Node K +

Each leaf node of a group B-tree will have at + least this many entries but not more than twice this + many. If a group has a single leaf node then it + may have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 0+ of the superblock. +

+
Group Internal Node K +

Each internal node of a group B-tree will have at + least this many entries but not more than twice this + many. If the group has only one internal + node then it might have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 0+ of the superblock. +

+
File Consistency Flags +

This value contains flags to indicate information + about the consistency of the information contained + within the file. Currently, the following bit flags are + defined: +

    +
  • Bit 0 set indicates that the file is opened for + write-access. +
  • Bit 1 set indicates that the file has + been verified for consistency and is guaranteed to be + consistent with the format defined in this document. +
  • Bits 2-31 are reserved for future use. +
+ Bit 0 should be + set as the first action when a file is opened for write + access and should be cleared only as the final action + when closing a file. Bit 1 should be cleared during + normal access to a file and only set after the file's + consistency is guaranteed by the library or a + consistency utility. +

+ +

This field is present in version 0+ of the superblock. +

+
Indexed Storage Internal Node K +

Each internal node of a indexed storage B-tree will have at + least this many entries but not more than twice this + many. If the group has only one internal + node then it might have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 1+ of the superblock. +

+
Base Address +

This is the absolute file address of the first byte of + the HDF5 data within the file. The library currently + constrains this value to be the absolute file address + of the super block itself when creating new files; + future versions of the library may provide greater + flexibility. When opening an existing file and this address does + not match the offset of the superblock, the library assumes + that the entire contents of the HDF5 file have been adjusted in + the file and adjusts the base address and end of file address to + reflect their new positions in the file. Unless otherwise noted, + all other file addresses are relative to this base + address. +

+ +

This field is present in version 0+ of the superblock. +

+
Address of Global Free-space Index +

Free-space management is not yet defined in the HDF5 + file format and is not handled by the library. + Currently this field always contains the + undefined address. +

+ +

This field is present in version 0+ of the superblock. +

+
End of File Address +

This is the absolute file address of the first byte past + the end of all HDF5 data. It is used to determine whether a + file has been accidently truncated and as an address where + file data allocation can occur if space from the free list is + not used. +

+ +

This field is present in version 0+ of the superblock. +

+
Driver Information Block Address +

This is the relative file address of the file driver + information block which contains driver-specific + information needed to reopen the file. If there is no + driver information block then this entry should be the + undefined address. +

+ +

This field is present in version 0+ of the superblock. +

+
Root Group Symbol Table Entry +

This is the symbol table entry + of the root group, which serves as the entry point into + the group graph for the file. +

+ +

This field is present in version 0+ of the superblock. +

+
+
+ +

+ Disk Format: Level 0B - File Driver Info

+ +

The file driver information block is an optional region of the + file which contains information needed by the file driver in + order to reopen a file. The format of the file driver information + block is: + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Driver Information Block +
bytebytebytebyte
VersionReserved (zero)
Driver Information Size (4 bytes)

Driver Identification (8 bytes)



Driver Information (n bytes)


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

The version number of the driver information block. The + file format documented here is version zero. +

+
Driver Information Size +

The size in bytes of the Driver Information part of this + structure. +

+
Driver Identification +

This is an eight-byte ASCII string without null + termination which identifies the driver and version number + of the Driver Information block. The predefined drivers + supplied with the HDF5 library are identified by the + letters NCSA followed by the first four characters of + the driver name. If the Driver Information block is not + the original version then the last letter(s) of the + identification will be replaced by a version number in + ASCII. +

+

+ For example, the various versions of the multi driver + will be identified by NCSAmult. + (NCSAmult is simply NCSAmulti truncated + to eight characters. Subsequent identifiers will be created by + substituting sequential numerical values for the final character, + starting with zero.) multi driver is the only default driver that + is encoded in this field. +

+

+ Identification for user-defined drivers + is eight-byte long and arbitrary but should be unique and avoid + the four character prefix "NCSA". +

+
Driver InformationDriver information is encoded/decoded in a format defined by the + file driver. multi driver is the only default driver that has driver + information stored in this field. Its format is explained in the + following block.
+
+ +
+

Multi driver has the following format:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Multi Driver Message +
bytebytebytebyte
Member MappingMember MappingMember MappingMember Mapping
Member MappingMember MappingReservedReserved

Address of Member File 1


End of Address for Member File 1


Address of Member File 2


End of Address for Member File 2


... ...


Name of Member File 1


Name of Member File 2


... ...

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Member Mapping

Multi driver enables different types of HDF5 data and + metadata to be written to separate files. These files are viewed by the + library as a single virtual HDF5 file with a single file address. + It allows maximal 6 files to be created. + In sequence, these Member Mapping fields are for super block, + B-tree, raw data, global heap, local heap, + and object header. More than one type of data can be written to the + same file.

+

These Member Mapping fields are integer values from 1 to 6 + indicating how the data can be mapped to or merged with another type of + data. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Member MappingDescription
1The super block data.
2The B-tree data.
3The raw data.
4The global heap data.
5The local heap data.
6The object header data.

+ For example, if the third field has the value 3 and all the rest have the + value 1, it means there are two files, one for raw data, one for super block, + B-tree, global heap, local heap, and object header. +
Reserved

These fields are reserved and should always be zero.

Address of Member File

Specifies the virtual address. A normally eight-byte integer with + the value from 0 (zero) to maximal value, + at which the member file starts.

End of Address for Member File

The end of allocated address for the member file. A normally eight-byte + integer value.

Name of Member File

The null-terminated name of member file. Its length should be multiples of + 8 bytes. Additional bytes will be padded with NULLs. The default naming + convention is %%s-X.h5, where X is one of the letters + s (for super block), b (for B-tree), r (for raw data), + g (for global heap), l (for local heap), and o (for + object header). The name for the whole HDF5 file will substitute the %s + in the string. +

+
+
+ +
+
+ +

+ Disk Format: Level 1 - File Infrastructure

+

Disk Format: Level 1A - B-link Trees and B-tree Nodes

+ +

B-link trees allow flexible storage for objects which tend to grow + in ways that cause the object to be stored discontiguously. B-trees + are described in various algorithms books including "Introduction to + Algorithms" by Thomas H. Cormen, Charles E. Leiserson, and Ronald + L. Rivest. The B-link tree, in which the sibling nodes at a + particular level in the tree are stored in a doubly-linked list, + is described in the "Efficient Locking for Concurrent Operations + on B-trees" paper by Phillip Lehman and S. Bing Yao as published + in the ACM Transactions on Database Systems, Vol. 6, + No. 4, December 1981. + +

The B-link trees implemented by the file format contain one more + key than the number of children. In other words, each child + pointer out of a B-tree node has a left key and a right key. + The pointers out of internal nodes point to sub-trees while + the pointers out of leaf nodes point to symbol nodes and + raw data chunks. + Aside from that difference, internal nodes and leaf nodes + are identical. + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ B-tree Nodes +
bytebytebytebyte
Signature
Node TypeNode LevelEntries Used
Address of Left SiblingO
Address of Right SiblingO
Key 0 (variable size)
Address of Child 0O
Key 1 (variable size)
Address of Child 1O
...
Key 2K (variable size)
Address of Child 2KO
Key 2K+1 (variable size)
+ + + +
+ (Items marked with an 'O' the above table are +
+ of the size specified in "Size of Offsets.") +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Signature +

The ASCII character string "TREE" is + used to indicate the + beginning of a B-link tree node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file. +

+
Node Type +

Each B-link tree points to a particular type of data. + This field indicates the type of data as well as + implying the maximum degree K of the tree and + the size of each Key field. +

+ + + + + + + + + + + + + + +
Node TypeDescription
0This tree points to group nodes.
1This tree points to raw data chunk nodes.
+
Node Level +

The node level indicates the level at which this node + appears in the tree (leaf nodes are at level zero). Not + only does the level indicate whether child pointers + point to sub-trees or to data, but it can also be used + to help file consistency checking utilities reconstruct + damanged trees. +

+
Entries Used +

This determines the number of children to which this + node points. All nodes of a particular type of tree + have the same maximum degree, but most nodes will point + to less than that number of children. The valid child + pointers and keys appear at the beginning of the node + and the unused pointers and keys appear at the end of + the node. The unused pointers and keys have undefined + values. +

+
Address of Left Sibling +

This is the relative file address of the left sibling of + the current node. If the current + node is the left-most node at this level then this field + is the undefined address. +

+
Address of Right Sibling +

This is the relative file address of the right sibling of + the current node. If the current + node is the right-most node at this level then this + field is the undefined address. +

+
Keys and Child Pointers +

Each tree has 2K+1 keys with 2K + child pointers interleaved between the keys. The number + of keys and child pointers actually containing valid + values is determined by the node's Entries Used field. + If that field is N then the B-link tree contains + N child pointers and N+1 keys. +

+
Key +

The format and size of the key values is determined by + the type of data to which this tree points. The keys are + ordered and are boundaries for the contents of the child + pointer; that is, the key values represented by child + N fall between Key N and Key + N+1. Whether the interval is open or closed on + each end is determined by the type of data to which the + tree points. +

+ +

+ The format of the key depends on the node type. + For nodes of node type 0 (group nodes), the key is formatted as + follows: +

+ + + + + +
A single field of Size of Lengths + bytes:Indicates the byte offset into the local heap + for the first object name in the subtree which + that key describes. +
+
+

+ +

+ For nodes of node type 1 (chunked raw data nodes), the key is + formatted as follows: +

+ + + + + + + + + + + + + +
Bytes 1-4:Size of chunk in bytes.
Bytes 4-8:Filter mask, a 32-bit bitfield indicating which + filters have been skipped for this chunk. Each filter + has an index number in the pipeline (starting at 0, with + the first filter to apply) and if that filter is skipped, + the bit corresponding to it's index is set.
N 64-bit fields:A 64-bit index indicating the offset of the + chunk within the dataset where N is the number + of dimensions of the dataset. For example, if + a chunk in a 3-dimensional dataset begins at the + position [5,5,5], there will be three + such 64-bit indices, each with the value of + 5.
+
+

+
Child Pointer +

The tree node contains file addresses of subtrees or + data depending on the node level. Nodes at Level 0 point + to data addresses, either raw data chunk or group nodes. + Nodes at non-zero levels point to other nodes of the + same B-tree. +

+

For raw data chunk nodes, the child pointer is the address + of a single raw data chunk. For group nodes, the child pointer + points to a symbol table, which contains + information for multiple symbol table entries. +

+
+
+ +

+ Conceptually, each B-tree node looks like this: +

+ + + + + + + + + + + + + +
key[0] child[0] key[1] child[1] key[2] ... ... key[N-1] child[N-1] key[N]
+
+
+ + where child[i] is a pointer to a sub-tree (at a level + above Level 0) or to data (at Level 0). + Each key[i] describes an item stored by the B-tree + (a chunk or an object of a group node). The range of values + represented by child[i] is indicated by key[i] + and key[i+1]. + + +

The following question must next be answered: + "Is the value described by key[i] contained in + child[i-1] or in child[i]?" + The answer depends on the type of tree. + In trees for groups (node type 0) the object described by + key[i] is the greatest object contained in + child[i-1] while in chunk trees (node type 1) the + chunk described by key[i] is the least chunk in + child[i]. + +

That means that key[0] for group trees is sometimes unused; + it points to offset zero in the heap, which is always the + empty string and compares as "less-than" any valid object name. + +

And key[N] for chunk trees is sometimes unused; + it contains a chunk offset which compares as "greater-than" + any other chunk offset and has a chunk byte size of zero + to indicate that it is not actually allocated. + + +

Disk Format: Level 1B - Group and Symbol Nodes

+ +

A group is an object internal to the file that allows + arbitrary nesting of objects within the file (including other groups). + A group maps a set of names in the group to a set of relative + file addresses where objects with those names are located in + the file. Certain metadata for an object to which the group points + can be cached in the group's symbol table in addition to the + object's header. + +

An HDF5 object name space can be stored hierarchically by + partitioning the name into components and storing each + component in a group. The group entry for a + non-ultimate component points to the group containing + the next component. The group entry for the last + component points to the object being named. + +

A group is a collection of group nodes pointed + to by a B-link tree. Each group node contains entries + for one or more symbols. If an attempt is made to add a + symbol to an already full group node containing + 2K entries, then the node is split and one node + contains K symbols and the other contains + K+1 symbols. + +
+

+ + + + + + + + + + + + + + + + + + + +
+ Group Node (A Leaf of a B-tree) +
bytebytebytebyte
Signature
Version NumberReserved (0)Number of Symbols


Group Entries


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Signature +

The ASCII character string "SNOD" is + used to indicate the + beginning of a group node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file. +

+
Version Number +

The version number for the group node. This + document describes version 1. (There is no version '0' + of the group node) +

+
Number of Symbols +

Although all group nodes have the same length, + most contain fewer than the maximum possible number of + symbol entries. This field indicates how many entries + contain valid data. The valid entries are packed at the + beginning of the group node while the remaining + entries contain undefined values. +

+
Group Entries +

Each symbol has an entry in the group node. + The format of the entry is described below. + There are 2K entries in each group node, where + K is the "Group Leaf Node K" value from the + super block. +

+
+
+ +

+ Disk Format: Level 1C - Group Entry

+ +

Each group entry in a group node is designed + to allow for very fast browsing of stored objects. + Toward that design goal, the group entries + include space for caching certain constant metadata from the + object header. + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Group Entry +
bytebytebytebyte
Name OffsetO
Object Header AddressO
Cache Type
Reserved


Scratch-pad Space (16 bytes)


+ + + +
+ (Items marked with an 'O' the above table are +
+ of the size specified in "Size of Offsets.") +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Name Offset +

This is the byte offset into the group local + heap for the name of the object. The name is null + terminated. +

+
Object Header Address +

Every object has an object header which serves as a + permanent location for the object's metadata. In addition + to appearing in the object header, some metadata can be + cached in the scratch-pad space. +

+
Cache Type +

The cache type is determined from the object header. + It also determines the format for the scratch-pad space: +
+ + + + + + + + + + + + + + + + + + + + + +
Type:Description:
0No data is cached by the group entry. This + is guaranteed to be the case when an object header + has a link count greater than one. +
1Object header metadata is cached in the group + entry. This implies that the group + entry refers to another group. +
2The entry is a symbolic link. The first four bytes + of the scratch-pad space are the offset into the local + heap for the link value. The object header address + will be undefined. +
NOther cache values can be defined later and + libraries that do not understand the new values will + still work properly. +
+

+
Reserved +

These four bytes are present so that the scratch-pad + space is aligned on an eight-byte boundary. They are + always set to zero. +

+
Scratch-pad Space +

This space is used for different purposes, depending + on the value of the Cache Type field. Any metadata + about a dataset object represented in the scratch-pad + space is duplicated in the object header for that + dataset. This metadata can include the datatype + and the size of the dataspace for a dataset whose datatype + is atomic and whose dataspace is fixed and less than + four dimensions. +

+

+ Furthermore, no data is cached in the group + entry scratch-pad space if the object header for + the group entry has a link count greater than + one. +

+
+
+ +

Format of the Scratch-pad Space

+ +

The group entry scratch-pad space is formatted + according to the value in the Cache Type field. + +

If the Cache Type field contains the value zero + (0) then no information is + stored in the scratch-pad space. + +

If the Cache Type field contains the value one + (1), then the scratch-pad space + contains cached metadata for another object header + in the following format: + +
+

+ + + + + + + + + + + + + + +
+ Object Header Scratch-pad Format +
bytebytebytebyte
Address of B-treeO
Address of Name HeapO
+ + + +
+ (Items marked with an 'O' the above table are +
+ of the size specified in "Size of Offsets.") +
+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
Address of B-tree +

This is the file address for the root of the + group's B-tree. +

+
Address of Name Heap +

This is the file address for the group's local + heap, in which are stored the group's symbol names. +

+
+
+ + +

If the Cache Type field contains the value two + (2), then the scratch-pad space + contains cached metadata for another symbolic link + in the following format: + +
+

+ + + + + + + + + + + + + +
+ Symbolic Link Scratch-pad Format +
bytebytebytebyte
Offset to Link Value
+
+ +
+
+ + + + + + + + + + +
Field NameDescription
Offset to Link Value +

The value of a symbolic link (that is, the name of the + thing to which it points) is stored in the local heap. + This field is the 4-byte offset into the local heap for + the start of the link value, which is null terminated. +

+
+
+ +

Disk Format: Level 1D - Local Heaps

+ +

A heap is a collection of small heap objects. Objects can be + inserted and removed from the heap at any time. + The address of a heap does not change once the heap is created. + References to objects are stored in the group table; + the names of those objects are stored in the local heap. +

+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Local Heap +
bytebytebytebyte
Signature
VersionReserved (zero)
Data Segment SizeL
Offset to Head of Free-listL
Address of Data SegmentO
+ + + + +
+ (Items marked with an 'L' the above table are +
+ of the size specified in "Size of Lengths.") +
+ (Items marked with an 'O' the above table are +
+ of the size specified in "Size of Offsets.") +
+
+ +

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Signature +

The ASCII character string "HEAP" + is used to indicate the + beginning of a heap. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+
Version +

Each local heap has its own version number so that new + heaps can be added to old files. This document + describes version zero (0) of the local heap. +

+
Data Segment Size +

The total amount of disk memory allocated for the heap + data. This may be larger than the amount of space + required by the objects stored in the heap. The extra + unused space in the heap holds a linked list of free blocks. +

+
Offset to Head of Free-list +

This is the offset within the heap data segment of the + first free block (or the + undefined address if there is no + free block). The free block contains "Size of Lengths" bytes that + are the offset of the next free block (or the + value '1' if this is the + last free block) followed by "Size of Lengths" bytes that store + the size of this free block. The size of the free block includes + the space used to store the offset of the next free block and + the of the current block, making the minimum size of a free block + 2 * "Size of Lengths". +

+
Address of Data Segment +

The data segment originally starts immediately after + the heap header, but if the data segment must grow as a + result of adding more objects, then the data segment may + be relocated, in its entirety, to another part of the + file. +

+
+
+ +

Objects within the heap should be aligned on an 8-byte boundary. + +

Disk Format: Level 1E - Global Heap

+ +

Each HDF5 file has a global heap which stores various types of + information which is typically shared between datasets. The + global heap was designed to satisfy these goals: + +

    +
  1. Repeated access to a heap object must be efficient without + resulting in repeated file I/O requests. Since global heap + objects will typically be shared among several datasets, it is + probable that the object will be accessed repeatedly. +
  2. Collections of related global heap objects should result in + fewer and larger I/O requests. For instance, a dataset of + object references will have a global heap object for each + reference. Reading the entire set of object references + should result in a few large I/O requests instead of one small + I/O request for each reference. +
  3. It should be possible to remove objects from the global heap + and the resulting file hole should be eligible to be reclaimed + for other uses. +
+

+ +

The implementation of the heap makes use of the memory + management already available at the file level and combines that + with a new top-level object called a collection to + achieve Goal B. The global heap is the set of all collections. + Each global heap object belongs to exactly one collection and + each collection contains one or more global heap objects. For + the purposes of disk I/O and caching, a collection is treated as + an atomic object. +

+ +

The HDF5 library creates global heap collections as needed, so there may + be multiple collections throughout the file. The set of all of them is + abstractly called the "global heap", although they don't actually link + to each other, and there is no global place in the file where you can + discover all of the collections. The collections are found simply by + finding a reference to one through another object in the file (eg. + variable-length datatype elements, etc). +

+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ A Global Heap Collection +
bytebytebytebyte
Signature
VersionReserved (zero)
Collection SizeL

Global Heap Object 1


Global Heap Object 2


...


Global Heap Object N


Global Heap Object 0 (free space)

+ + + +
+ (Items marked with an 'L' the above table are +
+ of the size specified in "Size of Lengths.") +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Signature +

The ASCII character string "GCOL" + is used to indicate the + beginning of a collection. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+
Version +

Each collection has its own version number so that new + collections can be added to old files. This document + describes version one (1) of the collections (there is no + version zero (0)). +

+
Collection Size +

This is the size in bytes of the entire collection + including this field. The default (and minimum) + collection size is 4096 bytes which is a typical file + system block size. This allows for 127 16-byte heap + objects plus their overhead (the collection header of 16 bytes + and the 16 bytes of information about each heap object). +

+
Global Heap Object 1 through N +

The objects are stored in any order with no + intervening unused space. +

+
Global Heap Object 0 +

Global Heap Object 0 (zero), when present, represents the free + space in the collection. Free space always appears at the end of + the collection. If the free space is too small to store the header + for Object 0 (described below) then the header is implied and the + collection contains no free space. +

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Global Heap Object +
bytebytebytebyte
Heap Object IDReference Count
Reserved
Object SizeL

Object Data

+ + + +
+ (Items marked with an 'L' the above table are +
+ of the size specified in "Size of Lengths.") +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Heap Object ID +

Each object has a unique identification number within a + collection. The identification numbers are chosen so that + new objects have the smallest value possible with the + exception that the identifier 0 always refers to the + object which represents all free space within the + collection. +

+
Reference Count +

All heap objects have a reference count field. An + object which is referenced from some other part of the + file will have a positive reference count. The reference + count for Object 0 is always zero. +

+
Reserved +

Zero padding to align next field on an 8-byte boundary. +

+
Object Size +

This is the size of the object data stored for the object. + The actual storage space allocated for the object data is rounded + up to a multiple of eight. +

+
Object Data +

The object data is treated as a one-dimensional array + of bytes to be interpreted by the caller. +

+
+
+ +

Disk Format: Level 1F - Free-space Index

+ +

The free-space index is a collection of blocks of data, + dispersed throughout the file, which are currently not used by + any file objects. + +

The super block contains a pointer to root of the free-space description; + that pointer is currently required to be the + undefined address. + +

The format of the free-space index is not defined at this time. + + + +
+


+ +

Disk Format: Level 2 - Data Objects

+ +

Data objects contain the real information in the file. These + objects compose the scientific data and other information which + are generally thought of as "data" by the end-user. All the + other information in the file is provided as a framework for + these data objects. +

+ +

A data object is composed of header information and data + information. The header information contains the information + needed to interpret the data information for the data object as + well as additional "metadata" or pointers to additional + "metadata" used to describe or annotate each data object. +

+ +

+ Disk Format: Level 2A - Data Object Headers

+ +

The header information of an object is designed to encompass + all the information about an object, except for the data itself. + This information includes + the dataspace, datatype, information about how the data + is stored on disk (in external files, compressed, broken up in + blocks, etc.), as well as other information used by the library + to speed up access to the data objects or maintain a file's + integrity. Information stored by user applications as attributes + is also stored in the object's header. The header of each object is + not necessarily located immediately prior to the object's data in the + file and in fact may be located in any position in the file. The order + of the messages in an object header is not significant. +

+ +

Header messages are aligned on 8-byte boundaries. +

+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Object Headers +
bytebytebytebyte
VersionReserved (zero)Number of Header Messages
Object Reference Count
Object Header Size
Header Message Type #1Size of Header Message Data #1
Header Message #1 FlagsReserved (zero)

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #n
Header Message #n FlagsReserved (zero)

Header Message Data #n

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

This value is used to determine the format of the + information in the object header. When the format of the + information in the object header is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. This + document describes version one (1) (there was no version + zero (0)). +

+
Number of Header Messages +

This value determines the number of messages listed in + object headers for this object. This value includes the messages + in continuation messages for this object. +

+
Object Reference Count +

This value specifies the number of "hard links" to this object + within the current file. References to the object from external + files, "soft links" in this file and object references in this + file are not tracked. +

+
Object Header Size +

This value specifies the number of bytes of header message data + following this length field that contain object header messages + for this object header. This value does not include the size of + object header continuation blocks for this object elsewhere in the + file. +

+
Header Message Type +

This value specifies the type of information included in the + following header message data. The header message types for the + pre-defined header messages are included in sections below. +

+
Size of Header Message Data +

This value specifies the number of bytes of header + message data following the header message type and length + information for the current message. The size includes + padding bytes to make the message a multiple of eight + bytes. +

+
Header Message Flags +

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, the message data is constant. This is used + for messages like the datatype message of a dataset. +
1If set, the message is stored in the global heap. + The Header Message Data field contains a Shared Object + message and the Size of Header Message Data field + contains the size of that Shared Object message. +
2-7Reserved
+

+
Header Message Data +

The format and length of this field is determined by the + header message type and size respectively. Some header + message types do not require any data and this information + can be eliminated by setting the length of the message to + zero. The data is padded with enough zeros to make the + size a multiple of eight. +

+
+
+ +

The header message types and the message data associated with + them compose the critical "metadata" about each object. Some + header messages are required for each object while others are + optional. Some optional header messages may also be repeated + several times in the header itself, the requirements and number + of times allowed in the header will be noted in each header + message description below. +

+ +

The following is a list of currently defined header messages: +

+ +
+

Name: NIL

+ +

Header Message Type: 0x0000 +

+

Length: varies +

+

Status: Optional, may be repeated. +

+

Purpose and Description: The NIL message is used to indicate a + message which is to be ignored when reading the header messages for a + data object. [Possibly one which has been deleted for some reason.] +

+

Format of Data: Unspecified. +

+ +
+

Name: Simple Dataspace

+ +

Header Message Type: 0x0001 +

+

Length: Varies according to the number of dimensions, + as described in the following table. +

+

Status: Required for dataset objects, may not be + repeated. +

+

Description: The simple dataspace message describes the + number of dimensions (i.e. "rank") and size of each dimension that the + data object has. This message is only used for datasets which have a + simple, rectilinear grid layout; datasets requiring a more complex + layout (irregularly structured or unstructured grids, etc.) must use + the Complex Dataspace message for expressing the space the + dataset inhabits. (Note: The Complex Dataspace + functionality is not yet implemented and it is not described in this + document.) +

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Simple Dataspace Message +
bytebytebytebyte
VersionDimensionalityFlagsReserved
Reserved
Dimension #1 SizeL
.
.
.
Dimension #n SizeL
Dimension #1 Maximum SizeL
.
.
.
Dimension #n Maximum SizeL
Permutation Index #1L
.
.
.
Permutation Index #nL
+ + + +
+ (Items marked with an 'L' the above table are +
+ of the size specified in "Size of Lengths.") +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

This value is used to determine the format of the + Simple Dataspace Message. When the format of the + information in the message is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. This + document describes version one (1) (there was no version + zero (0)). +

+
Dimensionality +

This value is the number of dimensions that the data + object has. +

+
Flags +

This field is used to store flags to indicate the + presence of parts of this message. Bit 0 (the least + significant bit) is used to indicate that maximum + dimensions are present. Bit 1 is used to indicate that + permutation indices are present. +

+
Dimension #n Size +

This value is the current size of the dimension of the + data as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+
Dimension #n Maximum Size +

This value is the maximum size of the dimension of the + data as stored in the file. This value may be the special + "unlimited" size which indicates + that the data may expand along this dimension indefinitely. + If these values are not stored, the maximum size of each + dimension is assumed to be the dimension's current size. +

+
Permutation Index #n +

This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. If these values are + not stored, the first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+
+
+ +

+ + + +
+

Name: Reserved - Not Assigned Yet

+ Header Message Type: 0x0002
+ Length: N/A
+ Status: N/A
+ Format of Data: N/A
+ +

Purpose and Description: This message type was skipped during + the initial specification of the file format and may be used in a + future expansion to the format. + + +


+

Name: Datatype

+ +

Header Message Type: 0x0003 +

+

Length: variable +

+

Status: Required for dataset or named datatype objects, + may not be repeated. +

+ +

Description: The datatype message defines the datatype + for each element of a dataset. A datatype can describe an atomic type + like a fixed- or floating-point type or a compound type like a C + struct. + Datatypes messages are stored + as a list of datatype classes and + their associated properties. +

+ +

Datatype messages that are part of a dataset object, + do not describe how elements are related to one another, the dataspace + message is used for that purpose. Datatype messages that are part of + a named datatype message describe an "abstract" datatype that can be + used by other objects in the file. +

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Datatype Message +
bytebytebytebyte
Class and VersionClass Bit Field, Bits 0-7Class Bit Field, Bits 8-15Class Bit Field, Bits 16-23
Size


Properties


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Class and Version +

The version of the datatype message and the datatype's class + information are packed together in this field. The version + number is packed in the top 4 bits of the field and the class + is contained in the bottom 4 bits. +

+

The version number information is used for changes in the + format of the datatype message and is described here: + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Used by early versions of the library to encode + compound datatypes with explicit array fields. + See the compound datatype description below for + further details. +
2The current version used by the library. +
+

+

The class of the datatype determines the format for the class + bit field and properties portion of the datatype message, which + are described below. The + following classes are currently defined: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Fixed-Point
1Floating-Point
2Time
3String
4Bitfield
5Opaque
6Compound
7Reference
8Enumerated
9Variable-Length
10Array
+

+
Class Bit Fields +

The information in these bit fields is specific to each datatype + class and is described below. All bits not defined for a + datatype class are set to zero. +

+
Size +

The size of the datatype in bytes. +

+
Properties +

This variable-sized field encodes information specific to each + datatype class and is described below. If there is no + property information specified for a datatype class, the size + of this field is zero. +

+
+
+

+ +

Class specific information for Fixed-Point Numbers (Class 0): + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1, 2Padding type. Bit 1 is the lo_pad type and bit 2 + is the hi_pad type. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.
3Signed. If this bit is set then the fixed-point + number is in 2's complement form.
4-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + + + + +
+ Property Descriptions +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription
Bit Offset +

The bit offset of the first significant bit of the fixed-point + value within the datatype. The bit offset specifies the number + of bits "to the right of" the value. +

+
Bit Precision +

The number of bits of precision of the fixed-point value + within the datatype. +

+
+
+

+ +

Class specific information for Floating-Point Numbers (Class 1): + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1, 2, 3Padding type. Bit 1 is the low bits pad type, bit 2 + is the high bits pad type, and bit 3 is the internal bits + pad type. If a datum has unused bits at either end or between + the sign bit, exponent, or mantissa, then the value of bit + 1, 2, or 3 is copied to those locations.
4-5Normalization. The value can be 0 if there is no + normalization, 1 if the most significant bit of the + mantissa is always set (except for 0.0), and 2 if the most + signficant bit of the mantissa is not stored but is + implied to be set. The value 3 is reserved and will not + appear in this field.
6-7Reserved (zero).
8-15Sign Location. This is the bit position of the sign + bit. Bits are numbered with the least significant bit zero.
16-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Property Descriptions +
ByteByteByteByte
Bit OffsetBit Precision
Exponent LocationExponent SizeMantissa LocationMantissa Size
Exponent Bias
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Bit Offset +

The bit offset of the first significant bit of the floating-point + value within the datatype. The bit offset specifies the number + of bits "to the right of" the value. +

+
Bit Precision +

The number of bits of precision of the floating-point value + within the datatype. +

+
Exponent Location +

The bit position of the exponent field. Bits are numbered with + the least significant bit number zero. +

+
Exponent Size +

The size of the exponent field in bits. +

+
Mantissa Location +

The bit position of the mantissa field. Bits are numbered with + the least significant bit number zero. +

+
Mantissa Size +

The size of the mantissa field in bits. +

+
Exponent Bias +

The bias of the exponent field. +

+
+
+

+ +

Class specific information for Time (Class 2): + +
+

+ + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + +
+ Property Descriptions +
ByteByte
Bit Precision
+
+ +
+
+ + + + + + + + + + + +
Field NameDescription
Bit Precision +

The number of bits of precision of the time value. +

+
+
+

+ +

Class specific information for Strings (Class 3): + +
+

+ + + + + + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0-3Padding type. This four-bit value determines the + type of padding to use for the string. The values are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Null Terminate: A zero byte marks the end of the + string and is guaranteed to be present after + converting a long string to a short string. When + converting a short string to a long string the value is + padded with additional null characters as necessary. +
1Null Pad: Null characters are added to the end of + the value during conversions from short values to long + values but conversion in the opposite direction simply + truncates the value. +
2Space Pad: Space characters are added to the end of + the value during conversions from short values to long + values but conversion in the opposite direction simply + truncates the value. This is the Fortran + representation of the string. +
3-15Reserved +
+
4-7Character Set. The character set to use for + encoding the string. The only character set supported is + the 8-bit ASCII (zero) so no translations have been defined + yet.
8-23Reserved (zero).
+
+ +

There are no properties defined for the string class. +

+

+ +

Class specific information for Bitfields (Class 4): + +
+

+ + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.
1, 2Padding type. Bit 1 is the lo_pad type and bit 2 + is the hi_pad type. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.
3-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + + + + +
+ Property Description +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
Bit Offset +

The bit offset of the first significant bit of the bitfield + within the datatype. The bit offset specifies the number + of bits "to the right of" the value. +

+
Bit Precision +

The number of bits of precision of the bitfield + within the datatype. +

+
+
+

+ +

Class specific information for Opaque (Class 5): + +
+

+ + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0-7Length of ASCII tag in bytes.
8-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + + + +
+ Property Description +
ByteByteByteByte

ASCII Tag
+
+
+ +
+
+ + + + + + + + + + +
Field NameDescription
ASCII Tag +

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+
+
+

+ +

Class specific information for Compound (Class 6): + +
+

+ + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0-15Number of Members. This field contains the number + of members defined for the compound datatype. The member + definitions are listed in the Properties field of the data + type message. +
15-23Reserved (zero).
+
+

+ +

The Properties field of a compound datatype is a list of the + member definitions of the compound datatype. The member + definitions appear one after another with no intervening bytes. + The member types are described with a recursive datatype + message. + +

Note that the property descriptions are different for different + versions of the datatype version. Additionally note that the version + 0 properties are deprecated and have been replaced with the version + 1 properties in versions of the HDF5 library from the 1.4 release + onward. + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Properties Description for Datatype Version 1 +
ByteByteByteByte

Name

Byte Offset of Member
DimensionalityReserved (zero)
Dimension Permutation
Reserved (zero)
Dimension #1 Size (required)
Dimension #2 Size (required)
Dimension #3 Size (required)
Dimension #4 Size (required)

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Name +

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+
Byte Offset of Member +

This is the byte offset of the member within the datatype. +

+
Dimensionality +

If set to zero, this field indicates a scalar member. If set + to a value greater than zero, this field indicates that the + member is an array of values. For array members, the size of + the array is indicated by the 'Size of Dimension n' field in + this message. +

+
Dimension Permutation +

This field was intended to allow an array field to have + it's dimensions permuted, but this was never implemented. + This field should always be set to zero. +

+
Dimension #n Size +

This field is the size of a dimension of the array field as + stored in the file. The first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+
Member Type Message +

This field is a datatype message describing the datatype of + the member. +

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Properties Description for Datatype Version 2 +
ByteByteByteByte

Name

Byte Offset of Member

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Name +

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+
Byte Offset of Member +

This is the byte offset of the member within the datatype. +

+
Member Type Message +

This field is a datatype message describing the datatype of + the member. +

+
+
+

+ +

Class specific information for Reference (Class 7): + +
+

+ + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0-3Type. This four-bit value contains the type of reference + described. The values defined are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Object Reference: A reference to another object in this + HDF5 file. +
1Dataset Region Reference: A reference to a region within + a dataset in this HDF5 file. +
2Internal Reference: A reference to a region within the + current dataset. (Not currently implemented) +
3-15Reserved +
+ +
15-23Reserved (zero).
+
+ +

There are no properties defined for the reference class. +

+

+ +

Class specific information for Enumeration (Class 8): + +
+

+ + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0-15Number of Members. The number of name/value + pairs defined for the enumeration type.
16-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Property Description +
ByteByteByteByte

Base Type


Names


Values

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Base Type +

Each enumeration type is based on some parent type, usually an + integer. The information for that parent type is described + recursively by this field. +

+
Names +

The name for each name/value pair. Each name is stored as a null + terminated ASCII string in a multiple of eight bytes. The names + are in no particular order. +

+
Values +

The list of values in the same order as the names. The values + are packed (no inter-value padding) and the size of each value + is determined by the parent type. +

+
+
+

+ + +

Class specific information for Variable-Length (Class 9): + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bit Field Description +
BitsMeaning
0-3Type. This four-bit value contains the type of + variable-length datatype described. The values defined are: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Sequence: A variable-length sequence of any sequence of + data. Variable-length sequences do not have padding or + character set information. +
1String: A variable-length sequence of characters. + Variable-length strings have padding and character set + information. +
2-15Reserved +
+ +
4-7Padding type. (variable-length string only) + This four-bit value determines the type of padding + used for variable-length strings. The values are the same + as for the string padding type, as follows: + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Null terminate: A zero byte marks the end of a string + and is guaranteed to be present after converting a long + string to a short string. When converting a short string + to a long string, the value is padded with additional null + characters as necessary. +
1Null pad: Null characters are added to the end of the + value during conversion from a short string to a longer + string. Conversion from a long string to a shorter string + simply truncates the value. +
2Space pad: Space characters are added to the end of the + value during conversion from a short string to a longer + string. Conversion from a long string to a shorter string + simply truncates the value. This is the Fortran + representation of the string. +
3-15Reserved +
+ + This value is set to zero for variable-length sequences. + +
8-11Character Set. (variable-length string only) + This four-bit value specifies the character set + to be used for encoding the string: + + + + + + + + + + + + + + + +
ValueDescription
0ASCII: As of this writing (July 2003, Release 1.6.0), + 8-bit ASCII is the only character set supported. Therefore, + no translations have been defined. +
1-15Reserved +
+ + This value is set to zero for variable-length sequences. + +
12-23Reserved (zero).
+
+ +
+
+ + + + + + + + + + + + + + +
+ Property Description +
ByteByteByteByte

Base Type

+
+ +
+
+ + + + + + + + + + + +
Field NameDescription
Base Type +

Each variable-length type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+

+ +

Class specific information for Array (Class 10): + +

There are no bit fields defined for the array class. +

+ +

Note that the dimension information defined in the property for this + datatype class is independent of dataspace information for a dataset. + The dimension information here describes the dimensionality of the + information within a data element (or a component of an element, if the + array datatype is nested within another datatype) and the dataspace for a + dataset describes the location of the elements in a dataset. +

+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Property Description +
ByteByteByteByte
DimensionalityReserved (zero)
Dimension #1 Size
.
.
.
Dimension #n Size
Permutation Index #1
.
.
.
Permutation Index #n

Base Type

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Dimensionality +

This value is the number of dimensions that the array has. +

+
Dimension #n Size +

This value is the size of the dimension of the array + as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+
Permutation Index #n +

This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. Currently, dimension + permutations are not supported and these indices should be set + to the index position minus one (i.e. the first dimension should + be set to 0, the second dimension should be set to 1, etc.) +

+
Base Type +

Each array type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ +

+ +
+

Name: Data Storage - Fill Value (Old)

+ +

Header Message Type: 0x0004 +

+

Length: varies +

+

Status: Optional, may not be repeated. +

+ +

Description: The fill value message stores a single + data value which is returned to the application when an uninitialized + data element is read from a dataset. The fill value is interpreted + with the same datatype as the dataset. If no fill value message is + present then a fill value of all zero bytes is assumed. +

+ +

This fill value message is deprecated in favor of the "new" + fill value message (Message Type 0x0005) and is only written to the + file for forward compatibility with versions of the HDF5 library before + the 1.6.0 version. Additionally, it only appears for datasets with a + user defined fill value (as opposed to the library default fill value + or an explicitly set "undefined" fill value). +

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + +
+ Fill Value Message (Old) +
bytebytebytebyte
Size

Fill Value

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
Size +

This is the size of the Fill Value field in bytes. +

+
Fill Value +

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. +

+
+
+

+ +
+

Name: Data Storage - Fill Value

+ +

Header Message Type: 0x0005 +

+

Length: varies +

+

Status: Required for dataset objects, may not be repeated. +

+ +

Description: The fill value message stores a single + data value which is returned to the application when an uninitialized + data element is read from a dataset. The fill value is interpreted + with the same datatype as the dataset. +

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Fill Value Message +
bytebytebytebyte
VersionSpace Allocation TimeFill Value Write TimeFill Value Defined
Size

Fill Value

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

The version number information is used for changes in the + format of the fill value message and is described here: + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Used by version 1.6.x of the library to encode + fill values. In this version, the Size field is + always present. +
2The current version used by the library (version + 1.7.3 or later). In this version, the Size and + Fill Value fields are + only present if the Fill Value Defined field is set + to 1. +
+

+
Space Allocation Time +

When the storage space for the dataset's raw data will be + allocated. The allowed values are: + + + + + + + + + + + + + + + + + + +
ValueDescription
1Early allocation. Storage space for the entire dataset + should be allocated in the file when the dataset is + created. +
2Late allocation. Storage space for the entire dataset + should not be allocated until the dataset is written + to. +
3Incremental allocation. Storage space for the + dataset should not be allocated until the portion + of the dataset is written to. This is currently + used in conjunction with chunked data storage for + datasets. +
+

+
Fill Value Write Time +

At the time that storage space for the dataset's raw data is + allocated, this value indicates whether the fill value should + be written to the raw data storage elements. The allowed values + are: + + + + + + + + + + + + + + + + + + +
ValueDescription
0On allocation. The fill value is always written to + the raw data storage when the storage space is allocated. +
1Never. The fill value should never be written to + the raw data storage. +
2Fill value written if set by user. The fill value + will be written to the raw data storage when the storage + space is allocated only if the user explicitly set + the fill value. If the fill value is the library + default or is undefined, it will not be written to + the raw data storage. +
+

+
Fill Value Defined +

This value indicates if a fill value is defined for this + dataset. If this value is 0, the fill value is undefined. + If this value is 1, a fill value is defined for this dataset. + For version 2 or later of the fill value message, this value + controls the presence of the Size field. +

+
Size +

This is the size of the Fill Value field in bytes. This field + is not present if the Version field is >1 and the Fill Value + Defined field is set to 0. +

+
Fill Value +

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. This field is + not present if the Version field is >1 and the Fill Value + Defined field is set to 0. +

+
+
+

+ + + +
+

Name: Reserved - Not Assigned Yet

+

Header Message Type: 0x0006

+

Length: N/A

+

Status: N/A

+

Format of Data: N/A

+ +

Purpose and Description: This message type was skipped during + the initial specification of the file format and may be used in a + future expansion to the format.

+ +
+

Name: Data Storage - + External Data Files

+

Header Message Type: 0x0007

+

Length: varies

+

Status: Optional, may not be repeated.

+ +

Purpose and Description: The external object message + indicates that the data for an object is stored outside the HDF5 + file. The filename of the object is stored as a Universal + Resource Location (URL) of the actual filename containing the + data. An external file list record also contains the byte offset + of the start of the data within the file and the amount of space + reserved in the file for that data.

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ External File List Message +
bytebytebytebyte
VersionReserved
Allocated SlotsUsed Slots

Heap Address


Slot Definitions...

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

The version number information is used for changes in the format of External File + List Message and is described here: + + + + + + + + + + + +
VersionDescription
0Never used. +
1The current version used by the library. +
+

+
Reserved +

This field is reserved for future use.

+
Allocated Slots +

The total number of slots allocated in the message. Its value must be at least as + large as the value contained in the Used Slots field. (The current library simply + uses the number of Used Slots for this message)

+
Used Slots +

The number of initial slots which contains valid information.

+
Heap Address +

This is the address of a local heap which contains the names for the external + files (The local heap information can be found in Disk Format Level 1D in this + document). The name at offset zero in the heap is always the empty string.

+
Slot Definitions +

The slot definitions are stored in order according to the array addresses they + represent.

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ External File List Slot +
bytebytebytebyte

Name Offset(<size> bytes)


File Offset(<size> bytes)


Size

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Name Offset(<size> bytes) +

The byte offset within the local name heap for the name + of the file. File names are stored as a URL which has a + protocol name, a host name, a port number, and a file + name: + protocol:port//host/file. + If the protocol is omitted then "file:" is assumed. If + the port number is omitted then a default port for that + protocol is used. If both the protocol and the port + number are omitted then the colon can also be omitted. If + the double slash and host name are omitted then + "localhost" is assumed. The file name is the only + mandatory part, and if the leading slash is missing then + it is relative to the application's current working + directory (the use of relative names is not + recommended).

+
File Offset(<size> bytes) +

This is the byte offset to the start of the data in the + specified file. For files that contain data for a single + dataset this will usually be zero.

+
Size +

This is the total number of bytes reserved in the + specified file for raw data storage. For a file that + contains exactly one complete dataset which is not + extendable, the size will usually be the exact size of the + dataset. However, by making the size larger one allows + HDF5 to extend the dataset. The size can be set to a value + larger than the entire file since HDF5 will read zeros + past the end of the file without failing.

+
+
+ + +
+

Name: Data Storage - Layout

+ +

Header Message Type: 0x0008

+

Length: varies

+

Status: Required for datasets, may not be repeated.

+ +

Purpose and Description: Data layout describes how the + elements of a multi-dimensional array are arranged in the linear + address space of the file. Three types of data layout are + supported: + +

    +
  1. Contiguous: The array can be stored in one contiguous area of the file. + The layout requires that the size of the array be constant and + does not permit chunking, compression, checksums, encryption, + etc. The message stores the total size of the array and the + offset of an element from the beginning of the storage area is + computed as in C. + +
  2. Chunked: The array domain can be regularly decomposed into chunks and + each chunk is allocated separately. This layout supports + arbitrary element traversals, compression, encryption, and + checksums, and the chunks can be distributed across external + raw data files (these features are described in other + messages). The message stores the size of a chunk instead of + the size of the entire array; the size of the entire array can + be calculated by traversing the B-tree that stores the chunk + addresses. + +
  3. Compact: The array can be stored in one contiguous block, as part of + this object header message (this is called "compact" storage below). +
+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Data Layout Message (Versions 1 and 2) +
bytebytebytebyte
VersionDimensionalityLayout ClassReserved
Reserved

Address

Dimension 0 (4-bytes)
Dimension 1 (4-bytes)
...
Dataset Element Size (optional)
Compact Data Size (4-bytes)

Compact Data...

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

The version number information is used for changes in the format of the data + layout message and is described here:

+ + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by version 1.4 and before of the library to encode layout information. + Data space is always allocated when the data set is created.
2Used by version 1.6.x of the library to encode layout information. + Data space is allocated only when it is necessary.
+
Dimensionality

An array has a fixed dimensionality. This field + specifies the number of dimension size fields later in the + message.

Layout Class

The layout class specifies how the other fields of the + layout message are to be interpreted. A value of one + indicates contiguous storage, a value of two indicates chunked storage, + while a value of zero indicates compact storage. Other values will be defined + in the future.

Address

For contiguous storage, this is the address of the first + byte of storage. For chunked storage this is the address + of the B-tree that is used to look up the addresses of the + chunks. This field is not present for compact storage. + If the version for this message is set to 2, the address + may have the "undefined address" value, to indicate that + storage has not yet been allocated for this array.

Dimensions

For contiguous and compact storage the dimensions define + the entire size of the array while for chunked storage they define + the size of a single chunk. In all cases, they are in units of + array elements (not bytes). The first dimension stored in the list + of dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+
Dataset Element Size

The size of a dataset element, in bytes. This field is only + present for chunked storage. +

+
Compact Data Size

This field is only present for compact data storage. + It contains the size of the raw data for the dataset array.

Compact Data

This field is only present for compact data storage. + It contains the raw data for the dataset array.

+
+ +
+

Version 3 of this message re-structured the format into specific + properties that are required for each layout class. + +
+

+ + + + + + + + + + + + + + + + + + + +
+ Data Layout Message (Version 3) +
bytebytebytebyte
VersionLayout Class 

Properties

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version +

The version number information is used for changes in the format of layout message + and is described here:

+ + + + + + + + + + +
VersionDescription
3Used by the version 1.6.3 and later of the library to store properties + for each layout class.
+
Layout Class

The layout class specifies how the other fields of the layout message are to be + interpreted. A value of one indicates contiguous storage, a value of two + indicates chunked storage, while a value of zero indicates compact storage.

Properties

This variable-sized field encodes information specific to each + layout class and is described below. If there is no property + information specified for a layout class, the size of this field + is zero bytes.

+
+ +
+

Class-specific information for compact layout (Class 0): (Note: The dimensionality information + is in the Dataspace message) + +
+

+ + + + + + + + + + + + + + + + + + +
+ Property Descriptions +
bytebytebytebyte
Size 

Raw Data...

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
Size

This field contains the size of the raw data for the dataset array.

Raw Data

This field contains the raw data for the dataset array.

+
+ +
+

Class-specific information for contiguous layout (Class 1): (Note: The dimensionality information + is in the Dataspace message) + +
+

+ + + + + + + + + + + + + + + + + +
+ Property Descriptions +
bytebytebytebyte

Address


Size

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
Address

This is the address of the first byte of raw data storage. + The address may have the "undefined address" value, to indicate + that storage has not yet been allocated for this array.

Size

This field contains the size allocated to store the raw data.

+
+ +
+

Class-specific information for chunked layout (Class 2): + +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Property Descriptions +
bytebytebytebyte
Dimensionality 

Address

Dimension 0 (4-bytes)
Dimension 1 (4-bytes)
...
Dataset Element Size
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Dimensionality

A chunk has a fixed dimensionality. This field specifies + the number of dimension size fields later in the message.

Address

This is the address of the B-tree that is used to look up the addresses of the + chunks. The address may have the "undefined address" value, to indicate that + storage has not yet been allocated for this array.

Dimensions

These values define the dimension size of a single chunk, in + units of array elements (not bytes). The first dimension stored in + the list of dimensions is the slowest changing dimension and the + last dimension stored is the fastest changing dimension. +

+
Dataset Element Size

The size of a dataset element, in bytes. +

+
+
+ +
+

Name: Reserved - Not Assigned Yet

+

Header Message Type: 0x0009

+

Length: N/A

+

Status: N/A

+

Format of Data: N/A

+ +

Purpose and Description: This message type was skipped during the initial + specification of the file format and may be used in a future expansion to the format. + +


+

Name: Reserved - Not Assigned Yet

+

Header Message Type: 0x0009

+

Length: N/A

+

Status: N/A

+

Format of Data: N/A

+ +

Purpose and Description: This message type was skipped during the initial + specification of the file format and may be used in a future expansion to the format. + +


+

Name: Data Storage - Filter Pipeline

+

Header Message Type: 0x000B

+

Length: varies

+

Status: Optional, may not be repeated.

+ +

Description: This message describes the + filter pipeline which should be applied to the data stream by + providing filter identification numbers, flags, a name, and + client data.

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Pipeline Message +
bytebytebytebyte
VersionNumber of FiltersReserved
Reserved

Filter List

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version

The version number for this message. This document + describes version 1.

Number of Filters

The total number of filters described by this + message. The maximum possible number of filters in a + message is 32.

Filter List

A description of each filter. A filter description + appears in the next table.

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Description +
bytebytebytebyte
Filter IdentificationName Length
FlagsNumber of Values for Client Data

Name


Client Data

Padding
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Filter Identification +

+ This value, often referred to as a filter identifier, + is designed to be a unique identifier for the filter. + Values from zero through 32,767 are reserved for filters + supported by The HDF Group in the HDF5 library and for + filters requested and supported by third parties. + Filters supported by The HDF Group are documented immediately + below. Information on 3rd-party filters can be found at + + https://support.hdfgroup.org/services/contributions.html#filters. + 1 +

+ To request a filter identifier, please contact + The HDF Group’s Help Desk at + . + You will be asked to provide the following information: +

    +
  1. Contact information for the developer requesting the + new identifier +
  2. A short description of the new filter +
  3. Links to any relevant information, including licensing + information +
+

+ Values from 32768 to 65535 are reserved for non-distributed uses + (for example, internal company usage) or for application usage + when testing a feature. The HDF Group does not track or document + the use of the filters with identifiers from this range. + +

+ The filters currently in library version 1.6.5 are + listed below: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IdentificationNameDescription
1deflateGZIP deflate compression
2shuffleData element shuffling
3fletcher32Fletcher32 checksum
4szipSZIP compression
+

Name Length

Each filter has an optional null-terminated ASCII name + and this field holds the length of the name including the + null termination padded with nulls to be a multiple of + eight. If the filter has no name then a value of zero is + stored in this field.

Flags

The flags indicate certain properties for a filter. The + bit values defined so far are:

+ + + + + + + + + + +
ValueDescription
bit 1If set then the filter is an optional filter. + During output, if an optional filter fails it will be + silently removed from the pipeline.
+
Client Data Number of Values

Each filter can store a few integer values to control + how the filter operates. The number of entries in the + Client Data array is stored in this field.

Name

If the Name Length field is non-zero then it will + contain the size of this field, a multiple of eight. This + field contains a null-terminated, ASCII character + string to serve as a comment/name for the filter.

Client Data

This is an array of four-byte integers which will be + passed to the filter function. The Client Data Number of + Values determines the number of elements in the array.

Padding

Four bytes of zeros are added to the message at this + point if the Client Data Number of Values field contains + an odd number.

+
+

+


+ 1If you are reading + an earlier version of this document, this link may have changed. + If the link does not work, use the latest version of this document + on The HDF Group’s website, + + https://support.hdfgroup.org/HDF5/doc/H5.format.html; + the link there will always be correct. + (Return) +

+ +
+

Name: Attribute

+

Header Message Type: 0x000C +

Length: varies +

Status: Optional, may be repeated. + +

Description: The Attribute + message is used to list objects in the HDF file which are used + as attributes, or "metadata" about the current object. An + attribute is a small dataset; it has a name, a datatype, a data + space, and raw data. Since attributes are stored in the object + header they must be relatively small (<64KB) and can be + associated with any type of object which has an object header + (groups, datasets, named types and spaces, etc.). + +

Note: Attributes on an object must have unique names. (The HDF5 library + currently enforces this by causing the creation of an attribute with + a duplicate name to fail). Attributes on different objects may have the + same name, however. + +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Message (Version 1) +
bytebytebytebyte
VersionReservedName Size
Datatype SizeDataspace Size

Name


Datatype


Dataspace


Data

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version

The version number information is used for changes in the format of the + attribute message and is described here:

+ + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by the library before version 1.6 to encode attribute message. + This version does not support shared data type.
+
Reserved

This field is reserved for later use and is set to + zero.

Name Size

The length of the attribute name in bytes including the + null terminator. Note that the Name field below may + contain additional padding not represented by this + field.

Datatype Size

The length of the datatype description in the Datatype + field below. Note that the Datatype field may contain + additional padding not represented by this field.

Dataspace Size

The length of the dataspace description in the Dataspace + field below. Note that the Dataspace field may contain + additional padding not represented by this field.

Name

The null-terminated attribute name. This field is + padded with additional null characters to make it a + multiple of eight bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. This + field is not padded with additional bytes.

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Message (Version 2) +
bytebytebytebyte
VersionFlagName Size
Type SizeSpace Size

Name


Type


Space


Data

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version

The version number information is used for changes in the format of the + attribute message and is described here:

+ + + + + + + + + + +
VersionDescription
2Used by the library of version 1.6.x and after to encode attribute message. + This version supports shared data type. The fields of name, type, and space + are not padded with additional bytes of zero.
+
Flag

This field indicates whether the data type of this attribute is shared:

+ + + + + + + + + + + + + + + +
ValueDescription
0Datatype is not shared.
1Datatype is shared.
+
Name Size

The length of the attribute name in bytes including the + null terminator.

Datatype Size

The length of the datatype description in the Datatype + field below.

Dataspace Size

The length of the dataspace description in the Dataspace + field below.

Name

The null-terminated attribute name. This field is not + padded with additional bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. This + field is not padded with additional bytes.

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. This + field is not padded with additional bytes.

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. This + field is not padded with additional zero + bytes.

+
+ +
+

Name: Object Comment

+ +

Header Message Type: 0x000D

+

Length: varies

+

Status: Optional, may not be repeated.

+ +

Description: The object comment is + designed to be a short description of an object. An object comment + is a sequence of non-zero (\0) ASCII characters with no other + formatting included by the library.

+ +

Format of Data: +
+

+ + + + + + + + + + + + + +
+ Name Message +
bytebytebytebyte

Comment

+
+ +
+
+ + + + + + + + + + +
Field NameDescription
NameA null terminated ASCII character string.
+
+ +
+

Name: Object Modification Date & Time (Old)

+ +

Header Message Type: 0x000E

+

Length: fixed

+

Status: Optional, may not be repeated.

+ +

Description: The object modification date + and time is a timestamp which indicates (using ISO-8601 date and + time format) the last modification of an object. The time is + updated when any object header message changes according to the + system clock where the change was posted. + +

This modification time message is deprecated in favor of the "new" + modification time message (Message Type 0x0012) and is no longer written + to the file in versions of the HDF5 library after the 1.6.0 version. +

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Modification Time Message +
bytebytebytebyte
Year
MonthDay of Month
HourMinute
SecondReserved
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Year

The four-digit year as an ASCII string. For example, + 1998. All fields of this message should be interpreted + as coordinated universal time (UTC)

Month

The month number as a two digit ASCII string where + January is 01 and December is 12.

Day of Month

The day number within the month as a two digit ASCII + string. The first day of the month is 01.

Hour

The hour of the day as a two digit ASCII string where + midnight is 00 and 11:00pm is 23.

Minute

The minute of the hour as a two digit ASCII string where + the first minute of the hour is 00 and + the last is 59.

Second

The second of the minute as a two digit ASCII string + where the first second of the minute is 00 + and the last is 59.

Reserved

This field is reserved and should always be zero.

+
+ +
+

Name: Shared Object Message

+

Header Message Type: 0x000F

+

Length: Fixed

+

Status: Optional, may be repeated.

+ +

Description: A constant message can be shared among + several object headers. A Shared Object Message contains the address of + the object message to be shared. Care must be exercised to prevent cycles when a + message of one object header points to a message in some other object header. + Starting from Version 2 of the Shared Object Message, the Flags + field becomes unused. +

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + + + +
+ Shared Object Message (Version 1) +
byte + byte + byte + byte +
VersionFlagsReserved
Reserved

Pointer

+
+ +
+
+ + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version

The version number is used when there are changes in the format + of a shared object message and is described here:

+ + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by the library before version 1.6.1. In this version, + the Flags field is used to indicate whether the actual message is + stored in the global heap (never implemented). The Pointer field + either contains the the header message address in the global heap + (never implemented) or the address of the shared object header.
+
Flags

The Shared Message message points to a message which is + shared among multiple object headers. The Flags field + describes the type of sharing:

+ + + + + + + + + + + + + + + +
BitDescription
0If this bit is clear then the actual message is the + first message in some other object header; otherwise + the actual message is stored in the global heap (never + implemented).
2-7Reserved (always zero)
+
Pointer

The address of the object header + containing the message to be shared.

+
+ +
+
+ + + + + + + + + + + + + + + +
+ Shared Object Message (Version 2) +
byte + byte + byte + byte +
VersionFlags 

Pointer

+
+ +
+
+ + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version

The version number is used when there are changes in the format + of a shared object message and is described here:

+ + + + + + + + + + +
VersionDescription
2Used by the library of version 1.6.1 and after. In this version, + The Flags field is not used and the Pointer field contains the address + of the object header containing the message to be shared.
+
Flags

Unused.

Pointer

The address of the object header + containing the message to be shared.

+
+ + +
+

Name: Object Header Continuation

+

Header Message Type: 0x0010

+

Length: fixed

+

Status: Optional, may be repeated.

+

Description: The object header continuation is the location + in the file of more header messages for the current data object. This can be + used when header blocks become too large or are likely to change over time.

+ +

Format of Data: +
+

+ + + + + + + + + + + + + + + + + +
+ Object Header Continuation Message +
bytebytebytebyte

Offset


Length

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
Offset

This value is the offset in bytes from the beginning of the file where the + header continuation information is located.

Length

This value is the length in bytes of the header continuation information in + the file.

+
+ +
+

Name: Group Message

+

Header Message Type: 0x0011

+

Length: fixed

+

Status: Required for groups, may not be repeated.

+

Description: Each group has a B-tree and a + name heap which are pointed to by this message.

+

Format of data: + +
+

+ + + + + + + + + + + + + + + + + +
+ Group Message +
bytebytebytebyte

B-tree Address


Heap Address

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription
B-tree Address

This value is the offset in bytes from the beginning of the file + where the B-tree is located.

Heap Address

This value is the offset in bytes from the beginning of the file + where the group name heap is located.

+
+ +
+

Name: Object Modification Date & Time

+ +

Header Message Type: 0x0012

+

Length: Fixed

+

Status: Optional, may not be repeated.

+ +

Description: The object modification date + and time is a timestamp which indicates the last modification of an object. + The time is updated when any object header message changes according to the + system clock where the change was posted. +

+ +

Format of Data: +

+ + + + + + + + + + + + + + + + + + +
+ Modification Time Message +
bytebytebytebyte
VersionReserved
Seconds After Epoch
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription
Version

The version number is used for changes in the format of Object Modification Time + and is described here:

+ + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by Version 1.6.1 and after of the library to encode time. In + this version, the time is the seconds after Epoch.
+
Reserved

This field is reserved and should always be zero.

Seconds After Epoch

The number of seconds since 0 hours, 0 minutes, 0 seconds, + January 1, 1970, Coordinated Universal Time.

+
+ +
+

Disk Format: Level 2b - Data Object Data Storage

+

The data for an object is stored separately from the header +information in the file and may not actually be located in the HDF5 file +itself if the header indicates that the data is stored externally. The +information for each record in the object is stored according to the +dimensionality of the object (indicated in the dimensionality header message). +Multi-dimensional data is stored in C order [same as current scheme], i.e. the +"last" dimension changes fastest. +

Data whose elements are composed of simple number-types are stored in +native-endian IEEE format, unless they are specifically defined as being stored +in a different machine format with the architecture-type information from the +number-type header message. This means that each architecture will need to +[potentially] byte-swap data values into the internal representation for that +particular machine. +

Data with a variable-length datatype is stored in the global heap +of the HDF5 file. Global heap identifiers are stored in the +data object storage. +

Data whose elements are composed of pointer number-types are stored in several +different ways depending on the particular pointer type involved. Simple +pointers are just stored as the dataset offset of the object being pointed to with the +size of the pointer being the same number of bytes as offsets in the file. +Dataset region references are stored as a heap-ID which points to the following +information within the file-heap: an offset of the object pointed to, number-type +information (same format as header message), dimensionality information (same +format as header message), sub-set start and end information (i.e. a coordinate +location for each), and field start and end names (i.e. a [pointer to the] +string indicating the first field included and a [pointer to the] string name +for the last field). + +

Data of a compound datatype is stored as a contiguous stream of the items +in the structure, with each item formatted according to its datatype.

+ +
+

Appendix

+

Definitions of various terms used in this document. +

+

The "undefined address" for a file is a +file address with all bits set, i.e. 0xffff...ff. +

The "unlimited size" for a size is a +value with all bits set, i.e. 0xffff...ff. + + + diff --git a/doxygen/examples/H5.format.2.0.html b/doxygen/examples/H5.format.2.0.html new file mode 100644 index 0000000..3653489 --- /dev/null +++ b/doxygen/examples/H5.format.2.0.html @@ -0,0 +1,14902 @@ + + + + + HDF5 File Format Specification Version 2.0 + + + + +

+ + + + + + + +
+
    +
  1. Introduction
  2. + +
      +
    1. This Document
    2. +
    3. Changes for HDF5 1.10
    4. +
    +
    + +
  3. Disk Format: Level 0 - File Metadata
  4. + +
      +
    1. Disk Format: Level 0A - Format Signature and Superblock
    2. +
    3. Disk Format: Level 0B - File Driver Info
    4. +
    5. Disk Format: Level 0C - Superblock Extension
    6. +
    +
    +
  5. Disk Format: Level 1 - File Infrastructure
  6. + +
      +
    1. Disk Format: Level 1A - B-trees and B-tree + Nodes
    2. +
        +
      1. Disk Format: Level 1A1 - Version 1 + B-trees (B-link Trees)
      2. +
      3. Disk Format: Level 1A2 - Version 2 + B-trees
      4. +
      +
    3. Disk Format: Level 1B - Group Symbol Table Nodes
    4. +
    5. Disk Format: Level 1C - Symbol Table Entry
    6. +
    7. Disk Format: Level 1D - Local Heaps
    8. +
    9. Disk Format: Level 1E - Global Heap
    10. +
    11. Disk Format: Level 1F - Fractal Heap
    12. +
    13. Disk Format: Level 1G - Free-space Manager
    14. +
    15. Disk Format: Level 1H - Shared Object Header Message Table
    16. +
    +
    +
  7. Disk Format: Level 2 - Data Objects
  8. + +
      +
    1. Disk Format: Level 2A - Data Object Headers
    2. +
        +
      1. Disk Format: Level 2A1 - Data Object Header Prefix
      2. +
          +
        1. Version 1 Data Object Header Prefix
        2. +
        3. Version 2 Data Object Header Prefix
        4. +
        +
      3. Disk Format: Level 2A2 - Data Object Header Messages
      4. +
          +
        1. The NIL Message
        2. +
        3. The Dataspace Message
        4. +
        5. The Link Info Message
        6. +
        +
      +
    +
    +
+
  +
    +
  1. Disk Format: Level 2 - Data + Objects (Continued)
  2. +
      +
    1. Disk Format: Level 2A - Data Object + Headers (Continued)
    2. +
        +
      1. Disk Format: Level 2A2 - + Data Object Header Messages (Continued)
      2. +
          +
        1. The Datatype Message
        2. +
        3. The Data Storage - + Fill Value (Old) Message
        4. +
        5. The Data Storage - + Fill Value Message
        6. +
        7. The Link Message
        8. +
        9. The Data Storage - + External Data Files Message
        10. +
        11. The Data Storage - + Layout Message
        12. +
        13. The Bogus Message
        14. +
        15. The Group Info + Message
        16. +
        17. The Data Storage - + Filter Pipeline Message
        18. +
        19. The Attribute + Message
        20. +
        21. The Object Comment + Message
        22. +
        23. The Object + Modification Time (Old) Message
        24. +
        25. The Shared Message + Table Message
        26. +
        27. The Object Header + Continuation Message
        28. +
        29. The Symbol + Table Message
        30. +
        31. The Object + Modification Time Message
        32. +
        33. The B-tree + ‘K’ Values Message
        34. +
        35. The Driver Info + Message
        36. +
        37. The Attribute Info + Message
        38. +
        39. The Object Reference + Count Message
        40. +
        41. The File Space Info + Message
        42. +
        +
      +
    3. Disk Format: Level 2B - Data Object Data Storage
    4. +
    +
    +
  3. Appendix A: Definitions
  4. +
  5. Appendix B: File Memory Allocation Types
  6. +
+
+
+ + + +
+
+
+

I. Introduction

+ + + + + + + +
  +
+ HDF5 Groups +
 
  + Figure 1: Relationships among the HDF5 root group, other groups, and objects +
+
 
  + HDF5 Objects +  
  + Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces +
+
 
+ + +

The format of an HDF5 file on disk encompasses several + key ideas of the HDF4 and AIO file formats as well as + addressing some shortcomings therein. The new format is + more self-describing than the HDF4 format and is more + uniformly applied to data objects in the file.

+ +

An HDF5 file appears to the user as a directed graph. + The nodes of this graph are the higher-level HDF5 objects + that are exposed by the HDF5 APIs:

+ +
    +
  • Groups
  • +
  • Datasets
  • +
  • Committed (formerly Named) datatypes
  • +
+ +

At the lowest level, as information is actually written to the disk, + an HDF5 file is made up of the following objects:

+
    +
  • A superblock
  • +
  • B-tree nodes
  • +
  • Heap blocks
  • +
  • Object headers
  • +
  • Object data
  • +
  • Free space
  • +
+ +

The HDF5 Library uses these low-level objects to represent the + higher-level objects that are then presented to the user or + to applications through the APIs. For instance, a group is an + object header that contains a message that points to a local + heap (for storing the links to objects in the group) and to a + B-tree (which indexes the links). A dataset is an object header + that contains messages that describe datatype, dataspace, layout, + filters, external files, fill value, and other elements with the + layout message pointing to either a raw data chunk or to a + B-tree that points to raw data chunks.

+ + +
+

I.A. This Document

+ +

This document describes the lower-level data objects; + the higher-level objects and their properties are described + in the HDF5 User’s Guide.

+ +

Three levels of information comprise the file format. + Level 0 contains basic information for identifying and + defining information about the file. Level 1 information contains + the information about the pieces of a file shared by many objects + in the file (such as a B-trees and heaps). Level 2 is the rest + of the file and contains all of the data objects, with each object + partitioned into header information, also known as + metadata, and data.

+ +

The sizes of various fields in the following layout tables are + determined by looking at the number of columns the field spans + in the table. There are three exceptions: (1) The size may be + overridden by specifying a size in parentheses, (2) the size of + addresses is determined by the Size of Offsets field + in the superblock and is indicated in this document with a + superscripted ‘O’, and (3) the size of length fields is determined + by the Size of Lengths field in the superblock and is + indicated in this document with a superscripted ‘L’.

+ +

Values for all fields in this document should be treated as unsigned + integers, unless otherwise noted in the description of a field. + Additionally, all metadata fields are stored in little-endian byte + order. +

+ +

All checksums used in the format are computed with the + Jenkins’ + lookup3 algorithm. +

+ +

Whenever a bit flag or field is mentioned for an entry, bits are + numbered from the lowest bit position in the entry. +

+ +

Various tables in this document aligned with “This space inserted + only to align table nicely”. These entries in the table are just + to make the table presentation nicer and do not represent any values + or padding in the file. +

+ + +
+

I.B. Changes for HDF5 1.10

+ +

As of October 2015, changes in the file format for HDF5 1.10 + have not yet been finalized.

+ + + +
+
+
+

+II. Disk Format: Level 0 - File Metadata

+ +
+

+II.A. Disk Format: Level 0A - Format Signature and Superblock

+ +

The superblock may begin at certain predefined offsets within + the HDF5 file, allowing a block of unspecified content for + users to place additional information at the beginning (and + end) of the HDF5 file without limiting the HDF5 Library’s + ability to manage the objects within the file itself. This + feature was designed to accommodate wrapping an HDF5 file in + another file format or adding descriptive information to an HDF5 + file without requiring the modification of the actual file’s + information. The superblock is located by searching for the + HDF5 format signature at byte offset 0, byte offset 512, and at + successive locations in the file, each a multiple of two of + the previous location; in other words, at these byte offsets: + 0, 512, 1024, 2048, and so on.

+ +

The superblock is composed of the format signature, followed by a + superblock version number and information that is specific to each + version of the superblock. + Currently, there are three versions of the superblock format. + Version 0 is the default format, while version 1 is basically the same + as version 0 with additional information when a non-default B-tree ‘K’ + value is stored. Version 2 is the latest format, with some fields + eliminated or compressed and with superblock extension and checksum + support.

+ +

Version 0 and 1 of the superblock are described below:

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Superblock (Versions 0 and 1) +
bytebytebytebyte

Format Signature (8 bytes)

Version # of SuperblockVersion # of File’s Free Space StorageVersion # of Root Group Symbol Table EntryReserved (zero)
Version # of Shared Header Message FormatSize of OffsetsSize of LengthsReserved (zero)
Group Leaf Node KGroup Internal Node K
File Consistency Flags
Indexed Storage Internal Node K1Reserved (zero)1

Base AddressO


Address of File Free space InfoO


End of File AddressO


Driver Information Block AddressO

Root Group Symbol Table Entry
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in “Size of Offsets.”) +
  + (Items marked with a ‘1’ in the above table are + new in version 1 of the superblock) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Format Signature

This field contains a constant value and can be used to + quickly identify a file as being an HDF5 file. The + constant value is designed to allow easy identification of + an HDF5 file and to allow certain types of data corruption + to be detected. The file signature of an HDF5 file always + contains the following values:

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Decimal:13772687013102610
Hexadecimal:894844460d0a1a0a
ASCII C Notation:\211HDF\r\n\032\n
+
+

This signature both identifies the file as an HDF5 file + and provides for immediate detection of common + file-transfer problems. The first two bytes distinguish + HDF5 files on systems that expect the first two bytes to + identify the file type uniquely. The first byte is + chosen as a non-ASCII value to reduce the probability + that a text file may be misrecognized as an HDF5 file; + also, it catches bad file transfers that clear bit + 7. Bytes two through four name the format. The CR-LF + sequence catches bad file transfers that alter newline + sequences. The control-Z character stops file display + under MS-DOS. The final line feed checks for the inverse + of the CR-LF translation problem. (This is a direct + descendent of the + PNG file + signature.)

+

This field is present in version 0+ of the superblock. +

Version Number of the Superblock

This value is used to determine the format of the + information in the superblock. When the format of the + information in the superblock is changed, the version number + is incremented to the next integer and can be used to + determine how the information in the superblock is + formatted.

+ +

Values of 0, 1 and 2 are defined for this field. (The format + of version 2 is described below, not here) +

+ +

This field is present in version 0+ of the superblock. +

+

Version Number of the File’s Free Space + Information

+

This value is used to determine the format of the + file’s free space information. +

+

The only value currently valid in this field is ‘0’, which + indicates that the file’s free space is as described + below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Version Number of the Root Group Symbol Table + Entry

This value is used to determine the format of the + information in the Root Group Symbol Table Entry. When the + format of the information in that field is changed, the + version number is incremented to the next integer and can be + used to determine how the information in the field + is formatted.

+

The only value currently valid in this field is ‘0’, + which indicates that the root group symbol table entry is + formatted as described below.

+

This field is present in version 0 and 1 of the + superblock.

+

Version Number of the Shared Header Message Format

This value is used to determine the format of the + information in a shared object header message. Since the format + of the shared header messages differs from the other private + header messages, a version number is used to identify changes + in the format. +

+

The only value currently valid in this field is ‘0’, which + indicates that shared header messages are formatted as + described below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Size of Offsets

This value contains the number of bytes used to store + addresses in the file. The values for the addresses of + objects in the file are offsets relative to a base address, + usually the address of the superblock signature. This + allows a wrapper to be added after the file is created + without invalidating the internal offset locations. +

+ +

This field is present in version 0+ of the superblock. +

+

Size of Lengths

This value contains the number of bytes used to store + the size of an object. +

+

This field is present in version 0+ of the superblock. +

+

Group Leaf Node K

+

Each leaf node of a group B-tree will have at + least this many entries but not more than twice this + many. If a group has a single leaf node then it + may have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Group Internal Node K

+

Each internal node of a group B-tree will have at + least this many entries but not more than twice this + many. If the group has only one internal + node then it might have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

File Consistency Flags

+

This value contains flags to indicate information + about the consistency of the information contained + within the file. Currently, the following bit flags are + defined: +

    +
  • Bit 0 set indicates that the file is opened for + write-access.
  • +
  • Bit 1 set indicates that the file has + been verified for consistency and is guaranteed to be + consistent with the format defined in this document.
  • +
  • Bits 2-31 are reserved for future use.
  • +
+ Bit 0 should be + set as the first action when a file is opened for write + access and should be cleared only as the final action + when closing a file. Bit 1 should be cleared during + normal access to a file and only set after the file’s + consistency is guaranteed by the library or a + consistency utility. +

+ +

This field is present in version 0+ of the superblock. +

+

Indexed Storage Internal Node K

+

Each internal node of an indexed storage B-tree will have at + least this many entries but not more than twice this + many. If the index storage B-tree has only one internal + node then it might have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 1 of the superblock. +

+

Base Address

+

This is the absolute file address of the first byte of + the HDF5 data within the file. The library currently + constrains this value to be the absolute file address + of the superblock itself when creating new files; + future versions of the library may provide greater + flexibility. When opening an existing file and this address does + not match the offset of the superblock, the library assumes + that the entire contents of the HDF5 file have been adjusted in + the file and adjusts the base address and end of file address to + reflect their new positions in the file. Unless otherwise noted, + all other file addresses are relative to this base + address. +

+ +

This field is present in version 0+ of the superblock. +

+

Address of Global Free-space Index

+

The file’s free space is not persistent for version 0 and 1 of + the superblock. + Currently this field always contains the + undefined address. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

End of File Address

+

This is the absolute file address of the first byte past + the end of all HDF5 data. It is used to determine whether a + file has been accidently truncated and as an address where + file data allocation can occur if space from the free list is + not used. +

+ +

This field is present in version 0+ of the superblock. +

+

Driver Information Block Address

+

This is the relative file address of the file driver + information block which contains driver-specific + information needed to reopen the file. If there is no + driver information block then this entry should be the + undefined address. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Root Group Symbol Table Entry

+

This is the symbol table entry + of the root group, which serves as the entry point into + the group graph for the file. +

+ +

This field is present in version 0 and 1 of the superblock. +

+
+
+ +
+

Version 2 of the superblock is described below:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Superblock (Version 2) +
bytebytebytebyte

Format Signature (8 bytes)

Version # of SuperblockSize of OffsetsSize of LengthsFile Consistency Flags

Base AddressO


Superblock Extension AddressO


End of File AddressO


Root Group Object Header AddressO

Superblock Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in “Size of Offsets.”) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Format Signature

+

This field is the same as described for versions 0 and 1 of the + superblock. +

Version Number of the Superblock

+

This field has a value of 2 and has the same meaning as for + versions 0 and 1. +

+

Size of Offsets

+

This field is the same as described for versions 0 and 1 of the + superblock. +

+

Size of Lengths

+

This field is the same as described for versions 0 and 1 of the + superblock. +

+

File Consistency Flags

+

This field is the same as described for versions 0 and 1 except + that it is smaller (the number of reserved bits has been reduced + from 30 to 6). +

+

Base Address

+

This field is the same as described for versions 0 and 1 of the + superblock. +

+

Superblock Extension Address

+

The field is the address of the object header for the + superblock extension. + If there is no extension then this entry should be the + undefined address. +

+

End of File Address

+

This field is the same as described for versions 0 and 1 of the + superblock. +

+

Root Group Object Header Address

+

This is the address of + the root group object header, + which serves as the entry point into the group graph for the file. +

+

Superblock Checksum

+

The checksum for the superblock. +

+
+
+ +
+

+II.B. Disk Format: Level 0B - File Driver Info

+ +

The driver information block is an optional region of the + file which contains information needed by the file driver + to reopen a file. The format is described below:

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Driver Information Block +
bytebytebytebyte
VersionReserved
Driver Information Size

Driver Identification (8 bytes)



Driver Information (variable size)


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number of the Driver Information Block. + This document describes version 0. +

+

Driver Information Size

+

The size in bytes of the Driver Information field. +

+

Driver Identification

+

This is an eight-byte ASCII string without null + termination which identifies the driver and/or version number + of the Driver Information Block. The predefined driver encoded + in this field by the HDF5 Library is identified by the + letters NCSA followed by the first four characters of + the driver name. If the Driver Information block is not + the original version then the last letter(s) of the + identification will be replaced by a version number in + ASCII, starting with 0. +

+

+ Identification for user-defined drivers is also eight-byte long. + It can be arbitrary but should be unique to avoid + the four character prefix “NCSA”. +

+

Driver Information

Driver information is stored in a format defined by the + file driver (see description below).
+
+ +
+ The two drivers encoded in the Driver Identification field are as follows: +
    +
  • + Multi driver: +

    + The identifier for this driver is “NCSAmulti”. + This driver provides a mechanism for segregating raw data and different types of metadata + into multiple files. + These files are viewed by the library as a single virtual HDF5 file with a single file address. + A maximum of 6 files will be created for the following data: + superblock, B-tree, raw data, global heap, local heap, and object header. + More than one type of data can be written to the same file. +

  • +
  • + Family driver +

    + The identifier for this driver is “NCSAfami” and is encoded in this field for library version 1.8 and after. + This driver is designed for systems that do not support files larger than 2 gigabytes + by splitting the HDF5 file address space across several smaller files. + It does nothing to segregate metadata and raw data; + they are mixed in the address space just as they would be in a single contiguous file. +

  • +
+

The format of the Driver Information field for the + above two drivers are described below:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Multi Driver Information +
bytebytebytebyte
Member MappingMember MappingMember MappingMember Mapping
Member MappingMember MappingReservedReserved

Address of Member File 1


End of Address for Member File 1


Address of Member File 2


End of Address for Member File 2


... ...


Address of Member File N


End of Address for Member File N


Name of Member File 1 (variable size)


Name of Member File 2 (variable size)


... ...


Name of Member File N (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Member Mapping

These fields are integer values from 1 to 6 + indicating how the data can be mapped to or merged with another type of + data. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Member MappingDescription
1The superblock data.
2The B-tree data.
3The raw data.
4The global heap data.
5The local heap data.
6The object header data.

+

For example, if the third field has the value 3 and all the rest have the + value 1, it means there are two files: one for raw data, and one for superblock, + B-tree, global heap, local heap, and object header.

+

Reserved

These fields are reserved and should always be zero.

Address of Member File N

This field Specifies the virtual address at which the member file starts.

+

N is the number of member files.

+

End of Address for Member File N

This field is the end of the allocated address for the member file. +

Name of Member File N

This field is the null-terminated name of the member file and + its length should be multiples of 8 bytes. + Additional bytes will be padded with NULLs. The default naming + convention is %s-X.h5, where X is one of the letters + s (for superblock), b (for B-tree), r (for raw data), + g (for global heap), l (for local heap), and o (for + object header). The name of the whole HDF5 file will substitute the %s + in the string. +

+
+
+ +
+
+ + + + + + + + + + + + + + +
+ Family Driver Information +
bytebytebytebyte

Size of Member File

+
+ +
+
+ + + + + + + + + + +
Field NameDescription

Size of Member File

This field is the size of the member file in the family of files.

+
+ +
+

+II.C. Disk Format: Level 0C - Superblock Extension

+ +

The superblock extension is used to store superblock metadata + which is either optional, or added after the version of the superblock + was defined. Superblock extensions may only exist when version 2+ of + superblock is used. A superblock extension is an object header which may + hold the following messages:

+ + + + +
+
+
+

+III. Disk Format: Level 1 - File Infrastructure

+ +
+

+III.A. Disk Format: Level 1A - B-trees and B-tree Nodes

+ +

B-trees allow flexible storage for objects which tend to grow + in ways that cause the object to be stored discontiguously. B-trees + are described in various algorithms books including “Introduction to + Algorithms” by Thomas H. Cormen, Charles E. Leiserson, and Ronald + L. Rivest. B-trees are used in several places in the HDF5 file format, + when an index is needed for another data structure.

+ +

The version 1 B-tree structure described below is the original index + structure, but are limited by some bugs in our implementation (mainly in + how they handle deleting records). The version 1 B-trees are being phased + out in favor of the version 2 B-trees described below, although both + types of structures may be found in the same file, depending on + application settings when creating the file.

+ +
+

+III.A.1. Disk Format: Level 1A1 - Version 1 B-trees (B-link Trees)

+ +

Version 1 B-trees in HDF5 files an implementation of the B-link tree, + in which the sibling nodes at a particular level in the tree are stored + in a doubly-linked list, is described in the “Efficient Locking for + Concurrent Operations on B-trees” paper by Phillip Lehman and S. Bing Yao + as published in the ACM Transactions on Database Systems, + Vol. 6, No. 4, December 1981.

+ +

The B-link trees implemented by the file format contain one more + key than the number of children. In other words, each child + pointer out of a B-tree node has a left key and a right key. + The pointers out of internal nodes point to sub-trees while + the pointers out of leaf nodes point to symbol nodes and + raw data chunks. + Aside from that difference, internal nodes and leaf nodes + are identical.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ B-link Tree Nodes +
bytebytebytebyte
Signature
Node TypeNode LevelEntries Used

Address of Left SiblingO


Address of Right SiblingO

Key 0 (variable size)

Address of Child 0O

Key 1 (variable size)

Address of Child 1O

...
Key 2K (variable size)

Address of Child 2KO

Key 2K+1 (variable size)
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “TREE” is + used to indicate the + beginning of a B-link tree node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Node Type

+

Each B-link tree points to a particular type of data. + This field indicates the type of data as well as + implying the maximum degree K of the tree and + the size of each Key field. + + + + + + + + + + + + + + + +
Node TypeDescription
0This tree points to group nodes.
1This tree points to raw data chunk nodes.

+

Node Level

+

The node level indicates the level at which this node + appears in the tree (leaf nodes are at level zero). Not + only does the level indicate whether child pointers + point to sub-trees or to data, but it can also be used + to help file consistency checking utilities reconstruct + damaged trees. +

+

Entries Used

+

This determines the number of children to which this + node points. All nodes of a particular type of tree + have the same maximum degree, but most nodes will point + to less than that number of children. The valid child + pointers and keys appear at the beginning of the node + and the unused pointers and keys appear at the end of + the node. The unused pointers and keys have undefined + values. +

+

Address of Left Sibling

+

This is the relative file address of the left sibling of + the current node. If the current + node is the left-most node at this level then this field + is the undefined address. +

+

Address of Right Sibling

+

This is the relative file address of the right sibling of + the current node. If the current + node is the right-most node at this level then this + field is the undefined address. +

+

Keys and Child Pointers

+

Each tree has 2K+1 keys with 2K + child pointers interleaved between the keys. The number + of keys and child pointers actually containing valid + values is determined by the node’s Entries Used field. + If that field is N then the B-link tree contains + N child pointers and N+1 keys. +

+

Key

+

The format and size of the key values is determined by + the type of data to which this tree points. The keys are + ordered and are boundaries for the contents of the child + pointer; that is, the key values represented by child + N fall between Key N and Key + N+1. Whether the interval is open or closed on + each end is determined by the type of data to which the + tree points. +

+ +

+ The format of the key depends on the node type. + For nodes of node type 0 (group nodes), the key is formatted as + follows: + + + + + + +
A single field of Size of Lengths + bytes:Indicates the byte offset into the local heap + for the first object name in the subtree which + that key describes. +
+

+ + +

+ For nodes of node type 1 (chunked raw data nodes), the key is + formatted as follows: + + + + + + + + + + + + + + +
Bytes 1-4:Size of chunk in bytes.
Bytes 4-8:Filter mask, a 32-bit bit field indicating which + filters have been skipped for this chunk. Each filter + has an index number in the pipeline (starting at 0, with + the first filter to apply) and if that filter is skipped, + the bit corresponding to its index is set.
(D + 1) 64-bit fields:The offset of the + chunk within the dataset where D is the number + of dimensions of the dataset, and the last value is the + offset within the dataset’s datatype and should always be + zero. For example, if + a chunk in a 3-dimensional dataset begins at the + position [5,5,5], there will be three + such 64-bit values, each with the value of + 5, followed by a 0 value.
+

+ +

Child Pointer

+

The tree node contains file addresses of subtrees or + data depending on the node level. Nodes at Level 0 point + to data addresses, either raw data chunks or group nodes. + Nodes at non-zero levels point to other nodes of the + same B-tree. +

+

For raw data chunk nodes, the child pointer is the address + of a single raw data chunk. For group nodes, the child pointer + points to a symbol table, which contains + information for multiple symbol table entries. +

+
+
+ +

+ Conceptually, each B-tree node looks like this:

+
+ + + + + + + + + + + + + +
key[0] child[0] key[1] child[1] key[2] ... ... key[N-1] child[N-1] key[N]
+
+
+ + where child[i] is a pointer to a sub-tree (at a level + above Level 0) or to data (at Level 0). + Each key[i] describes an item stored by the B-tree + (a chunk or an object of a group node). The range of values + represented by child[i] is indicated by key[i] + and key[i+1]. + + +

The following question must next be answered: + “Is the value described by key[i] contained in + child[i-1] or in child[i]?” + The answer depends on the type of tree. + In trees for groups (node type 0) the object described by + key[i] is the greatest object contained in + child[i-1] while in chunk trees (node type 1) the + chunk described by key[i] is the least chunk in + child[i].

+ +

That means that key[0] for group trees is sometimes unused; + it points to offset zero in the heap, which is always the + empty string and compares as “less-than” any valid object name.

+ +

And key[N] for chunk trees is sometimes unused; + it contains a chunk offset which compares as “greater-than” + any other chunk offset and has a chunk byte size of zero + to indicate that it is not actually allocated.

+ +
+

+III.A.2. Disk Format: Level 1A2 - Version 2 B-trees

+ +

Version 2 B-trees are “traditional” B-trees, with one major difference. + Instead of just using a simple pointer (or address in the file) to a + child of an internal node, the pointer to the child node contains two + additional pieces of information: the number of records in the child + node itself, and the total number of records in the child node and + all its descendants. Storing this additional information allows fast + array-like indexing to locate the nth record in the B-tree.

+ +

The entry into a version 2 B-tree is a header which contains global + information about the structure of the B-tree. The root node + address + field in the header points to the B-tree root node, which is either an + internal or leaf node, depending on the value in the header’s + depth field. An internal node consists of records plus + pointers to further leaf or internal nodes in the tree. A leaf node + consists of solely of records. The format of the records depends on + the B-tree type (stored in the header).

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree Header +
bytebytebytebyte
Signature
VersionTypeThis space inserted only to align table nicely
Node Size
Record SizeDepth
Split PercentMerge PercentThis space inserted only to align table nicely

Root Node AddressO

Number of Records in Root NodeThis space inserted only to align table nicely

Total Number of Records in B-treeL

Checksum
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “BTHD” is + used to indicate the header of a version 2 B-link tree node. +

+

Version

+

The version number for this B-tree header. This document + describes version 0. +

+

Type

+

This field indicates the type of B-tree: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0A “testing” B-tree, this value should not be + used for storing records in actual HDF5 files. +
1This B-tree is used for indexing indirectly accessed, + non-filtered ‘huge’ fractal heap objects. +
2This B-tree is used for indexing indirectly accessed, + filtered ‘huge’ fractal heap objects. +
3This B-tree is used for indexing directly accessed, + non-filtered ‘huge’ fractal heap objects. +
4This B-tree is used for indexing directly accessed, + filtered ‘huge’ fractal heap objects. +
5This B-tree is used for indexing the ‘name’ field for + links in indexed groups. +
6This B-tree is used for indexing the ‘creation order’ + field for links in indexed groups. +
7This B-tree is used for indexing shared object header + messages. +
8This B-tree is used for indexing the ‘name’ field for + indexed attributes. +
9This B-tree is used for indexing the ‘creation order’ + field for indexed attributes. +

+

The format of records for each type is described below.

+

Node Size

+

This is the size in bytes of all B-tree nodes. +

+

Record Size

+

This field is the size in bytes of the B-tree record. +

+

Depth

+

This is the depth of the B-tree. +

+

Split Percent

+

The percent full that a node needs to increase above before it + is split. +

+

Merge Percent

+

The percent full that a node needs to be decrease below before it + is split. +

+

Root Node Address

+

This is the address of the root B-tree node. A B-tree with + no records will have the undefined + address in this field. +

+

Number of Records in Root Node

+

This is the number of records in the root node. +

+

Total Number of Records in B-tree

+

This is the total number of records in the entire B-tree. +

+

Checksum

+

This is the checksum for the B-tree header. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree Internal Node +
bytebytebytebyte
Signature
VersionTypeRecords 0, 1, 2...N-1 (variable size)

Child Node Pointer 0O


Number of Records N0 for Child Node 0 (variable size)

Total Number of Records for Child Node 0 (optional, variable size)

Child Node Pointer 1O


Number of Records N1 for Child Node 1 (variable size)

Total Number of Records for Child Node 1 (optional, variable size)
...

Child Node Pointer NO


Number of Records Nn for Child Node N (variable size)

Total Number of Records for Child Node N (optional, variable size)
Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+
+ + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “BTIN” is + used to indicate the internal node of a B-link tree. +

+

Version

+

The version number for this B-tree internal node. + This document describes version 0. +

+

Type

+

This field is the type of the B-tree node. It should always + be the same as the B-tree type in the header. +

+

Records

+

The size of this field is determined by the number of records + for this node and the record size (from the header). The format + of records depends on the type of B-tree. +

+

Child Node Pointer

+

This field is the address of the child node pointed to by the + internal node. +

+

Number of Records in Child Node

+

This is the number of records in the child node pointed to by + the corresponding Node Pointer. +

+

The number of bytes used to store this field is determined by + the maximum possible number of records able to be stored in the + child node. +

+

+ The maximum number of records in a child node is computed + in the following way: + +

    +
  • Subtract the fixed size overhead for + the child node (for example, its signature, version, + checksum, and so on and one pointer triplet + of information for the child node (because there is one + more pointer triplet than records in each internal node)) + from the size of nodes for the B-tree.
  • +
  • Divide that result by the size of a record plus the + pointer triplet of information stored to reach each + child node from this node. +
+ +

+

+ Note that leaf nodes do not encode any + child pointer triplets, so the maximum number of records in a + leaf node is just the node size minus the leaf node overhead, + divided by the record size. +

+

+ Also note that the first level of internal nodes above the + leaf nodes do not encode the Total Number of Records in Child + Node value in the child pointer triplets (since it is the + same as the Number of Records in Child Node), so the + maximum number of records in these nodes is computed with the + equation above, but using (Child Pointer, Number of + Records in Child Node) pairs instead of triplets. +

+

+ The number of + bytes used to encode this field is the least number of bytes + required to encode the maximum number of records in a child + node value for the child nodes below this level + in the B-tree. +

+

+ For example, if the maximum number of child records is + 123, one byte will be used to encode these values in this + node; if the maximum number of child records is + 20000, two bytes will be used to encode these values in this + node; and so on. The maximum number of bytes used to + encode these values is 8 (in other words, an unsigned + 64-bit integer). +

+

Total Number of Records in Child Node

+

This is the total number of records for the node pointed to by + the corresponding Node Pointer and all its children. + This field exists only in nodes whose depth in the B-tree node + is greater than 1 (in other words, the “twig” + internal nodes, just above leaf nodes, do not store this + field in their child node pointers). +

+

The number of bytes used to store this field is determined by + the maximum possible number of records able to be stored in the + child node and its descendants. +

+

+ The maximum possible number of records able to be stored in a + child node and its descendants is computed iteratively, in the + following way: The maximum number of records in a leaf node + is computed, then that value is used to compute the maximum + possible number of records in the first level of internal nodes + above the leaf nodes. Multiplying these two values together + determines the maximum possible number of records in child node + pointers for the level of nodes two levels above leaf nodes. + This process is continued up to any level in the B-tree. +

+

+ The number of bytes used to encode this value is computed in + the same way as for the Number of Records in Child Node + field. +

+

Checksum

+

This is the checksum for this node. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree Leaf Node +
bytebytebytebyte
Signature
VersionTypeRecord 0, 1, 2...N-1 (variable size)
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “BTLF“ is + used to indicate the leaf node of a version 2 B-link tree. +

+

Version

+

The version number for this B-tree leaf node. + This document describes version 0. +

+

Type

+

This field is the type of the B-tree node. It should always + be the same as the B-tree type in the header. +

+

Records

+

The size of this field is determined by the number of records + for this node and the record size (from the header). The format + of records depends on the type of B-tree. +

+

Checksum

+

This is the checksum for this node. +

+
+
+ +
+

The record layout for each stored (in other words, non-testing) + B-tree type is as follows:

+ +
+ + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 1 Record Layout - Indirectly Accessed, Non-Filtered, + ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Huge Object AddressO


Huge Object LengthL


Huge Object IDL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Huge Object Address

+

The address of the huge object in the file. +

+

Huge Object Length

+

The length of the huge object in the file. +

+

Huge Object ID

+

The heap ID for the huge object. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 2 Record Layout - Indirectly Accessed, Filtered, + ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Filtered Huge Object AddressO


Filtered Huge Object LengthL

Filter Mask

Filtered Huge Object Memory SizeL


Huge Object IDL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Filtered Huge Object Address

+

The address of the filtered huge object in the file. +

+

Filtered Huge Object Length

+

The length of the filtered huge object in the file. +

+

Filter Mask

+

A 32-bit bit field indicating which filters have been skipped for + this chunk. Each filter has an index number in the pipeline + (starting at 0, with the first filter to apply) and if that + filter is skipped, the bit corresponding to its index is set. +

+

Filtered Huge Object Memory Size

+

The size of the de-filtered huge object in memory. +

+

Huge Object ID

+

The heap ID for the huge object. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 3 Record Layout - Directly Accessed, Non-Filtered, + ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Huge Object AddressO


Huge Object LengthL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Huge Object Address

+

The address of the huge object in the file. +

+

Huge Object Length

+

The length of the huge object in the file. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 4 Record Layout - Directly Accessed, Filtered, + ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Filtered Huge Object AddressO


Filtered Huge Object LengthL

Filter Mask

Filtered Huge Object Memory SizeL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Filtered Huge Object Address

+

The address of the filtered huge object in the file. +

+

Filtered Huge Object Length

+

The length of the filtered huge object in the file. +

+

Filter Mask

+

A 32-bit bit field indicating which filters have been skipped for + this chunk. Each filter has an index number in the pipeline + (starting at 0, with the first filter to apply) and if that + filter is skipped, the bit corresponding to its index is set. +

+

Filtered Huge Object Memory Size

+

The size of the de-filtered huge object in memory. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 5 Record Layout - Link Name for Indexed Group +
bytebytebytebyte
Hash of Name
ID (bytes 1-4)
ID (bytes 5-7)
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Hash

+

This field is hash value of the name for the link. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the link’s name. +

+

ID

+

This is a 7-byte sequence of bytes and is the heap ID for the + link record in the group’s fractal heap.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 6 Record Layout - Creation Order for Indexed Group +
bytebytebytebyte

Creation Order (8 bytes)

ID (bytes 1-4)
ID (bytes 5-7)
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Creation Order

+

This field is the creation order value for the link. +

+

ID

+

This is a 7-byte sequence of bytes and is the heap ID for the + link record in the group’s fractal heap.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 7 Record Layout - Shared Object Header Messages (Sub-Type 0 - Message in Heap) +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash
Reference Count

Heap ID (8 bytes)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Message Location

+

This field Indicates the location where the message is stored: + + + + + + + + + + + + + +
ValueDescription
0Shared message is stored in shared message index heap. +
1Shared message is stored in object header. +

+

Hash

+

This field is hash value of the shared message. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the shared message.

+

Reference Count

+

The number of objects which reference this message.

+

Heap ID

+

This is an 8-byte sequence of bytes and is the heap ID for the + shared message in the shared message index’s fractal heap.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 7 Record Layout - Shared Object Header Messages (Sub-Type 1 - Message in Object Header) +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash
Reserved (zero)Message TypeObject Header Index

Object Header AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Message Location

+

This field Indicates the location where the message is stored: + + + + + + + + + + + + + +
ValueDescription
0Shared message is stored in shared message index heap. +
1Shared message is stored in object header. +

+

Hash

+

This field is hash value of the shared message. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the shared message.

+

Message Type

+

The object header message type of the shared message.

+

Object Header Index

+

This field indicates that the shared message is the nth message + of its type in the specified object header.

+

Object Header Address

+

The address of the object header containing the shared message.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 8 Record Layout - Attribute Name for Indexed Attributes +
bytebytebytebyte

Heap ID (8 bytes)

Message FlagsThis space inserted only to align table nicely
Creation Order
Hash of Name
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Heap ID

+

This is an 8-byte sequence of bytes and is the heap ID for the + attribute in the object’s attribute fractal heap.

+

Message Flags

The object header message flags for the attribute message.

+

Creation Order

+

This field is the creation order value for the attribute. +

+

Hash

+

This field is hash value of the name for the attribute. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the attribute’s name. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + +
+ Version 2 B-tree, Type 9 Record Layout- Creation Order for Indexed Attributes +
bytebytebytebyte

Heap ID (8 bytes)

Message FlagsThis space inserted only to align table nicely
Creation Order
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Heap ID

+

This is an 8-byte sequence of bytes and is the heap ID for the + attribute in the object’s attribute fractal heap.

+

Message Flags

+

The object header message flags for the attribute message.

+

Creation Order

+

This field is the creation order value for the attribute. +

+
+
+ + +
+

+III.B. Disk Format: Level 1B - Group Symbol Table Nodes

+ +

A group is an object internal to the file that allows + arbitrary nesting of objects within the file (including other groups). + A group maps a set of link names in the group to a set of relative + file addresses of objects in the file. Certain metadata for an object to + which the group points can be cached in the group’s symbol table entry in + addition to being in the object’s header.

+ +

An HDF5 object name space can be stored hierarchically by + partitioning the name into components and storing each + component as a link in a group. The link for a + non-ultimate component points to the group containing + the next component. The link for the last + component points to the object being named.

+ +

One implementation of a group is a collection of symbol table nodes + indexed by a B-link tree. Each symbol table node contains entries + for one or more links. If an attempt is made to add a link to an already + full symbol table node containing 2K entries, then the node is + split and one node contains K symbols and the other contains + K+1 symbols.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Symbol Table Node (A Leaf of a B-link tree) +
bytebytebytebyte
Signature
Version NumberReserved (zero)Number of Symbols


Group Entries


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “SNOD” is + used to indicate the + beginning of a symbol table node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version Number

+

The version number for the symbol table node. This + document describes version 1. (There is no version ‘0’ + of the symbol table node) +

+

Number of Entries

+

Although all symbol table nodes have the same length, + most contain fewer than the maximum possible number of + link entries. This field indicates how many entries + contain valid data. The valid entries are packed at the + beginning of the symbol table node while the remaining + entries contain undefined values. +

+

Symbol Table Entries

+

Each link has an entry in the symbol table node. + The format of the entry is described below. + There are 2K entries in each group node, where + K is the “Group Leaf Node K” value from the + superblock. +

+
+
+ +
+

+III.C. Disk Format: Level 1C - Symbol Table Entry

+ +

Each symbol table entry in a symbol table node is designed + to allow for very fast browsing of stored objects. + Toward that design goal, the symbol table entries + include space for caching certain constant metadata from the + object header.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Symbol Table Entry +
bytebytebytebyte

Link Name OffsetO


Object Header AddressO

Cache Type
Reserved (zero)


Scratch-pad Space (16 bytes)


+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Link Name Offset

+

This is the byte offset into the group’s local + heap for the name of the link. The name is null + terminated. +

+

Object Header Address

+

Every object has an object header which serves as a + permanent location for the object’s metadata. In addition + to appearing in the object header, some of the object’s metadata + can be cached in the scratch-pad space. +

+

Cache Type

+

The cache type is determined from the object header. + It also determines the format for the scratch-pad space: + + + + + + + + + + + + + + + + + + +
TypeDescription
0No data is cached by the group entry. This + is guaranteed to be the case when an object header + has a link count greater than one. +
1Group object header metadata is cached in the + scratch-pad space. This implies that the symbol table + entry refers to another group. +
2The entry is a symbolic link. The first four bytes + of the scratch-pad space are the offset into the local + heap for the link value. The object header address + will be undefined. +

+ +

Reserved

+

These four bytes are present so that the scratch-pad + space is aligned on an eight-byte boundary. They are + always set to zero. +

+

Scratch-pad Space

+

This space is used for different purposes, depending + on the value of the Cache Type field. Any metadata + about an object represented in the scratch-pad + space is duplicated in the object header for that + object. +

+

+ Furthermore, no data is cached in the group + entry scratch-pad space if the object header for + the object has a link count greater than one. +

+
+
+ +
+

Format of the Scratch-pad Space

+ +

The symbol table entry scratch-pad space is formatted + according to the value in the Cache Type field.

+ +

If the Cache Type field contains the value zero + (0) then no information is + stored in the scratch-pad space.

+ +

If the Cache Type field contains the value one + (1), then the scratch-pad space + contains cached metadata for another object header + in the following format:

+ +
+ + + + + + + + + + + + + + + + + +
+ Object Header Scratch-pad Format +
bytebytebytebyte

Address of B-treeO


Address of Name HeapO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Address of B-tree

+

This is the file address for the root of the + group’s B-tree. +

+

Address of Name Heap

+

This is the file address for the group’s local + heap, in which are stored the group’s symbol names. +

+
+
+ + +
+

If the Cache Type field contains the value two + (2), then the scratch-pad space + contains cached metadata for a symbolic link + in the following format:

+ +
+ + + + + + + + + + + + + +
+ Symbolic Link Scratch-pad Format +
bytebytebytebyte
Offset to Link Value
+
+ +
+
+ + + + + + + + + + +
Field NameDescription

Offset to Link Value

+

The value of a symbolic link (that is, the name of the + thing to which it points) is stored in the local heap. + This field is the 4-byte offset into the local heap for + the start of the link value, which is null terminated. +

+
+
+ +
+

+III.D. Disk Format: Level 1D - Local Heaps

+ +

A local heap is a collection of small pieces of data that are particular + to a single object in the HDF5 file. Objects can be + inserted and removed from the heap at any time. + The address of a heap does not change once the heap is created. + For example, a group stores addresses of objects in symbol table nodes + with the names of links stored in the group’s local heap. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Local Heap +
bytebytebytebyte
Signature
VersionReserved (zero)

Data Segment SizeL


Offset to Head of Free-listL


Address of Data SegmentO

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “HEAP” + is used to indicate the + beginning of a heap. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

Each local heap has its own version number so that new + heaps can be added to old files. This document + describes version zero (0) of the local heap. +

+

Data Segment Size

+

The total amount of disk memory allocated for the heap + data. This may be larger than the amount of space + required by the objects stored in the heap. The extra + unused space in the heap holds a linked list of free blocks. +

+

Offset to Head of Free-list

+

This is the offset within the heap data segment of the + first free block (or the + undefined address if there is no + free block). The free block contains “Size of Lengths” bytes that + are the offset of the next free block (or the + value ‘1’ if this is the + last free block) followed by “Size of Lengths” bytes that store + the size of this free block. The size of the free block includes + the space used to store the offset of the next free block and + the size of the current block, making the minimum size of a free + block 2 * “Size of Lengths”. +

+

Address of Data Segment

+

The data segment originally starts immediately after + the heap header, but if the data segment must grow as a + result of adding more objects, then the data segment may + be relocated, in its entirety, to another part of the + file. +

+
+
+ +

Objects within a local heap should be aligned on an 8-byte boundary.

+ +
+

+III.E. Disk Format: Level 1E - Global Heap

+ +

Each HDF5 file has a global heap which stores various types of + information which is typically shared between datasets. The + global heap was designed to satisfy these goals:

+ +
    +
  1. Repeated access to a heap object must be efficient without + resulting in repeated file I/O requests. Since global heap + objects will typically be shared among several datasets, it is + probable that the object will be accessed repeatedly.
  2. +
  3. Collections of related global heap objects should result in + fewer and larger I/O requests. For instance, a dataset of + object references will have a global heap object for each + reference. Reading the entire set of object references + should result in a few large I/O requests instead of one small + I/O request for each reference.
  4. +
  5. It should be possible to remove objects from the global heap + and the resulting file hole should be eligible to be reclaimed + for other uses.
  6. +
+ + +

The implementation of the heap makes use of the memory management + already available at the file level and combines that with a new + object called a collection to achieve goal B. The global heap + is the set of all collections. Each global heap object belongs to + exactly one collection and each collection contains one or more global + heap objects. For the purposes of disk I/O and caching, a collection is + treated as an atomic object, addressing goal A. +

+ +

When a global heap object is deleted from a collection (which occurs + when its reference count falls to zero), objects located after the + deleted object in the collection are packed down toward the beginning + of the collection and the collection’s global heap object 0 is created + (if possible) or its size is increased to account for the recently + freed space. There are no gaps between objects in each collection, + with the possible exception of the final space in the collection, if + it is not large enough to hold the header for the collection’s global + heap object 0. These features address goal C. +

+ +

The HDF5 Library creates global heap collections as needed, so there may + be multiple collections throughout the file. The set of all of them is + abstractly called the “global heap”, although they do not actually link + to each other, and there is no global place in the file where you can + discover all of the collections. The collections are found simply by + finding a reference to one through another object in the file. For + example, data of variable-length datatype elements is stored in the + global heap and is accessed via a global heap ID. The format for + global heap IDs is described at the end of this section. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ A Global Heap Collection +
bytebytebytebyte
Signature
VersionReserved (zero)

Collection SizeL


Global Heap Object 1


Global Heap Object 2


...


Global Heap Object N


Global Heap Object 0 (free space)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “GCOL” + is used to indicate the + beginning of a collection. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

Each collection has its own version number so that new + collections can be added to old files. This document + describes version one (1) of the collections (there is no + version zero (0)). +

+

Collection Size

+

This is the size in bytes of the entire collection + including this field. The default (and minimum) + collection size is 4096 bytes which is a typical file + system block size. This allows for 127 16-byte heap + objects plus their overhead (the collection header of 16 bytes + and the 16 bytes of information about each heap object). +

+

Global Heap Object 1 through N

+

The objects are stored in any order with no + intervening unused space. +

+

Global Heap Object 0

+

Global Heap Object 0 (zero), when present, represents the free + space in the collection. Free space always appears at the end of + the collection. If the free space is too small to store the header + for Object 0 (described below) then the header is implied and the + collection contains no free space. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Global Heap Object +
bytebytebytebyte
Heap Object IndexReference Count
Reserved (zero)

Object SizeL


Object Data

+ + + + + +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Heap Object Index

+

Each object has a unique identification number within a + collection. The identification numbers are chosen so that + new objects have the smallest value possible with the + exception that the identifier 0 always refers to the + object which represents all free space within the + collection. +

+

Reference Count

+

All heap objects have a reference count field. An + object which is referenced from some other part of the + file will have a positive reference count. The reference + count for Object 0 is always zero. +

+

Reserved

+

Zero padding to align next field on an 8-byte boundary. +

+

Object Size

+

This is the size of the object data stored for the object. + The actual storage space allocated for the object data is rounded + up to a multiple of eight. +

+

Object Data

+

The object data is treated as a one-dimensional array + of bytes to be interpreted by the caller. +

+
+ +
+ +
+

+ The format for the ID used to locate an object in the global heap is + described here:

+ +
+ + + + + + + + + + + + + + + + + +
+ Global Heap ID +
bytebytebytebyte

Collection AddressO

Object Index
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Collection Address

+

This field is the address of the global heap collection + where the data object is stored. +

+

ID

+

This field is the index of the data object within the + global heap collection. +

+
+
+ + +
+

+III.F. Disk Format: Level 1F - Fractal Heap

+ +

+ Each fractal heap consists of a header and zero or more direct and + indirect blocks (described below). The header contains general + information as well as + initialization parameters for the doubling table. The Root + Block Address in the header points to the first direct or + indirect block in the heap. +

+ +

+ Fractal heaps are based on a data structure called a doubling + table. A doubling table provides a mechanism for quickly + extending an array-like data structure that minimizes the number of + empty blocks in the heap, while retaining very fast lookup of any + element within the array. More information on fractal heaps and + doubling tables can be found in the RFC + “Private + Heaps in HDF5.” +

+ +

+ The fractal heap implements the doubling table structure with + indirect and direct blocks. + Indirect blocks in the heap do not actually contain data for + objects in the heap, their “size” is abstract - + they represent the indexing structure for locating the + direct blocks in the doubling table. + Direct blocks + contain the actual data for objects stored in the heap. +

+ +

+ All indirect blocks have a constant number of block entries in each + row, called the width of the doubling table (stored in + the heap header). + + The number + of rows for each indirect block in the heap is determined by the + size of the block that the indirect block represents in the + doubling table (calculation of this is shown below) and is + constant, except for the “root” + indirect block, which expands and shrinks its number of rows as + needed. +

+ +

+ Blocks in the first two rows of an indirect block + are Starting Block Size number of bytes in size, + and the blocks in each subsequent row are twice the size of + the blocks in the previous row. In other words, blocks in + the third row are twice the Starting Block Size, + blocks in the fourth row are four times the + Starting Block Size, and so on. Entries for + blocks up to the Maximum Direct Block Size point to + direct blocks, and entries for blocks greater than that size + point to further indirect blocks (which have their own + entries for direct and indirect blocks). +

+ +

+ The number of rows of blocks, nrows, in an + indirect block of size iblock_size is given by the + following expression: +

+ nrows = (log2(iblock_size) - + log2(<Starting Block Size> * + <Width>)) + 1 +

+ +

+ The maximum number of rows of direct blocks, max_dblock_rows, + in any indirect block of a fractal heap is given by the + following expression: +

+ max_dblock_rows = + (log2(<Max. Direct Block Size>) - + log2(<Starting Block Size>)) + 2 +

+ +

+ Using the computed values for nrows and + max_dblock_rows, along with the Width of the + doubling table, the number of direct and indirect block entries + (K and N in the indirect block description, below) + in an indirect block can be computed: +

+ K = MIN(nrows, max_dblock_rows) * + Width + +

+ If nrows is less than or equal to max_dblock_rows, + N is 0. Otherwise, N is simply computed: +

+ N = K - (max_dblock_rows * + Width) +

+ +

+ The size indirect blocks on disk is determined by the number + of rows in the indirect block (computed above). The size of direct + blocks on disk is exactly the size of the block in the doubling + table. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fractal Heap Header +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely
Heap ID LengthI/O Filters’ Encoded Length
FlagsThis space inserted only to align table nicely
Maximum Size of Managed Objects

Next Huge Object IDL


v2 B-tree Address of Huge ObjectsO


Amount of Free Space in Managed BlocksL


Address of Managed Block Free Space ManagerO


Amount of Managed Space in HeapL


Amount of Allocated Managed Space in HeapL


Offset of Direct Block Allocation Iterator in Managed SpaceL


Number of Managed Objects in HeapL


Size of Huge Objects in HeapL


Number of Huge Objects in HeapL


Size of Tiny Objects in HeapL


Number of Tiny Objects in HeapL

Table WidthThis space inserted only to align table nicely

Starting Block SizeL


Maximum Direct Block SizeL

Maximum Heap SizeStarting # of Rows in Root Indirect Block

Address of Root BlockO

Current # of Rows in Root Indirect BlockThis space inserted only to align table nicely

Size of Filtered Root Direct Block (optional)L

I/O Filter Mask (optional)
I/O Filter Information (optional, variable size)
Checksum
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “FRHP” + is used to indicate the + beginning of a fractal heap header. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

This document describes version 0.

+

Heap ID Length

+

This is the length in bytes of heap object IDs for this heap.

+

I/O Filters’ Encoded Length

+

This is the size in bytes of the encoded I/O Filter Information. +

+

Flags

+

This field is the heap status flag and is a bit field + indicating additional information about the fractal heap. + + + + + + + + + + + + + + + + + + +
Bit(s)Description
0If set, the ID value to use for huge object has wrapped + around. If the value for the Next Huge Object ID + has wrapped around, each new huge object inserted into the + heap will require a search for an ID value. +
1If set, the direct blocks in the heap are checksummed. +
2-7Reserved

+ +

Maximum Size of Managed Objects

+

This is the maximum size of managed objects allowed in the heap. + Objects greater than this this are ‘huge’ objects and will be + stored in the file directly, rather than in a direct block for + the heap. +

+

Next Huge Object ID

+

This is the next ID value to use for a huge object in the heap. +

+

v2 B-tree Address of Huge Objects

+

This is the address of the v2 B-tree + used to track huge objects in the heap. The type of records + stored in the v2 B-tree will + be determined by whether the address & length of a huge object + can fit into a heap ID (if yes, it is a “directly” accessed + huge object) and whether there is a filter used on objects + in the heap. +

+

Amount of Free Space in Managed Blocks

+

This is the total amount of free space in managed direct blocks + (in bytes). +

+

Address of Managed Block Free Space Manager

+

This is the address of the + Free-space Manager for + managed blocks. +

+

Amount of Managed Space in Heap

+

This is the total amount of managed space in the heap (in bytes), + essentially the upper bound of the heap’s linear address space. +

+

Amount of Allocated Managed Space in Heap

+

This is the total amount of managed space (in bytes) actually + allocated in + the heap. This can be less than the Amount of Managed Space + in Heap field, if some direct blocks in the heap’s linear + address space are not allocated. +

+

Offset of Direct Block Allocation Iterator in Managed Space

+

This is the linear heap offset where the next direct + block should be allocated at (in bytes). This may be less than + the Amount of Managed Space in Heap value because the + heap’s address space is increased by a “row” of direct blocks + at a time, rather than by single direct block increments. +

+

Number of Managed Objects in Heap

+

This is the number of managed objects in the heap. +

+

Size of Huge Objects in Heap

+

This is the total size of huge objects in the heap (in bytes). +

+

Number of Huge Objects in Heap

+

This is the number of huge objects in the heap. +

+

Size of Tiny Objects in Heap

+

This is the total size of tiny objects that are packed in heap + IDs (in bytes). +

+

Number of Tiny Objects in Heap

+

This is the number of tiny objects that are packed in heap IDs. +

+

Table Width

+

This is the number of columns in the doubling table for managed + blocks. This value must be a power of two. +

+

Starting Block Size

+

This is the starting block size to use in the doubling table for + managed blocks (in bytes). This value must be a power of two. +

+

Maximum Direct Block Size

+

This is the maximum size allowed for a managed direct block. + Objects inserted into the heap that are larger than this value + (less the # of bytes of direct block prefix/suffix) + are stored as ‘huge’ objects. This value must be a power of + two. +

+

Maximum Heap Size

+

This is the maximum size of the heap’s linear address space for + managed objects (in bytes). The value stored is the log2 of + the actual value, that is: the # of bits of the address space. + ‘Huge’ and ‘tiny’ objects are not counted in this value, since + they do not store objects in the linear address space of the + heap. +

+

Starting # of Rows in Root Indirect Block

+

This is the starting number of rows for the root indirect block. + A value of 0 indicates that the root indirect block will have + the maximum number of rows needed to address the heap’s Maximum + Heap Size. +

+

Address of Root Block

+

This is the address of the root block for the heap. It can + be the undefined address if + there is no data in the heap. It either points to a direct + block (if the Current # of Rows in the Root Indirect Block + value is 0), or an indirect block. +

+

Current # of Rows in Root Indirect Block

+

This is the current number of rows in the root indirect block. + A value of 0 indicates that Address of Root Block + points to direct block instead of indirect block. +

+

Size of Filtered Root Direct Block

+

This is the size of the root direct block, if filters are + applied to heap objects (in bytes). This field is only + stored in the header if the I/O Filters’ Encoded Length + is greater than 0. +

+

I/O Filter Mask

+

This is the filter mask for the root direct block, if filters + are applied to heap objects. This mask has the same format as + that used for the filter mask in chunked raw data records in a + v1 B-tree. + This field is only + stored in the header if the I/O Filters’ Encoded Length + is greater than 0. +

+

I/O Filter Information

+

This is the I/O filter information encoding direct blocks and + huge objects, if filters are applied to heap objects. This + field is encoded as a Filter Pipeline + message. + The size of this field is determined by I/O Filters’ + Encoded Length. +

+

Checksum

+

This is the checksum for the header.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fractal Heap Direct Block +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely

Heap Header AddressO

Block Offset (variable size)
Checksum (optional)

Object Data (variable size)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “FHDB” + is used to indicate the + beginning of a fractal heap direct block. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

This document describes version 0.

+

Heap Header Address

+

This is the address for the fractal heap header that this + block belongs to. This field is principally used for file + integrity checking. +

+

Block Offset

+

This is the offset of the block within the fractal heap’s + address space (in bytes). The number of bytes used to encode + this field is the Maximum Heap Size (in the heap’s + header) divided by 8 and rounded up to the next highest integer, + for values that are not a multiple of 8. This value is + principally used for file integrity checking. +

+

Checksum

+

This is the checksum for the direct block.

+

This field is only present if bit 1 of Flags in the + heap’s header is set.

+

Object Data

+

This section of the direct block stores the actual data for + objects in the heap. The size of this section is determined by + the direct block’s size minus the size of the other fields + stored in the direct block (for example, the Signature, + Version, and others including the Checksum if it is + present). +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fractal Heap Indirect Block +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely

Heap Header AddressO

Block Offset (variable size)

Child Direct Block #0 AddressO


Size of Filtered Direct Block #0 (optional) L

Filter Mask for Direct Block #0 (optional)

Child Direct Block #1 AddressO


Size of Filtered Direct Block #1 (optional)L

Filter Mask for Direct Block #1 (optional)
...

Child Direct Block #K-1 AddressO


Size of Filtered Direct Block #K-1 (optional)L

Filter Mask for Direct Block #K-1 (optional)

Child Indirect Block #0 AddressO


Child Indirect Block #1 AddressO

...

Child Indirect Block #N-1 AddressO

Checksum
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “FHIB” is used to + indicate the beginning of a fractal heap indirect block. This + gives file consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Heap Header Address

+

This is the address for the fractal heap header that this + block belongs to. This field is principally used for file + integrity checking. +

+

Block Offset

+

This is the offset of the block within the fractal heap’s + address space (in bytes). The number of bytes used to encode + this field is the Maximum Heap Size (in the heap’s + header) divided by 8 and rounded up to the next highest integer, + for values that are not a multiple of 8. This value is + principally used for file integrity checking. +

+

Child Direct Block #K Address

+

This field is the address of the child direct block. + The size of the [uncompressed] direct block can be computed by + its offset in the heap’s linear address space. +

+

Size of Filtered Direct Block #K

+

This is the size of the child direct block after passing through + the I/O filters defined for this heap (in bytes). If no I/O + filters are present for this heap, this field is not present. +

+

Filter Mask for Direct Block #K

+

This is the I/O filter mask for the filtered direct block. + This mask has the same format as that used for the filter mask + in chunked raw data records in a v1 B-tree. + If no I/O filters are present for this heap, this field is not + present. +

+

Child Indirect Block #N Address

+

This field is the address of the child indirect block. + The size of the indirect block can be computed by + its offset in the heap’s linear address space. +

+

Checksum

+

This is the checksum for the indirect block.

+
+ +
+ +
+

An object in the fractal heap is identified by means of a fractal heap ID, + which encodes information to locate the object in the heap. + Currently, the fractal heap stores an object in one of three ways, + depending on the object’s size:

+ +
+ + + + + + + + + + + + + + + + + + + + +
TypeDescription
Tiny +

When an object is small enough to be encoded in the heap ID, the + object’s data is embedded in the fractal heap ID itself. There are + 2 sub-types for this type of object: normal and extended. The + sub-type for tiny heap IDs depends on whether the heap ID is large + enough to store objects greater than 16 bytes or not. If the + heap ID length is 18 bytes or smaller, the ‘normal’ tiny heap ID + form is used. If the heap ID length is greater than 18 bytes in + length, the “extented” form is used. See format description below + for both sub-types. +

+
Huge +

When the size of an object is larger than Maximum Size of + Managed Objects in the Fractal Heap Header, the + object’s data is stored on its own in the file and the object + is tracked/indexed via a version 2 B-tree. All huge objects + for a particular fractal heap use the same v2 B-tree. All huge + objects for a particular fractal heap use the same format for + their huge object IDs. +

+ +

Depending on whether the IDs for a heap are large enough to hold + the object’s retrieval information and whether I/O pipeline filters + are applied to the heap’s objects, 4 sub-types are derived for + huge object IDs for this heap:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Sub-typeDescription
Directly accessed, non-filtered +

The object’s address and length are embedded in the + fractal heap ID itself and the object is directly accessed + from them. This allows the object to be accessed without + resorting to the B-tree. +

+
Directly accessed, filtered +

The filtered object’s address, length, filter mask and + de-filtered size are embedded in the fractal heap ID itself + and the object is accessed directly with them. This allows + the object to be accessed without resorting to the B-tree. +

+
Indirectly accessed, non-filtered +

The object is located by using a B-tree key embedded in + the fractal heap ID to retrieve the address and length from + the version 2 B-tree for huge objects. Then, the address + and length are used to access the object. +

+
Indirectly accessed, filtered +

The object is located by using a B-tree key embedded in + the fractal heap ID to retrieve the filtered object’s + address, length, filter mask and de-filtered size from the + version 2 B-tree for huge objects. Then, this information + is used to access the object. +

+
+
+ +
Managed +

When the size of an object does not meet the above two + conditions, the object is stored and managed via the direct and + indirect blocks based on the doubling table. +

+
+
+ + +

The specific format for each type of heap ID is described below: +

+ +
+ + + + + + + + + + + + + + + + + + + +
Fractal Heap ID for Tiny Objects (sub-type 1 - ‘Normal’) +
bytebytebytebyte
Version, Type & LengthThis space inserted only to align table nicely

Data (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Version, Type & Length

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Tiny objects have a value of 2. +
0-3The length of the tiny object. The value stored + is one less than the actual length (since zero-length + objects are not allowed to be stored in the heap). + For example, an object of actual length 1 has an + encoded length of 0, an object of actual length 2 + has an encoded length of 1, and so on. +

+ +

Data

+

This is the data for the object. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + +
Fractal Heap ID for Tiny Objects (sub-type 2 - ‘Extended’) +
bytebytebytebyte
Version, Type & LengthExtended LengthThis space inserted only to align table nicely
Data (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version, Type & Length

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Tiny objects have a value of 2. +
0-3These 4 bits, together with the next byte, form an + unsigned 12-bit integer for holding the length of the + object. These 4-bits are bits 8-11 of the 12-bit integer. + See description for the Extended Length field below. +

+ +

Extended Length

+

This byte, together with the 4 bits in the previous byte, + forms an unsigned 12-bit integer for holding the length of + the tiny object. These 8 bits are bits 0-7 of the 12-bit + integer formed. The value stored is one less than the actual + length (since zero-length objects are not allowed to be + stored in the heap). For example, an object of actual length + 1 has an encoded length of 0, an object of actual length + 2 has an encoded length of 1, and so on. +

+

Data

+

This is the data for the object. +

+
+
+ + +
+
+
+ + + + + + + + + + + + + + + + + + + +
Fractal Heap ID for Huge Objects (sub-type 1 & 2): indirectly accessed, non-filtered/filtered +
bytebytebytebyte
Version & TypeThis space inserted only to align table nicely

v2 B-tree KeyL (variable size)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Version & Type

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Huge objects have a value of 1. +
0-3Reserved. +

+ +

v2 B-tree Key

This field is the B-tree key for retrieving the information + from the version 2 B-tree for huge objects needed to access the + object. See the description of v2 B-tree + records sub-type 1 & 2 for a description of the fields. New key + values are derived from Next Huge Object ID in the + Fractal Heap Header.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
Fractal Heap ID for Huge Objects (sub-type 3): directly accessed, non-filtered +
bytebytebytebyte
Version & TypeThis space inserted only to align table nicely

Address O


Length L

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version & Type

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Huge objects have a value of 1. +
0-3Reserved. +

+ +

Address

This field is the address of the object in the file.

+

Length

This field is the length of the object in the file.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Fractal Heap ID for Huge Objects (sub-type 4): directly accessed, filtered +
bytebytebytebyte
Version & TypeThis space inserted only to align table nicely

Address O


Length L

Filter Mask

De-filtered Size L

+ + + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
 (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version & Type

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Huge objects have a value of 1. +
0-3Reserved. +

+ +

Address

This field is the address of the filtered object in the file.

+

Length

This field is the length of the filtered object in the file.

+

Filter Mask

This field is the I/O pipeline filter mask for the + filtered object in the file.

+

Filtered Size

This field is the size of the de-filtered object in the file.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + +
Fractal Heap ID for Managed Objects +
bytebytebytebyte
Version & TypeThis space inserted only to align table nicely
Offset (variable size)
Length (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version & Type

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Managed objects have a value of 0. +
0-3Reserved. +

+

Offset

This field is the offset of the object in the heap. + This field’s size is the minimum number of bytes + necessary to encode the Maximum Heap Size value + (from the Fractal Heap Header). For example, if the + value of the Maximum Heap Size is less than 256 bytes, + this field is 1 byte in length, a Maximum Heap Size + of 256-65535 bytes uses a 2 byte length, and so on.

Length

This field is the length of the object in the heap. It + is determined by taking the minimum value of Maximum + Direct Block Size and Maximum Size of Managed + Objects in the Fractal Heap Header. Again, + the minimum number of bytes needed to encode that value is + used for the size of this field.

+
+ +
+

+III.G. Disk Format: Level 1G - Free-space Manager

+ +

+ Free-space managers are used to describe space within a heap or + the entire HDF5 file that is not currently used for that heap or + file. +

+ +

+ The free-space manager header contains metadata information + about the space being tracked, along with the address of the list + of free space sections which actually describes the free + space. The header records information about free-space sections being + tracked, creation parameters for handling free-space sections of a + client, and section information used to locate the collection of + free-space sections. +

+ +

+ The free-space section list stores a collection of + free-space sections that is specific to each client of the + free-space manager. + + For example, the fractal heap is a client of the free space manager + and uses it to track unused space within the heap. There are 4 + types of section records for the fractal heap, each of which has + its own format, listed below. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Free-space Manager Header +
bytebytebytebyte
Signature
VersionClient IDThis space inserted only to align table nicely

Total Space TrackedL


Total Number of SectionsL


Number of Serialized SectionsL


Number of Un-Serialized SectionsL

Number of Section ClassesThis space inserted only to align table nicely
Shrink PercentExpand Percent
Size of Address SpaceThis space inserted only to align table nicely

Maximum Section Size L


Address of Serialized Section ListO


Size of Serialized Section List UsedL


Allocated Size of Serialized Section ListL

Checksum
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “FSHD” is used to + indicate the beginning of the Free-space Manager Header. + This gives file consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version

+

This is the version number for the Free-space Manager Header + and this document describes version 0.

+

Client ID

+

This is the client ID for identifying the user of this + free-space manager: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Fractal heap +
1File +
2+Reserved. +

+ +

Total Space Tracked

+

This is the total amount of free space being tracked, in bytes. +

+

Total Number of Sections

+

This is the total number of free-space sections being tracked. +

+

Number of Serialized Sections

+

This is the number of serialized free-space sections being + tracked. +

+

Number of Un-Serialized Sections

+

This is the number of un-serialized free-space sections being + managed. Un-serialized sections are created by the free-space + client when the list of sections is read in. +

+

Number of Section Classes

+

This is the number of section classes handled by this free space + manager for the free-space client. +

+

Shrink Percent

+

This is the percent of current size to shrink the allocated + serialized free-space section list. +

+

Expand Percent

+

This is the percent of current size to expand the allocated + serialized free-space section list. +

+

Size of Address Space

+

This is the size of the address space that free-space sections + are within. This is stored as the log2 of the + actual value (in other words, the number of bits required + to store values within that address space). +

+

Maximum Section Size

+

This is the maximum size of a section to be tracked. +

+

Address of Serialized Section List

+

This is the address where the serialized free-space section + list is stored. +

+

Size of Serialized Section List Used

+

This is the size of the serialized free-space section + list used (in bytes). This value must be less than + or equal to the allocated size of serialized section + list, below. +

+

Allocated Size of Serialized Section List

+

This is the size of serialized free-space section list + actually allocated (in bytes). +

+

Checksum

+

This is the checksum for the free-space manager header.

+
+
+ +
+

The free-space sections being managed are stored in a + free-space section list, described below. The sections + in the free-space section list are stored in the following way: + a count of the number of sections describing a particular size of + free space and the size of the free-space described (in bytes), + followed by a list of section description records; then another + section count and size, followed by the list of section + descriptions for that size; and so on.

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Free-space Section List +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely

Free-space Manager Header AddressO

Number of Section Records in Set #0 (variable size)
Size of Free-space Section Described in Record Set #0 (variable size)
Record Set #0 Section Record #0 Offset(variable size)
Record Set #0 Section Record #0 TypeThis space inserted only to align table nicely
Record Set #0 Section Record #0 Data (variable size)
...
Record Set #0 Section Record #K-1 Offset(variable size)
Record Set #0 Section Record #K-1 TypeThis space inserted only to align table nicely
Record Set #0 Section Record #K-1 Data (variable size)
Number of Section Records in Set #1 (variable size)
Size of Free-space Section Described in Record Set #1 (variable size)
Record Set #1 Section Record #0 Offset(variable size)
Record Set #1 Section Record #0 TypeThis space inserted only to align table nicely
Record Set #1 Section Record #0 Data (variable size)
...
Record Set #1 Section Record #K-1 Offset(variable size)
Record Set #1 Section Record #K-1 TypeThis space inserted only to align table nicely
Record Set #1 Section Record #K-1 Data (variable size)
...
...
Number of Section Records in Set #N-1 (variable size)
Size of Free-space Section Described in Record Set #N-1 (variable size)
Record Set #N-1 Section Record #0 Offset(variable size)
Record Set #N-1 Section Record #0 TypeThis space inserted only to align table nicely
Record Set #N-1 Section Record #0 Data (variable size)
...
Record Set #N-1 Section Record #K-1 Offset(variable size)
Record Set #N-1 Section Record #K-1 TypeThis space inserted only to align table nicely
Record Set #N-1 Section Record #K-1 Data (variable size)
Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “FSSE” is used to + indicate the beginning of the Free-space Section Information. + This gives file consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version

+

This is the version number for the Free-space Section List + and this document describes version 0.

+

Free-space Manager Header Address

+

This is the address of the Free-space Manager Header. + This field is principally used for file + integrity checking. +

+

Number of Section Records for Set #N

+

This is the number of free-space section records for set #N. + The length of this field is the minimum number of bytes needed + to store the number of serialized sections (from the + free-space manager header). +

+ +

+ The number of sets of free-space section records is + determined by the size of serialized section list in + the free-space manager header. +

+

Section Size for Record Set #N

+

This is the size (in bytes) of the free-space section described + for all the section records in set #N. +

+ +

+ The length of this field is the minimum number of bytes needed + to store the maximum section size (from the + free-space manager header). +

+

Record Set #N Section #K Offset

+

This is the offset (in bytes) of the free-space section within + the client for the free-space manager. +

+ +

+ The length of this field is the minimum number of bytes needed + to store the size of address space (from the + free-space manager header). +

+

Record Set #N Section #K Type

+

This is the type of the section record, used to decode the + record set #N section #K data information. The defined + record type for file client is: + + + + + + + + + + + + + + + +
TypeDescription
0File’s section (a range of actual bytes in file) +
1+Reserved. +

+ +

The defined record types for a fractal heap client are: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TypeDescription
0Fractal heap “single” section +
1Fractal heap “first row” section +
2Fractal heap “normal row” section +
3Fractal heap “indirect” section +
4+Reserved. +

+ +

Record Set #N Section #K Data

+

This is the section-type specific information for each record + in the record set, described below. +

+

Checksum

+

This is the checksum for the Free-space Section List. +

+
+
+ +
+

+ The section-type specific data for each free-space section record is + described below: +

+ +
+ + + + + + +
+ File’s Section Data Record +
No additional record data stored
+
+ +
+
+
+ + + + + + +
+ Fractal Heap “Single” Section Data Record +
No additional record data stored
+
+ +
+
+
+ + + + + + +
+ Fractal Heap “First Row” Section Data Record +
Same format as “indirect” section data
+
+ +
+
+
+ + + + + + +
+ Fractal Heap “Normal Row” Section Data Record +
No additional record data stored
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Fractal Heap “Indirect” Section Data Record +
bytebytebytebyte
Fractal Heap Indirect Block Offset (variable size)
Block Start RowBlock Start Column
Number of BlocksThis space inserted only to align table nicely
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Fractal Heap Block Offset

+

The offset of the indirect block in the fractal heap’s address + space containing the empty blocks. +

+

+ The number of bytes used to encode this field is the minimum + number of bytes needed to encode values for the Maximum + Heap Size (in the fractal heap’s header). +

+

Block Start Row

+

This is the row that the empty blocks start in. +

+

Block Start Column

+

This is the column that the empty blocks start in. +

+

Number of Blocks

+

This is the number of empty blocks covered by the section. +

+
+
+ +
+

+III.H. Disk Format: Level 1H - Shared Object Header Message Table

+ +

+ The shared object header message table is used to locate + object + header messages that are shared between two or more object headers + in the file. Shared object header messages are stored and indexed + in the file in one of two ways: indexed sequentially in a + shared header message list or indexed with a v2 B-tree. + The shared messages themselves are either stored in a fractal + heap (when two or more objects share the message), or remain in an + object’s header (when only one object uses the message currently, + but the message can be shared in the future). +

+ +

+ The shared object header message table + contains a list of shared message index headers. Each index header + records information about the version of the index format, the index + storage type, flags for the message types indexed, the number of + messages in the index, the address where the index resides, + and the fractal heap address if shared messages are stored there. +

+ +

+ Each index can be either a list or a v2 B-tree and may transition + between those two forms as the number of messages in the index + varies. Each shared message record contains information used to + locate the shared message from either a fractal heap or an object + header. The types of messages that can be shared are: Dataspace, + Datatype, Fill Value, Filter Pipeline and Attribute. +

+ +

+ The shared object header message table is pointed to + from a shared message table message + in the superblock extension for a file. This message stores the + version of the table format, along with the number of index headers + in the table. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Shared Object Header Message Table +
bytebytebytebyte
Signature
Version for index #0Index Type for index #0Message Type Flags for index #0
Minimum Message Size for index #0
List Cutoff for index #0v2 B-tree Cutoff for index #0
Number of Messages for index #0This space inserted only to align table nicely

Index AddressO for index #0


Fractal Heap AddressO for index #0

...
...
Version for index #N-1Index Type for index #N-1Message Type Flags for index #N-1
Minimum Message Size for index #N-1
List Cutoff for index #N-1v2 B-tree Cutoff for index #N-1
Number of Messages for index #N-1This space inserted only to align table nicely

Index AddressO for index #N-1


Fractal Heap AddressO for index #N-1

Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “SMTB” is used to + indicate the beginning of the Shared Object Header Message table. + This gives file consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version for index #N

+

This is the version number for the list of shared object header message + indexes and this document describes version 0.

+

Index Type for index #N

+

The type of index can be an unsorted list or a v2 B-tree. +

+

Message Type Flags for index #N

+

This field indicates the type of messages tracked in the index, + as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitsDescription
0If set, the index tracks Dataspace Messages. +
1If set, the message tracks Datatype Messages. +
2If set, the message tracks Fill Value Messages. +
3If set, the message tracks Filter Pipeline Messages. +
4If set, the message tracks Attribute Messages. +
5-15Reserved (zero). +

+ + +

+ An index can track more than one type of message, but each type + of message can only by in one index. +

+

Minimum Message Size for index #N

+

This is the message size sharing threshold for the index. + If the encoded size of the message is less than this value, the + message is not shared. +

+

List Cutoff for index #N

+

This is the cutoff value for the indexing of messages to + switch from a list to a v2 B-tree. If the number of messages + is greater than this value, the index should be a v2 B-tree. +

+

v2 B-tree Cutoff for index #N

+

This is is the cutoff value for the indexing of messages to + switch from a v2 B-tree back to a list. If the number of + messages is less than this value, the index should be a list. +

+

Number of Messages for index #N

+

The number of shared messages being tracked for the index. +

+

Index Address for index #N

+

This field is the address of the list or v2 B-tree where the + index nodes reside. +

+

Fractal Heap Address for index #N

+

This field is the address of the fractal heap if shared messages + are stored there. +

+

Checksum

+

This is the checksum for the table.

+
+
+ +
+

+ Shared messages are indexed either with a shared message record + list, described below, or using a v2 B-tree (using record type 7). + The number of records in the shared message record list is + determined in the index’s entry in the shared object header message + table. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Shared Message Record List +
bytebytebytebyte
Signature
Shared Message Record #0
Shared Message Record #1
...
Shared Message Record #N-1
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “SMLI” is used to + indicate the beginning of a list of index nodes. + This gives file consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Shared Message Record #N

+

The record for locating the shared message, either in the + fractal heap for the index, or an object header (see format for + index nodes below). +

+

Checksum

+

This is the checksum for the list. +

+
+
+ +
+

+ The record for each shared message in an index is stored in one of the + following forms: +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Shared Message Record, for messages stored in a fractal heap +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash Value
Reference Count

Fractal Heap ID

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Message Location

+

This has a value of 0 indicating that the message is stored in + the heap. +

+

Hash Value

+

This is the hash value for the message. +

+

Reference Count

+

This is the number of times the message is used in the file. +

+

Fractal Heap ID

+

This is an 8-byte fractal heap ID for the message as stored in + the fractal heap for the index. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Shared Message Record, for messages stored in an object header +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash Value
ReservedMessage TypeCreation Index

Object Header AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Message Location

+

This has a value of 1 indicating that the message is stored in + an object header. +

+

Hash Value

+

This is the hash value for the message. +

+

Message Type

+

This is the message type in the object header. +

+

Creation Index

+

This is the creation index of the message within the object + header. +

+

Object Header Address

+

This is the address of the object header where the message is + located. +

+
+
+ + + +
+
+
+

+IV. Disk Format: Level 2 - Data Objects

+ +

Data objects contain the “real” user-visible information in the file. + These objects compose the scientific data and other information which + are generally thought of as “data” by the end-user. All the + other information in the file is provided as a framework for + storing and accessing these data objects. +

+ +

A data object is composed of header and data + information. The header information contains the information + needed to interpret the data information for the object as + well as additional “metadata” or pointers to additional + “metadata” used to describe or annotate each object. +

+ +
+

+IV.A. Disk Format: Level 2A - Data Object Headers

+ +

The header information of an object is designed to encompass + all of the information about an object, except for the data itself. + This information includes the dataspace, the datatype, information + about how the data is stored on disk (in external files, compressed, + broken up in blocks, and so on), as well as other information used + by the library to speed up access to the data objects or maintain + a file’s integrity. Information stored by user applications + as attributes is also stored in the object’s header. The header + of each object is not necessarily located immediately prior to the + object’s data in the file and in fact may be located in any + position in the file. The order of the messages in an object header + is not significant.

+ +

Object headers are composed of a prefix and a set of messages. The + prefix contains the information needed to interpret the messages and + a small amount of metadata about the object, and the messages contain + the majority of the metadata about the object. +

+ +
+

+IV.A.1. Disk Format: Level 2A1 - Data Object Header Prefix

+ +
+

+IV.A.1.a. Version 1 Data Object Header Prefix

+ +

Header messages are aligned on 8-byte boundaries for version 1 + object headers. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 1 Object Header +
bytebytebytebyte
VersionReserved (zero)Total Number of Header Messages
Object Reference Count
Object Header Size
Header Message Type #1Size of Header Message Data #1
Header Message #1 FlagsReserved (zero)

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #n
Header Message #n FlagsReserved (zero)

Header Message Data #n

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

This value is used to determine the format of the + information in the object header. When the format of the + object header is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. This + is version one (1) (there was no version zero (0)) of the + object header. +

+

Total Number of Header Messages

+

This value determines the total number of messages listed in + object headers for this object. This value includes the messages + in continuation messages for this object. +

+

Object Reference Count

+

This value specifies the number of “hard links” to this object + within the current file. References to the object from external + files, “soft links” in this file and object references in this + file are not tracked. +

+

Object Header Size

+

This value specifies the number of bytes of header message data + following this length field that contain object header messages + for this object header. This value does not include the size of + object header continuation blocks for this object elsewhere in the + file. +

+

Header Message #n Type

+

This value specifies the type of information included in the + following header message data. The message types for + header messages are defined in sections below. +

+

Size of Header Message #n Data

+

This value specifies the number of bytes of header + message data following the header message type and length + information for the current message. The size includes + padding bytes to make the message a multiple of eight + bytes. +

+

Header Message #n Flags

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, the message data is constant. This is used + for messages like the datatype message of a dataset. +
1If set, the message is shared and stored + in another location than the object header. The Header + Message Data field contains a Shared Message + (described in the Data Object Header Messages + section below) + and the Size of Header Message Data field + contains the size of that Shared Message. +
2If set, the message should not be shared. +
3If set, the HDF5 decoder should fail to open this object + if it does not understand the message’s type and the file + is open with permissions allowing write access to the file. + (Normally, unknown messages can just be ignored by HDF5 + decoders) +
4If set, the HDF5 decoder should set bit 5 of this + message’s flags (in other words, this bit field) + if it does not understand the message’s type + and the object is modified in any way. (Normally, + unknown messages can just be ignored by HDF5 + decoders) +
5If set, this object was modified by software that did not + understand this message. + (Normally, unknown messages should just be ignored by HDF5 + decoders) (Can be used to invalidate an index or a similar + feature) +
6If set, this message is shareable. +
7If set, the HDF5 decoder should always fail to open this + object if it does not understand the message’s type (whether + it is open for read-only or read-write access). (Normally, + unknown messages can just be ignored by HDF5 decoders) +

+ +

Header Message #n Data

+

The format and length of this field is determined by the + header message type and size respectively. Some header + message types do not require any data and this information + can be eliminated by setting the length of the message to + zero. The data is padded with enough zeroes to make the + size a multiple of eight. +

+
+
+ +
+

+IV.A.1.b. Version 2 Data Object Header Prefix

+ +

Note that the “total number of messages” field has been dropped from + the data object header prefix in this version. The number of messages + in the data object header is just determined by the messages encountered + in all the object header blocks.

+ +

Note also that the fields and messages in this version of data object + headers have no alignment or padding bytes inserted - they are + stored packed together.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 Object Header +
bytebytebytebyte
Signature
VersionFlagsThis space inserted only to align table nicely
Access time (optional)
Modification Time (optional)
Change Time (optional)
Birth Time (optional)
Maximum # of compact attributes (optional)Minimum # of dense attributes (optional)
Size of Chunk #0 (variable size)This space inserted only to align table nicely
Header Message Type #1Size of Header Message Data #1Header Message #1 Flags
Header Message #1 Creation Order (optional)This space inserted only to align table nicely

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #nHeader Message #n Flags
Header Message #n Creation Order (optional)This space inserted only to align table nicely

Header Message Data #n

Gap (optional, variable size)
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “OHDR” + is used to indicate the + beginning of an object header. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

This field has a value of 2 indicating version 2 of the object header. +

+

Flags

+

This field is a bit field indicating additional information + about the object header. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Bit(s)Description
0-1This two bit field determines the size of the + Size of Chunk #0 field. The values are: + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0The Size of Chunk #0 field is 1 byte. +
1The Size of Chunk #0 field is 2 bytes. +
2The Size of Chunk #0 field is 4 bytes. +
3The Size of Chunk #0 field is 8 bytes. +

+
2If set, attribute creation order is tracked.
3If set, attribute creation order is indexed.
4If set, non-default attribute storage phase change + values are stored.
5If set, access, modification, change and birth times + are stored.
6-7Reserved

+ +

Access Time

+

This 32-bit value represents the number of seconds after the + UNIX epoch when the object’s raw data was last accessed + (in other words, read or written). +

+

This field is present if bit 5 of flags is set. +

+

Modification Time

+

This 32-bit value represents the number of seconds after + the UNIX epoch when the object’s raw data was last + modified (in other words, written). +

+

This field is present if bit 5 of flags is set. +

+

Change Time

+

This 32-bit value represents the number of seconds after the + UNIX epoch when the object’s metadata was last changed. +

+

This field is present if bit 5 of flags is set. +

+

Birth Time

+

This 32-bit value represents the number of seconds after the + UNIX epoch when the object was created. +

+

This field is present if bit 5 of flags is set. +

+

Maximum # of compact attributes

+

This is the maximum number of attributes to store in the compact + format before switching to the indexed format. +

+

This field is present if bit 4 of flags is set. +

+

Minimum # of dense attributes

+

This is the minimum number of attributes to store in the indexed + format before switching to the compact format. +

+

This field is present if bit 4 of flags is set. +

+

Size of Chunk #0

+

+ This unsigned value specifies the number of bytes of header + message data following this field that contain object header + information. +

+

+ This value does not include the size of object header + continuation blocks for this object elsewhere in the file. +

+

+ The length of this field varies depending on bits 0 and 1 of + the flags field. +

+

Header Message #n Type

+

Same format as version 1 of the object header, described above. +

+

Size of Header Message #n Data

+

This value specifies the number of bytes of header + message data following the header message type and length + information for the current message. The size of messages + in this version does not include any padding bytes. +

+

Header Message #n Flags

+

Same format as version 1 of the object header, described above. +

+

Header Message #n Creation Order

+

This field stores the order that a message of a given type + was created in. +

+

This field is present if bit 2 of flags is set. +

+

Header Message #n Data

+

Same format as version 1 of the object header, described above. +

+

Gap

+

A gap in an object header chunk is inferred by the end of the + messages for the chunk before the beginning of the chunk’s + checksum. Gaps are always smaller than the size of an + object header message prefix (message type + message size + + message flags). +

+

Gaps are formed when a message (typically an attribute message) + in an earlier chunk is deleted and a message from a later + chunk that does not quite fit into the free space is moved + into the earlier chunk. +

+

Checksum

+

This is the checksum for the object header chunk. +

+
+
+ +

The header message types and the message data associated with + them compose the critical “metadata” about each object. Some + header messages are required for each object while others are + optional. Some optional header messages may also be repeated + several times in the header itself, the requirements and number + of times allowed in the header will be noted in each header + message description below. +

+ + +
+

+IV.A.2. Disk Format: Level 2A2 - Data Object Header Messages

+ +

Data object header messages are small pieces of metadata that are + stored in the data object header for each object in an HDF5 file. + Data object header messages provide the metadata required to describe + an object and its contents, as well as optional pieces of metadata + that annotate the meaning or purpose of the object. +

+ +

Data object header messages are either stored directly in the data + object header for the object or are shared between multiple objects + in the file. When a message is shared, a flag in the Message Flags + indicates that the actual Message Data + portion of that message is stored in another location (such as another + data object header, or a heap in the file) and the Message Data + field contains the information needed to locate the actual information + for the message. +

+ +

+ The format of shared message data is described here:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Shared Message (Version 1) +
bytebytebytebyte
VersionTypeReserved (zero)
Reserved (zero)

AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number is used when there are changes in the format + of a shared object message and is described here: + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by the library before version 1.6.1. +

+

Type

The type of shared message location: + + + + + + + + + + +
ValueDescription
0Message stored in another object’s header (a committed + message). +

+

Address

The address of the object header + containing the message to be shared.

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Shared Message (Version 2) +
bytebytebytebyte
VersionTypeThis space inserted only to align table nicely

AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number is used when there are changes in the format + of a shared object message and is described here: + + + + + + + + + + +
VersionDescription
2Used by the library of version 1.6.1 and after. +

+

Type

The type of shared message location: + + + + + + + + + + +
ValueDescription
0Message stored in another object’s header (a committed + message). +

+

Address

The address of the object header + containing the message to be shared.

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Shared Message (Version 3) +
bytebytebytebyte
VersionTypeThis space inserted only to align table nicely
Location (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number indicates changes in the format of shared + object message and is described here: + + + + + + + + + + +
VersionDescription
3Used by the library of version 1.8 and after. In this + version, the Type field can indicate that + the message is stored in the fractal heap. +

+

Type

The type of shared message location: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Message is not shared and is not shareable. +
1Message stored in file’s shared object header message + heap (a shared message). +
2Message stored in another object’s header (a committed + message). +
3Message stored is not shared, but is sharable. +

+

Location

This field contains either a Size of Offsets-bytes + address of the object header + containing the message to be shared, or an 8-byte fractal heap ID + for the message in the file’s shared object header message + heap. +

+
+
+ + +

The following is a list of currently defined header messages: +

+ +
+

IV.A.2.a. The NIL Message

+ + +
+ + + + + + + + +
Header Message Name: NIL
Header Message Type: 0x0000
Length: Varies
Status: Optional; may be repeated.
Description:The NIL message is used to indicate a message which is to be + ignored when reading the header messages for a data object. + [Possibly one which has been deleted for some reason.] +
Format of Data: Unspecified
+ + + +
+

IV.A.2.b. The Dataspace Message

+ + +
+ + + + + + + + + + +
Header Message Name: Dataspace
Header Message Type: 0x0001
Length: Varies according to the number of + dimensions, as described in the following table.
Status: Required for dataset objects; + may not be repeated.
Description:The dataspace message describes the number of dimensions (in + other words, “rank”) and size of each dimension that + the data object has. This message is only used for datasets which + have a simple, rectilinear, array-like layout; datasets requiring + a more complex layout are not yet supported. +
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Dataspace Message - Version 1 +
bytebytebytebyte
VersionDimensionalityFlagsReserved
Reserved

Dimension #1 SizeL

.
.
.

Dimension #n SizeL


Dimension #1 Maximum SizeL (optional)

.
.
.

Dimension #n Maximum SizeL (optional)


Permutation Index #1L (optional)

.
.
.

Permutation Index #nL (optional)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

This value is used to determine the format of the + Dataspace Message. When the format of the + information in the message is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. This + document describes version one (1) (there was no version + zero (0)). +

+

Dimensionality

+

This value is the number of dimensions that the data + object has. +

+

Flags

+

This field is used to store flags to indicate the + presence of parts of this message. Bit 0 (the least + significant bit) is used to indicate that maximum + dimensions are present. Bit 1 is used to indicate that + permutation indices are present. +

+

Dimension #n Size

+

This value is the current size of the dimension of the + data as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Dimension #n Maximum Size

+

This value is the maximum size of the dimension of the + data as stored in the file. This value may be the special + “unlimited” size which indicates + that the data may expand along this dimension indefinitely. + If these values are not stored, the maximum size of each + dimension is assumed to be the dimension’s current size. +

+

Permutation Index #n

+

This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. If these values are + not stored, the first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+
+
+ + + +
+

Version 2 of the dataspace message dropped the optional + permutation index value support, as it was never implemented in the + HDF5 Library:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Dataspace Message - Version 2 +
bytebytebytebyte
VersionDimensionalityFlagsType

Dimension #1 SizeL

.
.
.

Dimension #n SizeL


Dimension #1 Maximum SizeL (optional)

.
.
.

Dimension #n Maximum SizeL (optional)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

This value is used to determine the format of the + Dataspace Message. This field should be ‘2’ for version 2 + format messages. +

+

Dimensionality

+

This value is the number of dimensions that the data object has. +

+

Flags

+

This field is used to store flags to indicate the + presence of parts of this message. Bit 0 (the least + significant bit) is used to indicate that maximum + dimensions are present. +

+

Type

+

This field indicates the type of the dataspace: + + + + + + + + + + + + + + + + + + +
ValueDescription
0A scalar dataspace; in other words, + a dataspace with a single, dimensionless element. +
1A simple dataspace; in other words, + a dataspace with a rank > 0 and an appropriate # of + dimensions. +
2A null dataspace; in other words, + a dataspace with no elements. +

+

Dimension #n Size

+

This value is the current size of the dimension of the + data as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Dimension #n Maximum Size

+

This value is the maximum size of the dimension of the + data as stored in the file. This value may be the special + “unlimited” size which indicates + that the data may expand along this dimension indefinitely. + If these values are not stored, the maximum size of each + dimension is assumed to be the dimension’s current size. +

+
+
+ + + + + +
+

IV.A.2.c. The Link Info Message

+ + +
+ + + + + + + + +
Header Message Name: Link Info
Header Message Type: 0x002
Length: Varies
Status: Optional; may not be + repeated.
Description:The link info message tracks variable information about the + current state of the links for a “new style” + group’s behavior. Variable information will be stored in + this message and constant information will be stored in the + Group Info message. +
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Link Info +
bytebytebytebyte
VersionFlagsThis space inserted only to align table nicely

Maximum Creation Index (8 bytes, optional)


Fractal Heap AddressO


Address of v2 B-tree for Name IndexO


Address of v2 B-tree for Creation Order IndexO (optional)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number for this message. This document describes + version 0.

+

Flags

This field determines various optional aspects of the link + info message: + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, creation order for the links is tracked. +
1If set, creation order for the links is indexed. +
2-7Reserved

+ +

Maximum Creation Index

This 64-bit value is the maximum creation order index value + stored for a link in this group.

+

This field is present if bit 0 of flags is set.

+

Fractal Heap Address

+

+ This is the address of the fractal heap to store dense links. + Each link stored in the fractal heap is stored as a + Link Message. +

+

+ If there are no links in the group, or the group’s links + are stored “compactly” (as object header messages), this + value will be the undefined + address. +

+

Address of v2 B-tree for Name Index

This is the address of the version 2 B-tree to index names of links.

+

If there are no links in the group, or the group’s links + are stored “compactly” (as object header messages), this + value will be the undefined + address. +

+

Address of v2 B-tree for Creation Order Index

This is the address of the version 2 B-tree to index creation order of links.

+

If there are no links in the group, or the group’s links + are stored “compactly” (as object header messages), this + value will be the undefined + address. +

+

This field exists if bit 1 of flags is set.

+
+
+ + +
+

IV.A.2.d. The Datatype Message

+ + +
+ + + + + + + + +
Header Message Name: Datatype
Header Message Type: 0x0003 +
Length: Variable
Status: Required for dataset or committed + datatype (formerly named datatype) objects; may not be repeated. +
Description:

The datatype message defines the datatype for each element + of a dataset or a common datatype for sharing between multiple + datasets. A datatype can describe an atomic type like a fixed- + or floating-point type or more complex types like a C struct + (compound datatype), array (array datatype) or C++ vector + (variable-length datatype).

+

Datatype messages that are part of a dataset object do not + describe how elements are related to one another; the dataspace + message is used for that purpose. Datatype messages that are part of + a committed datatype (formerly named datatype) message describe + a common datatype that can be shared by multiple datasets in the + file.

+
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Datatype Message +
bytebytebytebyte
Class and VersionClass Bit Field, Bits 0-7Class Bit Field, Bits 8-15Class Bit Field, Bits 16-23
Size


Properties


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Class and Version

+

The version of the datatype message and the datatype’s class + information are packed together in this field. The version + number is packed in the top 4 bits of the field and the class + is contained in the bottom 4 bits. +

+

The version number information is used for changes in the + format of the datatype message and is described here: + + + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Used by early versions of the library to encode + compound datatypes with explicit array fields. + See the compound datatype description below for + further details. +
2Used when an array datatype needs to be encoded. +
3Used when a VAX byte-ordered type needs to be + encoded. Packs various other datatype classes more + efficiently also. +

+ +

The class of the datatype determines the format for the class + bit field and properties portion of the datatype message, which + are described below. The + following classes are currently defined: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Fixed-Point
1Floating-Point
2Time
3String
4Bit field
5Opaque
6Compound
7Reference
8Enumerated
9Variable-Length
10Array

+ +

Class Bit Fields

+

The information in these bit fields is specific to each datatype + class and is described below. All bits not defined for a + datatype class are set to zero. +

+

Size

+

The size of a datatype element in bytes. +

+

Properties

+

This variable-sized sequence of bytes encodes information + specific to each datatype class and is described for each class + below. If there is no property information specified for a + datatype class, the size of this field is zero bytes. +

+
+
+ + +
+

Class specific information for Fixed-Point Numbers (Class 0):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fixed-point Bit Field Description +
BitsMeaning

0

Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.

1, 2

Padding type. Bit 1 is the lo_pad bit and bit 2 + is the hi_pad bit. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.

3

Signed. If this bit is set then the fixed-point + number is in 2’s complement form.

4-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + +
+ Fixed-Point Property Description +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Bit Offset

+

The bit offset of the first significant bit of the fixed-point + value within the datatype. The bit offset specifies the number + of bits “to the right of” the value (which are set to the + lo_pad bit value). +

+

Bit Precision

+

The number of bits of precision of the fixed-point value + within the datatype. This value, combined with the datatype + element’s size and the Bit Offset field specifies the number + of bits “to the left of” the value (which are set to the + hi_pad bit value). +

+
+
+ + +
+

Class specific information for Floating-Point Numbers (Class 1):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Floating-Point Bit Field Description +
BitsMeaning

0, 6

Byte Order. These two non-contiguous bits specify the + “endianness” of the bytes in the datatype element. + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Bit 6Bit 0Description
00Byte order is little-endian +
01Byte order is big-endian +
10Reserved +
11Byte order is VAX-endian +

+

1, 2, 3

Padding type. Bit 1 is the low bits pad type, bit 2 + is the high bits pad type, and bit 3 is the internal bits + pad type. If a datum has unused bits at either end or between + the sign bit, exponent, or mantissa, then the value of bit + 1, 2, or 3 is copied to those locations.

4-5

Mantissa Normalization. This 2-bit bit field specifies + how the most significant bit of the mantissa is managed. + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0No normalization +
1The most significant bit of the mantissa is always set + (except for 0.0). +
2The most significant bit of the mantissa is not stored, + but is implied to be set. +
3Reserved. +

+

7

Reserved (zero).

8-15

Sign Location. This is the bit position of the sign + bit. Bits are numbered with the least significant bit zero.

16-23

Reserved (zero).

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Floating-Point Property Description +
ByteByteByteByte
Bit OffsetBit Precision
Exponent LocationExponent SizeMantissa LocationMantissa Size
Exponent Bias
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Bit Offset

+

The bit offset of the first significant bit of the floating-point + value within the datatype. The bit offset specifies the number + of bits “to the right of” the value. +

+

Bit Precision

+

The number of bits of precision of the floating-point value + within the datatype. +

+

Exponent Location

+

The bit position of the exponent field. Bits are numbered with + the least significant bit number zero. +

+

Exponent Size

+

The size of the exponent field in bits. +

+

Mantissa Location

+

The bit position of the mantissa field. Bits are numbered with + the least significant bit number zero. +

+

Mantissa Size

+

The size of the mantissa field in bits. +

+

Exponent Bias

+

The bias of the exponent field. +

+
+
+ + +
+

Class specific information for Time (Class 2):

+ + +
+ + + + + + + + + + + + + + + + + +
+ Time Bit Field Description +
BitsMeaning

0

Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.

1-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + +
+ Time Property Description +
ByteByte
Bit Precision
+
+ +
+
+ + + + + + + + + + + +
Field NameDescription

Bit Precision

+

The number of bits of precision of the time value. +

+
+
+ + +
+

Class specific information for Strings (Class 3):

+ + +
+ + + + + + + + + + + + + + + + + + + + + + +
+ String Bit Field Description +
BitsMeaning

0-3

Padding type. This four-bit value determines the + type of padding to use for the string. The values are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Null Terminate: A zero byte marks the end of the + string and is guaranteed to be present after + converting a long string to a short string. When + converting a short string to a long string the value is + padded with additional null characters as necessary. +
1Null Pad: Null characters are added to the end of + the value during conversions from short values to long + values but conversion in the opposite direction simply + truncates the value. +
2Space Pad: Space characters are added to the end of + the value during conversions from short values to long + values but conversion in the opposite direction simply + truncates the value. This is the Fortran + representation of the string. +
3-15Reserved +

+

4-7

Character Set. The character set used to + encode the string. + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding +
1UTF-8 character set encoding +
2-15Reserved +

+

8-23

Reserved (zero).

+
+ +

There are no properties defined for the string class. +

+ + +

Class specific information for bit fields (Class 4):

+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ Bitfield Bit Field Description +
BitsMeaning

0

Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.

1, 2

Padding type. Bit 1 is the lo_pad type and bit 2 + is the hi_pad type. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.

3-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + +
+ Bit Field Property Description +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Bit Offset

+

The bit offset of the first significant bit of the bit field + within the datatype. The bit offset specifies the number + of bits “to the right of” the value. +

+

Bit Precision

+

The number of bits of precision of the bit field + within the datatype. +

+
+
+ + +
+

Class specific information for Opaque (Class 5):

+ +
+ + + + + + + + + + + + + + + + + +
+ Opaque Bit Field Description +
BitsMeaning

0-7

Length of ASCII tag in bytes.

8-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + +
+ Opaque Property Description +
ByteByteByteByte

ASCII Tag
+
+
+ +
+
+ + + + + + + + + + +
Field NameDescription

ASCII Tag

+

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+
+
+ + +
+

Class specific information for Compound (Class 6):

+ +
+ + + + + + + + + + + + + + + + + +
+ Compound Bit Field Description +
BitsMeaning

0-15

Number of Members. This field contains the number + of members defined for the compound datatype. The member + definitions are listed in the Properties field of the data + type message.

16-23

Reserved (zero).

+
+ + +

The Properties field of a compound datatype is a list of the + member definitions of the compound datatype. The member + definitions appear one after another with no intervening bytes. + The member types are described with a (recursively) encoded datatype + message.

+ +

Note that the property descriptions are different for different + versions of the datatype version. Additionally note that the version + 0 datatype encoding is deprecated and has been replaced with later + encodings in versions of the HDF5 Library from the 1.4 release + onward.

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Compound Properties Description for Datatype Version 1 +
ByteByteByteByte

Name

Byte Offset of Member
DimensionalityReserved (zero)
Dimension Permutation
Reserved (zero)
Dimension #1 Size (required)
Dimension #2 Size (required)
Dimension #3 Size (required)
Dimension #4 Size (required)

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Name

+

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+

Byte Offset of Member

+

This is the byte offset of the member within the datatype. +

+

Dimensionality

+

If set to zero, this field indicates a scalar member. If set + to a value greater than zero, this field indicates that the + member is an array of values. For array members, the size of + the array is indicated by the ‘Size of Dimension n’ field in + this message. +

+

Dimension Permutation

+

This field was intended to allow an array field to have + its dimensions permuted, but this was never implemented. + This field should always be set to zero. +

+

Dimension #n Size

+

This field is the size of a dimension of the array field as + stored in the file. The first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+

Member Type Message

+

This field is a datatype message describing the datatype of + the member. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Compound Properties Description for Datatype Version 2 +
ByteByteByteByte

Name

Byte Offset of Member

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Name

+

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+

Byte Offset of Member

+

This is the byte offset of the member within the datatype. +

+

Member Type Message

+

This field is a datatype message describing the datatype of + the member. +

+
+
+ + +
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Compound Properties Description for Datatype Version 3 +
ByteByteByteByte

Name

Byte Offset of Member (variable size)

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Name

This NUL-terminated string provides a description for the + opaque type. It is not NUL-padded to a multiple of 8 + bytes.

Byte Offset of Member

This is the byte offset of the member within the datatype. + The field size is the minimum number of bytes necessary, + based on the size of the datatype element. For example, a + datatype element size of less than 256 bytes uses a 1 byte + length, a datatype element size of 256-65535 bytes uses a + 2 byte length, and so on.

Member Type Message

This field is a datatype message describing the datatype of + the member.

+
+ + +
+

Class specific information for Reference (Class 7):

+ +
+ + + + + + + + + + + + + + + + + +
+ Reference Bit Field Description +
BitsMeaning

0-3

Type. This four-bit value contains the type of reference + described. The values defined are: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Object Reference: A reference to another object in this + HDF5 file. +
1Dataset Region Reference: A reference to a region within + a dataset in this HDF5 file. +
2-15Reserved +

+ +

4-23

Reserved (zero).

+
+ +

There are no properties defined for the reference class. +

+ + +
+

Class specific information for Enumeration (Class 8):

+ +
+ + + + + + + + + + + + + + + + + +
+ Enumeration Bit Field Description +
BitsMeaning

0-15

Number of Members. The number of name/value + pairs defined for the enumeration type.

16-23

Reserved (zero).

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Enumeration Property Description for Datatype Versions 1 & 2 +
ByteByteByteByte

Base Type


Names


Values

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Base Type

+

Each enumeration type is based on some parent type, usually an + integer. The information for that parent type is described + recursively by this field. +

+

Names

+

The name for each name/value pair. Each name is stored as a null + terminated ASCII string in a multiple of eight bytes. The names + are in no particular order. +

+

Values

+

The list of values in the same order as the names. The values + are packed (no inter-value padding) and the size of each value + is determined by the parent type. +

+
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Enumeration Property Description for Datatype Version 3 +
ByteByteByteByte

Base Type


Names


Values

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Base Type

+

Each enumeration type is based on some parent type, usually an + integer. The information for that parent type is described + recursively by this field. +

+

Names

+

The name for each name/value pair. Each name is stored as a null + terminated ASCII string, not padded to a multiple of + eight bytes. The names are in no particular order. +

+

Values

+

The list of values in the same order as the names. The values + are packed (no inter-value padding) and the size of each value + is determined by the parent type. +

+
+
+ + + +
+

Class specific information for Variable-Length (Class 9):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Variable-Length Bit Field Description +
BitsMeaning

0-3

Type. This four-bit value contains the type of + variable-length datatype described. The values defined are: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Sequence: A variable-length sequence of any datatype. + Variable-length sequences do not have padding or + character set information. +
1String: A variable-length sequence of characters. + Variable-length strings have padding and character set + information. +
2-15Reserved +

+ +

4-7

Padding type. (variable-length string only) + This four-bit value determines the type of padding + used for variable-length strings. The values are the same + as for the string padding type, as follows: + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Null terminate: A zero byte marks the end of a string + and is guaranteed to be present after converting a long + string to a short string. When converting a short string + to a long string, the value is padded with additional null + characters as necessary. +
1Null pad: Null characters are added to the end of the + value during conversion from a short string to a longer + string. Conversion from a long string to a shorter string + simply truncates the value. +
2Space pad: Space characters are added to the end of the + value during conversion from a short string to a longer + string. Conversion from a long string to a shorter string + simply truncates the value. This is the Fortran + representation of the string. +
3-15Reserved +

+ +

This value is set to zero for variable-length sequences.

+ +

8-11

Character Set. (variable-length string only) + This four-bit value specifies the character set + to be used for encoding the string: + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding +
1UTF-8 character set encoding +
2-15Reserved +

+ +

This value is set to zero for variable-length sequences.

+ +

12-23

Reserved (zero).

+
+ +
+
+
+ + + + + + + + + + + + + + +
+ Variable-Length Property Description +
ByteByteByteByte

Base Type

+
+ +
+
+ + + + + + + + + + + +
Field NameDescription

Base Type

+

Each variable-length type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ + +
+

Class specific information for Array (Class 10):

+ +

There are no bit fields defined for the array class. +

+ +

Note that the dimension information defined in the property for this + datatype class is independent of dataspace information for a dataset. + The dimension information here describes the dimensionality of the + information within a data element (or a component of an element, if the + array datatype is nested within another datatype) and the dataspace for a + dataset describes the size and locations of the elements in a dataset. +

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Array Property Description for Datatype Version 2 +
ByteByteByteByte
DimensionalityReserved (zero)
Dimension #1 Size
.
.
.
Dimension #n Size
Permutation Index #1
.
.
.
Permutation Index #n

Base Type

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Dimensionality

+

This value is the number of dimensions that the array has. +

+

Dimension #n Size

+

This value is the size of the dimension of the array + as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Permutation Index #n

+

This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. Currently, dimension + permutations are not supported, and these indices should + be set to the index position minus one. In other words, + the first dimension should be set to 0, the second dimension + should be set to 1, and so on. +

+

Base Type

+

Each array type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Array Property Description for Datatype Version 3 +
ByteByteByteByte
DimensionalityThis space inserted only to align table nicely
Dimension #1 Size
.
.
.
Dimension #n Size

Base Type

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Dimensionality

+

This value is the number of dimensions that the array has. +

+

Dimension #n Size

+

This value is the size of the dimension of the array + as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Base Type

+

Each array type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ + + +
+

IV.A.2.e. The Data Storage - +Fill Value (Old) Message

+ + +
+ + + + + + + + +
Header Message Name: Fill Value + (old)
Header Message Type: 0x0004
Length: Varies
Status: Optional; may not be + repeated.
Description:

The fill value message stores a single data value which + is returned to the application when an uninitialized data element + is read from a dataset. The fill value is interpreted with the + same datatype as the dataset. If no fill value message is present + then a fill value of all zero bytes is assumed.

+

This fill value message is deprecated in favor of the + “new” fill value message (Message Type 0x0005) and + is only written to the file for forward compatibility with + versions of the HDF5 Library before the 1.6.0 version. + Additionally, it only appears for datasets with a user-defined + fill value (as opposed to the library default fill value or an + explicitly set “undefined” fill value).

+
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + +
+ Fill Value Message (Old) +
bytebytebytebyte
Size

Fill Value (optional, variable size)

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Size

+

This is the size of the Fill Value field in bytes. +

+

Fill Value

+

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. +

+
+
+ + +
+

IV.A.2.f. The Data Storage - +Fill Value Message

+ + +
+ + + + + + + + +
Header Message Name: Fill + Value
Header Message Type: 0x0005
Length: Varies
Status: Required for dataset objects; + may not be repeated.
Description:The fill value message stores a single data value which is + returned to the application when an uninitialized data element + is read from a dataset. The fill value is interpreted with the + same datatype as the dataset.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Fill Value Message - Versions 1 & 2 +
bytebytebytebyte
VersionSpace Allocation TimeFill Value Write TimeFill Value Defined
Size (optional)

Fill Value (optional, variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number information is used for changes in the + format of the fill value message and is described here: + + + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Initial version of this message. +
2In this version, the Size and Fill Value fields are + only present if the Fill Value Defined field is set + to 1. +
3This version packs the other fields in the message + more efficiently than version 2. +

+

+

Space Allocation Time

+

When the storage space for the dataset’s raw data will be + allocated. The allowed values are: + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Not used. +
1Early allocation. Storage space for the entire dataset + should be allocated in the file when the dataset is + created. +
2Late allocation. Storage space for the entire dataset + should not be allocated until the dataset is written + to. +
3Incremental allocation. Storage space for the + dataset should not be allocated until the portion + of the dataset is written to. This is currently + used in conjunction with chunked data storage for + datasets. +

+ +

Fill Value Write Time

+

At the time that storage space for the dataset’s raw data is + allocated, this value indicates whether the fill value should + be written to the raw data storage elements. The allowed values + are: + + + + + + + + + + + + + + + + + + +
ValueDescription
0On allocation. The fill value is always written to + the raw data storage when the storage space is allocated. +
1Never. The fill value should never be written to + the raw data storage. +
2Fill value written if set by user. The fill value + will be written to the raw data storage when the storage + space is allocated only if the user explicitly set + the fill value. If the fill value is the library + default or is undefined, it will not be written to + the raw data storage. +

+ +

Fill Value Defined

+

This value indicates if a fill value is defined for this + dataset. If this value is 0, the fill value is undefined. + If this value is 1, a fill value is defined for this dataset. + For version 2 or later of the fill value message, this value + controls the presence of the Size and Fill Value fields. +

+

Size

+

This is the size of the Fill Value field in bytes. This field + is not present if the Version field is greater than 1, + and the Fill Value Defined field is set to 0. +

+

Fill Value

+

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. This field is + not present if the Version field is greater than 1, + and the Fill Value Defined field is set to 0. +

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Fill Value Message - Version 3 +
bytebytebytebyte
VersionFlagsThis space inserted only to align table nicely
Size (optional)

Fill Value (optional, variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number information is used for changes in the + format of the fill value message and is described here: + + + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Initial version of this message. +
2In this version, the Size and Fill Value fields are + only present if the Fill Value Defined field is set + to 1. +
3This version packs the other fields in the message + more efficiently than version 2. +

+ +

Flags

+

When the storage space for the dataset’s raw data will be + allocated. The allowed values are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitsDescription
0-1Space Allocation Time, with the same + values as versions 1 and 2 of the message. +
2-3Fill Value Write Time, with the same + values as versions 1 and 2 of the message. +
4Fill Value Undefined, indicating that the fill + value has been marked as “undefined” for this dataset. + Bits 4 and 5 cannot both be set. +
5Fill Value Defined, with the same values as + versions 1 and 2 of the message. + Bits 4 and 5 cannot both be set. +
6-7Reserved (zero). +

+ +

Size

+

This is the size of the Fill Value field in bytes. This field + is not present if the Version field is greater than 1, + and the Fill Value Defined flag is set to 0. +

+

Fill Value

+

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. This field is + not present if the Version field is greater than 1, + and the Fill Value Defined flag is set to 0. +

+
+
+ + +
+

IV.A.2.g. The Link Message

+ + +
+ + + + + + + + +
Header Message Name: Link
Header Message Type: 0x0006
Length: Varies
Status: Optional; may be + repeated.
Description:

This message encodes the information for a link in a + group’s object header, when the group is storing its links + “compactly”, or in the group’s fractal heap, + when the group is storing its links “densely”.

+

A group is storing its links compactly when the fractal heap + address in the Link Info + Message is set to the “undefined address” + value.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Link Message +
bytebytebytebyte
VersionFlagsLink type (optional)This space inserted only to align table nicely

Creation Order (8 bytes, optional)

Link Name Character Set (optional)Length of Link Name (variable size)This space inserted only to align table nicely
Link Name (variable size)

Link Information (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes version 1.

+

Flags

This field contains information about the link and controls + the presence of other fields below. + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitsDescription
0-1Determines the size of the Length of Link Name + field. + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0The size of the Length of Link Name + field is 1 byte. +
1The size of the Length of Link Name + field is 2 bytes. +
2The size of the Length of Link Name + field is 4 bytes. +
3The size of the Length of Link Name + field is 8 bytes. +
+
2Creation Order Field Present: if set, the Creation + Order field is present. If not set, creation order + information is not stored for links in this group. +
3Link Type Field Present: if set, the link is not + a hard link and the Link Type field is present. + If not set, the link is a hard link. +
4Link Name Character Set Field Present: if set, the + link name is not represented with the ASCII character + set and the Link Name Character Set field is + present. If not set, the link name is represented with + the ASCII character set. +
5-7Reserved (zero). +

+ +

Link type

This is the link class type and can be one of the following + values: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0A hard link (should never be stored in the file) +
1A soft link. +
2-63Reserved for future HDF5 internal use. +
64An external link. +
65-255Reserved, but available for user-defined link types. +

+ +

This field is present if bit 3 of Flags is set.

+

Creation Order

This 64-bit value is an index of the link’s creation time within + the group. Values start at 0 when the group is created an increment + by one for each link added to the group. Removing a link from a + group does not change existing links’ creation order field. +

+

This field is present if bit 2 of Flags is set.

+

Link Name Character Set

This is the character set for encoding the link’s name: + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding (this should never be stored + in the file) +
1UTF-8 character set encoding +

+ +

This field is present if bit 4 of Flags is set.

+

Length of link name

This is the length of the link’s name. The size of this field + depends on bits 0 and 1 of Flags.

+

Link name

This is the name of the link, non-NULL terminated.

+

Link information

The format of this field depends on the link type.

+

For hard links, the field is formatted as follows: + + + + + + +
Size of Offsets bytes:The address of the object header for the object that the + link points to. +
+

+ +

+ For soft links, the field is formatted as follows: + + + + + + + + + + +
Bytes 1-2:Length of soft link value.
Length of soft link value bytes:A non-NULL-terminated string storing the value of the + soft link. +
+

+ +

+ For external links, the field is formatted as follows: + + + + + + + + + + +
Bytes 1-2:Length of external link value.
Length of external link value bytes:The first byte contains the version number in the + upper 4 bits and flags in the lower 4 bits for the external + link. Both version and flags are defined to be zero in + this document. The remaining bytes consist of two + NULL-terminated strings, with no padding between them. + The first string is the name of the HDF5 file containing + the object linked to and the second string is the full path + to the object linked to, within the HDF5 file’s + group hierarchy. +
+

+ +

+ For user-defined links, the field is formatted as follows: + + + + + + + + + + +
Bytes 1-2:Length of user-defined data.
Length of user-defined link value bytes:The data supplied for the user-defined link type.
+

+ +
+
+ +
+

IV.A.2.h. The Data Storage - +External Data Files Message

+ + +
+ + + + + + + + +
Header Message Name: External + Data Files
Header Message Type: 0x0007
Length: Varies
Status: Optional; may not be + repeated.
Description:The external data storage message indicates that the data + for an object is stored outside the HDF5 file. The filename of + the object is stored as a Universal Resource Location (URL) of + the actual filename containing the data. An external file list + record also contains the byte offset of the start of the data + within the file and the amount of space reserved in the file + for that data.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ External File List Message +
bytebytebytebyte
VersionReserved (zero)
Allocated SlotsUsed Slots

Heap AddressO


Slot Definitions...

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number information is used for changes in the format of + External Data Storage Message and is described here: + + + + + + + + + + + + + +
VersionDescription
0Never used.
1The current version used by the library.

+ +

Allocated Slots

+

The total number of slots allocated in the message. Its value must be at least as + large as the value contained in the Used Slots field. (The current library simply + uses the number of Used Slots for this message)

+

Used Slots

+

The number of initial slots which contains valid information.

+

Heap Address

+

This is the address of a local heap which contains the names for the external + files (The local heap information can be found in Disk Format Level 1D in this + document). The name at offset zero in the heap is always the empty string.

+

Slot Definitions

+

The slot definitions are stored in order according to the array addresses they + represent.

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ External File List Slot +
bytebytebytebyte

Name Offset in Local HeapL


Offset in External Data FileL


Data Size in External FileL

+ + + + + +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Name Offset in Local Heap

+

The byte offset within the local name heap for the name + of the file. File names are stored as a URL which has a + protocol name, a host name, a port number, and a file + name: + protocol:port//host/file. + If the protocol is omitted then “file:” is assumed. If + the port number is omitted then a default port for that + protocol is used. If both the protocol and the port + number are omitted then the colon can also be omitted. If + the double slash and host name are omitted then + “localhost” is assumed. The file name is the only + mandatory part, and if the leading slash is missing then + it is relative to the application’s current working + directory (the use of relative names is not + recommended). +

+

Offset in External Data File

+

This is the byte offset to the start of the data in the + specified file. For files that contain data for a single + dataset this will usually be zero.

+

Data Size in External File

+

This is the total number of bytes reserved in the + specified file for raw data storage. For a file that + contains exactly one complete dataset which is not + extendable, the size will usually be the exact size of the + dataset. However, by making the size larger one allows + HDF5 to extend the dataset. The size can be set to a value + larger than the entire file since HDF5 will read zeroes + past the end of the file without failing.

+
+
+ + +
+

IV.A.2.i. The Data Storage - Layout +Message

+ + +
+ + + + + + + + +
Header Message Name: Data Storage - + Layout
Header Message Type: 0x0008
Length: Varies
Status: Required for datasets; may not + be repeated.
Description:Data layout describes how the elements of a multi-dimensional + array are stored in the HDF5 file. Three types of data layout + are supported: +
    +
  1. Contiguous: The array is stored in one contiguous area of + the file. This layout requires that the size of the array be + constant: data manipulations such as chunking, compression, + checksums, or encryption are not permitted. The message stores + the total storage size of the array. The offset of an element + from the beginning of the storage area is computed as in a C + array.
  2. +
  3. Chunked: The array domain is regularly decomposed into + chunks, and each chunk is allocated and stored separately. This + layout supports arbitrary element traversals, compression, + encryption, and checksums. (these features are described + in other messages). The message stores the size of a chunk + instead of the size of the entire array; the storage size of + the entire array can be calculated by traversing the B-tree + that stores the chunk addresses.
  4. +
  5. Compact: The array is stored in one contiguous block, as + part of this object header message.
  6. +
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Data Layout Message (Versions 1 and 2) +
bytebytebytebyte
VersionDimensionalityLayout ClassReserved (zero)
Reserved (zero)

Data AddressO (optional)

Dimension 0 Size
Dimension 1 Size
...
Dimension #n Size
Dataset Element Size (optional)
Compact Data Size (optional)

Compact Data... (variable size, optional)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number information is used for changes in the format of the data + layout message and is described here: + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by version 1.4 and before of the library to encode layout information. + Data space is always allocated when the data set is created.
2Used by version 1.6.x of the library to encode layout information. + Data space is allocated only when it is necessary.

+

Dimensionality

An array has a fixed dimensionality. This field + specifies the number of dimension size fields later in the + message. The value stored for chunked storage is 1 greater than + the number of dimensions in the dataset’s dataspace. + For example, 2 is stored for a 1 dimensional dataset. +

+

Layout Class

The layout class specifies the type of storage for the data + and how the other fields of the layout message are to be + interpreted. + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Compact Storage +
1Contiguous Storage +
2Chunked Storage +
+

+

Data Address

For contiguous storage, this is the address of the raw + data in the file. For chunked storage this is the address + of the v1 B-tree that is used to look up the addresses of the + chunks. This field is not present for compact storage. + If the version for this message is greater than 1, the address + may have the “undefined address” value, to indicate that + storage has not yet been allocated for this array.

+

Dimension #n Size

For contiguous and compact storage the dimensions define + the entire size of the array while for chunked storage they define + the size of a single chunk. In all cases, they are in units of + array elements (not bytes). The first dimension stored in the list + of dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+

Dataset Element Size

The size of a dataset element, in bytes. This field is only + present for chunked storage. +

+

Compact Data Size

This field is only present for compact data storage. + It contains the size of the raw data for the dataset array, in + bytes.

+

Compact Data

This field is only present for compact data storage. + It contains the raw data for the dataset array.

+
+
+ +
+

Version 3 of this message re-structured the format into specific + properties that are required for each layout class.

+ + +
+ + + + + + + + + + + + + + + + + + + +
+ Data Layout Message (Version 3) +
bytebytebytebyte
VersionLayout ClassThis space inserted only to align table nicely

Properties (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

+

The version number information is used for changes in the format of layout message + and is described here: + + + + + + + + + + +
VersionDescription
3Used by the version 1.6.3 and later of the library to store properties + for each layout class.

+

Layout Class

The layout class specifies the type of storage for the data + and how the other fields of the layout message are to be + interpreted. + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Compact Storage +
1Contiguous Storage +
2Chunked Storage +
+

+

Properties

This variable-sized field encodes information specific to each + layout class and is described below. If there is no property + information specified for a layout class, the size of this field + is zero bytes.

+
+ +
+

Class-specific information for compact layout (Class 0): (Note: The dimensionality information + is in the Dataspace message)

+ + +
+ + + + + + + + + + + + + + + + + + +
+ Compact Storage Property Description +
bytebytebytebyte
SizeThis space inserted only to align table nicely

Raw Data... (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Size

This field contains the size of the raw data for the dataset + array, in bytes. +

+

Raw Data

This field contains the raw data for the dataset array.

+
+ + +
+

Class-specific information for contiguous layout (Class 1): (Note: The dimensionality information + is in the Dataspace message)

+ + +
+ + + + + + + + + + + + + + + + + +
+ Contiguous Storage Property Description +
bytebytebytebyte

AddressO


SizeL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Address

This is the address of the raw data in the file. + The address may have the “undefined address” value, to indicate + that storage has not yet been allocated for this array.

Size

This field contains the size allocated to store the raw data, + in bytes. +

+
+
+ + +
+

Class-specific information for chunked layout (Class 2):

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Chunked Storage Property Description +
bytebytebytebyte
DimensionalityThis space inserted only to align table nicely

AddressO

Dimension 0 Size
Dimension 1 Size
...
Dimension #n Size
Dataset Element Size
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Dimensionality

A chunk has a fixed dimensionality. This field specifies + the number of dimension size fields later in the message.

Address

This is the address of the v1 B-tree that is used to look up the + addresses of the chunks that actually store portions of the array + data. The address may have the “undefined address” value, to + indicate that storage has not yet been allocated for this array.

Dimension #n Size

These values define the dimension size of a single chunk, in + units of array elements (not bytes). The first dimension stored in + the list of dimensions is the slowest changing dimension and the + last dimension stored is the fastest changing dimension. +

+

Dataset Element Size

The size of a dataset element, in bytes. +

+
+
+ +
+

IV.A.2.j. The Bogus Message

+ + +
+ + + + + + + + +
Header Message Name: Bogus
Header Message Type: 0x0009
Length: 4 bytes
Status: For testing only; should never + be stored in a valid file.
Description:This message is used for testing the HDF5 Library’s + response to an “unknown” message type and should + never be encountered in a valid HDF5 file.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + +
+ Bogus Message +
bytebytebytebyte
Bogus Value
+
+ +
+
+ + + + + + + + + + +
Field NameDescription

Bogus Value

+

This value should always be: 0xdeadbeef.

+
+
+ +
+

IV.A.2.k. The Group Info Message +

+ + +
+ + + + + + + + +
Header Message Name: Group Info
Header Message Type: 0x000A
Length: Varies
Status: Optional; may not be + repeated.
Description:

This message stores information for the constants defining + a “new style” group’s behavior. Constant + information will be stored in this message and variable + information will be stored in the + Link Info message.

+

Note: the “estimated entry” information below is + used when determining the size of the object header for the + group when it is created.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Group Info Message +
bytebytebytebyte
VersionFlagsLink Phase Change: Maximum Compact Value (optional)
Link Phase Change: Minimum Dense Value (optional)Estimated Number of Entries (optional)
Estimated Link Name Length of Entries (optional)This space inserted only to align table nicely
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes version 0.

+

Flags

This is the group information flag with the following definition: + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, link phase change values are stored. +
1If set, the estimated entry information is non-default + and is stored. +
2-7Reserved

+

Link Phase Change: Maximum Compact Value

The is the maximum number of links to store “compactly” (in + the group’s object header).

+

This field is present if bit 0 of Flags is set.

+

Link Phase Change: Minimum Dense Value

This is the minimum number of links to store “densely” (in + the group’s fractal heap). The fractal heap’s address is + located in the Link Info + message.

+

This field is present if bit 0 of Flags is set.

+

Estimated Number of Entries

This is the estimated number of entries in groups.

+

If this field is not present, the default value of 4 + will be used for the estimated number of group entries.

+

This field is present if bit 1 of Flags is set.

+

Estimated Link Name Length of Entries

This is the estimated length of entry name.

+

If this field is not present, the default value of 8 + will be used for the estimated link name length of group entries.

+

This field is present if bit 1 of Flags is set.

+
+
+

+ +
+

IV.A.2.l. The Data Storage - Filter +Pipeline Message

+ + +
+ + + + + + + + +
Header Message Name: + Data Storage - Filter Pipeline
Header Message Type: 0x000B
Length: Varies
Status: Optional; may not be + repeated.
Description:

This message describes the filter pipeline which should + be applied to the data stream by providing filter identification + numbers, flags, a name, and client data.

+

This message may be present in the object headers of both + dataset and group objects. For datasets, it specifies the + filters to apply to raw data. For groups, it specifies the + filters to apply to the group’s fractal heap. Currently, + only datasets using chunked data storage use the filter + pipeline on their raw data.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Pipeline Message - Version 1 +
bytebytebytebyte
VersionNumber of FiltersReserved (zero)
Reserved (zero)

Filter Description List (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This table + describes version 1.

Number of Filters

The total number of filters described in this + message. The maximum possible number of filters in a + message is 32.

Filter Description List

A description of each filter. A filter description + appears in the next table.

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Description +
bytebytebytebyte
Filter Identification ValueName Length
FlagsNumber Client Data Values

Name (variable size, optional)


Client Data (variable size, optional)

Padding (variable size, optional)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Filter Identification Value

+

+ This value, often referred to as a filter identifier, + is designed to be a unique identifier for the filter. + Values from zero through 32,767 are reserved for filters + supported by The HDF Group in the HDF5 Library and for + filters requested and supported by third parties. + Filters supported by The HDF Group are documented immediately + below. Information on 3rd-party filters can be found at + The HDF Group’s + + Contributions page.

+ +

+ To request a filter identifier, please contact + The HDF Group’s Help Desk at + The HDF Group Help Desk. + You will be asked to provide the following information:

+
    +
  1. Contact information for the developer requesting the + new identifier
  2. +
  3. A short description of the new filter
  4. +
  5. Links to any relevant information, including licensing + information
  6. +
+

+ Values from 32768 to 65535 are reserved for non-distributed uses + (for example, internal company usage) or for application usage + when testing a feature. The HDF Group does not track or document + the use of the filters with identifiers from this range.

+ +

+ The filters currently in library version 1.8.0 are + listed below: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IdentificationNameDescription
0N/AReserved
1deflateGZIP deflate compression
2shuffleData element shuffling
3fletcher32Fletcher32 checksum
4szipSZIP compression
5nbitN-bit packing
6scaleoffsetScale and offset encoded values
+

Name Length

Each filter has an optional null-terminated ASCII name + and this field holds the length of the name including the + null termination padded with nulls to be a multiple of + eight. If the filter has no name then a value of zero is + stored in this field.

Flags

The flags indicate certain properties for a filter. The + bit values defined so far are: + + + + + + + + + + + + + + + +
BitDescription
0If set then the filter is an optional filter. + During output, if an optional filter fails it will be + silently skipped in the pipeline.
1-15Reserved (zero)

+

Number of Client Data Values

Each filter can store integer values to control + how the filter operates. The number of entries in the + Client Data array is stored in this field.

Name

If the Name Length field is non-zero then it will + contain the size of this field, padded to a multiple of eight. This + field contains a null-terminated, ASCII character + string to serve as a comment/name for the filter.

Client Data

This is an array of four-byte integers which will be + passed to the filter function. The Client Data Number of + Values determines the number of elements in the array.

Padding

Four bytes of zeroes are added to the message at this + point if the Client Data Number of Values field contains + an odd number.

+
+ +
+
+ + + + + + + + + + + + + + + + + + + +
+ Filter Pipeline Message - Version 2 +
bytebytebytebyte
VersionNumber of FiltersThis space inserted only to align table nicely

Filter Description List (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This table + describes version 2.

Number of Filters

The total number of filters described in this + message. The maximum possible number of filters in a + message is 32.

Filter Description List

A description of each filter. A filter description + appears in the next table.

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Filter Description +
bytebytebytebyte
Filter Identification ValueName Length (optional)
FlagsNumber Client Data Values

Name (variable size, optional)


Client Data (variable size, optional)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Filter Identification Value

+

+ This value, often referred to as a filter identifier, + is designed to be a unique identifier for the filter. + Values from zero through 32,767 are reserved for filters + supported by The HDF Group in the HDF5 Library and for + filters requested and supported by third parties. + Filters supported by The HDF Group are documented immediately + below. Information on 3rd-party filters can be found at + The HDF Group’s + + Contributions page.

+ +

+ To request a filter identifier, please contact + The HDF Group’s Help Desk at + The HDF Group Help Desk. + You will be asked to provide the following information:

+
    +
  1. Contact information for the developer requesting the + new identifier
  2. +
  3. A short description of the new filter
  4. +
  5. Links to any relevant information, including licensing + information
  6. +
+

+ Values from 32768 to 65535 are reserved for non-distributed uses + (for example, internal company usage) or for application usage + when testing a feature. The HDF Group does not track or document + the use of the filters with identifiers from this range.

+ +

+ The filters currently in library version 1.8.0 are + listed below: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IdentificationNameDescription
0N/AReserved
1deflateGZIP deflate compression
2shuffleData element shuffling
3fletcher32Fletcher32 checksum
4szipSZIP compression
5nbitN-bit packing
6scaleoffsetScale and offset encoded values
+

Name Length

Each filter has an optional null-terminated ASCII name + and this field holds the length of the name including the + null termination padded with nulls to be a multiple of + eight. If the filter has no name then a value of zero is + stored in this field.

+

Filters with IDs less than 256 (in other words, filters + that are defined in this format documentation) do not store + the Name Length or Name fields. +

+

Flags

The flags indicate certain properties for a filter. The + bit values defined so far are: + + + + + + + + + + + + + + + +
BitDescription
0If set then the filter is an optional filter. + During output, if an optional filter fails it will be + silently skipped in the pipeline.
1-15Reserved (zero)

+

Number of Client Data Values

Each filter can store integer values to control + how the filter operates. The number of entries in the + Client Data array is stored in this field.

Name

If the Name Length field is non-zero then it will + contain the size of this field, not padded to a multiple + of eight. This field contains a non-null-terminated, + ASCII character string to serve as a comment/name for the filter. +

+

Filters that are defined in this format documentation + such as deflate and shuffle do not store the Name + Length or Name fields. +

+

Client Data

This is an array of four-byte integers which will be + passed to the filter function. The Client Data Number of + Values determines the number of elements in the array.

+
+
+ +
+

IV.A.2.m. The Attribute Message

+ + +
+ + + + + + + + +
Header Message Name: Attribute
Header Message Type: 0x000C
Length: Varies
Status: Optional; may be + repeated.
Description:

The Attribute message is used to store objects + in the HDF5 file which are used as attributes, or + “metadata” about the current object. An attribute + is a small dataset; it has a name, a datatype, a dataspace, and + raw data. Since attributes are stored in the object header, they + should be relatively small (in other words, less than 64KB). + They can be associated with any type of object which has an + object header (groups, datasets, or committed (named) + datatypes).

+

In 1.8.x versions of the library, attributes can be larger + than 64KB. See the + + “Special Issues” section of the Attributes chapter + in the HDF5 User’s Guide for more information.

+

Note: Attributes on an object must have unique names: + the HDF5 Library currently enforces this by causing the + creation of an attribute with a duplicate name to fail. + Attributes on different objects may have the same name, + however.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Message (Version 1) +
bytebytebytebyte
VersionReserved (zero)Name Size
Datatype SizeDataspace Size

Name (variable size)


Datatype (variable size)


Dataspace (variable size)


Data (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number information is used for changes in the format of the + attribute message and is described here: + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by the library before version 1.6 to encode attribute message. + This version does not support shared datatypes.

+

Name Size

The length of the attribute name in bytes including the + null terminator. Note that the Name field below may + contain additional padding not represented by this + field.

Datatype Size

The length of the datatype description in the Datatype + field below. Note that the Datatype field may contain + additional padding not represented by this field.

Dataspace Size

The length of the dataspace description in the Dataspace + field below. Note that the Dataspace field may contain + additional padding not represented by this field.

Name

The null-terminated attribute name. This field is + padded with additional null characters to make it a + multiple of eight bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. This + field is not padded with additional bytes.

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Message (Version 2) +
bytebytebytebyte
VersionFlagsName Size
Datatype SizeDataspace Size

Name (variable size)


Datatype (variable size)


Dataspace (variable size)


Data (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number information is used for changes in the + format of the attribute message and is described here: + + + + + + + + + + +
VersionDescription
2Used by the library of version 1.6.x and after to encode + attribute messages. + This version supports shared datatypes. The fields of + name, datatype, and dataspace are not padded with + additional bytes of zero. +

+

Flags

This bit field contains extra information about + interpreting the attribute message: + + + + + + + + + + + + + + + + +
BitDescription
0If set, datatype is shared.
1If set, dataspace is shared.

+

Name Size

The length of the attribute name in bytes including the + null terminator.

Datatype Size

The length of the datatype description in the Datatype + field below.

Dataspace Size

The length of the dataspace description in the Dataspace + field below.

Name

The null-terminated attribute name. This field is not + padded with additional bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. +

+

If the + Flag field indicates this attribute’s datatype is + shared, this field will contain a “shared message” encoding + instead of the datatype encoding. +

+

This field is not padded with additional bytes. +

+

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. +

+

If the + Flag field indicates this attribute’s dataspace is + shared, this field will contain a “shared message” encoding + instead of the dataspace encoding. +

+

This field is not padded with additional bytes.

+

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. +

+

This field is not padded with additional zero bytes. +

+
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Message (Version 3) +
bytebytebytebyte
VersionFlagsName Size
Datatype SizeDataspace Size
Name Character Set EncodingThis space inserted only to align table nicely

Name (variable size)


Datatype (variable size)


Dataspace (variable size)


Data (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number information is used for changes in the + format of the attribute message and is described here: + + + + + + + + + + +
VersionDescription
3Used by the library of version 1.8.x and after to + encode attribute messages. + This version supports attributes with non-ASCII names. +

+

Flags

This bit field contains extra information about + interpreting the attribute message: + + + + + + + + + + + + + + + + +
BitDescription
0If set, datatype is shared.
1If set, dataspace is shared.

+

Name Size

The length of the attribute name in bytes including the + null terminator.

Datatype Size

The length of the datatype description in the Datatype + field below.

Dataspace Size

The length of the dataspace description in the Dataspace + field below.

Name Character Set Encoding

The character set encoding for the attribute’s name: + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding +
1UTF-8 character set encoding +
+

+

Name

The null-terminated attribute name. This field is not + padded with additional bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. +

+

If the + Flag field indicates this attribute’s datatype is + shared, this field will contain a “shared message” encoding + instead of the datatype encoding. +

+

This field is not padded with additional bytes. +

+

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. +

+

If the + Flag field indicates this attribute’s dataspace is + shared, this field will contain a “shared message” encoding + instead of the dataspace encoding. +

+

This field is not padded with additional bytes.

+

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. +

+

This field is not padded with additional zero bytes. +

+
+
+ +
+

IV.A.2.n. The Object Comment +Message

+ + +
+ + + + + + + + +
Header Message Name: Object + Comment
Header Message Type: 0x000D
Length: Varies
Status: Optional; may not be + repeated.
Description:The object comment is designed to be a short description of + an object. An object comment is a sequence of non-zero + (\0) ASCII characters with no other formatting + included by the library.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + +
+ Name Message +
bytebytebytebyte

Comment (variable size)

+
+ +
+
+ + + + + + + + + + +
Field NameDescription

Name

A null terminated ASCII character string.

+
+ +
+

IV.A.2.o. The Object +Modification Time (Old) Message

+ + +
+ + + + + + + + +
Header Message Name: Object + Modification Time (Old)
Header Message Type: 0x000E
Length: Fixed
Status: Optional; may not be + repeated.
Description:

The object modification date and time is a timestamp + which indicates (using ISO-8601 date and time format) the last + modification of an object. The time is updated when any object + header message changes according to the system clock where the + change was posted. All fields of this message should be + interpreted as coordinated universal time (UTC).

+

This modification time message is deprecated in favor of + the “new” Object + Modification Time message and is no longer written to the + file in versions of the HDF5 Library after the 1.6.0 + version.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Modification Time Message +
bytebytebytebyte
Year
MonthDay of Month
HourMinute
SecondReserved
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Year

The four-digit year as an ASCII string. For example, + 1998. +

Month

The month number as a two digit ASCII string where + January is 01 and December is 12.

Day of Month

The day number within the month as a two digit ASCII + string. The first day of the month is 01.

Hour

The hour of the day as a two digit ASCII string where + midnight is 00 and 11:00pm is 23.

Minute

The minute of the hour as a two digit ASCII string where + the first minute of the hour is 00 and + the last is 59.

Second

The second of the minute as a two digit ASCII string + where the first second of the minute is 00 + and the last is 59.

Reserved

This field is reserved and should always be zero.

+
+ +
+

IV.A.2.p. The Shared Message Table +Message

+ + +
+ + + + + + + + +
Header Message Name: Shared Message + Table
Header Message Type: 0x000F
Length: Fixed
Status: Optional; may not be + repeated.
Description:This message is used to locate the table of shared object + header message (SOHM) indexes. Each index consists of information + to find the shared messages from either the heap or object header. + This message is only found in the superblock + extension.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Shared Message Table Message +
bytebytebytebyte
VersionThis space inserted only to align table nicely

Shared Object Header Message Table AddressO

Number of IndicesThis space inserted only to align table nicely
+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes version 0.

Shared Object Header Message Table Address

This field is the address of the master table for shared + object header message indexes.

+

Number of Indices

This field is the number of indices in the master table. +

+
+ +
+

IV.A.2.q. The Object Header +Continuation Message

+ + +
+ + + + + + + + +
Header Message Name: Object Header + Continuation
Header Message Type: 0x0010
Length: Fixed
Status: Optional; may be + repeated.
Description:The object header continuation is the location in the file + of a block containing more header messages for the current data + object. This can be used when header blocks become too large or + are likely to change over time.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + +
+ Object Header Continuation Message +
bytebytebytebyte

OffsetO


LengthL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Offset

This value is the address in the file where the + header continuation block is located.

Length

This value is the length in bytes of the header continuation + block in the file.

+
+
+ +

The format of the header continuation block that this message points + to depends on the version of the object header that the message is + contained within. +

+ +

+ Continuation blocks for version 1 object headers have no special + formatting information; they are merely a list of object header + message info sequences (type, size, flags, reserved bytes and data + for each message sequence). See the description + of Version 1 Data Object Header Prefix. +

+ +

Continuation blocks for version 2 object headers do have + special formatting information as described here + (see also the description of + Version 2 Data Object Header Prefix.): +

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Version 2 Object Header Continuation Block +
bytebytebytebyte
Signature
Header Message Type #1Size of Header Message Data #1Header Message #1 Flags
Header Message #1 Creation Order (optional)This space inserted only to align table nicely

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #nHeader Message #n Flags
Header Message #n Creation Order (optional)This space inserted only to align table nicely

Header Message Data #n

Gap (optional, variable size)
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Signature

+

The ASCII character string “OCHK” + is used to indicate the + beginning of an object header continuation block. This gives file + consistency checking utilities a better chance of reconstructing a + damaged file. +

+

Header Message #n Type

+

Same format as version 1 of the object header, described above. +

Size of Header Message #n Data

+

Same format as version 1 of the object header, described above. +

Header Message #n Flags

+

Same format as version 1 of the object header, described above. +

Header Message #n Creation Order

+

This field stores the order that a message of a given type + was created in.

+

This field is present if bit 2 of flags is set.

+

Header Message #n Data

+

Same format as version 1 of the object header, described above. +

Gap

+

A gap in an object header chunk is inferred by the end of the + messages for the chunk before the beginning of the chunk’s + checksum. Gaps are always smaller than the size of an + object header message prefix (message type + message size + + message flags).

+

Gaps are formed when a message (typically an attribute message) + in an earlier chunk is deleted and a message from a later + chunk that does not quite fit into the free space is moved + into the earlier chunk.

+

Checksum

+

This is the checksum for the object header chunk. +

+
+
+ +
+

IV.A.2.r. The Symbol Table +Message

+ + +
+ + + + + + + + +
Header Message Name: Symbol Table + Message
Header Message Type: 0x0011
Length: Fixed
Status: Required for + “old style” groups; may not be repeated.
Description:Each “old style” group has a v1 B-tree and a + local heap for storing symbol table entries, which are located + with this message.
Format of data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + +
+ Symbol Table Message +
bytebytebytebyte

v1 B-tree AddressO


Local Heap AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

v1 B-tree Address

This value is the address of the v1 B-tree containing the + symbol table entries for the group.

Local Heap Address

This value is the address of the local heap containing + the link names for the symbol table entries for the group.

+
+ +
+

IV.A.2.s. The Object +Modification Time Message

+ + +
+ + + + + + + + +
Header Message Name: Object + Modification Time
Header Message Type: 0x0012
Length: Fixed
Status: Optional; may not be + repeated.
Description:The object modification time is a timestamp which indicates + the time of the last modification of an object. The time is + updated when any object header message changes according to + the system clock where the change was posted.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + +
+ Modification Time Message +
bytebytebytebyte
VersionReserved (zero)
Seconds After UNIX Epoch
+
+ +
+
+ + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number is used for changes in the format of Object Modification Time + and is described here: + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by Version 1.6.1 and after of the library to encode time. In + this version, the time is the seconds after Epoch.

+

Seconds After UNIX Epoch

A 32-bit unsigned integer value that stores the number of + seconds since 0 hours, 0 minutes, 0 seconds, January 1, 1970, + Coordinated Universal Time.

+
+ +
+

IV.A.2.t. The B-tree +‘K’ Values Message

+ + +
+ + + + + + + + +
Header Message Name: B-tree + ‘K’ Values
Header Message Type: 0x0013
Length: Fixed
Status: Optional; may not be + repeated.
Description:This message retrieves non-default ‘K’ values + for internal and leaf nodes of a group or indexed storage v1 + B-trees. This message is only found in the superblock + extension.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + +
+ B-tree ‘K’ Values Message +
bytebytebytebyte
VersionIndexed Storage Internal Node KThis space inserted only to align table nicely
Group Internal Node KGroup Leaf Node K
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Indexed Storage Internal Node K

This is the node ‘K’ value for each internal node of an + indexed storage v1 B-tree. See the description of this field + in version 0 and 1 of the superblock as well the section on + v1 B-trees. +

+

Group Internal Node K

This is the node ‘K’ value for each internal node of a group + v1 B-tree. See the description of this field in version 0 and + 1 of the superblock as well as the section on v1 B-trees. +

+

Group Leaf Node K

This is the node ‘K’ value for each leaf node of a group v1 + B-tree. See the description of this field in version 0 and 1 + of the superblock as well as the section on v1 B-trees. +

+
+
+ +
+

IV.A.2.u. The Driver Info +Message

+ + +
+ + + + + + + + + +
Header Message Name: Driver + Info
Header Message Type: 0x0014
Length: Varies
Status: Optional; may not be + repeated.
+ Description:This message contains information needed by the file driver + to reopen a file. This message is only found in the + superblock extension: see the + “Disk Format: Level 0C - Superblock Extension” + section for more information. For more information on the fields + in the driver info message, see the + “Disk Format : Level 0B - File Driver Info” + section; those who use the multi and family file drivers will + find this section particularly helpful.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Driver Info Message +
bytebytebytebyte
VersionThis space inserted only to align table nicely

Driver Identification
Driver Information SizeThis space inserted only to align table nicely


Driver Information (variable size)


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Driver Identification

This is an eight-byte ASCII string without null termination which + identifies the driver. +

+

Driver Information Size

The size in bytes of the Driver Information field of this + message.

+

Driver Information

Driver information is stored in a format defined by the file driver.

+
+
+ +
+

IV.A.2.v. The Attribute Info +Message

+ + +
+ + + + + + + + +
Header Message Name: Attribute + Info
Header Message Type: 0x0015
Length: Varies
Status: Optional; may not be + repeated.
Description:This message stores information about the attributes on an + object, such as the maximum creation index for the attributes + created and the location of the attribute storage when the + attributes are stored “densely”.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Attribute Info Message +
bytebytebytebyte
VersionFlagsMaximum Creation Index (optional)

Fractal Heap AddressO


Attribute Name v2 B-tree AddressO


Attribute Creation Order v2 B-tree AddressO (optional)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Flags

This is the attribute index information flag with the + following definition: + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, creation order for attributes is tracked. +
1If set, creation order for attributes is indexed. +
2-7Reserved

+ +

Maximum Creation Index

The is the maximum creation order index value for the + attributes on the object.

+

This field is present if bit 0 of Flags is set.

+

Fractal Heap Address

This is the address of the fractal heap to store dense + attributes.

+

Attribute Name v2 B-tree Address

This is the address of the version 2 B-tree to index the + names of densely stored attributes.

+

Attribute Creation Order v2 B-tree Address

This is the address of the version 2 B-tree to index the + creation order of densely stored attributes.

+

This field is present if bit 1 of Flags is set.

+
+
+ +
+

IV.A.2.w. The Object Reference +Count Message

+ + +
+ + + + + + + + +
Header Message Name: Object Reference + Count
Header Message Type: 0x0016
Length: Fixed
Status: Optional; may not be + repeated.
Description:This message stores the number of hard links (in groups or + objects) pointing to an object: in other words, its + reference count.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + +
+ Object Reference Count +
bytebytebytebyte
VersionThis space inserted only to align table nicely
Reference count
+
+ +
+
+ + + + + + + + + + + + + + + + +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Reference Count

The unsigned 32-bit integer is the reference count for the + object. This message is only present in “version 2” + (or later) object headers, and if not present those object + header versions, the reference count for the object is assumed + to be 1.

+
+
+ +
+

IV.A.2.x. The File Space Info +Message

+ + +
+ + + + + + + + +
Header Message Name: File Space + Info
Header Message Type: 0x0018
Length: Fixed
Status: Optional; may not be + repeated.
+ Description:This message stores the file space management strategy (see + description below) that the library uses in handling file space + request for the file. It also contains the free-space section + threshold used by the library’s free-space managers for + the file. If the strategy is 1, this message also contains the + addresses of the file’s free-space managers which track + free space for each type of file space allocation. There are + six basic types of file space allocation: superblock, B-tree, + raw data, global heap, local heap, and object header. See the + description of Free-space + Manager as well the description of allocation types in + Appendix B.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ File Space Info +
bytebytebytebyte
VersionStrategyThresholdL
Super-block Free-space Manager AddressO
B-tree Free-space Manager AddressO
Raw Data Free-space Manager AddressO
Global Heap Free-space Manager AddressO
Local Heap Free-space Manager AddressO
Object Header Free-space Manager AddressO
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are of the size + specified in “Size of Offsets” field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are of the size + specified in “Size of Lengths” field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Field NameDescription

Version

This is the version number of this message. This document describes + version 0.

+

Strategy

This is the file space management strategy for the file. + There are four types of strategies: + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
1With this strategy, the HDF5 Library’s free-space managers track the + free space that results from the manipulation of HDF5 objects + in the HDF5 file. The free space information is saved when the + file is closed, and reloaded when the file is reopened. +
+ When space is needed for file metadata or raw data, + the HDF5 Library first requests space from the library’s free-space + managers. If the request is not satisfied, the library requests space + from the aggregators. If the request is still not satisfied, + the library requests space from the virtual file driver. + That is, the library will use all of the mechanisms for allocating + space. +
2This is the HDF5 Library’s default file space management strategy. + With this strategy, the library’s free-space managers track the free space + that results from the manipulation of HDF5 objects in the HDF5 file. + The free space information is NOT saved when the file is closed and + the free space that exists upon file closing becomes unaccounted + space in the file. +
+ As with strategy #1, the library will try all of the mechanisms + for allocating space. When space is needed for file metadata or + raw data, the library first requests space from the free-space + managers. If the request is not satisfied, the library requests + space from the aggregators. If the request is still not satisfied, + the library requests space from the virtual file driver. +
3With this strategy, the HDF5 Library does not track free space that results + from the manipulation of HDF5 objects in the HDF5 file and + the free space becomes unaccounted space in the file. +
+ When space is needed for file metadata or raw data, + the library first requests space from the aggregators. + If the request is not satisfied, the library requests space from + the virtual file driver. +
4With this strategy, the HDF5 Library does not track free space that results + from the manipulation of HDF5 objects in the HDF5 file and + the free space becomes unaccounted space in the file. +
+ When space is needed for file metadata or raw data, + the library requests space from the virtual file driver. +

+

Threshold

This is the free-space section threshold. + The library’s free-space managers will track only + free-space sections with size greater than or equal to + threshold. The default is to track free-space + sections of all sizes.

+

Superblock Free-space Manager Address

This is the address of the free-space manager for + H5FD_MEM_SUPER allocation type. +

+

B-tree Free-space Manager Address

This is the address of the free-space manager for + H5FD_MEM_BTREE allocation type. +

+

Raw Data Free-space Manager Address

This is the address of the free-space manager for + H5FD_MEM_DRAW allocation type. +

+

Global Heap Free-space Manager Address

This is the address of the free-space manager for + H5FD_MEM_GHEAP allocation type. +

+

Local Heap Free-space Manager Address

This is the address of the free-space manager for + H5FD_MEM_LHEAP allocation type. +

+

Object Header Free-space Manager Address

This is the address of the free-space manager for + H5FD_MEM_OHDR allocation type. +

+
+
+
+ + +
+

+IV.B. Disk Format: Level 2B - Data Object Data Storage

+ +

The data for an object is stored separately from its header + information in the file and may not actually be located in the HDF5 file + itself if the header indicates that the data is stored externally. The + information for each record in the object is stored according to the + dimensionality of the object (indicated in the dataspace header message). + Multi-dimensional array data is stored in C order; in other words, the + “last” dimension changes fastest.

+ +

Data whose elements are composed of atomic datatypes are stored in IEEE + format, unless they are specifically defined as being stored in a different + machine format with the architecture-type information from the datatype + header message. This means that each architecture will need to [potentially] + byte-swap data values into the internal representation for that particular + machine.

+ +

Data with a variable-length datatype is stored in the global heap + of the HDF5 file. Global heap identifiers are stored in the + data object storage.

+ +

Data whose elements are composed of reference datatypes are stored in + several different ways depending on the particular reference type involved. + Object pointers are just stored as the offset of the object header being + pointed to with the size of the pointer being the same number of bytes as + offsets in the file.

+ +

Dataset region references are stored as a heap-ID which points to +the following information within the file-heap: an offset of the object +pointed to, number-type information (same format as header message), +dimensionality information (same format as header message), sub-set start +and end information (in other words, a coordinate location for each), +and field start and end names (in other words, a [pointer to the] string +indicating the first field included and a [pointer to the] string name +for the last field).

+ +

Data of a compound datatype is stored as a contiguous stream of the items + in the structure, with each item formatted according to its datatype.

+ + + +
+
+
+

+V. Appendix A: Definitions

+ +

Definitions of various terms used in this document are included in +this section.

+ +
+ + + + + + + + + + + + + + + + +
TermDefinition
Undefined AddressThe undefined + address for a file is a file address with all bits + set: in other words, 0xffff...ff.
Unlimited SizeThe unlimited size + for a size is a value with all bits set: in other words, + 0xffff...ff.
+
+ + + +
+
+
+

+VI. Appendix B: File Memory Allocation Types

+ +

There are six basic types of file memory allocation as follows: +

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Basic Allocation TypeDescription
H5FD_MEM_SUPERFile memory allocated for Superblock.
H5FD_MEM_BTREEFile memory allocated for B-tree.
H5FD_MEM_DRAWFile memory allocated for raw data.
H5FD_MEM_GHEAPFile memory allocated for Global Heap.
H5FD_MEM_LHEAPFile memory allocated for Local Heap.
H5FD_MEM_OHDRFile memory allocated for Object Header.
+
+ +

There are other file memory allocation types that are mapped to the +above six basic allocation types because they are similar in nature. +The mapping is listed in the following table: +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Basic Allocation TypeMapping of Allocation Types to Basic Allocation Types
H5FD_MEM_SUPERnone
H5FD_MEM_BTREEH5FD_MEM_SOHM_INDEX
H5FD_MEM_DRAWH5FD_MEM_FHEAP_HUGE_OBJ
H5FD_MEM_GHEAPnone
H5FD_MEM_LHEAPH5FD_MEM_FHEAP_DBLOCK, H5FD_MEM_FSPACE_SINFO
H5FD_MEM_OHDRH5FD_MEM_FHEAP_HDR, H5FD_MEM_FHEAP_IBLOCK, H5FD_MEM_FSPACE_HDR, H5FD_MEM_SOHM_TABLE
+
+ +

Allocation types that are mapped to basic allocation types are described below: +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Allocation TypeDescription
H5FD_MEM_FHEAP_HDRFile memory allocated for Fractal Heap Header.
H5FD_MEM_FHEAP_DBLOCKFile memory allocated for Fractal Heap Direct Blocks.
H5FD_MEM_FHEAP_IBLOCKFile memory allocated for Fractal Heap Indirect Blocks.
H5FD_MEM_FHEAP_HUGE_OBJFile memory allocated for huge objects in the fractal heap.
H5FD_MEM_FSPACE_HDRFile memory allocated for Free-space Manager Header.
H5FD_MEM_FSPACE_SINFOFile memory allocated for Free-space Section List of the free-space manager.
H5FD_MEM_SOHM_TABLEFile memory allocated for Shared Object Header Message Table.
H5FD_MEM_SOHM_INDEXFile memory allocated for Shared Message Record List.
+
+ + diff --git a/doxygen/examples/H5.format.html b/doxygen/examples/H5.format.html new file mode 100644 index 0000000..e16805f --- /dev/null +++ b/doxygen/examples/H5.format.html @@ -0,0 +1,20400 @@ + + + + HDF5 File Format Specification Version 3.0 + + + + + + + + + + + +
+ + + + + + + +
+
    +
  1. Introduction
  2. + +
      +
    1. This Document
    2. +
    3. Changes for HDF5 1.12
    4. +
    5. Changes for HDF5 1.10
    6. +
    +
    + +
  3. Disk Format: Level 0 - File Metadata
  4. + +
      +
    1. Disk Format: Level 0A - Format Signature + and Superblock
    2. +
    3. Disk Format: Level 0B - File Driver + Info
    4. +
    5. Disk Format: Level 0C - Superblock + Extension
    6. +
    +
    +
  5. Disk Format: Level 1 - File Infrastructure
  6. + +
      +
    1. Disk Format: Level 1A - B-trees and B-tree + Nodes +
        +
      1. Disk Format: Level 1A1 - Version 1 + B-trees
      2. +
      3. Disk Format: Level 1A2 - Version 2 + B-trees
      4. +
      +
    2. +
    3. Disk Format: Level 1B - Group Symbol + Table Nodes
    4. +
    5. Disk Format: Level 1C - Symbol + Table Entry
    6. +
    7. Disk Format: Level 1D - Local Heaps
    8. +
    9. Disk Format: Level 1E - Global Heap
    10. +
    11. Disk Format: Level 1F - Global Heap + Block for Virtual Datasets
    12. +
    13. Disk Format: Level 1G - Fractal Heap
    14. +
    15. Disk Format: Level 1H - Free-space + Manager
    16. +
    17. Disk Format: Level 1I - Shared Object + Header Message Table
    18. +
    +
    +
  7. Disk Format: Level 2 - Data Objects
  8. + +
      +
    1. Disk Format: Level 2A - Data Object Headers
    2. +
        +
      1. Disk Format: Level 2A1 - + Data Object Header Prefix +
          +
        1. Version 1 Data + Object Header Prefix
        2. +
        3. Version 2 Data + Object Header Prefix
        4. +
        +
      2. +
      3. Disk Format: Level 2A2 - + Data Object Header Messages
      4. +
          +
        1. The NIL Message
        2. +
        3. The Dataspace Message
        4. +
        5. The Link Info Message
        6. +
        7. The Datatype Message
        8. +
        9. The Data Storage - + Fill Value (Old) Message
        10. +
        +
      +
    +
    +
+
  +
    +
  1. Disk Format: Level 2 - Data + Objects (Continued)
  2. +
      +
    1. Disk Format: Level 2A - Data Object + Headers (Continued) +
        +
      1. Disk Format: Level 2A2 - + Data Object Header Messages (Continued)
      2. +
          +
        1. The Data Storage - + Fill Value Message
        2. +
        3. The Link Message
        4. +
        5. The Data Storage - + External Data Files Message
        6. +
        7. The Data Layout Message
        8. +
        9. The Bogus Message
        10. +
        11. The Group Info + Message
        12. +
        13. The Data Storage - + Filter Pipeline Message
        14. +
        15. The Attribute + Message
        16. +
        17. The Object Comment + Message
        18. +
        19. The Object + Modification Time (Old) Message
        20. +
        21. The Shared Message + Table Message
        22. +
        23. The Object Header + Continuation Message
        24. +
        25. The Symbol + Table Message
        26. +
        27. The Object + Modification Time Message
        28. +
        29. The B-tree + ‘K’ Values Message
        30. +
        31. The Driver Info + Message
        32. +
        33. The Attribute Info + Message
        34. +
        35. The Object Reference + Count Message
        36. +
        37. The File Space Info + Message
        38. +
        +
      +
    2. +
    3. Disk Format: Level 2B - Data Object Data Storage
    4. +
    +
    +
  3. Appendix A: Definitions
  4. +
  5. Appendix B: File Space Allocation + Types
  6. +
  7. + Appendix C: Types of Indexes for Dataset Chunks
  8. + +
      +
    1. The Single Chunk Index
    2. +
    3. The Implicit Index
    4. +
    5. The Fixed Array Index
    6. +
    7. The Extensible Array Index
    8. +
    9. The Version 2 B-trees Index
    10. +
    +
    +
  9. + Appendix D: Encoding for Dataspace and Reference
  10. + +
      +
    1. Dataspace Encoding
    2. +
    3. Reference Encoding (Revised)
    4. +
    5. Reference Encoding (Backward Compatibility)
    6. +
    +
    +
+
+
+ + +

I. Introduction

+ + + + + + + +
  +
+ HDF5 Groups +
 
  + Figure 1: Relationships among the HDF5 root group, other groups, and objects +
+
 
  + HDF5 Objects +  
  + Figure 2: HDF5 objects -- datasets, datatypes, or dataspaces +
+
 
+ + +

The format of an HDF5 file on disk encompasses several + key ideas of the HDF4 and AIO file formats as well as + addressing some shortcomings therein. The new format is + more self-describing than the HDF4 format and is more + uniformly applied to data objects in the file.

+ +

An HDF5 file appears to the user as a directed graph. + The nodes of this graph are the higher-level HDF5 objects + that are exposed by the HDF5 APIs:

+ +
    +
  • Groups
  • +
  • Datasets
  • +
  • Committed (formerly Named) datatypes
  • +
+ +

At the lowest level, as information is actually written to the disk, + an HDF5 file is made up of the following objects:

+
    +
  • A superblock
  • +
  • B-tree nodes
  • +
  • Heap blocks
  • +
  • Object headers
  • +
  • Object data
  • +
  • Free space
  • +
+ +

The HDF5 Library uses these low-level objects to represent the + higher-level objects that are then presented to the user or + to applications through the APIs. For instance, a group is an + object header that contains a message that points to a local + heap (for storing the links to objects in the group) and to a + B-tree (which indexes the links). A dataset is an object header + that contains messages that describe the datatype, dataspace, + layout, filters, external files, fill value, and other elements + with the layout message pointing to either a raw data chunk or + to a B-tree that points to raw data chunks.

+ + +

I.A. This Document

+ +

This document describes the lower-level data objects; + the higher-level objects and their properties are described + in the HDF5 User’s Guide.

+ +

Three levels of information comprise the file format. + Level 0 contains basic information for identifying and + defining information about the file. Level 1 information contains + the information about the pieces of a file shared by many objects + in the file (such as B-trees and heaps). Level 2 is the rest + of the file and contains all of the data objects with each object + partitioned into header information, also known as + metadata, and data.

+ +

The various components of the lower-level data objects are + described in pairs of tables. The first table shows the format + layout, and the second table describes the fields. The titles + of format layout tables begin with “Layout”. The + titles of the tables where the fields are described begin with + “Fields”. For example, the table that describes the + format of the version 2 B-tree header has + a title of “Layout: Version 2 B-tree Header”, and the + fields in the version 2 B-tree header are described in the table + titled “Fields: Version 2 B-tree Header”. + +

The sizes of various fields in the following layout tables are + determined by looking at the number of columns the field spans + in the table. There are exceptions:

+
    +
  • The size may be overridden by specifying a size in + parentheses
  • +
  • The size of addresses is determined by the + Size of Offsets field + in the superblock and is indicated in this document with a + superscripted ‘O’
  • +
  • The size of length fields is determined by the + Size of Lengths field in + the superblock and is indicated in this document with a + superscripted ‘L’
  • +
+ +

Values for all fields in this document should be treated as unsigned + integers, unless otherwise noted in the description of a field. + Additionally, all metadata fields are stored in little-endian byte + order. +

+ +

All checksums used in the format are computed with the + Jenkins’ + lookup3 algorithm. +

+ +

Whenever a bit flag or field is mentioned for an entry, bits are + numbered from the lowest bit position in the entry. +

+ +

Various format tables in this document have cells with + “This space inserted only to align table nicely”. These + entries in the table are just to make the table presentation nicer + and do not represent any values or padding in the file. +

+ + +

I.B. Changes for HDF5 1.12

+

The following sections have been + changed or added for the 1.12 release:

+ + + + +

I.C. Changes for HDF5 1.10

+ +

The following sections have been + changed or added for the 1.10 release:

+ + + + +

+ II. Disk Format: Level 0 - File Metadata

+ + + +

+ II.A. Disk Format: Level 0A - Format Signature and Superblock

+ +

The superblock may begin at certain predefined offsets within + the HDF5 file, allowing a block of unspecified content for + users to place additional information at the beginning (and + end) of the HDF5 file without limiting the HDF5 Library’s + ability to manage the objects within the file itself. This + feature was designed to accommodate wrapping an HDF5 file in + another file format or adding descriptive information to an HDF5 + file without requiring the modification of the actual file’s + information. The superblock is located by searching for the + HDF5 format signature at byte offset 0, byte offset 512, and at + successive locations in the file, each a multiple of two of + the previous location; in other words, at these byte offsets: + 0, 512, 1024, 2048, and so on.

+ +

The superblock is composed of the format signature, followed by a + superblock version number and information that is specific to each + version of the superblock. + +

Currently, there are four versions of the superblock format: +

    +
  • Version 0 is the default format.
  • +
  • Version 1 is the same as version 0 but with the + “Indexed Storage Internal Node K” field + for storing non-default B-tree ‘K’ value.
  • +
  • Version 2 has some fields eliminated and compressed from + superblock format versions 0 and 1. It has added checksum support + and superblock extension to store additional superblock + metadata.
  • +
  • Version 3 is the same as version 2 except that the field + “File Consistency Flags” is used for file + locking. This format version will enable support for the latest + version.
  • +
+ +

Versions 0 and 1 of the superblock are described below:

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Superblock (Versions 0 and 1) +
bytebytebytebyte

Format Signature + (8 bytes)

Version # of SuperblockVersion # of File’s Free Space StorageVersion # of Root Group Symbol Table EntryReserved (zero)
Version Number of Shared Header Message FormatSize of OffsetsSize of LengthsReserved (zero)
Group Leaf Node KGroup Internal Node K
File Consistency Flags
Indexed Storage Internal Node K1Reserved + (zero)1

Base AddressO


Address of File Free space InfoO


End of File AddressO


Driver Information Block AddressO

Root Group Symbol Table Entry
+ + + + + + + + +
  + (Items marked with a ‘1’ in the above table are + new in version 1 of the superblock.) +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Superblock (Versions 0 and 1) +
Field NameDescription

Format Signature

This field contains a constant value and can be used to + quickly identify a file as being an HDF5 file. The + constant value is designed to allow easy identification of + an HDF5 file and to allow certain types of data corruption + to be detected. The file signature of an HDF5 file always + contains the following values:

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Decimal:13772687013102610
Hexadecimal:894844460d0a1a0a
ASCII C Notation:\211HDF\r\n\032\n
+
+

This signature both identifies the file as an HDF5 file + and provides for immediate detection of common + file-transfer problems. The first two bytes distinguish + HDF5 files on systems that expect the first two bytes to + identify the file type uniquely. The first byte is + chosen as a non-ASCII value to reduce the probability + that a text file may be misrecognized as an HDF5 file; + also, it catches bad file transfers that clear bit + 7. Bytes two through four name the format. The CR-LF + sequence catches bad file transfers that alter newline + sequences. The control-Z character stops file display + under MS-DOS. The final line feed checks for the inverse + of the CR-LF translation problem. (This is a direct + descendent of the + PNG file + signature.)

+

This field is present in version 0+ of the superblock. +

Version Number of the Superblock

This value is used to determine the format of the + information in the superblock. When the format of the + information in the superblock is changed, the version number + is incremented to the next integer and can be used to + determine how the information in the superblock is + formatted.

+ +

Values of 0, 1 and 2 are defined for this field (the + format of version 2 is described below, not here). +

+ +

This field is present in version 0+ of the superblock. +

+

Version Number of the File’s Free Space + Information

+

This value is used to determine the format of the + file’s free space information. +

+

The only value currently valid in this field is ‘0’, which + indicates that the file’s free space is as described + below. +

+ +

This field is present in versions 0 and 1 of the + superblock. +

+

Version Number of the Root Group Symbol Table + Entry

This value is used to determine the format of the + information in the Root Group Symbol Table Entry. When the + format of the information in that field is changed, the + version number is incremented to the next integer and can be + used to determine how the information in the field + is formatted.

+

The only value currently valid in this field is ‘0’, + which indicates that the root group symbol table entry is + formatted as described below.

+

This field is present in version 0 and 1 of the + superblock.

+

Version Number of the Shared Header Message Format

This value is used to determine the format of the + information in a shared object header message. Since the format + of the shared header messages differs from the other private + header messages, a version number is used to identify changes + in the format. +

+

The only value currently valid in this field is ‘0’, which + indicates that shared header messages are formatted as + described below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Size of Offsets

This value contains the number of bytes used to store + addresses in the file. The values for the addresses of + objects in the file are offsets relative to a base address, + usually the address of the superblock signature. This + allows a wrapper to be added after the file is created + without invalidating the internal offset locations. +

+ +

This field is present in version 0+ of the superblock. +

+

Size of Lengths

This value contains the number of bytes used to store + the size of an object. +

+

This field is present in version 0+ of the superblock. +

+

Group Leaf Node K

+

Each leaf node of a group B-tree will have at + least this many entries but not more than twice this + many. If a group has a single leaf node then it + may have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Group Internal Node K

+

Each internal node of a group B-tree will have at + least this many entries but not more than twice this + many. If the group has only one internal + node then it might have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

File Consistency Flags

+

This field is unused and should be ignored. +

+

This field is present in version 0+ of the superblock. +

+

Indexed Storage Internal Node K

+

Each internal node of an indexed storage B-tree will have at + least this many entries but not more than twice this + many. If the index storage B-tree has only one internal + node then it might have fewer entries. +

+

This value must be greater than zero. +

+

See the description of B-trees below. +

+ +

This field is present in version 1 of the superblock. +

+

Base Address

+

This is the absolute file address of the first byte of + the HDF5 data within the file. The library currently + constrains this value to be the absolute file address + of the superblock itself when creating new files; + future versions of the library may provide greater + flexibility. When opening an existing file and this address does + not match the offset of the superblock, the library assumes + that the entire contents of the HDF5 file have been adjusted in + the file and adjusts the base address and end of file address to + reflect their new positions in the file. Unless otherwise noted, + all other file addresses are relative to this base + address. +

+ +

This field is present in version 0+ of the superblock. +

+

Address of Global Free-space Index

+

The file’s free space is not persistent for version 0 and 1 of + the superblock. + Currently this field always contains the + undefined address. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

End of File Address

+

This is the absolute file address of the first byte past + the end of all HDF5 data. It is used to determine whether a + file has been accidently truncated and as an address where + file data allocation can occur if space from the free list is + not used. +

+ +

This field is present in version 0+ of the superblock. +

+

Driver Information Block Address

+

This is the relative file address of the file driver + information block which contains driver-specific + information needed to reopen the file. If there is no + driver information block then this entry should be the + undefined address. +

+ +

This field is present in version 0 and 1 of the superblock. +

+

Root Group Symbol Table Entry

+

This is the symbol table entry + of the root group, which serves as the entry point into + the group graph for the file. +

+ +

This field is present in version 0 and 1 of the superblock. +

+
+
+ +
+
+
+

Versions 2 and 3 of the superblock are described below:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Superblock (Versions 2 and 3) +
bytebytebytebyte

Format Signature + (8 bytes)

Version # of SuperblockSize of OffsetsSize of LengthsFile Consistency Flags

Base AddressO


Superblock Extension AddressO


End of File AddressO


Root Group Object Header AddressO

Superblock Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Superblock (Versions 2 and 3) +
Field NameDescription

Format Signature

+

This field is the same as described for versions 0 and 1 of the + superblock. +

Version Number of the Superblock

+

This field has a value of 2 and has the same meaning as for + versions 0 and 1. +

+

Size of Offsets

+

This field is the same as described for + versions 0 and 1 of the + superblock. +

+

Size of Lengths

+

This field is the same as described for + versions 0 and 1 of the + superblock. +

+

File Consistency Flags

+

For superblock version + 2: This field is unused and should be ignored.

+

For superblock version + 3: This value contains flags to ensure file consistency for + file locking. Currently, the following bit flags are defined: +

    +
  • Bit 0 if set indicates that the file has been opened for + write access.
  • +
  • Bit 1 is reserved for future use.
  • +
  • Bit 2 if set indicates that the file has been opened for + single-writer/multiple-reader (SWMR) write access.
  • +
  • Bits 3-7 are reserved for future use.
  • +
+

+ Bit 0 should be set as the first action when a file has been + opened for write access. Bit 2 should be set when a file + has been opened for SWMR write access. These two bits should + be cleared only as the final action when closing a file. +

+

This field is present in version 0+ of the superblock. +

+

The size of this + field has been reduced from 4 bytes in superblock format + versions 0 and 1 to 1 byte. +

+

Base Address

+

This field is the same as described for versions 0 and + 1 of the superblock. +

+

Superblock Extension Address

+

The field is the address of the object header for the + superblock extension. + If there is no extension then this entry should be the + undefined address. +

+

End of File Address

+

This field is the same as described for versions 0 and 1 of the + superblock. +

+

Root Group Object Header Address

+

This is the address of + the root group object header, + which serves as the entry point into the group graph for the file. +

+

Superblock Checksum

+

The checksum for the superblock. +

+
+
+ +
+ +

+ II.B. Disk Format: Level 0B - File Driver Info

+ +

The driver information block is an optional region of the + file which contains information needed by the file driver + to reopen a file. The format is described below:

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Driver Information Block +
bytebytebytebyte
VersionReserved
Driver Information Size

Driver Identification + (8 bytes)



Driver Information + (variable size)


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Driver Information Block +
Field NameDescription

Version

+

The version number of the Driver Information Block. + This document describes version 0. +

+

Driver Information Size

+

The size in bytes of the Driver Information field. +

+

Driver Identification

+

This is an eight-byte ASCII string without null + termination which identifies the driver and/or version number + of the Driver Information Block. The predefined driver encoded + in this field by the HDF5 Library is identified by the + letters NCSA followed by the first four characters of + the driver name. If the Driver Information block is not + the original version then the last letter(s) of the + identification will be replaced by a version number in + ASCII, starting with 0. +

+

+ Identification for user-defined drivers is also eight-byte long. + It can be arbitrary but should be unique to avoid + the four character prefix “NCSA”. +

+

Driver Information

Driver information is stored in a format defined by the + file driver (see description below).
+
+ +
+

The two drivers encoded in the Driver Identification + field are as follows:

+
    +
  • + Multi driver: +

    + The identifier for this driver is “NCSAmulti”. + This driver provides a mechanism for segregating raw data and different types of metadata + into multiple files. + These files are viewed by the library as a single virtual HDF5 file with a single file address. + A maximum of 6 files will be created for the following data: + superblock, B-tree, raw data, global heap, local heap, and object header. + More than one type of data can be written to the same file. +

  • +
  • + Family driver +

    + The identifier for this driver is “NCSAfami” and is encoded in this field for library version 1.8 and after. + This driver is designed for systems that do not support files larger than 2 gigabytes + by splitting the HDF5 file address space across several smaller files. + It does nothing to segregate metadata and raw data; + they are mixed in the address space just as they would be in a single contiguous file. +

  • +
+

The format of the Driver Information field for the + above two drivers are described below:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Multi Driver Information +
bytebytebytebyte
Member MappingMember MappingMember MappingMember Mapping
Member MappingMember MappingReservedReserved

Address of Member File 1


End of Address for Member File 1


Address of Member File 2


End of Address for Member File 2


... ...


Address of Member File N


End of Address for Member File N


Name of Member File 1 + (variable size)


Name of Member File 2 + (variable size)


... ...


Name of Member File N + (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Multi Driver Information +
Field NameDescription

Member Mapping

These fields are integer values from 1 to 6 + indicating how the data can be mapped to or merged with another type of + data. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Member MappingDescription
1The superblock data.
2The B-tree data.
3The raw data.
4The global heap data.
5The local heap data.
6The object header data.

+

For example, if the third field has the value 3 and all the rest have the + value 1, it means there are two files: one for raw data, and one for superblock, + B-tree, global heap, local heap, and object header.

+

Reserved

These fields are reserved and should always be zero.

Address of Member File N

This field Specifies the virtual address at which the member file starts.

+

N is the number of member files.

+

End of Address for Member File N

This field is the end of the allocated address for the member file. +

Name of Member File N

This field is the null-terminated name of the member file and + its length should be multiples of 8 bytes. + Additional bytes will be padded with NULLs. The default naming + convention is %s-X.h5, where X is one of the letters + s (for superblock), b (for B-tree), r (for raw data), + g (for global heap), l (for local heap), and o (for + object header). The name of the whole HDF5 file will substitute the %s + in the string. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + +
+ Layout: Family Driver Information +
bytebytebytebyte

Size of Member File

+
+ +
+
+ + + + + + + + + + + +
+ Fields: Family Driver Information +
Field NameDescription

Size of Member File

This field is the size of the member file in the family of files.

+
+ +

+ II.C. Disk Format: Level 0C - Superblock Extension

+ +

The superblock extension is used to store superblock metadata + which is either optional, or added after the version of the superblock + was defined. Superblock extensions may only exist when version 2 + or later of the superblock is used. A superblock extension is an object + header which may hold the following messages:

+ + + + +

+ III. Disk Format: Level 1 - File Infrastructure

+ +

+ III.A. Disk Format: Level 1A - B-trees and B-tree Nodes

+ +

B-trees allow flexible storage for objects which tend to grow + in ways that cause the object to be stored discontiguously. B-trees + are described in various algorithms books including “Introduction to + Algorithms” by Thomas H. Cormen, Charles E. Leiserson, and Ronald + L. Rivest. B-trees are used in several places in the HDF5 file format, + when an index is needed for another data structure.

+ +

The version 1 B-tree structure described below is the original + index structure. The version 1 B-trees are being phased out in + favor of the version 2 B-trees described below. Note that both + types of structures may be found in the same file depending on + the application settings when creating the file.

+ +

+ III.A.1. Disk Format: Level 1A1 - Version 1 B-trees

+ +

Version 1 B-trees in HDF5 files are an implementation of the + B-link tree. The sibling nodes at a particular level in + the tree are stored in a doubly-linked list. See the + “Efficient Locking for Concurrent Operations on B-trees” + paper by Phillip Lehman and S. Bing Yao as published in the + ACM Transactions on Database Systems, Vol. 6, No. 4, + December 1981.

+ +

The B-trees implemented by the file format contain one more + key than the number of children. In other words, each child + pointer out of a B-tree node has a left key and a right key. + The pointers out of internal nodes point to sub-trees while + the pointers out of leaf nodes point to symbol nodes and + raw data chunks. + Aside from that difference, internal nodes and leaf nodes + are identical.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: B-tree Nodes +
bytebytebytebyte
Signature
Node TypeNode LevelEntries Used

Address of Left SiblingO


Address of Right SiblingO

Key 1 (variable size)

Address of Child 1O

Key 2 (variable size)

Address of Child 2O

...
Key 2K (variable size)

Address of Child 2KO

Key 2K+1 + (variable size)
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: B-tree Nodes +
Field NameDescription

Signature

+

The ASCII character string “TREE” + is used to indicate the beginning of a B-tree node. This + gives file consistency checking utilities a better chance + of reconstructing a damaged file. +

+

Node Type

+

Each B-tree points to a particular type of data. + This field indicates the type of data as well as + implying the maximum degree K of the tree and + the size of each Key field. + + + + + + + + + + + + + + + +
Node TypeDescription
0This tree points to group nodes.
1This tree points to raw data chunk nodes.

+

Node Level

+

The node level indicates the level at which this node + appears in the tree (leaf nodes are at level zero). Not + only does the level indicate whether child pointers + point to sub-trees or to data, but it can also be used + to help file consistency checking utilities reconstruct + damaged trees. +

+

Entries Used

+

This determines the number of children to which this + node points. All nodes of a particular type of tree + have the same maximum degree, but most nodes will point + to less than that number of children. The valid child + pointers and keys appear at the beginning of the node + and the unused pointers and keys appear at the end of + the node. The unused pointers and keys have undefined + values. +

+

Address of Left Sibling

+

This is the relative file address of the left sibling of + the current node. If the current + node is the left-most node at this level then this field + is the undefined address. +

+

Address of Right Sibling

+

This is the relative file address of the right sibling of + the current node. If the current + node is the right-most node at this level then this + field is the undefined address. +

+

Keys and Child Pointers

+

Each tree has 2K+1 keys with 2K + child pointers interleaved between the keys. The number + of keys and child pointers actually containing valid + values is determined by the node’s Entries + Used field. If that field is N, then the + B-tree contains N child pointers and + N+1 keys. +

+

Key

+

The format and size of the key values is determined by + the type of data to which this tree points. The keys are + ordered and are boundaries for the contents of the child + pointer; that is, the key values represented by child + N fall between Key N and Key + N+1. Whether the interval is open or closed on + each end is determined by the type of data to which the + tree points. +

+ +

+ The format of the key depends on the node type. + For nodes of node type 0 (group nodes), the key is formatted as + follows: + + + + + + +
A single field of + Size of Lengths + bytes:Indicates the byte offset into the local heap + for the first object name in the subtree which + that key describes. +
+

+ + +

+ For nodes of node type 1 (chunked raw data nodes), the key is + formatted as follows: + + + + + + + + + + + + + + +
Bytes 1-4:Size of chunk in bytes.
Bytes 4-8:Filter mask, a 32-bit bit field indicating which + filters have been skipped for this chunk. Each filter + has an index number in the pipeline (starting at 0, with + the first filter to apply) and if that filter is skipped, + the bit corresponding to its index is set.
(D + 1) 64-bit fields:The offset of the + chunk within the dataset where D is the number + of dimensions of the dataset, and the last value is the + offset within the dataset’s datatype and should + always be zero. For example, if + a chunk in a 3-dimensional dataset begins at the + position [5,5,5], there will be three + such 64-bit values, each with the value of + 5, followed by a 0 value.
+

+ +

Child Pointer

+

The tree node contains file addresses of subtrees or + data depending on the node level. Nodes at Level 0 point + to data addresses, either raw data chunks or group nodes. + Nodes at non-zero levels point to other nodes of the + same B-tree. +

+

For raw data chunk nodes, the child pointer is the address + of a single raw data chunk. For group nodes, the child pointer + points to a symbol table, which contains + information for multiple symbol table entries. +

+
+
+ +

+ Conceptually, each B-tree node looks like this:

+
+ + + + + + + + + + + + + +
key[0] child[0] key[1] child[1] key[2] ... ... key[N-1] child[N-1] key[N]
+
+
+ + where child[i] is a pointer to a sub-tree (at a level + above Level 0) or to data (at Level 0). + Each key[i] describes an item stored by the B-tree + (a chunk or an object of a group node). The range of values + represented by child[i] is indicated by key[i] + and key[i+1]. + + +

The following question must next be answered: + “Is the value described by key[i] contained in + child[i-1] or in child[i]?” + The answer depends on the type of tree. + In trees for groups (node type 0), the object described by + key[i] is the greatest object contained in + child[i-1] while in chunk trees (node type 1) the + chunk described by key[i] is the least chunk in + child[i].

+ +

That means that key[0] for group trees is sometimes unused; + it points to offset zero in the heap, which is always the + empty string and compares as “less-than” any valid + object name.

+ +

And key[N] for chunk trees is sometimes unused; + it contains a chunk offset which compares as “greater-than” + any other chunk offset and has a chunk byte size of zero + to indicate that it is not actually allocated.

+ +

+ III.A.2. Disk Format: Level 1A2 - Version 2 B-trees

+ +

Version 2 (v2) B-trees are “traditional” B-trees + with one major difference. Instead of just using a simple pointer + (or address in the file) to a child of an internal node, the pointer + to the child node contains two additional pieces of information: + the number of records in the child node itself, and the total number + of records in the child node and all its descendants. Storing this + additional information allows fast array-like indexing to locate + the nth record in the B-tree.

+ +

The entry into a version 2 B-tree is a header which contains global + information about the structure of the B-tree. The root node + address + field in the header points to the B-tree root node, which is either an + internal or leaf node, depending on the value in the header’s + depth field. An internal node consists of records plus + pointers to further leaf or internal nodes in the tree. A leaf node + consists of solely of records. The format of the records depends on + the B-tree type (stored in the header).

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree Header +
bytebytebytebyte
Signature
VersionTypeThis space inserted only to align table nicely
Node Size
Record SizeDepth
Split PercentMerge PercentThis space inserted only to align table nicely

Root Node AddressO

Number of Records in Root NodeThis space inserted only to align table nicely

Total Number of Records in B-treeL

Checksum
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree Header +
Field NameDescription

Signature

+

The ASCII character string “BTHD” + is used to indicate the header of a version 2 (v2) B-tree + node. +

+

Version

+

The version number for this B-tree header. This document + describes version 0. +

+

Type

+

This field indicates the type of B-tree: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0This B-tree is used for testing only. This + value should not be used for storing + records in actual HDF5 files. +
1This B-tree is used for indexing indirectly accessed, + non-filtered ‘huge’ fractal heap objects. +
2This B-tree is used for indexing indirectly accessed, + filtered ‘huge’ fractal heap objects. +
3This B-tree is used for indexing directly accessed, + non-filtered ‘huge’ fractal heap objects. +
4This B-tree is used for indexing directly accessed, + filtered ‘huge’ fractal heap objects. +
5This B-tree is used for indexing the ‘name’ field for + links in indexed groups. +
6This B-tree is used for indexing the ‘creation order’ + field for links in indexed groups. +
7This B-tree is used for indexing shared object header + messages. +
8This B-tree is used for indexing the ‘name’ field for + indexed attributes. +
9This B-tree is used for indexing the ‘creation order’ + field for indexed attributes. +
10This B-tree is used for indexing chunks of + datasets with no filters and with more than one + dimension of unlimited extent. +
11This B-tree is used for indexing chunks of + datasets with filters and more than one dimension + of unlimited extent. +

+

The format of records for each type is described below.

+

Node Size

+

This is the size in bytes of all B-tree nodes. +

+

Record Size

+

This field is the size in bytes of the B-tree record. +

+

Depth

+

This is the depth of the B-tree. +

+

Split Percent

+

The percent full that a node needs to increase above before it + is split. +

+

Merge Percent

+

The percent full that a node needs to be decrease below before it + is split. +

+

Root Node Address

+

This is the address of the root B-tree node. A B-tree with + no records will have the undefined + address in this field. +

+

Number of Records in Root Node

+

This is the number of records in the root node. +

+

Total Number of Records in B-tree

+

This is the total number of records in the entire B-tree. +

+

Checksum

+

This is the checksum for the B-tree header. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree Internal Node +
bytebytebytebyte
Signature
VersionTypeRecords 0, 1, 2...N-1 (variable size)

Child Node Pointer 0O


Number of Records N0 for Child + Node 0 (variable size)

Total Number of Records for Child Node 0 + (optional, variable size)

Child Node Pointer 1O


Number of Records N1 for + Child Node 1 (variable size)

Total Number of Records for Child Node 1 + (optional, variable size)
...

Child Node Pointer NO


Number of Records Nn for + Child Node N (variable size)

Total Number of Records for Child Node N + (optional, variable size)
Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ + +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree Internal Node +
Field NameDescription

Signature

+

The ASCII character string “BTIN” is + used to indicate the internal node of a B-tree. +

+

Version

+

The version number for this B-tree internal node. + This document describes version 0. +

+

Type

+

This field is the type of the B-tree node. It should always + be the same as the B-tree type in the header. +

+

Records

+

The size of this field is determined by the number of records + for this node and the record size (from the header). The format + of records depends on the type of B-tree. +

+

Child Node Pointer

+

This field is the address of the child node pointed to by the + internal node. +

+

Number of Records in Child Node

+

This is the number of records in the child node pointed to by + the corresponding Node Pointer. +

+

The number of bytes used to store this field is determined by + the maximum possible number of records able to be stored in the + child node. +

+

+ The maximum number of records in a child node is computed + in the following way: + +

    +
  • Subtract the fixed size overhead for + the child node (for example, its signature, version, + checksum, and so on and one pointer triplet + of information for the child node (because there is one + more pointer triplet than records in each internal node)) + from the size of nodes for the B-tree.
  • +
  • Divide that result by the size of a record plus the + pointer triplet of information stored to reach each + child node from this node.
  • +
+ +

+

+ Note that leaf nodes do not encode any + child pointer triplets, so the maximum number of records in a + leaf node is just the node size minus the leaf node overhead, + divided by the record size. +

+

+ Also note that the first level of internal nodes above the + leaf nodes do not encode the Total Number of Records in Child + Node value in the child pointer triplets (since it is the + same as the Number of Records in Child Node), so the + maximum number of records in these nodes is computed with the + equation above, but using (Child Pointer, Number of + Records in Child Node) pairs instead of triplets. +

+

+ The number of + bytes used to encode this field is the least number of bytes + required to encode the maximum number of records in a child + node value for the child nodes below this level + in the B-tree. +

+

+ For example, if the maximum number of child records is + 123, one byte will be used to encode these values in this + node; if the maximum number of child records is + 20000, two bytes will be used to encode these values in this + node; and so on. The maximum number of bytes used to + encode these values is 8 (in other words, an unsigned + 64-bit integer). +

+

Total Number of Records in Child Node

+

This is the total number of records for the node pointed to by + the corresponding Node Pointer and all its children. + This field exists only in nodes whose depth in the B-tree node + is greater than 1 (in other words, the “twig” + internal nodes, just above leaf nodes, do not store this + field in their child node pointers). +

+

The number of bytes used to store this field is determined by + the maximum possible number of records able to be stored in the + child node and its descendants. +

+

+ The maximum possible number of records able to be stored in a + child node and its descendants is computed iteratively, in the + following way: The maximum number of records in a leaf node + is computed, then that value is used to compute the maximum + possible number of records in the first level of internal nodes + above the leaf nodes. Multiplying these two values together + determines the maximum possible number of records in child node + pointers for the level of nodes two levels above leaf nodes. + This process is continued up to any level in the B-tree. +

+

+ The number of bytes used to encode this value is computed in + the same way as for the Number of Records in Child Node + field. +

+

Checksum

+

This is the checksum for this node. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree Leaf Node +
bytebytebytebyte
Signature
VersionTypeRecord 0, 1, 2...N-1 (variable size)
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree Leaf Node +
Field NameDescription

Signature

+

The ASCII character string “BTLF“ + is used to indicate the leaf node of a version 2 (v2) B-tree. +

+

Version

+

The version number for this B-tree leaf node. + This document describes version 0. +

+

Type

+

This field is the type of the B-tree node. It should always + be the same as the B-tree type in the header. +

+

Records

+

The size of this field is determined by the number of records + for this node and the record size (from the header). The format + of records depends on the type of B-tree. +

+

Checksum

+

This is the checksum for this node. +

+
+
+ +
+
+
+

The record layout for each stored (in other words, non-testing) + B-tree type is as follows:

+ +
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 1 Record Layout - Indirectly + Accessed, Non-filtered, ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Huge Object AddressO


Huge Object LengthL


Huge Object IDL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 1 Record Layout - Indirectly + Accessed, Non-filtered, ‘Huge’ Fractal Heap Objects +
Field NameDescription

Huge Object Address

+

The address of the huge object in the file. +

+

Huge Object Length

+

The length of the huge object in the file. +

+

Huge Object ID

+

The heap ID for the huge object. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 2 Record Layout - Indirectly + Accessed, Filtered, ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Filtered Huge Object AddressO


Filtered Huge Object LengthL

Filter Mask

Filtered Huge Object Memory SizeL


Huge Object IDL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 2 Record Layout - Indirectly + Accessed, Filtered, ‘Huge’ Fractal Heap Objects +
Field NameDescription

Filtered Huge Object Address

+

The address of the filtered huge object in the file. +

+

Filtered Huge Object Length

+

The length of the filtered huge object in the file. +

+

Filter Mask

+

A 32-bit bit field indicating which filters have been skipped for + this chunk. Each filter has an index number in the pipeline + (starting at 0, with the first filter to apply) and if that + filter is skipped, the bit corresponding to its index is set. +

+

Filtered Huge Object Memory Size

+

The size of the de-filtered huge object in memory. +

+

Huge Object ID

+

The heap ID for the huge object. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 3 Record Layout - Directly + Accessed, Non-filtered, ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Huge Object AddressO


Huge Object LengthL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 3 Record Layout - Directly + Accessed, Non-filtered, ‘Huge’ Fractal Heap Objects +
Field NameDescription

Huge Object Address

+

The address of the huge object in the file. +

+

Huge Object Length

+

The length of the huge object in the file. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 4 Record Layout - Directly + Accessed, Filtered, ‘Huge’ Fractal Heap Objects +
bytebytebytebyte

Filtered Huge Object AddressO


Filtered Huge Object LengthL

Filter Mask

Filtered Huge Object Memory SizeL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 4 Record Layout - Directly + Accessed, Filtered, ‘Huge’ Fractal Heap Objects +
Field NameDescription

Filtered Huge Object Address

+

The address of the filtered huge object in the file. +

+

Filtered Huge Object Length

+

The length of the filtered huge object in the file. +

+

Filter Mask

+

A 32-bit bit field indicating which filters have been skipped for + this chunk. Each filter has an index number in the pipeline + (starting at 0, with the first filter to apply) and if that + filter is skipped, the bit corresponding to its index is set. +

+

Filtered Huge Object Memory Size

+

The size of the de-filtered huge object in memory. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 5 Record Layout - Link Name + for Indexed Group +
bytebytebytebyte
Hash of Name
ID (bytes 1-4)
ID (bytes 5-7)
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 5 Record Layout - Link Name + for Indexed Group +
Field NameDescription

Hash

+

This field is hash value of the name for the link. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the link’s name. +

+

ID

+

This is a 7-byte sequence of bytes and is the heap ID for the + link record in the group’s fractal heap.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 6 Record Layout - Creation + Order for Indexed Group +
bytebytebytebyte

Creation Order + (8 bytes)

ID (bytes 1-4)
ID (bytes 5-7)
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 6 Record Layout - Creation + Order for Indexed Group +
Field NameDescription

Creation Order

+

This field is the creation order value for the link. +

+

ID

+

This is a 7-byte sequence of bytes and is the heap ID for the + link record in the group’s fractal heap.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 7 Record Layout - Shared + Object Header Messages (Sub-type 0 - Message in Heap) +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash
Reference Count

Heap ID (8 bytes)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 7 Record Layout - Shared + Object Header Messages (Sub-type 0 - Message in Heap) +
Field NameDescription

Message Location

+

This field Indicates the location where the message is stored: + + + + + + + + + + + + + +
ValueDescription
0Shared message is stored in shared message index heap. +
1Shared message is stored in object header. +

+

Hash

+

This field is hash value of the shared message. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the shared message.

+

Reference Count

+

The number of objects which reference this message.

+

Heap ID

+

This is an 8-byte sequence of bytes and is the heap ID for the + shared message in the shared message index’s fractal heap.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 7 Record Layout - Shared + Object Header Messages (Sub-type 1 - Message in Object Header) +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash
Reserved (zero)Message TypeObject Header Index

Object Header AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 7 Record Layout - Shared + Object Header Messages (Sub-type 1 - Message in Object Header) +
Field NameDescription

Message Location

+

This field Indicates the location where the message is stored: + + + + + + + + + + + + + +
ValueDescription
0Shared message is stored in shared message index heap. +
1Shared message is stored in object header. +

+

Hash

+

This field is hash value of the shared message. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the shared message.

+

Message Type

+

The object header message type of the shared message.

+

Object Header Index

+

This field indicates that the shared message is the nth message + of its type in the specified object header.

+

Object Header Address

+

The address of the object header containing the shared message.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 8 Record Layout - Attribute + Name for Indexed Attributes +
bytebytebytebyte

Heap ID (8 bytes)

Message FlagsThis space inserted only to align table nicely
Creation Order
Hash of Name
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 8 Record Layout - Attribute + Name for Indexed Attributes +
Field NameDescription

Heap ID

+

This is an 8-byte sequence of bytes and is the heap ID for the + attribute in the object’s attribute fractal heap.

+

Message Flags

The object header message flags for the attribute message.

+

Creation Order

+

This field is the creation order value for the attribute. +

+

Hash

+

This field is hash value of the name for the attribute. The hash + value is the Jenkins’ lookup3 checksum algorithm applied to + the attribute’s name. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 B-tree, Type 9 Record Layout - Creation + Order for Indexed Attributes +
bytebytebytebyte

Heap ID (8 bytes)

Message Flags + This space inserted only to align table nicely
Creation Order
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 9 Record Layout - Creation + Order for Indexed Attributes +
Field NameDescription

Heap ID

+

This is an 8-byte sequence of bytes and is the heap ID for the + attribute in the object’s attribute fractal heap.

+

Message Flags

+

The object header message flags for the attribute message.

+

Creation Order

+

This field is the creation order value for the attribute. +

+
+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + Layout: Version 2 B-tree, Type 10 Record Layout - + Non-filtered Dataset Chunks +
bytebytebytebyte

AddressO


Dimension 0 Scaled Offset + (8 bytes)


Dimension 1 Scaled Offset + (8 bytes)


...


Dimension #n Scaled Offset + (8 bytes)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 10 Record Layout - + Non-filtered Dataset Chunks +
Field NameDescription

Address

+

This field is the address of the dataset chunk in the file.

+

Dimension #n Scaled Offset

+

This field is the scaled offset of the chunk within the + dataset. n is the number of dimensions for the + dataset. The first scaled offset stored in the list is for + the slowest changing dimension, and the last scaled offset + stored is for the fastest changing dimension. Scaled offset + is calculated by dividing the chunk dimension sizes into + the chunk offsets.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + Layout: Version 2 B-tree, Type 11 Record Layout - Filtered + Dataset Chunks +
bytebytebytebyte

AddressO


Chunk Size + (variable size; at most 8 bytes)

Filter Mask

Dimension 0 Scaled Offset + (8 bytes)


Dimension 1 Scaled Offset + (8 bytes)


...


Dimension #n Scaled Offset + (8 bytes)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 B-tree, Type 11 Record Layout - Filtered + Dataset Chunks +
Field NameDescription

Address

+

This field is the address of the dataset chunk in the file.

+

Chunk Size

+

This field is the size of the dataset chunk in bytes.

+

Filter Mask

+

This field is the filter mask which indicates the filter + to skip for the dataset chunk. Each filter has an index + number in the pipeline and if that filter is skipped, + the bit corresponding to its index is set.

+

Dimension #n Scaled Offset

+

This field is the scaled offset of the chunk within + the dataset. n is the number of dimensions for + the dataset. The first scaled offset stored in the list + is for the slowest changing dimension, and the last scaled + offset stored is for the fastest changing dimension.

+
+
+ +

+ III.B. Disk Format: Level 1B - Group Symbol Table Nodes

+ +

A group is an object internal to the file that allows + arbitrary nesting of objects within the file (including other + groups). A group maps a set of link names in the group to a set + of relative file addresses of objects in the file. Certain metadata + for an object to which the group points can be cached in the + group’s symbol table entry in addition to being in the + object’s header.

+ +

An HDF5 object name space can be stored hierarchically by + partitioning the name into components and storing each + component as a link in a group. The link for a + non-ultimate component points to the group containing + the next component. The link for the last + component points to the object being named.

+ +

One implementation of a group is a collection of symbol table + nodes indexed by a B-tree. Each symbol table node contains entries + for one or more links. If an attempt is made to add a link to an + already full symbol table node containing 2K entries, then + the node is split and one node contains K symbols and the + other contains K+1 symbols.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Symbol Table Node (A Leaf of a B-tree) +
bytebytebytebyte
Signature
Version NumberReserved (zero)Number of Symbols


Group Entries


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Symbol Table Node (A Leaf of a B-tree) +
Field NameDescription

Signature

+

The ASCII character string “SNOD” is + used to indicate the + beginning of a symbol table node. This gives file + consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version Number

+

The version number for the symbol table node. This + document describes version 1. (There is no version ‘0’ + of the symbol table node) +

+

Number of Entries

+

Although all symbol table nodes have the same length, + most contain fewer than the maximum possible number of + link entries. This field indicates how many entries + contain valid data. The valid entries are packed at the + beginning of the symbol table node while the remaining + entries contain undefined values. +

+

Symbol Table Entries

+

Each link has an entry in the symbol table node. + The format of the entry is described below. + There are 2K entries in each group node, where + K is the “Group Leaf Node K” value from the + superblock. +

+
+
+ +

+ III.C. Disk Format: Level 1C - Symbol Table Entry

+ +

Each symbol table entry in a symbol table node is designed + to allow for very fast browsing of stored objects. + Toward that design goal, the symbol table entries + include space for caching certain constant metadata from the + object header.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Symbol Table Entry +
bytebytebytebyte

Link Name OffsetO


Object Header AddressO

Cache Type
Reserved (zero)


Scratch-pad Space + (16 bytes)


+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Symbol Table Entry +
Field NameDescription

Link Name Offset

+

This is the byte offset into the group’s local + heap for the name of the link. The name is null + terminated. +

+

Object Header Address

+

Every object has an object header which serves as a + permanent location for the object’s metadata. In addition + to appearing in the object header, some of the object’s metadata + can be cached in the scratch-pad space. +

+

Cache Type

+

The cache type is determined from the object header. + It also determines the format for the scratch-pad space: + + + + + + + + + + + + + + + + + + +
TypeDescription
0No data is cached by the group entry. This + is guaranteed to be the case when an object header + has a link count greater than one. +
1Group object header metadata is cached in the + scratch-pad space. This implies that the symbol table + entry refers to another group. +
2The entry is a symbolic link. The first four bytes + of the scratch-pad space are the offset into the local + heap for the link value. The object header address + will be undefined. +

+ +

Reserved

+

These four bytes are present so that the scratch-pad + space is aligned on an eight-byte boundary. They are + always set to zero. +

+

Scratch-pad Space

+

This space is used for different purposes, depending + on the value of the Cache Type field. Any metadata + about an object represented in the scratch-pad + space is duplicated in the object header for that + object. +

+

+ Furthermore, no data is cached in the group + entry scratch-pad space if the object header for + the object has a link count greater than one. +

+
+
+ +

Format of the Scratch-pad Space

+ +

The symbol table entry scratch-pad space is formatted + according to the value in the Cache Type field.

+ +

If the Cache Type field contains the value zero + (0) then no information is + stored in the scratch-pad space.

+ +

If the Cache Type field contains the value one + (1), then the scratch-pad space + contains cached metadata for another object header + in the following format:

+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Object Header Scratch-pad Format +
bytebytebytebyte

Address of B-treeO


Address of Name HeapO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Object Header Scratch-pad Format +
Field NameDescription

Address of B-tree

+

This is the file address for the root of the + group’s B-tree. +

+

Address of Name Heap

+

This is the file address for the group’s local + heap, in which are stored the group’s symbol names. +

+
+
+ + +
+
+
+

If the Cache Type field contains the value two + (2), then the scratch-pad space + contains cached metadata for a symbolic link + in the following format:

+ +
+ + + + + + + + + + + + + +
+ Layout: Symbolic Link Scratch-pad Format +
bytebytebytebyte
Offset to Link Value
+
+ +
+
+ + + + + + + + + + + +
+ Fields: Symbolic Link Scratch-pad Format +
Field NameDescription

Offset to Link Value

+

The value of a symbolic link (that is, the name of the + thing to which it points) is stored in the local heap. + This field is the 4-byte offset into the local heap for + the start of the link value, which is null terminated. +

+
+
+ +

+ III.D. Disk Format: Level 1D - Local Heaps

+ +

A local heap is a collection of small pieces of data that are particular + to a single object in the HDF5 file. Objects can be + inserted and removed from the heap at any time. + The address of a heap does not change once the heap is created. + For example, a group stores addresses of objects in symbol table nodes + with the names of links stored in the group’s local heap. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Local Heap +
bytebytebytebyte
Signature
VersionReserved (zero)

Data Segment SizeL


Offset to Head of Free-listL


Address of Data SegmentO

+ + + + + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Local Heap +
Field NameDescription

Signature

+

The ASCII character string “HEAP” + is used to indicate the + beginning of a heap. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

Each local heap has its own version number so that new + heaps can be added to old files. This document + describes version zero (0) of the local heap. +

+

Data Segment Size

+

The total amount of disk memory allocated for the heap + data. This may be larger than the amount of space + required by the objects stored in the heap. The extra + unused space in the heap holds a linked list of free blocks. +

+

Offset to Head of Free-list

+

This is the offset within the heap data segment of the + first free block (or the + undefined address if there is no + free block). The free block contains + Size of Lengths bytes that + are the offset of the next free block (or the + value ‘1’ if this is the + last free block) followed by Size of Lengths bytes that store + the size of this free block. The size of the free block includes + the space used to store the offset of the next free block and + the size of the current block, making the minimum size of a free + block 2 * Size of Lengths. +

+

Address of Data Segment

+

The data segment originally starts immediately after + the heap header, but if the data segment must grow as a + result of adding more objects, then the data segment may + be relocated, in its entirety, to another part of the + file. +

+
+
+ +

Objects within a local heap should be aligned on an 8-byte boundary.

+ +

+ III.E. Disk Format: Level 1E - Global Heap

+ +

Each HDF5 file has a global heap which stores various types of + information which is typically shared between datasets. The + global heap was designed to satisfy these goals:

+ +
    +
  1. Repeated access to a heap object must be efficient without + resulting in repeated file I/O requests. Since global heap + objects will typically be shared among several datasets, it is + probable that the object will be accessed repeatedly.
  2. +
  3. Collections of related global heap objects should result in + fewer and larger I/O requests. For instance, a dataset of + object references will have a global heap object for each + reference. Reading the entire set of object references + should result in a few large I/O requests instead of one small + I/O request for each reference.
  4. +
  5. It should be possible to remove objects from the global heap + and the resulting file hole should be eligible to be reclaimed + for other uses.
  6. +
+ + +

The implementation of the heap makes use of the memory management + already available at the file level and combines that with a new + object called a collection to achieve goal B. The global heap + is the set of all collections. Each global heap object belongs to + exactly one collection, and each collection contains one or more global + heap objects. For the purposes of disk I/O and caching, a collection is + treated as an atomic object, addressing goal A. +

+ +

When a global heap object is deleted from a collection (which + occurs when its reference count falls to zero), objects located + after the deleted object in the collection are packed down toward + the beginning of the collection, and the collection’s + global heap object 0 is created (if possible), or its size is + increased to account for the recently freed space. There are + no gaps between objects in each collection, with the possible + exception of the final space in the collection, if it is not + large enough to hold the header for the collection’s + global heap object 0. These features address goal C. +

+ +

The HDF5 Library creates global heap collections as needed, so there may + be multiple collections throughout the file. The set of all of them is + abstractly called the “global heap”, although they do not actually link + to each other, and there is no global place in the file where you can + discover all of the collections. The collections are found simply by + finding a reference to one through another object in the file. For + example, data of variable-length datatype elements is stored in the + global heap and is accessed via a global heap ID. The format for + global heap IDs is described at the end of this section. +

+ +

For more information on global heaps for virtual datasets, see + “Disk Format: Level 1F - Global Heap + Block for Virtual Datasets.”

+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: A Global Heap Collection +
bytebytebytebyte
Signature
VersionReserved (zero)

Collection SizeL


Global Heap Object 1


Global Heap Object 2


...


Global Heap Object N


Global Heap Object 0 (free space)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: A Global Heap Collection +
Field NameDescription

Signature

+

The ASCII character string “GCOL” + is used to indicate the + beginning of a collection. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

Each collection has its own version number so that new + collections can be added to old files. This document + describes version one (1) of the collections (there is no + version zero (0)). +

+

Collection Size

+

This is the size in bytes of the entire collection + including this field. The default (and minimum) + collection size is 4096 bytes which is a typical file + system block size. This allows for 127 16-byte heap + objects plus their overhead (the collection header of 16 bytes + and the 16 bytes of information about each heap object). +

+

Global Heap Object 1 through N

+

The objects are stored in any order with no + intervening unused space. +

+

Global Heap Object 0

+

Global Heap Object 0 (zero), when present, represents the free + space in the collection. Free space always appears at the end of + the collection. If the free space is too small to store the header + for Object 0 (described below) then the header is implied and is not + written. +

+ The field Object Size for Object 0 indicates the + amount of possible free space in the collection including the 16-byte + header size of Object 0. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Global Heap Object +
bytebytebytebyte
Heap Object IndexReference Count
Reserved (zero)

Object SizeL


Object Data

+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Global Heap Object +
Field NameDescription

Heap Object Index

+

Each object has a unique identification number within a + collection. The identification numbers are chosen so that + new objects have the smallest value possible with the + exception that the identifier 0 always refers to the + object which represents all free space within the + collection. +

+

Reference Count

+

All heap objects have a reference count field. An + object which is referenced from some other part of the + file will have a positive reference count. The reference + count for Object 0 is always zero. +

+

Reserved

+

Zero padding to align next field on an 8-byte boundary. +

+

Object Size

+

This is the size of the object data stored for the object. + The actual storage space allocated for the object data is rounded + up to a multiple of eight. +

+

Object Data

+

The object data is treated as a one-dimensional array + of bytes to be interpreted by the caller. +

+
+ +
+ +
+
+
+

+ + The format for the ID used to locate an object in the global heap is + described here:

+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Global Heap ID +
bytebytebytebyte

Collection AddressO

Object Index
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Global Heap ID +
Field NameDescription

Collection Address

+

This field is the address of the global heap collection + where the data object is stored. +

+

ID

+

This field is the index of the data object within the + global heap collection. +

+
+
+ + + +

III.F. Disk Format: Level 1F - Global + Heap Block for Virtual Datasets

+ +

The layout for the global heap block used with virtual datasets is + described below. For more information on global heaps, see + “Disk Format: Level 1E - Global Heap.”

+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Global Heap Block for Virtual Dataset +
bytebytebytebyte
VersionThis space inserted + only to align table nicely

Num EntriesL


Source Filename #1 (variable size)


Source Dataset #1 (variable + size)


Source Selection #1 (variable + size)


Virtual Selection #1 (variable + size)

.
.
.

Source Filename #n (variable + size)


Source Dataset #n (variable + size)


Source Selection #n (variable + size)


Virtual Selection #n (variable + size)

Checksum
+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Global Heap Block for Virtual Dataset +
Field NameDescription

Version

+

The version number for the block; the value is 0.

+

Num Entries

The number of entries in the block.

+

Source Filename #n

+

The source file name where the source dataset is located. +

+

Source Dataset #n

The source dataset name that is mapped to the + virtual dataset.

Source Selection #n

+

The dataspace selection in the + source dataset that is mapped to the virtual selection. +

+

Virtual Selection #n

+

This is the dataspace selection in the virtual dataset that is + mapped to the source selection. +

+

Checksum

+

This is the checksum for the block.

+
+
+
+ +

+ III.G. Disk Format: Level 1G - Fractal Heap

+ +

+ Each fractal heap consists of a header and zero or more direct and + indirect blocks (described below). The header contains general + information as well as + initialization parameters for the doubling table. The Address + of Root Block field in the header points to the first direct or + indirect block in the heap. +

+ +

+ Fractal heaps are based on a data structure called a doubling + table. A doubling table provides a mechanism for quickly + extending an array-like data structure that minimizes the number of + empty blocks in the heap, while retaining very fast lookup of any + element within the array. More information on fractal heaps and + doubling tables can be found in the RFC + “Private + Heaps in HDF5.” +

+ +

+ The fractal heap implements the doubling table structure with + indirect and direct blocks. + Indirect blocks in the heap do not actually contain data for + objects in the heap, their “size” is abstract - + they represent the indexing structure for locating the + direct blocks in the doubling table. + Direct blocks + contain the actual data for objects stored in the heap. +

+ +

+ All indirect blocks have a constant number of block entries in each + row, called the width of the doubling table + (see Table Width field in the header). + + The number + of rows for each indirect block in the heap is determined by the + size of the block that the indirect block represents in the + doubling table (calculation of this is shown below) and is + constant, except for the “root” + indirect block, which expands and shrinks its number of rows as + needed. +

+ +

+ Blocks in the first two rows of an indirect block + are Starting Block Size number of bytes in size. + For example, if the row width of the doubling table is 4, + then the first eight block entries in the + indirect block are Starting Block Size number of bytes in size. + The blocks in each subsequent row are twice the size of + the blocks in the previous row. In other words, blocks in + the third row are twice the Starting Block Size, + blocks in the fourth row are four times the + Starting Block Size, and so on. Entries for + blocks up to the Maximum Direct Block Size point to + direct blocks, and entries for blocks greater than that size + point to further indirect blocks (which have their own + entries for direct and indirect blocks). + Starting Block Size and + Maximum Direct Block Size are fields + stored in the header. +

+ +

+ The number of rows of blocks, nrows, in an + indirect block is calculated by the following expression: +

+ nrows = (log2(block_size) - + log2(<Starting Block Size>)) + 1 +

+where block_size is the size of the block that the indirect block +represents in the doubling table. +For example, to represent a block with block_size equals to 1024, +and Starting Block Size equals to 256, +three rows are needed. +

+ The maximum number of rows of direct blocks, max_dblock_rows, + in any indirect block of a fractal heap is given by the + following expression: +

+ max_dblock_rows = + (log2(<Maximum Direct Block Size>) - + log2(<Starting Block Size>)) + 2 +

+

+ Using the computed values for nrows and + max_dblock_rows, along with the width of the + doubling table, the number of direct and indirect block entries + (K and N in the indirect block description, below) + in an indirect block can be computed: +

+ K = MIN(nrows, max_dblock_rows) * + <Table Width> + +

+ If nrows is less than or equal to max_dblock_rows, + N is 0. Otherwise, N is simply computed: +

+ N = K - (max_dblock_rows * + <Table Width>) +

+ +

+ The size of indirect blocks on disk is determined by the number + of rows in the indirect block (computed above). The size of direct + blocks on disk is exactly the size of the block in the doubling + table. +

+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap Header +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely
Heap ID LengthI/O Filters’ Encoded Length
FlagsThis space inserted only to align table nicely
Maximum Size of Managed Objects

Next Huge Object IDL


v2 B-tree Address of Huge ObjectsO


Amount of Free Space in Managed BlocksL


Address of Managed Block Free Space ManagerO


Amount of Managed Space in HeapL


Amount of Allocated Managed Space in HeapL


Offset of Direct Block Allocation Iterator in Managed SpaceL


Number of Managed Objects in HeapL


Size of Huge Objects in HeapL


Number of Huge Objects in HeapL


Size of Tiny Objects in HeapL


Number of Tiny Objects in HeapL

Table WidthThis space inserted only to align table nicely

Starting Block SizeL


Maximum Direct Block SizeL

Maximum Heap SizeStarting # of Rows in Root Indirect Block

Address of Root BlockO

Current # of Rows in Root Indirect BlockThis space inserted only to align table nicely

Size of Filtered Root Direct Block (optional)L

I/O Filter Mask (optional)
I/O Filter Information (optional, variable size)
Checksum
+ + + + + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap Header +
Field NameDescription

Signature

+

The ASCII character string “FRHP” + is used to indicate the + beginning of a fractal heap header. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

This document describes version 0.

+

Heap ID Length

+

This is the length in bytes of heap object IDs for this heap.

+

I/O Filters’ Encoded Length

+

This is the size in bytes of the encoded I/O Filter Information. +

+

Flags

+

This field is the heap status flag and is a bit field + indicating additional information about the fractal heap. + + + + + + + + + + + + + + + + + + +
Bit(s)Description
0If set, the ID value to use for huge object has wrapped + around. If the value for the Next Huge Object ID + has wrapped around, each new huge object inserted into the + heap will require a search for an ID value. +
1If set, the direct blocks in the heap are checksummed. +
2-7Reserved

+ +

Maximum Size of Managed Objects

+

This is the maximum size of managed objects allowed in the heap. + Objects greater than this this are ‘huge’ objects and will be + stored in the file directly, rather than in a direct block for + the heap. +

+

Next Huge Object ID

+

This is the next ID value to use for a huge object in the heap. +

+

v2 B-tree Address of Huge Objects

+

This is the address of the v2 B-tree + used to track huge objects in the heap. The type of records + stored in the v2 B-tree will + be determined by whether the address and length of a huge object + can fit into a heap ID (if yes, it is a “directly” accessed + huge object) and whether there is a filter used on objects + in the heap. +

+

Amount of Free Space in Managed Blocks

+

This is the total amount of free space in managed direct blocks + (in bytes). +

+

Address of Managed Block Free Space Manager

+

This is the address of the + Free-space Manager for + managed blocks. +

+

Amount of Managed Space in Heap

+

This is the total amount of managed space in the heap (in bytes), + essentially the upper bound of the heap’s linear address space. +

+

Amount of Allocated Managed Space in Heap

+

This is the total amount of managed space (in bytes) actually + allocated in + the heap. This can be less than the Amount of Managed Space + in Heap field, if some direct blocks in the heap’s linear + address space are not allocated. +

+

Offset of Direct Block Allocation Iterator in Managed Space

+

This is the linear heap offset where the next direct + block should be allocated at (in bytes). This may be less than + the Amount of Managed Space in Heap value because the + heap’s address space is increased by a “row” of direct blocks + at a time, rather than by single direct block increments. +

+

Number of Managed Objects in Heap

+

This is the number of managed objects in the heap. +

+

Size of Huge Objects in Heap

+

This is the total size of huge objects in the heap (in bytes). +

+

Number of Huge Objects in Heap

+

This is the number of huge objects in the heap. +

+

Size of Tiny Objects in Heap

+

This is the total size of tiny objects that are packed in heap + IDs (in bytes). +

+

Number of Tiny Objects in Heap

+

This is the number of tiny objects that are packed in heap IDs. +

+

Table Width

+

This is the number of columns in the doubling table for managed + blocks. This value must be a power of two. +

+

Starting Block Size

+

This is the starting block size to use in the doubling table for + managed blocks (in bytes). This value must be a power of two. +

+

Maximum Direct Block Size

+

This is the maximum size allowed for a managed direct block. + Objects inserted into the heap that are larger than this value + (less the number of bytes of direct block prefix/suffix) + are stored as ‘huge’ objects. This value must be a power of + two. +

+

Maximum Heap Size

+

This is the maximum size of the heap’s linear address space for + managed objects (in bytes). The value stored is the log2 of + the actual value, that is: the number of bits of the address space. + ‘Huge’ and ‘tiny’ objects are not counted in this value, since + they do not store objects in the linear address space of the + heap. +

+

Starting # of Rows in Root Indirect Block

+

This is the starting number of rows for the root indirect block. + A value of 0 indicates that the root indirect block will have + the maximum number of rows needed to address the heap’s Maximum + Heap Size. +

+

Address of Root Block

+

This is the address of the root block for the heap. It can + be the undefined address if + there is no data in the heap. It either points to a direct + block (if the Current # of Rows in the Root Indirect + Block value is 0), or an indirect block. +

+

Current # of Rows in Root Indirect Block

+

This is the current number of rows in the root indirect block. + A value of 0 indicates that Address of Root Block + points to direct block instead of indirect block. +

+

Size of Filtered Root Direct Block

+

This is the size of the root direct block, if filters are + applied to heap objects (in bytes). This field is only + stored in the header if the I/O Filters’ Encoded Length + is greater than 0. +

+

I/O Filter Mask

+

This is the filter mask for the root direct block, if filters + are applied to heap objects. This mask has the same format as + that used for the filter mask in chunked raw data records in a + v1 B-tree. + This field is only + stored in the header if the I/O Filters’ Encoded Length + is greater than 0. +

+

I/O Filter Information

+

This is the I/O filter information encoding direct blocks and + huge objects, if filters are applied to heap objects. This + field is encoded as a Filter Pipeline + message. + The size of this field is determined by I/O Filters’ + Encoded Length. +

+

Checksum

+

This is the checksum for the header.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap Direct Block +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely

Heap Header AddressO

Block Offset (variable size)
Checksum (optional)

Object Data (variable size)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap Direct Block +
Field NameDescription

Signature

+

The ASCII character string “FHDB” + is used to indicate the + beginning of a fractal heap direct block. This gives file consistency + checking utilities a better chance of reconstructing a + damaged file. +

+

Version

+

This document describes version 0.

+

Heap Header Address

+

This is the address for the fractal heap header that this + block belongs to. This field is principally used for file + integrity checking. +

+

Block Offset

+

This is the offset of the block within the fractal heap’s + address space (in bytes). The number of bytes used to encode + this field is the Maximum Heap Size (in the heap’s + header) divided by 8 and rounded up to the next highest integer, + for values that are not a multiple of 8. This value is + principally used for file integrity checking. +

+

Checksum

+

This is the checksum for the direct block.

+

This field is only present if bit 1 of Flags in the + heap’s header is set.

+

Object Data

+

This section of the direct block stores the actual data for + objects in the heap. The size of this section is determined by + the direct block’s size minus the size of the other fields + stored in the direct block (for example, the Signature, + Version, and others including the Checksum if it is + present). +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap Indirect Block +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely

Heap Header AddressO

Block Offset (variable size)

Child Direct Block #0 AddressO


Size of Filtered Direct Block #0 (optional) L

Filter Mask for Direct Block #0 (optional)

Child Direct Block #1 AddressO


Size of Filtered Direct Block #1 (optional)L

Filter Mask for Direct Block #1 (optional)
...

Child Direct Block #K-1 AddressO


Size of Filtered Direct Block #K-1 (optional)L

Filter Mask for Direct Block #K-1 (optional)

Child Indirect Block #0 AddressO


Child Indirect Block #1 AddressO

...

Child Indirect Block #N-1 AddressO

Checksum
+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap Indirect Block +
Field NameDescription

Signature

+

The ASCII character string “FHIB” is used to + indicate the beginning of a fractal heap indirect block. This + gives file consistency checking utilities a better chance of + reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Heap Header Address

+

This is the address for the fractal heap header that this + block belongs to. This field is principally used for file + integrity checking. +

+

Block Offset

+

This is the offset of the block within the fractal heap’s + address space (in bytes). The number of bytes used to encode + this field is the Maximum Heap Size (in the heap’s + header) divided by 8 and rounded up to the next highest integer, + for values that are not a multiple of 8. This value is + principally used for file integrity checking. +

+

Child Direct Block #K Address

+

This field is the address of the child direct block. + The size of the [uncompressed] direct block can be computed by + its offset in the heap’s linear address space. +

+

Size of Filtered Direct Block #K

+

This is the size of the child direct block after passing through + the I/O filters defined for this heap (in bytes). If no I/O + filters are present for this heap, this field is not present. +

+

Filter Mask for Direct Block #K

+

This is the I/O filter mask for the filtered direct block. + This mask has the same format as that used for the filter mask + in chunked raw data records in a v1 B-tree. + If no I/O filters are present for this heap, this field is not + present. +

+

Child Indirect Block #N Address

+

This field is the address of the child indirect block. + The size of the indirect block can be computed by + its offset in the heap’s linear address space. +

+

Checksum

+

This is the checksum for the indirect block.

+
+ +
+ +
+

An object in the fractal heap is identified by means of a fractal heap ID, + which encodes information to locate the object in the heap. + Currently, the fractal heap stores an object in one of three ways, + depending on the object’s size:

+ +
+ + + + + + + + + + + + + + + + + + + + +
TypeDescription
Tiny +

When an object is small enough to be encoded in the + heap ID, the object’s data is embedded in the fractal + heap ID itself. There are two sub-types for this type of + object: normal and extended. The sub-type for tiny heap + IDs depends on whether the heap ID is large enough to + store objects greater than 16 bytes or not. If the + heap ID length is 18 bytes or smaller, the + ‘normal’ tiny heap ID form is used. If the + heap ID length is greater than 18 bytes in length, the + “extended” form is used. See the format + description below for both sub-types. +

+
Huge +

When the size of an object is larger than Maximum Size of + Managed Objects in the Fractal Heap Header, the + object’s data is stored on its own in the file and the object + is tracked/indexed via a version 2 B-tree. All huge objects + for a particular fractal heap use the same v2 B-tree. All huge + objects for a particular fractal heap use the same format for + their huge object IDs. +

+ +

Depending on whether the IDs for a heap are large enough to hold + the object’s retrieval information and whether I/O pipeline filters + are applied to the heap’s objects, 4 sub-types are derived for + huge object IDs for this heap:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Sub-typeDescription
Directly accessed, non-filtered +

The object’s address and length are embedded in the + fractal heap ID itself and the object is directly accessed + from them. This allows the object to be accessed without + resorting to the B-tree. +

+
Directly accessed, filtered +

The filtered object’s address, length, filter mask and + de-filtered size are embedded in the fractal heap ID itself + and the object is accessed directly with them. This allows + the object to be accessed without resorting to the B-tree. +

+
Indirectly accessed, non-filtered +

The object is located by using a B-tree key embedded in + the fractal heap ID to retrieve the address and length from + the version 2 B-tree for huge objects. Then, the address + and length are used to access the object. +

+
Indirectly accessed, filtered +

The object is located by using a B-tree key embedded in + the fractal heap ID to retrieve the filtered object’s + address, length, filter mask and de-filtered size from the + version 2 B-tree for huge objects. Then, this information + is used to access the object. +

+
+
+ +
Managed +

When the size of an object does not meet the above two + conditions, the object is stored and managed via the direct and + indirect blocks based on the doubling table. +

+
+
+ + +
+

The specific format for each type of heap ID is described below: +

+ +
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap ID for Tiny Objects (Sub-type 1 - + ‘Normal’) +
bytebytebytebyte
Version, Type, and LengthThis space inserted only to align table nicely

Data (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap ID for Tiny Objects (Sub-type 1 - + ‘Normal’) +
Field NameDescription

Version, Type, and Length

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Tiny objects have a value of 2. +
0-3The length of the tiny object. The value stored + is one less than the actual length (since zero-length + objects are not allowed to be stored in the heap). + For example, an object of actual length 1 has an + encoded length of 0, an object of actual length 2 + has an encoded length of 1, and so on. +

+ +

Data

+

This is the data for the object. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap ID for Tiny Objects (Sub-type 2 - + ‘Extended’) +
bytebytebytebyte
Version, Type, and LengthExtended LengthThis space inserted only to align table nicely
Data (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap ID for Tiny Objects (Sub-type 2 - + ‘Extended’) +
Field NameDescription

Version, Type, and Length

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Tiny objects have a value of 2. +
0-3These 4 bits, together with the next byte, form an + unsigned 12-bit integer for holding the length of the + object. These 4-bits are bits 8-11 of the 12-bit integer. + See description for the Extended Length field below. +

+ +

Extended Length

+

This byte, together with the 4 bits in the previous byte, + forms an unsigned 12-bit integer for holding the length of + the tiny object. These 8 bits are bits 0-7 of the 12-bit + integer formed. The value stored is one less than the actual + length (since zero-length objects are not allowed to be + stored in the heap). For example, an object of actual length + 1 has an encoded length of 0, an object of actual length + 2 has an encoded length of 1, and so on. +

+

Data

+

This is the data for the object. +

+
+
+ + +
+
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap ID for Huge Objects (Sub-types 1 and 2): + Indirectly Accessed, Non-filtered/Filtered +
bytebytebytebyte
Version and TypeThis space inserted + only to align table nicely

v2 B-tree KeyL (variable size)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap ID for Huge Objects (Sub-types 1 and 2): + Indirectly Accessed, Non-filtered/Filtered +
Field NameDescription

Version and Type

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Huge objects have a value of 1. +
0-3Reserved. +

+ +

v2 B-tree Key

This field is the B-tree key for retrieving the information + from the version 2 B-tree for huge objects needed to access the + object. See the description of v2 B-tree + records sub-types 1 and 2 for a description of the fields. New key + values are derived from Next Huge Object ID in the + Fractal Heap Header.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap ID for Huge Objects (Sub-type 3): + Directly Accessed, Non-filtered +
bytebytebytebyte
Version and TypeThis space inserted only to align table nicely

Address O


Length L

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap ID for Huge Objects (Sub-type 3): + Directly Accessed, Non-filtered +
Field NameDescription

Version and Type

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Huge objects have a value of 1. +
0-3Reserved. +

+ +

Address

This field is the address of the object in the file.

+

Length

This field is the length of the object in the file.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap ID for Huge Objects (Sub-type 4): + Directly Accessed, Filtered +
bytebytebytebyte
Version and TypeThis space inserted only to align table nicely

Address O


Length L

Filter Mask

De-filtered Size L

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap ID for Huge Objects (Sub-type 4): + Directly Accessed, Filtered +
Field NameDescription

Version and Type

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Huge objects have a value of 1. +
0-3Reserved. +

+ +

Address

This field is the address of the filtered object in the file.

+

Length

This field is the length of the filtered object in the file.

+

Filter Mask

This field is the I/O pipeline filter mask for the + filtered object in the file.

+

Filtered Size

This field is the size of the de-filtered object in the file.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap ID for Managed Objects +
bytebytebytebyte
Version and TypeThis space inserted only to align table nicely
Offset (variable size)
Length (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap ID for Managed Objects +
Field NameDescription

Version and Type

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + +
BitDescription
6-7The current version of ID format. This document + describes version 0. +
4-5The ID type. Managed objects have a value of 0. +
0-3Reserved. +

+

Offset

This field is the offset of the object in the heap. + This field’s size is the minimum number of bytes + necessary to encode the Maximum Heap Size value + (from the Fractal Heap Header). For example, if the + value of the Maximum Heap Size is less than 256 bytes, + this field is 1 byte in length, a Maximum Heap Size + of 256-65535 bytes uses a 2 byte length, and so on.

Length

This field is the length of the object in the heap. It + is determined by taking the minimum value of Maximum + Direct Block Size and Maximum Size of Managed + Objects in the Fractal Heap Header. Again, + the minimum number of bytes needed to encode that value is + used for the size of this field.

+
+ +

+ III.H. Disk Format: Level 1H - Free-space Manager

+ +

+ Free-space managers are used to describe space within a heap or + the entire HDF5 file that is not currently used for that heap or + file. +

+ +

+ The free-space manager header contains metadata information + about the space being tracked, along with the address of the list + of free space sections which actually describes the free + space. The header records information about free-space sections being + tracked, creation parameters for handling free-space sections of a + client, and section information used to locate the collection of + free-space sections. +

+ +

+ The free-space section list stores a collection of + free-space sections that is specific to each client of the + free-space manager. + + For example, the fractal heap is a client of the free space manager + and uses it to track unused space within the heap. There are 4 + types of section records for the fractal heap, each of which has + its own format, listed below. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Free-space Manager Header +
bytebytebytebyte
Signature
VersionClient IDThis space inserted only to align table nicely

Total Space TrackedL


Total Number of SectionsL


Number of Serialized SectionsL


Number of Un-Serialized SectionsL

Number of Section ClassesThis space inserted only to align table nicely
Shrink PercentExpand Percent
Size of Address SpaceThis space inserted only to align table nicely

Maximum Section Size L


Address of Serialized Section ListO


Size of Serialized Section List UsedL


Allocated Size of Serialized Section ListL

Checksum
+ + + + + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Free-space Manager Header +
Field NameDescription

Signature

+

The ASCII character string “FSHD” + is used to indicate the beginning of the Free-space Manager + Header. This gives file consistency checking utilities a + better chance of reconstructing a damaged file. +

+

Version

+

This is the version number for the Free-space Manager Header + and this document describes version 0.

+

Client ID

+

This is the client ID for identifying the user of this + free-space manager: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Fractal heap +
1File +
2+Reserved. +

+ +

Total Space Tracked

+

This is the total amount of free space being tracked, in bytes. +

+

Total Number of Sections

+

This is the total number of free-space sections being tracked. +

+

Number of Serialized Sections

+

This is the number of serialized free-space sections being + tracked. +

+

Number of Un-Serialized Sections

+

This is the number of un-serialized free-space sections being + managed. Un-serialized sections are created by the free-space + client when the list of sections is read in. +

+

Number of Section Classes

+

This is the number of section classes handled by this free space + manager for the free-space client. +

+

Shrink Percent

+

This is the percent of current size to shrink the allocated + serialized free-space section list. +

+

Expand Percent

+

This is the percent of current size to expand the allocated + serialized free-space section list. +

+

Size of Address Space

+

This is the size of the address space that free-space sections + are within. This is stored as the log2 of the + actual value (in other words, the number of bits required + to store values within that address space). +

+

Maximum Section Size

+

This is the maximum size of a section to be tracked. +

+

Address of Serialized Section List

+

This is the address where the serialized free-space section + list is stored. +

+

Size of Serialized Section List Used

+

This is the size of the serialized free-space section + list used (in bytes). This value must be less than + or equal to the allocated size of serialized section + list, below. +

+

Allocated Size of Serialized Section List

+

This is the size of serialized free-space section list + actually allocated (in bytes). +

+

Checksum

+

This is the checksum for the free-space manager header.

+
+
+ +
+

The free-space sections being managed are stored in a + free-space section list, described below. The sections + in the free-space section list are stored in the following way: + a count of the number of sections describing a particular size of + free space and the size of the free-space described (in bytes), + followed by a list of section description records; then another + section count and size, followed by the list of section + descriptions for that size; and so on.

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Free-space Section List +
bytebytebytebyte
Signature
VersionThis space inserted only to align table nicely

Free-space Manager Header AddressO

Number of Section Records in Set #0 (variable size)
Size of Free-space Section Described in Record Set #0 (variable size)
Record Set #0 Section Record #0 Offset(variable size)
Record Set #0 Section Record #0 TypeThis space inserted only to align table nicely
Record Set #0 Section Record #0 Data (variable size)
...
Record Set #0 Section Record #K-1 Offset(variable size)
Record Set #0 Section Record #K-1 TypeThis space inserted only to align table nicely
Record Set #0 Section Record #K-1 Data (variable size)
Number of Section Records in Set #1 (variable size)
Size of Free-space Section Described in Record Set #1 (variable size)
Record Set #1 Section Record #0 Offset(variable size)
Record Set #1 Section Record #0 TypeThis space inserted only to align table nicely
Record Set #1 Section Record #0 Data (variable size)
...
Record Set #1 Section Record #K-1 Offset(variable size)
Record Set #1 Section Record #K-1 TypeThis space inserted only to align table nicely
Record Set #1 Section Record #K-1 Data (variable size)
...
...
Number of Section Records in Set #N-1 (variable size)
Size of Free-space Section Described in Record Set #N-1 (variable size)
Record Set #N-1 Section Record #0 Offset(variable size)
Record Set #N-1 Section Record #0 TypeThis space inserted only to align table nicely
Record Set #N-1 Section Record #0 Data (variable size)
...
Record Set #N-1 Section Record #K-1 Offset(variable size)
Record Set #N-1 Section Record #K-1 TypeThis space inserted only to align table nicely
Record Set #N-1 Section Record #K-1 Data (variable size)
Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Free-space Section List +
Field NameDescription

Signature

+

The ASCII character string “FSSE” + is used to indicate the beginning of the Free-space Section + Information. This gives file consistency checking utilities + a better chance of reconstructing a damaged file. +

+

Version

+

This is the version number for the Free-space Section List + and this document describes version 0.

+

Free-space Manager Header Address

+

This is the address of the Free-space Manager Header. + This field is principally used for file + integrity checking. +

+

Number of Section Records for Set #N

+

This is the number of free-space section records for set #N. + The length of this field is the minimum number of bytes needed + to store the number of serialized sections (from the + free-space manager header). +

+ +

+ The number of sets of free-space section records is + determined by the size of serialized section list in + the free-space manager header. +

+

Section Size for Record Set #N

+

This is the size (in bytes) of the free-space section described + for all the section records in set #N. +

+ +

+ The length of this field is the minimum number of bytes needed + to store the maximum section size (from the + free-space manager header). +

+

Record Set #N Section #K Offset

+

This is the offset (in bytes) of the free-space section within + the client for the free-space manager. +

+ +

+ The length of this field is the minimum number of bytes needed + to store the size of address space (from the + free-space manager header). +

+

Record Set #N Section #K Type

+

This is the type of the section record, used to decode the + record set #N section #K data information. The defined + record type for file client is: + + + + + + + + + + + + + + + +
TypeDescription
0File’s section (a range of actual bytes in file) +
1+Reserved. +

+ +

The defined record types for a fractal heap client are: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
TypeDescription
0Fractal heap “single” section +
1Fractal heap “first row” section +
2Fractal heap “normal row” section +
3Fractal heap “indirect” section +
4+Reserved. +

+ +

Record Set #N Section #K Data

+

This is the section-type specific information for each record + in the record set, described below. +

+

Checksum

+

This is the checksum for the Free-space Section List. +

+
+
+ +
+

+ The section-type specific data for each free-space section record is + described below: +

+ +
+ + + + + + +
+ Layout: File’s Section Data Record +
No additional record data stored
+
+ +
+
+
+
+ + + + + + +
+ Layout: Fractal Heap “Single” Section Data Record +
No additional record data stored
+
+ +
+
+
+
+ + + + + + +
+ Layout: Fractal Heap “First Row” Section Data + Record +
Same format as “indirect” + section data
+
+ +
+
+
+
+ + + + + + +
+ Layout: Fractal Heap “Normal Row” Section Data + Record +
No additional record data stored
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fractal Heap “Indirect” Section + Data Record +
bytebytebytebyte
Fractal Heap Indirect Block Offset (variable size)
Block Start RowBlock Start Column
Number of BlocksThis space inserted only to align table nicely
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fractal Heap “Indirect” Section + Data Record +
Field NameDescription

Fractal Heap Block Offset

+

The offset of the indirect block in the fractal heap’s address + space containing the empty blocks. +

+

+ The number of bytes used to encode this field is the minimum + number of bytes needed to encode values for the Maximum + Heap Size (in the fractal heap’s header). +

+

Block Start Row

+

This is the row that the empty blocks start in. +

+

Block Start Column

+

This is the column that the empty blocks start in. +

+

Number of Blocks

+

This is the number of empty blocks covered by the section. +

+
+
+ +

+ III.I. Disk Format: Level 1I - Shared Object Header Message Table

+ +

+ The shared object header message table is used to locate + object + header messages that are shared between two or more object headers + in the file. Shared object header messages are stored and indexed + in the file in one of two ways: indexed sequentially in a + shared header message list or indexed with a v2 B-tree. + The shared messages themselves are either stored in a fractal + heap (when two or more objects share the message), or remain in an + object’s header (when only one object uses the message currently, + but the message can be shared in the future). +

+ +

+ The shared object header message table + contains a list of shared message index headers. Each index header + records information about the version of the index format, the index + storage type, flags for the message types indexed, the number of + messages in the index, the address where the index resides, + and the fractal heap address if shared messages are stored there. +

+ +

+ Each index can be either a list or a v2 B-tree and may transition + between those two forms as the number of messages in the index + varies. Each shared message record contains information used to + locate the shared message from either a fractal heap or an object + header. The types of messages that can be shared are: Dataspace, + Datatype, Fill Value, Filter Pipeline and Attribute. +

+ +

+ The shared object header message table is pointed to + from a shared message table message + in the superblock extension for a file. This message stores the + version of the table format, along with the number of index headers + in the table. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Object Header Message Table +
bytebytebytebyte
Signature
Version for index #0Index Type for index #0Message Type Flags for index #0
Minimum Message Size for index #0
List Cutoff for index #0v2 B-tree Cutoff for index #0
Number of Messages for index #0This space inserted only to align table nicely

Index AddressO for index #0


Fractal Heap AddressO for index #0

...
...
Version for index #N-1Index Type for index #N-1Message Type Flags for index #N-1
Minimum Message Size for index #N-1
List Cutoff for index #N-1v2 B-tree Cutoff for index #N-1
Number of Messages for index #N-1This space inserted only to align table nicely

Index AddressO for index #N-1


Fractal Heap AddressO for index #N-1

Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Object Header Message Table +
Field NameDescription

Signature

+

The ASCII character string “SMTB” + is used to indicate the beginning of the Shared Object + Header Message table. This gives file consistency checking + utilities a better chance of reconstructing a damaged file. +

+

Version for index #N

+

This is the version number for the list of shared object header message + indexes and this document describes version 0.

+

Index Type for index #N

+

The type of index can be an unsorted list or a v2 B-tree. +

+

Message Type Flags for index #N

+

This field indicates the type of messages tracked in the index, + as follows: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitsDescription
0If set, the index tracks Dataspace Messages. +
1If set, the message tracks Datatype Messages. +
2If set, the message tracks Fill Value Messages. +
3If set, the message tracks Filter Pipeline Messages. +
4If set, the message tracks Attribute Messages. +
5-15Reserved (zero). +

+ + +

+ An index can track more than one type of message, but each type + of message can only by in one index. +

+

Minimum Message Size for index #N

+

This is the message size sharing threshold for the index. + If the encoded size of the message is less than this value, the + message is not shared. +

+

List Cutoff for index #N

+

This is the cutoff value for the indexing of messages to + switch from a list to a v2 B-tree. If the number of messages + is greater than this value, the index should be a v2 B-tree. +

+

v2 B-tree Cutoff for index #N

+

This is the cutoff value for the indexing of messages + to switch from a v2 B-tree back to a list. If the number + of messages is less than this value, the index should be + a list. +

+

Number of Messages for index #N

+

The number of shared messages being tracked for the index. +

+

Index Address for index #N

+

This field is the address of the list or v2 B-tree where the + index nodes reside. +

+

Fractal Heap Address for index #N

+

This field is the address of the fractal heap if shared messages + are stored there. +

+

Checksum

+

This is the checksum for the table.

+
+
+ +
+

+ Shared messages are indexed either with a shared message record + list, described below, or using a v2 B-tree (using record type 7). + The number of records in the shared message record list is + determined in the index’s entry in the shared object header message + table. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message Record List +
bytebytebytebyte
Signature
Shared Message Record #0
Shared Message Record #1
...
Shared Message Record #N-1
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message Record List +
Field NameDescription

Signature

+

The ASCII character string “SMLI” + is used to indicate the beginning of a list of index nodes. + This gives file consistency checking utilities a better + chance of reconstructing a damaged file. +

+

Shared Message Record #N

+

The record for locating the shared message, either in the + fractal heap for the index, or an object header (see format for + index nodes below). +

+

Checksum

+

This is the checksum for the list. +

+
+
+ +
+

+ The record for each shared message in an index is stored in one + of the following forms: +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message Record for Messages Stored in a + Fractal Heap +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash Value
Reference Count

Fractal Heap ID

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message Record for Messages Stored in a + Fractal Heap +
Field NameDescription

Message Location

+

This has a value of 0 indicating that the message is stored in + the heap. +

+

Hash Value

+

This is the hash value for the message. +

+

Reference Count

+

This is the number of times the message is used in the file. +

+

Fractal Heap ID

+

This is an 8-byte fractal heap ID for the message as stored in + the fractal heap for the index. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message Record for Messages Stored in an + Object Header +
bytebytebytebyte
Message LocationThis space inserted only to align table nicely
Hash Value
ReservedMessage TypeCreation Index

Object Header AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message Record for Messages Stored in an + Object Header +
Field NameDescription

Message Location

+

This has a value of 1 indicating that the message is stored in + an object header. +

+

Hash Value

+

This is the hash value for the message. +

+

Message Type

+

This is the message type in the object header. +

+

Creation Index

+

This is the creation index of the message within the object + header. +

+

Object Header Address

+

This is the address of the object header where the message is + located. +

+
+
+ +

+ IV. Disk Format: Level 2 - Data Objects

+ +

Data objects contain the “real” user-visible information in the file. + These objects compose the scientific data and other information which + are generally thought of as “data” by the end-user. All the + other information in the file is provided as a framework for + storing and accessing these data objects. +

+ +

A data object is composed of header and data + information. The header information contains the information + needed to interpret the data information for the object as + well as additional “metadata” or pointers to additional + “metadata” used to describe or annotate each object. +

+ +

+ IV.A. Disk Format: Level 2A - Data Object Headers

+ +

The header information of an object is designed to encompass + all of the information about an object, except for the data itself. + This information includes the dataspace, the datatype, information + about how the data is stored on disk (in external files, compressed, + broken up in blocks, and so on), as well as other information used + by the library to speed up access to the data objects or maintain + a file’s integrity. Information stored by user applications + as attributes is also stored in the object’s header. The header + of each object is not necessarily located immediately prior to the + object’s data in the file and in fact may be located in any + position in the file. The order of the messages in an object header + is not significant.

+ +

Object headers are composed of a prefix and a set of messages. The + prefix contains the information needed to interpret the messages and + a small amount of metadata about the object, and the messages contain + the majority of the metadata about the object. +

+ +

+ IV.A.1. Disk Format: Level 2A1 - Data Object Header Prefix

+ + + +

+ IV.A.1.a. Version 1 Data Object Header Prefix

+ +

Header messages are aligned on 8-byte boundaries for version 1 + object headers. +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 1 Object Header +
bytebytebytebyte
VersionReserved (zero)Total Number of Header Messages
Object Reference Count
Object Header Size
Reserved (zero)
Header Message Type #1Size of Header Message Data #1
Header Message #1 FlagsReserved (zero)

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #n
Header Message #n FlagsReserved (zero)

Header Message Data #n

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 1 Object Header +
Field NameDescription

Version

+

This value is used to determine the format of the + information in the object header. When the format of the + object header is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. This + is version one (1) (there was no version zero (0)) of the + object header. +

+

Total Number of Header Messages

+

This value determines the total number of messages listed in + object headers for this object. This value includes the messages + in continuation messages for this object. +

+

Object Reference Count

+

This value specifies the number of “hard links” to this object + within the current file. References to the object from external + files, “soft links” in this file and object references in this + file are not tracked. +

+

Object Header Size

+

This value specifies the number of bytes of header message data + following this length field that contain object header messages + for this object header. This value does not include the size of + object header continuation blocks for this object elsewhere in the + file. +

+

Header Message #n Type

+

This value specifies the type of information included in the + following header message data. The message types for + header messages are defined in sections below. +

+

Size of Header Message #n Data

+

This value specifies the number of bytes of header + message data following the header message type and length + information for the current message. The size includes + padding bytes to make the message a multiple of eight + bytes. +

+

Header Message #n Flags

+

This is a bit field with the following definition: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, the message data is constant. This is used + for messages like the datatype message of a dataset. +
1If set, the message is shared and stored + in another location than the object header. The Header + Message Data field contains a Shared Message + (described in the Data Object Header Messages + section below) + and the Size of Header Message Data field + contains the size of that Shared Message. +
2If set, the message should not be shared. +
3If set, the HDF5 decoder should fail to open this object + if it does not understand the message’s type and the file + is open with permissions allowing write access to the file. + (Normally, unknown messages can just be ignored by HDF5 + decoders) +
4If set, the HDF5 decoder should set bit 5 of this + message’s flags (in other words, this bit field) + if it does not understand the message’s type + and the object is modified in any way. (Normally, + unknown messages can just be ignored by HDF5 + decoders) +
5If set, this object was modified by software that did not + understand this message. + (Normally, unknown messages should just be ignored by HDF5 + decoders) (Can be used to invalidate an index or a similar + feature) +
6If set, this message is shareable. +
7If set, the HDF5 decoder should always fail to open this + object if it does not understand the message’s type (whether + it is open for read-only or read-write access). (Normally, + unknown messages can just be ignored by HDF5 decoders) +

+ +

Header Message #n Data

+

The format and length of this field is determined by the + header message type and size respectively. Some header + message types do not require any data and this information + can be eliminated by setting the length of the message to + zero. The data is padded with enough zeroes to make the + size a multiple of eight. +

+
+
+ +

+ IV.A.1.b. Version 2 Data Object Header Prefix

+ +

Note that the “total number of messages” field has been dropped from + the data object header prefix in this version. The number of messages + in the data object header is just determined by the messages encountered + in all the object header blocks.

+ +

Note also that the fields and messages in this version of data object + headers have no alignment or padding bytes inserted - they are + stored packed together.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 Object Header +
bytebytebytebyte
Signature
VersionFlagsThis space inserted only to align table nicely
Access time (optional)
Modification Time (optional)
Change Time (optional)
Birth Time (optional)
Maximum # of compact attributes (optional)Minimum # of dense attributes (optional)
Size of Chunk #0 (variable size)This space inserted only to align table nicely
Header Message Type #1Size of Header Message Data #1Header Message #1 Flags
Header Message #1 Creation Order (optional)This space inserted only to align table nicely

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #nHeader Message #n Flags
Header Message #n Creation Order (optional)This space inserted only to align table nicely

Header Message Data #n

Gap (optional, variable size)
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 Object Header +
Field NameDescription

Signature

+

The ASCII character string “OHDR” + is used to indicate the beginning of an object header. This + gives file consistency checking utilities a better chance + of reconstructing a damaged file. +

+

Version

+

This field has a value of 2 indicating version 2 of the object header. +

+

Flags

+

This field is a bit field indicating additional information + about the object header. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Bit(s)Description
0-1This two bit field determines the size of the + Size of Chunk #0 field. The values are: + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0The Size of Chunk #0 field is 1 byte. +
1The Size of Chunk #0 field is 2 bytes. +
2The Size of Chunk #0 field is 4 bytes. +
3The Size of Chunk #0 field is 8 bytes. +
+
2If set, attribute creation order is tracked.
3If set, attribute creation order is indexed.
4If set, non-default attribute storage phase change + values are stored.
5If set, access, modification, change and birth times + are stored.
6-7Reserved

+ +

Access Time

+

This 32-bit value represents the number of seconds after the + UNIX epoch when the object’s raw data was last accessed + (in other words, read or written). +

+

This field is present if bit 5 of flags is set. +

+

Modification Time

+

This 32-bit value represents the number of seconds after + the UNIX epoch when the object’s raw data was last + modified (in other words, written). +

+

This field is present if bit 5 of flags is set. +

+

Change Time

+

This 32-bit value represents the number of seconds after the + UNIX epoch when the object’s metadata was last changed. +

+

This field is present if bit 5 of flags is set. +

+

Birth Time

+

This 32-bit value represents the number of seconds after the + UNIX epoch when the object was created. +

+

This field is present if bit 5 of flags is set. +

+

Maximum # of compact attributes

+

This is the maximum number of attributes to store in the compact + format before switching to the indexed format. +

+

This field is present if bit 4 of flags is set. +

+

Minimum # of dense attributes

+

This is the minimum number of attributes to store in the indexed + format before switching to the compact format. +

+

This field is present if bit 4 of flags is set. +

+

Size of Chunk #0

+

+ This unsigned value specifies the number of bytes of header + message data following this field that contain object header + information. +

+

+ This value does not include the size of object header + continuation blocks for this object elsewhere in the file. +

+

+ The length of this field varies depending on bits 0 and 1 of + the flags field. +

+

Header Message #n Type

+

Same format as version 1 of the object header, described above. +

+

Size of Header Message #n Data

+

This value specifies the number of bytes of header + message data following the header message type and length + information for the current message. The size of messages + in this version does not include any padding bytes. +

+

Header Message #n Flags

+

Same format as version 1 of the object header, described above. +

+

Header Message #n Creation Order

+

This field stores the order that a message of a given type + was created in. +

+

This field is present if bit 2 of flags is set. +

+

Header Message #n Data

+

Same format as version 1 of the object header, described above. +

+

Gap

+

A gap in an object header chunk is inferred by the end of the + messages for the chunk before the beginning of the chunk’s + checksum. Gaps are always smaller than the size of an + object header message prefix (message type + message size + + message flags). +

+

Gaps are formed when a message (typically an attribute message) + in an earlier chunk is deleted and a message from a later + chunk that does not quite fit into the free space is moved + into the earlier chunk. +

+

Checksum

+

This is the checksum for the object header chunk. +

+
+
+ +

The header message types and the message data associated with + them compose the critical “metadata” about each object. Some + header messages are required for each object while others are + optional. Some optional header messages may also be repeated + several times in the header itself, the requirements and number + of times allowed in the header will be noted in each header + message description below. +

+ + +

+ IV.A.2. Disk Format: Level 2A2 - Data Object Header Messages

+ +

Data object header messages are small pieces of metadata that are + stored in the data object header for each object in an HDF5 file. + Data object header messages provide the metadata required to describe + an object and its contents, as well as optional pieces of metadata + that annotate the meaning or purpose of the object. +

+ +

Data object header messages are either stored directly in the data + object header for the object or are shared between multiple objects + in the file. When a message is shared, a flag in the Message Flags + indicates that the actual Message Data + portion of that message is stored in another location (such as another + data object header, or a heap in the file) and the Message Data + field contains the information needed to locate the actual information + for the message. +

+ +

+ The format of shared message data is described here:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message (Version 1) +
bytebytebytebyte
VersionTypeReserved (zero)
Reserved (zero)

AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message (Version 1) +
Field NameDescription

Version

The version number is used when there are changes in the format + of a shared object message and is described here: + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by the library before version 1.6.1. +

+

Type

The type of shared message location: + + + + + + + + + + +
ValueDescription
0Message stored in another object’s header (a committed + message). +

+

Address

The address of the object header + containing the message to be shared.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message (Version 2) +
bytebytebytebyte
VersionTypeThis space inserted only to align table nicely

AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message (Version 2) +
Field NameDescription

Version

The version number is used when there are changes in the format + of a shared object message and is described here: + + + + + + + + + + +
VersionDescription
2Used by the library of version 1.6.1 and after. +

+

Type

The type of shared message location: + + + + + + + + + + +
ValueDescription
0Message stored in another object’s header (a committed + message). +

+

Address

The address of the object header + containing the message to be shared.

+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message (Version 3) +
bytebytebytebyte
VersionTypeThis space inserted only to align table nicely
Location (variable size)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message (Version 3) +
Field NameDescription

Version

The version number indicates changes in the format of shared + object message and is described here: + + + + + + + + + + +
VersionDescription
3Used by the library of version 1.8 and after. In this + version, the Type field can indicate that + the message is stored in the fractal heap. +

+

Type

The type of shared message location: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Message is not shared and is not shareable. +
1Message stored in file’s shared object header message + heap (a shared message). +
2Message stored in another object’s header (a committed + message). +
3Message stored is not shared, but is sharable. +

+

Location

This field contains either a + Size of Offsets-bytes address of the object header + containing the message to be shared, or an 8-byte fractal heap + ID for the message in the file’s shared object header + message heap. +

+
+
+ + +

The following is a list of currently defined header messages: +

+ +

IV.A.2.a. The NIL Message

+ + +
+ + + + + + + + +
Header Message Name: NIL
Header Message Type: 0x0000
Length: Varies
Status: Optional; may be repeated.
Description:The NIL message is used to indicate a message which is to be + ignored when reading the header messages for a data object. + [Possibly one which has been deleted for some reason.] +
Format of Data: Unspecified
+ + + +

IV.A.2.b. The Dataspace Message

+ + +
+ + + + + + + + + + +
Header Message Name: Dataspace
Header Message Type: 0x0001
Length: Varies according to the number of + dimensions, as described in the following table.
Status: Required for dataset objects; + may not be repeated.
Description:The dataspace message describes the number of dimensions (in + other words, “rank”) and size of each dimension that + the data object has. This message is only used for datasets which + have a simple, rectilinear, array-like layout; datasets requiring + a more complex layout are not yet supported. +
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Dataspace Message - Version 1 +
bytebytebytebyte
VersionDimensionalityFlagsReserved
Reserved

Dimension #1 SizeL

.
.
.

Dimension #n SizeL


Dimension #1 Maximum SizeL (optional)

.
.
.

Dimension #n Maximum SizeL (optional)


Permutation Index #1L (optional)

.
.
.

Permutation Index #nL (optional)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Dataspace Message - Version 1 +
Field NameDescription

Version

+

This value is used to determine the format of the + Dataspace Message. When the format of the + information in the message is changed, the version number + is incremented and can be used to determine how the + information in the object header is formatted. This + document describes version one (1) (there was no version + zero (0)). +

+

Dimensionality

+

This value is the number of dimensions that the data + object has. +

+

Flags

+

This field is used to store flags to indicate the + presence of parts of this message. Bit 0 (the least + significant bit) is used to indicate that maximum + dimensions are present. Bit 1 is used to indicate that + permutation indices are present. +

+

Dimension #n Size

+

This value is the current size of the dimension of the + data as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Dimension #n Maximum Size

+

This value is the maximum size of the dimension of the + data as stored in the file. This value may be the special + “unlimited” size which indicates + that the data may expand along this dimension indefinitely. + If these values are not stored, the maximum size of each + dimension is assumed to be the dimension’s current size. +

+

Permutation Index #n

+

This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. If these values are + not stored, the first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+
+
+ + + +
+

Version 2 of the dataspace message dropped the optional + permutation index value support, as it was never implemented in the + HDF5 Library:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Dataspace Message - Version 2 +
bytebytebytebyte
VersionDimensionalityFlagsType

Dimension #1 SizeL

.
.
.

Dimension #n SizeL


Dimension #1 Maximum SizeL (optional)

.
.
.

Dimension #n Maximum SizeL (optional)

+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Dataspace Message - Version 2 +
Field NameDescription

Version

+

This value is used to determine the format of the + Dataspace Message. This field should be ‘2’ for version 2 + format messages. +

+

Dimensionality

+

This value is the number of dimensions that the data object has. +

+

Flags

+

This field is used to store flags to indicate the + presence of parts of this message. Bit 0 (the least + significant bit) is used to indicate that maximum + dimensions are present. +

+

Type

+

This field indicates the type of the dataspace: + + + + + + + + + + + + + + + + + + +
ValueDescription
0A scalar dataspace; in other words, + a dataspace with a single, dimensionless element. +
1A simple dataspace; in other words, + a dataspace with a rank greater than 0 and an + appropriate number of dimensions. +
2A null dataspace; in other words, + a dataspace with no elements. +

+

Dimension #n Size

+

This value is the current size of the dimension of the + data as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Dimension #n Maximum Size

+

This value is the maximum size of the dimension of the + data as stored in the file. This value may be the special + “unlimited” size which indicates + that the data may expand along this dimension indefinitely. + If these values are not stored, the maximum size of each + dimension is assumed to be the dimension’s current size. +

+
+
+ + + + + +

IV.A.2.c. The Link Info Message

+ + +
+ + + + + + + + +
Header Message Name: Link Info
Header Message Type: 0x002
Length: Varies
Status: Optional; may not be + repeated.
Description:The link info message tracks variable information about the + current state of the links for a “new style” + group’s behavior. Variable information will be stored in + this message and constant information will be stored in the + Group Info message. +
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Link Info +
bytebytebytebyte
VersionFlagsThis space inserted only to align table nicely

Maximum Creation Index (8 bytes, optional)


Fractal Heap AddressO


Address of v2 B-tree for Name IndexO


Address of v2 B-tree for Creation Order IndexO (optional)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Link Info +
Field NameDescription

Version

+

The version number for this message. This document describes + version 0.

+

Flags

This field determines various optional aspects of the link + info message: + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, creation order for the links is tracked. +
1If set, creation order for the links is indexed. +
2-7Reserved

+ +

Maximum Creation Index

This 64-bit value is the maximum creation order index value + stored for a link in this group.

+

This field is present if bit 0 of flags is set.

+

Fractal Heap Address

+

+ This is the address of the fractal heap to store dense links. + Each link stored in the fractal heap is stored as a + Link Message. +

+

+ If there are no links in the group, or the group’s links + are stored “compactly” (as object header messages), this + value will be the undefined + address. +

+

Address of v2 B-tree for Name Index

This is the address of the version 2 B-tree to index names of links.

+

If there are no links in the group, or the group’s links + are stored “compactly” (as object header messages), this + value will be the undefined + address. +

+

Address of v2 B-tree for Creation Order Index

This is the address of the version 2 B-tree to index creation order of links.

+

If there are no links in the group, or the group’s links + are stored “compactly” (as object header messages), this + value will be the undefined + address. +

+

This field exists if bit 1 of flags is set.

+
+
+ + +

IV.A.2.d. The Datatype Message

+ + +
+ + + + + + + + +
Header Message Name: Datatype
Header Message Type: 0x0003 +
Length: Variable
Status: Required for dataset or committed + datatype (formerly named datatype) objects; may not be repeated. +
Description:

The datatype message defines the datatype for each element + of a dataset or a common datatype for sharing between multiple + datasets. A datatype can describe an atomic type like a fixed- + or floating-point type or more complex types like a C struct + (compound datatype), array (array datatype), or C++ vector + (variable-length datatype).

+

Datatype messages that are part of a dataset object do not + describe how elements are related to one another; the dataspace + message is used for that purpose. Datatype messages that are part of + a committed datatype (formerly named datatype) message describe + a common datatype that can be shared by multiple datasets in the + file.

+
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Datatype Message +
bytebytebytebyte
Class and VersionClass Bit Field, Bits 0-7Class Bit Field, Bits 8-15Class Bit Field, Bits 16-23
Size


Properties


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Datatype Message +
Field NameDescription

Class and Version

+

The version of the datatype message and the datatype’s class + information are packed together in this field. The version + number is packed in the top 4 bits of the field and the class + is contained in the bottom 4 bits. +

+

The version number information is used for changes in the + format of the datatype message and is described here: + + + + + + + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Used by early versions of the library to encode + compound datatypes with explicit array fields. + See the compound datatype description below for + further details. +
2Used when an array datatype needs to be encoded. +
3Used when a VAX byte-ordered type needs to be + encoded. Packs various other datatype classes more + efficiently also. +
4Used to encode the revised reference datatype. +

+ +

The class of the datatype determines the format for the class + bit field and properties portion of the datatype message, which + are described below. The + following classes are currently defined: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Fixed-Point
1Floating-Point
2 Time
3String
4Bit field
5Opaque
6Compound
7Reference
8Enumerated
9Variable-Length
10Array

+ +

Class Bit Fields

+

The information in these bit fields is specific to each datatype + class and is described below. All bits not defined for a + datatype class are set to zero. +

+

Size

+

The size of a datatype element in bytes. +

+

Properties

+

This variable-sized sequence of bytes encodes information + specific to each datatype class and is described for each class + below. If there is no property information specified for a + datatype class, the size of this field is zero bytes. +

+
+
+ + +
+
+ +

Class specific information for the Fixed-point Numbers class + (Class 0):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bits: Fixed-point Bit Field Description +
BitsMeaning

0

Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.

1, 2

Padding type. Bit 1 is the lo_pad bit and bit 2 + is the hi_pad bit. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.

3

Signed. If this bit is set then the fixed-point + number is in 2’s complement form.

4-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + +
+ Layout: Fixed-point Property Description +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Fixed-point Property Description +
Field NameDescription

Bit Offset

+

The bit offset of the first significant bit of the fixed-point + value within the datatype. The bit offset specifies the number + of bits “to the right of” the value (which are set to the + lo_pad bit value). +

+

Bit Precision

+

The number of bits of precision of the fixed-point value + within the datatype. This value, combined with the datatype + element’s size and the Bit Offset field specifies the number + of bits “to the left of” the value (which are set to the + hi_pad bit value). +

+
+
+ + +
+
+ +

Class specific information for the Floating-point Numbers class + (Class 1):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bits: Floating-point Bit Field Description +
BitsMeaning

0, 6

Byte Order. These two non-contiguous bits specify the + “endianness” of the bytes in the datatype element. + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Bit 6Bit 0Description
00Byte order is little-endian +
01Byte order is big-endian +
10Reserved +
11Byte order is VAX-endian +

+

1, 2, 3

Padding type. Bit 1 is the low bits pad type, bit 2 + is the high bits pad type, and bit 3 is the internal bits + pad type. If a datum has unused bits at either end or between + the sign bit, exponent, or mantissa, then the value of bit + 1, 2, or 3 is copied to those locations.

4-5

Mantissa Normalization. This 2-bit bit field specifies + how the most significant bit of the mantissa is managed. + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0No normalization +
1The most significant bit of the mantissa is always set + (except for 0.0). +
2The most significant bit of the mantissa is not stored, + but is implied to be set. +
3Reserved. +

+

7

Reserved (zero).

8-15

Sign Location. This is the bit position of the sign + bit. Bits are numbered with the least significant bit zero.

16-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Floating-point Property Description +
ByteByteByteByte
Bit OffsetBit Precision
Exponent LocationExponent SizeMantissa LocationMantissa Size
Exponent Bias
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Floating-point Property Description +
Field NameDescription

Bit Offset

+

The bit offset of the first significant bit of the floating-point + value within the datatype. The bit offset specifies the number + of bits “to the right of” the value. +

+

Bit Precision

+

The number of bits of precision of the floating-point value + within the datatype. +

+

Exponent Location

+

The bit position of the exponent field. Bits are numbered with + the least significant bit number zero. +

+

Exponent Size

+

The size of the exponent field in bits. +

+

Mantissa Location

+

The bit position of the mantissa field. Bits are numbered with + the least significant bit number zero. +

+

Mantissa Size

+

The size of the mantissa field in bits. +

+

Exponent Bias

+

The bias of the exponent field. +

+
+
+ + +
+
+ +

Class specific information for the Time class (Class 2):

+ + +
+ + + + + + + + + + + + + + + + + +
+ Bits: Time Bit Field Description +
BitsMeaning

0

Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.

1-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + +
+ Layout: Time Property Description +
ByteByte
Bit Precision
+
+ +
+
+ + + + + + + + + + + + +
+ Fields: Time Property Description +
Field NameDescription

Bit Precision

+

The number of bits of precision of the time value. +

+
+
+ + +
+ +

Class specific information for the Strings class (Class 3):

+ + +
+ + + + + + + + + + + + + + + + + + + + + + +
+ Bits: String Bit Field Description +
BitsMeaning

0-3

Padding type. This four-bit value determines the + type of padding to use for the string. The values are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Null Terminate: A zero byte marks the end of the + string and is guaranteed to be present after + converting a long string to a short string. When + converting a short string to a long string the value is + padded with additional null characters as necessary. +
1Null Pad: Null characters are added to the end of + the value during conversions from short values to long + values but conversion in the opposite direction simply + truncates the value. +
2Space Pad: Space characters are added to the end of + the value during conversions from short values to long + values but conversion in the opposite direction simply + truncates the value. This is the Fortran + representation of the string. +
3-15Reserved +

+

4-7

Character Set. The character set used to + encode the string. + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding +
1UTF-8 character set encoding +
2-15Reserved +

+

8-23

Reserved (zero).

+
+ +

There are no properties defined for the string class. +

+ +
+
+ +

Class specific information for the Bit Fields class (Class 4):

+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ Bits: Bitfield Bit Field Description +
BitsMeaning

0

Byte Order. If zero, byte order is little-endian; + otherwise, byte order is big endian.

1, 2

Padding type. Bit 1 is the lo_pad type and bit 2 + is the hi_pad type. If a datum has unused bits at either + end, then the lo_pad or hi_pad bit is copied to those + locations.

3-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + +
+ Layout: Bit Field Property Description +
ByteByteByteByte
Bit OffsetBit Precision
+
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Bit Field Property Description +
Field NameDescription

Bit Offset

+

The bit offset of the first significant bit of the bit field + within the datatype. The bit offset specifies the number + of bits “to the right of” the value. +

+

Bit Precision

+

The number of bits of precision of the bit field + within the datatype. +

+
+
+ + +
+
+ +

Class specific information for the Opaque class (Class 5):

+ +
+ + + + + + + + + + + + + + + + + +
+ Bits: Opaque Bit Field Description +
BitsMeaning

0-7

Length of ASCII tag in bytes.

8-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + +
+ Layout: Opaque Property Description +
ByteByteByteByte

ASCII Tag
+
+
+ +
+
+ + + + + + + + + + + +
+ Fields: Opaque Property Description +
Field NameDescription

ASCII Tag

+

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+
+
+ + +
+
+ +

Class specific information for the Compound class (Class 6):

+ +
+ + + + + + + + + + + + + + + + + +
+ Bits: Compound Bit Field Description +
BitsMeaning

0-15

Number of Members. This field contains the number + of members defined for the compound datatype. The member + definitions are listed in the Properties field of the data + type message.

16-23

Reserved (zero).

+
+ + +

The Properties field of a compound datatype is a list of the + member definitions of the compound datatype. The member + definitions appear one after another with no intervening bytes. + The member types are described with a (recursively) encoded datatype + message.

+ +

Note that the property descriptions are different for different + versions of the datatype version. Additionally note that the version + 0 datatype encoding is deprecated and has been replaced with later + encodings in versions of the HDF5 Library from the 1.4 release + onward.

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Compound Properties Description for Datatype Version 1 +
ByteByteByteByte

Name

Byte Offset of Member
DimensionalityReserved (zero)
Dimension Permutation
Reserved (zero)
Dimension #1 Size (required)
Dimension #2 Size (required)
Dimension #3 Size (required)
Dimension #4 Size (required)

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Compound Properties Description for Datatype Version 1 +
Field NameDescription

Name

+

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+

Byte Offset of Member

+

This is the byte offset of the member within the datatype. +

+

Dimensionality

+

If set to zero, this field indicates a scalar member. If set + to a value greater than zero, this field indicates that the + member is an array of values. For array members, the size of + the array is indicated by the ‘Size of Dimension n’ field in + this message. +

+

Dimension Permutation

+

This field was intended to allow an array field to have + its dimensions permuted, but this was never implemented. + This field should always be set to zero. +

+

Dimension #n Size

+

This field is the size of a dimension of the array field as + stored in the file. The first dimension stored in the list of + dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+

Member Type Message

+

This field is a datatype message describing the datatype of + the member. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Compound Properties Description for Datatype Version 2 +
ByteByteByteByte

Name

Byte Offset of Member

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Compound Properties Description for Datatype Version 2 +
Field NameDescription

Name

+

This NUL-terminated string provides a description for the + opaque type. It is NUL-padded to a multiple of 8 bytes. +

+

Byte Offset of Member

+

This is the byte offset of the member within the datatype. +

+

Member Type Message

+

This field is a datatype message describing the datatype of + the member. +

+
+
+ + +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Compound Properties Description for Datatype Version 3 +
ByteByteByteByte

Name

Byte Offset of Member (variable size)

Member Type Message

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Compound Properties Description for Datatype Version 3 +
Field NameDescription

Name

This NUL-terminated string provides a description for the + opaque type. It is not NUL-padded to a multiple of 8 + bytes.

Byte Offset of Member

This is the byte offset of the member within the datatype. + The field size is the minimum number of bytes necessary, + based on the size of the datatype element. For example, a + datatype element size of less than 256 bytes uses a 1 byte + length, a datatype element size of 256-65535 bytes uses a + 2 byte length, and so on.

Member Type Message

This field is a datatype message describing the datatype of + the member.

+
+ + +
+
+ +

Class specific information for the Reference class (Class 7):

+ +
+ + + + + + + + + + + + + + + + + +
+ Bits: Reference Bit Field Description for Datatype Version < 4 +
BitsMeaning

0-3

Type. This four-bit value contains the reference types which are supported for + backward compatibility. The values defined are: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Object Reference (H5R_OBJECT1): A reference to another object in this + HDF5 file. +
1Dataset Region Reference (H5R_DATASET_REGION1): A reference to a region within + a dataset in this HDF5 file. +
2-15Reserved +

+ +

4-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Bits: Reference Bit Field Description for Datatype Version 4 +
BitsMeaning

0-3

Type. This four-bit value contains the revised reference types. + The values defined are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
2Object Reference (H5R_OBJECT2): A reference to another object + in this file or an external file. +
3Dataset Region Reference (H5R_DATASET_REGION2): A reference to a region within + a dataset in this file or an external file. +
4Attribute Reference (H5R_ATTR): A reference to an attribute attached to an + object in this file or an external file. +
5-15Reserved +

+ +

4-7

Version. This four-bit value contains the version for encoding + the revised reference types. The values defined are: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Unused +
1The version for encoding the revised reference types: Object Reference (2), + Dataset Region Reference (3) and Attribute Reference (4). +
2-15Reserved +

+ +

8-23

Reserved (zero).

+
+ +

There are no properties defined for the reference class. +

+ + +
+
+ +

Class specific information for the Enumeration class (Class 8):

+ +
+ + + + + + + + + + + + + + + + + +
+ Bits: Enumeration Bit Field Description +
BitsMeaning

0-15

Number of Members. The number of name/value + pairs defined for the enumeration type.

16-23

Reserved (zero).

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Enumeration Property Description for Datatype Versions + 1 and 2 +
ByteByteByteByte

Base Type


Names


Values

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Enumeration Property Description for Datatype Versions + 1 and 2 +
Field NameDescription

Base Type

+

Each enumeration type is based on some parent type, usually an + integer. The information for that parent type is described + recursively by this field. +

+

Names

+

The name for each name/value pair. Each name is stored as a null + terminated ASCII string in a multiple of eight bytes. The names + are in no particular order. +

+

Values

+

The list of values in the same order as the names. The values + are packed (no inter-value padding) and the size of each value + is determined by the parent type. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Enumeration Property Description for Datatype Version 3 +
ByteByteByteByte

Base Type


Names


Values

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Enumeration Property Description for Datatype Version 3 +
Field NameDescription

Base Type

+

Each enumeration type is based on some parent type, usually an + integer. The information for that parent type is described + recursively by this field. +

+

Names

+

The name for each name/value pair. Each name is stored as a null + terminated ASCII string, not padded to a multiple of + eight bytes. The names are in no particular order. +

+

Values

+

The list of values in the same order as the names. The values + are packed (no inter-value padding) and the size of each value + is determined by the parent type. +

+
+
+ + + +
+ +

Class specific information for the Variable-length class (Class 9):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Bits: Variable-length Bit Field Description +
BitsMeaning

0-3

Type. This four-bit value contains the type of + variable-length datatype described. The values defined are: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Sequence: A variable-length sequence of any datatype. + Variable-length sequences do not have padding or + character set information. +
1String: A variable-length sequence of characters. + Variable-length strings have padding and character set + information. +
2-15Reserved +

+ +

4-7

Padding type. (variable-length string only) + This four-bit value determines the type of padding + used for variable-length strings. The values are the same + as for the string padding type, as follows: + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Null terminate: A zero byte marks the end of a string + and is guaranteed to be present after converting a long + string to a short string. When converting a short string + to a long string, the value is padded with additional null + characters as necessary. +
1Null pad: Null characters are added to the end of the + value during conversion from a short string to a longer + string. Conversion from a long string to a shorter string + simply truncates the value. +
2Space pad: Space characters are added to the end of the + value during conversion from a short string to a longer + string. Conversion from a long string to a shorter string + simply truncates the value. This is the Fortran + representation of the string. +
3-15Reserved +

+ +

This value is set to zero for variable-length sequences.

+ +

8-11

Character Set. (variable-length string only) + This four-bit value specifies the character set + to be used for encoding the string: + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding +
1UTF-8 character set encoding +
2-15Reserved +

+ +

This value is set to zero for variable-length sequences.

+ +

12-23

Reserved (zero).

+
+ +
+
+ + + + + + + + + + + + + + +
+ Layout: Variable-length Property Description +
ByteByteByteByte

Base Type

+
+ +
+
+ + + + + + + + + + + + +
+ Fields: Variable-length Property Description +
Field NameDescription

Base Type

+

Each variable-length type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ + +
+
+ +

Class specific information for the Array class (Class 10):

+ +

There are no bit fields defined for the array class. +

+ +

Note that the dimension information defined in the property for this + datatype class is independent of dataspace information for a dataset. + The dimension information here describes the dimensionality of the + information within a data element (or a component of an element, if the + array datatype is nested within another datatype) and the dataspace for a + dataset describes the size and locations of the elements in a dataset. +

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Array Property Description for Datatype Version 2 +
ByteByteByteByte
DimensionalityReserved (zero)
Dimension #1 Size
.
.
.
Dimension #n Size
Permutation Index #1
.
.
.
Permutation Index #n

Base Type

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Array Property Description for Datatype Version 2 +
Field NameDescription

Dimensionality

+

This value is the number of dimensions that the array has. +

+

Dimension #n Size

+

This value is the size of the dimension of the array + as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Permutation Index #n

+

This value is the index permutation used to map + each dimension from the canonical representation to an + alternate axis for each dimension. Currently, dimension + permutations are not supported, and these indices should + be set to the index position minus one. In other words, + the first dimension should be set to 0, the second dimension + should be set to 1, and so on. +

+

Base Type

+

Each array type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Array Property Description for Datatype Version 3 +
ByteByteByteByte
DimensionalityThis space inserted only to align table nicely
Dimension #1 Size
.
.
.
Dimension #n Size

Base Type

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Array Property Description for Datatype Version 3 +
Field NameDescription

Dimensionality

+

This value is the number of dimensions that the array has. +

+

Dimension #n Size

+

This value is the size of the dimension of the array + as stored in the file. The first dimension stored in + the list of dimensions is the slowest changing dimension + and the last dimension stored is the fastest changing + dimension. +

+

Base Type

+

Each array type is based on some parent type. The + information for that parent type is described recursively by + this field. +

+
+
+ + + +

IV.A.2.e. The Data Storage - + Fill Value (Old) Message

+ + +
+ + + + + + + + +
Header Message Name: Fill Value + (old)
Header Message Type: 0x0004
Length: Varies
Status: Optional; may not be + repeated.
Description:

The fill value message stores a single data value which + is returned to the application when an uninitialized data element + is read from a dataset. The fill value is interpreted with the + same datatype as the dataset. If no fill value message is present + then a fill value of all zero bytes is assumed.

+

This fill value message is deprecated in favor of the + “new” fill value message (Message Type 0x0005) and + is only written to the file for forward compatibility with + versions of the HDF5 Library before the 1.6.0 version. + Additionally, it only appears for datasets with a user-defined + fill value (as opposed to the library default fill value or an + explicitly set “undefined” fill value).

+
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + +
+ Layout: Fill Value Message (Old) +
bytebytebytebyte
Size

Fill Value (optional, variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Fill Value Message (Old) +
Field NameDescription

Size

+

This is the size of the Fill Value field in bytes. +

+

Fill Value

+

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. +

+
+
+ + +

IV.A.2.f. The Data Storage - + Fill Value Message

+ + +
+ + + + + + + + +
Header Message Name: Fill + Value
Header Message Type: 0x0005
Length: Varies
Status: Required for dataset objects; + may not be repeated.
Description:The fill value message stores a single data value which is + returned to the application when an uninitialized data element + is read from a dataset. The fill value is interpreted with the + same datatype as the dataset.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fill Value Message - Versions 1 and 2 +
bytebytebytebyte
VersionSpace Allocation TimeFill Value Write TimeFill Value Defined
Size (optional)

Fill Value (optional, variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fill Value Message - Versions 1 and 2 +
Field NameDescription

Version

+

The version number information is used for changes in the + format of the fill value message and is described here: + + + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Initial version of this message. +
2In this version, the Size and Fill Value fields are + only present if the Fill Value Defined field is set + to 1. +
3This version packs the other fields in the message + more efficiently than version 2. +

+ +

Space Allocation Time

+

When the storage space for the dataset’s raw data will be + allocated. The allowed values are: + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Not used. +
1Early allocation. Storage space for the entire dataset + should be allocated in the file when the dataset is + created. +
2Late allocation. Storage space for the entire dataset + should not be allocated until the dataset is written + to. +
3Incremental allocation. Storage space for the + dataset should not be allocated until the portion + of the dataset is written to. This is currently + used in conjunction with chunked data storage for + datasets. +

+ +

Fill Value Write Time

+

At the time that storage space for the dataset’s raw data is + allocated, this value indicates whether the fill value should + be written to the raw data storage elements. The allowed values + are: + + + + + + + + + + + + + + + + + + +
ValueDescription
0On allocation. The fill value is always written to + the raw data storage when the storage space is allocated. +
1Never. The fill value should never be written to + the raw data storage. +
2Fill value written if set by user. The fill value + will be written to the raw data storage when the storage + space is allocated only if the user explicitly set + the fill value. If the fill value is the library + default or is undefined, it will not be written to + the raw data storage. +

+ +

Fill Value Defined

+

This value indicates if a fill value is defined for this + dataset. If this value is 0, the fill value is undefined. + If this value is 1, a fill value is defined for this dataset. + For version 2 or later of the fill value message, this value + controls the presence of the Size and Fill Value fields. +

+

Size

+

This is the size of the Fill Value field in bytes. This field + is not present if the Version field is greater than 1, + and the Fill Value Defined field is set to 0. +

+

Fill Value

+

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. This field is + not present if the Version field is greater than 1, + and the Fill Value Defined field is set to 0. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fill Value Message - Version 3 +
bytebytebytebyte
VersionFlagsThis space inserted only to align table nicely
Size (optional)

Fill Value (optional, variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fill Value Message - Version 3 +
Field NameDescription

Version

+

The version number information is used for changes in the + format of the fill value message and is described here: + + + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used +
1Initial version of this message. +
2In this version, the Size and Fill Value fields are + only present if the Fill Value Defined field is set + to 1. +
3This version packs the other fields in the message + more efficiently than version 2. +

+ +

Flags

+

When the storage space for the dataset’s raw data will be + allocated. The allowed values are: + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitsDescription
0-1Space Allocation Time, with the same + values as versions 1 and 2 of the message. +
2-3Fill Value Write Time, with the same + values as versions 1 and 2 of the message. +
4Fill Value Undefined, indicating that the fill + value has been marked as “undefined” for this dataset. + Bits 4 and 5 cannot both be set. +
5Fill Value Defined, with the same values as + versions 1 and 2 of the message. + Bits 4 and 5 cannot both be set. +
6-7Reserved (zero). +

+ +

Size

+

This is the size of the Fill Value field in bytes. This field + is not present if the Version field is greater than 1, + and the Fill Value Defined flag is set to 0. +

+

Fill Value

+

The fill value. The bytes of the fill value are interpreted + using the same datatype as for the dataset. This field is + not present if the Version field is greater than 1, + and the Fill Value Defined flag is set to 0. +

+
+
+ + +

IV.A.2.g. The Link Message

+ + +
+ + + + + + + + +
Header Message Name: Link
Header Message Type: 0x0006
Length: Varies
Status: Optional; may be + repeated.
Description:

This message encodes the information for a link in a + group’s object header, when the group is storing its links + “compactly”, or in the group’s fractal heap, + when the group is storing its links “densely”.

+

A group is storing its links compactly when the fractal heap + address in the Link Info + Message is set to the “undefined address” + value.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Link Message +
bytebytebytebyte
VersionFlagsLink type (optional)This space inserted only to align table nicely

Creation Order (8 bytes, optional)

Link Name Character Set (optional)Length of Link Name (variable size)This space inserted only to align table nicely
Link Name (variable size)

Link Information (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Link Message +
Field NameDescription

Version

The version number for this message. This document describes version 1.

+

Flags

This field contains information about the link and controls + the presence of other fields below. + + + + + + + + + + + + + + + + + + + + + + + + + + +
BitsDescription
0-1Determines the size of the Length of Link Name + field. + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0The size of the Length of Link Name + field is 1 byte. +
1The size of the Length of Link Name + field is 2 bytes. +
2The size of the Length of Link Name + field is 4 bytes. +
3The size of the Length of Link Name + field is 8 bytes. +
+
2Creation Order Field Present: if set, the Creation + Order field is present. If not set, creation order + information is not stored for links in this group. +
3Link Type Field Present: if set, the link is not + a hard link and the Link Type field is present. + If not set, the link is a hard link. +
4Link Name Character Set Field Present: if set, the + link name is not represented with the ASCII character + set and the Link Name Character Set field is + present. If not set, the link name is represented with + the ASCII character set. +
5-7Reserved (zero). +

+ +

Link type

This is the link class type and can be one of the following + values: + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0A hard link (should never be stored in the file) +
1A soft link. +
2-63Reserved for future HDF5 internal use. +
64An external link. +
65-255Reserved, but available for user-defined link types. +

+ +

This field is present if bit 3 of Flags is set.

+

Creation Order

This 64-bit value is an index of the link’s creation time within + the group. Values start at 0 when the group is created an increment + by one for each link added to the group. Removing a link from a + group does not change existing links’ creation order field. +

+

This field is present if bit 2 of Flags is set.

+

Link Name Character Set

This is the character set for encoding the link’s name: + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding (this should never be stored + in the file) +
1UTF-8 character set encoding +

+ +

This field is present if bit 4 of Flags is set.

+

Length of link name

This is the length of the link’s name. The size of this field + depends on bits 0 and 1 of Flags.

+

Link name

This is the name of the link, non-NULL terminated.

+

Link information

The format of this field depends on the link type.

+

For hard links, the field is formatted as follows: + + + + + + +
+ Size of Offsets bytes:The address of the object header for the object that the + link points to. +
+

+ +

+ For soft links, the field is formatted as follows: + + + + + + + + + + +
Bytes 1-2:Length of soft link value.
Length of soft link value bytes:A non-NULL-terminated string storing the value of the + soft link. +
+

+ +

+ For external links, the field is formatted as follows: + + + + + + + + + + +
Bytes 1-2:Length of external link value.
Length of external link value bytes:The first byte contains the version number in the + upper 4 bits and flags in the lower 4 bits for the external + link. Both version and flags are defined to be zero in + this document. The remaining bytes consist of two + NULL-terminated strings, with no padding between them. + The first string is the name of the HDF5 file containing + the object linked to and the second string is the full path + to the object linked to, within the HDF5 file’s + group hierarchy. +
+

+ +

+ For user-defined links, the field is formatted as follows: + + + + + + + + + + +
Bytes 1-2:Length of user-defined data.
Length of user-defined link value bytes:The data supplied for the user-defined link type.
+

+ +
+
+ +

IV.A.2.h. The Data Storage - + External Data Files Message

+ + +
+ + + + + + + + +
Header Message Name: External + Data Files
Header Message Type: 0x0007
Length: Varies
Status: Optional; may not be + repeated.
Description:The external data storage message indicates that the data + for an object is stored outside the HDF5 file. The filename of + the object is stored as a Universal Resource Location (URL) of + the actual filename containing the data. An external file list + record also contains the byte offset of the start of the data + within the file and the amount of space reserved in the file + for that data.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: External File List Message +
bytebytebytebyte
VersionReserved (zero)
Allocated SlotsUsed Slots

Heap AddressO


Slot Definitions...

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: External File List Message +
Field NameDescription

Version

+

The version number information is used for changes in the format of + External Data Storage Message and is described here: + + + + + + + + + + + + + +
VersionDescription
0Never used.
1The current version used by the library.

+ +

Allocated Slots

+

The total number of slots allocated in the message. Its value must be at least as + large as the value contained in the Used Slots field. (The current library simply + uses the number of Used Slots for this message)

+

Used Slots

+

The number of initial slots which contains valid information.

+

Heap Address

+

This is the address of a local heap which contains the names for the external + files (The local heap information can be found in Disk Format Level 1D in this + document). The name at offset zero in the heap is always the empty string.

+

Slot Definitions

+

The slot definitions are stored in order according to the array addresses they + represent.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Layout: External File List Slot +
bytebytebytebyte

Name Offset in Local HeapL


Offset in External Data FileL


Data Size in External FileL

+ + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: External File List Slot +
Field NameDescription

Name Offset in Local Heap

+

The byte offset within the local name heap for the name + of the file. File names are stored as a URL which has a + protocol name, a host name, a port number, and a file + name: + protocol:port//host/file. + If the protocol is omitted then “file:” is assumed. If + the port number is omitted then a default port for that + protocol is used. If both the protocol and the port + number are omitted then the colon can also be omitted. If + the double slash and host name are omitted then + “localhost” is assumed. The file name is the only + mandatory part, and if the leading slash is missing then + it is relative to the application’s current working + directory (the use of relative names is not + recommended). +

+

Offset in External Data File

+

This is the byte offset to the start of the data in the + specified file. For files that contain data for a single + dataset this will usually be zero.

+

Data Size in External File

+

This is the total number of bytes reserved in the + specified file for raw data storage. For a file that + contains exactly one complete dataset which is not + extendable, the size will usually be the exact size of the + dataset. However, by making the size larger one allows + HDF5 to extend the dataset. The size can be set to a value + larger than the entire file since HDF5 will read zeroes + past the end of the file without failing.

+
+
+ + +

IV.A.2.i. The Data Layout Message

+ + +
+ + + + + + + + +
Header Message Name: Data Layout
Header Message Type: 0x0008
Length: Varies
Status: Required for datasets; may not + be repeated.
Description:The Data Layout message + describes how the elements of a multi-dimensional array are stored + in the HDF5 file. Four types of data layout are supported: +
    +
  1. Contiguous: The array is stored in one contiguous area of + the file. This layout requires that the size of the array be + constant: data manipulations such as chunking, compression, + checksums, or encryption are not permitted. The message stores + the total storage size of the array. The offset of an element + from the beginning of the storage area is computed as in a C + array.
  2. +
  3. Chunked: The array domain is regularly decomposed into + chunks, and each chunk is allocated and stored separately. This + layout supports arbitrary element traversals, compression, + encryption, and checksums (these features are described + in other messages). The message stores the size of a chunk + instead of the size of the entire array; the storage size of + the entire array can be calculated by traversing the chunk index + that stores the chunk addresses.
  4. +
  5. Compact: The array is stored in one contiguous block as + part of this object header message.
  6. +
  7. Virtual: This is only supported for version 4 of the Data + Layout message. The message stores information that is used to + locate the global heap collection containing the Virtual Dataset + (VDS) mapping information. The mapping associates the VDS to + the source dataset elements that are stored across a collection + of HDF5 files.
  8. +
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Data Layout Message (Versions 1 and 2) +
bytebytebytebyte
VersionDimensionalityLayout ClassReserved (zero)
Reserved (zero)

Data AddressO (optional)

Dimension 1 Size
Dimension 2 Size
...
Dimension #n Size
Dataset Element Size (optional)
Compact Data Size (optional)

Compact Data... (variable size, optional)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Data Layout Message (Versions 1 and 2) +
Field NameDescription

Version

+

The version number information is used for changes in the format of the data + layout message and is described here: + + + + + + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by version 1.4 and before of the library to encode layout information. + Data space is always allocated when the data set is created.
2Used by version 1.6.[0,1,2] of the library to encode layout information. + Data space is allocated only when it is necessary.

+

Dimensionality

An array has a fixed dimensionality. This field + specifies the number of dimension size fields later in the + message. The value stored for chunked storage is 1 greater than + the number of dimensions in the dataset’s dataspace. + For example, 2 is stored for a 1 dimensional dataset. +

+

Layout Class

The layout class specifies the type of storage for the data + and how the other fields of the layout message are to be + interpreted. + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Compact Storage +
1Contiguous Storage +
2Chunked Storage +
+

+

Data Address

For contiguous storage, this is the address of the raw + data in the file. For chunked storage this is the address + of the v1 B-tree that is used to look up the addresses of the + chunks. This field is not present for compact storage. + If the version for this message is greater than 1, the address + may have the “undefined address” value, to indicate that + storage has not yet been allocated for this array.

+

Dimension #n Size

For contiguous and compact storage the dimensions define + the entire size of the array while for chunked storage they define + the size of a single chunk. In all cases, they are in units of + array elements (not bytes). The first dimension stored in the list + of dimensions is the slowest changing dimension and the last + dimension stored is the fastest changing dimension. +

+

Dataset Element Size

The size of a dataset element, in bytes. This field is only + present for chunked storage. +

+

Compact Data Size

This field is only present for compact data storage. + It contains the size of the raw data for the dataset array, in + bytes.

+

Compact Data

This field is only present for compact data storage. + It contains the raw data for the dataset array.

+
+
+ +
+

Version 3 of this message re-structured the format into specific + properties that are required for each layout class.

+ + +
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Data Layout Message (Version 3) +
bytebytebytebyte
VersionLayout ClassThis space inserted only to align table nicely

Properties (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Data Layout Message (Version 3) +
Field NameDescription

Version

+

The version number information is used for changes in the format of layout message + and is described here: + + + + + + + + + + +
VersionDescription
3Used by the version 1.6.3 and later of the library to store properties + for each layout class.

+

Layout Class

The layout class specifies the type of storage for the data + and how the other fields of the layout message are to be + interpreted. + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Compact Storage +
1Contiguous Storage +
2Chunked Storage +
+

+

Properties

This variable-sized field encodes information specific to each + layout class and is described below. If there is no property + information specified for a layout class, the size of this field + is zero bytes.

+
+ +
+ +

Class-specific information for compact storage (layout class 0): (Note: The dimensionality information + is in the Dataspace message)

+ + +
+ + + + + + + + + + + + + + + + + + +
+ Layout: Compact Storage Property Description +
bytebytebytebyte
SizeThis space inserted only to align table nicely

Raw Data... (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Compact Storage Property Description +
Field NameDescription

Size

This field contains the size of the raw data for the dataset + array, in bytes. +

+

Raw Data

This field contains the raw data for the dataset array.

+
+ + +
+ +

Class-specific information for contiguous storage (layout class 1): + (Note: The dimensionality information is in the Dataspace message)

+ + +
+ + + + + + + + + + + + + + + + + +
+ Layout: Contiguous Storage Property Description +
bytebytebytebyte

AddressO


SizeL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Contiguous Storage Property Description +
Field NameDescription

Address

This is the address of the raw data in the file. + The address may have the “undefined address” value, to indicate + that storage has not yet been allocated for this array.

Size

This field contains the size allocated to store the raw data, + in bytes. +

+
+
+ + +
+

Class-specific information for chunked storage (layout class 2):

+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Chunked Storage Property Description +
bytebytebytebyte
DimensionalityThis space inserted only to align table nicely

AddressO

Dimension 0 Size
Dimension 1 Size
...
Dimension #n Size
Dataset Element Size
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Chunked Storage Property Description +
Field NameDescription

Dimensionality

A chunk has a fixed dimensionality. This field specifies + the number of dimension size fields later in the message.

Address

This is the address of the v1 B-tree + that is used to look up the + addresses of the chunks that actually store portions of the array + data. The address may have the “undefined address” value, to + indicate that storage has not yet been allocated for this array.

Dimension #n Size

These values define the dimension size of a single chunk, in + units of array elements (not bytes). The first dimension stored in + the list of dimensions is the slowest changing dimension and the + last dimension stored is the fastest changing dimension. +

+

Dataset Element Size

The size of a dataset element, in bytes. +

+
+
+ + +
+ +

+ Version 4 of this message is similar to version 3 but has + additional information for the virtual layout class as well as + indexing information for the chunked layout class.

+ +
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Data Layout Message (Version 4) +
bytebytebytebyte
VersionLayout ClassThis space inserted + only to align table nicely

Properties (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Data Layout Message (Version 4) +
Field NameDescription

Version

+

The value for this field is 4 and is used by version 1.10.0 + and later of the library to store properties for each layout + class and indexing information for the chunked layout. +

+

Layout Class

The layout class specifies the type of storage for the data + and how the other fields of the layout message are to be + interpreted. + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Compact Storage +
1Contiguous Storage +
2Chunked Storage +
3Virtual Storage +
+

+

Properties

This variable-sized field encodes information specific to a + layout class as follows: + + + + + + + + + + + + + + + + + + + + + + + + + +
Layout ClassDescription
Compact StorageSee Compact Storage + Property Description for the version 3 +Data Layout message. +
Contiguous StorageSee Contiguous Storage + Property Description for the version 3 +Data Layout message. +
Chunked StorageSee Chunked Storage + Property Description below. +
Virtual StorageSee Virtual Storage + Property Description below. +
+ +

+
+ +
+ +

Class-specific information for chunked storage (layout + class 2):

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Chunked Storage Property Description +
bytebytebytebyte
FlagsDimensionalityDimension Size Encoded LengthThis space inserted to align table nicely
Dimension 0 Size (variable size)
Dimension 1 Size (variable size)
...
Dimension #n Size (variable size)
Chunk Indexing TypeThis space inserted only to align table nicely
Indexing Type Information (variable size)

AddressO

+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Chunked Storage Property Description +
Field NameDescription

Flags

This is the chunked layout feature flag:

+ + + + + + + + + + + + + + + + + +
ValueDescription
DONT_FILTER_PARTIAL_BOUND_CHUNKS (bit 0)Do not apply filter to a partial edge chunk. + +
SINGLE_INDEX_WITH_FILTER (bit 1)A filtered chunk for Single Chunk indexing. +
+ +

Dimensionality

A chunk has fixed dimension. This field specifies + the number of Dimension Size fields later in the message.

Dimension Size Encoded Length

+

This is the size in bytes used to encode Dimension Size. +

+

Dimension #n Size

These values define the dimension size of a single chunk, in + units of array elements (not bytes). The first dimension stored in + the list of dimensions is the slowest changing dimension and the + last dimension stored is the fastest changing dimension. +

+

Chunk Indexing Type

There are five indexing types used to look up addresses + of the chunks. For more information on each type, see + “Appendix C: Types of Indexes for + Dataset Chunks.” + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
1Single Chunk indexing type. +
2Implicit indexing type. +
3Fixed Array indexing type. +
4Extensible Array indexing type. +
5Version 2 B-tree indexing type. +
+

+

Indexing Type Information

This variable-sized field encodes information specific to + an indexing type. More information on what is encoded with + each type can be found below this table. +

+

+

Address

This is the address specific to an indexing type. + The address may be undefined if the chunk or index storage is not allocated yet. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
Single Chunk indexAddress of the single chunk.
Implicit indexAddress of the array of dataset chunks.
Fixed Array indexAddress of the index.
Extensible Array indexAddress of the index.
Version 2 B-tree indexAddress of the index.
+ +

+
+
+ +
+ +
    +
  1. + + Index-specific information for Single Chunk: +
  2. + +

    The following information exists only when the chunk is filtered. + In other words, when DONT_FILTER_PARTIAL_BOUND_CHUNKS + (bit 0) is enabled in the field flags.

    + +
    + + + + + + + + + + + + + + + + + + +
    + Layout: Single Chunk Indexing Information +
    bytebytebytebyte

    Size of filtered chunkL

    Filters for chunk
    + + + + + +
      + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
    +
    + +
    +
    + + + + + + + + + + + + + + + + +
    + Fields: Single Chunk Indexing Information +
    Field NameDescription

    Size of filtered chunk

    This field is the size of a filtered chunk.

    Filters for chunk

    This field contains filters for the chunk.

    +
    +

    + +
    + +
  3. + + Index-specific information for Implicit: +
  4. + +
    + + + + + + + + + + + + + + +
    + Layout: Implicit Indexing Information +
    bytebytebytebyte
    + No specific indexing information
    +
    + +
    +
  5. + + Index-specific information for Fixed Array: +
  6. + +
    + + + + + + + + + + + + + + + +
    + Layout: Fixed Array Indexing Information +
    bytebytebytebyte
    Page BitsThis space inserted only to align table nicely
    +
    + +
    +
    + + + + + + + + + + + + +
    + Fields: Fixed Array Indexing Information +
    Field NameDescription

    Page Bits

    This field contains the number of bits needed to store the + maximum number of elements in a data block page.

    +
    +

    + +
    +
  7. + + Index-specific information for Extensible Array: +
  8. + +
    + + + + + + + + + + + + + + + + + + + + + +
    + Layout: Extensible Array Indexing Information +
    bytebytebytebyte
    Max BitsIndex ElementsMin PointersMin Elements
    Page BitsThis space inserted only to align table nicely
    +
    + +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + Fields: Extensible Array Indexing Information +
    Field NameDescription

    Max Bits

    This field contains the number of bits needed to store the maximum number of elements + in the array. +

    +

    Index Elements

    This field contains the number of elements to store in the + index block. +

    +

    Min Pointers

    This field contains the minimum number of data block pointers + for a superblock. +

    +

    Min Elements

    This field contains the minimum number of elements per data block. +

    +

    Page Bits

    This field contains the number of bits needed to store the + maximum number of elements in a data block page. +

    +
    +
    +

    +
    + +
  9. + + Index-specific information for Version 2 B-tree: +
  10. + +
    + + + + + + + + + + + + + + + + + + + +
    + Layout: Version 2 B-tree Indexing Information +
    bytebytebytebyte
    Node Size
    Split PercentMerge Percent + This space inserted only to align table nicely
    +
    + +
    +
    + + + + + + + + + + + + + + + + + + + + + +
    + Fields: Version 2 B-tree Indexing Information +
    Field NameDescription

    Node Size

    This field is the size in bytes of a B-tree node. +

    +

    Split Percent

    This field is the percentage full of a B-tree node at which to split the node.

    Merge Percent

    This field is the percentage full of a B-tree node at which to merge the node.

    +
    +
+ + + +
+ +

+ Class-specific information for virtual storage (layout class 3):

+ +
+ + + + + + + + + + + + + + + + + + +
+ Layout: Virtual Storage Property Description +
bytebytebytebyte

AddressO

Index
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Virtual Storage Property Description +
Field NameDescription

Address

This is the address of the global heap collection where + the VDS mapping entries are stored. + See “Disk Format: Level 1F - + Global Heap Block for Virtual Datasets.” +

Index

This is the index of the data object within the global heap collection. +

+
+
+ +

IV.A.2.j. The Bogus Message

+ + +
+ + + + + + + + +
Header Message Name: Bogus
Header Message Type: 0x0009
Length: 4 bytes
Status: For testing only; should never + be stored in a valid file.
Description:This message is used for testing the HDF5 Library’s + response to an “unknown” message type and should + never be encountered in a valid HDF5 file.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + +
+ Layout: Bogus Message +
bytebytebytebyte
Bogus Value
+
+ +
+
+ + + + + + + + + + + +
+ Fields: Bogus Message +
Field NameDescription

Bogus Value

+

This value should always be: 0xdeadbeef.

+
+
+ +

IV.A.2.k. The Group Info Message +

+ + +
+ + + + + + + + +
Header Message Name: Group Info
Header Message Type: 0x000A
Length: Varies
Status: Optional; may not be + repeated.
Description:

This message stores information for the constants defining + a “new style” group’s behavior. Constant + information will be stored in this message and variable + information will be stored in the + Link Info message.

+

Note: the “estimated entry” information below is + used when determining the size of the object header for the + group when it is created.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Group Info Message +
bytebytebytebyte
VersionFlagsLink Phase Change: Maximum Compact Value (optional)
Link Phase Change: Minimum Dense Value (optional)Estimated Number of Entries (optional)
Estimated Link Name Length of Entries (optional)This space inserted only to align table nicely
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Group Info Message +
Field NameDescription

Version

The version number for this message. This document describes version 0.

+

Flags

This is the group information flag with the following definition: + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, link phase change values are stored. +
1If set, the estimated entry information is non-default + and is stored. +
2-7Reserved

+

Link Phase Change: Maximum Compact Value

The is the maximum number of links to store “compactly” (in + the group’s object header).

+

This field is present if bit 0 of Flags is set.

+

Link Phase Change: Minimum Dense Value

This is the minimum number of links to store “densely” (in + the group’s fractal heap). The fractal heap’s address is + located in the Link Info + message.

+

This field is present if bit 0 of Flags is set.

+

Estimated Number of Entries

This is the estimated number of entries in groups.

+

If this field is not present, the default value of 4 + will be used for the estimated number of group entries.

+

This field is present if bit 1 of Flags is set.

+

Estimated Link Name Length of Entries

This is the estimated length of entry name.

+

If this field is not present, the default value of 8 + will be used for the estimated link name length of group entries.

+

This field is present if bit 1 of Flags is set.

+
+
+ + +

IV.A.2.l. The Data Storage - Filter + Pipeline Message

+ + +
+ + + + + + + + +
Header Message Name: + Data Storage - Filter Pipeline
Header Message Type: 0x000B
Length: Varies
Status: Optional; may not be + repeated.
Description:

This message describes the filter pipeline which should + be applied to the data stream by providing filter identification + numbers, flags, a name, and client data.

+

This message may be present in the object headers of both + dataset and group objects. For datasets, it specifies the + filters to apply to raw data. For groups, it specifies the + filters to apply to the group’s fractal heap. Currently, + only datasets using chunked data storage use the filter + pipeline on their raw data.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Filter Pipeline Message - Version 1 +
bytebytebytebyte
VersionNumber of FiltersReserved (zero)
Reserved (zero)

Filter Description List (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Filter Pipeline Message - Version 1 +
Field NameDescription

Version

The version number for this message. This table + describes version 1.

Number of Filters

The total number of filters described in this + message. The maximum possible number of filters in a + message is 32.

Filter Description List

A description of each filter. A filter description + appears in the next table.

+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Filter Description - Version 1 +
bytebytebytebyte
Filter Identification ValueName Length
FlagsNumber Client Data Values

Name (variable size, optional)


Client Data (variable size, optional)

Padding (variable size, optional)
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Filter Description - Version 1 +
Field NameDescription

Filter Identification Value

+

+ This value, often referred to as a filter identifier, + is designed to be a unique identifier for the filter. + Values from zero through 32,767 are reserved for filters + supported by The HDF Group in the HDF5 Library and for + filters requested and supported by third parties. + Filters supported by The HDF Group are documented immediately + below. Information on 3rd-party filters can be found at + The HDF Group’s + + Contributions page.

+ +

+ To request a filter identifier, please contact + The HDF Group’s Help Desk at + The HDF Group Help Desk. + You will be asked to provide the following information:

+
    +
  1. Contact information for the developer requesting the + new identifier
  2. +
  3. A short description of the new filter
  4. +
  5. Links to any relevant information, including licensing + information
  6. +
+

+ Values from 32768 to 65535 are reserved for non-distributed uses + (for example, internal company usage) or for application usage + when testing a feature. The HDF Group does not track or document + the use of the filters with identifiers from this range.

+ +

+ The filters currently in library version 1.8.0 are + listed below: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IdentificationNameDescription
0N/AReserved
1deflateGZIP deflate compression
2shuffleData element shuffling
3fletcher32Fletcher32 checksum
4szipSZIP compression
5nbitN-bit packing
6scaleoffsetScale and offset encoded values
+

Name Length

Each filter has an optional null-terminated ASCII name + and this field holds the length of the name including the + null termination padded with nulls to be a multiple of + eight. If the filter has no name then a value of zero is + stored in this field.

Flags

The flags indicate certain properties for a filter. The + bit values defined so far are: + + + + + + + + + + + + + + + +
BitDescription
0If set then the filter is an optional filter. + During output, if an optional filter fails it will be + silently skipped in the pipeline.
1-15Reserved (zero)

+

Number of Client Data Values

Each filter can store integer values to control + how the filter operates. The number of entries in the + Client Data array is stored in this field.

Name

If the Name Length field is non-zero then it will + contain the size of this field, padded to a multiple of eight. This + field contains a null-terminated, ASCII character string to serve + as a comment/name for the filter.

Client Data

This is an array of four-byte integers which will be + passed to the filter function. The Client Data Number of + Values determines the number of elements in the array.

Padding

Four bytes of zeroes are added to the message at this + point if the Client Data Number of Values field contains + an odd number.

+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + +
+ Layout: Filter Pipeline Message - Version 2 +
bytebytebytebyte
VersionNumber of FiltersThis space inserted only to align table nicely

Filter Description List (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + +
+ Fields: Filter Pipeline Message - Version 2 +
Field NameDescription

Version

The version number for this message. This table + describes version 2.

Number of Filters

The total number of filters described in this + message. The maximum possible number of filters in a + message is 32.

Filter Description List

A description of each filter. A filter description + appears in the next table.

+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Filter Description - Version 2 +
bytebytebytebyte
Filter Identification ValueName Length (optional)
FlagsNumber Client Data Values

Name (variable size, optional)


Client Data (variable size, optional)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Filter Description - Version 2 +
Field NameDescription

Filter Identification Value

+

+ This value, often referred to as a filter identifier, + is designed to be a unique identifier for the filter. + Values from zero through 32,767 are reserved for filters + supported by The HDF Group in the HDF5 Library and for + filters requested and supported by third parties. + Filters supported by The HDF Group are documented immediately + below. Information on 3rd-party filters can be found at + The HDF Group’s + + Contributions page.

+ +

+ To request a filter identifier, please contact + The HDF Group’s Help Desk at + The HDF Group Help Desk. + You will be asked to provide the following information:

+
    +
  1. Contact information for the developer requesting the + new identifier
  2. +
  3. A short description of the new filter
  4. +
  5. Links to any relevant information, including licensing + information
  6. +
+

+ Values from 32768 to 65535 are reserved for non-distributed uses + (for example, internal company usage) or for application usage + when testing a feature. The HDF Group does not track or document + the use of the filters with identifiers from this range.

+ +

+ The filters currently in library version 1.8.0 are + listed below: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
IdentificationNameDescription
0N/AReserved
1deflateGZIP deflate compression
2shuffleData element shuffling
3fletcher32Fletcher32 checksum
4szipSZIP compression
5nbitN-bit packing
6scaleoffsetScale and offset encoded values
+

Name Length

Each filter has an optional null-terminated ASCII name + and this field holds the length of the name including the + null termination padded with nulls to be a multiple of + eight. If the filter has no name then a value of zero is + stored in this field.

+

Filters with IDs less than 256 (in other words, filters + that are defined in this format documentation) do not store + the Name Length or Name fields. +

+

Flags

The flags indicate certain properties for a filter. The + bit values defined so far are: + + + + + + + + + + + + + + + +
BitDescription
0If set then the filter is an optional filter. + During output, if an optional filter fails it will be + silently skipped in the pipeline.
1-15Reserved (zero)

+

Number of Client Data Values

Each filter can store integer values to control + how the filter operates. The number of entries in the + Client Data array is stored in this field.

Name

If the Name Length field is non-zero, then it will + contain the size of this field, not padded to a multiple + of eight. This field contains a non-null-terminated, + ASCII character string to serve as a comment/name for the filter. +

+

Filters that are defined in this format documentation + such as deflate and shuffle do not store the Name + Length or Name fields. +

+

Client Data

This is an array of four-byte integers which will be + passed to the filter function. The Client Data Number of + Values determines the number of elements in the array.

+
+
+ +

IV.A.2.m. The Attribute Message

+ + +
+ + + + + + + + +
Header Message Name: Attribute
Header Message Type: 0x000C
Length: Varies
Status: Optional; may be + repeated.
Description:

The Attribute message is used to store objects + in the HDF5 file which are used as attributes, or + “metadata” about the current object. An attribute + is a small dataset; it has a name, a datatype, a dataspace, and + raw data. Since attributes are stored in the object header, they + should be relatively small (in other words, less than 64KB). + They can be associated with any type of object which has an + object header (groups, datasets, or committed (named) + datatypes).

+

In 1.8.x versions of the library, attributes can be larger + than 64KB. See the + + “Special Issues” section of the Attributes chapter + in the HDF5 User’s Guide for more information.

+

Note: Attributes on an object must have unique names: + the HDF5 Library currently enforces this by causing the + creation of an attribute with a duplicate name to fail. + Attributes on different objects may have the same name, + however.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Attribute Message (Version 1) +
bytebytebytebyte
VersionReserved (zero)Name Size
Datatype SizeDataspace Size

Name (variable size)


Datatype (variable size)


Dataspace (variable size)


Data (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Attribute Message (Version 1) +
Field NameDescription

Version

The version number information is used for changes in the format of the + attribute message and is described here: + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by the library before version 1.6 to encode attribute message. + This version does not support shared datatypes.

+

Name Size

The length of the attribute name in bytes including the + null terminator. Note that the Name field below may + contain additional padding not represented by this + field.

Datatype Size

The length of the datatype description in the Datatype + field below. Note that the Datatype field may contain + additional padding not represented by this field.

Dataspace Size

The length of the dataspace description in the Dataspace + field below. Note that the Dataspace field may contain + additional padding not represented by this field.

Name

The null-terminated attribute name. This field is + padded with additional null characters to make it a + multiple of eight bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. This + field is padded with additional zero bytes to make it a + multiple of eight bytes.

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. This + field is not padded with additional bytes.

+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Attribute Message (Version 2) +
bytebytebytebyte
VersionFlagsName Size
Datatype SizeDataspace Size

Name (variable size)


Datatype (variable size)


Dataspace (variable size)


Data (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Attribute Message (Version 2) +
Field NameDescription

Version

The version number information is used for changes in the + format of the attribute message and is described here: + + + + + + + + + + +
VersionDescription
2Used by the library of version 1.6.x and after to encode + attribute messages. + This version supports shared datatypes. The fields of + name, datatype, and dataspace are not padded with + additional bytes of zero. +

+

Flags

This bit field contains extra information about + interpreting the attribute message: + + + + + + + + + + + + + + + + +
BitDescription
0If set, datatype is shared.
1If set, dataspace is shared.

+

Name Size

The length of the attribute name in bytes including the + null terminator.

Datatype Size

The length of the datatype description in the Datatype + field below.

Dataspace Size

The length of the dataspace description in the Dataspace + field below.

Name

The null-terminated attribute name. This field is not + padded with additional bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. +

+

If the + Flag field indicates this attribute’s datatype is + shared, this field will contain a “shared message” encoding + instead of the datatype encoding. +

+

This field is not padded with additional bytes. +

+

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. +

+

If the + Flag field indicates this attribute’s dataspace is + shared, this field will contain a “shared message” encoding + instead of the dataspace encoding. +

+

This field is not padded with additional bytes.

+

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. +

+

This field is not padded with additional zero bytes. +

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Attribute Message (Version 3) +
bytebytebytebyte
VersionFlagsName Size
Datatype SizeDataspace Size
Name Character Set EncodingThis space inserted only to align table nicely

Name (variable size)


Datatype (variable size)


Dataspace (variable size)


Data (variable size)

+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Attribute Message (Version 3) +
Field NameDescription

Version

The version number information is used for changes in the + format of the attribute message and is described here: + + + + + + + + + + +
VersionDescription
3Used by the library of version 1.8.x and after to + encode attribute messages. + This version supports attributes with non-ASCII names. +

+

Flags

This bit field contains extra information about + interpreting the attribute message: + + + + + + + + + + + + + + + + +
BitDescription
0If set, datatype is shared.
1If set, dataspace is shared.

+

Name Size

The length of the attribute name in bytes including the + null terminator.

Datatype Size

The length of the datatype description in the Datatype + field below.

Dataspace Size

The length of the dataspace description in the Dataspace + field below.

Name Character Set Encoding

The character set encoding for the attribute’s name: + + + + + + + + + + + + + + + +
ValueDescription
0ASCII character set encoding +
1UTF-8 character set encoding +
+

+

Name

The null-terminated attribute name. This field is not + padded with additional bytes.

Datatype

The datatype description follows the same format as + described for the datatype object header message. +

+

If the + Flag field indicates this attribute’s datatype is + shared, this field will contain a “shared message” encoding + instead of the datatype encoding. +

+

This field is not padded with additional bytes. +

+

Dataspace

The dataspace description follows the same format as + described for the dataspace object header message. +

+

If the + Flag field indicates this attribute’s dataspace is + shared, this field will contain a “shared message” encoding + instead of the dataspace encoding. +

+

This field is not padded with additional bytes.

+

Data

The raw data for the attribute. The size is determined + from the datatype and dataspace descriptions. +

+

This field is not padded with additional zero bytes. +

+
+
+ +

IV.A.2.n. The Object Comment + Message

+ + +
+ + + + + + + + +
Header Message Name: Object + Comment
Header Message Type: 0x000D
Length: Varies
Status: Optional; may not be + repeated.
Description:The object comment is designed to be a short description of + an object. An object comment is a sequence of non-zero + (\0) ASCII characters with no other formatting + included by the library.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + +
+ Layout: Object Comment Message +
bytebytebytebyte

Comment (variable size)

+
+ +
+
+ + + + + + + + + + + +
+ Fields: Object Comment Message +
Field NameDescription

Name

A null terminated ASCII character string.

+
+ +

IV.A.2.o. The Object + Modification Time (Old) Message

+ + +
+ + + + + + + + +
Header Message Name: Object + Modification Time (Old)
Header Message Type: 0x000E
Length: Fixed
Status: Optional; may not be + repeated.
Description:

The object modification date and time is a timestamp + which indicates (using ISO-8601 date and time format) the last + modification of an object. The time is updated when any object + header message changes according to the system clock where the + change was posted. All fields of this message should be + interpreted as coordinated universal time (UTC).

+

This modification time message is deprecated in favor of + the “new” Object + Modification Time message and is no longer written to the + file in versions of the HDF5 Library after the 1.6.0 + version.

Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Modification Time Message (Old) +
bytebytebytebyte
Year
MonthDay of Month
HourMinute
SecondReserved
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Modification Time Message (Old) +
Field NameDescription

Year

The four-digit year as an ASCII string. For example, + 1998. +

Month

The month number as a two digit ASCII string where + January is 01 and December is 12.

Day of Month

The day number within the month as a two digit ASCII + string. The first day of the month is 01.

Hour

The hour of the day as a two digit ASCII string where + midnight is 00 and 11:00pm is 23.

Minute

The minute of the hour as a two digit ASCII string where + the first minute of the hour is 00 and + the last is 59.

Second

The second of the minute as a two digit ASCII string + where the first second of the minute is 00 + and the last is 59.

Reserved

This field is reserved and should always be zero.

+
+ +

IV.A.2.p. The Shared Message Table + Message

+ + +
+ + + + + + + + +
Header Message Name: Shared Message + Table
Header Message Type: 0x000F
Length: Fixed
Status: Optional; may not be + repeated.
Description:This message is used to locate the table of shared object + header message (SOHM) indexes. Each index consists of information + to find the shared messages from either the heap or object header. + This message is only found in the superblock + extension.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Shared Message Table Message +
bytebytebytebyte
VersionThis space inserted only to align table nicely

Shared Object Header Message Table AddressO

Number of IndicesThis space inserted only to align table nicely
+ + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Shared Message Table Message +
Field NameDescription

Version

The version number for this message. This document describes version 0.

Shared Object Header Message Table Address

This field is the address of the master table for shared + object header message indexes.

+

Number of Indices

This field is the number of indices in the master table. +

+
+ +

IV.A.2.q. The Object Header + Continuation Message

+ + +
+ + + + + + + + +
Header Message Name: Object Header + Continuation
Header Message Type: 0x0010
Length: Fixed
Status: Optional; may be + repeated.
Description:The object header continuation is the location in the file + of a block containing more header messages for the current data + object. This can be used when header blocks become too large or + are likely to change over time.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + +
+ Layout: Object Header Continuation Message +
bytebytebytebyte

OffsetO


LengthL

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Object Header Continuation Message +
Field NameDescription

Offset

This value is the address in the file where the + header continuation block is located.

Length

This value is the length in bytes of the header continuation + block in the file.

+
+
+ +

The format of the header continuation block that this message points + to depends on the version of the object header that the message is + contained within. +

+ +

+ Continuation blocks for version 1 object headers have no special + formatting information; they are merely a list of object header + message info sequences (type, size, flags, reserved bytes and data + for each message sequence). See the description + of Version 1 Data Object Header Prefix. +

+ +

Continuation blocks for version 2 object headers do have + special formatting information as described here + (see also the description of + Version 2 Data Object Header Prefix.): +

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 Object Header Continuation Block +
bytebytebytebyte
Signature
Header Message Type #1Size of Header Message Data #1Header Message #1 Flags
Header Message #1 Creation Order (optional)This space inserted only to align table nicely

Header Message Data #1

.
.
.
Header Message Type #nSize of Header Message Data #nHeader Message #n Flags
Header Message #n Creation Order (optional)This space inserted only to align table nicely

Header Message Data #n

Gap (optional, variable size)
Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 Object Header Continuation Block +
Field NameDescription

Signature

+

The ASCII character string “OCHK” + is used to indicate the beginning of an object header + continuation block. This gives file consistency checking + utilities a better chance of reconstructing a damaged file. +

+

Header Message #n Type

+

Same format as version 1 of the object header, described above. +

Size of Header Message #n Data

+

Same format as version 1 of the object header, described above. +

Header Message #n Flags

+

Same format as version 1 of the object header, described above. +

Header Message #n Creation Order

+

This field stores the order that a message of a given type + was created in.

+

This field is present if bit 2 of flags is set.

+

Header Message #n Data

+

Same format as version 1 of the object header, described above. +

Gap

+

A gap in an object header chunk is inferred by the end of the + messages for the chunk before the beginning of the chunk’s + checksum. Gaps are always smaller than the size of an + object header message prefix (message type + message size + + message flags).

+

Gaps are formed when a message (typically an attribute message) + in an earlier chunk is deleted and a message from a later + chunk that does not quite fit into the free space is moved + into the earlier chunk.

+

Checksum

+

This is the checksum for the object header chunk. +

+
+
+ +

IV.A.2.r. The Symbol Table + Message

+ + +
+ + + + + + + + +
Header Message Name: Symbol Table + Message
Header Message Type: 0x0011
Length: Fixed
Status: Required for + “old style” groups; may not be repeated.
Description:Each “old style” group has a v1 B-tree and a + local heap for storing symbol table entries, which are located + with this message.
Format of data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + +
+ Layout: Symbol Table Message +
bytebytebytebyte

v1 B-tree AddressO


Local Heap AddressO

+ + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Symbol Table Message +
Field NameDescription

v1 B-tree Address

This value is the address of the v1 B-tree containing the + symbol table entries for the group.

Local Heap Address

This value is the address of the local heap containing + the link names for the symbol table entries for the group.

+
+ +

IV.A.2.s. The Object + Modification Time Message

+ + +
+ + + + + + + + +
Header Message Name: Object + Modification Time
Header Message Type: 0x0012
Length: Fixed
Status: Optional; may not be + repeated.
Description:The object modification time is a timestamp which indicates + the time of the last modification of an object. The time is + updated when any object header message changes according to + the system clock where the change was posted.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + +
+ Layout: Modification Time Message +
bytebytebytebyte
VersionReserved (zero)
Seconds After UNIX Epoch
+
+ +
+
+ + + + + + + + + + + + + + + + +
+ Fields: Modification Time Message +
Field NameDescription

Version

The version number is used for changes in the format of Object Modification Time + and is described here: + + + + + + + + + + + + + + + +
VersionDescription
0Never used.
1Used by Version 1.6.1 and after of the library to encode time. In + this version, the time is the seconds after Epoch.

+

Seconds After UNIX Epoch

A 32-bit unsigned integer value that stores the number of + seconds since 0 hours, 0 minutes, 0 seconds, January 1, 1970, + Coordinated Universal Time.

+
+ +

IV.A.2.t. The B-tree + ‘K’ Values Message

+ + +
+ + + + + + + + +
Header Message Name: B-tree + ‘K’ Values
Header Message Type: 0x0013
Length: Fixed
Status: Optional; may not be + repeated.
Description:This message retrieves non-default ‘K’ values + for internal and leaf nodes of a group or indexed storage v1 + B-trees. This message is only found in the superblock + extension.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + +
+ Layout: B-tree ‘K’ Values Message +
bytebytebytebyte
VersionIndexed Storage Internal Node KThis space inserted only to align table nicely
Group Internal Node KGroup Leaf Node K
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: B-tree ‘K’ Values Message +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Indexed Storage Internal Node K

This is the node ‘K’ value for each internal node of an + indexed storage v1 B-tree. See the description of this field + in version 0 and 1 of the superblock as well the section on + v1 B-trees. +

+

Group Internal Node K

This is the node ‘K’ value for each internal node of a group + v1 B-tree. See the description of this field in version 0 and + 1 of the superblock as well as the section on v1 B-trees. +

+

Group Leaf Node K

This is the node ‘K’ value for each leaf node of a group v1 + B-tree. See the description of this field in version 0 and 1 + of the superblock as well as the section on v1 B-trees. +

+
+
+ +

IV.A.2.u. The Driver Info + Message

+ + +
+ + + + + + + + + +
Header Message Name: Driver + Info
Header Message Type: 0x0014
Length: Varies
Status: Optional; may not be + repeated.
+ Description:This message contains information needed by the file driver + to reopen a file. This message is only found in the + superblock extension: see the + “Disk Format: Level 0C - Superblock Extension” + section for more information. For more information on the fields + in the driver info message, see the + “Disk Format: Level 0B - File Driver Info” + section; those who use the multi and family file drivers will + find this section particularly helpful.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Driver Info Message +
bytebytebytebyte
VersionThis space inserted only to align table nicely

Driver Identification
Driver Information SizeThis space inserted only to align table nicely


Driver Information (variable size)


+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Driver Info Message +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Driver Identification

This is an eight-byte ASCII string without null termination which + identifies the driver. +

+

Driver Information Size

The size in bytes of the Driver Information field of this + message.

+

Driver Information

Driver information is stored in a format defined by the file driver.

+
+
+ +

IV.A.2.v. The Attribute Info + Message

+ + +
+ + + + + + + + +
Header Message Name: Attribute + Info
Header Message Type: 0x0015
Length: Varies
Status: Optional; may not be + repeated.
Description:This message stores information about the attributes on an + object, such as the maximum creation index for the attributes + created and the location of the attribute storage when the + attributes are stored “densely”.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Attribute Info Message +
bytebytebytebyte
VersionFlagsMaximum Creation Index (optional)

Fractal Heap AddressO


Attribute Name v2 B-tree AddressO


Attribute Creation Order v2 B-tree AddressO (optional)

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Attribute Info Message +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Flags

This is the attribute index information flag with the + following definition: + + + + + + + + + + + + + + + + + + + +
BitDescription
0If set, creation order for attributes is tracked. +
1If set, creation order for attributes is indexed. +
2-7Reserved

+ +

Maximum Creation Index

The is the maximum creation order index value for the + attributes on the object.

+

This field is present if bit 0 of Flags is set.

+

Fractal Heap Address

This is the address of the fractal heap to store dense + attributes. + Each attribute stored in the fractal heap is described by + the Attribute Message. +

+

Attribute Name v2 B-tree Address

This is the address of the version 2 B-tree to index the + names of densely stored attributes.

+

Attribute Creation Order v2 B-tree Address

This is the address of the version 2 B-tree to index the + creation order of densely stored attributes.

+

This field is present if bit 1 of Flags is set.

+
+
+ +

IV.A.2.w. The Object Reference + Count Message

+ + +
+ + + + + + + + +
Header Message Name: Object Reference + Count
Header Message Type: 0x0016
Length: Fixed
Status: Optional; may not be + repeated.
Description:This message stores the number of hard links (in groups or + objects) pointing to an object: in other words, its + reference count.
Format of Data: See the tables + below.
+ + +
+ + + + + + + + + + + + + + + + + + +
+ Layout: Object Reference Count +
bytebytebytebyte
VersionThis space inserted only to align table nicely
Reference count
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Object Reference Count +
Field NameDescription

Version

The version number for this message. This document describes + version 0.

+

Reference Count

The unsigned 32-bit integer is the reference count for the + object. This message is only present in “version 2” + (or later) object headers, and if not present those object + header versions, the reference count for the object is assumed + to be 1.

+
+
+ +
+ +

IV.A.2.x. The File Space Info + Message

+ +
+ + + + + + + +

+

+
Header Message Name: File Space + Info
Header Message Type: 0x0017
Length: Fixed
Status: Optional; may not be + repeated.
+ Description:This message stores the file space management information + that the library uses in handling file space + requests for the file. Version 0 of the message is used for release 1.10.0 only. + Version 1 of the message is used for release 1.10.1+. + There is no File Space Info message before release 1.10 as the library does + not track file space across multiple file opens. +

+ Note that version 0 is deprecated starting release 1.10.1. + That means when the 1.10.1+ library opens an HDF5 file with a version 0 message, + the library will decode and map the message to version 1. + On file close, it will encode the message as a version 1 message. +

+ The library uses the following three mechanisms to manage file space in an HDF5 file: +

    +
  • Free-space managers +
    They track free-space sections of various sizes in the file that are not currently + allocated. Each free-space manager corresponds to a file space type. + There are two main groups of file space types: metadata and raw data. + Metadata is further divided into five types: superblock, B-tree, global heap, + local heap, and object header. + See the description of Free-space + Manager as well the description of file space allocation types in + Appendix B +
  • +
  • Aggregators +
    The library manages two aggregators, one for metadata and one for raw data. + Aggregator is a contiguous block of free-space in the file. + The size of each aggregator is tunable via public routines + H5Pset_meta_block_size and H5Pset_small_data_block_size respectively. +
  • +
  • Virtual file drivers +
    The library's virtual file driver interface dispatches requests for additional + space to the allocation routine of the file driver associated with the file. + For example, if the sec2 file driver is being used, its allocation routine will + increase the size of the file to service the requests. +
  • +
+

+ For release 1.10.0, the library derives the following four file space strategies + based on the mechanisms: +

    +
  • H5F_FILE_SPACE_ALL +
      +
    • Mechanisms used: free-space managers, aggregators, and virtual file drivers
    • +
    • Does not persist free-space across file opens
    • +
    • This strategy is the library default
    • +
    +
  • +
  • H5F_FILE_SPACE_ALL_PERSIST
  • +
      +
    • Mechanisms used: free-space managers, aggregators, and virtual file drivers
    • +
    • Persist free-space across file opens
    • +
    +
  • H5F_FILE_SPACE_AGGR_VFD
  • +
      +
    • Mechanisms used: aggregators and virtual file drivers
    • +
    • Does not persist free-space across file opens
    • +
    +
  • H5F_FILE_SPACE_VFD
  • +
      +
    • Mechanisms used: virtual file drivers
    • +
    • Does not persist free-space across file opens
    • +
    +
+ For release 1.10.1+, the free-space manager mechanism is modified to handle paged aggregation + which aggregates small metadata and raw data allocations into constant-sized well-aligned pages + to allow efficient I/O accesses. + With the support of this feature, the library derives the following four file space strategies: +
    +
  • H5F_FSPACE_STRATEGY_FSM_AGGR
  • +
      +
    • Mechanisms used: free-space managers, aggregators, and virtual file drivers
    • +
    • This strategy is the library default
    • +
    +
  • H5F_FSPACE_STRATEGY_PAGE
  • +
      +
    • Mechanisms used: free-space managers with embedded paged aggregation and virtual file drivers
    • +
    +
  • H5F_FSPACE_STRATEGY_AGGR
  • +
      +
    • Mechanisms used: aggregators and virtual file drivers
    • +
    +
  • H5F_FSPACE_STRATEGY_NONE
  • +
      +
    • Mechanisms used: virtual file drivers
    • +
    +
+ The default is not persisting free-space across file opens for the above four strategies. + User can use the public routine H5Pset_file_space_strategy to request + persisting free-space. +
Format of Data: See the tables + below.
+

+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: File Space Info - Version 0 +
bytebytebytebyte
VersionStrategyThresholdL

Free-space manager addressO for H5FD_MEM_SUPER


Free-space manager address0 for H5FD_MEM_BTREE


Free-space manager address0 for H5FD_MEM_DRAW


Free-space manager address0 for H5FD_MEM_GHEAP


Free-space manager address0 for H5FD_MEM_LHEAP


Free-space manager address0 for H5FD_MEM_OHDR

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: File Space Info +
Field NameDescription

Version

This is version 0 of this message.

+

Strategy

This is the file space strategy used to manage file space. + There are four types: + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
1H5F_FILE_SPACE_ALL_PERSIST
2H5F_FILE_SPACE_ALL
3H5F_FILE_SPACE_AGGR_VFD
4H5F_FILE_SPACE_VFD

+

Threshold

This is the smallest free-space section size that the + free-space manager will track. +

Free-space manager addresses

These are the six free-space manager addresses for the + six file space allocation types: +

    +
  • H5FD_MEM_SUPER
  • +
  • H5FD_MEM_BTREE
  • +
  • H5FD_MEM_DRAW
  • +
  • H5FD_MEM_GHEAP
  • +
  • H5FD_MEM_LHEAP
  • +
  • H5FD_MEM_OHDR
  • +
+ Note that these six fields exist only if the value for the field + “Strategy” is H5F_FILE_SPACE_ALL_PERSIST. +

+
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: File Space Info - Version 1 +
bytebytebytebyte
VersionStrategyPersisting free-spaceThis space inserted only to align table nicely
Free-space Section ThresholdL
File Space Page Size
Page-end Metadata thresholdThis space inserted only to align table nicely

EOA0


AddressO of small-sized free-space manager for H5FD_MEM_SUPER


AddressO of small-sized free-space manager for H5FD_MEM_BTREE


AddressO of small-sized free-space manager for H5FM_MEM_DRAW


AddressO of small-sized free-space manager for H5FD_MEM_GHEAP


AddressO of small-sized free-space manager for H5FD_MEM_LHEAP


AddressO of small-sized free-space manager for H5FD_MEM_OHDR


AddressO of large-sized free-space manager for H5FD_MEM_SUPER


AddressO of large-sized free-space manager for H5FD_MEM_BTREE


AddressO of large-sized free-space manager for H5FM_MEM_DRAW


AddressO of large-sized free-space manager for H5FD_MEM_GHEAP


AddressO of large-sized free-space manager for H5FD_MEM_LHEAP


AddressO of large-sized free-space manager for H5FD_MEM_OHDR

+ + + + + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: File Space Info +
Field NameDescription

Version

This is version 1 of this message.

+

Strategy

This is the file space strategy used to manage file space. + There are four types: + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0H5F_FSPACE_STRATEGY_FSM_AGGR
1H5F_FSPACE_STRATEGY_PAGE
2H5F_FSPACE_STRATEGY_AGGR
3H5F_FSPACE_STRATEGY_NONE

+

Persisting free-space

True or false in persisting free-space. +

Free-space Section Threshold

This is the smallest free-space section size that the + free-space manager will track. +

File space page size

This is the file space page size, which is used when the paged aggregation feature + is enabled. +

Page-end metadata threshold

This is the smallest free-space section size at the end of a page that + the free-space manager will track. This is used when the paged aggregation feature + is enabled. +

EOA

The EOA before the allocation of free-space manager header and section info for the + self-referential free-space managers when persisting free-space. +
+ Note that self-referential free-space managers are managers that involve file space + allocation for the managers' free-space header and section info. +

Addresses of small-sized free-space managers

These are the addresses of the six small-sized free-space managers for + the six file space allocation types: +

+
    +
  • H5FD_MEM_SUPER
  • +
  • H5FD_MEM_BTREE
  • +
  • H5FD_MEM_DRAW
  • +
  • H5FD_MEM_GHEAP
  • +
  • H5FD_MEM_LHEAP
  • +
  • H5FD_MEM_OHDR
  • +
+ Note that these six fields exist only if the value for the field + “Persisting free-space” is true. + +

Addresses of large-sized free-space managers

These are the addresses of the six large-sized free-space managers for + the six file space allocation types: +

+
    +
  • H5FD_MEM_SUPER
  • +
  • H5FD_MEM_BTREE
  • +
  • H5FD_MEM_DRAW
  • +
  • H5FD_MEM_GHEAP
  • +
  • H5FD_MEM_LHEAP
  • +
  • H5FD_MEM_OHDR
  • +
+ Note that these six fields exist only if the value for the field + “Persisting free-space” is true. + +
+
+ +

+ IV.B. Disk Format: Level 2B - Data Object Data Storage

+ +

The data for an object is stored separately from its header + information in the file and may not actually be located in the HDF5 file + itself if the header indicates that the data is stored externally. The + information for each record in the object is stored according to the + dimensionality of the object (indicated in the dataspace header message). + Multi-dimensional array data is stored in C order; in other words, the + “last” dimension changes fastest.

+ +

Data whose elements are composed of atomic datatypes are stored in IEEE + format, unless they are specifically defined as being stored in a different + machine format with the architecture-type information from the datatype + header message. This means that each architecture will need to [potentially] + byte-swap data values into the internal representation for that particular + machine.

+ +

Data with a variable-length datatype is stored in the global heap + of the HDF5 file. Global heap identifiers are stored in the + data object storage.

+ +

Data whose elements are composed of reference datatypes are stored in + several different ways depending on the particular reference type involved. + Object pointers are just stored as the offset of the object header being + pointed to with the size of the pointer being the same number of bytes as + offsets in the file.

+ +

Dataset region references are stored as a heap-ID which points to + the following information within the file-heap: an offset of the object + pointed to, number-type information (same format as header message), + dimensionality information (same format as header message), sub-set start + and end information (in other words, a coordinate location for each), + and field start and end names (in other words, a [pointer to the] string + indicating the first field included and a [pointer to the] string name + for the last field).

+ +

Data of a compound datatype is stored as a contiguous stream of the items + in the structure, with each item formatted according to its datatype. +

+ Description of datatypes for variable-length, references and compound classes can be found + in Datatype Message. +

+ Information about global heap and heap ID can be found in Global Heap. +

+ For reference datatype, + see also the encoding description for Reference Encoding (Revised) and + Reference Encoding (Backward Compatibility) in Appendix D. +

+ +

+ V. Appendix A: Definitions

+ +

Definitions of various terms used in this document are included in + this section.

+ +
+ + + + + + + + + + + + + + + + +
TermDefinition
Undefined AddressThe undefined + address for a file is a file address with all bits + set: in other words, 0xffff...ff.
Unlimited SizeThe unlimited size + for a size is a value with all bits set: in other words, + 0xffff...ff.
+
+ + +

+ VI. Appendix B: File Space Allocation Types

+ +

There are six basic types of file space allocation as follows: +

+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Basic Allocation TypeDescription
H5FD_MEM_SUPERFile space allocated for Superblock.
H5FD_MEM_BTREEFile space allocated for B-tree.
H5FD_MEM_DRAWFile space allocated for raw data.
H5FD_MEM_GHEAPFile space allocated for Global Heap.
H5FD_MEM_LHEAPFile space allocated for Local Heap.
H5FD_MEM_OHDRFile space allocated for Object Header.
+
+ +
+

There are other file space allocation types that are mapped to the + above six basic types because they are similar in nature. + The mapping and the corresponding description are listed in the following two tables: +

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Basic Allocation TypeMapping of Allocation Types to Basic Allocation Types
H5FD_MEM_SUPERnone
H5FD_MEM_BTREEH5FD_MEM_SOHM_INDEX
H5FD_MEM_DRAWH5FD_MEM_FHEAP_HUGE_OBJ
H5FD_MEM_GHEAPnone
H5FD_MEM_LHEAPH5FD_MEM_FHEAP_DBLOCK, H5FD_MEM_FSPACE_SINFO
H5FD_MEM_OHDRH5FD_MEM_FHEAP_HDR, H5FD_MEM_FHEAP_IBLOCK, H5FD_MEM_FSPACE_HDR, H5FD_MEM_SOHM_TABLE
+
+ +
+

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Allocation TypeDescription
H5FD_MEM_FHEAP_HDRFile space allocated for Fractal Heap Header.
H5FD_MEM_FHEAP_DBLOCKFile space allocated for Fractal Heap Direct Blocks.
H5FD_MEM_FHEAP_IBLOCKFile space allocated for Fractal Heap Indirect Blocks.
H5FD_MEM_FHEAP_HUGE_OBJFile space allocated for huge objects in the fractal heap.
H5FD_MEM_FSPACE_HDRFile space allocated for Free-space Manager Header.
H5FD_MEM_FSPACE_SINFOFile space allocated for Free-space Section List of the free-space manager.
H5FD_MEM_SOHM_TABLEFile space allocated for Shared Object Header Message Table.
H5FD_MEM_SOHM_INDEXFile space allocated for Shared Message Record List.
+
+ +

VII. Appendix C: + Types of Indexes for Dataset Chunks

+ +

For an HDF5 file without the latest format enabled, the library + uses the Version 1 B-tree to index dataset + chunks.

+ +

For an HDF5 file with the latest format enabled, the library uses + one of the following five indexing types depending on a chunked + dataset’s dimension specification and the way it is extended. +

+ + +

VII.A. The Single Chunk Index

+ +

The Single Chunk index can be used when the dataset fulfills + the following condition:

+ +
    +
  • the current, maximum, and chunk dimension sizes are all the same
  • +
+ +

The dataset has only one chunk, and the address of the single + chunk is stored in the version 4 Data Layout message. + See the Chunked Storage Property + Description layout and field description tables.

+ + +

VII.B. The Implicit Index

+ +

The Implicit index can be used when the dataset fulfills + the following conditions:

+ +
    +
  • fixed maximum dimension sizes
  • +
  • no filter applied to the dataset
  • +
  • the timing for the space allocation of the dataset chunks is + H5P_ALLOC_TIME_EARLY
  • +
+ +

Since the dataset’s dimension sizes are known and storage space + is to be allocated early, an array of dataset chunks are allocated + based on the maximum dimension sizes when the dataset is created. + The base address of the array is stored in the version 4 + Data Layout message. See the + Chunked Storage Property + Description layout and field description tables. +

+ +

When accessing a dataset chunk with a specified offset, the + address of the chunk in the array is computed as below:

+ +

base address + (size of a chunk in bytes * chunk index + associated with the offset)

+ +

A chunk index starts at 0 and increases according to the + fastest changing dimension, then the next fastest, and so on. + + The chunk index for a dataset chunk offset is computed as below: +

    +
  1. Calculate the scaled offset for each dimension in + scaled_offset: +
    +
    +        scaled_offset = chunk_offset/chunk_dims
    +    
  2. +
  3. Calculate the # of chunks for each dimension in + nchunks: +
    +
    +        nchunks = (curr_dims + chunk_dims - 1)/chunk_dims
    +    
  4. + +
  5. Calculate the down chunks for each dimension in + down_chunks: +
    +
    +        /* n is the # of dimensions */
    +        for(i = (int)(n-1), acc = 1; i >= 0; i--) {
    +        down_chunks[i] = acc;
    +        acc *= nchunks[i];
    +        }
    +      
    +
  6. + +
  7. Calculate the chunk index in chunk_index: +
    +
    +        /* n is the # of dimensions */
    +        for(u = 0, chunk_index = 0; u < n; u++)
    +                                        chunk_index += down_chunks[u] * scaled_offset[u];
    +                                        
    +
  8. +
+

+ For example, for a 2-dimensional dataset with + curr_dims[4,5] and chunk_dims[3,2], + there will be a total of 6 chunks, with 3 chunks in the fastest + changing dimension and 2 chunks in the slowest changing dimension. + See the figure below. + The chunk index for the chunk offset [3,4] + is computed as below: +

    + +
  1. scaled_offset[0] = 1, scaled_offset[1] = 2
  2. +
  3. nchunks[0] = 2, nchunks[1] = 3
  4. +
  5. down_chunks[0] = 3, down_chunks[1] = 1
  6. +
  7. chunk_index = 5
  8. +
    +
+ + + + + + + + + +
+
+ Chunk Diagram
+
+ Figure 3. Implicit index chunk diagram +
+ + + + + + +

VII.C. The Fixed Array Index

+ +

The Fixed Array index can be used when the dataset fulfills + the following condition:

+
    +
  • fixed maximum dimension sizes
  • +
+ +

Since the maximum number of chunks is known, an array of + in-file-on-disk addresses based on the maximum number of chunks is + allocated when data is written to the dataset. To access a dataset + chunk with a specified offset, the + chunk index associated with the offset +is calculated. The index is mapped into the array to locate the +disk address for the chunk.

+ +

The Fixed Array (FA) index structure provides space and speed + improvements in locating chunks over index structures that handle + more dynamic data accesses like a + Version 2 B-tree index. + The entry into the Fixed Array is the Fixed Array header which + contains metadata about the entries stored in the array. The + header contains a pointer to a data block which stores the array + of entries that describe the dataset chunks. For greater efficiency, + the array will be divided into multiple pages if the number of + entries exceeds a threshold value. The space for the data block + and possibly data block pages are allocated as a single contiguous + block of space.

+ +

The content of the data block depends on whether paging is + activated or not. When paging is not used, elements that describe + the chunks are stored in the data block. If paging is turned on, + the data block contains a bitmap indicating which pages are + initialized. Then subsequent data block pages will contain the + entries that describe the chunks.

+ +

An entry describes either a filtered or non-filtered dataset + chunk. The formats for both element types are described below. +

+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fixed Array Header +
bytebytebytebyte
Signature
VersionClient IDEntry SizePage Bits

Max Num + EntriesL


Data Block + AddressO

Checksum
+ + + + + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fixed Array Header +
Field NameDescription

Signature

+

The ASCII character string “FAHD” + is used to indicate the beginning of a Fixed Array header. + This gives file consistency checking utilities a better + chance of reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Client ID

+

The ID for identifying the client of the + Fixed Array: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
2+Reserved +
+

+

Entry Size

+

The size in bytes of an entry in the Fixed Array. +

+

Page Bits

+

The number of bits needed to store the maximum + number of entries in a + data block page.

+

Max Num Entries

+

The maximum number of entries in the Fixed + Array.

+

Data Block Address

+

The address of the data block in the Fixed Array. +

+

Checksum

+

The checksum for the header.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Fixed Array Data Block +
bytebytebytebyte
Signature
VersionClient IDThis space inserted + only to align table nicely

Header AddressO


Page Bitmap (variable size and + optional)


Elements (variable size and + optional)

Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Fixed Array Data Block +
Field NameDescription

Signature

+

The ASCII character string “FADB” + is used to indicate the beginning of a Fixed Array data + block. This gives file consistency checking utilities a + better chance of reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Client ID

+

The ID for identifying the client of the + Fixed Array: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
2+Reserved. +
+

+

Header Address

+

The address of the Fixed Array header. Principally used + for file integrity checking. +

+

Page Bitmap

A bitmap indicating which data block pages are initialized.

+

Exists only if the data block is paged.

Elements

+

Contains the elements stored in the data block + and exists only if the data block is not paged. + There are two element types: + + + + + + + + + + + + + + +
IDDescription
0Non-filtered + dataset chunks +
1Filtered dataset + chunks +
+

+

Checksum

+

The checksum for the Fixed Array data block.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + +
+ Layout: Fixed Array Data Block Page +
bytebytebytebyte

Elements (variable + size)

Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Fixed Array Data Block Page +
Field NameDescription

Elements

+

Contains the elements stored in the data block page. + There are two element types: + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
+

+

Checksum

+

The checksum for a Fixed Array data block page.

+
+
+ +
+
+
+ +
+ + + + + + + + + + + + + + +
+ Layout: Data Block Element for Non-filtered Dataset Chunk +
bytebytebytebyte

AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + +
+ Fields: Data Block Element for Non-filtered Dataset Chunk +
Field NameDescription

Address

The address of the dataset chunk in the file. +

+
+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Data Block Element for Filtered Dataset Chunk +
bytebytebytebyte

AddressO


Chunk Size (variable size; at most + 8 bytes)

Filter Mask
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Data Block Element for Filtered Dataset Chunk +
Field NameDescription

Address

The address of the dataset chunk in the file. +

+

Chunk Size

The size of the dataset chunk in bytes. +

+

Filter Mask

Indicates the filter to skip for the dataset chunk. Each + filter has an index number in the pipeline; if that filter is + skipped, the bit corresponding to its index is set. +

+
+
+ + +

VII.D. The Extensible Array Index

+ +

The Extensible Array index can be used when the dataset + fulfills the following condition:

+ +
    +
  • only one dimension of unlimited extent
  • +
+ +

The Extensible Array (EA) is a data structure that is used as a + chunk index in datasets where the dataspace has a single + unlimited dimension. In other words, one dimension is set to + H5S_UNLIMITED, and the other dimensions are any number + of fixed-size dimensions. The idea behind the extensible array is + that a particular data object can be located via a lightweight + indexing structure of fixed depth for a given address space. This + indexing structure requires only a few (2-3) file operations per + element lookup and gives good cache performance. Unlike the B-tree + structure, the extensible array is optimized for appends. Where a + B-tree would always add at the rightmost node under these + circumstances, either creating a deep tree (version 1) or requiring + expensive rebalances to correct (version 2), the extensible array + has already mapped out a pre-balanced internal structure. This + optimized internal structure is instantiated as needed when chunk + records are inserted into the structure.

+ + + + + + + +

An Extensible Array consists of a header, an index block, + secondary blocks, data blocks, and (optional) data block pages. The + general scheme is that the index block is used to reference a + secondary block, which is, in turn, used to reference the data block + page where the chunk information is stored. The data blocks will + be paged for efficiency when their size passes a threshold value. + These pages are laid out contiguously on the disk after the data + block, are initialized as needed, and are tracked via bitmaps + stored in the secondary block. The number of secondary and data + blocks/pages in a chunk index varies as they are allocated as + needed and the first few are (conceptually) stored in parent + elements as an optimization.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Extensible Array Header +
bytebytebytebyte
Signature
VersionClient IDElement SizeMax Nelmts Bits
Index Blk ElmtsData Blk Min ElmtsSecondary Blk Min Data PtrsMax Data Blk Page Nelmts Bits

Num Secondary BlksL


Secondary Blk SizeL


Num Data BlksL


Data Blk SizeL


Max Index SetL


Num ElementsL


Index Block AddressO

Checksum
+ + + + + + + + +
  + (Items marked with an ‘L’ in the above table are + of the size specified in the Size + of Lengths field in the superblock.) +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Extensible Array Header +
Field NameDescription

Signature

+

The ASCII character string “EAHD” + is used to indicate the beginning of an Extensible Array + header. This gives file consistency checking utilities a + better chance of reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Client ID

+

The ID for identifying the client of the + Fixed Array: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
2+Reserved. +
+

+

Element Size

+

The size in bytes of an element in the Extensible Array. +

+

Max Nelmts Bits

+

The number of bits needed to store the + maximum number of elements in the Extensible Array.

+

Index Blk Elmts

+

The number of elements to store in the index block. +

+

Data Blk Min Elmts

+

The minimum number of elements per data block. +

+

Secondary Blk Min Data Ptrs

+

The minimum number of data block pointers for a + secondary block. +

+

Max Dblk Page Nelmts Bits

+

The number of bits needed to store the maximum number + of elements in a data block page. +

+

Num Secondary Blks

+

The number of secondary blocks created. +

+

Secondary Blk Size

+

The size of the secondary blocks created. +

+

Num Data Blks

+

The number of data blocks created. +

+

Data Blk Size

+

The size of the data blocks created. +

+

Max Index Set

+

The maximum index set. +

+

Num Elmts

+

The number of elements realized. +

+

Index Block Address

+

The address of the index block. +

+

Checksum

+

The checksum for the header.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Extensible Array Index Block +
bytebytebytebyte
Signature
VersionClient IDThis space inserted + only to align table nicely

Header AddressO


Elements (variable size and + optional)


Data Block Addresses (variable + size and optional)


Secondary Block Addresses (variable + size and optional)

Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Extensible Array Index Block +
Field NameDescription

Signature

+

The ASCII character string “EAIB” + is used to indicate the beginning of an Extensible Array + Index Block. This gives file consistency checking utilities + a better chance of reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Client ID

+

The client ID for identifying the user of the + Extensible Array: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
2+Reserved. +
+

+

Header Address

+

The address of the Extensible Array header. Principally + used for file integrity checking.

+

Elements

+

Contains the elements that are stored directly in + the index block. An optimization to avoid unnecessary + secondary blocks. +
+
+ There are two element types: + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
+

+

Data Block Addresses

+

Contains the addresses of the data blocks + that are stored directly in the Index Block. An + optimization to avoid unnecessary secondary blocks.

+

Secondary Block Addresses

+

Contains the addresses of the secondary + blocks.

+

Checksum

+

The checksum for the Extensible Array Index Block.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Extensible Array Secondary Block +
bytebytebytebyte
Signature
VersionClient IDThis space inserted + only to align table nicely

Header AddressO


Block Offset (variable + size)


Page Bitmap (variable size and + optional)


Data Block Addresses (variable + size and optional)

Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Extensible Array Secondary Block +
Field NameDescription

Signature

+

The ASCII character string “EASB” + is used to indicate the beginning of an Extensible Array + Secondary Block. This gives file consistency checking utilities + a better chance of reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Client ID

+

The ID for identifying the client of the + Extensible Array: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
2+Reserved. +
+

+

Header Address

+

The address of the Extensible Array header. Principally + used for file integrity checking.

+

Block Offset

+

Stores the offset of the block in the array. +

+

Page Bitmap

+

A bitmap indicating which + data block pages are initialized. +

+ Exists only if the data block is paged. +

Data Block Addresses

+

Contains the addresses of the data blocks + referenced by this secondary block.

+

Checksum

+

The checksum for the Extensible Array + Secondary Block.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Extensible Array Data Block +
bytebytebytebyte
Signature
VersionClient IDThis space inserted + only to align table nicely

Header AddressO


Block Offset (variable + size)


Elements (variable size and + optional)

Checksum
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Extensible Array Data Block +
Field NameDescription

Signature

+

The ASCII character string “EADB” + is used to indicate the beginning of an Extensible Array + data block. This gives file consistency checking utilities + a better chance of reconstructing a damaged file. +

+

Version

+

This document describes version 0.

+

Client ID

+

The ID for identifying the client of the + Extensible Array: + + + + + + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
2+Reserved. +
+

+

Header Address

+

The address of the Extensible Array header. Principally + used for file integrity checking. +

+

Block Offset

+

The offset of the block in the array. +

Elements

+

Contains the elements stored in the data block and + exists only if the data block is not paged. +
+
+ There are two element types: + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
+

+

Checksum

+

The checksum for the Extensible Array data block.

+
+
+ +
+
+
+
+ + + + + + + + + + + + + + + + + + +
+ Layout: Extensible Array Data Block Page +
bytebytebytebyte

Elements (variable + size)

Checksum
+
+ +
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Extensible Array Data Block Page +
Field NameDescription

Elements

+

Contains the elements stored in the data block + page.

+

+ There are two element types: + + + + + + + + + + + + + + +
IDDescription
0Non-filtered dataset chunks +
1Filtered dataset chunks +
+

+

Checksum

+

The checksum for an Extensible Array data block + page.

+
+
+ +
+
+
+ +
+ + + + + + + + + + + + + + +
+ Layout: Data Block Element for Non-filtered Dataset Chunk +
bytebytebytebyte

AddressO

+ + + + + +
+
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + +
+ Fields: Data Block Element for Non-filtered Dataset Chunk +
Field NameDescription

Address

The address of the dataset chunk in the file. +

+
+
+

+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Data Block Element for Filtered Dataset Chunk +
bytebytebytebyte

AddressO


Chunk Size (variable size; at + most 8 bytes)

Filter Mask
+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+
+ +
+
+ + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Data Block Element for Filtered Dataset Chunk +
Field NameDescription

Address

The address of the dataset chunk in the file. +

+

Chunk Size

The size of the dataset chunk in bytes. +

+

Filter Mask

Indicates the filter to skip for the dataset chunk. + Each filter has an index number in the pipeline; if that + filter is skipped, the bit corresponding to its index is set. +

+
+
+ + +

VII.E. The Version 2 B-trees Index

+ +

The Version 2 B-trees index can be used when the dataset + fulfills the following condition:

+ +
    +
  • more than one dimension of unlimited extent
  • +
+ +

Version 2 B-trees can be used to index various objects in the + library. See “Version 2 B-trees” + for more information. The B-tree types 10 + and 11 record layouts are for + indexing dataset chunks.

+ +

VIII. Appendix D: + Encoding for dataspace and reference

+ + +

VIII.A. Dataspace Encoding

+H5Sencode is a public routine that encodes a dataspace description into a buffer while +H5Sdecode is the corresponding routine that decodes the description encoded in the buffer. +

+ See the reference manual description for these two public routines. + +
+
+
+

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Dataspace Description for H5Sencode/H5Sdecode +
bytebytebytebyte
Dataspace IDEncode VersionSize of SizeThis space inserted + only to align table nicely

Size of Extent +



Dataspace Message + (variable size) +



Dataspace Selection + (variable size) +

+ +
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Dataspace Description for H5Sencode/H5Sdecode +
Field NameDescription

Dataspace ID

+

The datspace message ID which is 1.

+

Encode Version

+

H5S_ENCODE_VERSION which is 0. +

+

Size of Size

+

The number of bytes used to store the size of an object. +

+

Size of Extent

+

Size of the dataspace message. +

+

Dataspace Message

+

The dataspace message information. See + Dataspace Message.

+

+

Dataspace Selection

+

The dataspace selection information. See + Dataspace Selection.

+
+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Dataspace Selection +
bytebytebytebyte
Selection Type

Selection Info (variable + size)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + +
+ Fields: Dataspace Selection +
Field NameDescription

Selection Type

+

There are 4 types of selection: + + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0H5S_SEL_NONE: Nothing selected +
1H5S_SEL_POINTS: Sequence of points selected +
2H5S_SEL_HYPER: Hyperslab selected +
3H5S_SEL_ALL: Entire extent selected +
+

Selection Info

+

There are 4 types of selection info: + + + + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
0Selection info for H5S_SEL_NONE +
1Selection info for H5S_SEL_POINTS +
2Selection info for H5S_SEL_HYPER +
3Selection for H5S_SEL_ALL +
+

+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Selection Info for H5S_SEL_NONE +
bytebytebytebyte
Version

Reserved (zero, 8 bytes)

+
+ +
+
+
+ + + + + + + + + + + +
+ Fields: Selection Info for H5S_SEL_NONE +
Field NameDescription

Version

The version number for the H5S_SEL_NONE Selection Info. + The value is 1.

+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Selection Info for H5S_SEL_POINTS +
bytebytebytebyte
Version


Points Selection Info (variable size) +


+
+ +
+
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Selection Info for H5S_SEL_POINTS +
Field NameDescription

Version

The version number for the H5S_SEL_POINTS Selection Info. + The value is either 1 or 2.

Points Selection Info

Depending on version: + + + + + + + + + + + + + + + + +
VersionDescription
1See Version 1 Points Selection Info +
2See Version 2 Points Selection Info +
+

+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 1 Points Selection Info +
bytebytebytebyte
Reserved (zero)
Length
Rank
Num Points
Point #1: coordinate #1
.
.
.
Point #1: coordinate #u
.
.
.
Point #n: coordinate #1
.
.
.
Point #n: coordinate #u
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 1 Points Selection Info +
Field NameDescription

Length

The size in bytes from Length to the end of the + selection info.

Rank

The number of dimensions.

Num Points

The number of points in the selection.

Point #n: coordinate #u

The array of points in the selection. +

The points selected are #1 to #n where n is Num Points. +

The list of coordinates for each point are #1 to #u where u is + Rank.

+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 Points Selection Info +
bytebytebytebyte
Encode SizeThis space inserted only to align table nicely +
Rank
Num Points

(2, 4 or 8 bytes)

Point #1: coordinate #1

(2, 4 or 8 bytes)

.
.
.
Point #1: coordinate #u

(2, 4 or 8 bytes)

.
.
.
Point #n: coordinate #1

(2, 4 or 8 bytes)

.
.
.
Point #n: coordinate #u

(2, 4 or 8 bytes)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 Points Selection Info +
Field NameDescription

Encode Size

The size for encoding the points selection info which can be 2, 4 or 8 bytes. +

Rank

The number of dimensions.

Num Points

The number of points in the selection. +

The field Encode Size indicates the size of this field

Point #n: coordinate #u

The array of points in the selection. +

The points selected are #1 to #n where n is Num Points. +

The list of coordinates for each point are #1 to #u where u is + Rank. +

The field Encode Size indicates the size of this field

+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Selection Info for H5S_SEL_HYPER +
bytebytebytebyte
Version

Hyperslab Selection Info + (variable size)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Selection Info for H5S_SEL_HYPER +
Field NameDescription

Version

The version number for the H5S_SEL_HYPER selection info. + The value is 1, 2 or 3.

Hyperslab Selection Info

Depending on version: + + + + + + + + + + + + + + + + + + + + +
VersionDescription
1See Version 1 Hyperslab Selection Info. +
2See Version 2 Hyperslab Selection Info +
3See Version 3 Hyperslab Selection Info +
+

+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 1 Hyperslab Selection Info +
bytebytebytebyte
Reserved
Length
Rank
Num Blocks
Starting Offset #1 for Block #1
.
.
.
Starting Offset #n for Block #1
Ending Offset #1 for Block #1
.
.
.
Ending Offset #n for Block #1
.
.
.
.
.
.
.
.
.
Starting Offset #1 for Block #u
.
.
.
Starting Offset #n for Block #u
Ending Offset #1 for Block #u
.
.
.
Ending Offset #n for Block #u
+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 1 Hyperslab Selection Info +
Field NameDescription

Length

The size in bytes from the field Rank to the + end of the Selection Info.

Rank

The number of dimensions in the dataspace.

Num Blocks

The number of blocks in the selection.

Starting Offset #n for Block #u

The offset #n of the starting element in block #u. +

#n is from 1 to Rank. +

#u is from 1 to Num Blocks moving from the fastest + changing dimension to the slowest changing dimension. +

Ending Offset #n for Block #u

The offset #n of the ending element in block #u. +

#n is from 1 to Rank. +

#u is from 1 to Num Blocks moving from the fastest + changing dimension to the slowest changing dimension. +

+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 2 Hyperslab Selection Info +
bytebytebytebyte
FlagsThis space inserted + only to align table nicely
Length
Rank
Start #1 (8 bytes)

Stride #1 (8 bytes)

Count #1 (8 bytes)

Block #1 (8 bytes)

.
.
.
Start #n (8 bytes)

Stride #n (8 bytes)

Count #n (8 bytes)

Block #n (8 bytes)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 2 Hyperslab Selection Info +
Field NameDescription

Flags

This is a bit field with the following definition. + Currently, this is always set to 0x1. +

+ + + + + + + + + + +
BitDescription
0If set, it a a regular hyperslab, otherwise, irregular. +
+

Length

The size in bytes from the field Rank to the + end of the Selection Info.

Rank

The number of dimensions in the dataspace.

Start #n

The offset of the starting element in the block. +

#n is from 1 to Rank. +

Stride #n

The number of elements to move in each dimension. +

#n is from 1 to Rank. +

Count #n

The number of blocks to select in each dimension. +

#n is from 1 to Rank. +

Block #n

The size (in elements) of each block in each dimension. +

#n is from 1 to Rank. +

+
+ + + + +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 3 Hyperslab Selection Info +
bytebytebytebyte
FlagsEncode SizeThis space inserted + only to align table nicely
Rank

Regular/Irregular Hyperslab Selection Info +

(variable size)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 3 Hyperslab Selection Info +
Field NameDescription

Flags

This is a bit field with the following definition: +

+ + + + + + + + + + +
BitDescription
0If set, it is a regular hyperslab, otherwise, irregular. +
+

Encode Size

The size for encoding hyperslab selection info, which can 2, 4 or 8 bytes.

Rank

The number of dimensions in the dataspace.

Regular/Irregular Hyperslab Selection Info

This is the selection info for version 3 hyperslab which can be regular or irregular. +

If bit 0 of the field Flags is set, + See Version 3 Regular Hyperslab Selection Info +

Otherwise, see Version 3 Irregular Hyperslab Selection Info +

+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 3 Regular Hyperslab Selection Info +
bytebytebytebyte
Start #1

(2, 4 or 8 bytes)

Stride #1

(2, 4 or 8 bytes)

Count #1

(2, 4 or 8 bytes)

Block #1

(2, 4 or 8 bytes)

.
.
.
Start #n

(2, 4 or 8 bytes)

Stride #n

(2, 4 or 8 bytes)

Count #n

(2, 4 or 8 bytes)

Block #n

(2, 4 or 8 bytes)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Version 3 Regular Hyperslab Selection Info +
Field NameDescription

Start #n

The offset of the starting element in the block. +

#n is from 1 to Rank. +

The field Encode Size indicates the size of this field. +

Stride #n

The number of elements to move in each dimension. +

#n is from 1 to Rank. +

The field Encode Size indicates the size of this field. +

Count #n

The number of blocks to select in each dimension. +

#n is from 1 to Rank. +

The field Encode Size indicates the size of this field. +

Block #n

The size (in elements) of each block in each dimension. +

#n is from 1 to Rank. +

The field Encode Size indicates the size of this field. +

+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Version 3 Irregular Hyperslab Selection Info +
bytebytebytebyte
Num Blocks

(2, 4 or 8 bytes)

Starting Offset #1 for Block #1

(2, 4 or 8 bytes)

.
.
.
Starting Offset #n for Block #1

(2, 4 or 8 bytes)

Ending Offset #1 for Block #1

(2, 4 or 8 bytes)

.
.
.
Ending Offset #n for Block #1

(2, 4 or 8 bytes)

.
.
.
.
.
.
.
.
.
Starting Offset #1 for Block #u

(2, 4 or 8 bytes)

.
.
.
Starting Offset #n for Block #u

(2, 4 or 8 bytes)

Ending Offset #1 for Block #u

(2, 4 or 8 bytes)

.
.
.
Ending Offset #n for Block #u

(2, 4 or 8 bytes)

+
+ +
+
+
+ + + + + + + + + + + + + + + + + + +
+ Fields: Version 3 Irregular Hyperslab Selection Info +

Num Blocks

The number of blocks in the selection. +

The field Encode Size indicates the size of this field

Starting Offset #n for Block #u

The offset #n of the starting element in block #u. +

#n is from 1 to Rank. +

#u is from 1 to Num Blocks moving from the fastest + changing dimension to the slowest changing dimension. +

The field Encode Size indicates the size of this field +

Ending Offset #n for Block #u

The offset #n of the ending element in block #u. +

#n is from 1 to Rank. +

#u is from 1 to Num Blocks moving from the fastest + changing dimension to the slowest changing dimension. +

The field Encode Size indicates the size of this field +

+
+ + +
+
+
+ +
+ + + + + + + + + + + + + + + + + +
+ Layout: Selection Info for H5S_SEL_ALL +
bytebytebytebyte
Version

Reserved (zero, + 8 bytes)

+
+ +
+
+
+ + + + + + + + + + + +
+ Fields: Selection Info for H5S_SEL_ALL +
Field NameDescription

Version

The version number for the H5S_SEL_ALL Selection Info; + the value is 1.

+
+ + +

VIII.B. Reference Encoding (Revised)

+

+
+ For the following reference type, + the Reference Header and Reference Block are stored together as the dataset's raw data: +

    +
  • Object Reference (H5R_OBJECT2) (without reference to an external file)
  • +
+

+ For the following reference types, + the Reference Header plus the Global Heap ID are stored + as the dataset's raw data in the file. + The global heap ID is used to locate the Reference Block stored in the global heap: +

    +
  • Object Reference (H5R_OBJECT2) (with reference to an external file)
  • +
  • Dataset Region Reference (H5R_DATASET_REGION2) (with/without reference to an external file)
  • +
  • Attribute Reference (H5R_ATTR) (with/without reference to an external file)
  • +
+
+
+ +
+ + + + + + + + + + + + + + + + +
+ Layout: Reference Header +
bytebytebytebyte
Reference TypeFlagsThis space inserted + only to align table nicely
+ +
+ +
+
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Reference Header +
Field NameDescription

Reference Type

+

There are 3 types of references: + + + + + + + + + + + + + + + + + + + + + +
ValueDescription
2H5R_OBJECT2: Object Reference +
3H5R_DATASET_REGION2: Dataset Region Reference +
4H5R_ATTR: Attribute Reference +
+ +

Flags

This field describes the reference: + + + + + + + + + + + + + + +
BitDescription
0If set, the reference is to an external file. +
1-7Reserved

+ +
+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Layout: Reference Block +
bytebytebytebyte
Token SizeThis space inserted + only to align table nicely


Token + (variable size)


Length of External File NameThis space inserted + only to align table nicely


External File Name + (variable size)


Size of Dataspace Selection
Rank of Dataspace Selection


Dataspace Selection Information + (variable size)


Length of Attribute Name This space inserted + only to align table nicely


Attribute Name + (variable size)


+ +
+ +
+
+
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ Fields: Reference Block +
Field NameDescription

Token size

This is the size of the token for the object. +

Token

+

+ This is the token for the object. +

+

Length fo External File Name

This is the length for the external file name. +

This field exists if bit 0 of flags is set.

+

+

External File Name

This is the name of the external file being referenced.

+

+

This field exists if bit 0 of flags is set.

+

Dataspace Selection Information

See Dataspace Selection.

+

+

This field exists if the Reference Type is H5R_DATASET_REGION2.

+

Length of Attribute Name

This is the length of the attribute name. +

This field exists if the Reference Type is H5R_ATTRIBUTE.

+

Attribute Name

This is the name of the attribute being referenced. +

This field exists if the Reference Type is H5R_ATTRIBUTE.

+
+
+ +
+
+
+ + + +

VIII.C. Reference Encoding (Backward Compatibility)

+

+
+ The two references described below are maintained to preserve compatibility with previous versions of the library. +

+ For the following reference type, + the reference encoding is stored as the dataset's raw data in the file: +

    +
  • Object Reference (H5R_OBJECT1)
  • +
+

+ For the following reference type, + the Global Heap ID is stored as the dataset's raw data in the file. + The global heap ID is used to locate the reference encoding + stored in the global heap: +

    +
  • Dataset Region Reference (H5R_DATASET_REGION1)
  • +
+ +
+
+
+ + + + + + + + + + + + + + +
+ Layout: Reference for H5R_OBJECT1 +
bytebytebytebyte

Object AddressO

+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+
+ + + + + + + + + + + + +
+ Fields: Reference for H5R_OBJECT1 +
Field NameDescription

Object Address

+

Address of the object being referenced +

+
+ +
+
+
+ +
+ + + + + + + + + + + + + + + + + + +
+ Layout: Reference for H5R_DATASET_REGION1 +
bytebytebytebyte

Object AddressO



Dataspace Selection Information + (variable size)


+ + + + + +
  + (Items marked with an ‘O’ in the above table are + of the size specified in the Size + of Offsets field in the superblock.) +
+ +
+ +
+
+
+ + + + + + + + + + + + + + + + + +
+ Fields: Reference for H5R_DATASET_REGION1 +
Field NameDescription

Object Address

This is the address of the object being referenced. +

Dataspace Selection Information

This is the dataspace selection for the object being referenced. + See Dataspace Selection.

+

+
+
+ +
+
+
+ + + + diff --git a/doxygen/examples/H5A_examples.c b/doxygen/examples/H5A_examples.c new file mode 100644 index 0000000..f332efa --- /dev/null +++ b/doxygen/examples/H5A_examples.c @@ -0,0 +1,145 @@ +/* -*- c-file-style: "stroustrup" -*- */ + +#include "hdf5.h" + +#include +#include + +int +main(void) +{ + int ret_val = EXIT_SUCCESS; + + //! + { + __label__ fail_acpl, fail_attr, fail_file; + hid_t file, acpl, fspace, attr; + + unsigned mode = H5F_ACC_TRUNC; + char file_name[] = "f1.h5"; + // attribute names can be arbitrary Unicode strings + char attr_name[] = "Χαρακτηριστικό"; + + if ((file = H5Fcreate(file_name, mode, H5P_DEFAULT, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + if ((acpl = H5Pcreate(H5P_ATTRIBUTE_CREATE)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_acpl; + } + // use UTF-8 encoding for the attribute name + if (H5Pset_char_encoding(acpl, H5T_CSET_UTF8) < 0) { + ret_val = EXIT_FAILURE; + goto fail_fspace; + } + // create a scalar (singleton) attribute + if ((fspace = H5Screate(H5S_SCALAR)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_fspace; + } + // create an attribute on the root group + if ((attr = H5Acreate2(file, attr_name, H5T_STD_I32LE, fspace, acpl, H5P_DEFAULT)) == + H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_attr; + } + + H5Aclose(attr); +fail_attr: + H5Sclose(fspace); +fail_fspace: + H5Pclose(acpl); +fail_acpl: + H5Fclose(file); +fail_file:; + } + //! + + //! + { + __label__ fail_attr, fail_file; + hid_t file, attr; + + unsigned mode = H5F_ACC_RDONLY; + char file_name[] = "f1.h5"; + char attr_name[] = "Χαρακτηριστικό"; + int value; + + if ((file = H5Fopen(file_name, mode, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + if ((attr = H5Aopen(file, attr_name, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_attr; + } + // read the attribute value + if (H5Aread(attr, H5T_NATIVE_INT, &value) < 0) + ret_val = EXIT_FAILURE; + + // do something w/ the attribute value + + H5Aclose(attr); +fail_attr: + H5Fclose(file); +fail_file:; + } + //! + + //! + { + __label__ fail_attr, fail_file; + hid_t file, attr; + + unsigned mode = H5F_ACC_RDWR; + char file_name[] = "f1.h5"; + char attr_name[] = "Χαρακτηριστικό"; + int value = 1234; + + if ((file = H5Fopen(file_name, mode, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + if ((attr = H5Aopen(file, attr_name, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_attr; + } + // update the attribute value + if (H5Awrite(attr, H5T_NATIVE_INT, &value) < 0) + ret_val = EXIT_FAILURE; + + H5Aclose(attr); +fail_attr: + H5Fclose(file); +fail_file:; + } + //! + + //! + { + __label__ fail_attr, fail_file; + hid_t file; + + unsigned mode = H5F_ACC_RDWR; + char file_name[] = "f1.h5"; + char attr_name[] = "Χαρακτηριστικό"; + + if ((file = H5Fopen(file_name, mode, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + // delete the attribute + if (H5Adelete(file, attr_name) < 0) { + ret_val = EXIT_FAILURE; + goto fail_attr; + } + +fail_attr: + H5Fclose(file); +fail_file:; + } + //! + + return ret_val; +} diff --git a/doxygen/examples/H5D_examples.c b/doxygen/examples/H5D_examples.c new file mode 100644 index 0000000..aad057d --- /dev/null +++ b/doxygen/examples/H5D_examples.c @@ -0,0 +1,173 @@ +/* -*- c-file-style: "stroustrup" -*- */ + +#include "hdf5.h" + +#include +#include + +int +main(void) +{ + int ret_val = EXIT_SUCCESS; + + //! + { + __label__ fail_lcpl, fail_dset, fail_file; + hid_t file, lcpl, fspace, dset; + + unsigned mode = H5F_ACC_TRUNC; + char file_name[] = "d1.h5"; + // link names can be arbitrary Unicode strings + char dset_name[] = "σύνολο/δεδομένων"; + + if ((file = H5Fcreate(file_name, mode, H5P_DEFAULT, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + if ((lcpl = H5Pcreate(H5P_LINK_CREATE)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_lcpl; + } + // use UTF-8 encoding for link names + if (H5Pset_char_encoding(lcpl, H5T_CSET_UTF8) < 0) { + ret_val = EXIT_FAILURE; + goto fail_fspace; + } + // create intermediate groups as needed + if (H5Pset_create_intermediate_group(lcpl, 1) < 0) { + ret_val = EXIT_FAILURE; + goto fail_fspace; + } + // create a 1D dataspace + if ((fspace = H5Screate_simple(1, (hsize_t[]){10}, NULL)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_fspace; + } + // create a 32-bit integer dataset + if ((dset = H5Dcreate2(file, dset_name, H5T_STD_I32LE, fspace, lcpl, H5P_DEFAULT, H5P_DEFAULT)) == + H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_dset; + } + + H5Dclose(dset); +fail_dset: + H5Sclose(fspace); +fail_fspace: + H5Pclose(lcpl); +fail_lcpl: + H5Fclose(file); +fail_file:; + } + //! + + //! + { + __label__ fail_dset, fail_file; + hid_t file, dset; + + unsigned mode = H5F_ACC_RDONLY; + char file_name[] = "d1.h5"; + // assume a priori knowledge of dataset name and size + char dset_name[] = "σύνολο/δεδομένων"; + int elts[10]; + + if ((file = H5Fopen(file_name, mode, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + if ((dset = H5Dopen2(file, dset_name, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_dset; + } + // read all dataset elements + if (H5Dread(dset, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, elts) < 0) + ret_val = EXIT_FAILURE; + + // do something w/ the dataset elements + + H5Dclose(dset); +fail_dset: + H5Fclose(file); +fail_file:; + } + //! + + //! + { + __label__ fail_update, fail_fspace, fail_dset, fail_file; + hid_t file, dset, fspace; + + unsigned mode = H5F_ACC_RDWR; + char file_name[] = "d1.h5"; + char dset_name[] = "σύνολο/δεδομένων"; + int new_elts[6][2] = {{-1, 1}, {-2, 2}, {-3, 3}, {-4, 4}, {-5, 5}, {-6, 6}}; + + if ((file = H5Fopen(file_name, mode, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + if ((dset = H5Dopen2(file, dset_name, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_dset; + } + // get the dataset's dataspace + if ((fspace = H5Dget_space(dset)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_fspace; + } + // select the first 5 elements in odd positions + if (H5Sselect_hyperslab(fspace, H5S_SELECT_SET, (hsize_t[]){1}, (hsize_t[]){2}, (hsize_t[]){5}, + NULL) < 0) { + ret_val = EXIT_FAILURE; + goto fail_update; + } + + // (implicitly) select and write the first 5 elements of the second column of NEW_ELTS + if (H5Dwrite(dset, H5T_NATIVE_INT, H5S_ALL, fspace, H5P_DEFAULT, new_elts) < 0) + ret_val = EXIT_FAILURE; + +fail_update: + H5Sclose(fspace); +fail_fspace: + H5Dclose(dset); +fail_dset: + H5Fclose(file); +fail_file:; + } + //! + + //! + { + __label__ fail_delete, fail_file; + hid_t file; + + unsigned mode = H5F_ACC_RDWR; + char file_name[] = "d1.h5"; + char group_name[] = "σύνολο"; + char dset_name[] = "σύνολο/δεδομένων"; + + if ((file = H5Fopen(file_name, mode, H5P_DEFAULT)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + // delete (unlink) the dataset + if (H5Ldelete(file, dset_name, H5P_DEFAULT) < 0) { + ret_val = EXIT_FAILURE; + goto fail_delete; + } + // the previous call deletes (unlinks) only the dataset + if (H5Ldelete(file, group_name, H5P_DEFAULT) < 0) { + ret_val = EXIT_FAILURE; + goto fail_delete; + } + +fail_delete: + H5Fclose(file); +fail_file:; + } + + //! + + return ret_val; +} diff --git a/doxygen/examples/H5F_examples.c b/doxygen/examples/H5F_examples.c new file mode 100644 index 0000000..a7ce6fb --- /dev/null +++ b/doxygen/examples/H5F_examples.c @@ -0,0 +1,187 @@ +/* -*- c-file-style: "stroustrup" -*- */ + +#include "hdf5.h" + +#include +#include + +int +main(void) +{ + int ret_val = EXIT_SUCCESS; + + //! + { + __label__ fail_fapl, fail_fcpl, fail_file; + hid_t fcpl, fapl, file; + + if ((fcpl = H5Pcreate(H5P_FILE_CREATE)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_fcpl; + } + else { + // adjust the file creation properties + } + + if ((fapl = H5Pcreate(H5P_FILE_ACCESS)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_fapl; + } + else { + // adjust the file access properties + } + + unsigned mode = H5F_ACC_EXCL; + char name[] = "f1.h5"; + + if ((file = H5Fcreate(name, mode, fcpl, fapl)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + + // do something useful with FILE + + H5Fclose(file); +fail_file: + H5Pclose(fapl); +fail_fapl: + H5Pclose(fcpl); +fail_fcpl:; + } + //! + + //! + { + __label__ fail_fapl, fail_file; + hid_t fapl, file; + + if ((fapl = H5Pcreate(H5P_FILE_ACCESS)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_fapl; + } + else { + // adjust the file access properties + } + + unsigned mode = H5F_ACC_RDWR; + char name[] = "f1.h5"; + + if ((file = H5Fopen(name, mode, fapl)) == H5I_INVALID_HID) { + ret_val = EXIT_FAILURE; + goto fail_file; + } + + // do something useful with FILE + + H5Fclose(file); +fail_file: + H5Pclose(fapl); +fail_fapl:; + } + //! + + //! + { + unsigned mode = H5F_ACC_TRUNC; + char name[] = "f11.h5"; + + hid_t file = H5Fcreate(name, mode, H5P_DEFAULT, H5P_DEFAULT); + if (file != H5I_INVALID_HID) + H5Fclose(file); + else + ret_val = EXIT_FAILURE; + } + //! + + //! + { + unsigned mode = H5F_ACC_RDONLY; + char name[] = "f11.h5"; + + hid_t file = H5Fopen(name, mode, H5P_DEFAULT); + if (file != H5I_INVALID_HID) + H5Fclose(file); + else + ret_val = EXIT_FAILURE; + } + //! + + //! + { + unsigned mode = H5F_ACC_RDWR; + char name[] = "f11.h5"; + + hid_t file = H5Fopen(name, mode, H5P_DEFAULT); + if (file != H5I_INVALID_HID) { + int step; + for (step = 0; step < 1000; ++step) { + + // do important work & flush every 20 steps + + if (step % 20 == 0) { + if (H5Fflush(file, H5F_SCOPE_LOCAL) < 0) { + perror("H5Fflush failed."); + ret_val = EXIT_FAILURE; + break; + } + } + } + + if (H5Fclose(file) < 0) + perror("H5Fclose failed."); + } + else + ret_val = EXIT_FAILURE; + } + //! + + //! + { + unsigned mode = H5F_ACC_RDWR; + char name[] = "f11.h5"; + + hid_t file = H5Fopen(name, mode, H5P_DEFAULT); + if (file != H5I_INVALID_HID) { + if (H5Fset_libver_bounds(file, H5F_LIBVER_EARLIEST, H5F_LIBVER_V18) >= 0) { + + // object creation will not exceed HDF5 version 1.8.x + } + else + perror("H5Fset_libver_bounds failed."); + + if (H5Fclose(file) < 0) + perror("H5Fclose failed."); + } + else + ret_val = EXIT_FAILURE; + } + //! + + //! + { + hid_t file = H5Fopen("f11.h5", H5F_ACC_RDWR, H5P_DEFAULT); + if (file != H5I_INVALID_HID) { + hid_t group, child; + if ((group = H5Gcreate1(file, "mount_point", H5P_DEFAULT)) != H5I_INVALID_HID) { + if ((child = H5Fopen("f1.h5", H5F_ACC_RDONLY, H5P_DEFAULT)) != H5I_INVALID_HID) { + if (H5Fmount(group, ".", child, H5P_DEFAULT) >= 0) { + + // do something useful w/ the mounted file + } + else { + ret_val = EXIT_FAILURE; + perror("H5Fmount failed."); + } + H5Fclose(child); + } + H5Gclose(group); + } + H5Fclose(file); + } + else + ret_val = EXIT_FAILURE; + } + //! + + return ret_val; +} diff --git a/doxygen/examples/H5Pget_metadata_read_attempts.1.c b/doxygen/examples/H5Pget_metadata_read_attempts.1.c new file mode 100644 index 0000000..da325c0 --- /dev/null +++ b/doxygen/examples/H5Pget_metadata_read_attempts.1.c @@ -0,0 +1,22 @@ +/* Get a copy of file access property list */ +fapl = H5Pcreate(H5P_FILE_ACCESS); + +/* Retrieve the # of read attempts from the file access property list */ +H5Pget_metadata_read_attempts(fapl, &attempts); + +/* + * The value returned in "attempts" will be 1 (default for non-SWMR access). + */ + +/* Set the # of read attempts to 20 */ +H5Pset_metadata_read_attempts(fapl, 20); + +/* Retrieve the # of read attempts from the file access property list */ +H5Pget_metadata_read_attempts(fapl, &attempts); + +/* + * The value returned in "attempts" will be 20 as set. + */ + +/* Close the property list */ +H5Pclose(fapl); diff --git a/doxygen/examples/H5Pget_metadata_read_attempts.2.c b/doxygen/examples/H5Pget_metadata_read_attempts.2.c new file mode 100644 index 0000000..2cd12db --- /dev/null +++ b/doxygen/examples/H5Pget_metadata_read_attempts.2.c @@ -0,0 +1,44 @@ +/* Open the file with SWMR access and default file access property list */ +fid = H5Fopen(FILE, (H5F_ACC_RDONLY | H5F_ACC_SWMR_READ), H5P_DEFAULT); + +/* Get the file's file access roperty list */ +file_fapl = H5Fget_access_plist(fid); + +/* Retrieve the # of read attempts from the file's file access property list */ +H5Pget_metadata_read_attempts(file_fapl, &attempts); + +/* + * The value returned in "attempts" will be 100 (default for SWMR access). + */ + +/* Close the property list */ +H5Pclose(file_fapl); + +/* Close the file */ +H5Fclose(fid); + +/* Create a copy of file access property list */ +fapl = H5Pcreate(H5P_FILE_ACCESS); + +/* Set the # of read attempts */ +H5Pset_metadata_read_attempts(fapl, 20); + +/* Open the file with SWMR access and the non-default file access property list */ +fid = H5Fopen(FILE, (H5F_ACC_RDONLY | H5F_ACC_SWMR_READ), fapl); + +/* Get the file's file access roperty list */ +file_fapl = H5Fget_access_plist(fid); + +/* Retrieve the # of read attempts from the file's file access property list */ +H5Pget_metadata_read_attempts(file_fapl, &attempts); + +/* + * The value returned in "attempts" will be 20. + */ + +/* Close the property lists */ +H5Pclose(file_fapl); +H5Pclose(fapl); + +/* Close the file */ +H5Fclose(fid); diff --git a/doxygen/examples/H5Pget_metadata_read_attempts.3.c b/doxygen/examples/H5Pget_metadata_read_attempts.3.c new file mode 100644 index 0000000..4b5ea3a --- /dev/null +++ b/doxygen/examples/H5Pget_metadata_read_attempts.3.c @@ -0,0 +1,44 @@ +/* Open the file with non-SWMR access and default file access property list */ +fid = H5Fopen(FILE, H5F_ACC_RDONLY, H5P_DEFAULT); + +/* Get the file's file access roperty list */ +file_fapl = H5Fget_access_plist(fid); + +/* Retrieve the # of read attempts from the file's file access property list */ +H5Pget_metadata_read_attempts(file_fapl, &attempts); + +/* + * The value returned in "attempts" will be 1 (default for non-SWMR access). + */ + +/* Close the property list */ +H5Pclose(file_fapl); + +/* Close the file */ +H5Fclose(fid); + +/* Create a copy of file access property list */ +fapl = H5Pcreate(H5P_FILE_ACCESS); + +/* Set the # of read attempts */ +H5Pset_metadata_read_attempts(fapl, 20); + +/* Open the file with non-SWMR access and the non-default file access property list */ +fid = H5Fopen(FILE, H5F_ACC_RDONLY, fapl); + +/* Get the file's file access roperty list */ +file_fapl = H5Fget_access_plist(fid); + +/* Retrieve the # of read attempts from the file's file access property list */ +H5Pget_metadata_read_attempts(file_fapl, &attempts); + +/* + * The value returned in "attempts" will be 1 (default for non-SWMR access). + */ + +/* Close the property lists */ +H5Pclose(file_fapl); +H5Pclose(fapl); + +/* Close the file */ +H5Fclose(fid); diff --git a/doxygen/examples/H5Pget_object_flush_cb.c b/doxygen/examples/H5Pget_object_flush_cb.c new file mode 100644 index 0000000..d18f3df --- /dev/null +++ b/doxygen/examples/H5Pget_object_flush_cb.c @@ -0,0 +1,41 @@ +hid_t fapl_id; +unsigned counter; +H5F_object_flush_t *ret_cb; +unsigned * ret_counter; + +/* Create a copy of the file access property list */ +fapl_id = H5Pcreate(H5P_FILE_ACCESS); + +/* Set up the object flush property values */ +/* flush_cb: callback function to invoke when an object flushes (see below) */ +/* counter: user data to pass along to the callback function */ +H5Pset_object_flush_cb(fapl_id, flush_cb, &counter); + +/* Open the file */ +file_id = H5Fopen(FILE, H5F_ACC_RDWR, H5P_DEFAULT); + +/* Get the file access property list for the file */ +fapl = H5Fget_access_plist(file_id); + +/* Retrieve the object flush property values for the file */ +H5Pget_object_flush_cb(fapl, &ret_cb, &ret_counter); +/* ret_cb will point to flush_cb() */ +/* ret_counter will point to counter */ + +/* +. +. +. +. +. +. +*/ + +/* The callback function for the object flush property */ +static herr_t +flush_cb(hid_t obj_id, void *_udata) +{ + unsigned *flush_ct = (unsigned *)_udata; + ++(*flush_ct); + return 0; +} diff --git a/doxygen/examples/H5Pset_metadata_read_attempts.c b/doxygen/examples/H5Pset_metadata_read_attempts.c new file mode 100644 index 0000000..7c2f65d --- /dev/null +++ b/doxygen/examples/H5Pset_metadata_read_attempts.c @@ -0,0 +1,59 @@ +//! [SWMR Access] +/* Create a copy of file access property list */ +fapl = H5Pcreate(H5P_FILE_ACCESS); + +/* Set the # of read attempts */ +H5Pset_metadata_read_attempts(fapl, 20); + +/* Open the file with SWMR access and the non-default file access property list */ +fid = H5Fopen(FILE, (H5F_ACC_RDONLY | H5F_ACC_SWMR_READ), fapl); + +/* Get the file's file access roperty list */ +file_fapl = H5Fget_access_plist(fid); + +/* Retrieve the # of read attempts from the file's file access property list */ +H5Pget_metadata_read_attempts(file_fapl, &attempts); + +/* + * The value returned in "attempts" will be 20. + * The library will use 20 as the number of read attempts + * when reading checksummed metadata in the file + */ + +/* Close the property list */ +H5Pclose(fapl); +H5Pclose(file_fapl); + +/* Close the file */ +H5Fclose(fid); +//! [SWMR Access] + +//! [non-SWMR Access] +/* Create a copy of file access property list */ +fapl = H5Pcreate(H5P_FILE_ACCESS); + +/* Set the # of read attempts */ +H5Pset_metadata_read_attempts(fapl, 20); + +/* Open the file with SWMR access and the non-default file access property list */ +fid = H5Fopen(FILE, H5F_ACC_RDONLY, fapl); + +/* Get the file's file access roperty list */ +file_fapl = H5Fget_access_plist(fid); + +/* Retrieve the # of read attempts from the file's file access property list */ +H5Pget_metadata_read_attempts(file_fapl, &attempts); + +/* + * The value returned in "attempts" will be 1 (default for non-SWMR access). + * The library will use 1 as the number of read attempts + * when reading checksummed metadata in the file + */ + +/* Close the property lists */ +H5Pclose(fapl); +H5Pclose(file_fapl); + +/* Close the file */ +H5Fclose(fid); +//! [non-SWMR Access] diff --git a/doxygen/examples/H5Pset_object_flush_cb.c b/doxygen/examples/H5Pset_object_flush_cb.c new file mode 100644 index 0000000..1dfa90d --- /dev/null +++ b/doxygen/examples/H5Pset_object_flush_cb.c @@ -0,0 +1,41 @@ +hid_t file_id, fapl_id; +hid_t dataset_id, dapl_id; +unsigned counter; + +/* Create a copy of the file access property list */ +fapl_id = H5Pcreate(H5P_FILE_ACCESS); + +/* Set up the object flush property values */ +/* flush_cb: callback function to invoke when an object flushes (see below) */ +/* counter: user data to pass along to the callback function */ +H5Pset_object_flush_cb(fapl_id, flush_cb, &counter); + +/* Open the file */ +file_id = H5Fopen(FILE, H5F_ACC_RDWR, H5P_DEFAULT); + +/* Create a group */ +gid = H5Gcreate2(fid, “group”, H5P_DEFAULT, H5P_DEFAULT_H5P_DEFAULT); + +/* Open a dataset */ +dataset_id = H5Dopen2(file_id, DATASET, H5P_DEFAULT); + +/* The flush will invoke flush_cb() with counter */ +H5Dflush(dataset_id); +/* counter will be equal to 1 */ + +/* ... */ + +/* The flush will invoke flush_cb() with counter */ +H5Gflush(gid); +/* counter will be equal to 2 */ + +/* ... */ + +/* The callback function for object flush property */ +static herr_t +flush_cb(hid_t obj_id, void *_udata) +{ + unsigned *flush_ct = (unsigned *)_udata; + ++(*flush_ct); + return 0; +} diff --git a/doxygen/examples/ImageSpec.html b/doxygen/examples/ImageSpec.html new file mode 100644 index 0000000..1b700ff --- /dev/null +++ b/doxygen/examples/ImageSpec.html @@ -0,0 +1,1203 @@ + + + + + + Image Specification + +The HDF5 specification defines the standard objects and storage for the +standard HDF5 objects. (For information about the HDF5 library, model and +specification, see the HDF documentation.)  This document is an additional +specification do define a standard profile for how to store image data +in HDF5. Image data in HDF5 is stored as HDF5 datasets with standard attributes +to define the properties of the image. +

This specification is primarily concerned with two dimensional raster +data similar to HDF4 Raster Images.  Specifications for storing other +types of imagery will be covered in other documents. +

This specification defines: +

    +
  • +Standard storage and attributes for an Image dataset (Section +1)
  • + +
  • +Standard storage and attributes for Palettes (Section +2)
  • + +
  • +Standard for associating Palettes with Images. (Section +3)
  • +
+ +

+1. HDF5 Image Specification

+ +

+1.1 Overview

+Image data is stored as an HDF5 dataset with values of HDF5 class Integer +or Float.  A common example would be a two dimensional dataset, with +elements of class Integer, e.g., a two dimensional array of unsigned 8 +bit integers.  However, this specification does not limit the dimensions +or number type that may be used for an Image. +

The dataset for an image is distinguished from other datasets by giving +it an attribute "CLASS=IMAGE".  In addition, the Image dataset may +have an optional attribute "PALETTE" that is an array of object references +for zero or more palettes. The Image dataset may have additional attributes +to describe the image data, as defined in Section 1.2. +

A Palette is an HDF5 dataset which contains color map information.  +A Pallet dataset has an attribute "CLASS=PALETTE" and other attributes +indicating the type and size of the palette, as defined in Section +2.1.  A Palette is an independent object, which can be shared +among several Image datasets. +

+1.2  Image Attributes

+The attributes for the Image are scalars unless otherwise noted.  +The length of String valued attributes should be at least the number of +characters. Optionally, String valued attributes may be stored in a String +longer than the minimum, in which case it must be zero terminated or null +padded.  "Required" attributes must always be used. "Optional" attributes +must be used when required. +
  +

+Attributes

+ +
+
+Attribute name="CLASS" (Required)
+ +
+This attribute is type H5T_C_S1, with size 5.
+ +
+For all Images, the value of this attribute is "IMAGE".
+ +
+
+ +
+This attribute identifies this data set as intended to be interpreted as +an image that conforms to the specifications on this page.
+
+ +
+Attribute name="PALETTE"
+ +
+
+A Image dataset within an HDF5 file may optionally specify an array of +palettes to be viewed with. The dataset will have an attribute field called +"PALETTE" which contains a one-dimensional array of object reference +pointers (HDF5 datatype H5T_STD_REF_OBJ) which refer to palettes in the +file. The palette datasets must conform to the Palette specification in +section +2 below. The first palette in this array will be the default palette +that the data may be viewed with.
+
+ +
+
+
+ +
+Attribute name="IMAGE_SUBCLASS"
+ +
+If present, the value of this attribute indicates the type of Palette that +should be used with the Image.  This attribute is a scalar of type +H5T_C_S1, with size according to the string plus one.  The values +are:
+ +
+
+"IMAGE_GRAYSCALE" (length 15)
+ +
+A grayscale image
+ +
+"IMAGE_BITMAP" (length 12)
+ +
+A bit map image
+ +
+"IMAGE_TRUECOLOR" (length 15)
+ +
+A truecolor image
+ +
+"IMAGE_INDEXED" (length 13)
+ +
+An indexed image
+ +
+
+
+ +
+Attribute name="INTERLACE_MODE"
+ +
+For images with more than one component for each pixel, this optional attribute +specifies the layout of the data. The values are type H5T_C_S1 of length +15. See section 1.3 for information about the +storage layout for data.
+ +
+"INTERLACE_PIXEL" (default): the component value for a pixel are contiguous.
+ +
+"INTERLACE_PLANE": each component is stored as a plane.
+ +
+
+ +
+Attribute name="DISPLAY_ORIGIN"
+ +
+This optional attribute indicates the intended orientation of the data +on a two-dimensional raster display.  The value indicates which corner +the pixel at (0, 0) should be viewed.  The values are type H5T_C_S1 +of length 2. If DISPLAY_ORIGIN is not set, the orientation is undefined.
+ +
+"UL": (0,0) is at the upper left.
+ +
+"LL": (0,0) is at the lower left.
+ +
+"UR": (0,0) is at the upper right.
+ +
+"LR": (0,0) is at the lower right.
+
+ +
+Attribute name="IMAGE_WHITE_IS_ZERO"
+ +
+
+This attribute is of type H5T_NATIVE_UCHAR.  0 = false, 1 = true .  +This is used for images with IMAGE_SUBCLASS="IMAGE_GRAYSCALE" or "IMAGE_BITMAP".
+
+ +
+
+Attribute name="IMAGE_MINMAXRANGE"
+ +
+If present, this attribute is an array of two numbers, of the same HDF5 +datatype as the data.  The first element is the minimum value of the +data, and the second is the maximum.  This is used for images with +IMAGE_SUBCLASS="IMAGE_GRAYSCALE", "IMAGE_BITMAP" or "IMAGE_INDEXED".
+
+ +
+Attribute name="IMAGE_BACKGROUNDINDEX"
+ +
+
+If set, this attribute indicates the index value that should be interpreted +as the "background color".  This attribute is HDF5 type H5T_NATIVE_UINT.
+
+ +
+Attribute name="IMAGE_TRANSPARENCY"
+ +
+
+If set, this attribute indicates the index value that should be interpreted +as the "transparent color".  This attribute is HDF5 type H5T_NATIVE_UINT.  +This attribute may not be used for IMAGE_SUBCLASS="IMAGE_TRUE_COLOR".
+
+ +
+Attribute name="IMAGE_ASPECTRATIO"
+ +
+
+If set, this attribute indicates the aspect ratio.
+
+ +
+Attribute name="IMAGE_COLORMODEL"
+ +
+
+If set, this attribute indicates the color model of Palette that should +be used with the Image.  This attribute is of type H5T_C_S1, with +size 3, 4, or 5.  The value is one of the color models described in +the Palette specification in section 2.2 below.  +This attribute may be used only for IMAGE_SUBCLASS="IMAGE_TRUECOLOR" or +"IMAGE_INDEXED".
+
+ +
+Attribute name="IMAGE_GAMMACORRECTION"
+ +
+
+If set, this attribute gives the Gamma correction.  The attribute +is type H5T_NATIVE_FLOAT.  This attribute may be used only for IMAGE_SUBCLASS="IMAGE_TRUECOLOR" +or "IMAGE_INDEXED".
+
+Attribute name="IMAGE_VERSION" (Required) +
+
+This attribute is of type H5T_C_S1, with size corresponding to the length +of the version string.  This attribute identifies the version number +of this specification to which it conforms.  The current version number +is "1.2".
+ +
  +

  +
  +
  +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 1. Attributes of an Image Dataset
Attribute Name(R = Required +
O= Optional)
TypeString SizeValue
CLASSRString5"IMAGE"
PALETTEOArray Object References<references to Palette datasets>1
IMAGE_SUBCLASSO2String15,  +
12,  +
15, +
13
+
+"IMAGE_GRAYSCALE",
+ +
+"IMAGE_BITMAP",
+ +
+"IMAGE_TRUECOLOR",
+ +
+"IMAGE_INDEXED"
+
INTERLACE_MODEO3,6String15The layout of components if more than one component per pixel.
DISPLAY_ORIGINOString2If set, indicates the intended location of the pixel (0,0).
IMAGE_WHITE_IS_ZEROO3,4Unsigned Integer0 = false, 1 = true
IMAGE_MINMAXRANGEO3,5Array [2] <same datatype as data values>The (<minimum>, <maximum>) value of the data.
IMAGE_BACKGROUNDINDEXO3Unsigned IntegerThe index of the background color.
IMAGE_TRANSPARENCYO3,5Unsigned IntegerThe index of the transparent color.
IMAGE_ASPECTRATIOO3,4Unsigned IntegerThe aspect ratio.
IMAGE_COLORMODELO3,6String3, 4, or 5The color model, as defined below in the Palette specification for +attribute PAL_COLORMODEL.
IMAGE_GAMMACORRECTIONO3,6FloatThe gamma correction.
IMAGE_VERSIONRString3"1.2"
+ +
1.  The first element of the array is the default +Palette. +
2.  This attribute is required for images +that use one of the standard color map types listed. +
3. This attribute is required if set for the source +image, in the case that the image is translated from another file into +HDF5. +
4.  This applies to:  IMAGE_SUBCLASS="IMAGE_GRAYSCALE" +or "IMAGE_BITMAP". +
5.  This applies to:  IMAGE_SUBCLASS="IMAGE_GRAYSCALE", +"IMAGE_BITMAP", or "IMAGE_INDEXED". +
6.  This applies to: IMAGE_SUBCLASS="IMAGE_TRUECOLOR", +or "IMAGE_INDEXED".
+
+Table 2 summarizes the standard attributes for an Image datasets using +the common sub-classes. R means that the attribute listed on the leftmost +column is Required for the image subclass on the first row, O means that +the attribute is Optional for that subclass and N that the attribute cannot +be applied to that subclass. The two first rows show the only required +attributes +for all subclasses. +
  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 2a. Applicability of Attributes to IMAGE sub-classes
IMAGE_SUBCLASS1IMAGE_GRAYSCALEIMAGE_BITMAP
CLASSRR
IMAGE_VERSIONRR
INTERLACE_MODENN
IMAGE_WHITE_IS_ZERORR
IMAGE_MINMAXRANGEOO
IMAGE_BACKGROUNDINDEXOO
IMAGE_TRANSPARENCYOO
IMAGE_ASPECTRATIOOO
IMAGE_COLORMODELNN
IMAGE_GAMMACORRECTIONNN
PALETTEOO
DISPLAY_ORIGINOO
+ +
 
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 2b. Applicability of Attributes to IMAGE sub-classes
IMAGE_SUBCLASSIMAGE_TRUECOLORIMAGE_INDEXED
CLASSRR
IMAGE_VERSIONRR
INTERLACE_MODERN
IMAGE_WHITE_IS_ZERONN
IMAGE_MINMAXRANGENO
IMAGE_BACKGROUNDINDEXNO
IMAGE_TRANSPARENCYNO
IMAGE_ASPECTRATIOOO
IMAGE_COLORMODELOO
IMAGE_GAMMACORRECTIONOO
PALETTEOO
DISPLAY_ORIGINOO
+ +

+1.3 Storage Layout and Properties for Images

+In the case of an image with more than one component per pixel (e.g., Red, +Green, and Blue), the data may be arranged in one of two ways.  Following +HDF4 terminology, the data may be interlaced by pixel or by plane, which +should be indicated by the INTERLACE_MODE  attribute.  In both +cases, the dataset will have a dataspace with three dimensions, height, +width, and components.  The interlace modes specify different orders +for the dimensions. +
  + + + + + + + + + + + + + + + + + + + + +
Table 3. Storage of multiple component image data.
Interlace ModeDimensions in the Dataspace
INTERLACE_PIXEL[height][width][pixel components]
INTERLACE_PLANE[pixel components][height][width]
+ +

For example, consider a 5 (rows) by 10 (column) image, with Red, Green, +and Blue components.  Each component is an unsigned byte. In HDF5, +the datatype would be declared as an unsigned 8 bit integer.  For +pixel interlace, the dataspace would be a three dimensional array, with +dimensions: [10][5][3].  For plane interleave, the dataspace would +be three dimensions: [3][10][5]. +

In the case of images with only one component, the dataspace may be +either a two dimensional array, or a three dimensional array with the third +dimension of size 1.  For example, a 5 by 10 image with 8 bit color +indexes would be an HDF5 dataset with type unsigned 8 bit integer.  +The dataspace could be either a two dimensional array, with dimensions +[10][5], or three dimensions, with dimensions either [10][5][1] or [1][10][5]. +

Image datasets may be stored with any chunking or compression properties +supported by HDF5. +

A note concerning compatibility with HDF5 GR interface: An Image +dataset is stored as an HDF5 dataset.  It is important to note that +the order of the dimensions is the same as for any other HDF5 dataset.  +For a two dimensional image that is to be stored as a series of horizontal +scan lines, with the scan lines contiguous (i.e., the fastest changing +dimension is 'width'), the image will have a dataspace with dim[0] = +height and dim[1] = width.  This is completely consistent +with all other HDF5 datasets. +

Users familiar with HDF4 should be cautioned that this is not the +same as HDF4, and specifically is not consistent with what the HDF4 +GR interface does. +
  +

+2.  HDF5 Palette Specification

+ +

+2.1 Overview

+A palette is the means by which color is applied to an image and is also +referred to as a color lookup table. It is a table in which every row contains +the numerical representation of a particular color. In the example of an +8 bit standard RGB color model palette, this numerical representation of +a color is presented as a triplet specifying the intensity of red, green, +and blue components that make up each color. +
+

+ +

In this example, the color component numeric type is an 8 bit unsigned +integer. While this is most common and recommended for general use, other +component color numeric datatypes, such as a 16 bit unsigned integer , +may be used. This type is specified as the type attribute of the palette +dataset. (see H5Tget_type(), H5Tset_type()) +

The minimum and maximum values of the component color numeric are specified +as attribute of the palette dataset. See below (attribute PAL_MINMAXNUMERIC). +If these attributes do not exist, it is assumed that the range of values +will fill the space of the color numeric type. i.e. with an 8 bit unsigned +integer, the valid range would be 0 to 255 for each color component. +

The HDF5 palette specification additionally allows for color models +beyond RGB. YUV, HSV, CMY, CMYK, YCbCr color models are supported, and +may be specified as a color model attribute of the palette dataset. (see +"Palette Attributes" for details). +

In HDF 4 and earlier, palettes were limited to 256 colors. The HDF5 +palette specification allows for palettes of varying length. The length +is specified as the number of rows of the palette dataset. +
  +
  + + + + +
Important Note: The specification of the Indexed +Palette will change substantially in the next version.  The Palette +described here is denigrated and is not supported.
+ +
  + + + + +
Denigrated +

In a standard palette, the color entries are indexed directly. HDF5 +supports the notion of a range index table. Such a table defines an ascending +ordered list of ranges that map dataset values to the palette. If a range +index table exists for the palette, the PAL_TYPE attribute will be set +to "RANGEINDEX", and the PAL_RANGEINDEX attribute will contain an object +reference to a range index table array. If not, the PAL_TYPE attribute +either does not exist, or will be set to "STANDARD". +

The range index table array consists of a one dimensional array with +the same length as the palette dataset - 1. Ideally, the range index would +be of the same type as the dataset it refers to, however this is not a +requirement. +

Example 2: A range index array of type floating point +

+

+ +

The range index array attribute defines the "to" of the range. +Notice that the range index array attribute is one less entry in size than +the palette. The first entry of 0.1259, specifies that all values below +and up to 0.1259 inclusive, will map to the first palette entry. The second +entry signifies that all values greater than 0.1259 up to 0.3278 inclusive, +will map to the second palette entry, etc. All value greater than the last +range index array attribute (100000) map to the last entry in the palette.

+ +

+2.2. Palette Attributes

+A palette exists in an HDF file as an independent data set with accompanying +attributes.  The Palette attributes are scalars except where noted +otherwise.  String values should have size the length of the string +value plus one.  "Required" attributes must be used.  "Optional" +attributes must be used when required. +

These attributes are defined as follows: +

+
+Attribute name="CLASS" (Required)
+ +
+This attribute is of type H5T_C_S1, with size 7.
+ +
+For all palettes, the value of this attribute is "PALETTE". This attribute +identifies this palette data set as a palette that conforms to the specifications +on this page.
+ +
+Attribute name="PAL_COLORMODEL" (Required)
+ +
+This attribute is of type H5T_C_S1, with size 3, 4, or 5.
+ +
+Possible values for this are "RGB", "YUV", "CMY", "CMYK", "YCbCr", "HSV".
+ +
+This defines the color model that the entries in the palette data set represent.
+ +
+
+"RGB"
+ +
+Each color index contains a triplet where the the first value defines the +red component, second defines the green component, and the third the blue +component.
+ +
+"CMY"
+ +
+Each color index contains a triplet where the the first value defines the +cyan component, second defines the magenta component, and the third the +yellow component.
+ +
+"CMYK"
+ +
+Each color index contains a quadruplet where the the first value defines +the cyan component, second defines the magenta component, the third the +yellow component, and the forth the black component.
+ +
+"YCbCr"
+ +
+Class Y encoding model. Each color index contains a triplet where the the +first value defines the luminance, second defines the Cb Chromonance, and +the third the Cr Chromonance.
+ +
+"YUV"
+ +
+Composite encoding color model. Each color index contains a triplet where +the the first value defines the luminance component, second defines the +chromonance component, and the third the value component.
+ +
+"HSV"
+ +
+Each color index contains a triplet where the the first value defines the +hue component, second defines the saturation component, and the third the +value component. The hue component defines the hue spectrum with a low +value representing magenta/red progressing to a high value which would +represent blue/magenta, passing through yellow, green, cyan. A low value +for the saturation component means less color saturation than a high value. +A low value for value will be darker than a high value.
+ +
+
+
+ +
+Attribute name="PAL_TYPE" (Required)
+ +
+This attribute is of type H5T_C_S1, with size 9 or 10.
+ +
+The current supported values for this attribute are : "STANDARD8" or "RANGEINDEX"
+ +
+A PAL_TYPE of "STANDARD8" defines a palette dataset such that the first +entry defines index 0, the second entry defines index 1, etc. up until +the length of the palette - 1. This assumes an image dataset with direct +indexes into the palette.
+
+ +
  + + + + +
Denigrated +

If the PAL_TYPE is set to "RANGEINDEX", there will be an additional +attribute with a name of "PAL_RANGEINDEX",  (See example 2 +for more details)

+ + + + + +
+
+Attribute name="PAL_RANGEINDEX"   (Denigrated)
+ +
+
+The PAL_RANGEINDEX attribute contains an HDF object reference (HDF5 +datatype H5T_STD_REF_OBJ) pointer which specifies a range index array in +the file to be used for color lookups for the palette.  (Only for +PAL_TYPE="RANGEINDEX")
+
+
+ +
+Attribute name="PAL_MINMAXNUMERIC"
+ +
+
+If present, this attribute is an array of two numbers, of the same HDF5 +datatype as the palette elements or color numerics.
+ +
They specify the minimum and maximum values of the color numeric components. +For example, if the palette was an RGB of type Float, the color numeric +range for Red, Green, and Blue could be set to be between 0.0 and 1.0. +The intensity of the color guns would then be scaled accordingly to be +between this minimum and maximum attribute.
+Attribute name="PAL_VERSION"  (Required) +
This attribute is of type H5T_C_S1, with size corresponding to the +length of the version string.  This attribute identifies the version +number of this specification to which it conforms.  The current version +is "1.2".
+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 4. Attributes of a Palette Dataset
Attribute Name(R = Required, +
O = Optional)
TypeString SizeValue
CLASSRString +
7
+
"PALETTE"
PAL_COLORMODELRString +
3, 4, or 5
+
Color Model:  "RGB", YUV", "CMY", "CMYK", "YCbCr", or "HSV"
PAL_TYPERString +
9
+ +


+ + + + +
or 10
+

"STANDARD8"  + + + + +
or "RANGEINDEX" (Denigrated)
+
+ + + + +
Denigrated +
RANGE_INDEX
+
+ + + + +
Object Reference 
+
+ + + + +
<Object Reference to Dataset of range index values>
+
PAL_MINMAXNUMERICOArray[2] of <same datatype as palette>The first value is the <Minimum value for color values>, the second +value is <Maximum value for color values>2
PAL_VERSIONRString4"1.2"
+ +
  + + + + +
1.  The RANGE_INDEX attribute is required if the +PAL_TYPE is "RANGEINDEX".  Otherwise, the RANGE_INDEX attribute should +be omitted. (Range index is denigrated.)
+2.  The minimum and maximum are optional.  If not +set, the range is assumed to the maximum range of the number type.  +If one of these attributes is set, then both should be set.  The value +of the minimum must be less than or equal to the value of the maximum.
+
+Table 5 summarized the uses of the standard attributes for a palette dataset. +R means that the attribute listed on the leftmost column is Required for +the palette type on the first row, O means that the attribute is Optional +for that type and N that the attribute cannot be applied to that type. +The four first rows show the attributes that are always required  +for the two palette types. +
  +
  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 5. Applicability of Attributes
PAL_TYPESTANDARD8RANGEINDEX
CLASSRR
PAL_VERSIONRR
PAL_COLORMODELRR
RANGE_INDEXNR
PAL_MINMAXNUMERICOO
+ +

+2.3. Storage Layout for Palettes

+The values of the Palette are stored as a dataset.  The datatype can +be any HDF 5 atomic numeric type.  The dataset will have dimensions +(nentries  by  ncomponents), where 'nentries' +is the number of colors (usually 256) and 'ncomponents' is the +number of values per color (3 for RGB, 4 for CMYK, etc.) +
  +

+3.  Consistency and Correlation of Image and Palette +Attributes

+The objects in this specification are an extension to the base HDF5 specification +and library.  They are accessible with the standard HDF5 library, +but the semantics of the objects are not enforced by the base library.  +For example, it is perfectly possible to add an attribute called IMAGE +to any dataset, or to include an object reference to any +HDF5 dataset in a PALETTE attribute.  This would be a valid +HDF5 file, but not conformant to this specification.  The rules defined +in this specification must be implemented with appropriate software, and +applications must use conforming software to assure correctness. +

The Image and Palette specifications include several redundant standard +attributes, such as the IMAGE_COLORMODEL and the PAL_COLORMODEL.  +These attributes are informative not normative, in that it is acceptable +to attach a Palette to an Image dataset even if their attributes do not +match.  Software is not required to enforce consistency, and files +may contain mismatched associations of Images and Palettes.  In all +cases, it is up to applications to determine what kinds of images and color +models can be supported. +

For example, an Image that was created from a file with an "RGB" may +have a "YUV" Palette in its PALETTE attribute array.  This +would be a legal HDF5 file and also conforms to this specification, although +it may or may not be correct for a given application.

+ + + diff --git a/doxygen/examples/PaletteExample1.gif b/doxygen/examples/PaletteExample1.gif new file mode 100644 index 0000000..8694d9d Binary files /dev/null and b/doxygen/examples/PaletteExample1.gif differ diff --git a/doxygen/examples/Palettes.fm.anc.gif b/doxygen/examples/Palettes.fm.anc.gif new file mode 100644 index 0000000..d344c03 Binary files /dev/null and b/doxygen/examples/Palettes.fm.anc.gif differ diff --git a/doxygen/examples/TableSpec.html b/doxygen/examples/TableSpec.html new file mode 100644 index 0000000..474176e --- /dev/null +++ b/doxygen/examples/TableSpec.html @@ -0,0 +1,193 @@ + + + HDF5 Table Specification + + +The HDF5 specification defines the standard objects and storage for the +standard HDF5 objects. (For information about the HDF5 library, model and +specification, see the HDF documentation.)  This document is an additional +specification do define a standard profile for how to store tables in HDF5. +Table data in HDF5 is stored as HDF5 datasets with standard attributes to define +the properties of the tables. + +

+1. Overview

+A generic table is a sequence of records, each record has a name and a type. +Table data is stored as an HDF5 one dimensional compound dataset.  A table +is defined as a collection of records whose values are stored in fixed-length +fields. All records have the same structure and all values in each field have +the same data type. +

The dataset for a table is distinguished from other datasets by giving +it an attribute "CLASS=TABLE".   +Optional attributes allow the storage of a title for the Table and for +each column, and a fill value for each column. +

+2.  Table Attributes

+The attributes for the Table are strings. They are written with the H5LTset_attribute_string +Lite API function.  "Required" attributes must always be used. "Optional" attributes +must be used when required. +
  +

+Attributes

+ +
+
+Attribute name="CLASS" (Required)
+ +
+This attribute is type H5T_C_S1, with size 5.
+ +
+For all Tables, the value of this attribute is "TABLE".
+ +
+This attribute identifies this data set as intended to be interpreted as Table that conforms to the specifications on this page.
+
+ +
+Attribute name="VERSION" (Required) + +
+This attribute is of type H5T_C_S1, with size corresponding to the length +of the version string.  This attribute identifies the version number +of this specification to which it conforms.  The current version number +is "0.2".
+ +
+ +
+
+Attribute name="TITLE" (Optional)
+ +
+The TITLE is an optional String that is to be used as the +informative title of the whole table. +The TITLE is set with the parameter table_title of the function + H5TBmake_table
+
+ +
+
+Attribute name="FIELD_(n)_NAME" (Required)
+ +
+The FIELD_(n)_NAME is an optional String that is to be used as the +informative title of column n of the table. +For each of the fields the word FIELD_ is concatenated with + the zero based field (n) index together with the name of the field.
+ +
+
+
+Attribute name="FIELD_(n)_FILL" (Optional)
+ +
+The FIELD_(n)_FILL is an optional String that is the fill value for +column n of the table. +For each of the fields the word FIELD_ is concatenated with + the zero based field (n) index together with the fill value, if present. +This value is written only when a fill value is defined for the table.
+ +
+ +
+ +
  +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Table 1. Attributes of an Image Dataset
Attribute Name(R = Required +
O= Optional)
TypeString SizeValue
CLASSRString5"TABLE"
VERSIONRString3"0.2"
TITLEOString  + +
FIELD_(n)_NAMERString  +  + +
FIELD_(n)_FILLO*String  +  +
+
+ +
+

+

+  +
+* The attribute FIELD_(n)_FILL is written to the table if a fill value is +specified on the creation of the Table. Otherwise, it is not.

The following +section of code shows the calls necessary to the creation of a table. + +

/* Create a new HDF5 file using default properties. */
+ file_id = H5Fcreate( "my_table.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT );

+ +

/* Call the make table function */
+
H5TBmake_table( "Table Title", file_id, "Table1", NFIELDS, NRECORDS, dst_size, 
+ field_names, dst_offset, field_type, 
+ chunk_size, fill_data, compress, p_data ) 

+ +

/* Close the file. */
+ status = H5Fclose( file_id );

+ + diff --git a/doxygen/examples/ThreadSafeLibrary.html b/doxygen/examples/ThreadSafeLibrary.html new file mode 100644 index 0000000..8daf386 --- /dev/null +++ b/doxygen/examples/ThreadSafeLibrary.html @@ -0,0 +1,787 @@ + + + + Thread Safe Library + + +

1. Library header files and conditional compilation

+ +

+The following code is placed at the beginning of H5private.h: +

+ +
+
+  #ifdef H5_HAVE_THREADSAFE
+  #include <pthread.h>
+  #endif
+  
+
+ +

+H5_HAVE_THREADSAFE is defined when the HDF-5 library is +compiled with the --enable-threadsafe configuration option. In general, +code for the non-threadsafe version of HDF-5 library are placed within +the #else part of the conditional compilation. The exception +to this rule are the changes to the FUNC_ENTER (in +H5private.h), HRETURN and HRETURN_ERROR (in +H5Eprivate.h) macros (see section 3.2). +

+ + +

2. Global variables/structures

+ +

2.1 Global library initialization variable

+ +

+In the threadsafe implementation, the global library initialization +variable H5_libinit_g is changed to a global structure +consisting of the variable with its associated lock (locks are explained +in section 4.1): +

+ +
+
+    hbool_t  H5_libinit_g = FALSE;
+  
+
+ +

+becomes +

+ +
+
+    H5_api_t H5_g;
+  
+
+ +

+where H5_api_t is +

+ +
+
+    typedef struct H5_api_struct {
+      H5_mutex_t init_lock;           /* API entrance mutex */
+      hbool_t H5_libinit_g;
+    } H5_api_t;
+  
+
+ +

+All former references to H5_libinit_g in the library are now +made using the macro H5_INIT_GLOBAL. If the threadsafe +library is to be used, the macro is set to H5_g.H5_libinit_g +instead. +

+ +

2.2 Global serialization variable

+ +

+A new global boolean variable H5_allow_concurrent_g is used +to determine if multiple threads are allowed to an API call +simultaneously. This is set to FALSE. +

+ +

+All APIs that are allowed to do so have their own local variable that +shadows the global variable and is set to TRUE. In phase 1, +no such APIs exist. +

+ +

+It is defined in H5.c as follows: +

+ +
+
+    hbool_t H5_allow_concurrent_g = FALSE;
+  
+
+ +

2.3 Global thread initialization variable

+ +

+The global variable H5_first_init_g of type +pthread_once_t is used to allow only the first thread in the +application process to call an initialization function using +pthread_once. All subsequent calls to +pthread_once by any thread are disregarded. +

+ +

+The call sets up the mutex in the global structure H5_g (see +section 3.1) via an initialization function +H5_first_thread_init. The first thread initialization +function is described in section 4.2. +

+ +

+H5_first_init_g is defined in H5.c as follows: +

+ +
+
+    pthread_once_t H5_first_init_g = PTHREAD_ONCE_INIT;
+  
+
+ +

2.4 Global key for per-thread error stacks

+ +

+A global pthread-managed key H5_errstk_key_g is used to +allow pthreads to maintain a separate error stack (of type +H5E_t) for each thread. This is defined in H5.c +as: +

+ +
+
+    pthread_key_t H5_errstk_key_g;
+  
+
+ +

+Error stack management is described in section 4.3. +

+ +

2.5 Global structure and key for thread cancellation prevention

+ +

+We need to preserve the thread cancellation status of each thread +individually by using a key H5_cancel_key_g. The status is +preserved using a structure (of type H5_cancel_t) which +maintains the cancellability state of the thread before it entered the +library and a count (which works very much like the recursive lock +counter) which keeps track of the number of API calls the thread makes +within the library. +

+ +

+The structure is defined in H5private.h as: +

+ +
+
+    /* cancelability structure */
+    typedef struct H5_cancel_struct {
+      int previous_state;
+      unsigned int cancel_count;
+    } H5_cancel_t;
+  
+
+ +

+Thread cancellation is described in section 4.4. +

+ + +

3. Changes to Macro expansions

+ +

3.1 Changes to FUNC_ENTER

+ +

+The FUNC_ENTER macro is now extended to include macro calls +to initialize first threads, disable cancellability and wraps a lock +operation around the checking of the global initialization flag. It +should be noted that the cancellability should be disabled before +acquiring the lock on the library. Doing so otherwise would allow the +possibility that the thread be cancelled just after it has acquired the +lock on the library and in that scenario, if the cleanup routines are not +properly set, the library would be permanently locked out. +

+ +

+The additional macro code and new macro definitions can be found in +Appendix E.1 to E.5. The changes are made in H5private.h. +

+ +

3.2 Changes to HRETURN and HRETURN_ERROR

+ +

+The HRETURN and HRETURN_ERROR macros are the +counterparts to the FUNC_ENTER macro described in section +3.1. FUNC_LEAVE makes a macro call to HRETURN, +so it is also covered here. +

+ +

+The basic changes to these two macros involve adding macro calls to call +an unlock operation and re-enable cancellability if necessary. It should +be noted that the cancellability should be re-enabled only after the +thread has released the lock to the library. The consequence of doing +otherwise would be similar to that described in section 3.1. +

+ +

+The additional macro code and new macro definitions can be found in +Appendix E.9 to E.9. The changes are made in H5Eprivate.h. +

+ +

4. Implementation of threadsafe functionality

+ +

4.1 Recursive Locks

+ +

+A recursive mutex lock m allows a thread t1 to successfully lock m more +than once without blocking t1. Another thread t2 will block if t2 tries +to lock m while t1 holds the lock to m. If t1 makes k lock calls on m, +then it also needs to make k unlock calls on m before it releases the +lock. +

+ +

+Our implementation of recursive locks is built on top of a pthread mutex +lock (which is not recursive). It makes use of a pthread condition +variable to have unsuccessful threads wait on the mutex. Waiting threads +are awaken by a signal from the final unlock call made by the thread +holding the lock. +

+ +

+Recursive locks are defined to be the following type +(H5private.h): +

+ +
+
+    typedef struct H5_mutex_struct {
+      pthread_t owner_thread;         /* current lock owner */
+      pthread_mutex_t atomic_lock;    /* lock for atomicity of new mechanism */
+      pthread_cond_t cond_var;        /* condition variable */
+      unsigned int lock_count;
+    } H5_mutex_t;
+  
+
+ +

+Detailed implementation code can be found in Appendix A. The +implementation changes are made in H5TS.c. +

+ +

4.2 First thread initialization

+ +

+Because the mutex lock associated with a recursive lock cannot be +statically initialized, a mechanism is required to initialize the +recursive lock associated with H5_g so that it can be used +for the first time. +

+ +

+The pthreads library allows this through the pthread_once call which as +described in section 3.3 allows only the first thread accessing the +library in an application to initialize H5_g. +

+ +

+In addition to initializing H5_g, it also initializes the +key (see section 3.4) for use with per-thread error stacks (see section +4.3). +

+ +

+The first thread initialization mechanism is implemented as the function +call H5_first_thread_init() in H5TS.c. This is +described in appendix B. +

+ +

4.3 Per-thread error stack management

+ +

+Pthreads allows individual threads to access dynamic and persistent +per-thread data through the use of keys. Each key is associated with +a table that maps threads to data items. Keys can be initialized by +pthread_key_create() in pthreads (see sections 3.4 and 4.2). +Per-thread data items are accessed using a key through the +pthread_getspecific() and pthread_setspecific() +calls to read and write to the association table respectively. +

+ +

+Per-thread error stacks are accessed through the key +H5_errstk_key_g which is initialized by the first thread +initialization call (see section 4.2). +

+ +

+In the non-threadsafe version of the library, there is a global stack +variable H5E_stack_g[1] which is no longer defined in the +threadsafe version. At the same time, the macro call to gain access to +the error stack H5E_get_my_stack is changed from: +

+ +
+
+    #define H5E_get_my_stack() (H5E_stack_g+0)
+  
+
+ +

+to: +

+ +
+
+    #define H5E_get_my_stack() H5E_get_stack()
+  
+
+ +

+where H5E_get_stack() is a surrogate function that does the +following operations: +

+ +
    +
  1. if a thread is attempting to get an error stack for the first + time, the error stack is dynamically allocated for the thread and + associated with H5_errstk_key_g using + pthread_setspecific(). The way we detect if it is the + first time is through pthread_getspecific() which + returns NULL if no previous value is associated with + the thread using the key.
  2. + +
  3. if pthread_getspecific() returns a non-null value, + then that is the pointer to the error stack associated with the + thread and the stack can be used as usual.
  4. +
+ +

+A final change to the error reporting routines is as follows; the current +implementation reports errors to always be detected at thread 0. In the +threadsafe implementation, this is changed to report the number returned +by a call to pthread_self(). +

+ +

+The change in code (reflected in H5Eprint of file +H5E.c) is as follows: +

+ +
+
+    #ifdef H5_HAVE_THREADSAFE
+      fprintf (stream, "HDF5-DIAG: Error detected in thread %d."
+               ,pthread_self());
+    #else
+      fprintf (stream, "HDF5-DIAG: Error detected in thread 0.");
+    #endif
+  
+
+ +

+Code for H5E_get_stack() can be found in Appendix C. All the +above changes were made in H5E.c. +

+ +

4.4 Thread Cancellation safety

+ +

+To prevent thread cancellations from killing a thread while it is in the +library, we maintain per-thread information about the cancellability +status of the thread before it entered the library so that we can restore +that same status when the thread leaves the library. +

+ +

+By enter and leave the library, we mean the points when a +thread makes an API call from a user application and the time that API +call returns. Other API or callback function calls made from within that +API call are considered within the library. +

+ +

+Because other API calls may be made from within the first API call, we +need to maintain a counter to determine which was the first and +correspondingly the last return. +

+ +

+When a thread makes an API call, the macro H5_API_SET_CANCEL +calls the worker function H5_cancel_count_inc() which does +the following: +

+ +
    +
  1. if this is the first time the thread has entered the library, + a new cancellability structure needs to be assigned to it.
  2. +
  3. if the thread is already within the library when the API call is + made, then cancel_count is simply incremented. Otherwise, we set + the cancellability state to PTHREAD_CANCEL_DISABLE + while storing the previous state into the cancellability structure. + cancel_count is also incremented in this case.
  4. +
+ +

+When a thread leaves an API call, the macro +H5_API_UNSET_CANCEL calls the worker function +H5_cancel_count_dec() which does the following: +

+ +
    +
  1. if cancel_count is greater than 1, indicating that the + thread is not yet about to leave the library, then + cancel_count is simply decremented.
  2. +
  3. otherwise, we reset the cancellability state back to its original + state before it entered the library and decrement the count (back + to zero).
  4. +
+ +

+H5_cancel_count_inc and H5_cancel_count_dec are +described in Appendix D and may be found in H5TS.c. +

+ +

5. Test programs

+ +

+Except where stated, all tests involve 16 simultaneous threads that make +use of HDF-5 API calls without any explicit synchronization typically +required in a non-threadsafe environment. +

+ +

5.1 Data set create and write

+ +

+The test program sets up 16 threads to simultaneously create 16 +different datasets named from zero to fifteen for a single +file and then writing an integer value into that dataset equal to the +dataset's named value. +

+ +

+The main thread would join with all 16 threads and attempt to match the +resulting HDF-5 file with expected results - that each dataset contains +the correct value (0 for zero, 1 for one etc ...) and all +datasets were correctly created. +

+ +

+The test is implemented in the file ttsafe_dcreate.c. +

+ +

5.2 Test on error stack

+ +

+The error stack test is one in which 16 threads simultaneously try to +create datasets with the same name. The result, when properly serialized, +should be equivalent to 16 attempts to create the dataset with the same +name. +

+ +

+The error stack implementation runs correctly if it reports 15 instances +of the dataset name conflict error and finally generates a correct HDF-5 +containing that single dataset. Each thread should report its own stack +of errors with a thread number associated with it. +

+ +

+The test is implemented in the file ttsafe_error.c. +

+ +

5.3 Test on cancellation safety

+ +

+The main idea in thread cancellation safety is as follows; a child thread +is spawned to create and write to a dataset. Following that, it makes a +H5Diterate call on that dataset which activates a callback +function. +

+ +

+A deliberate barrier is invoked at the callback function which waits for +both the main and child thread to arrive at that point. After that +happens, the main thread proceeds to make a thread cancel call on the +child thread while the latter sleeps for 3 seconds before proceeding to +write a new value to the dataset. +

+ +

+After the iterate call, the child thread logically proceeds to wait +another 3 seconds before writing another newer value to the dataset. +

+ +

+The test is correct if the main thread manages to read the second value +at the end of the test. This means that cancellation did not take place +until the end of the iteration call despite of the 3 second wait within +the iteration callback and the extra dataset write operation. +Furthermore, the cancellation should occur before the child can proceed +to write the last value into the dataset. +

+ +

5.4 Test on attribute creation

+ +

+A main thread makes 16 threaded calls to H5Acreate with a +generated name for each attribute. Sixteen attributes should be created +for the single dataset in random (chronological) order and receive values +depending on its generated attribute name (e.g. attrib010 would +receive the value 10). +

+ +

+After joining with all child threads, the main thread proceeds to read +each attribute by generated name to see if the value tallies. Failure is +detected if the attribute name does not exist (meaning they were never +created) or if the wrong values were read back. +

+ +

A. Recursive Lock implementation code

+ +
+
+  void H5_mutex_init(H5_mutex_t *H5_mutex)
+  {
+    H5_mutex->owner_thread = NULL;
+    pthread_mutex_init(&H5_mutex->atomic_lock, NULL);
+    pthread_cond_init(&H5_mutex->cond_var, NULL);
+    H5_mutex->lock_count = 0;
+  }
+
+  void H5_mutex_lock(H5_mutex_t *H5_mutex)
+  {
+    pthread_mutex_lock(&H5_mutex->atomic_lock);
+
+    if (pthread_equal(pthread_self(), H5_mutex->owner_thread)) {
+    	/* already owned by self - increment count */
+    	H5_mutex->lock_count++;
+    } else {
+    	if (H5_mutex->owner_thread == NULL) {
+    		/* no one else has locked it - set owner and grab lock */
+    		H5_mutex->owner_thread = pthread_self();
+    		H5_mutex->lock_count = 1;
+    	} else {
+    		/* if already locked by someone else */
+    		while (1) {
+    			pthread_cond_wait(&H5_mutex->cond_var, &H5_mutex->atomic_lock);
+
+    			if (H5_mutex->owner_thread == NULL) {
+    				H5_mutex->owner_thread = pthread_self();
+    				H5_mutex->lock_count = 1;
+    				break;
+    			} /* else do nothing and loop back to wait on condition*/
+    		}
+    	}
+    }
+
+    pthread_mutex_unlock(&H5_mutex->atomic_lock);
+  }
+
+  void H5_mutex_unlock(H5_mutex_t *H5_mutex)
+  {
+    pthread_mutex_lock(&H5_mutex->atomic_lock);
+    H5_mutex->lock_count--;
+
+    if (H5_mutex->lock_count == 0) {
+    	H5_mutex->owner_thread = NULL;
+    	pthread_cond_signal(&H5_mutex->cond_var);
+    }
+    pthread_mutex_unlock(&H5_mutex->atomic_lock);
+  }
+  
+
+ +

B. First thread initialization

+ +
+
+  void H5_first_thread_init(void)
+  {
+    /* initialize global API mutex lock                      */
+    H5_g.H5_libinit_g = FALSE;
+    H5_g.init_lock.owner_thread = NULL;
+    pthread_mutex_init(&H5_g.init_lock.atomic_lock, NULL);
+    pthread_cond_init(&H5_g.init_lock.cond_var, NULL);
+    H5_g.init_lock.lock_count = 0;
+
+    /* initialize key for thread-specific error stacks       */
+    pthread_key_create(&H5_errstk_key_g, NULL);
+
+    /* initialize key for thread cancellability mechanism    */
+    pthread_key_create(&H5_cancel_key_g, NULL);
+  }
+  
+
+ + +

C. Per-thread error stack acquisition

+ +
+
+  H5E_t *H5E_get_stack(void)
+  {
+    H5E_t *estack;
+
+    if (estack = pthread_getspecific(H5_errstk_key_g)) {
+    	return estack;
+    } else {
+    	/* no associated value with current thread - create one */
+    	estack = (H5E_t *)malloc(sizeof(H5E_t));
+    	pthread_setspecific(H5_errstk_key_g, (void *)estack);
+    	return estack;
+    }
+  }
+  
+
+ +

D. Thread cancellation mechanisms

+ +
+
+  void H5_cancel_count_inc(void)
+  {
+    H5_cancel_t *cancel_counter;
+
+    if (cancel_counter = pthread_getspecific(H5_cancel_key_g)) {
+      /* do nothing here */
+    } else {
+      /*
+       * first time thread calls library - create new counter and
+       * associate with key
+       */
+      cancel_counter = (H5_cancel_t *)malloc(sizeof(H5_cancel_t));
+      cancel_counter->cancel_count = 0;
+      pthread_setspecific(H5_cancel_key_g, (void *)cancel_counter);
+    }
+
+    if (cancel_counter->cancel_count == 0) {
+      /* thread entering library */
+      pthread_setcancelstate(PTHREAD_CANCEL_DISABLE,
+                             &(cancel_counter->previous_state));
+    }
+
+    cancel_counter->cancel_count++;
+  }
+
+  void H5_cancel_count_dec(void)
+  {
+    H5_cancel_t *cancel_counter = pthread_getspecific(H5_cancel_key_g);
+
+    if (cancel_counter->cancel_count == 1)
+      pthread_setcancelstate(cancel_counter->previous_state, NULL);
+
+    cancel_counter->cancel_count--;
+  }
+  
+
+ +

E. Macro expansion codes

+ +

E.1 FUNC_ENTER

+ +
+
+  /* Initialize the library */                                \
+  H5_FIRST_THREAD_INIT                                        \
+  H5_API_UNSET_CANCEL                                         \
+  H5_API_LOCK_BEGIN                                           \
+    if (!(H5_INIT_GLOBAL)) {                                  \
+      H5_INIT_GLOBAL = TRUE;                                  \
+        if (H5_init_library() < 0) {                          \
+          HRETURN_ERROR (H5E_FUNC, H5E_CANTINIT, err,         \
+                        "library initialization failed");     \
+        }                                                     \
+    }                                                         \
+    H5_API_LOCK_END                                           \
+             :
+             :
+             :
+  
+
+ +

E.2 H5_FIRST_THREAD_INIT

+ +
+
+  /* Macro for first thread initialization */
+  #define H5_FIRST_THREAD_INIT                                \
+    pthread_once(&H5_first_init_g, H5_first_thread_init);
+  
+
+ + +

E.3 H5_API_UNSET_CANCEL

+ +
+
+  #define H5_API_UNSET_CANCEL                                 \
+    if (H5_IS_API(FUNC)) {                                    \
+      H5_cancel_count_inc();                                  \
+    }
+  
+
+ + +

E.4 H5_API_LOCK_BEGIN

+ +
+
+  #define H5_API_LOCK_BEGIN                                   \
+     if (H5_IS_API(FUNC)) {                                   \
+       H5_mutex_lock(&H5_g.init_lock);
+  
+
+ + +

E.5 H5_API_LOCK_END

+ +
+
+  #define H5_API_LOCK_END }
+  
+
+ + +

E.6 HRETURN and HRETURN_ERROR

+ +
+
+            :
+            :
+    H5_API_UNLOCK_BEGIN                                       \
+    H5_API_UNLOCK_END                                         \
+    H5_API_SET_CANCEL                                         \
+    return ret_val;                                           \
+  }
+  
+
+ +

E.7 H5_API_UNLOCK_BEGIN

+ +
+
+  #define H5_API_UNLOCK_BEGIN                                 \
+    if (H5_IS_API(FUNC)) {                                    \
+      H5_mutex_unlock(&H5_g.init_lock);
+  
+
+ +

E.8 H5_API_UNLOCK_END

+ +
+
+  #define H5_API_UNLOCK_END }
+  
+
+ + +

E.9 H5_API_SET_CANCEL

+ +
+
+  #define H5_API_SET_CANCEL                                   \
+    if (H5_IS_API(FUNC)) {                                    \
+      H5_cancel_count_dec();                                  \
+    }
+  
+
+ +

By Chee Wai Lee

+

By Bill Wendling

+ + + diff --git a/doxygen/examples/VFL.html b/doxygen/examples/VFL.html new file mode 100644 index 0000000..9776f96 --- /dev/null +++ b/doxygen/examples/VFL.html @@ -0,0 +1,1601 @@ + + + + +HDF5 Virtual File Layer + + + + + + + + +Revision History +

Initial document, 18 November 1999.

+ +

Updated on 10/24/00, Quincey Koziol

+ +

Added the section “Programming Note for C++ Developers Using C +Functions,” 08/23/2012, Mark Evans + + + +

+


+

Table of Contents

+ +


+ + +

Introduction

+ +

+The HDF5 file format describes how HDF5 data structures and dataset raw +data are mapped to a linear format address space and the HDF5 +library implements that bidirectional mapping in terms of an +API. However, the HDF5 format specifications do not indicate how +the format address space is mapped onto storage and HDF (version 5 and +earlier) simply mapped the format address space directly onto a single +file by convention. + +

+

+Since early versions of HDF5 it became apparent that users want the ability to +map the format address space onto different types of storage (a single file, +multiple files, local memory, global memory, network distributed global +memory, a network protocol, etc.) with various types of maps. For +instance, some users want to be able to handle very large format address +spaces on operating systems that support only 2GB files by partitioning the +format address space into equal-sized parts each served by a separate +file. Other users want the same multi-file storage capability but want to +partition the address space according to purpose (raw data in one file, object +headers in another, global heap in a third, etc.) in order to improve I/O +speeds. + +

+

+In fact, the number of storage variations is probably larger than the +number of methods that the HDF5 team is capable of implementing and +supporting. Therefore, a Virtual File Layer API is being +implemented which will allow application teams or departments to design +and implement their own mapping between the HDF5 format address space +and storage, with each mapping being a separate file driver +(possibly written in terms of other file drivers). The HDF5 team will +provide a small set of useful file drivers which will also serve as +examples for those who which to write their own: + +

+
+ +
H5FD_SEC2 +
+This is the default driver which uses Posix file-system functions like +read and write to perform I/O to a single file. All I/O +requests are unbuffered although the driver does optimize file seeking +operations to some extent. + +
H5FD_STDIO +
+This driver uses functions from `stdio.h' to perform buffered I/O +to a single file. + +
H5FD_CORE +
+This driver performs I/O directly to memory and can be used to create small +temporary files that never exist on permanent storage. This type of storage is +generally very fast since the I/O consists only of memory-to-memory copy +operations. + +
H5FD_MPIIO +
+This is the driver of choice for accessing files in parallel using MPI and +MPI-IO. It is only predefined if the library is compiled with parallel I/O +support. + +
H5FD_FAMILY +
+Large format address spaces are partitioned into more manageable pieces and +sent to separate storage locations using an underlying driver of the user's +choice. The h5repart tool can be used to change the sizes of the +family members when stored as files or to convert a family of files to a +single file or vice versa. + +
H5FD_SPLIT +
+The format address space is split into meta data and raw data and each is +mapped onto separate storage using underlying drivers of the user's +choice. The meta data storage can be read by itself (for limited +functionality) or both files can be accessed together. +
+ + + +

Using a File Driver

+ +

+Most application writers will use a driver defined by the HDF5 library or +contributed by another programming team. This chapter describes how existing +drivers are used. + +

+ + + +

Driver Header Files

+ +

+Each file driver is defined in its own public header file which should +be included by any application which plans to use that driver. The +predefined drivers are in header files whose names begin with +`H5FD' followed by the driver name and `.h'. The `hdf5.h' +header file includes all the predefined driver header files. + +

+

+Once the appropriate header file is included a symbol of the form +`H5FD_' followed by the upper-case driver name will be the driver +identification number.(1) However, the +value may change if the library is closed (e.g., by calling +H5close) and the symbol is referenced again. + +

+ + +

Creating and Opening Files

+ +

+In order to create or open a file one must define the method by which the +storage is accessed(2) and does so by creating a file access property list(3) which is passed to the H5Fcreate or +H5Fopen function. A default file access property list is created by +calling H5Pcreate and then the file driver information is inserted by +calling a driver initialization function such as H5Pset_fapl_family: + +

+ +
+hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
+size_t member_size = 100*1024*1024; /*100MB*/
+H5Pset_fapl_family(fapl, member_size, H5P_DEFAULT);
+hid_t file = H5Fcreate("foo%05d.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
+H5Pclose(fapl);
+
+ +

+Each file driver will have its own initialization function +whose name is H5Pset_fapl_ followed by the driver name and which +takes a file access property list as the first argument followed by +additional driver-dependent arguments. + +

+

+An alternative to using the driver initialization function is to set the +driver directly using the H5Pset_driver function.(4) Its second argument is the file driver identifier, which may +have a different numeric value from run to run depending on the order in which +the file drivers are registered with the library. The third argument +encapsulates the additional arguments of the driver initialization +function. This method only works if the file driver writer has made the +driver-specific property list structure a public datatype, which is +often not the case. + +

+ +
+hid_t fapl = H5Pcreate(H5P_FILE_ACCESS);
+static H5FD_family_fapl_t fa = {100*1024*1024, H5P_DEFAULT};
+H5Pset_driver(fapl, H5FD_FAMILY, &fa);
+hid_t file = H5Fcreate("foo.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl);
+H5Pclose(fapl);
+
+ +

+It is also possible to query the file driver information from a file access +property list by calling H5Pget_driver to determine the driver and then +calling a driver-defined query function to obtain the driver information: + +

+ +
+hid_t driver = H5Pget_driver(fapl);
+if (H5FD_SEC2==driver) {
+    /*nothing further to get*/
+} else if (H5FD_FAMILY==driver) {
+    hid_t member_fapl;
+    haddr_t member_size;
+    H5Pget_fapl_family(fapl, &member_size, &member_fapl);
+} else if (....) {
+    ....
+}
+
+ + + +

Performing I/O

+ +

+The H5Dread and H5Dwrite functions transfer data between +application memory and the file. They both take an optional data transfer +property list which has some general driver-independent properties and +optional driver-defined properties. An application will typically perform I/O +in one of three styles via the H5Dread or H5Dwrite function: + +

+

+Like file access properties in the previous section, data transfer properties +can be set using a driver initialization function or a general purpose +function. For example, to set the MPI-IO driver to use independent access for +I/O operations one would say: + +

+ +
+hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
+H5Pset_dxpl_mpio(dxpl, H5FD_MPIO_INDEPENDENT);
+H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
+H5Pclose(dxpl);
+
+ +

+The alternative is to initialize a driver defined C struct and pass it +to the H5Pset_driver function: + +

+ +
+hid_t dxpl = H5Pcreate(H5P_DATA_XFER);
+static H5FD_mpio_dxpl_t dx = {H5FD_MPIO_INDEPENDENT};
+H5Pset_driver(dxpl, H5FD_MPIO, &dx);
+H5Dread(dataset, type, mspace, fspace, buffer, dxpl);
+
+ +

+The transfer propery list can be queried in a manner similar to the file +access property list: the driver provides a function (or functions) to return +various information about the transfer property list: + +

+ +
+hid_t driver = H5Pget_driver(dxpl);
+if (H5FD_MPIO==driver) {
+    H5FD_mpio_xfer_t xfer_mode;
+    H5Pget_dxpl_mpio(dxpl, &xfer_mode);
+} else {
+    ....
+}
+
+ + + +

File Driver Interchangeability

+ +

+The HDF5 specifications describe two things: the mapping of data onto a linear +format address space and the C API which performs the mapping. +However, the mapping of the format address space onto storage intentionally +falls outside the scope of the HDF5 specs. This is a direct result of the fact +that it is not generally possible to store information about how to access +storage inside the storage itself. For instance, given only the file name +`/arborea/1225/work/f%03d' the HDF5 library is unable to tell whether the +name refers to a file on the local file system, a family of files on the local +file system, a file on host `arborea' port 1225, a family of files on a +remote system, etc. + +

+

+Two ways which library could figure out where the storage is located are: +storage access information can be provided by the user, or the library can try +all known file access methods. This implementation uses the former method. + +

+

+In general, if a file was created with one driver then it isn't possible to +open it with another driver. There are of course exceptions: a file created +with MPIO could probably be opened with the sec2 driver, any file created +by the sec2 driver could be opened as a family of files with one member, +etc. In fact, sometimes a file must not only be opened with the same +driver but also with the same driver properties. The predefined drivers are +written in such a way that specifying the correct driver is sufficient for +opening a file. + +

+ + +

Implementation of a Driver

+ +

+A driver is simply a collection of functions and data structures which are +registered with the HDF5 library at runtime. The functions fall into these +categories: + +

+ +
    +
  • Functions which operate on modes + +
  • Functions which operate on files + +
  • Functions which operate on the address space + +
  • Functions which operate on data + +
  • Functions for driver initialization + +
  • Optimization functions + +
+ + + +

Mode Functions

+ +

+Some drivers need information about file access and data transfers which are +very specific to the driver. The information is usually implemented as a pair +of pointers to C structs which are allocated and initialized as part of an +HDF5 property list and passed down to various driver functions. There are two +classes of settings: file access modes that describe how to access the file +through the driver, and data transfer modes which are settings that control +I/O operations. Each file opened by a particular driver may have a different +access mode; each dataset I/O request for a particular file may have a +different data transfer mode. + +

+

+Since each driver has its own particular requirements for various settings, +each driver is responsible for defining the mode structures that it +needs. Higher layers of the library treat the structures as opaque but must be +able to copy and free them. Thus, the driver provides either the size of the +structure or a pair of function pointers for each of the mode types. + +

+

+Example: The family driver needs to know how the format address +space is partitioned and the file access property list to use for the +family members. + +

+ +
+/* Driver-specific file access properties */
+typedef struct H5FD_family_fapl_t {
+    hsize_t     memb_size;      /*size of each member                   */
+    hid_t       memb_fapl_id;   /*file access property list of each memb*/
+} H5FD_family_fapl_t;
+
+/* Driver specific data transfer properties */
+typedef struct H5FD_family_dxpl_t {
+    hid_t       memb_dxpl_id;   /*data xfer property list of each memb  */
+} H5FD_family_dxpl_t;
+
+ +

+In order to copy or free one of these structures the member file access +or data transfer properties must also be copied or freed. This is done +by providing a copy and close function for each structure: + +

+

+Example: The file access property list copy and close functions +for the family driver: + +

+ +
+static void *
+H5FD_family_fapl_copy(const void *_old_fa)
+{
+    const H5FD_family_fapl_t *old_fa = (const H5FD_family_fapl_t*)_old_fa;
+    H5FD_family_fapl_t *new_fa = malloc(sizeof(H5FD_family_fapl_t));
+    assert(new_fa);
+
+    memcpy(new_fa, old_fa, sizeof(H5FD_family_fapl_t));
+    new_fa->memb_fapl_id = H5Pcopy(old_fa->memb_fapl_id);
+    return new_fa;
+}
+
+static herr_t
+H5FD_family_fapl_free(void *_fa)
+{
+    H5FD_family_fapl_t  *fa = (H5FD_family_fapl_t*)_fa;
+    H5Pclose(fa->memb_fapl_id);
+    free(fa);
+    return 0;
+}
+
+ +

+Generally when a file is created or opened the file access properties +for the driver are copied into the file pointer which is returned and +they may be modified from their original value (for instance, the file +family driver modifies the member size property when opening an existing +family). In order to support the H5Fget_access_plist function the +driver must provide a fapl_get callback which creates a copy of +the driver-specific properties based on a particular file. + +

+

+Example: The file family driver copies the member size file +access property list into the return value: + +

+ +
+static void *
+H5FD_family_fapl_get(H5FD_t *_file)
+{
+    H5FD_family_t	*file = (H5FD_family_t*)_file;
+    H5FD_family_fapl_t	*fa = calloc(1, sizeof(H5FD_family_fapl_t*));
+
+    fa->memb_size = file->memb_size;
+    fa->memb_fapl_id = H5Pcopy(file->memb_fapl_id);
+    return fa;
+}
+
+ + + +

File Functions

+ +

+The higher layers of the library expect files to have a name and allow the +file to be accessed in various modes. The driver must be able to create a new +file, replace an existing file, or open an existing file. Opening or creating +a file should return a handle, a pointer to a specialization of the +H5FD_t struct, which allows read-only or read-write access and which +will be passed to the other driver functions as they are +called.(5) + +

+ +
+typedef struct {
+    /* Public fields */
+    H5FD_class_t *cls; /*class data defined below*/
+
+    /* Private fields -- driver-defined */
+
+} H5FD_t;
+
+ +

+Example: The family driver requires handles to the underlying +storage, the size of the members for this particular file (which might be +different than the member size specified in the file access property list if +an existing file family is being opened), the name used to open the file in +case additional members must be created, and the flags to use for creating +those additional members. The eoa member caches the size of the format +address space so the family members don't have to be queried in order to find +it. + +

+ +
+/* The description of a file belonging to this driver. */
+typedef struct H5FD_family_t {
+    H5FD_t      pub;            /*public stuff, must be first           */
+    hid_t       memb_fapl_id;   /*file access property list for members */
+    hsize_t     memb_size;      /*maximum size of each member file      */
+    int         nmembs;         /*number of family members              */
+    int         amembs;         /*number of member slots allocated      */
+    H5FD_t      **memb;         /*dynamic array of member pointers      */
+    haddr_t     eoa;            /*end of allocated addresses            */
+    char        *name;          /*name generator printf format          */
+    unsigned    flags;          /*flags for opening additional members  */
+} H5FD_family_t;
+
+ +

+Example: The sec2 driver needs to keep track of the underlying Unix +file descriptor and also the end of format address space and current Unix file +size. It also keeps track of the current file position and last operation +(read, write, or unknown) in order to optimize calls to lseek. The +device and inode fields are defined on Unix in order to uniquely +identify the file and will be discussed below. + +

+ +
+typedef struct H5FD_sec2_t {
+    H5FD_t      pub;                    /*public stuff, must be first   */
+    int         fd;                     /*the unix file                 */
+    haddr_t     eoa;                    /*end of allocated region       */
+    haddr_t     eof;                    /*end of file; current file size*/
+    haddr_t     pos;                    /*current file I/O position     */
+    int         op;                     /*last operation                */
+    dev_t       device;                 /*file device number            */
+    ino_t       inode;                  /*file i-node number            */
+} H5FD_sec2_t;
+
+ + + +

Opening Files

+ +

+All drivers must define a function for opening/creating a file. This +function should have a prototype which is: + +

+

+

+
Function: static H5FD_t * open (const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr) +
+ +

+

+The file name name and file access property list fapl are +the same as were specified in the H5Fcreate or H5Fopen +call. The flags are the same as in those calls also except the +flag H5F_ACC_CREATE is also present if the call was to +H5Fcreate and they are documented in the `H5Fpublic.h' +file. The maxaddr argument is the maximum format address that the +driver should be prepared to handle (the minimum address is always +zero). +

+ +

+

+Example: The sec2 driver opens a Unix file with the requested name +and saves information which uniquely identifies the file (the Unix device +number and inode). + +

+ +
+static H5FD_t *
+H5FD_sec2_open(const char *name, unsigned flags, hid_t fapl_id/*unused*/,
+               haddr_t maxaddr)
+{
+    unsigned    o_flags;
+    int         fd;
+    struct stat sb;
+    H5FD_sec2_t *file=NULL;
+
+    /* Check arguments */
+    if (!name || !*name) return NULL;
+    if (0==maxaddr || HADDR_UNDEF==maxaddr) return NULL;
+    if (ADDR_OVERFLOW(maxaddr)) return NULL;
+
+    /* Build the open flags */
+    o_flags = (H5F_ACC_RDWR & flags) ? O_RDWR : O_RDONLY;
+    if (H5F_ACC_TRUNC & flags) o_flags |= O_TRUNC;
+    if (H5F_ACC_CREAT & flags) o_flags |= O_CREAT;
+    if (H5F_ACC_EXCL & flags) o_flags |= O_EXCL;
+
+    /* Open the file */
+    if ((fd=open(name, o_flags, 0666))<0) return NULL;
+    if (fstat(fd, &sb)<0) {
+        close(fd);
+        return NULL;
+    }
+
+    /* Create the new file struct */
+    file = calloc(1, sizeof(H5FD_sec2_t));
+    file->fd = fd;
+    file->eof = sb.st_size;
+    file->pos = HADDR_UNDEF;
+    file->op = OP_UNKNOWN;
+    file->device = sb.st_dev;
+    file->inode = sb.st_ino;
+
+    return (H5FD_t*)file;
+}
+
+ + + +

Closing Files

+ +

+Closing a file simply means that all cached data should be flushed to the next +lower layer, the file should be closed at the next lower layer, and all +file-related data structures should be freed. All information needed by the +close function is already present in the file handle. + +

+

+

+
Function: static herr_t close (H5FD_t *file) +
+ +

+

+The file argument is the handle which was returned by the open +function, and the close should free only memory associated with the +driver-specific part of the handle (the public parts will have already been released by HDF5's virtual file layer). +

+ +

+

+Example: The sec2 driver just closes the underlying Unix file, +making sure that the actual file size is the same as that known to the +library by writing a zero to the last file position it hasn't been +written by some previous operation (which happens in the same code which +flushes the file contents and is shown below). + +

+ +
+static herr_t
+H5FD_sec2_close(H5FD_t *_file)
+{
+    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
+
+    if (H5FD_sec2_flush(_file)<0) return -1;
+    if (close(file->fd)<0) return -1;
+    free(file);
+    return 0;
+}
+
+ + + +

File Keys

+ +

+Occasionally an application will attempt to open a single file more than one +time in order to obtain multiple handles to the file. HDF5 allows the files to +share information(6) but in order to +accomplish this HDF5 must be able to tell when two names refer to the same +file. It does this by associating a driver-defined key with each file opened +by a driver and comparing the key for an open request with the keys for all +other files currently open by the same driver. + +

+

+

+
Function: const int cmp (const H5FD_t *f1, const H5FD_t *f2) +
+ +

+

+The driver may provide a function which compares two files f1 and +f2 belonging to the same driver and returns a negative, positive, or +zero value a la the strcmp function.(7) If this +function is not provided then HDF5 assumes that all calls to the open +callback return unique files regardless of the arguments and it is up to the +application to avoid doing this if that assumption is incorrect. +

+ +

+

+Each time a file is opened the library calls the cmp function to +compare that file with all other files currently open by the same driver and +if one of them matches (at most one can match) then the file which was just +opened is closed and the previously opened file is used instead. + +

+

+Opening a file twice with incompatible flags will result in failure. For +instance, opening a file with the truncate flag is a two step process which +first opens the file without truncation so keys can be compared, and if no +matching file is found already open then the file is closed and immediately +reopened with the truncation flag set (if a matching file is already open then +the truncating open will fail). + +

+

+Example: The sec2 driver uses the Unix device and i-node as the +key. They were initialized when the file was opened. + +

+ +
+static int
+H5FD_sec2_cmp(const H5FD_t *_f1, const H5FD_t *_f2)
+{
+    const H5FD_sec2_t   *f1 = (const H5FD_sec2_t*)_f1;
+    const H5FD_sec2_t   *f2 = (const H5FD_sec2_t*)_f2;
+
+    if (f1->device < f2->device) return -1;
+    if (f1->device > f2->device) return 1;
+
+    if (f1->inode < f2->inode) return -1;
+    if (f1->inode > f2->inode) return 1;
+
+    return 0;
+}
+
+ + + +

Saving Modes Across Opens

+ +

+Some drivers may also need to store certain information in the file superblock +in order to be able to reliably open the file at a later date. This is done by +three functions: one to determine how much space will be necessary to store +the information in the superblock, one to encode the information, and one to +decode the information. These functions are optional, but if any one is +defined then the other two must also be defined. + +

+

+

+
Function: static hsize_t sb_size (H5FD_t *file) +
+
Function: static herr_t sb_encode (H5FD_t *file, char *name, unsigned char *buf) +
+
Function: static herr_t sb_decode (H5FD_t *file, const char *name, const unsigned char *buf) +
+ +

+

+The sb_size function returns the number of bytes necessary to encode +information needed later if the file is reopened. The sb_encode +function encodes information from the file into buffer buf +allocated by the caller. It also writes an 8-character (plus null +termination) into the name argument, which should be a unique +identification for the driver. The sb_decode function looks at +the name + +

+

+ decodes +data from the buffer buf and updates the file argument with the new information, +advancing *p in the process. +

+ +

+

+The part of this which is somewhat tricky is that the file must be readable +before the superblock information is decoded. File access modes fall outside +the scope of the HDF5 file format, but they are placed inside the boot block +for convenience.(8) + +

+

+Example: To be written later. + +

+ + +

Address Space Functions

+ +

+HDF5 does not assume that a file is a linear address space of bytes. Instead, +the library will call functions to allocate and free portions of the HDF5 +format address space, which in turn map onto functions in the file driver to +allocate and free portions of file address space. The library tells the file +driver how much format address space it wants to allocate and the driver +decides what format address to use and how that format address is mapped onto +the file address space. Usually the format address is chosen so that the file +address can be calculated in constant time for data I/O operations (which are +always specified by format addresses). + +

+ + + +

Userblock and Superblock

+ +

+The HDF5 format allows an optional userblock to appear before the actual HDF5 +data in such a way that if the userblock is sucked out of the file and +everything remaining is shifted downward in the file address space, then the +file is still a valid HDF5 file. The userblock size can be zero or any +multiple of two greater than or equal to 512 and the file superblock begins +immediately after the userblock. + +

+

+HDF5 allocates space for the userblock and superblock by calling an +allocation function defined below, which must return a chunk of memory at +format address zero on the first call. + +

+ + +

Allocation of Format Regions

+ +

+The library makes many types of allocation requests: + +

+
+ +
H5FD_MEM_SUPER +
+An allocation request for the userblock and/or superblock. +
H5FD_MEM_BTREE +
+An allocation request for a node of a B-tree. +
H5FD_MEM_DRAW +
+An allocation request for the raw data of a dataset. +
H5FD_MEM_META +
+An allocation request for the raw data of a dataset which +the user has indicated will be relatively small. +
H5FD_MEM_GROUP +
+An allocation request for a group leaf node (internal nodes of the group tree +are allocated as H5MF_BTREE). +
H5FD_MEM_GHEAP +
+An allocation request for a global heap collection. Global heaps are used to +store certain types of references such as dataset region references. The set +of all global heap collections can become quite large. +
H5FD_MEM_LHEAP +
+An allocation request for a local heap. Local heaps are used to store the +names which are members of a group. The combined size of all local heaps is a +function of the number of object names in the file. +
H5FD_MEM_OHDR +
+An allocation request for (part of) an object header. Object headers are +relatively small and include meta information about objects (like the data +space and type of a dataset) and attributes. +
+ +

+When a chunk of memory is freed the library adds it to a free list and +allocation requests are satisfied from the free list before requesting memory +from the file driver. Each type of allocation request enumerated above has its +own free list, but the file driver can specify that certain object types can +share a free list. It does so by providing an array which maps a request type +to a free list. If any value of the map is H5MF_DEFAULT (zero) then the +object's own free list is used. The special value H5MF_NOLIST indicates +that the library should not attempt to maintain a free list for that +particular object type, instead calling the file driver each time an object of +that type is freed. + +

+

+Mappings predefined in the `H5FDpublic.h' file are: +

+ +
H5FD_FLMAP_SINGLE +
+All memory usage types are mapped to a single free list. +
H5FD_FLMAP_DICHOTOMY +
+Memory usage is segregated into meta data and raw data for the purposes of +memory management. +
H5FD_FLMAP_DEFAULT +
+Each memory usage type has its own free list. +
+ +

+Example: To make a map that manages object headers on one free list +and everything else on another free list one might initialize the map with the +following code: (the use of H5FD_MEM_SUPER is arbitrary) + +

+ +
+H5FD_mem_t mt, map[H5FD_MEM_NTYPES];
+
+for (mt=0; mt<H5FD_MEM_NTYPES; mt++) {
+    map[mt] = (H5FD_MEM_OHDR==mt) ? mt : H5FD_MEM_SUPER;
+}
+
+ +

+If an allocation request cannot be satisfied from the free list then one of +two things happen. If the driver defines an allocation callback then it is +used to allocate space; otherwise new memory is allocated from the end of the +format address space by incrementing the end-of-address marker. + +

+

+

+
Function: static haddr_t alloc (H5FD_t *file, H5MF_type_t type, hsize_t size) +
+ +

+

+The file argument is the file from which space is to be allocated, +type is the type of memory being requested (from the list above) without +being mapped according to the freelist map and size is the number of +bytes being requested. The library is allowed to allocate large chunks of +storage and manage them in a layer above the file driver (although the current +library doesn't do that). The allocation function should return a format +address for the first byte allocated. The allocated region extends from that +address for size bytes. If the request cannot be honored then the +undefined address value is returned (HADDR_UNDEF). The first call to +this function for a file which has never had memory allocated must +return a format address of zero or HADDR_UNDEF since this is how the +library allocates space for the userblock and/or superblock. +

+ +

+ +

+Example: To be written later. + +

+ + +

Freeing Format Regions

+ +

+When the library is finished using a certain region of the format address +space it will return the space to the free list according to the type of +memory being freed and the free list map described above. If the free list has +been disabled for a particular memory usage type (according to the free list +map) and the driver defines a free callback then it will be +invoked. The free callback is also invoked for all entries on the free +list when the file is closed. + +

+

+

+
Function: static herr_t free (H5FD_t *file, H5MF_type_t type, haddr_t addr, hsize_t size) +
+ +

+

+The file argument is the file for which space is being freed; type +is the type of object being freed (from the list above) without being mapped +according to the freelist map; addr is the first format address to free; +and size is the size in bytes of the region being freed. The region +being freed may refer to just part of the region originally allocated and/or +may cross allocation boundaries provided all regions being freed have the same +usage type. However, the library will never attempt to free regions which have +already been freed or which have never been allocated. +

+ +

+

+A driver may choose to not define the free function, in which case +format addresses will be leaked. This isn't normally a huge problem since the +library contains a simple free list of its own and freeing parts of the format +address space is not a common occurrence. + +

+

+Example: To be written later. + +

+ + +

Querying Address Range

+ +

+Each file driver must have some mechanism for setting and querying the end of +address, or EOA, marker. The EOA marker is the first format address +after the last format address ever allocated. If the last part of the +allocated address range is freed then the driver may optionally decrease the +eoa marker. + +

+

+

+
Function: static haddr_t get_eoa (H5FD_t *file) +
+ +

+

+This function returns the current value of the EOA marker for the specified +file. +

+ +

+

+Example: The sec2 driver just returns the current eoa marker value +which is cached in the file structure: + +

+ +
+static haddr_t
+H5FD_sec2_get_eoa(H5FD_t *_file)
+{
+    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
+    return file->eoa;
+}
+
+ +

+The eoa marker is initially zero when a file is opened and the library may set +it to some other value shortly after the file is opened (after the superblock +is read and the saved eoa marker is determined) or when allocating additional +memory in the absence of an alloc callback (described above). + +

+

+Example: The sec2 driver simply caches the eoa marker in the file +structure and does not extend the underlying Unix file. When the file is +flushed or closed then the Unix file size is extended to match the eoa marker. + +

+ +
+static herr_t
+H5FD_sec2_set_eoa(H5FD_t *_file, haddr_t addr)
+{
+    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
+    file->eoa = addr;
+    return 0;
+}
+
+ + + +

Data Functions

+ +

+These functions operate on data, transferring a region of the format address +space between memory and files. + +

+ + + +

Contiguous I/O Functions

+ +

+A driver must specify two functions to transfer data from the library to the +file and vice versa. + +

+

+

+
Function: static herr_t read (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buf) +
+
Function: static herr_t write (H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buf) +
+ +

+

+The read function reads data from file file beginning at address +addr and continuing for size bytes into the buffer buf +supplied by the caller. The write function transfers data in the +opposite direction. Both functions take a data transfer property list +dxpl which indicates the fine points of how the data is to be +transferred and which comes directly from the H5Dread or +H5Dwrite function. Both functions receive type of +data being written, which may allow a driver to tune it's behavior for +different kinds of data. +

+ +

+

+Both functions should return a negative value if they fail to transfer the +requested data, or non-negative if they succeed. The library will never +attempt to read from unallocated regions of the format address space. + +

+

+Example: The sec2 driver just makes system calls. It tries not to +call lseek if the current operation is the same as the previous +operation and the file position is correct. It also fills the output buffer +with zeros when reading between the current EOF and EOA markers and restarts +system calls which were interrupted. + +

+ +
+static herr_t
+H5FD_sec2_read(H5FD_t *_file, H5FD_mem_t type/*unused*/, hid_t dxpl_id/*unused*/,
+        haddr_t addr, hsize_t size, void *buf/*out*/)
+{
+    H5FD_sec2_t         *file = (H5FD_sec2_t*)_file;
+    ssize_t             nbytes;
+
+    assert(file && file->pub.cls);
+    assert(buf);
+
+    /* Check for overflow conditions */
+    if (REGION_OVERFLOW(addr, size)) return -1;
+    if (addr+size>file->eoa) return -1;
+
+    /* Seek to the correct location */
+    if ((addr!=file->pos || OP_READ!=file->op) &&
+        file_seek(file->fd, (file_offset_t)addr, SEEK_SET)<0) {
+        file->pos = HADDR_UNDEF;
+        file->op = OP_UNKNOWN;
+        return -1;
+    }
+
+    /*
+     * Read data, being careful of interrupted system calls, partial results,
+     * and the end of the file.
+     */
+    while (size>0) {
+        do nbytes = read(file->fd, buf, size);
+        while (-1==nbytes && EINTR==errno);
+        if (-1==nbytes) {
+            /* error */
+            file->pos = HADDR_UNDEF;
+            file->op = OP_UNKNOWN;
+            return -1;
+        }
+        if (0==nbytes) {
+            /* end of file but not end of format address space */
+            memset(buf, 0, size);
+            size = 0;
+        }
+        assert(nbytes>=0);
+        assert((hsize_t)nbytes<=size);
+        size -= (hsize_t)nbytes;
+        addr += (haddr_t)nbytes;
+        buf = (char*)buf + nbytes;
+    }
+
+    /* Update current position */
+    file->pos = addr;
+    file->op = OP_READ;
+    return 0;
+}
+
+ +

+Example: The sec2 write callback is similar except it updates +the file EOF marker when extending the file. + +

+ + +

Flushing Cached Data

+ +

+Some drivers may desire to cache data in memory in order to make larger I/O +requests to the underlying file and thus improving bandwidth. Such drivers +should register a cache flushing function so that the library can insure that +data has been flushed out of the drivers in response to the application +calling H5Fflush. + +

+

+

+
Function: static herr_t flush (H5FD_t *file) +
+ +

+

+Flush all data for file file to storage. +

+ +

+

+Example: The sec2 driver doesn't cache any data but it also doesn't +extend the Unix file as agressively as it should. Therefore, when finalizing a +file it should write a zero to the last byte of the allocated region so that +when reopening the file later the EOF marker will be at least as large as the +EOA marker saved in the superblock (otherwise HDF5 will refuse to open the +file, claiming that the data appears to be truncated). + +

+ +
+static herr_t
+H5FD_sec2_flush(H5FD_t *_file)
+{
+    H5FD_sec2_t *file = (H5FD_sec2_t*)_file;
+
+    if (file->eoa>file->eof) {
+        if (-1==file_seek(file->fd, file->eoa-1, SEEK_SET)) return -1;
+        if (write(file->fd, "", 1)!=1) return -1;
+        file->eof = file->eoa;
+        file->pos = file->eoa;
+        file->op = OP_WRITE;
+    }
+
+    return 0;
+}
+
+ + + +

Optimization Functions

+ +

+The library is capable of performing several generic optimizations on I/O, but +these types of optimizations may not be appropriate for a given VFL driver. +

+ +

+Each driver may provide a query function to allow the library to query whether +to enable these optimizations. If a driver lacks a query function, the library +will disable all types of optimizations which can be queried. +

+ +

+

+
Function: static herr_t query (const H5FD_t *file, unsigned long *flags) +
+

+

+This function is called by the library to query which optimizations to enable +for I/O to this driver. These are the flags which are currently defined: + +

    +
    +
    H5FD_FEAT_AGGREGATE_METADATA (0x00000001) +
    Defining the H5FD_FEAT_AGGREGATE_METADATA for a VFL driver means that +the library will attempt to allocate a larger block for metadata and +then sub-allocate each metadata request from that larger block. +
    H5FD_FEAT_ACCUMULATE_METADATA (0x00000002) +
    Defining the H5FD_FEAT_ACCUMULATE_METADATA for a VFL driver means that +the library will attempt to cache metadata as it is written to the file +and build up a larger block of metadata to eventually pass to the VFL +'write' routine. +
    H5FD_FEAT_DATA_SIEVE (0x00000004) +
    Defining the H5FD_FEAT_DATA_SIEVE for a VFL driver means that +the library will attempt to cache raw data as it is read from/written to +a file in a "data sieve" buffer. See Rajeev Thakur's papers: +
      +
      +
      http://www.mcs.anl.gov/~thakur/papers/romio-coll.ps.gz +
      http://www.mcs.anl.gov/~thakur/papers/mpio-high-perf.ps.gz +
      +
    +
    +
+

+ +
+

+ +

Registration of a Driver

+ +

+Before a driver can be used the HDF5 library needs to be told of its +existence. This is done by registering the driver, which results in a driver +identification number. Instead of passing many arguments to the registration +function, the driver information is entered into a structure and the address +of the structure is passed to the registration function where it is +copied. This allows the HDF5 API to be extended while providing backward +compatibility at the source level. + +

+

+

+
Function: hid_t H5FDregister (H5FD_class_t *cls) +
+ +

+

+The driver described by struct cls is registered with the library and an +ID number for the driver is returned. +

+ +

+

+The H5FD_class_t type is a struct with the following fields: + +

+
+ +
const char *name +
+A pointer to a constant, null-terminated driver name to be used for debugging +purposes. +
size_t fapl_size +
+The size in bytes of the file access mode structure or zero if the driver +supplies a copy function or doesn't define the structure. +
void *(*fapl_copy)(const void *fapl) +
+An optional function which copies a driver-defined file access mode structure. +This field takes precedence over fm_size when both are defined. +
void (*fapl_free)(void *fapl) +
+An optional function to free the driver-defined file access mode structure. If +null, then the library calls the C free function to free the +structure. +
size_t dxpl_size +
+The size in bytes of the data transfer mode structure or zero if the driver +supplies a copy function or doesn't define the structure. +
void *(*dxpl_copy)(const void *dxpl) +
+An optional function which copies a driver-defined data transfer mode +structure. This field takes precedence over xm_size when both are +defined. +
void (*dxpl_free)(void *dxpl) +
+An optional function to free the driver-defined data transfer mode +structure. If null, then the library calls the C free function to +free the structure. +
H5FD_t *(*open)(const char *name, unsigned flags, hid_t fapl, haddr_t maxaddr) +
+The function which opens or creates a new file. +
herr_t (*close)(H5FD_t *file) +
+The function which ends access to a file. +
int (*cmp)(const H5FD_t *f1, const H5FD_t *f2) +
+An optional function to determine whether two open files have the same key. If +this function is not present then the library assumes that two files will +never be the same. +
int (*query)(const H5FD_t *f, unsigned long *flags) +
+An optional function to determine which library optimizations a driver can +support. +
haddr_t (*alloc)(H5FD_t *file, H5FD_mem_t type, hsize_t size) +
+An optional function to allocate space in the file. +
herr_t (*free)(H5FD_t *file, H5FD_mem_t type, haddr_t addr, hsize_t size) +
+An optional function to free space in the file. +
haddr_t (*get_eoa)(H5FD_t *file) +
+A function to query how much of the format address space has been allocated. +
herr_t (*set_eoa)(H5FD_t *file, haddr_t) +
+A function to set the end of address space. +
haddr_t (*get_eof)(H5FD_t *file) +
+A function to return the current end-of-file marker value. +
herr_t (*read)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, void *buffer) +
+A function to read data from a file. +
herr_t (*write)(H5FD_t *file, H5FD_mem_t type, hid_t dxpl, haddr_t addr, hsize_t size, const void *buffer) +
+A function to write data to a file. +
herr_t (*flush)(H5FD_t *file) +
+A function which flushes cached data to the file. +
H5FD_mem_t fl_map[H5FD_MEM_NTYPES] +
+An array which maps a file allocation request type to a free list. +
+ +

+Example: The sec2 driver would be registered as: + +

+ +
+static const H5FD_class_t H5FD_sec2_g = {
+    "sec2",                                     /*name                  */
+    MAXADDR,                                    /*maxaddr               */
+    NULL,                                       /*sb_size               */
+    NULL,                                       /*sb_encode             */
+    NULL,                                       /*sb_decode             */
+    0,                                          /*fapl_size             */
+    NULL,                                       /*fapl_get              */
+    NULL,                                       /*fapl_copy             */
+    NULL,                                       /*fapl_free             */
+    0,                                          /*dxpl_size             */
+    NULL,                                       /*dxpl_copy             */
+    NULL,                                       /*dxpl_free             */
+    H5FD_sec2_open,                             /*open                  */
+    H5FD_sec2_close,                            /*close                 */
+    H5FD_sec2_cmp,                              /*cmp                   */
+    H5FD_sec2_query,                            /*query                 */
+    NULL,                                       /*alloc                 */
+    NULL,                                       /*free                  */
+    H5FD_sec2_get_eoa,                          /*get_eoa               */
+    H5FD_sec2_set_eoa,                          /*set_eoa               */
+    H5FD_sec2_get_eof,                          /*get_eof               */
+    H5FD_sec2_read,                             /*read                  */
+    H5FD_sec2_write,                            /*write                 */
+    H5FD_sec2_flush,                            /*flush                 */
+    H5FD_FLMAP_SINGLE,                          /*fl_map                */
+};
+
+hid_t
+H5FD_sec2_init(void)
+{
+    if (!H5FD_SEC2_g) {
+        H5FD_SEC2_g = H5FDregister(&H5FD_sec2_g);
+    }
+    return H5FD_SEC2_g;
+}
+
+ +

+A driver can be removed from the library by unregistering it + +

+

+

+
Function: herr_t H5Dunregister (hid_t driver) +
+Where driver is the ID number returned when the driver was registered. +
+ +

+

+Unregistering a driver makes it unusable for creating new file access or data +transfer property lists but doesn't affect any property lists or files that +already use that driver. + +

+ + + + +

Programming Note +for C++ Developers Using C Functions

+ +

If a C routine that takes a function pointer as an argument is +called from within C++ code, the C routine should be returned from +normally.

+ +

Examples of this kind of routine include callbacks such as +H5Pset_elink_cb and H5Pset_type_conv_cb +and functions such as H5Tconvert and +H5Ewalk2.

+ +

Exiting the routine in its normal fashion allows the HDF5 C +Library to clean up its work properly. In other words, if the C++ +application jumps out of the routine back to the C++ +“catch” statement, the library is not given the +opportunity to close any temporary data structures that were set +up when the routine was called. The C++ application should save +some state as the routine is started so that any problem that +occurs might be diagnosed.

+ + + + + + + +

Querying Driver Information

+ +

+

+
Function: void * H5Pget_driver_data (hid_t fapl) +
+
Function: void * H5Pget_driver_data (hid_t fxpl) +
+ +

+

+This function is intended to be used by driver functions, not applications. +It returns a pointer directly into the file access property list +fapl which is a copy of the driver's file access mode originally +provided to the H5Pset_driver function. If its argument is a data +transfer property list fxpl then it returns a pointer to the +driver-specific data transfer information instead. +

+ +

+ + + +

Miscellaneous

+ +

+The various private H5F_low_* functions will be replaced by public +H5FD* functions so they can be called from drivers. + +

+

+All private functions H5F_addr_* which operate on addresses will be +renamed as public functions by removing the first underscore so they can be +called by drivers. + +

+

+The haddr_t address data type will be passed by value throughout the +library. The original intent was that this type would eventually be a union of +file address types for the various drivers and may become quite large, but +that was back when drivers were part of HDF5. It will become an alias for an +unsigned integer type (32 or 64 bits depending on how the library was +configured). + +

+

+The various H5F*.c driver files will be renamed H5FD*.c and each +will have a corresponding header file. All driver functions except the +initializer and API will be declared static. + +

+

+This documentation didn't cover optimization functions which would be useful +to drivers like MPI-IO. Some drivers may be able to perform data pipeline +operations more efficiently than HDF5 and need to be given a chance to +override those parts of the pipeline. The pipeline would be designed to call +various H5FD optimization functions at various points which return one of +three values: the operation is not implemented by the driver, the operation is +implemented but failed in a non-recoverable manner, the operation is +implemented and succeeded. + +

+

+Various parts of HDF5 check the only the top-level file driver and do +something special if it is the MPI-IO driver. However, we might want to be +able to put the MPI-IO driver under other drivers such as the raw part of a +split driver or under a debug driver whose sole purpose is to accumulate +statistics as it passes all requests through to the MPI-IO driver. Therefore +we will probably need a function which takes a format address and or object +type and returns the driver which would have been used at the lowest level to +process the request. + +

+ +


+

Footnotes

+

(1)

+

The driver name is by convention and might +not apply to drivers which are not distributed with HDF5. +

(2)

+

The access method also indicates how to translate +the storage name to a storage server such as a file, network protocol, or +memory. +

(3)

+

The term +"file access property list" is a misnomer since storage isn't +required to be a file. +

(4)

+

This +function is overloaded to operate on data transfer property lists also, as +described below. +

(5)

+

Read-only access is only appropriate when opening an existing +file. +

(6)

+

For instance, writing data to one handle will cause +the data to be immediately visible on the other handle. +

(7)

+

The ordering is +arbitrary as long as it's consistent within a particular file driver. +

(8)

+

File access modes do not describe data, but rather +describe how the HDF5 format address space is mapped to the underlying +file(s). Thus, in general the mapping must be known before the file superblock +can be read. However, the user usually knows enough about the mapping for the +superblock to be readable and once the superblock is read the library can fill +in the missing parts of the mapping. +


+ + + + + diff --git a/doxygen/hdf5_footer.html b/doxygen/hdf5_footer.html new file mode 100644 index 0000000..520f3f5 --- /dev/null +++ b/doxygen/hdf5_footer.html @@ -0,0 +1,21 @@ + + +

+ + + + + + + diff --git a/doxygen/hdf5_header.html b/doxygen/hdf5_header.html new file mode 100644 index 0000000..4a575d6 --- /dev/null +++ b/doxygen/hdf5_header.html @@ -0,0 +1,61 @@ + + + + + + + +$projectname: $title +$title + + + +$treeview +$search +$mathjax + + + + + + + + +
Please, help us to better know about our user community by answering the following short survey: https://www.hdfgroup.org/
+ +
+ + +
+ + + + + + + + + + + + + + + + + + + + + +
+
$projectname +  $projectnumber +
+
$projectbrief
+
+
$projectbrief
+
$searchbox
+
+ + diff --git a/doxygen/hdf5_navtree_hacks.js b/doxygen/hdf5_navtree_hacks.js new file mode 100644 index 0000000..942970c --- /dev/null +++ b/doxygen/hdf5_navtree_hacks.js @@ -0,0 +1,246 @@ + +// generate a table of contents in the side-nav based on the h1/h2 tags of the current page. +function generate_autotoc() { + var headers = $("h1, h2"); + if(headers.length > 1) { + var toc = $("#side-nav").append(''); + toc = $("#nav-toc"); + var footer = $("#nav-path"); + var footerHeight = footer.height(); + toc = toc.append('
    '); + toc = toc.find('ul'); + var indices = new Array(); + indices[0] = 0; + indices[1] = 0; + + var h1counts = $("h1").length; + headers.each(function(i) { + var current = $(this); + var levelTag = current[0].tagName.charAt(1); + if(h1counts==0) + levelTag--; + var cur_id = current.attr("id"); + + indices[levelTag-1]+=1; + var prefix = indices[0]; + if (levelTag >1) { + prefix+="."+indices[1]; + } + + // Uncomment to add number prefixes + // current.html(prefix + " " + current.html()); + for(var l = levelTag; l < 2; ++l){ + indices[l] = 0; + } + + if(cur_id == undefined) { + current.attr('id', 'title' + i); + current.addClass('anchor'); + toc.append("
  • " + current.text() + "
  • "); + } else { + toc.append("
  • " + current.text() + "
  • "); + } + }); + resizeHeight(); + } +} + + +var global_navtree_object; + +// Overloaded to remove links to sections/subsections +function getNode(o, po) +{ + po.childrenVisited = true; + var l = po.childrenData.length-1; + for (var i in po.childrenData) { + var nodeData = po.childrenData[i]; + if((!nodeData[1]) || (nodeData[1].indexOf('#')==-1)) // <- we added this line + po.children[i] = newNode(o, po, nodeData[0], nodeData[1], nodeData[2], i==l); + } +} + +// Overloaded to adjust the size of the navtree wrt the toc +function resizeHeight() +{ + var header = $("#top"); + var sidenav = $("#side-nav"); + var content = $("#doc-content"); + var navtree = $("#nav-tree"); + var footer = $("#nav-path"); + var toc = $("#nav-toc"); + + var headerHeight = header.outerHeight(); + var footerHeight = footer.outerHeight(); + var tocHeight = toc.height(); + var windowHeight = $(window).height() - headerHeight - footerHeight; + content.css({height:windowHeight + "px"}); + navtree.css({height:(windowHeight-tocHeight) + "px"}); + sidenav.css({height:windowHeight + "px"}); +} + +// Overloaded to save the root node into global_navtree_object +function initNavTree(toroot,relpath) +{ + var o = new Object(); + global_navtree_object = o; // <- we added this line + o.toroot = toroot; + o.node = new Object(); + o.node.li = document.getElementById("nav-tree-contents"); + o.node.childrenData = NAVTREE; + o.node.children = new Array(); + o.node.childrenUL = document.createElement("ul"); + o.node.getChildrenUL = function() { return o.node.childrenUL; }; + o.node.li.appendChild(o.node.childrenUL); + o.node.depth = 0; + o.node.relpath = relpath; + o.node.expanded = false; + o.node.isLast = true; + o.node.plus_img = document.createElement("img"); + o.node.plus_img.src = relpath+"ftv2pnode.png"; + o.node.plus_img.width = 16; + o.node.plus_img.height = 22; + + if (localStorageSupported()) { + var navSync = $('#nav-sync'); + if (cachedLink()) { + showSyncOff(navSync,relpath); + navSync.removeClass('sync'); + } else { + showSyncOn(navSync,relpath); + } + navSync.click(function(){ toggleSyncButton(relpath); }); + } + + navTo(o,toroot,window.location.hash,relpath); + + $(window).bind('hashchange', function(){ + if (window.location.hash && window.location.hash.length>1){ + var a; + if ($(location).attr('hash')){ + var clslink=stripPath($(location).attr('pathname'))+':'+ + $(location).attr('hash').substring(1); + a=$('.item a[class$="'+clslink+'"]'); + } + if (a==null || !$(a).parent().parent().hasClass('selected')){ + $('.item').removeClass('selected'); + $('.item').removeAttr('id'); + } + var link=stripPath2($(location).attr('pathname')); + navTo(o,link,$(location).attr('hash'),relpath); + } else if (!animationInProgress) { + $('#doc-content').scrollTop(0); + $('.item').removeClass('selected'); + $('.item').removeAttr('id'); + navTo(o,toroot,window.location.hash,relpath); + } + }) + + $(window).on("load", showRoot); +} + +// return false if the the node has no children at all, or has only section/subsection children +function checkChildrenData(node) { + if (!(typeof(node.childrenData)==='string')) { + for (var i in node.childrenData) { + var url = node.childrenData[i][1]; + if(url.indexOf("#")==-1) + return true; + } + return false; + } + return (node.childrenData); +} + +// Modified to: +// 1 - remove the root node +// 2 - remove the section/subsection children +function createIndent(o,domNode,node,level) +{ + var level=-2; // <- we replaced level=-1 by level=-2 + var n = node; + while (n.parentNode) { level++; n=n.parentNode; } + if (checkChildrenData(node)) { // <- we modified this line to use checkChildrenData(node) instead of node.childrenData + var imgNode = document.createElement("span"); + imgNode.className = 'arrow'; + imgNode.style.paddingLeft=(16*level).toString()+'px'; + imgNode.innerHTML=arrowRight; + node.plus_img = imgNode; + node.expandToggle = document.createElement("a"); + node.expandToggle.href = "javascript:void(0)"; + node.expandToggle.onclick = function() { + if (node.expanded) { + $(node.getChildrenUL()).slideUp("fast"); + node.plus_img.innerHTML=arrowRight; + node.expanded = false; + } else { + expandNode(o, node, false, false); + } + } + node.expandToggle.appendChild(imgNode); + domNode.appendChild(node.expandToggle); + } else { + var span = document.createElement("span"); + span.className = 'arrow'; + span.style.width = 16*(level+1)+'px'; + span.innerHTML = ' '; + domNode.appendChild(span); + } +} + +// Overloaded to automatically expand the selected node +function selectAndHighlight(hash,n) +{ + var a; + if (hash) { + var link=stripPath($(location).attr('pathname'))+':'+hash.substring(1); + a=$('.item a[class$="'+link+'"]'); + } + if (a && a.length) { + a.parent().parent().addClass('selected'); + a.parent().parent().attr('id','selected'); + highlightAnchor(); + } else if (n) { + $(n.itemDiv).addClass('selected'); + $(n.itemDiv).attr('id','selected'); + } + if ($('#nav-tree-contents .item:first').hasClass('selected')) { + $('#nav-sync').css('top','30px'); + } else { + $('#nav-sync').css('top','5px'); + } + expandNode(global_navtree_object, n, true, true); // <- we added this line + showRoot(); +} + + +$(document).ready(function() { + + generate_autotoc(); + + (function (){ // wait until the first "selected" element has been created + try { + + // this line will triger an exception if there is no #selected element, i.e., before the tree structure is complete. + document.getElementById("selected").className = "item selected"; + + // ok, the default tree has been created, we can keep going... + + // expand the "Chapters" node + if(window.location.href.indexOf('unsupported')==-1) + expandNode(global_navtree_object, global_navtree_object.node.children[0].children[2], true, true); + else + expandNode(global_navtree_object, global_navtree_object.node.children[0].children[1], true, true); + + // Hide the root node "HDF5" + $(document.getElementsByClassName('index.html')[0]).parent().parent().css({display:"none"}); + + } catch (err) { + setTimeout(arguments.callee, 10); + } + })(); + + $(window).on("load", resizeHeight); +}); diff --git a/doxygen/hdf5doxy.css b/doxygen/hdf5doxy.css new file mode 100644 index 0000000..8c03860 --- /dev/null +++ b/doxygen/hdf5doxy.css @@ -0,0 +1,251 @@ + +/******** HDF5 specific CSS code ************/ + +/**** Styles removing elements ****/ + +/* remove the "modules|classes" link for module pages (they are already in the TOC) */ +div.summary { + display:none; +} + +/* remove */ +div.contents hr { + display:none; +} + +/**** ****/ + +p, dl.warning, dl.attention, dl.note +{ + max-width:60em; + text-align:justify; +} + +li { + max-width:55em; + text-align:justify; +} + +img { + border: 0; +} + +div.fragment { + display:table; /* this allows the element to be larger than its parent */ + padding: 0pt; +} +pre.fragment { + border: 1px solid #cccccc; + + margin: 2px 0px 2px 0px; + padding: 3px 5px 3px 5px; +} + +/* Common style for all HDF5's tables */ + +table.example, table.manual, table.manual-vl, table.manual-hl { + max-width:100%; + border-collapse: collapse; + border-style: solid; + border-width: 1px; + border-color: #cccccc; + font-size: 1em; + + box-shadow: 5px 5px 5px rgba(0, 0, 0, 0.15); + -moz-box-shadow: 5px 5px 5px rgba(0, 0, 0, 0.15); + -webkit-box-shadow: 5px 5px 5px rgba(0, 0, 0, 0.15); +} + +table.example th, table.manual th, table.manual-vl th, table.manual-hl th { + padding: 0.5em 0.5em 0.5em 0.5em; + text-align: left; + padding-right: 1em; + color: #555555; + background-color: #F4F4E5; + + background-image: -webkit-gradient(linear,center top,center bottom,from(#FFFFFF), color-stop(0.3,#FFFFFF), color-stop(0.30,#FFFFFF), color-stop(0.98,#F4F4E5), to(#ECECDE)); + background-image: -moz-linear-gradient(center top, #FFFFFF 0%, #FFFFFF 30%, #F4F4E5 98%, #ECECDE); + filter: progid:DXImageTransform.Microsoft.gradient(startColorstr='#FFFFFF', endColorstr='#F4F4E5'); +} + +table.example td, table.manual td, table.manual-vl td, table.manual-hl td { + vertical-align:top; + border-width: 1px; + border-color: #cccccc; +} + +/* header of headers */ +table th.meta { + text-align:center; + font-size: 1.2em; + background-color:#FFFFFF; +} + +/* intermediate header */ +table th.inter { + text-align:left; + background-color:#FFFFFF; + background-image:none; + border-style:solid solid solid solid; + border-width: 1px; + border-color: #cccccc; +} + +/** class for example / output tables **/ + +table.example { +} + +table.example th { +} + +table.example td { + padding: 0.5em 0.5em 0.5em 0.5em; + vertical-align:top; +} + +/* standard class for the manual */ + +table.manual, table.manual-vl, table.manual-hl { + padding: 0.2em 0em 0.5em 0em; +} + +table.manual th, table.manual-vl th, table.manual-hl th { + margin: 0em 0em 0.3em 0em; +} + +table.manual td, table.manual-vl td, table.manual-hl td { + padding: 0.3em 0.5em 0.3em 0.5em; + vertical-align:top; + border-width: 1px; +} + +table.manual td.alt, table.manual tr.alt, table.manual-vl td.alt, table.manual-vl tr.alt { + background-color: #F4F4E5; +} + +table.manual-vl th, table.manual-vl td, table.manual-vl td.alt { + border-color: #cccccc; + border-width: 1px; + border-style: none solid none solid; +} + +table.manual-vl th.inter { + border-style: solid solid solid solid; +} + +table.manual-hl td { + border-color: #cccccc; + border-width: 1px; + border-style: solid none solid none; +} + +table td.code { + font-family: monospace; +} + +h2 { + margin-top:2em; + border-style: none none solid none; + border-width: 1px; + border-color: #cccccc; +} + +/**** Table of content in the side-nav ****/ + + +div.toc { + margin:0; + padding: 0.3em 0 0 0; + width:100%; + float:none; + position:absolute; + bottom:0; + border-radius:0px; + border-style: solid none none none; + max-height:50%; + overflow-y: scroll; +} + +div.toc h3 { + margin-left: 0.5em; + margin-bottom: 0.2em; +} + +div.toc ul { + margin: 0.2em 0 0.4em 0.5em; +} + +span.cpp11,span.cpp14,span.cpp17 { + color: #119911; + font-weight: bold; +} + +.newin3x { + color: #a37c1a; + font-weight: bold; +} + +div.warningbox { + max-width:60em; + border-style: solid solid solid solid; + border-color: red; + border-width: 3px; +} + +/**** old HDF5's styles ****/ + + +table.tutorial_code td { + border-color: transparent; /* required for Firefox */ + padding: 3pt 5pt 3pt 5pt; + vertical-align: top; +} + + +/* Whenever doxygen meets a '\n' or a '
    ', it will put + * the text containing the character into a

    . + * This little hack together with table.tutorial_code td.note + * aims at fixing this issue. */ +table.tutorial_code td.note p.starttd { + margin: 0px; + border: none; + padding: 0px; +} + +div.eimainmenu { + text-align: center; +} + +/* center version number on main page */ +h3.version { + text-align: center; +} + + +td.width20em p.endtd { + width: 20em; +} + +/* needed for huge screens */ +.ui-resizable-e { + background-repeat: repeat-y; +} + +/* Style external links -- nav-tree is different */ + +#nav-tree .label a { + padding:2px 16px 2px 2px; +} + +a { + outline: none; + text-decoration: none; + padding: 2px 1px 0; +} + +a[href*="http"] { + background: url('https://mdn.mozillademos.org/files/12982/external-link-52.png') no-repeat 100% 0; + background-size: 12px 12px; + padding-right: 16px; +} diff --git a/doxygen/hdf5doxy_layout.xml b/doxygen/hdf5doxy_layout.xml new file mode 100644 index 0000000..7f71c24 --- /dev/null +++ b/doxygen/hdf5doxy_layout.xml @@ -0,0 +1,182 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/doxygen/img/FF-IH_FileGroup.gif b/doxygen/img/FF-IH_FileGroup.gif new file mode 100644 index 0000000..b0d76f5 Binary files /dev/null and b/doxygen/img/FF-IH_FileGroup.gif differ diff --git a/doxygen/img/FF-IH_FileObject.gif b/doxygen/img/FF-IH_FileObject.gif new file mode 100644 index 0000000..8eba623 Binary files /dev/null and b/doxygen/img/FF-IH_FileObject.gif differ diff --git a/doxygen/img/FileFormatSpecChunkDiagram.jpg b/doxygen/img/FileFormatSpecChunkDiagram.jpg new file mode 100644 index 0000000..03fd90a Binary files /dev/null and b/doxygen/img/FileFormatSpecChunkDiagram.jpg differ diff --git a/doxygen/img/HDFG-logo.png b/doxygen/img/HDFG-logo.png index a2d52a9..38300ff 100644 Binary files a/doxygen/img/HDFG-logo.png and b/doxygen/img/HDFG-logo.png differ diff --git a/doxygen/img/PaletteExample1.gif b/doxygen/img/PaletteExample1.gif new file mode 100644 index 0000000..8694d9d Binary files /dev/null and b/doxygen/img/PaletteExample1.gif differ diff --git a/doxygen/img/Palettes.fm.anc.gif b/doxygen/img/Palettes.fm.anc.gif new file mode 100644 index 0000000..d344c03 Binary files /dev/null and b/doxygen/img/Palettes.fm.anc.gif differ diff --git a/doxygen/img/ftv2node.png b/doxygen/img/ftv2node.png new file mode 100644 index 0000000..63c605b Binary files /dev/null and b/doxygen/img/ftv2node.png differ diff --git a/doxygen/img/ftv2pnode.png b/doxygen/img/ftv2pnode.png new file mode 100644 index 0000000..c6ee22f Binary files /dev/null and b/doxygen/img/ftv2pnode.png differ diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 0c92bb2..74fd71f 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -1378,6 +1378,7 @@ if (DOXYGEN_FOUND) set (DOXYGEN_OPTIMIZE_OUTPUT_FOR_C YES) set (DOXYGEN_MACRO_EXPANSION YES) set (DOXYGEN_OUTPUT_DIRECTORY ${HDF5_BINARY_DIR}/hdf5lib_docs) + set (DOXYGEN_EXAMPLES_DIRECTORY ${HDF5_DOXYGEN_DIR}/examples) # This configure and custom target work together # Replace variables inside @@ with the current values diff --git a/src/H5ACpublic.h b/src/H5ACpublic.h index e6cebff..f8f4f28 100644 --- a/src/H5ACpublic.h +++ b/src/H5ACpublic.h @@ -442,124 +442,347 @@ extern "C" { #define H5AC_METADATA_WRITE_STRATEGY__PROCESS_0_ONLY 0 #define H5AC_METADATA_WRITE_STRATEGY__DISTRIBUTED 1 +/** + * H5AC_cache_config_t is a public structure intended for use in public APIs. + * At least in its initial incarnation, it is basically a copy of \c struct + * \c H5C_auto_size_ctl_t, minus the \c report_fcn field, and plus the + * \c dirty_bytes_threshold field. + * + * The \c report_fcn field is omitted, as including it would require us to make + * \c H5C_t structure public. + * + * The \c dirty_bytes_threshold field does not appear in \c H5C_auto_size_ctl_t, + * as synchronization between caches on different processes is handled at the \c + * H5AC level, not at the level of \c H5C. Note however that there is + * considerable interaction between this value and the other fields in this + * structure. + * + * Similarly, the \c open_trace_file, \c close_trace_file, and \c + * trace_file_name fields do not appear in \c H5C_auto_size_ctl_t, as most trace + * file issues are handled at the \c H5AC level. The one exception is storage + * of the pointer to the trace file, which is handled by \c H5C. + * + * The structure is in H5ACpublic.h as we may wish to allow different + * configuration options for metadata and raw data caches. + */ + +//! typedef struct H5AC_cache_config_t { /* general configuration fields: */ + //! int version; + /**< Integer field indicating the the version of the H5AC_cache_config_t + * in use. This field should be set to #H5AC__CURR_CACHE_CONFIG_VERSION + * (defined in H5ACpublic.h). */ hbool_t rpt_fcn_enabled; + /**< Boolean flag indicating whether the adaptive cache resize report + * function is enabled. This field should almost always be set to disabled + * (0). Since resize algorithm activity is reported via stdout, it MUST be + * set to disabled (0) on Windows machines.\n + * The report function is not supported code, and can be expected to change + * between versions of the library. Use it at your own risk. */ hbool_t open_trace_file; + /**< Boolean field indicating whether the + * \ref H5AC_cache_config_t.trace_file_name "trace_file_name" + * field should be used to open a trace file for the cache.\n + * The trace file is a debugging feature that allows the capture + * of top level metadata cache requests for purposes of debugging + * and/or optimization. This field should normally be set to 0, as + * trace file collection imposes considerable overhead.\n + * This field should only be set to 1 when the + * \ref H5AC_cache_config_t.trace_file_name "trace_file_name" + * contains the full path of the desired trace file, and either + * there is no open trace file on the cache, or the + * \ref H5AC_cache_config_t.close_trace_file "close_trace_file" + * field is also 1.\n + * The trace file feature is unsupported unless used at the + * direction of The HDF Group. It is intended to allow The HDF + * Group to collect a trace of cache activity in cases of occult + * failures and/or poor performance seen in the field, so as to aid + * in reproduction in the lab. If you use it absent the direction + * of The HDF Group, you are on your own. */ + hbool_t close_trace_file; - char trace_file_name[H5AC__MAX_TRACE_FILE_NAME_LEN + 1]; + /**< Boolean field indicating whether the current trace file + *(if any) should be closed.\n + * See the above comments on the \ref H5AC_cache_config_t.open_trace_file + * "open_trace_file" field. This field should be set to 0 unless there is + * an open trace file on the cache that you wish to close.\n + * The trace file feature is unsupported unless used at the direction of + * The HDF Group. It is intended to allow The HDF Group to collect a trace + * of cache activity in cases of occult failures and/or poor performance + * seen in the field, so as to aid in reproduction in the lab. If you use + * it absent the direction of The HDF Group, you are on your own. */ + + char trace_file_name[H5AC__MAX_TRACE_FILE_NAME_LEN + 1]; + /**< Full path of the trace file to be opened if the + * \ref H5AC_cache_config_t.open_trace_file "open_trace_file" field is set + * to 1.\n + * In the parallel case, an ascii representation of the MPI rank of the + * process will be appended to the file name to yield a unique trace file + * name for each process.\n + * The length of the path must not exceed #H5AC__MAX_TRACE_FILE_NAME_LEN + * characters.\n + * The trace file feature is unsupported unless used at the direction of + * The HDF Group. It is intended to allow The HDF Group to collect a trace + * of cache activity in cases of occult failures and/or poor performance + * seen in the field, so as to aid in reproduction in the lab. If you use + * it absent the direction of The HDF Group, you are on your own. */ hbool_t evictions_enabled; + /**< A boolean flag indicating whether evictions from the metadata cache + * are enabled. This flag is initially set to enabled (1).\n + * In rare circumstances, the raw data throughput quirements may be so high + * that the user wishes to postpone metadata writes so as to reserve I/O + * throughput for raw data. The \p evictions_enabled field exists to allow + * this. However, this is an extreme step, and you have no business doing + * it unless you have read the User Guide section on metadata caching, and + * have considered all other options carefully.\n + * The \p evictions_enabled field may not be set to disabled (0) + * unless all adaptive cache resizing code is disabled via the + * \ref H5AC_cache_config_t.incr_mode "incr_mode", + * \ref H5AC_cache_config_t.flash_incr_mode "flash_incr_mode", + * \ref H5AC_cache_config_t.decr_mode "decr_mode" fields.\n + * When this flag is set to disabled (\c 0), the metadata cache will not + * attempt to evict entries to make space for new entries, and thus will + * grow without bound.\n + * Evictions will be re-enabled when this field is set back to \c 1. + * This should be done as soon as possible. */ hbool_t set_initial_size; - size_t initial_size; + /**< Boolean flag indicating whether the cache should be created + * with a user specified initial size. */ + + size_t initial_size; + /**< If \ref H5AC_cache_config_t.set_initial_size "set_initial_size" + * is set to 1, \p initial_size must contain he desired initial size in + * bytes. This value must lie in the closed interval + * [ \p min_size, \p max_size ]. (see below) */ double min_clean_fraction; + /**< This field specifies the minimum fraction of the cache + * that must be kept either clean or empty.\n + * The value must lie in the interval [0.0, 1.0]. 0.01 is a good place to + * start in the serial case. In the parallel case, a larger value is needed + * -- see the overview of the metadata cache in the + * “Metadata Caching in HDF5” section of the -- HDF5 User’s Guide + * for details. */ size_t max_size; + /**< Upper bound (in bytes) on the range of values that the + * adaptive cache resize code can select as the maximum cache size. */ + size_t min_size; + /**< Lower bound (in bytes) on the range of values that the + * adaptive cache resize code can select as the mininum cache * size. */ long int epoch_length; + /**< Number of cache accesses between runs of the adaptive cache resize + * code. 50,000 is a good starting number. */ + //! /* size increase control fields: */ + //! enum H5C_cache_incr_mode incr_mode; + /**< Enumerated value indicating the operational mode of the automatic + * cache size increase code. At present, only two values listed in + * #H5C_cache_incr_mode are legal. */ double lower_hr_threshold; + /**< Hit rate threshold used by the hit rate threshold cache size + * increment algorithm.\n + * When the hit rate over an epoch is below this threshold and the cache + * is full, the maximum size of the cache is multiplied by increment + * (below), and then clipped as necessary to stay within \p max_size, and + * possibly \p max_increment.\n + * This field must lie in the interval [0.0, 1.0]. 0.8 or 0.9 is a good + * place to start. */ double increment; + /**< Factor by which the hit rate threshold cache size increment + * algorithm multiplies the current cache max size to obtain a tentative + * new cache size.\n + * The actual cache size increase will be clipped to satisfy the \p max_size + * specified in the general configuration, and possibly max_increment + * below.\n + * The parameter must be greater than or equal to 1.0 -- 2.0 is a reasonable + * value.\n + * If you set it to 1.0, you will effectively disable cache size increases. + */ hbool_t apply_max_increment; - size_t max_increment; + /**< Boolean flag indicating whether an upper limit should be applied to + * the size of cache size increases. */ + + size_t max_increment; + /**< Maximum number of bytes by which cache size can be increased in a + * single step -- if applicable. */ enum H5C_cache_flash_incr_mode flash_incr_mode; - double flash_multiple; - double flash_threshold; + /**< Enumerated value indicating the operational mode of the flash cache + * size increase code. At present, only two listed values in + * #H5C_cache_flash_incr_mode are legal.*/ + + double flash_multiple; + /**< The factor by which the size of the triggering entry / entry size + * increase is multiplied to obtain the initial cache size increment. This + * increment may be reduced to reflect existing free space in the cache and + * the \p max_size field above.\n + * The parameter must lie in the interval [0.0, 1.0]. 0.1 or 0.05 is a good + * place to start.\n + * At present, this field must lie in the range [0.1, 10.0]. */ + + double flash_threshold; + /**< The factor by which the current maximum cache size is multiplied to + * obtain the minimum size entry / entry size increase which may trigger a + * flash cache size increase. \n + * At present, this value must lie in the range [0.1, 1.0]. */ + //! /* size decrease control fields: */ + //! enum H5C_cache_decr_mode decr_mode; + /**< Enumerated value indicating the operational mode of the tomatic + * cache size decrease code. At present, the values listed in + * #H5C_cache_decr_mode are legal.*/ double upper_hr_threshold; + /**< Hit rate threshold for the hit rate threshold and ageout with hit + * rate threshold cache size decrement algorithms.\n + * When \p decr_mode is #H5C_decr__threshold, and the hit rate over a given + * epoch exceeds the supplied threshold, the current maximum cache + * size is multiplied by decrement to obtain a tentative new (and smaller) + * maximum cache size.\n + * When \p decr_mode is #H5C_decr__age_out_with_threshold, there is + * no attempt to find and evict aged out entries unless the hit rate in + * the previous epoch exceeded the supplied threshold.\n + * This field must lie in the interval [0.0, 1.0].\n + * For #H5C_incr__threshold, .9995 or .99995 is a good place to start.\n + * For #H5C_decr__age_out_with_threshold, .999 might be more useful.*/ double decrement; + /**< In the hit rate threshold cache size decrease algorithm, this + * parameter contains the factor by which the current max cache size is + * multiplied to produce a tentative new cache size.\n + * The actual cache size decrease will be clipped to satisfy the + * \ref H5AC_cache_config_t.min_size "min_size" specified in the general + * configuration, and possibly \ref H5AC_cache_config_t.max_decrement + * "max_decrement".\n + * The parameter must be be in the interval [0.0, 1.0].\n + * If you set it to 1.0, you will effectively + * disable cache size decreases. 0.9 is a reasonable starting point. */ hbool_t apply_max_decrement; - size_t max_decrement; + /**< Boolean flag indicating ether an upper limit should be applied to + * the size of cache size decreases. */ + + size_t max_decrement; + /**< Maximum number of bytes by which the maximum cache size can be + * decreased in any single step -- if applicable.*/ int epochs_before_eviction; + /**< In the ageout based cache size reduction algorithms, this field + * contains the minimum number of epochs an entry must remain unaccessed in + * cache before the cache size reduction algorithm tries to evict it. 3 is a + * reasonable value. */ hbool_t apply_empty_reserve; - double empty_reserve; + /**< Boolean flag indicating whether the ageout based decrement + * algorithms will maintain a empty reserve when decreasing cache size. */ + + double empty_reserve; + /**< Empty reserve as a fraction maximum cache size if applicable.\n When + * so directed, the ageout based algorithms will not decrease the maximum + * cache size unless the empty reserve can be met.\n The parameter must lie + * in the interval [0.0, 1.0]. 0.1 or 0.05 is a good place to start. */ + //! /* parallel configuration fields: */ + //! size_t dirty_bytes_threshold; - int metadata_write_strategy; - + /**< Threshold number of bytes of dirty metadata generation for + * triggering synchronizations of the metadata caches serving the target + * file in the parallel case.\n Synchronization occurs whenever the number + * of bytes of dirty metadata created since the last synchronization exceeds + * this limit.\n This field only applies to the parallel case. While it is + * ignored elsewhere, it can still draw a value out of bounds error.\n It + * must be consistant across all caches on any given file.\n By default, + * this field is set to 256 KB. It shouldn't be more than half the current + * max cache size times the min clean fraction. */ + + int metadata_write_strategy; + /**< Desired metadata write strategy. The valid values for this field + * are:\n #H5AC_METADATA_WRITE_STRATEGY__PROCESS_0_ONLY: Specifies tha only + * process zero is allowed to write dirty metadata to disk.\n + * #H5AC_METADATA_WRITE_STRATEGY__DISTRIBUTED: Specifies that process zero + * still makes the decisions as to what entries should be flushed, but the + * actual flushes are distributed across the processes in the computation to + * the extent possible.\n The src/H5ACpublic.h include file in the HDF5 + * library has detailed information on each strategy. */ + //! } H5AC_cache_config_t; - -/**************************************************************************** - * - * structure H5AC_cache_image_config_t - * - * H5AC_cache_image_ctl_t is a public structure intended for use in public - * APIs. At least in its initial incarnation, it is a copy of struct - * H5C_cache_image_ctl_t. - * - * The fields of the structure are discussed individually below: - * - * version: Integer field containing the version number of this version - * of the H5C_image_ctl_t structure. Any instance of - * H5C_image_ctl_t passed to the cache must have a known - * version number, or an error will be flagged. - * - * generate_image: Boolean flag indicating whether a cache image should - * be created on file close. - * - * save_resize_status: Boolean flag indicating whether the cache image - * should include the adaptive cache resize configuration and status. - * Note that this field is ignored at present. - * - * entry_ageout: Integer field indicating the maximum number of - * times a prefetched entry can appear in subsequent cache images. - * This field exists to allow the user to avoid the buildup of - * infrequently used entries in long sequences of cache images. - * - * The value of this field must lie in the range - * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE (-1) to - * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__MAX (100). - * - * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE means that no limit - * is imposed on number of times a prefeteched entry can appear - * in subsequent cache images. - * - * A value of 0 prevents prefetched entries from being included - * in cache images. - * - * Positive integers restrict prefetched entries to the specified - * number of appearances. - * - * Note that the number of subsequent cache images that a prefetched - * entry has appeared in is tracked in an 8 bit field. Thus, while - * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__MAX can be increased from its - * current value, any value in excess of 255 will be the functional - * equivalent of H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE. - * - ****************************************************************************/ +//! #define H5AC__CURR_CACHE_IMAGE_CONFIG_VERSION 1 #define H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE -1 #define H5AC__CACHE_IMAGE__ENTRY_AGEOUT__MAX 100 +//! +/** + * H5AC_cache_image_config_t is a public structure intended for use in public + * APIs. At least in its initial incarnation, it is a copy of \c struct \c + * H5C_cache_image_ctl_t. + */ + typedef struct H5AC_cache_image_config_t { - int version; + int version; + /**< Integer field containing the version number of this version of the \c + * H5C_image_ctl_t structure. Any instance of \c H5C_image_ctl_t passed + * to the cache must have a known version number, or an error will be + * flagged. + */ hbool_t generate_image; + /**< Boolean flag indicating whether a cache image should be created on file + * close. + */ hbool_t save_resize_status; - int entry_ageout; + /**< Boolean flag indicating whether the cache image should include the + * adaptive cache resize configuration and status. Note that this field + * is ignored at present. + */ + int entry_ageout; + /**< Integer field indicating the maximum number of times a + * prefetched entry can appear in subsequent cache images. This field + * exists to allow the user to avoid the buildup of infrequently used + * entries in long sequences of cache images. + * + * The value of this field must lie in the range \ref + * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE (-1) to \ref + * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__MAX (100). + * + * \ref H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE means that no limit is + * imposed on number of times a prefeteched entry can appear in subsequent + * cache images. + * + * A value of 0 prevents prefetched entries from being included in cache + * images. + * + * Positive integers restrict prefetched entries to the specified number + * of appearances. + * + * Note that the number of subsequent cache images that a prefetched entry + * has appeared in is tracked in an 8 bit field. Thus, while \ref + * H5AC__CACHE_IMAGE__ENTRY_AGEOUT__MAX can be increased from its current + * value, any value in excess of 255 will be the functional equivalent of + * \ref H5AC__CACHE_IMAGE__ENTRY_AGEOUT__NONE. + */ } H5AC_cache_image_config_t; +//! + #ifdef __cplusplus } #endif diff --git a/src/H5Amodule.h b/src/H5Amodule.h index 45172bf..c89c93f 100644 --- a/src/H5Amodule.h +++ b/src/H5Amodule.h @@ -30,19 +30,39 @@ #define H5_MY_PKG_INIT YES /**\defgroup H5A H5A - * \brief Attribute Interface * - * \details The Attribute Interface, H5A, provides a mechanism for attaching - * additional information to a dataset, group, or named datatype. + * Use the functions in this module to manage HDF5 attributes. * - * Attributes are accessed by opening the object that they are - * attached to and are not independent objects. Typically an - * attribute is small in size and contains user metadata about the - * object that it is attached to. + * The Attribute Interface, H5A, provides a mechanism for attaching additional + * information to a dataset, group, or named datatype. * - * Attributes look similar to HDF5 datasets in that they have a - * datatype and dataspace. However, they do not support partial - * I/O operations and cannot be compressed or extended. + * Attributes are accessed by opening the object that they are attached to and + * are not independent objects. Typically an attribute is small in size and + * contains user metadata about the object that it is attached to. + * + * Attributes look similar to HDF5 datasets in that they have a datatype and + * dataspace. However, they do not support partial I/O operations and cannot be + * compressed or extended. + * + * + * + * + * + * + * + * + * + * + * + *
    CreateRead
    + * \snippet H5A_examples.c create + * + * \snippet H5A_examples.c read + *
    UpdateDelete
    + * \snippet H5A_examples.c update + * + * \snippet H5A_examples.c delete + *
    * */ diff --git a/src/H5Apublic.h b/src/H5Apublic.h index 68d8a8a..2d58cdf 100644 --- a/src/H5Apublic.h +++ b/src/H5Apublic.h @@ -22,19 +22,40 @@ #include "H5Opublic.h" /* Object Headers */ #include "H5Tpublic.h" /* Datatypes */ -/* Information struct for attribute (for H5Aget_info/H5Aget_info_by_idx) */ -//! [H5A_info_t_snip] +//! +/** + * Information struct for H5Aget_info() / H5Aget_info_by_idx() + */ typedef struct { - hbool_t corder_valid; /* Indicate if creation order is valid */ - H5O_msg_crt_idx_t corder; /* Creation order */ - H5T_cset_t cset; /* Character set of attribute name */ - hsize_t data_size; /* Size of raw data */ + hbool_t corder_valid; /**< Indicate if creation order is valid */ + H5O_msg_crt_idx_t corder; /**< Creation order */ + H5T_cset_t cset; /**< Character set of attribute name */ + hsize_t data_size; /**< Size of raw data */ } H5A_info_t; -//! [H5A_info_t_snip] +//! -/* Typedef for H5Aiterate2() callbacks */ +//! +/** + * Typedef for H5Aiterate2() / H5Aiterate_by_name() callbacks + * \param[in] location_id The identifier for the group, dataset + * or named datatype being iterated over + * \param[in] attr_name The name of the current object attribute + * \param[in] ainfo The attribute’s info struct + * \param[in,out] op_data A pointer to the operator data passed in to + * H5Aiterate2() or H5Aiterate_by_name() + * \returns The return values from an operator are: + * \li Zero causes the iterator to continue, returning zero when + * all attributes have been processed. + * \li Positive causes the iterator to immediately return that + * positive value, indicating short-circuit success. The + * iterator can be restarted at the next attribute. + * \li Negative causes the iterator to immediately return that value, + * indicating failure. The iterator can be restarted at the next + * attribute. + */ typedef herr_t (*H5A_operator2_t)(hid_t location_id /*in*/, const char *attr_name /*in*/, const H5A_info_t *ainfo /*in*/, void *op_data /*in,out*/); +//! /********************/ /* Public Variables */ @@ -105,8 +126,8 @@ H5_DLL herr_t H5Aclose(hid_t attr_id); * The attribute identifier returned by this function must be released * with H5Aclose() resource leaks will develop. * - * \note The \p acpl and \p aapl parameters are currently not used; specify - * #H5P_DEFAULT. + * \note The \p aapl parameter is currently not used; specify #H5P_DEFAULT. + * * \note If \p loc_id is a file identifier, the attribute will be attached * that file’s root group. * @@ -117,6 +138,11 @@ H5_DLL herr_t H5Aclose(hid_t attr_id); */ H5_DLL hid_t H5Acreate2(hid_t loc_id, const char *attr_name, hid_t type_id, hid_t space_id, hid_t acpl_id, hid_t aapl_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Acreate} + */ H5_DLL hid_t H5Acreate_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *attr_name, hid_t type_id, hid_t space_id, hid_t acpl_id, hid_t aapl_id, hid_t es_id); @@ -659,30 +685,12 @@ H5_DLL hid_t H5Aget_type(hid_t attr_id); * the value returned identifies the parameter to be operated on * in the next step of the iteration. * - * The #H5A_operator2_t prototype for the \p op parameter is a - * user defined function where: - * The operation receives the location identifier for the group or - * dataset being iterated over, \p location_id; the name of the - * current object attribute, \p attr_name; the attribute’s info - * struct, \p ainfo; and a pointer to the operator data passed - * into H5Aiterate2(), \p op_data. - * - * Valid return values from an operator and the resulting - * H5Aiterate2() and \p op behavior are as follows: - * - * \li Zero causes the iterator to continue, returning zero when - * all attributes have been processed. - * \li A positive value causes the iterator to immediately return - * that positive value, indicating short-circuit success. The - * iterator can be restarted at the next attribute, as - * indicated by the return value of \p idx. - * \li A negative value causes the iterator to immediately return - * that value, indicating failure. The iterator can be - * restarted at the next attribute, as indicated by the return - * value of \p idx. + * \p op is a user-defined function whose prototype is defined + * as follows: + * \snippet this H5A_operator2_t_snip + * \click4more * * \note This function is also available through the H5Aiterate() macro. - * \todo Add snippet for H5A_operator2_t * * \since 1.8.0 * @@ -751,13 +759,10 @@ H5_DLL herr_t H5Aiterate2(hid_t loc_id, H5_index_t idx_type, H5_iter_order_t ord * the value returned identifies the parameter to be operated on in * the next step of the iteration. * - * The #H5A_operator2_t prototype for the \p op parameter is a - * user defined function where: - * The operation receives the location identifier for the group or - * dataset being iterated over, \p location_id; the name of the - * current object attribute, \p attr_name; the attribute’s info - * struct, \p ainfo; and a pointer to the operator data passed - * into H5Aiterate_by_name(), \p op_data. + * \p op is a user-defined function whose prototype is defined + * as follows: + * \snippet this H5A_operator2_t_snip + * \click4more * * Valid return values from an operator and the resulting * H5Aiterate_by_name() and \p op behavior are as follows: @@ -777,17 +782,21 @@ H5_DLL herr_t H5Aiterate2(hid_t loc_id, H5_index_t idx_type, H5_iter_order_t ord * information regarding the properties of links required to access * the object, \p obj_name. * - * \todo Add snippet to show H5Aoperator2_t. * \since 1.8.0 * */ H5_DLL herr_t H5Aiterate_by_name(hid_t loc_id, const char *obj_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t *idx, H5A_operator2_t op, void *op_data, hid_t lapl_id); -H5_DLL hid_t H5Acreate_by_name_async(const char *app_file, const char *app_func, unsigned app_line, - hid_t loc_id, const char *obj_name, const char *attr_name, hid_t type_id, - hid_t space_id, hid_t acpl_id, hid_t aapl_id, hid_t lapl_id, - hid_t es_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Acreate_by_name} + */ +H5_DLL hid_t H5Acreate_by_name_async(const char *app_file, const char *app_func, unsigned app_line, + hid_t loc_id, const char *obj_name, const char *attr_name, hid_t type_id, + hid_t space_id, hid_t acpl_id, hid_t aapl_id, hid_t lapl_id, + hid_t es_id); /*--------------------------------------------------------------------------*/ /** * \ingroup H5A @@ -819,6 +828,11 @@ H5_DLL hid_t H5Acreate_by_name_async(const char *app_file, const char *app_func * \see H5Aclose(), H5Acreate() */ H5_DLL hid_t H5Aopen(hid_t obj_id, const char *attr_name, hid_t aapl_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aopen} + */ H5_DLL hid_t H5Aopen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t obj_id, const char *attr_name, hid_t aapl_id, hid_t es_id); /*--------------------------------------------------------------------------*/ @@ -868,6 +882,11 @@ H5_DLL hid_t H5Aopen_async(const char *app_file, const char *app_func, unsigned */ H5_DLL hid_t H5Aopen_by_idx(hid_t loc_id, const char *obj_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, hid_t aapl_id, hid_t lapl_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aopen_by_idx} + */ H5_DLL hid_t H5Aopen_by_idx_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *obj_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, hid_t aapl_id, hid_t lapl_id, hid_t es_id); @@ -914,6 +933,11 @@ H5_DLL hid_t H5Aopen_by_idx_async(const char *app_file, const char *app_func, un */ H5_DLL hid_t H5Aopen_by_name(hid_t loc_id, const char *obj_name, const char *attr_name, hid_t aapl_id, hid_t lapl_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aopen_by_name} + */ H5_DLL hid_t H5Aopen_by_name_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *obj_name, const char *attr_name, hid_t aapl_id, hid_t lapl_id, hid_t es_id); @@ -967,6 +991,11 @@ H5_DLL herr_t H5Aread(hid_t attr_id, hid_t type_id, void *buf); * */ H5_DLL herr_t H5Arename(hid_t loc_id, const char *old_name, const char *new_name); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aread} + */ H5_DLL herr_t H5Aread_async(const char *app_file, const char *app_func, unsigned app_line, hid_t attr_id, hid_t dtype_id, void *buf, hid_t es_id); /*--------------------------------------------------------------------------*/ @@ -1001,15 +1030,40 @@ H5_DLL herr_t H5Aread_async(const char *app_file, const char *app_func, unsigned * */ H5_DLL herr_t H5Awrite(hid_t attr_id, hid_t type_id, const void *buf); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Awrite} + */ H5_DLL herr_t H5Awrite_async(const char *app_file, const char *app_func, unsigned app_line, hid_t attr_id, hid_t type_id, const void *buf, hid_t es_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Arename} + */ H5_DLL herr_t H5Arename_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *old_name, const char *new_name, hid_t es_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Arename_by_name} + */ H5_DLL herr_t H5Arename_by_name_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *obj_name, const char *old_attr_name, const char *new_attr_name, hid_t lapl_id, hid_t es_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aexists} + */ H5_DLL herr_t H5Aexists_async(const char *app_file, const char *app_func, unsigned app_line, hid_t obj_id, const char *attr_name, hbool_t *exists, hid_t es_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aexists_by_name} + */ H5_DLL herr_t H5Aexists_by_name_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *obj_name, const char *attr_name, hbool_t *exists, hid_t lapl_id, hid_t es_id); @@ -1038,6 +1092,11 @@ H5_DLL herr_t H5Aexists_by_name_async(const char *app_file, const char *app_func */ H5_DLL herr_t H5Arename_by_name(hid_t loc_id, const char *obj_name, const char *old_attr_name, const char *new_attr_name, hid_t lapl_id); +/*--------------------------------------------------------------------------*/ +/** + * \ingroup ASYNC + * \async_variant_of{H5Aclose} + */ H5_DLL herr_t H5Aclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t attr_id, hid_t es_id); @@ -1085,9 +1144,28 @@ H5_DLL herr_t H5Aclose_async(const char *app_file, const char *app_func, unsigne /* Typedefs */ -/* Typedef for H5Aiterate1() callbacks */ +//! +/** + * \brief Typedef for H5Aiterate1() callbacks + * + * \param[in] location_id The identifier for the group, dataset + * or named datatype being iterated over + * \param[in] attr_name The name of the current object attribute + * \param[in,out] operator_data A pointer to the operator data passed in to + * H5Aiterate1() + * \returns The return values from an operator are: + * \li Zero causes the iterator to continue, returning zero when + * all attributes have been processed. + * \li Positive causes the iterator to immediately return that + * positive value, indicating short-circuit success. The + * iterator can be restarted at the next attribute. + * \li Negative causes the iterator to immediately return that value, + * indicating failure. The iterator can be restarted at the next + * attribute. + */ typedef herr_t (*H5A_operator1_t)(hid_t location_id /*in*/, const char *attr_name /*in*/, void *operator_data /*in,out*/); +//! /* Function prototypes */ /* --------------------------------------------------------------------------*/ @@ -1163,8 +1241,6 @@ H5_DLL int H5Aget_num_attrs(hid_t loc_id); * * \brief Calls a user’s function for each attribute on an object * - * \todo make prototype parameter match function (idx vs attr_num) - * * \loc_id * \param[in,out] idx Starting (in) and ending (out) attribute index * \param[in] op User's function to pass each attribute to @@ -1186,29 +1262,12 @@ H5_DLL int H5Aget_num_attrs(hid_t loc_id); * \p op, is returned in \p idx. If \p idx is the null pointer, * then all attributes are processed. * - * The prototype for #H5A_operator1_t is a user defined function - * where: - * The operation receives the identifier for the group, dataset - * or named datatype being iterated over, \p loc_id, the name of - * the current object attribute, \p attr_name, and the pointer to - * the operator data passed in to H5Aiterate1(), \p op_data. - * - * The return values from an operator are: - * - * \li Zero causes the iterator to continue, returning zero when - * all attributes have been processed. - * \li Positive causes the iterator to immediately return that - * positive value, indicating short-circuit success. The - * iterator can be restarted at the next attribute. - * \li Negative causes the iterator to immediately return that value, - * indicating failure. The iterator can be restarted at the next - * attribute. - * - * \todo Add snippet to show H5A_operator1_t. + * \p op is a user-defined function whose prototype is defined as follows: + * \snippet this H5A_operator1_t_snip + * \click4more * * \version 1.8.0 The function \p H5Aiterate was renamed to H5Aiterate1() * and deprecated in this release. - * * \since 1.0.0 * */ diff --git a/src/H5Cpublic.h b/src/H5Cpublic.h index 0e6fb84..79ece10 100644 --- a/src/H5Cpublic.h +++ b/src/H5Cpublic.h @@ -31,15 +31,34 @@ extern "C" { #endif -enum H5C_cache_incr_mode { H5C_incr__off, H5C_incr__threshold }; +enum H5C_cache_incr_mode { + H5C_incr__off, + /** + *

    CreateRead
    + * \snippet H5D_examples.c create + * + * \snippet H5D_examples.c read + *
    UpdateDelete
    + * \snippet H5D_examples.c update + * + * \snippet H5D_examples.c delete + *
    * - * A Dataset is used by other HDF5 APIs, either by name or by a handle, - * which is obtained by either creating or opening the dataset. */ #endif /* H5Dmodule_H */ diff --git a/src/H5Dpublic.h b/src/H5Dpublic.h index ce4ce84..608025a 100644 --- a/src/H5Dpublic.h +++ b/src/H5Dpublic.h @@ -39,30 +39,41 @@ /* Public Typedefs */ /*******************/ -/* Values for the H5D_LAYOUT property */ +//! +/** + * Values for the H5D_LAYOUT property + */ typedef enum H5D_layout_t { H5D_LAYOUT_ERROR = -1, - H5D_COMPACT = 0, /*raw data is very small */ - H5D_CONTIGUOUS = 1, /*the default */ - H5D_CHUNKED = 2, /*slow and fancy */ - H5D_VIRTUAL = 3, /*actual data is stored in other datasets */ - H5D_NLAYOUTS = 4 /*this one must be last! */ + H5D_COMPACT = 0, /**< raw data is very small */ + H5D_CONTIGUOUS = 1, /**< the default */ + H5D_CHUNKED = 2, /**< slow and fancy */ + H5D_VIRTUAL = 3, /**< actual data is stored in other datasets */ + H5D_NLAYOUTS = 4 /**< this one must be last! */ } H5D_layout_t; +//! -/* Types of chunk index data structures */ +//! +/** + * Types of chunk index data structures + */ typedef enum H5D_chunk_index_t { - H5D_CHUNK_IDX_BTREE = 0, /* v1 B-tree index (default) */ + H5D_CHUNK_IDX_BTREE = 0, /**< v1 B-tree index (default) */ H5D_CHUNK_IDX_SINGLE = - 1, /* Single Chunk index (cur dims[]=max dims[]=chunk dims[]; filtered & non-filtered) */ - H5D_CHUNK_IDX_NONE = 2, /* Implicit: No Index (H5D_ALLOC_TIME_EARLY, non-filtered, fixed dims) */ - H5D_CHUNK_IDX_FARRAY = 3, /* Fixed array (for 0 unlimited dims) */ - H5D_CHUNK_IDX_EARRAY = 4, /* Extensible array (for 1 unlimited dim) */ - H5D_CHUNK_IDX_BT2 = 5, /* v2 B-tree index (for >1 unlimited dims) */ - H5D_CHUNK_IDX_NTYPES /* This one must be last! */ + 1, /**< Single Chunk index (cur dims[]=max dims[]=chunk dims[]; filtered & non-filtered) */ + H5D_CHUNK_IDX_NONE = 2, /**< Implicit: No Index (#H5D_ALLOC_TIME_EARLY, non-filtered, fixed dims) */ + H5D_CHUNK_IDX_FARRAY = 3, /**< Fixed array (for 0 unlimited dims) */ + H5D_CHUNK_IDX_EARRAY = 4, /**< Extensible array (for 1 unlimited dim) */ + H5D_CHUNK_IDX_BT2 = 5, /**< v2 B-tree index (for >1 unlimited dims) */ + H5D_CHUNK_IDX_NTYPES /**< This one must be last! */ } H5D_chunk_index_t; +//! -/* Values for the space allocation time property */ +//! +/** + * Values for the space allocation time property + */ typedef enum H5D_alloc_time_t { H5D_ALLOC_TIME_ERROR = -1, H5D_ALLOC_TIME_DEFAULT = 0, @@ -70,57 +81,84 @@ typedef enum H5D_alloc_time_t { H5D_ALLOC_TIME_LATE = 2, H5D_ALLOC_TIME_INCR = 3 } H5D_alloc_time_t; +//! -/* Values for the status of space allocation */ +//! +/** + * Values for the status of space allocation + */ typedef enum H5D_space_status_t { H5D_SPACE_STATUS_ERROR = -1, H5D_SPACE_STATUS_NOT_ALLOCATED = 0, H5D_SPACE_STATUS_PART_ALLOCATED = 1, H5D_SPACE_STATUS_ALLOCATED = 2 } H5D_space_status_t; +//! -/* Values for time of writing fill value property */ +//! +/** + * Values for time of writing fill value property + */ typedef enum H5D_fill_time_t { H5D_FILL_TIME_ERROR = -1, H5D_FILL_TIME_ALLOC = 0, H5D_FILL_TIME_NEVER = 1, H5D_FILL_TIME_IFSET = 2 } H5D_fill_time_t; +//! -/* Values for fill value status */ +//! +/** + * Values for fill value status + */ typedef enum H5D_fill_value_t { H5D_FILL_VALUE_ERROR = -1, H5D_FILL_VALUE_UNDEFINED = 0, H5D_FILL_VALUE_DEFAULT = 1, H5D_FILL_VALUE_USER_DEFINED = 2 } H5D_fill_value_t; +//! -/* Values for VDS bounds option */ +//! +/** + * Values for VDS bounds option + */ typedef enum H5D_vds_view_t { H5D_VDS_ERROR = -1, H5D_VDS_FIRST_MISSING = 0, H5D_VDS_LAST_AVAILABLE = 1 } H5D_vds_view_t; +//! -/* Callback for H5Pset_append_flush() in a dataset access property list */ +//! +/** + * Callback for H5Pset_append_flush() in a dataset access property list + */ typedef herr_t (*H5D_append_cb_t)(hid_t dataset_id, hsize_t *cur_dims, void *op_data); +//! -/** Define the operator function pointer for H5Diterate() */ -//! [H5D_operator_t_snip] +//! +/** + * Define the operator function pointer for H5Diterate() + */ typedef herr_t (*H5D_operator_t)(void *elem, hid_t type_id, unsigned ndim, const hsize_t *point, void *operator_data); -//! [H5D_operator_t_snip] +//! -/** Define the operator function pointer for H5Dscatter() */ -//! [H5D_scatter_func_t_snip] +//! +/** + * Define the operator function pointer for H5Dscatter() + */ typedef herr_t (*H5D_scatter_func_t)(const void **src_buf /*out*/, size_t *src_buf_bytes_used /*out*/, void *op_data); -//! [H5D_scatter_func_t_snip] +//! -/** Define the operator function pointer for H5Dgather() */ -//! [H5D_gather_func_t_snip] +//! +/** + * Define the operator function pointer for H5Dgather() + */ typedef herr_t (*H5D_gather_func_t)(const void *dst_buf, size_t dst_buf_bytes_used, void *op_data); -//! [H5D_gather_func_t_snip] +//! /********************/ /* Public Variables */ @@ -203,26 +241,8 @@ H5_DLL hid_t H5Dcreate2(hid_t loc_id, const char *name, hid_t type_id, hid_t spa /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dcreate2() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[in] name Name of the dataset to create - * \type_id - * \space_id - * \lcpl_id - * \dcpl_id - * \dapl_id - * \es_id - * - * \return \hid_t{dataset} - * - * \see H5Dcreate2() - * + * \ingroup ASYNC + * \async_variant_of{H5Dcreate} */ H5_DLL hid_t H5Dcreate_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t type_id, hid_t space_id, hid_t lcpl_id, hid_t dcpl_id, @@ -325,22 +345,8 @@ H5_DLL hid_t H5Dopen2(hid_t loc_id, const char *name, hid_t dapl_id); /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dopen2() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[in] name Name of the dataset to open - * \dapl_id - * \es_id - * - * \return \hid_t{dataset} - * - * \see H5Dopen2() - * + * \ingroup ASYNC + * \async_variant_of{H5Dopen} */ H5_DLL hid_t H5Dopen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t dapl_id, hid_t es_id); @@ -370,24 +376,17 @@ H5_DLL hid_t H5Dget_space(hid_t dset_id); /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dget_space() - * - * \app_file - * \app_func - * \app_line - * \dset_id - * \es_id - * - * \return \hid_t{dataspace} - * - * \see H5Dget_space() - * + * \ingroup ASYNC + * \async_variant_of{H5Dget_space} */ H5_DLL hid_t H5Dget_space_async(const char *app_file, const char *app_func, unsigned app_line, hid_t dset_id, hid_t es_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5D + * \todo Document this function! + */ H5_DLL herr_t H5Dget_space_status(hid_t dset_id, H5D_space_status_t *allocation); /** @@ -614,7 +613,7 @@ H5_DLL herr_t H5Dget_num_chunks(hid_t dset_id, hid_t fspace_id, hsize_t *nchunks * using the coordinates specified by \p offset. * * If the queried chunk does not exist in the file, \p size will - * be set to 0, \p addr to #HADDR_UNDEF, and the buffer \p + * be set to 0, \p addr to \c HADDR_UNDEF, and the buffer \p * filter_mask will not be modified. * * \p offset is a pointer to a one-dimensional array with a size @@ -649,7 +648,7 @@ H5_DLL herr_t H5Dget_chunk_info_by_coord(hid_t dset_id, const hsize_t *offset, u * specified by the index index. The chunk belongs to a set of * chunks in the selection specified by fspace_id. If the queried * chunk does not exist in the file, the size will be set to 0 and - * address to #HADDR_UNDEF. The value pointed to by filter_mask will + * address to \c HADDR_UNDEF. The value pointed to by filter_mask will * not be modified. NULL can be passed in for any \p out parameters. * * \p chk_idx is the chunk index in the selection. Index value @@ -684,7 +683,7 @@ H5_DLL herr_t H5Dget_chunk_info(hid_t dset_id, hid_t fspace_id, hsize_t chk_idx, * * \dset_id * - * \return Returns the offset in bytes; otherwise, returns #HADDR_UNDEF, + * \return Returns the offset in bytes; otherwise, returns \c HADDR_UNDEF, * a negative value. * * \details H5Dget_offset() returns the address in the file of @@ -795,25 +794,8 @@ H5_DLL herr_t H5Dread(hid_t dset_id, hid_t mem_type_id, hid_t mem_space_id, hid_ /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dread() - * - * \app_file - * \app_func - * \app_line - * \dset_id Identifier of the dataset to read from - * \param[in] mem_type_id Identifier of the memory datatype - * \param[in] mem_space_id Identifier of the memory dataspace - * \param[in] file_space_id Identifier of the dataset's dataspace in the file - * \param[in] dxpl_id Identifier of a transfer property list - * \param[out] buf Buffer to receive data read from file - * \es_id - * - * \return \herr_t - * - * \see H5Dread() - * + * \ingroup ASYNC + * \async_variant_of{H5Dread} */ H5_DLL herr_t H5Dread_async(const char *app_file, const char *app_func, unsigned app_line, hid_t dset_id, hid_t mem_type_id, hid_t mem_space_id, hid_t file_space_id, hid_t dxpl_id, @@ -932,25 +914,8 @@ H5_DLL herr_t H5Dwrite(hid_t dset_id, hid_t mem_type_id, hid_t mem_space_id, hid /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dwrite() - * - * \app_file - * \app_func - * \app_line - * \param[in] dset_id Identifier of the dataset to read from - * \param[in] mem_type_id Identifier of the memory datatype - * \param[in] mem_space_id Identifier of the memory dataspace - * \param[in] file_space_id Identifier of the dataset's dataspace in the file - * \dxpl_id - * \param[out] buf Buffer with data to be written to the file - * \es_id - * - * \return \herr_t - * - * \see H5Dwrite() - * + * \ingroup ASYNC + * \async_variant_of{H5Dwrite} */ H5_DLL herr_t H5Dwrite_async(const char *app_file, const char *app_func, unsigned app_line, hid_t dset_id, hid_t mem_type_id, hid_t mem_space_id, hid_t file_space_id, hid_t dxpl_id, @@ -1292,22 +1257,8 @@ H5_DLL herr_t H5Dset_extent(hid_t dset_id, const hsize_t size[]); /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dset_extent() - * - * \app_file - * \app_func - * \app_line - * \dset_id - * \param[in] size[] Array containing the new magnitude of each dimension - * of the dataset - * \es_id - * - * \return \herr_t - * - * \see H5Dset_extent() - * + * \ingroup ASYNC + * \async_variant_of{H5Dset_extent} */ H5_DLL herr_t H5Dset_extent_async(const char *app_file, const char *app_func, unsigned app_line, hid_t dset_id, const hsize_t size[], hid_t es_id); @@ -1536,20 +1487,8 @@ H5_DLL herr_t H5Dclose(hid_t dset_id); /** * -------------------------------------------------------------------------- - * \ingroup H5D - * - * \brief Asynchronous version of H5Dclose() - * - * \app_file - * \app_func - * \app_line - * \dset_id - * \es_id - * - * \return \herr_t - * - * \see H5Dclose() - * + * \ingroup ASYNC + * \async_variant_of{H5Dclose} */ H5_DLL herr_t H5Dclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t dset_id, hid_t es_id); @@ -1608,9 +1547,181 @@ H5_DLL herr_t H5Dget_chunk_index_type(hid_t did, H5D_chunk_index_t *idx_type); /* Typedefs */ /* Function prototypes */ -H5_DLL hid_t H5Dcreate1(hid_t loc_id, const char *name, hid_t type_id, hid_t space_id, hid_t dcpl_id); -H5_DLL hid_t H5Dopen1(hid_t loc_id, const char *name); +/** + * -------------------------------------------------------------------------- + * \ingroup H5D + * + * \brief Creates a dataset at the specified location + * + * \fgdta_loc_id + * \param[in] name Name of the dataset to create + * \type_id + * \space_id + * \dcpl_id + * + * \return \hid_t{dataset} + * + * \deprecated This function is deprecated in favor of the function H5Dcreate2() + * or the macro H5Dcreate(). + * + * \details H5Dcreate1() creates a data set with a name, \p name, in the + * location specified by the identifier \p loc_id. \p loc_id may be a + * file, group, dataset, named datatype or attribute. If an attribute, + * dataset, or named datatype is specified for \p loc_id then the + * dataset will be created at the location where the attribute, + * dataset, or named datatype is attached. + * + * \p name can be a relative path based at \p loc_id or an absolute + * path from the root of the file. Use of this function requires that + * any intermediate groups specified in the path already exist. + * + * The dataset’s datatype and dataspace are specified by \p type_id and + * \p space_id, respectively. These are the datatype and dataspace of + * the dataset as it will exist in the file, which may differ from the + * datatype and dataspace in application memory. + * + * Names within a group are unique: H5Dcreate1() will return an error + * if a link with the name specified in name already exists at the + * location specified in \p loc_id. + * + * As is the case for any object in a group, the length of a dataset + * name is not limited. + * + * \p dcpl_id is an #H5P_DATASET_CREATE property list created with \p + * H5reate1() and initialized with various property list functions + * described in Property List Interface. + * + * H5Dcreate() and H5Dcreate_anon() return an error if the dataset’s + * datatype includes a variable-length (VL) datatype and the fill value + * is undefined, i.e., set to \c NULL in the dataset creation property + * list. Such a VL datatype may be directly included, indirectly + * included as part of a compound or array datatype, or indirectly + * included as part of a nested compound or array datatype. + * + * H5Dcreate() and H5Dcreate_anon() return a dataset identifier for + * success or a negative value for failure. The dataset identifier + * should eventually be closed by calling H5Dclose() to release + * resources it uses. + * + * See H5Dcreate_anon() for discussion of the differences between + * H5Dcreate() and H5Dcreate_anon(). + * + * The HDF5 library provides flexible means of specifying a fill value, + * of specifying when space will be allocated for a dataset, and of + * specifying when fill values will be written to a dataset. + * + * \version 1.8.0 Function H5Dcreate() renamed to H5Dcreate1() and deprecated in this release. + * \since 1.0.0 + * + * \see H5Dopen2(), H5Dclose(), H5Tset_size() + * + */ +H5_DLL hid_t H5Dcreate1(hid_t loc_id, const char *name, hid_t type_id, hid_t space_id, hid_t dcpl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5D + * + * \brief Opens an existing dataset + * + * \fgdta_loc_id + * \param[in] name Name of the dataset to access + * + * \return \hid_t{dataset} + * + * \deprecated This function is deprecated in favor of the function H5Dopen2() + * or the macro H5Dopen(). + * + * \details H5Dopen1() opens an existing dataset for access at the location + * specified by \p loc_id. \p loc_id may be a file, group, dataset, + * named datatype or attribute. If an attribute, dataset, or named + * datatype is specified for loc_id then the dataset will be opened at + * the location where the attribute, dataset, or named datatype is + * attached. name is a dataset name and is used to identify the dataset + * in the file. + * + * A dataset opened with this function should be closed with H5Dclose() + * when the dataset is no longer needed so that resource leaks will not + * develop. + * + * \version 1.8.0 Function H5Dopen() renamed to H5Dopen1() and deprecated in this release. + * \since 1.0.0 + * + */ +H5_DLL hid_t H5Dopen1(hid_t loc_id, const char *name); +/** + * -------------------------------------------------------------------------- + * \ingroup H5D + * + * \brief Extends a dataset + * + * \dset_id + * \param[in] size Array containing the new size of each dimension + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the function H5Dset_extent(). + * + * \details H5Dextend() verifies that the dataset is at least of size \p size, + * extending it if necessary. The dimensionality of size is the same as + * that of the dataspace of the dataset being changed. + * + * This function can be applied to the following datasets: + * \li Any dataset with unlimited dimensions + * \li A dataset with fixed dimensions if the current dimension sizes + * are less than the maximum sizes set with \c maxdims + * (see H5Screate_simple()) + * + * Space on disk is immediately allocated for the new dataset extent if + * the dataset’s space allocation time is set to + * #H5D_ALLOC_TIME_EARLY. Fill values will be written to the dataset if + * the dataset’s fill time is set to #H5D_FILL_TIME_IFSET or + * #H5D_FILL_TIME_ALLOC. (See H5Pset_fill_time() and + * H5Pset_alloc_time().) + * + * This function ensures that the dataset dimensions are of at least + * the sizes specified in size. The function H5Dset_extent() must be + * used if the dataset dimension sizes are are to be reduced. + * + * \version 1.8.0 Function Function deprecated in this release. Parameter size + * syntax changed to \Code{const hsize_t size[]} in this release. + * + */ H5_DLL herr_t H5Dextend(hid_t dset_id, const hsize_t size[]); +/** + * -------------------------------------------------------------------------- + * \ingroup H5D + * + * \brief Reclaims variable-length (VL) datatype memory buffers + * + * \type_id + * \space_id + * \dxpl_id + * \param[in] buf Pointer to the buffer to be reclaimed + * + * \return \herr_t + * + * \deprecated This function has been deprecated in HDF5-1.12 in favor of the + * function H5Treclaim(). + * + * \details H5Dvlen_reclaim() reclaims memory buffers created to store VL + * datatypes. + * + * The \p type_id must be the datatype stored in the buffer. The \p + * space_id describes the selection for the memory buffer to free the + * VL datatypes within. The \p dxpl_id is the dataset transfer property + * list which was used for the I/O transfer to create the buffer. And + * \p buf is the pointer to the buffer to be reclaimed. + * + * The VL structures (\ref hvl_t) in the user's buffer are modified to + * zero out the VL information after the memory has been reclaimed. + * + * If nested VL datatypes were used to create the buffer, this routine + * frees them from the bottom up, releasing all the memory without + * creating memory leaks. + * + * \version 1.12.0 Routine was deprecated + * + */ H5_DLL herr_t H5Dvlen_reclaim(hid_t type_id, hid_t space_id, hid_t dxpl_id, void *buf); #endif /* H5_NO_DEPRECATED_SYMBOLS */ diff --git a/src/H5ESmodule.h b/src/H5ESmodule.h index ea9fd7a..cbc812e 100644 --- a/src/H5ESmodule.h +++ b/src/H5ESmodule.h @@ -29,4 +29,37 @@ #define H5_MY_PKG_ERR H5E_EVENTSET #define H5_MY_PKG_INIT YES +/** + * \defgroup H5ES H5ES + * \brief Event Set Interface + * + * \details \Bold{This interface can be only used with the HDF5 VOL connectors that + * enable the asynchronous feature in HDF5.} The native HDF5 library has + * only synchronous operations. + * + * HDF5 VOL connectors with support for asynchronous operations: + * - ASYNC + * - DAOS + * + * \par Example: + * \code + * fid = H5Fopen(..); + * gid = H5Gopen(fid, ..); //Starts when H5Fopen completes + * did = H5Dopen(gid, ..); //Starts when H5Gopen completes + * + * es_id = H5EScreate(); // Create event set for tracking async operations + * status = H5Dwrite_async(did, .., es_id); //Asynchronous, starts when H5Dopen completes, + * // may run concurrently with other H5Dwrite_async + * // in event set. + * status = H5Dwrite_async(did, .., es_id); //Asynchronous, starts when H5Dopen completes, + * // may run concurrently with other H5Dwrite_async + * // in event set.... + * + * ... + * H5ESwait(es_id); // Wait for operations in event set to complete, buffers + * // used for H5Dwrite_async must only be changed after wait + * // returns. + * \endcode + */ + #endif /* H5ESmodule_H */ diff --git a/src/H5ESpublic.h b/src/H5ESpublic.h index 752218b..a7c2e58 100644 --- a/src/H5ESpublic.h +++ b/src/H5ESpublic.h @@ -47,20 +47,24 @@ typedef enum H5ES_status_t { H5ES_STATUS_FAIL /* An operation has completed, but failed */ } H5ES_status_t; -/* Information about failed operations in event set */ +//! +/** + * Information about failed operations in event set + */ typedef struct H5ES_err_info_t { /* Operation info */ - char * api_name; /* Name of HDF5 API routine called */ - char * api_args; /* "Argument string" for arguments to HDF5 API routine called */ - char * app_file_name; /* Name of source file where the HDF5 API routine was called */ - char * app_func_name; /* Name of function where the HDF5 API routine was called */ - unsigned app_line_num; /* Line # of source file where the HDF5 API routine was called */ - uint64_t op_ins_count; /* Counter of operation's insertion into event set */ - uint64_t op_ins_ts; /* Timestamp for when the operation was inserted into the event set */ + char * api_name; /**< Name of HDF5 API routine called */ + char * api_args; /**< "Argument string" for arguments to HDF5 API routine called */ + char * app_file_name; /**< Name of source file where the HDF5 API routine was called */ + char * app_func_name; /**< Name of function where the HDF5 API routine was called */ + unsigned app_line_num; /**< Line # of source file where the HDF5 API routine was called */ + uint64_t op_ins_count; /**< Counter of operation's insertion into event set */ + uint64_t op_ins_ts; /**< Timestamp for when the operation was inserted into the event set */ /* Error info */ - hid_t err_stack_id; /* ID for error stack from failed operation */ + hid_t err_stack_id; /**< ID for error stack from failed operation */ } H5ES_err_info_t; +//! /* H5ES_op_info_t: @@ -119,14 +123,157 @@ How to Trace Async Operations? extern "C" { #endif -H5_DLL hid_t H5EScreate(void); +/** + * \ingroup H5ES + * + * \brief Creates an event set + * + * \returns \hid_ti{event set} + * + * \details H5EScreate() creates a new event set and returns a corresponding + * event set identifier. + * + * \since 1.13.0 + * + */ +H5_DLL hid_t H5EScreate(void); + +/** + * \ingroup H5ES + * + * \brief Waits for operations in event set to complete + * + * \es_id + * \param[in] timeout Total time in nanoseconds to wait for all operations in + * the event set to complete + * \param[out] num_in_progress The number of operations still in progress + * \param[out] err_occurred Flag if an operation in the event set failed + * \returns \herr_t + * + * \details H5ESwait() waits for operations in an event set \p es_id to wait + * with \p timeout. + * + * Timeout value is in nanoseconds, and is for the H5ESwait() call and + * not for each individual operation in the event set. For example, if + * "10" is passed as a timeout value and the event set waited 4 + * nanoseconds for the first operation to complete, the remaining + * operations would be allowed to wait for at most 6 nanoseconds more, + * i.e., the timeout value used across all operations in the event set + * until it reaches 0, then any remaining operations are only checked + * for completion, not waited on. + * + * This call will stop waiting on operations and will return + * immediately if an operation fails. If a failure occurs, the value + * returned for the number of operations in progress may be inaccurate. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESwait(hid_t es_id, uint64_t timeout, size_t *num_in_progress, hbool_t *err_occurred); + +/** + * \ingroup H5ES + * + * \brief Retrieves number of events in an event set + * + * \es_id + * \param[out] count The number of events in the event set + * \returns \herr_t + * + * \details H5ESget_count() retrieves number of events in an event set specified + * by \p es_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESget_count(hid_t es_id, size_t *count); + +/** + * \ingroup H5ES + * + * \todo Fill in the blanks! + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESget_op_counter(hid_t es_id, uint64_t *counter); + +/** + * \ingroup H5ES + * + * \brief Checks for failed operations + * + * \es_id + * \param[out] err_occurred Status indicating if error is present in the event + * set + * \returns \herr_t + * + * \details H5ESget_err_status() checks if event set specified by es_id has + * failed operations. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESget_err_status(hid_t es_id, hbool_t *err_occurred); + +/** + * \ingroup H5ES + * + * \brief Retrieves the number of failed operations + * + * \es_id + * \param[out] num_errs Number of errors + * \returns \herr_t + * + * \details H5ESget_err_count() retrieves the number of failed operations in an + * event set specified by \p es_id. + * + * The function does not wait for active operations to complete, so + * count may not include all failures. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESget_err_count(hid_t es_id, size_t *num_errs); + +/** + * \ingroup H5ES + * + * \brief Retrieves information about failed operations + * + * \es_id + * \param[in] num_err_info The number of elements in \p err_info array + * \param[out] err_info Array of structures + * \param[out] err_cleared Number of cleared errors + * \returns \herr_t + * + * \details H5ESget_err_info() retrieves information about failed operations in + * an event set specified by \p es_id. The strings retrieved for each + * error info must be released by calling H5free_memory(). + * + * Below is the description of the \ref H5ES_err_info_t structure: + * \snippet this H5ES_err_info_t_snip + * \click4more + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESget_err_info(hid_t es_id, size_t num_err_info, H5ES_err_info_t err_info[], size_t *err_cleared); + +/** + * \ingroup H5ES + * + * \brief Terminates access to an event set + * + * \es_id + * \returns \herr_t + * + * \details H5ESclose() terminates access to an event set specified by \p es_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5ESclose(hid_t es_id); #ifdef __cplusplus diff --git a/src/H5Emodule.h b/src/H5Emodule.h index 1670c03..43d5d36 100644 --- a/src/H5Emodule.h +++ b/src/H5Emodule.h @@ -29,4 +29,33 @@ #define H5_MY_PKG_ERR H5E_ERROR #define H5_MY_PKG_INIT YES +/** + * \defgroup H5E H5E + * \brief Error Handling Interface + * + * \details The Error interface provides error handling in the form of a stack. + * The \Code{FUNC_ENTER} macro clears the error stack whenever an + * interface function is entered. When an error is detected, an entry + * is pushed onto the stack. As the functions unwind, additional + * entries are pushed onto the stack. The API function will return some + * indication that an error occurred and the application can print the + * error stack. + * + * Certain API functions in the \c H5E package, such as H5Eprint1(), do + * not clear the error stack. Otherwise, any function which does not + * have an underscore immediately after the package name will clear the + * error stack. For instance, H5Fopen() clears the error stack while + * \Code{H5F_open} does not. + * + * An error stack has a fixed maximum size. If this size is exceeded + * then the stack will be truncated and only the inner-most functions + * will have entries on the stack. This is expected to be a rare + * condition. + * + * Each thread has its own error stack, but since multi-threading has + * not been added to the library yet, this package maintains a single + * error stack. The error stack is statically allocated to reduce the + * complexity of handling errors within the \c H5E package. + */ + #endif /* H5Emodule_H */ diff --git a/src/H5Epublic.h b/src/H5Epublic.h index a2554d5..20a2107 100644 --- a/src/H5Epublic.h +++ b/src/H5Epublic.h @@ -26,18 +26,29 @@ /* Value for the default error stack */ #define H5E_DEFAULT (hid_t)0 -/* Different kinds of error information */ +/** + * Different kinds of error information + */ typedef enum H5E_type_t { H5E_MAJOR, H5E_MINOR } H5E_type_t; -/* Information about an error; element of error stack */ +/** + * Information about an error; element of error stack + */ typedef struct H5E_error2_t { - hid_t cls_id; /*class ID */ - hid_t maj_num; /*major error ID */ - hid_t min_num; /*minor error number */ - unsigned line; /*line in file where error occurs */ - const char *func_name; /*function in which error occurred */ - const char *file_name; /*file in which error occurred */ - const char *desc; /*optional supplied description */ + hid_t cls_id; + /**< Class ID */ + hid_t maj_num; + /**< Major error ID */ + hid_t min_num; + /**< Minor error number */ + unsigned line; + /**< Line in file where error occurs */ + const char *func_name; + /**< Function in which error occurred */ + const char *file_name; + /**< File in which error occurred */ + const char *desc; + /**< Optional supplied description */ } H5E_error2_t; /* When this header is included from a private header, don't make calls to H5open() */ @@ -138,10 +149,12 @@ H5_DLLVAR hid_t H5E_ERR_CLS_g; goto label; \ } -/* Error stack traversal direction */ +/** + * Error stack traversal direction + */ typedef enum H5E_direction_t { - H5E_WALK_UPWARD = 0, /*begin deep, end at API function */ - H5E_WALK_DOWNWARD = 1 /*begin at API function, end deep */ + H5E_WALK_UPWARD = 0, /**< begin w/ most specific error, end at API function */ + H5E_WALK_DOWNWARD = 1 /**< begin at API function, end w/ most specific error */ } H5E_direction_t; #ifdef __cplusplus @@ -149,30 +162,498 @@ extern "C" { #endif /* Error stack traversal callback function pointers */ +//! +/** + * \brief Callback function for H5Ewalk2() + * + * \param[in] n Indexed error position in the stack + * \param[in] err_desc Pointer to a data structure describing the error + * \param[in] client_data Pointer to client data in the format expected by the + * user-defined function + * \return \herr_t + */ typedef herr_t (*H5E_walk2_t)(unsigned n, const H5E_error2_t *err_desc, void *client_data); +//! + +//! +/** + * \brief Callback function for H5Eset_auto2() + * + * \estack_id{estack} + * \param[in] client_data Pointer to client data in the format expected by the + * user-defined function + * \return \herr_t + */ typedef herr_t (*H5E_auto2_t)(hid_t estack, void *client_data); +//! /* Public API functions */ -H5_DLL hid_t H5Eregister_class(const char *cls_name, const char *lib_name, const char *version); -H5_DLL herr_t H5Eunregister_class(hid_t class_id); -H5_DLL herr_t H5Eclose_msg(hid_t err_id); -H5_DLL hid_t H5Ecreate_msg(hid_t cls, H5E_type_t msg_type, const char *msg); -H5_DLL hid_t H5Ecreate_stack(void); -H5_DLL hid_t H5Eget_current_stack(void); -H5_DLL herr_t H5Eappend_stack(hid_t dst_stack_id, hid_t src_stack_id, hbool_t close_source_stack); -H5_DLL herr_t H5Eclose_stack(hid_t stack_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Registers a client library or application program to the HDF5 error API + * + * \param[in] cls_name Name of the error class + * \param[in] lib_name Name of the client library or application to which the error class belongs + * \param[in] version Version of the client library or application to which the + error class belongs. Can be \c NULL. + * \return Returns a class identifier on success; otherwise returns H5I_INVALID_ID. + * + * \details H5Eregister_class() registers a client library or application + * program to the HDF5 error API so that the client library or + * application program can report errors together with the HDF5 + * library. It receives an identifier for this error class for further + * error operations. The library name and version number will be + * printed out in the error message as a preamble. + * + * \since 1.8.0 + */ +H5_DLL hid_t H5Eregister_class(const char *cls_name, const char *lib_name, const char *version); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Removes an error class + * + * \param[in] class_id Error class identifier. + * \return \herr_t + * + * \details H5Eunregister_class() removes the error class specified by \p + * class_id. All the major and minor errors in this class will also be + * closed. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eunregister_class(hid_t class_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Closes an error message + * + * \param[in] err_id An error message identifier + * \return \herr_t + * + * \details H5Eclose_msg() closes an error message identifier, which can be + * either a major or minor message. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eclose_msg(hid_t err_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Adds a major error message to an error class + * + * \param[in] cls An error class identifier + * \param[in] msg_type The type of the error message + * \param[in] msg Major error message + * \return \herr_t + * + * \details H5Ecreate_msg() adds an error message to an error class defined by + * client library or application program. The error message can be + * either major or minor as indicated by the parameter \p msg_type. + * + * Use H5Eclose_msg() to close the message identifier returned by this + * function. + * + * \since 1.8.0 + */ +H5_DLL hid_t H5Ecreate_msg(hid_t cls, H5E_type_t msg_type, const char *msg); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Creates a new, empty error stack + * + * \return \hid_ti{error stack} + * + * \details H5Ecreate_stack() creates a new empty error stack and returns the + * new stack’s identifier. Use H5Eclose_stack() to close the error stack + * identifier returned by this function. + * + * \since 1.8.0 + */ +H5_DLL hid_t H5Ecreate_stack(void); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Returns a copy of the current error stack + * + * \return \hid_ti{error stack} + * + * \details H5Eget_current_stack() copies the current error stack and returns an + * error stack identifier for the new copy. + * + * \since 1.8.0 + */ +H5_DLL hid_t H5Eget_current_stack(void); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Appends one error stack to another, optionally closing the source + * stack. + * + * \estack_id{dst_stack_id} + * \estack_id{src_stack_id} + * \param[in] close_source_stack Flag to indicate whether to close the source stack + * \return \herr_t + * + * \details H5Eappend_stack() appends the messages from error stack + * \p src_stack_id to the error stack \p dst_stack_id. + * If \p close_source_stack is \c TRUE, the source error stack + * will be closed. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eappend_stack(hid_t dst_stack_id, hid_t src_stack_id, hbool_t close_source_stack); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Closes an error stack handle + * + * \estack_id{stack_id} + * + * \return \herr_t + * + * \details H5Eclose_stack() closes the error stack handle \p stack_id + * and releases its resources. #H5E_DEFAULT cannot be closed. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eclose_stack(hid_t stack_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Retrieves error class name + * + * \param[in] class_id Error class identifier + * \param[out] name Buffer for the error class name + * \param[in] size The maximum number of characters the class name to be returned + * by this function in\p name. + * \return Returns non-negative value as on success; otherwise returns negative value. + * + * \details H5Eget_class_name() retrieves the name of the error class specified + * by the class identifier. If non-NULL pointer is passed in for \p + * name and \p size is greater than zero, the class name of \p size + * long is returned. The length of the error class name is also + * returned. If NULL is passed in as \p name, only the length of class + * name is returned. If zero is returned, it means no name. The user is + * responsible for allocating sufficient buffer space for the name. + * + * \since 1.8.0 + */ H5_DLL ssize_t H5Eget_class_name(hid_t class_id, char *name, size_t size); -H5_DLL herr_t H5Eset_current_stack(hid_t err_stack_id); -H5_DLL herr_t H5Epush2(hid_t err_stack, const char *file, const char *func, unsigned line, hid_t cls_id, - hid_t maj_id, hid_t min_id, const char *msg, ...); -H5_DLL herr_t H5Epop(hid_t err_stack, size_t count); -H5_DLL herr_t H5Eprint2(hid_t err_stack, FILE *stream); -H5_DLL herr_t H5Ewalk2(hid_t err_stack, H5E_direction_t direction, H5E_walk2_t func, void *client_data); -H5_DLL herr_t H5Eget_auto2(hid_t estack_id, H5E_auto2_t *func, void **client_data); -H5_DLL herr_t H5Eset_auto2(hid_t estack_id, H5E_auto2_t func, void *client_data); -H5_DLL herr_t H5Eclear2(hid_t err_stack); -H5_DLL herr_t H5Eauto_is_v2(hid_t err_stack, unsigned *is_stack); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Replaces the current error stack + * + * \estack_id{err_stack_id} + * + * \return \herr_t + * + * \details H5Eset_current_stack() replaces the content of the current error + * stack with a copy of the content of the error stack specified by + * \p err_stack_id, and it closes the error stack specified by + * \p err_stack_id. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eset_current_stack(hid_t err_stack_id); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Pushes a new error record onto an error stack + * + * \estack_id{err_stack}. If the identifier is #H5E_DEFAULT, the error record + * will be pushed to the current stack. + * \param[in] file Name of the file in which the error was detected + * \param[in] func Name of the function in which the error was detected + * \param[in] line Line number in the file where the error was detected + * \param[in] cls_id Error class identifier + * \param[in] maj_id Major error identifier + * \param[in] min_id Minor error identifier + * \param[in] msg Error description string + * \return \herr_t + * + * \details H5Epush2() pushes a new error record onto the error stack specified + * by \p err_stack.\n + * The error record contains the error class identifier \p cls_id, the + * major and minor message identifiers \p maj_id and \p min_id, the + * function name \p func where the error was detected, the file name \p + * file and line number \p line in the file where the error was + * detected, and an error description \p msg.\n + * The major and minor errors must be in the same error class.\n + * The function name, filename, and error description strings must be + * statically allocated.\n + * \p msg can be a format control string with additional + * arguments. This design of appending additional arguments is similar + * to the system and C functions printf() and fprintf(). + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Epush2(hid_t err_stack, const char *file, const char *func, unsigned line, hid_t cls_id, + hid_t maj_id, hid_t min_id, const char *msg, ...); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Deletes specified number of error messages from the error stack + * + * \estack_id{err_stack} + * \param[in] count The number of error messages to be deleted from the top + * of error stack + * \return \herr_t + * + * \details H5Epop() deletes the number of error records specified in \p count + * from the top of the error stack specified by \p err_stack (including + * major, minor messages and description). The number of error messages + * to be deleted is specified by \p count. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Epop(hid_t err_stack, size_t count); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Prints the specified error stack in a default manner + * + * \estack_id{err_stack} + * \param[in] stream File pointer, or \c NULL for \c stderr + * \return \herr_t + * + * \details H5Eprint2() prints the error stack specified by \p err_stack on the + * specified stream, \p stream. Even if the error stack is empty, a + * one-line message of the following form will be printed: + * \code{.unparsed} + * HDF5-DIAG: Error detected in HDF5 library version: 1.5.62 thread 0. + * \endcode + * + * A similar line will appear before the error messages of each error + * class stating the library name, library version number, and thread + * identifier. + * + * If \p err_stack is #H5E_DEFAULT, the current error stack will be + * printed. + * + * H5Eprint2() is a convenience function for H5Ewalk2() with a function + * that prints error messages. Users are encouraged to write their own + * more specific error handlers. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eprint2(hid_t err_stack, FILE *stream); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Walks the specified error stack, calling the specified function + * + * \estack_id{err_stack} + * \param[in] direction Direction in which the error stack is to be walked + * \param[in] func Function to be called for each error encountered + * \param[in] client_data Data to be passed to \p func + * \return \herr_t + * + * \details H5Ewalk2() walks the error stack specified by err_stack for the + * current thread and calls the function specified in \p func for each + * error along the way. + * + * If the value of \p err_stack is #H5E_DEFAULT, then H5Ewalk2() walks + * the current error stack. + * + * \p direction specifies whether the stack is walked from the inside + * out or the outside in. A value of #H5E_WALK_UPWARD means to begin + * with the most specific error and end at the API; a value of + * #H5E_WALK_DOWNWARD means to start at the API and end at the + * innermost function where the error was first detected. + * + * \p func, a function conforming to the #H5E_walk2_t prototype, will + * be called for each error in the error stack. Its arguments will + * include an index number \c n (beginning at zero regardless of stack + * traversal direction), an error stack entry \c err_desc, and the \c + * client_data pointer passed to H5Eprint(). The #H5E_walk2_t prototype + * is as follows: + * \snippet this H5E_walk2_t_snip + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Ewalk2(hid_t err_stack, H5E_direction_t direction, H5E_walk2_t func, void *client_data); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Returns the settings for the automatic error stack traversal + * function and its data + * + * \estack_id + * \param[out] func The function currently set to be called upon an error condition + * \param[out] client_data Data currently set to be passed to the error function + * \return \herr_t + * + * \details H5Eget_auto2() returns the settings for the automatic error stack + * traversal function, \p func, and its data, \p client_data, that are + * associated with the error stack specified by \p estack_id. + * + * Either or both of the \p func and \p client_data arguments may be + * \c NULL, in which case the value is not returned. + * + * The library initializes its default error stack traversal functions + * to H5Eprint1() and H5Eprint2(). A call to H5Eget_auto2() returns + * H5Eprint2() or the user-defined function passed in through + * H5Eset_auto2(). A call to H5Eget_auto1() returns H5Eprint1() or the + * user-defined function passed in through H5Eset_auto1(). However, if + * the application passes in a user-defined function through + * H5Eset_auto1(), it should call H5Eget_auto1() to query the traversal + * function. If the application passes in a user-defined function + * through H5Eset_auto2(), it should call H5Eget_auto2() to query the + * traversal function. + * + * Mixing the new style and the old style functions will cause a + * failure. For example, if the application sets a user-defined + * old-style traversal function through H5Eset_auto1(), a call to + * H5Eget_auto2() will fail and will indicate that the application has + * mixed H5Eset_auto1() and H5Eget_auto2(). On the other hand, mixing + * H5Eset_auto2() and H5Eget_auto1() will also cause a failure. But if + * the traversal functions are the library’s default H5Eprint1() or + * H5Eprint2(), mixing H5Eset_auto1() and H5Eget_auto2() or mixing + * H5Eset_auto2() and H5Eget_auto1() does not fail. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eget_auto2(hid_t estack_id, H5E_auto2_t *func, void **client_data); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Turns automatic error printing on or off + * + * \estack_id + * \param[in] func Function to be called upon an error condition + * \param[in] client_data Data passed to the error function + * \return \herr_t + * + * \details H5Eset_auto2() turns on or off automatic printing of errors for the + * error stack specified with \p estack_id. An \p estack_id value of + * #H5E_DEFAULT indicates the current stack. + * + * When automatic printing is turned on, by the use of a non-null \p func + * pointer, any API function which returns an error indication will + * first call \p func, passing it \p client_data as an argument. + * + * \p func, a function compliant with the #H5E_auto2_t prototype, is + * defined in the H5Epublic.h source code file as: + * \snippet this H5E_auto2_t_snip + * + * When the library is first initialized, the auto printing function is + * set to H5Eprint2() (cast appropriately) and \p client_data is the + * standard error stream pointer, \c stderr. + * + * Automatic stack traversal is always in the #H5E_WALK_DOWNWARD + * direction. + * + * Automatic error printing is turned off with a H5Eset_auto2() call + * with a \c NULL \p func pointer. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eset_auto2(hid_t estack_id, H5E_auto2_t func, void *client_data); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Clears the specified error stack or the error stack for the current thread + * + * \estack_id{err_stack} + * \return \herr_t + * + * \details H5Eclear2() clears the error stack specified by \p err_stack, or, if + * \p err_stack is set to #H5E_DEFAULT, the error stack for the current + * thread. + * + * \p err_stack is an error stack identifier, such as that returned by + * H5Eget_current_stack(). + * + * The current error stack is also cleared whenever an API function is + * called, with certain exceptions (for instance, H5Eprint1() or + * H5Eprint2()). + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eclear2(hid_t err_stack); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Determines the type of error stack + * + * \estack_id{err_stack} + * \param[out] is_stack A flag indicating which error stack \c typedef the + * specified error stack conforms to + * + * \return \herr_t + * + * \details H5Eauto_is_v2() determines whether the error auto reporting function + * for an error stack conforms to the #H5E_auto2_t \c typedef or the + * #H5E_auto1_t \c typedef. + * + * The \p is_stack parameter is set to 1 if the error stack conforms to + * #H5E_auto2_t and 0 if it conforms to #H5E_auto1_t. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Eauto_is_v2(hid_t err_stack, unsigned *is_stack); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Retrieves an error message + * + * \param[in] msg_id Error message identifier + * \param[out] type The type of the error message Valid values are #H5E_MAJOR + * and #H5E_MINOR. + * \param[out] msg Error message buffer + * \param[in] size The length of error message to be returned by this function + * \return Returns the size of the error message in bytes on success; otherwise + * returns a negative value. + * + * \details H5Eget_msg() retrieves the error message including its length and + * type. The error message is specified by \p msg_id. The user is + * responsible for passing in sufficient buffer space for the + * message. If \p msg is not NULL and \p size is greater than zero, the + * error message of \p size long is returned. The length of the message + * is also returned. If NULL is passed in as \p msg, only the length + * and type of the message is returned. If the return value is zero, it + * means there is no message. + * + * \since 1.8.0 + */ H5_DLL ssize_t H5Eget_msg(hid_t msg_id, H5E_type_t *type, char *msg, size_t size); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Retrieves the number of error messages in an error stack + * + * \estack_id{error_stack_id} + * \return Returns a non-negative value on success; otherwise returns a negative value. + * + * \details H5Eget_num() retrieves the number of error records in the error + * stack specified by \p error_stack_id (including major, minor + * messages and description). + * + * \since 1.8.0 + */ H5_DLL ssize_t H5Eget_num(hid_t error_stack_id); /* Symbols defined for compatibility with previous versions of the HDF5 API. @@ -189,30 +670,259 @@ H5_DLL ssize_t H5Eget_num(hid_t error_stack_id); typedef hid_t H5E_major_t; typedef hid_t H5E_minor_t; -/* Information about an error element of error stack. */ +/** + * Information about an error element of error stack. + */ typedef struct H5E_error1_t { - H5E_major_t maj_num; /*major error number */ - H5E_minor_t min_num; /*minor error number */ - const char *func_name; /*function in which error occurred */ - const char *file_name; /*file in which error occurred */ - unsigned line; /*line in file where error occurs */ - const char *desc; /*optional supplied description */ + H5E_major_t maj_num; /**< major error number */ + H5E_minor_t min_num; /**< minor error number */ + const char *func_name; /**< function in which error occurred */ + const char *file_name; /**< file in which error occurred */ + unsigned line; /**< line in file where error occurs */ + const char *desc; /**< optional supplied description */ } H5E_error1_t; /* Error stack traversal callback function pointers */ +//! +/** + * \brief Callback function for H5Ewalk1() + * + * \param[in] n Indexed error position in the stack + * \param[in] err_desc Pointer to a data structure describing the error + * \param[in] client_data Pointer to client data in the format expected by the + * user-defined function + * \return \herr_t + */ typedef herr_t (*H5E_walk1_t)(int n, H5E_error1_t *err_desc, void *client_data); +//! + +//! +/** + * \brief Callback function for H5Eset_auto1() + * + * \param[in] client_data Pointer to client data in the format expected by the + * user-defined function + * \return \herr_t + */ typedef herr_t (*H5E_auto1_t)(void *client_data); +//! /* Function prototypes */ +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Clears the error stack for the current thread + * + * \return \herr_t + * + * \details H5Eclear1() clears the error stack for the current thread.\n + * The stack is also cleared whenever an API function is called, with + * certain exceptions (for instance, H5Eprint1()). + * + * \deprecated 1.8.0 Function H5Eclear() renamed to H5Eclear1() and deprecated + * in this release. + */ H5_DLL herr_t H5Eclear1(void); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Returns the current settings for the automatic error stack traversal + * function and its data + * + * \param[out] func Current setting for the function to be called upon an error + * condition + * \param[out] client_data Current setting for the data passed to the error + * function + * \return \herr_t + * + * \details H5Eget_auto1() returns the current settings for the automatic error + * stack traversal function, \p func, and its data, + * \p client_data. Either or both arguments may be \c NULL, in which case the + * value is not returned. + * + * The library initializes its default error stack traversal functions + * to H5Eprint1() and H5Eprint2(). A call to H5Eget_auto2() returns + * H5Eprint2() or the user-defined function passed in through + * H5Eset_auto2(). A call to H5Eget_auto1() returns H5Eprint1() or the + * user-defined function passed in through H5Eset_auto1(). However, if + * the application passes in a user-defined function through + * H5Eset_auto1(), it should call H5Eget_auto1() to query the traversal + * function. If the application passes in a user-defined function + * through H5Eset_auto2(), it should call H5Eget_auto2() to query the + * traversal function. + * + * Mixing the new style and the old style functions will cause a + * failure. For example, if the application sets a user-defined + * old-style traversal function through H5Eset_auto1(), a call to + * H5Eget_auto2() will fail and will indicate that the application has + * mixed H5Eset_auto1() and H5Eget_auto2(). On the other hand, mixing + * H5Eset_auto2() and H5Eget_auto1() will also cause a failure. But if + * the traversal functions are the library’s default H5Eprint1() or + * H5Eprint2(), mixing H5Eset_auto1() and H5Eget_auto2() or mixing + * H5Eset_auto2() and H5Eget_auto1() does not fail. + * + * \deprecated 1.8.0 Function H5Eget_auto() renamed to H5Eget_auto1() and + * deprecated in this release. + */ H5_DLL herr_t H5Eget_auto1(H5E_auto1_t *func, void **client_data); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Pushes a new error record onto the error stack + * + * \param[in] file Name of the file in which the error was detected + * \param[in] func Name of the function in which the error was detected + * \param[in] line Line number in the file where the error was detected + * \param[in] maj Major error identifier + * \param[in] min Minor error identifier + * \param[in] str Error description string + * \return \herr_t + * + * \details H5Epush1() pushes a new error record onto the error stack for the + * current thread.\n + * The error has major and minor numbers \p maj_num + * and \p min_num, the function \p func where the error was detected, the + * name of the file \p file where the error was detected, the line \p line + * within that file, and an error description string \p str.\n + * The function name, filename, and error description strings must be statically + * allocated. + * + * \since 1.4.0 + * \deprecated 1.8.0 Function H5Epush() renamed to H5Epush1() and + * deprecated in this release. + */ H5_DLL herr_t H5Epush1(const char *file, const char *func, unsigned line, H5E_major_t maj, H5E_minor_t min, const char *str); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Prints the current error stack in a default manner + * + * \param[in] stream File pointer, or \c NULL for \c stderr + * \return \herr_t + * + * \details H5Eprint1() prints prints the error stack for the current thread + * on the specified stream, \p stream. Even if the error stack is empty, a + * one-line message of the following form will be printed: + * \code{.unparsed} + * HDF5-DIAG: Error detected in thread 0. + * \endcode + * H5Eprint1() is a convenience function for H5Ewalk1() with a function + * that prints error messages. Users are encouraged to write their own + * more specific error handlers. + * + * \deprecated 1.8.0 Function H5Eprint() renamed to H5Eprint1() and + * deprecated in this release. + */ H5_DLL herr_t H5Eprint1(FILE *stream); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Turns automatic error printing on or off + * + * \param[in] func Function to be called upon an error condition + * \param[in] client_data Data passed to the error function + * \return \herr_t + * + * \details H5Eset_auto1() turns on or off automatic printing of errors. When + * turned on (non-null \p func pointer), any API function which returns + * an error indication will first call \p func, passing it \p + * client_data as an argument. + * + * \p func, a function conforming to the #H5E_auto1_t prototype, is + * defined in the H5Epublic.h source code file as: + * \snippet this H5E_auto1_t_snip + * + * When the library is first initialized, the auto printing function is + * set to H5Eprint1() (cast appropriately) and \p client_data is the + * standard error stream pointer, \c stderr. + * + * Automatic stack traversal is always in the #H5E_WALK_DOWNWARD + * direction. + * + * \deprecated 1.8.0 Function H5Eset_auto() renamed to H5Eset_auto1() and + * deprecated in this release. + */ H5_DLL herr_t H5Eset_auto1(H5E_auto1_t func, void *client_data); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Walks the current error stack, calling the specified function + * + * \param[in] direction Direction in which the error stack is to be walked + * \param[in] func Function to be called for each error encountered + * \param[in] client_data Data to be passed to \p func + * \return \herr_t + * + * \details H5Ewalk1() walks the error stack for the current thread and calls + * the function specified in \p func for each error along the way. + * + * \p direction specifies whether the stack is walked from the inside + * out or the outside in. A value of #H5E_WALK_UPWARD means to begin + * with the most specific error and end at the API; a value of + * #H5E_WALK_DOWNWARD means to start at the API and end at the + * innermost function where the error was first detected. + * + * \p func, a function conforming to the #H5E_walk1_t prototype, will + * be called for each error in the error stack. Its arguments will + * include an index number \c n (beginning at zero regardless of stack + * traversal direction), an error stack entry \c err_desc, and the \c + * client_data pointer passed to H5Eprint(). The #H5E_walk1_t prototype + * is as follows: + * \snippet this H5E_walk1_t_snip + * + * \deprecated 1.8.0 Function H5Ewalk() renamed to H5Ewalk1() and + * deprecated in this release. + */ H5_DLL herr_t H5Ewalk1(H5E_direction_t direction, H5E_walk1_t func, void *client_data); -H5_DLL char * H5Eget_major(H5E_major_t maj); -H5_DLL char * H5Eget_minor(H5E_minor_t min); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Returns a character string describing an error specified by a major + * error number + * + * \param[in] maj Major error number + * \return \herr_t + * + * \details Given a major error number, H5Eget_major() returns a constant + * character string that describes the error. + * + * \attention This function returns a dynamically allocated string (\c char + * array). An application calling this function must free the memory + * associated with the return value to prevent a memory leak. + * + * \deprecated 1.8.0 Function deprecated in this release. + */ +H5_DLL char *H5Eget_major(H5E_major_t maj); +/** + * -------------------------------------------------------------------------- + * \ingroup H5E + * + * \brief Returns a character string describing an error specified by a minor + * error number + * + * \param[in] min Minor error number + * \return \herr_t + * + * \details Given a minor error number, H5Eget_minor() returns a constant + * character string that describes the error. + * + * \attention In the Release 1.8.x series, H5Eget_minor() returns a string of + * dynamic allocated \c char array. An application calling this + * function from an HDF5 library of Release 1.8.0 or later must free + * the memory associated with the return value to prevent a memory + * leak. This is a change from the 1.6.x release series. + * + * \deprecated 1.8.0 Function deprecated and return type changed in this release. + */ +H5_DLL char *H5Eget_minor(H5E_minor_t min); #endif /* H5_NO_DEPRECATED_SYMBOLS */ #ifdef __cplusplus diff --git a/src/H5FDcore.h b/src/H5FDcore.h index f8a516a..d456c3e 100644 --- a/src/H5FDcore.h +++ b/src/H5FDcore.h @@ -25,8 +25,70 @@ #ifdef __cplusplus extern "C" { #endif -H5_DLL hid_t H5FD_core_init(void); +H5_DLL hid_t H5FD_core_init(void); + +/** + * \ingroup FAPL + * + * \brief Modifies the file access property list to use the #H5FD_CORE driver + * + * \fapl_id + * \param[in] increment Size, in bytes, of memory increments + * \param[in] backing_store Boolean flag indicating whether to write the file + * contents to disk when the file is closed + * \returns \herr_t + * + * \details H5Pset_fapl_core() modifies the file access property list to use the + * #H5FD_CORE driver. + * + * The #H5FD_CORE driver enables an application to work with a file in + * memory, speeding reads and writes as no disk access is made. File + * contents are stored only in memory until the file is closed. The \p + * backing_store parameter determines whether file contents are ever + * written to disk. + * + * \p increment specifies the increment by which allocated memory is to + * be increased each time more memory is required. + * + * While using H5Fcreate() to create a core file, if the \p + * backing_store is set to 1 (TRUE), the file contents are flushed to a + * file with the same name as this core file when the file is closed or + * access to the file is terminated in memory. + * + * The application is allowed to open an existing file with #H5FD_CORE + * driver. While using H5Fopen() to open an existing file, if the \p + * backing_store is set to 1 (TRUE) and the \c flags for H5Fopen() is set to + * #H5F_ACC_RDWR, any change to the file contents are saved to the file + * when the file is closed. If \p backing_store is set to 0 (FALSE) and the \c + * flags for H5Fopen() is set to #H5F_ACC_RDWR, any change to the file + * contents will be lost when the file is closed. If the flags for + * H5Fopen() is set to #H5F_ACC_RDONLY, no change to the file is + * allowed either in memory or on file. + * + * \note Currently this driver cannot create or open family or multi files. + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_fapl_core(hid_t fapl_id, size_t increment, hbool_t backing_store); + +/** + * \ingroup FAPL + * + * \brief Queries core file driver properties + * + * \fapl_id + * \param[out] increment Size, in bytes, of memory increments + * \param[out] backing_store Boolean flag indicating whether to write the file + * contents to disk when the file is closed + * \returns \herr_t + * + * \details H5Pget_fapl_core() queries the #H5FD_CORE driver properties as set + * by H5Pset_fapl_core(). + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pget_fapl_core(hid_t fapl_id, size_t *increment /*out*/, hbool_t *backing_store /*out*/); #ifdef __cplusplus } diff --git a/src/H5FDdirect.h b/src/H5FDdirect.h index eec10de..f06de7f 100644 --- a/src/H5FDdirect.h +++ b/src/H5FDdirect.h @@ -37,8 +37,69 @@ extern "C" { #define FBSIZE_DEF 4096 #define CBSIZE_DEF 16 * 1024 * 1024 -H5_DLL hid_t H5FD_direct_init(void); +H5_DLL hid_t H5FD_direct_init(void); + +/** + * \ingroup FAPL + * + * \brief Sets up use of the direct I/O driver + * + * \fapl_id + * \param[in] alignment Required memory alignment boundary + * \param[in] block_size File system block size + * \param[in] cbuf_size Copy buffer size + * \returns \herr_t + * + * \details H5Pset_fapl_direct() sets the file access property list, \p fapl_id, + * to use the direct I/O driver, #H5FD_DIRECT. With this driver, data + * is written to or read from the file synchronously without being + * cached by the system. + * + * File systems usually require the data address in memory, the file + * address, and the size of the data to be aligned. The HDF5 library’s + * direct I/O driver is able to handle unaligned data, though that will + * consume some additional memory resources and may slow + * performance. To get better performance, use the system function \p + * posix_memalign to align the data buffer in memory and the HDF5 + * function H5Pset_alignment() to align the data in the file. Be aware, + * however, that aligned data I/O may cause the HDF5 file to be bigger + * than the actual data size would otherwise require because the + * alignment may leave some holes in the file. + * + * \p alignment specifies the required alignment boundary in memory. + * + * \p block_size specifies the file system block size. A value of 0 + * (zero) means to use HDF5 library’s default value of 4KB. + * + * \p cbuf_size specifies the copy buffer size. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pset_fapl_direct(hid_t fapl_id, size_t alignment, size_t block_size, size_t cbuf_size); + +/** + * \ingroup FAPL + * + * \brief Retrieves direct I/O driver settings + * + * \fapl_id + * \param[out] boundary Required memory alignment boundary + * \param[out] block_size File system block size + * \param[out] cbuf_size Copy buffer size + * \returns \herr_t + * + * \details H5Pget_fapl_direct() retrieves the required memory alignment (\p + * alignment), file system block size (\p block_size), and copy buffer + * size (\p cbuf_size) settings for the direct I/O driver, #H5FD_DIRECT, + * from the file access property list \p fapl_id. + * + * See H5Pset_fapl_direct() for discussion of these values, + * requirements, and important considerations. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pget_fapl_direct(hid_t fapl_id, size_t *boundary /*out*/, size_t *block_size /*out*/, size_t *cbuf_size /*out*/); diff --git a/src/H5FDfamily.h b/src/H5FDfamily.h index f00836f..20ef532 100644 --- a/src/H5FDfamily.h +++ b/src/H5FDfamily.h @@ -26,8 +26,58 @@ extern "C" { #endif -H5_DLL hid_t H5FD_family_init(void); +H5_DLL hid_t H5FD_family_init(void); + +/** + * \ingroup FAPL + * + * \brief Sets the file access property list to use the family driver + * + * \fapl_id + * \param[in] memb_size Size in bytes of each file member + * \param[in] memb_fapl_id Identifier of file access property list for + * each family member + * \returns \herr_t + * + * \details H5Pset_fapl_family() sets the file access property list identifier, + * \p fapl_id, to use the family driver. + * + * \p memb_size is the size in bytes of each file member. This size + * will be saved in file when the property list \p fapl_id is used to + * create a new file. If \p fapl_id is used to open an existing file, + * \p memb_size has to be equal to the original size saved in file. A + * failure with an error message indicating the correct member size + * will be returned if \p memb_size does not match the size saved. If + * any user does not know the original size, #H5F_FAMILY_DEFAULT can be + * passed in. The library will retrieve the saved size. + * + * \p memb_fapl_id is the identifier of the file access property list + * to be used for each family member. + * + * \version 1.8.0 Behavior of the \p memb_size parameter was changed. + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_fapl_family(hid_t fapl_id, hsize_t memb_size, hid_t memb_fapl_id); + +/** + * \ingroup FAPL + * + * \brief Returns file access property list information + * + * \fapl_id + * \param[out] memb_size Size in bytes of each file member + * \param[out] memb_fapl_id Identifier of file access property list for + * each family member + * \returns \herr_t + * + * \details H5Pget_fapl_family() returns file access property list for use with + * the family driver. This information is returned through the output + * parameters. + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pget_fapl_family(hid_t fapl_id, hsize_t *memb_size /*out*/, hid_t *memb_fapl_id /*out*/); #ifdef __cplusplus diff --git a/src/H5FDhdfs.h b/src/H5FDhdfs.h index abe7682..8d65ac7 100644 --- a/src/H5FDhdfs.h +++ b/src/H5FDhdfs.h @@ -112,8 +112,20 @@ typedef struct H5FD_hdfs_fapl_t { int32_t stream_buffer_size; } H5FD_hdfs_fapl_t; -H5_DLL hid_t H5FD_hdfs_init(void); +H5_DLL hid_t H5FD_hdfs_init(void); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pget_fapl_hdfs(hid_t fapl_id, H5FD_hdfs_fapl_t *fa_out); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pset_fapl_hdfs(hid_t fapl_id, H5FD_hdfs_fapl_t *fa); #ifdef __cplusplus diff --git a/src/H5FDlog.h b/src/H5FDlog.h index aa1f3cb..969c091 100644 --- a/src/H5FDlog.h +++ b/src/H5FDlog.h @@ -65,7 +65,410 @@ extern "C" { #endif -H5_DLL hid_t H5FD_log_init(void); +H5_DLL hid_t H5FD_log_init(void); + +/** + * \ingroup FAPL + * + * \brief Sets up the logging virtual file driver (#H5FD_LOG) for use + * + * \fapl_id + * \param[in] logfile Name of the log file + * \param[in] flags Flags specifying the types of logging activity + * \param[in] buf_size The size of the logging buffers, in bytes (see description) + * \returns \herr_t + * + * \details H5Pset_fapl_log() modifies the file access property list to use the + * logging driver, #H5FD_LOG. The logging virtual file driver (VFD) is + * a clone of the standard SEC2 (#H5FD_SEC2) driver with additional + * facilities for logging VFD metrics and activity to a file. + * + * \p logfile is the name of the file in which the logging entries are + * to be recorded. + * + * The actions to be logged are specified in the parameter \p flags + * using the pre-defined constants described in the following + * table. Multiple flags can be set through the use of a logical \c OR + * contained in parentheses. For example, logging read and write + * locations would be specified as + * \Code{(H5FD_LOG_LOC_READ|H5FD_LOG_LOC_WRITE)}. + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
    Table1: Logging Flags
    + * #H5FD_LOG_LOC_READ + * + * Track the location and length of every read, write, or seek operation. + *
    #H5FD_LOG_LOC_WRITE
    #H5FD_LOG_LOC_SEEK
    + * #H5FD_LOG_LOC_IO + * + * Track all I/O locations and lengths. The logical equivalent of the following: + * \Code{(#H5FD_LOG_LOC_READ | #H5FD_LOG_LOC_WRITE | #H5FD_LOG_LOC_SEEK)} + *
    + * #H5FD_LOG_FILE_READ + * + * Track the number of times each byte is read or written. + *
    #H5FD_LOG_FILE_WRITE
    + * #H5FD_LOG_FILE_IO + * + * Track the number of times each byte is read and written. The logical + * equivalent of the following: + * \Code{(#H5FD_LOG_FILE_READ | #H5FD_LOG_FILE_WRITE)} + *
    + * #H5FD_LOG_FLAVOR + * + * Track the type, or flavor, of information stored at each byte. + *
    + * #H5FD_LOG_NUM_READ + * + * Track the total number of read, write, seek, or truncate operations that occur. + *
    #H5FD_LOG_NUM_WRITE
    #H5FD_LOG_NUM_SEEK
    #H5FD_LOG_NUM_TRUNCATE
    + * #H5FD_LOG_NUM_IO + * + * Track the total number of all types of I/O operations. The logical equivalent + * of the following: + * \Code{(#H5FD_LOG_NUM_READ | #H5FD_LOG_NUM_WRITE | #H5FD_LOG_NUM_SEEK | #H5FD_LOG_NUM_TRUNCATE)} + *
    + * #H5FD_LOG_TIME_OPEN + * + * Track the time spent in open, stat, read, write, seek, or close operations. + *
    #H5FD_LOG_TIME_STAT
    #H5FD_LOG_TIME_READ
    #H5FD_LOG_TIME_WRITE
    #H5FD_LOG_TIME_SEEK
    #H5FD_LOG_TIME_CLOSE
    + * #H5FD_LOG_TIME_IO + * + * Track the time spent in each of the above operations. The logical equivalent + * of the following: + * \Code{(#H5FD_LOG_TIME_OPEN | #H5FD_LOG_TIME_STAT | #H5FD_LOG_TIME_READ | #H5FD_LOG_TIME_WRITE | + * #H5FD_LOG_TIME_SEEK | #H5FD_LOG_TIME_CLOSE)} + *
    + * #H5FD_LOG_ALLOC + * + * Track the allocation of space in the file. + *
    + * #H5FD_LOG_ALL + * + * Track everything. The logical equivalent of the following: + * \Code{(#H5FD_LOG_ALLOC | #H5FD_LOG_TIME_IO | #H5FD_LOG_NUM_IO | #H5FD_LOG_FLAVOR | #H5FD_LOG_FILE_IO | + * #H5FD_LOG_LOC_IO)} + *
    + * The logging driver can track the number of times each byte in the file is + * read from or written to (using #H5FD_LOG_FILE_READ and #H5FD_LOG_FILE_WRITE) + * and what kind of data is at that location (e.g., metadata, raw data; using + * #H5FD_LOG_FLAVOR). This information is tracked in internal buffers of size + * buf_size, which must be at least the maximum size in bytes of the file to be + * logged while the log driver is in use.\n + * One buffer of size buf_size will be created for each of #H5FD_LOG_FILE_READ, + * #H5FD_LOG_FILE_WRITE and #H5FD_LOG_FLAVOR when those flags are set; these + * buffers will not grow as the file increases in size. + * + * \par Output: + * This section describes the logging driver (LOG VFD) output.\n + * The table, immediately below, describes output of the various logging driver + * flags and function calls. A list of valid flavor values, describing the type + * of data stored, follows the table. + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
    Table2: Logging Output
    FlagVFD CallOutput and Comments
    #H5FD_LOG_LOC_READRead + * \Code{%10a-%10a (%10Zu bytes) (%s) Read}\n\n + * Start position\n + * End position\n + * Number of bytes\n + * Flavor of read\n\n + * Adds \Code{(\%f s)} and seek time if #H5FD_LOG_TIME_SEEK is also set. + *
    #H5FD_LOG_LOC_READRead Error + * \Code{Error! Reading: %10a-%10a (%10Zu bytes)}\n\n + * Same parameters as non-error entry. + *
    #H5FD_LOG_LOC_WRITEWrite + * \Code{%10a-%10a (%10Zu bytes) (%s) Written}\n\n + * Start position\n + * End position\n + * Number of bytes\n + * Flavor of write\n\n + * Adds \Code{(\%f s)} and seek time if #H5FD_LOG_TIME_SEEK is also set. + *
    #H5FD_LOG_LOC_WRITEWrite Error + * \Code{Error! Writing: %10a-%10a (%10Zu bytes)}\n\n + * Same parameters as non-error entry. + *
    #H5FD_LOG_LOC_SEEKRead, Write + * \Code{Seek: From %10a-%10a}\n\n + * Start position\n + * End position\n\n + * Adds \Code{(\%f s)} and seek time if #H5FD_LOG_TIME_SEEK is also set. + *
    #H5FD_LOG_FILE_READClose + * Begins with:\n + * Dumping read I/O information\n\n + * Then, for each range of identical values, there is this line:\n + * \Code{Addr %10-%10 (%10lu bytes) read from %3d times}\n\n + * Start address\n + * End address\n + * Number of bytes\n + * Number of times read\n\n + * Note: The data buffer is scanned and each range of identical values + * gets one entry in the log file to save space and make it easier to read. + *
    #H5FD_LOG_FILE_WRITEClose + * Begins with:\n + * Dumping read I/O information\n\n + * Then, for each range of identical values, there is this line:\n + * \Code{Addr %10-%10 (%10lu bytes) written to %3d times}\n\n + * Start address\n + * End address\n + * Number of bytes\n + * Number of times written\n\n + * Note: The data buffer is scanned and each range of identical values + * gets one entry in the log file to save space and make it easier to read. + *
    #H5FD_LOG_FLAVORClose + * Begins with:\n + * Dumping I/O flavor information\n\n + * Then, for each range of identical values, there is this line:\n + * \Code{Addr %10-%10 (%10lu bytes) flavor is %s}\n\n + * Start address\n + * End address\n + * Number of bytes\n + * Flavor\n\n + * Note: The data buffer is scanned and each range of identical values + * gets one entry in the log file to save space and make it easier to read. + *
    #H5FD_LOG_NUM_READClose + * Total number of read operations: \Code{%11u} + *
    #H5FD_LOG_NUM_WRITEClose + * Total number of write operations: \Code{%11u} + *
    #H5FD_LOG_NUM_SEEKClose + * Total number of seek operations: \Code{%11u} + *
    #H5FD_LOG_NUM_TRUNCATEClose + * Total number of truncate operations: \Code{%11u} + *
    #H5FD_LOG_TIME_OPENOpen + * Open took: \Code{(\%f s)} + *
    #H5FD_LOG_TIME_READClose, Read + * Total time in read operations: \Code{\%f s}\n\n + * See also: #H5FD_LOG_LOC_READ + *
    #H5FD_LOG_TIME_WRITEClose, Write + * Total time in write operations: \Code{\%f s}\n\n + * See also: #H5FD_LOG_LOC_WRITE + *
    #H5FD_LOG_TIME_SEEKClose, Read, Write + * Total time in write operations: \Code{\%f s}\n\n + * See also: #H5FD_LOG_LOC_SEEK or #H5FD_LOG_LOC_WRITE + *
    #H5FD_LOG_TIME_CLOSEClose + * Close took: \Code{(\%f s)} + *
    #H5FD_LOG_TIME_STATOpen + * Stat took: \Code{(\%f s)} + *
    #H5FD_LOG_ALLOCAlloc + * \Code{%10-%10 (%10Hu bytes) (\%s) Allocated}\n\n + * Start of address space\n + * End of address space\n + * Total size allocation\n + * Flavor of allocation + *
    + * + * \par Flavors: + * The \Emph{flavor} describes the type of stored information. The following + * table lists the flavors that appear in log output and briefly describes each. + * These terms are provided here to aid in the construction of log message + * parsers; a full description is beyond the scope of this document. + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
    Table3: Flavors of logged data
    FlavorDescription
    #H5FD_MEM_NOLISTError value
    #H5FD_MEM_DEFAULTValue not yet set.\n + * May also be a datatype set in a larger allocation that will be + * suballocated by the library.
    #H5FD_MEM_SUPERSuperblock data
    #H5FD_MEM_BTREEB-tree data
    #H5FD_MEM_DRAWRaw data (for example, contents of a dataset)
    #H5FD_MEM_GHEAPGlobal heap data
    #H5FD_MEM_LHEAPLocal heap data
    #H5FD_MEM_OHDRObject header data
    + * + * \version 1.8.7 The flags parameter has been changed from \Code{unsigned int} + * to \Code{unsigned long long}. + * The implementation of the #H5FD_LOG_TIME_OPEN, #H5FD_LOG_TIME_READ, + * #H5FD_LOG_TIME_WRITE, and #H5FD_LOG_TIME_SEEK flags has been finished. + * New flags were added: #H5FD_LOG_NUM_TRUNCATE and #H5FD_LOG_TIME_STAT. + * \version 1.6.0 The \c verbosity parameter has been removed. + * Two new parameters have been added: \p flags of type \Code{unsigned} and + * \p buf_size of type \Code{size_t}. + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_fapl_log(hid_t fapl_id, const char *logfile, unsigned long long flags, size_t buf_size); #ifdef __cplusplus diff --git a/src/H5FDmirror.h b/src/H5FDmirror.h index 8aef934..49e24c1 100644 --- a/src/H5FDmirror.h +++ b/src/H5FDmirror.h @@ -61,8 +61,20 @@ typedef struct H5FD_mirror_fapl_t { char remote_ip[H5FD_MIRROR_MAX_IP_LEN + 1]; } H5FD_mirror_fapl_t; -H5_DLL hid_t H5FD_mirror_init(void); +H5_DLL hid_t H5FD_mirror_init(void); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pget_fapl_mirror(hid_t fapl_id, H5FD_mirror_fapl_t *fa_out); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pset_fapl_mirror(hid_t fapl_id, H5FD_mirror_fapl_t *fa); #ifdef __cplusplus diff --git a/src/H5FDmpi.h b/src/H5FDmpi.h index 3af5e41..cf49301 100644 --- a/src/H5FDmpi.h +++ b/src/H5FDmpi.h @@ -34,10 +34,12 @@ */ #define H5D_MULTI_CHUNK_IO_COL_THRESHOLD 60 -/* Type of I/O for data transfer properties */ +/** + * Type of I/O for data transfer properties + */ typedef enum H5FD_mpio_xfer_t { - H5FD_MPIO_INDEPENDENT = 0, /*zero is the default*/ - H5FD_MPIO_COLLECTIVE + H5FD_MPIO_INDEPENDENT = 0, /**< Use independent I/O access */ + H5FD_MPIO_COLLECTIVE /**< Use collective I/O access */ } H5FD_mpio_xfer_t; /* Type of chunked dataset I/O */ diff --git a/src/H5FDmpio.h b/src/H5FDmpio.h index 79b52c7..8caf11c 100644 --- a/src/H5FDmpio.h +++ b/src/H5FDmpio.h @@ -44,14 +44,237 @@ H5_DLLVAR hbool_t H5FD_mpi_opt_types_g; #ifdef __cplusplus extern "C" { #endif -H5_DLL hid_t H5FD_mpio_init(void); +H5_DLL hid_t H5FD_mpio_init(void); + +/** + * \ingroup FAPL + * + * \brief Stores MPI IO communicator information to the file access property list + * + * \fapl_id + * \param[in] comm MPI-2 communicator + * \param[in] info MPI-2 info object + * \returns \herr_t + * + * \details H5Pset_fapl_mpio() stores the user-supplied MPI IO parameters \p + * comm, for communicator, and \p info, for information, in the file + * access property list \p fapl_id. That property list can then be used + * to create and/or open a file. + * + * H5Pset_fapl_mpio() is available only in the parallel HDF5 library + * and is not a collective function. + * + * \p comm is the MPI communicator to be used for file open, as defined + * in \c MPI_File_open of MPI-2. This function makes a duplicate of the + * communicator, so modifications to \p comm after this function call + * returns have no effect on the file access property list. + * + * \p info is the MPI Info object to be used for file open, as defined + * in MPI_File_open() of MPI-2. This function makes a duplicate copy of + * the Info object, so modifications to the Info object after this + * function call returns will have no effect on the file access + * property list. + * + * If the file access property list already contains previously-set + * communicator and Info values, those values will be replaced and the + * old communicator and Info object will be freed. + * + * \note Raw dataset chunk caching is not currently supported when using this + * file driver in read/write mode. All calls to H5Dread() and H5Dwrite() + * will access the disk directly, and H5Pset_cache() and + * H5Pset_chunk_cache() will have no effect on performance.\n + * Raw dataset chunk caching is supported when this driver is used in + * read-only mode. + * + * \version 1.4.5 Handling of the MPI Communicator and Info object changed at + * this release. A duplicate of each of these is now stored in the property + * list instead of pointers to each. + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_fapl_mpio(hid_t fapl_id, MPI_Comm comm, MPI_Info info); + +/** + * \ingroup FAPL + * + * \brief Returns MPI IO communicator information + * + * \fapl_id + * \param[out] comm MPI-2 communicator + * \param[out] info MPI-2 info object + * \returns \herr_t + * + * \details If the file access property list is set to the #H5FD_MPIO driver, + * H5Pget_fapl_mpio() returns duplicates of the stored MPI communicator + * and Info object through the \p comm and \p info pointers, if those + * values are non-null. + * + * Since the MPI communicator and Info object are duplicates of the + * stored information, future modifications to the access property list + * will not affect them. It is the responsibility of the application to + * free these objects. + * + * \version 1.4.5 Handling of the MPI Communicator and Info object changed at + * this release. A duplicate of each of these is now stored in the + * property list instead of pointers to each. + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pget_fapl_mpio(hid_t fapl_id, MPI_Comm *comm /*out*/, MPI_Info *info /*out*/); + +/** + * \ingroup DXPL + * + * \brief Sets data transfer mode + * + * \dxpl_id + * \param[in] xfer_mode Transfer mode + * \returns \herr_t + * + * \details H5Pset_dxpl_mpio() sets the data transfer property list \p dxpl_id + * to use transfer mode \p xfer_mode. The property list can then be + * used to control the I/O transfer mode during data I/O operations. + * + * Valid transfer modes are #H5FD_MPIO_INDEPENDENT (default) and + * #H5FD_MPIO_COLLECTIVE. + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_dxpl_mpio(hid_t dxpl_id, H5FD_mpio_xfer_t xfer_mode); + +/** + * \ingroup DXPL + * + * \brief Returns the data transfer mode + * + * \dxpl_id + * \param[out] xfer_mode Transfer mode + * \returns \herr_t + * + * \details H5Pget_dxpl_mpio() queries the data transfer mode currently set in + * the data transfer property list \p dxpl_id. + * + * Upon return, \p xfer_mode contains the data transfer mode, if it is + * non-null. + * + * H5Pget_dxpl_mpio() is not a collective function. + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pget_dxpl_mpio(hid_t dxpl_id, H5FD_mpio_xfer_t *xfer_mode /*out*/); + +/** + * \ingroup DXPL + * + * \brief Sets data transfer mode + * + * \dxpl_id + * \param[in] opt_mode Transfer mode + * \returns \herr_t + * + * \details H5Pset_dxpl_mpio() sets the data transfer property list \p dxpl_id + * to use transfer mode xfer_mode. The property list can then be used + * to control the I/O transfer mode during data I/O operations. + * + * Valid transfer modes are #H5FD_MPIO_INDEPENDENT (default) and + * #H5FD_MPIO_COLLECTIVE. + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_dxpl_mpio_collective_opt(hid_t dxpl_id, H5FD_mpio_collective_opt_t opt_mode); + +/** + * \ingroup DXPL + * + * \brief Sets a flag specifying linked-chunk I/O or multi-chunk I/O + * + * \dxpl_id + * \param[in] opt_mode Transfer mode + * \returns \herr_t + * + * \details H5Pset_dxpl_mpio_chunk_opt() specifies whether I/O is to be + * performed as linked-chunk I/O or as multi-chunk I/O. This function + * overrides the HDF5 library's internal algorithm for determining + * which mechanism to use. + * + * When an application uses collective I/O with chunked storage, the + * HDF5 library normally uses an internal algorithm to determine + * whether that I/O activity should be conducted as one linked-chunk + * I/O or as multi-chunk I/O. H5Pset_dxpl_mpio_chunk_opt() is provided + * so that an application can override the library's algorithm in + * circumstances where the library might lack the information needed to + * make an optimal decision. + * + * H5Pset_dxpl_mpio_chunk_opt() works by setting one of the following + * flags in the parameter \p opt_mode: + * - #H5FD_MPIO_CHUNK_ONE_IO - Do one-link chunked I/O + * - #H5FD_MPIO_CHUNK_MULTI_IO - Do multi-chunked I/O + * + * This function works by setting a corresponding property in the + * dataset transfer property list \p dxpl_id. + * + * The library performs I/O in the specified manner unless it + * determines that the low-level MPI IO package does not support the + * requested behavior; in such cases, the HDF5 library will internally + * use independent I/O. + * + * Use of this function is optional. + * + * \todo Add missing version information + * + */ H5_DLL herr_t H5Pset_dxpl_mpio_chunk_opt(hid_t dxpl_id, H5FD_mpio_chunk_opt_t opt_mode); + +/** + * \ingroup DXPL + * + * \brief Sets a numeric threshold for linked-chunk I/O + * + * \dxpl_id + * \param[in] num_chunk_per_proc + * \returns \herr_t + * + * \details H5Pset_dxpl_mpio_chunk_opt_num() sets a numeric threshold for the + * use of linked-chunk I/O. + * + * The library will calculate the average number of chunks selected by + * each process when doing collective access with chunked storage. If + * the number is greater than the threshold set in \p + * num_chunk_per_proc, the library will use linked-chunk I/O; + * otherwise, a separate I/O process will be invoked for each chunk + * (multi-chunk I/O). + * + * \todo Add missing version information + * + */ H5_DLL herr_t H5Pset_dxpl_mpio_chunk_opt_num(hid_t dxpl_id, unsigned num_chunk_per_proc); + +/** + * \ingroup DXPL + * + * \brief Sets a ratio threshold for collective I/O + * + * \dxpl_id + * \param[in] percent_num_proc_per_chunk + * \returns \herr_t + * + * \details H5Pset_dxpl_mpio_chunk_opt_ratio() sets a threshold for the use of + * collective I/O based on the ratio of processes with collective + * access to a dataset with chunked storage. The decision whether to + * use collective I/O is made on a per-chunk basis. + * + * The library will calculate the percentage of the total number of + * processes, the ratio, that hold selections in each chunk. If that + * percentage is greater than the threshold set in \p + * percent_proc_per_chunk, the library will do collective I/O for this + * chunk; otherwise, independent I/O will be done for the chunk. + * + * \todo Add missing version information + * + */ H5_DLL herr_t H5Pset_dxpl_mpio_chunk_opt_ratio(hid_t dxpl_id, unsigned percent_num_proc_per_chunk); #ifdef __cplusplus } diff --git a/src/H5FDmulti.h b/src/H5FDmulti.h index 9e04d8d..62cc9c8 100644 --- a/src/H5FDmulti.h +++ b/src/H5FDmulti.h @@ -25,11 +25,228 @@ #ifdef __cplusplus extern "C" { #endif -H5_DLL hid_t H5FD_multi_init(void); +H5_DLL hid_t H5FD_multi_init(void); + +/** + * \ingroup FAPL + * + * \brief Sets up use of the multi-file driver + * + * \fapl_id + * \param[in] memb_map Maps memory usage types to other memory usage types + * \param[in] memb_fapl Property list for each memory usage type + * \param[in] memb_name Name generator for names of member files + * \param[in] memb_addr The offsets within the virtual address space, from 0 + * (zero) to #HADDR_MAX, at which each type of data storage begins + * \param[in] relax Allows read-only access to incomplete file sets when \c TRUE + * \returns \herr_t + * + * \details H5Pset_fapl_multi() sets the file access property list \p fapl_id to + * use the multi-file driver. + * + * The multi-file driver enables different types of HDF5 data and + * metadata to be written to separate files. These files are viewed by + * the HDF5 library and the application as a single virtual HDF5 file + * with a single HDF5 file address space. The types of data that can be + * broken out into separate files include raw data, the superblock, + * B-tree data, global heap data, local heap data, and object + * headers. At the programmer's discretion, two or more types of data + * can be written to the same file while other types of data are + * written to separate files. + * + * The array \p memb_map maps memory usage types to other memory usage + * types and is the mechanism that allows the caller to specify how + * many files are created. The array contains #H5FD_MEM_NTYPES entries, + * which are either the value #H5FD_MEM_DEFAULT or a memory usage + * type. The number of unique values determines the number of files + * that are opened. + * + * The array \p memb_fapl contains a property list for each memory + * usage type that will be associated with a file. + * + * The array \p memb_name should be a name generator (a + * \Code{printf}-style format with a \Code{%s} which will be replaced + * with the name passed to H5FDopen(), usually from H5Fcreate() or + * H5Fopen()). + * + * The array \p memb_addr specifies the offsets within the virtual + * address space, from 0 (zero) to #HADDR_MAX, at which each type of + * data storage begins. + * + * If \p relax is set to 1 (TRUE), then opening an existing file for + * read-only access will not fail if some file members are + * missing. This allows a file to be accessed in a limited sense if + * just the meta data is available. + * + * Default values for each of the optional arguments are as follows: + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
    \p memb_mapThe default member map contains the value #H5FD_MEM_DEFAULT for each element.
    + * \p memb_fapl + * + * The default value is #H5P_DEFAULT for each element. + *
    + * \p memb_name + * + * The default string is \Code{%s-X.h5} where \c X is one of the following letters: + * - \c s for #H5FD_MEM_SUPER + * - \c b for #H5FD_MEM_BTREE + * - \c r for #H5FD_MEM_DRAW + * - \c g for #H5FD_MEM_GHEAP + * - \c l for #H5FD_MEM_LHEAP + * - \c o for #H5FD_MEM_OHDR + *
    + * \p memb_addr + * + * The default setting is that the address space is equally divided + * among all of the elements: + * - #H5FD_MEM_SUPER \Code{-> 0 * (HADDR_MAX/6)} + * - #H5FD_MEM_BTREE \Code{-> 1 * (HADDR_MAX/6)} + * - #H5FD_MEM_DRAW \Code{-> 2 * (HADDR_MAX/6)} + * - #H5FD_MEM_GHEAP \Code{-> 3 * (HADDR_MAX/6)} + * - #H5FD_MEM_LHEAP \Code{-> 4 * (HADDR_MAX/6)} + * - #H5FD_MEM_OHDR \Code{-> 5 * (HADDR_MAX/6)} + *
    + * + * \par Example: + * The following code sample sets up a multi-file access property list that + * partitions data into meta and raw files, each being one-half of the address:\n + * \code + * H5FD_mem_t mt, memb_map[H5FD_MEM_NTYPES]; + * hid_t memb_fapl[H5FD_MEM_NTYPES]; + * const char *memb[H5FD_MEM_NTYPES]; + * haddr_t memb_addr[H5FD_MEM_NTYPES]; + * + * // The mapping... + * for (mt=0; mt typedef enum { H5FD_FILE_IMAGE_OP_NO_OP, H5FD_FILE_IMAGE_OP_PROPERTY_LIST_SET, + /**< Passed to the \p image_malloc and \p image_memcpy callbacks when a + * file image buffer is to be copied while being set in a file access + * property list (FAPL)*/ H5FD_FILE_IMAGE_OP_PROPERTY_LIST_COPY, + /**< Passed to the \p image_malloc and \p image_memcpy callbacks + * when a file image buffer is to be copied when a FAPL is copied*/ H5FD_FILE_IMAGE_OP_PROPERTY_LIST_GET, + /** -/* Define structure to hold file image callbacks */ +/** + * Define structure to hold file image callbacks + */ +//! typedef struct { + /** + * \param[in] size Size in bytes of the file image buffer to allocate + * \param[in] file_image_op A value from H5FD_file_image_op_t indicating + * the operation being performed on the file image + * when this callback is invoked + * \param[in] udata Value passed in in the H5Pset_file_image_callbacks + * parameter \p udata + */ + //! void *(*image_malloc)(size_t size, H5FD_file_image_op_t file_image_op, void *udata); + //! + /** + * \param[in] dest Address of the destination buffer + * \param[in] src Address of the source buffer + * \param[in] file_image_op A value from #H5FD_file_image_op_t indicating + * the operation being performed on the file image + * when this callback is invoked + * \param[in] udata Value passed in in the H5Pset_file_image_callbacks + * parameter \p udata + */ + //! void *(*image_memcpy)(void *dest, const void *src, size_t size, H5FD_file_image_op_t file_image_op, void *udata); + //! + /** + * \param[in] ptr Pointer to the buffer being reallocated + * \param[in] file_image_op A value from #H5FD_file_image_op_t indicating + * the operation being performed on the file image + * when this callback is invoked + * \param[in] udata Value passed in in the H5Pset_file_image_callbacks + * parameter \p udata + */ + //! void *(*image_realloc)(void *ptr, size_t size, H5FD_file_image_op_t file_image_op, void *udata); + //! + /** + * \param[in] udata Value passed in in the H5Pset_file_image_callbacks + * parameter \p udata + */ + //! herr_t (*image_free)(void *ptr, H5FD_file_image_op_t file_image_op, void *udata); + //! + /** + * \param[in] udata Value passed in in the H5Pset_file_image_callbacks + * parameter \p udata + */ + //! void *(*udata_copy)(void *udata); + //! + /** + * \param[in] udata Value passed in in the H5Pset_file_image_callbacks + * parameter \p udata + */ + //! herr_t (*udata_free)(void *udata); + //! + /** + * \brief The final field in the #H5FD_file_image_callbacks_t struct, + * provides a pointer to user-defined data. This pointer will be + * passed to the image_malloc, image_memcpy, image_realloc, and + * image_free callbacks. Define udata as NULL if no user-defined + * data is provided. + */ void *udata; } H5FD_file_image_callbacks_t; +//! #ifdef __cplusplus extern "C" { diff --git a/src/H5FDros3.h b/src/H5FDros3.h index 3ef6b8a..8e42ca2 100644 --- a/src/H5FDros3.h +++ b/src/H5FDros3.h @@ -89,8 +89,20 @@ typedef struct H5FD_ros3_fapl_t { extern "C" { #endif -H5_DLL hid_t H5FD_ros3_init(void); +H5_DLL hid_t H5FD_ros3_init(void); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pget_fapl_ros3(hid_t fapl_id, H5FD_ros3_fapl_t *fa_out); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pset_fapl_ros3(hid_t fapl_id, H5FD_ros3_fapl_t *fa); #ifdef __cplusplus diff --git a/src/H5FDsplitter.h b/src/H5FDsplitter.h index a201116..ee6e7c5 100644 --- a/src/H5FDsplitter.h +++ b/src/H5FDsplitter.h @@ -87,8 +87,20 @@ typedef struct H5FD_splitter_vfd_config_t { #ifdef __cplusplus extern "C" { #endif -H5_DLL hid_t H5FD_splitter_init(void); +H5_DLL hid_t H5FD_splitter_init(void); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pset_fapl_splitter(hid_t fapl_id, H5FD_splitter_vfd_config_t *config_ptr); + +/** + * \ingroup FAPL + * + * \todo Add missing documentation + */ H5_DLL herr_t H5Pget_fapl_splitter(hid_t fapl_id, H5FD_splitter_vfd_config_t *config_ptr); #ifdef __cplusplus diff --git a/src/H5FDstdio.h b/src/H5FDstdio.h index b3e06bb..9db92ed 100644 --- a/src/H5FDstdio.h +++ b/src/H5FDstdio.h @@ -28,7 +28,21 @@ extern "C" { #endif -H5_DLL hid_t H5FD_stdio_init(void); +H5_DLL hid_t H5FD_stdio_init(void); +/** + * \ingroup FAPL + * + * \brief Sets the standard I/O driver + * + * \fapl_id + * \returns \herr_t + * + * \details H5Pset_fapl_stdio() modifies the file access property list to use + * the standard I/O driver, H5FDstdio(). + * + * \since 1.4.0 + * + */ H5_DLL herr_t H5Pset_fapl_stdio(hid_t fapl_id); #ifdef __cplusplus diff --git a/src/H5FDwindows.h b/src/H5FDwindows.h index c1c4654..79e73b6 100644 --- a/src/H5FDwindows.h +++ b/src/H5FDwindows.h @@ -27,6 +27,36 @@ extern "C" { #endif /* __cplusplus */ +/** + * \ingroup FAPL + * + * \brief Sets the Windows I/O driver + * + * \fapl_id + * \returns \herr_t + * + * \details H5Pset_fapl_windows() sets the default HDF5 Windows I/O driver on + * Windows systems. + * + * Since the HDF5 library uses this driver, #H5FD_WINDOWS, by default + * on Windows systems, it is not normally necessary for a user + * application to call H5Pset_fapl_windows(). While it is not + * recommended, there may be times when a user chooses to set a + * different HDF5 driver, such as the standard I/O driver (#H5FD_STDIO) + * or the sec2 driver (#H5FD_SEC2), in a Windows + * application. H5Pset_fapl_windows() is provided so that the + * application can return to the Windows I/O driver when the time + * comes. + * + * Only the Windows driver is tested on Windows systems; other drivers + * are used at the application’s and the user’s risk. + * + * Furthermore, the Windows driver is tested and available only on + * Windows systems; it is not available on non-Windows systems. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pset_fapl_windows(hid_t fapl_id); #ifdef __cplusplus diff --git a/src/H5Fmodule.h b/src/H5Fmodule.h index 33c302a..7f0299a 100644 --- a/src/H5Fmodule.h +++ b/src/H5Fmodule.h @@ -31,8 +31,34 @@ /** * \defgroup H5F H5F - * \brief File Interface - * \todo Describe concisely what the functions in this module are about. + * + * Use the functions in this module to manage HDF5 files. + * + * In the code snippets below, we show the skeletal life cycle of an HDF5 file, + * when creating a new file (left) or when opening an existing file (right). + * File creation is essentially controlled through \ref FCPL, and file access to + * new and existing files is controlled through \ref FAPL. The file \c name and + * creation or access \c mode control the interaction with the underlying + * storage such as file systems. + * + * \Emph{Proper error handling is part of the life cycle.} + * + * + * + * + * + * + *
    CreateOpen
    + * \snippet H5F_examples.c life_cycle + * + * \snippet H5F_examples.c life_cycle_w_open + *
    + * + * In addition to general file management functions, there are three categories + * of functions that deal with advanced file management tasks and use cases: + * 1. The control of the HDF5 \ref MDC + * 2. The use of (MPI-) \ref PH5F HDF5 + * 3. The \ref SWMR pattern * * \defgroup MDC Metadata Cache * \ingroup H5F diff --git a/src/H5Fpublic.h b/src/H5Fpublic.h index 24fe350..164b412 100644 --- a/src/H5Fpublic.h +++ b/src/H5Fpublic.h @@ -47,27 +47,23 @@ * We're assuming that these constants are used rather early in the hdf5 * session. */ -#define H5F_ACC_RDONLY (H5CHECK H5OPEN 0x0000u) /**< absence of rdwr => rd-only */ -#define H5F_ACC_RDWR (H5CHECK H5OPEN 0x0001u) /**< open for read and write */ -#define H5F_ACC_TRUNC (H5CHECK H5OPEN 0x0002u) /**< overwrite existing files */ -#define H5F_ACC_EXCL (H5CHECK H5OPEN 0x0004u) /**< fail if file already exists*/ +#define H5F_ACC_RDONLY (H5CHECK H5OPEN 0x0000u) /**< Absence of RDWR: read-only */ +#define H5F_ACC_RDWR (H5CHECK H5OPEN 0x0001u) /**< Open for read and write */ +#define H5F_ACC_TRUNC (H5CHECK H5OPEN 0x0002u) /**< Overwrite existing files */ +#define H5F_ACC_EXCL (H5CHECK H5OPEN 0x0004u) /**< Fail if file already exists*/ /* NOTE: 0x0008u was H5F_ACC_DEBUG, now deprecated */ -#define H5F_ACC_CREAT (H5CHECK H5OPEN 0x0010u) /**< create non-existing files */ +#define H5F_ACC_CREAT (H5CHECK H5OPEN 0x0010u) /**< Create non-existing files */ #define H5F_ACC_SWMR_WRITE \ - (H5CHECK 0x0020u) /**< indicate that this file is open for writing in a \ - single-writer/multi-reader (SWMR) scenario. \ - Note that the process(es) opening the file for reading must \ - open the file with RDONLY access, and use the special "SWMR_READ" \ - access flag. */ + (H5CHECK 0x0020u) /**< Indicate that this file is open for writing in a \ + * single-writer/multi-reader (SWMR) scenario. \ + * Note that the process(es) opening the file for reading \ + * must open the file with #H5F_ACC_RDONLY and use the \ + * #H5F_ACC_SWMR_READ access flag. */ #define H5F_ACC_SWMR_READ \ - (H5CHECK 0x0040u) /**< indicate that this file is \ - * open for reading in a \ - * single-writer/multi-reader (SWMR) \ - * scenario. Note that the \ - * process(es) opening the file \ - * for SWMR reading must also \ - * open the file with the RDONLY \ - * flag. */ + (H5CHECK 0x0040u) /**< Indicate that this file is open for reading in a \ + * single-writer/multi-reader (SWMR) scenario. Note that \ + * the process(es) opening the file for SWMR reading must \ + * also open the file with the #H5F_ACC_RDONLY flag. */ /** * Default property list identifier @@ -91,7 +87,7 @@ #define H5F_FAMILY_DEFAULT (hsize_t)0 #ifdef H5_HAVE_PARALLEL -/* +/** * Use this constant string as the MPI_Info key to set H5Fmpio debug flags. * To turn on H5Fmpio debug flags, set the MPI_Info value with this key to * have the value of a string consisting of the characters that turn on the @@ -101,11 +97,12 @@ #endif /* H5_HAVE_PARALLEL */ /** - * The difference between a single file and a set of mounted files + * The scope of an operation such as H5Fflush(), e.g., + * a single file vs. a set of mounted files */ typedef enum H5F_scope_t { - H5F_SCOPE_LOCAL = 0, /**< specified file handle only */ - H5F_SCOPE_GLOBAL = 1 /**< entire virtual file */ + H5F_SCOPE_LOCAL = 0, /**< The specified file handle only */ + H5F_SCOPE_GLOBAL = 1 /**< The entire virtual file */ } H5F_scope_t; /** @@ -117,16 +114,16 @@ typedef enum H5F_scope_t { * How does file close behave? */ typedef enum H5F_close_degree_t { - H5F_CLOSE_DEFAULT = 0, /**< Use the degree pre-defined by underlining VFL */ + H5F_CLOSE_DEFAULT = 0, /**< Use the degree pre-defined by underlying VFD */ H5F_CLOSE_WEAK = 1, /**< File closes only after all opened objects are closed */ - H5F_CLOSE_SEMI = 2, /**< If no opened objects, file is close; otherwise, file close fails */ + H5F_CLOSE_SEMI = 2, /**< If no opened objects, file is closed; otherwise, file close fails */ H5F_CLOSE_STRONG = 3 /**< If there are opened objects, close them first, then close file */ } H5F_close_degree_t; /** * Current "global" information about file */ -//! [H5F_info2_t_snip] +//! typedef struct H5F_info2_t { struct { unsigned version; /**< Superblock version # */ @@ -144,7 +141,7 @@ typedef struct H5F_info2_t { H5_ih_info_t msgs_info; /**< Shared object header message index & heap size */ } sohm; } H5F_info2_t; -//! [H5F_info2_t_snip] +//! /** * Types of allocation requests. The values larger than #H5FD_MEM_DEFAULT @@ -176,12 +173,12 @@ typedef enum H5F_mem_t { /** * Free space section information */ -//! [H5F_sect_info_t_snip] +//! typedef struct H5F_sect_info_t { haddr_t addr; /**< Address of free space section */ hsize_t size; /**< Size of free space section */ } H5F_sect_info_t; -//! [H5F_sect_info_t_snip] +//! /** * Library's format versions @@ -193,7 +190,7 @@ typedef enum H5F_libver_t { H5F_LIBVER_V110 = 2, /**< Use the latest v110 format for storing objects */ H5F_LIBVER_V112 = 3, /**< Use the latest v112 format for storing objects */ H5F_LIBVER_V114 = 4, /**< Use the latest v114 format for storing objects */ - H5F_LIBVER_NBOUNDS + H5F_LIBVER_NBOUNDS /**< Sentinel */ } H5F_libver_t; #define H5F_LIBVER_LATEST H5F_LIBVER_V114 @@ -201,7 +198,7 @@ typedef enum H5F_libver_t { /** * File space handling strategy */ -//! [H5F_fspace_strategy_t_snip] +//! typedef enum H5F_fspace_strategy_t { H5F_FSPACE_STRATEGY_FSM_AGGR = 0, /**< Mechanisms: free-space managers, aggregators, and virtual file drivers This is the library default when not set */ @@ -211,7 +208,7 @@ typedef enum H5F_fspace_strategy_t { H5F_FSPACE_STRATEGY_NONE = 3, /**< Mechanisms: virtual file drivers */ H5F_FSPACE_STRATEGY_NTYPES /**< Sentinel */ } H5F_fspace_strategy_t; -//! [H5F_fspace_strategy_t_snip] +//! /** * File space handling strategy for release 1.10.0 @@ -228,7 +225,7 @@ typedef enum H5F_file_space_type_t { H5F_FILE_SPACE_NTYPES /**< Sentinel */ } H5F_file_space_type_t; -//! [H5F_retry_info_t_snip] +//! #define H5F_NUM_METADATA_READ_RETRY_TYPES 21 /** @@ -239,7 +236,7 @@ typedef struct H5F_retry_info_t { unsigned nbins; uint32_t *retries[H5F_NUM_METADATA_READ_RETRY_TYPES]; } H5F_retry_info_t; -//! [H5F_retry_info_t_snip] +//! /** * Callback for H5Pset_object_flush_cb() in a file access property list @@ -277,11 +274,6 @@ extern "C" { */ H5_DLL htri_t H5Fis_accessible(const char *container_name, hid_t fapl_id); /** - * \example H5Fcreate.c - * After creating an HDF5 file with H5Fcreate(), we close it with - * H5Fclose(). - */ -/** * \ingroup H5F * * \brief Creates an HDF5 file @@ -321,7 +313,8 @@ H5_DLL htri_t H5Fis_accessible(const char *container_name, hid_t fapl_id); * this file identifier should be closed by calling H5Fclose() when * it is no longer needed. * - * \include H5Fcreate.c + * \par Example + * \snippet H5F_examples.c minimal * * \note #H5F_ACC_TRUNC and #H5F_ACC_EXCL are mutually exclusive; use * exactly one. @@ -359,6 +352,11 @@ H5_DLL htri_t H5Fis_accessible(const char *container_name, hid_t fapl_id); * */ H5_DLL hid_t H5Fcreate(const char *filename, unsigned flags, hid_t fcpl_id, hid_t fapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Fcreate} + */ H5_DLL hid_t H5Fcreate_async(const char *app_file, const char *app_func, unsigned app_line, const char *filename, unsigned flags, hid_t fcpl_id, hid_t fapl_id, hid_t es_id); /** @@ -408,6 +406,9 @@ H5_DLL hid_t H5Fcreate_async(const char *app_file, const char *app_func, unsigne * identifier should be closed by calling H5Fclose() when it is no * longer needed. * + * \par Example + * \snippet H5F_examples.c open + * * \note #H5F_ACC_RDWR and #H5F_ACC_RDONLY are mutually exclusive; use * exactly one. * @@ -451,6 +452,11 @@ H5_DLL hid_t H5Fcreate_async(const char *app_file, const char *app_func, unsigne * */ H5_DLL hid_t H5Fopen(const char *filename, unsigned flags, hid_t fapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Fopen} + */ H5_DLL hid_t H5Fopen_async(const char *app_file, const char *app_func, unsigned app_line, const char *filename, unsigned flags, hid_t access_plist, hid_t es_id); /** @@ -479,6 +485,11 @@ H5_DLL hid_t H5Fopen_async(const char *app_file, const char *app_func, unsigned * */ H5_DLL hid_t H5Freopen(hid_t file_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Freopen} + */ H5_DLL hid_t H5Freopen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t file_id, hid_t es_id); /** @@ -503,6 +514,9 @@ H5_DLL hid_t H5Freopen_async(const char *app_file, const char *app_func, unsigne * global or local. Valid values are as follows: * \scopes * + * \par Example + * \snippet H5F_examples.c flush + * * \attention HDF5 does not possess full control over buffering. H5Fflush() * flushes the internal HDF5 buffers then asks the operating system * (the OS) to flush the system buffers for the open files. After @@ -511,13 +525,13 @@ H5_DLL hid_t H5Freopen_async(const char *app_file, const char *app_func, unsigne * */ H5_DLL herr_t H5Fflush(hid_t object_id, H5F_scope_t scope); -H5_DLL herr_t H5Fflush_async(const char *app_file, const char *app_func, unsigned app_line, hid_t object_id, - H5F_scope_t scope, hid_t es_id); /** - * \example H5Fclose.c - * After creating an HDF5 file with H5Fcreate(), we close it with - * H5Fclose(). + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Fflush} */ +H5_DLL herr_t H5Fflush_async(const char *app_file, const char *app_func, unsigned app_line, hid_t object_id, + H5F_scope_t scope, hid_t es_id); /** * \ingroup H5F * @@ -534,8 +548,8 @@ H5_DLL herr_t H5Fflush_async(const char *app_file, const char *app_func, unsigne * identifier, or shared datatype identifier), the file will be fully * closed and access will end. * - * Use H5Fclose() as shown in the following example: - * \include H5Fclose.c + * \par Example + * \snippet H5F_examples.c minimal * * \note \Bold{Delayed close:} Note the following deviation from the * above-described behavior. If H5Fclose() is called for a file but one @@ -562,6 +576,11 @@ H5_DLL herr_t H5Fflush_async(const char *app_file, const char *app_func, unsigne * */ H5_DLL herr_t H5Fclose(hid_t file_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Fclose} + */ H5_DLL herr_t H5Fclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t file_id, hid_t es_id); /** @@ -655,7 +674,7 @@ H5_DLL hid_t H5Fget_access_plist(hid_t file_id); * \note The function will not return an error if intent is NULL; it will * simply do nothing. * - * \version 1.10.0 C function enhanced to work with SWMR functionality. + * \version 1.10.0 Function enhanced to work with SWMR functionality. * * \since 1.8.0 * @@ -707,7 +726,7 @@ H5_DLL herr_t H5Fget_fileno(hid_t file_id, unsigned long *fileno); * \c (#H5F_OBJ_DATASET|#H5F_OBJ_GROUP) would call for datasets and * groups. * - * \version 1.6.8, 1.8.2 C function return type changed to \c ssize_t. + * \version 1.6.8, 1.8.2 Function return type changed to \c ssize_t. * \version 1.6.5 #H5F_OBJ_LOCAL has been added as a qualifier on the types * of objects to be counted. #H5F_OBJ_LOCAL restricts the * search to objects opened through current file identifier. @@ -745,9 +764,9 @@ H5_DLL ssize_t H5Fget_obj_count(hid_t file_id, unsigned types); * To retrieve a count of open objects, use the H5Fget_obj_count() * function. This count can be used to set the \p max_objs parameter. * - * \version 1.8.2 C function return type changed to \c ssize_t and \p + * \version 1.8.2 Function return type changed to \c ssize_t and \p * max_objs parameter datatype changed to \c size_t. - * \version 1.6.8 C function return type changed to \c ssize_t and \p + * \version 1.6.8 Function return type changed to \c ssize_t and \p * max_objs parameter datatype changed to \c size_t. * \since 1.6.0 * @@ -798,6 +817,9 @@ H5_DLL herr_t H5Fget_vfd_handle(hid_t file_id, hid_t fapl, void **file_handle); * attribute, then the file will be mounted at the location where the * attribute, dataset, or named datatype is attached. * + * \par Example + * \snippet H5F_examples.c mount + * * \note To date, no file mount properties have been defined in HDF5. The * proper value to pass for \p plist is #H5P_DEFAULT, indicating the * default file mount property list. @@ -868,8 +890,6 @@ H5_DLL hssize_t H5Fget_freespace(hid_t file_id); * if any, the HDF5 portion of the file, and any data that may have * been appended beyond the data written through the HDF5 library. * - * \version 1.6.3 Fortran subroutine introduced in this release. - * * \since 1.6.3 * */ @@ -948,9 +968,7 @@ H5_DLL herr_t H5Fincrement_filesize(hid_t file_id, hsize_t increment); * * \note \Bold{Recommended Reading:} This function is part of the file image * operations feature set. It is highly recommended to study the guide - * "HDF5 File Image Operations" before using this feature set.\n See the - * "See Also" section below for links to other elements of HDF5 file - * image operations. \todo Fix the references. + * \ref_file_image_ops before using this feature set. * * \attention H5Pget_file_image() will fail, returning a negative value, if the * file is too large for the supplied buffer. @@ -958,8 +976,6 @@ H5_DLL herr_t H5Fincrement_filesize(hid_t file_id, hsize_t increment); * \see H5LTopen_file_image(), H5Pset_file_image(), H5Pget_file_image(), * H5Pset_file_image_callbacks(), H5Pget_file_image_callbacks() * - * \version 1.8.13 Fortran subroutine added in this release. - * * \since 1.8.0 * */ @@ -976,197 +992,18 @@ H5_DLL ssize_t H5Fget_file_image(hid_t file_id, void *buf_ptr, size_t buf_len); * \ref H5AC-cache-config-t "here". * \return \herr_t * + * \note The \c in direction applies only to the H5AC_cache_config_t::version + * field. All other fields are out parameters. + * * \details H5Fget_mdc_config() loads the current metadata cache configuration * into the instance of H5AC_cache_config_t pointed to by the \p config_ptr - * parameter. - * - * Note that the \c version field of \p config_ptr must be initialized - * --this allows the library to support old versions of the H5AC_cache_config_t - * structure. - * - * \par General configuration section - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    int version IN: Integer field indicating the the version of the H5AC_cache_config_t in use. This field should - * be set to #H5AC__CURR_CACHE_CONFIG_VERSION (defined in H5ACpublic.h).
    hbool_t rpt_fcn_enabled

    OUT: Boolean flag indicating whether the adaptive cache resize report function is enabled. This - * field should almost always be set to disabled (0). Since resize algorithm activity is - * reported via stdout, it MUST be set to disabled (0) on Windows machines.

    The - * report function is not supported code, and can be expected to change between versions of the - * library. Use it at your own risk.

    hbool_t open_trace_file OUT: Boolean field indicating whether the trace_file_name field should be used to - * open a trace file for the cache. This field will always be set to 0 in this - * context.
    hbool_t close_trace_file OUT: Boolean field indicating whether the current trace file (if any) should be closed. This field - * will always be set to 0 in this context.
    char*trace_file_name OUT: Full path name of the trace file to be opened if the open_trace_file field is - * set to 1. This field will always be set to the empty string in this context.
    hbool_t evictions_enabled OUT: Boolean flag indicating whether metadata cache entry evictions are - * enabled.
    hbool_t set_initial_size OUT: Boolean flag indicating whether the cache should be created with a user specified initial - * maximum size.

    If the configuration is loaded from the cache, this flag will always be set - * to 0.

    size_t initial_size OUT: Initial maximum size of the cache in bytes, if applicable.

    If the configuration is loaded - * from the cache, this field will contain the cache maximum size as of the time of the - * call.

    double min_clean_fraction OUT: Float value specifying the minimum fraction of the cache that must be kept either clean or - * empty when possible.
    size_t max_size OUT: Upper bound (in bytes) on the range of values that the adaptive cache resize code can select - * as the maximum cache size.
    size_t min_size OUT: Lower bound (in bytes) on the range of values that the adaptive cache resize code can select - * as the maximum cache size.
    long int epoch_length OUT: Number of cache accesses between runs of the adaptive cache resize - * code.
    - * - * \par Increment configuration section - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    enum H5C_cache_incr_mode incr_mode OUT: Enumerated value indicating the operational mode of the automatic cache size increase code. - * At present, only the following values are legal:

    \c H5C_incr__off: Automatic cache size increase - * is disabled.

    \c H5C_incr__threshold: Automatic cache size increase is enabled using the hit - * rate threshold algorithm.

    double lower_hr_threshold OUT: Hit rate threshold used in the hit rate threshold cache size increase algorithm.
    double increment OUT: The factor by which the current maximum cache size is multiplied to obtain an initial new - * maximum cache size if a size increase is triggered in the hit rate threshold cache size increase - * algorithm.
    hbool_t apply_max_increment OUT: Boolean flag indicating whether an upper limit will be applied to the size of cache size - * increases.
    size_t max_increment OUT: The maximum number of bytes by which the maximum cache size can be increased in a single step - * -- if applicable.
    enum H5C_cache_flash_incr_mode flash_incr_mode OUT: Enumerated value indicating the operational mode of the flash cache size increase code. At - * present, only the following values are legal:

    \c H5C_flash_incr__off: Flash cache size increase is - * disabled.

    \c H5C_flash_incr__add_space: Flash cache size increase is enabled using the add - * space algorithm.

    double flash_threshold OUT: The factor by which the current maximum cache size is multiplied to obtain the minimum size - * entry / entry size increase which may trigger a flash cache size - * increase.
    double flash_multiple OUT: The factor by which the size of the triggering entry / entry size increase is multiplied to - * obtain the initial cache size increment. This increment may be reduced to reflect existing free - * space in the cache and the max_size field above.
    - * - * \par Decrement configuration section - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    Decrement configuration - * section:
    enum H5C_cache_decr_mode decr_mode OUT: Enumerated value indicating the operational mode of the automatic cache size decrease code. - * At present, the following values are legal:

    H5C_decr__off: Automatic cache size decrease is - * disabled, and the remaining decrement fields are ignored.

    H5C_decr__threshold: Automatic - * cache size decrease is enabled using the hit rate threshold algorithm.

    H5C_decr__age_out: - * Automatic cache size decrease is enabled using the ageout algorithm.

    - *

    H5C_decr__age_out_with_threshold: Automatic cache size decrease is enabled using the ageout - * with hit rate threshold algorithm

    double upper_hr_threshold OUT: Upper hit rate threshold. This value is only used if the decr_mode is either - * H5C_decr__threshold or H5C_decr__age_out_with_threshold.
    double decrement OUT: Factor by which the current max cache size is multiplied to obtain an initial value for the - * new cache size when cache size reduction is triggered in the hit rate threshold cache size reduction - * algorithm.
    hbool_t apply_max_decrement OUT: Boolean flag indicating whether an upper limit should be applied to the size of cache size - * decreases.
    size_t max_decrement OUT: The maximum number of bytes by which cache size can be decreased if any single step, if - * applicable.
    int epochs_before_eviction OUT: The minimum number of epochs that an entry must reside unaccessed in cache before being - * evicted under either of the ageout cache size reduction algorithms.
    hbool_t apply_empty_reserve OUT: Boolean flag indicating whether an empty reserve should be maintained under either of the - * ageout cache size reduction algorithms.
    double empty_reserve OUT: Empty reserve for use with the ageout cache size reduction algorithms, if applicable.
    - * - * \par Parallel configuration section - * - * - * - * - *
    int dirty_bytes_threshold OUT: Threshold number of bytes of dirty metadata generation for triggering synchronizations of the - * metadata caches serving the target file in the parallel case.

    Synchronization occurs whenever the - * number of bytes of dirty metadata created since the last synchronization exceeds this - * limit.

    + * parameter.\n + * The fields of the H5AC_cache_config_t structure are shown below: + * \snippet H5ACpublic.h H5AC_cache_config_t_snip + * \click4more * * \since 1.8.0 * - * \todo Fix the reference! - * */ H5_DLL herr_t H5Fget_mdc_config(hid_t file_id, H5AC_cache_config_t *config_ptr); /** @@ -1183,240 +1020,11 @@ H5_DLL herr_t H5Fget_mdc_config(hid_t file_id, H5AC_cache_config_t *config_ptr); * * \details H5Fset_mdc_config() attempts to configure the file's metadata cache * according configuration supplied in \p config_ptr. - * - * \par General configuration fields - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    int versionIN: Integer field indicating the the version of the H5AC_cache_config_t in use. This - * field should be set to #H5AC__CURR_CACHE_CONFIG_VERSION (defined - * in H5ACpublic.h).
    hbool_t rpt_fcn_enabledIN: Boolean flag indicating whether the adaptive cache resize report function is enabled. This - * field should almost always be set to disabled (0). Since resize algorithm activity is - * reported via stdout, it MUST be set to disabled (0) on Windows machines.

    The report - * function is not supported code, and can be expected to change between versions of the library. Use - * it at your own risk.

    hbool_t open_trace_FileIN: Boolean field indicating whether the trace_file_name field should be used to open - * a trace file for the cache.

    The trace file is a debuging feature that allows the capture of top - * level metadata cache requests for purposes of debugging and/or optimization. This field should - *

    This field should only normally be set to 0, as trace file collection imposes - * considerable overhead.

    be set to 1 when the trace_file_name contains - * the full path of the desired trace file, and either there is no open trace file on the cache, or the - * close_trace_file field is also 1.

    The trace file feature is - * unsupported unless used at the direction of The HDF Group. It is intended to allow The HDF Group to - * collect a trace of cache activity in cases of occult failures and/or poor performance seen in the - * field, so as to aid in reproduction in the lab. If you use it absent the direction of The HDF Group, - * you are on your own.

    hbool_t close_trace_fileIN: Boolean field indicating whether the current trace file (if any) should be closed.

    See the - * above comments on the open_trace_file field. This field should be set to - * 0 unless there is an open trace file on the cache that you wish to close.

    The - * trace file feature is unsupported unless used at the direction of The HDF Group. It is intended to - * allow The HDF Group to collect a trace of cache activity in cases of occult failures and/or poor - * performance seen in the field, so as to aid in reproduction in the lab. If you use it absent the - * direction of The HDF Group, you are on your own.

    char trace_file_name[]IN: Full path of the trace file to be opened if the open_trace_file field is set - * to 1.

    In the parallel case, an ascii representation of the mpi rank of the process - * will be appended to the file name to yield a unique trace file name for each process.

    The - * length of the path must not exceed #H5AC__MAX_TRACE_FILE_NAME_LEN characters.

    The trace file - * feature is unsupported unless used at the direction of The HDF Group. It is intended to allow The - * HDF Group to collect a trace of cache activity in cases of occult failures and/or poor performance - * seen in the field, so as to aid in reproduction in the lab. If you use it absent the direction of - * The HDF Group, you are on your own.

    hbool_t evictions_enabledIN: A boolean flag indicating whether evictions from the metadata cache are enabled. This flag is - * initially set to enabled (1).

    In rare circumstances, the raw data throughput - * requirements may be so high that the user wishes to postpone metadata writes so as to reserve I/O - * throughput for raw data. The evictions_enabled field exists to allow this. However, - * this is an extreme step, and you have no business doing it unless you have read the User Guide - * section on metadata caching, and have considered all other options carefully.

    The - * evictions_enabled field may not be set to disabled (0) unless all adaptive - * cache resizing code is disabled via the incr_mode, flash_incr_mode, and - * decr_mode fields.

    When this flag is set to disabled (0), the - * metadata cache will not attempt to evict entries to make space for new entries, and thus will grow - * without bound.

    Evictions will be re-enabled when this field is set back to 1. - * This should be done as soon as possible.

    hbool_t set_initial_sizeIN: Boolean flag indicating whether the cache should be forced to the user specified initial - * size.
    size_t initial_sizeIN: If set_initial_size is set to 1, then initial_size must - * contain the desired initial size in bytes. This value must lie in the closed interval - * [min_size, max_size]. (see below)
    double min_clean_fractionIN: This field specifies the minimum fraction of the cache that must be kept either clean or - * empty.

    The value must lie in the interval [0.0, 1.0]. 0.01 is a good place to start in the serial - * case. In the parallel case, a larger value is needed -- see Metadata Caching in HDF5 in the collection - * "Advanced Topics in HDF5."

    size_t max_sizeIN: Upper bound (in bytes) on the range of values that the adaptive cache resize code can select - * as the maximum cache size.
    size_t min_sizeIN: Lower bound (in bytes) on the range of values that the adaptive cache resize code can select - * as the maximum cache size.
    long int epoch_lengthIN: Number of cache accesses between runs of the adaptive cache resize code. 50,000 is a good - * starting number.
    - * - * \par Increment configuration fields - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    enum H5C_cache_incr_mode incr_modeIN: Enumerated value indicating the operational mode of the automatic cache size increase code. At - * present, only two values are legal:

    \c H5C_incr__off: Automatic cache size increase is disabled, - * and the remaining increment fields are ignored.

    \c H5C_incr__threshold: Automatic cache size - * increase is enabled using the hit rate threshold algorithm.

    double lower_hr_thresholdIN: Hit rate threshold used by the hit rate threshold cache size increment algorithm.

    When the - * hit rate over an epoch is below this threshold and the cache is full, the maximum size of the - * cache is multiplied by increment (below), and then clipped as necessary to stay within max_size, and - * possibly max_increment.

    This field must lie in the interval [0.0, 1.0]. 0.8 or 0.9 is a good - * starting point.

    double incrementIN: Factor by which the hit rate threshold cache size increment algorithm multiplies the current - * maximum cache size to obtain a tentative new cache size.

    The actual cache size increase will be - * clipped to satisfy the max_size specified in the general configuration, and possibly max_increment - * below.

    The parameter must be greater than or equal to 1.0 -- 2.0 is a reasonable - * value.

    If you set it to 1.0, you will effectively disable cache size increases.

    hbool_t apply_max_incrementIN: Boolean flag indicating whether an upper limit should be applied to the size of cache size - * increases.
    size_t max_incrementIN: Maximum number of bytes by which cache size can be increased in a single step -- if - * applicable.
    enum H5C_cache_flash_incr_mode flash_incr_modeIN: Enumerated value indicating the operational mode of the flash cache size increase code. At - * present, only the following values are legal:

    \c H5C_flash_incr__off: Flash cache size increase is - * disabled.

    \c H5C_flash_incr__add_space: Flash cache size increase is enabled using the add - * space algorithm.

    double flash_thresholdIN: The factor by which the current maximum cache size is multiplied to obtain the minimum size - * entry / entry size increase which may trigger a flash cache size increase.

    At present, this value - * must lie in the range [0.1, 1.0].

    double flash_multipleIN: The factor by which the size of the triggering entry / entry size increase is multiplied to - * obtain the initial cache size increment. This increment may be reduced to reflect existing free - * space in the cache and the max_size field above.

    At present, this field must lie in - * the range [0.1, 10.0].

    - * - * \par Decrement configuration fields - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - * - *
    enum H5C_cache_decr_mode decr_modeIN: Enumerated value indicating the operational mode of the automatic cache size decrease code. At - * present, the following values are legal:

    \c H5C_decr__off: Automatic cache size decrease is - * disabled.

    \c H5C_decr__threshold: Automatic cache size decrease is enabled using the hit - * rate threshold algorithm.

    \c H5C_decr__age_out: Automatic cache size decrease is enabled using - * the ageout algorithm.

    \c H5C_decr__age_out_with_threshold: Automatic cache size decrease is - * enabled using the ageout with hit rate threshold algorithm

    double upper_hr_thresholdIN: Hit rate threshold for the hit rate threshold and ageout with hit rate threshold cache size - * decrement algorithms.

    When \c decr_mode is \c H5C_decr__threshold, and the hit rate over a given - * epoch exceeds the supplied threshold, the current maximum cache size is multiplied by decrement to - * obtain a tentative new (and smaller) maximum cache size.

    When \c decr_mode is \c - * H5C_decr__age_out_with_threshold, there is no attempt to find and evict aged out entries unless the - * hit rate in the previous epoch exceeded the supplied threshold.

    This field must lie in the - * interval [0.0, 1.0].

    For \c H5C_incr__threshold, .9995 or .99995 is a good place to - * start.

    For \c H5C_decr__age_out_with_threshold, .999 might be more useful.

    double decrementIN: In the hit rate threshold cache size decrease algorithm, this parameter contains the factor by - * which the current max cache size is multiplied to produce a tentative new cache size.

    The actual - * cache size decrease will be clipped to satisfy the min_size specified in the general configuration, - * and possibly max_decrement below.

    The parameter must be be in the interval - * [0.0, 1.0].

    If you set it to 1.0, you will effectively disable cache size decreases. 0.9 is a - * reasonable starting point.

    hbool_t apply_max_decrementIN: Boolean flag indicating whether an upper limit should be applied to the size of cache size - * decreases.
    size_t max_decrementIN: Maximum number of bytes by which the maximum cache size can be decreased in any single step -- - * if applicable.
    int epochs_before_evictionIN: In the ageout based cache size reduction algorithms, this field contains the minimum number of - * epochs an entry must remain unaccessed in cache before the cache size reduction algorithm tries to - * evict it. 3 is a reasonable value.
    hbool_t apply_empty_reserveIN: Boolean flag indicating whether the ageout based decrement algorithms will maintain a empty - * reserve when decreasing cache size.
    double empty_reserveIN: Empty reserve as a fraction of maximum cache size if applicable.

    When so directed, the - * ageout based algorithms will not decrease the maximum cache size unless the empty reserve can be - * met.

    The parameter must lie in the interval [0.0, 1.0]. 0.1 or 0.05 is a good place to - * start.

    - * - * \par Parallel configuration fields - * - * - * - * - * - *
    int dirty_bytes_thresholdIN: Threshold number of bytes of dirty metadata generation for triggering synchronizations of the - * metadata caches serving the target file in the parallel case.

    Synchronization occurs whenever the - * number of bytes of dirty metadata created since the last synchronization exceeds this - * limit.

    This field only applies to the parallel case. While it is ignored elsewhere, it can - * still draw a value out of bounds error.

    It must be consistant across all caches on any given - * file.

    By default, this field is set to 256 KB. It shouldn't be more than half the current - * maximum cache size times the minimum clean fraction.

    + * \snippet H5ACpublic.h H5AC_cache_config_t_snip + * \click4more * * \since 1.8.0 * - * \todo Fix the MDC document reference! */ H5_DLL herr_t H5Fset_mdc_config(hid_t file_id, H5AC_cache_config_t *config_ptr); /** @@ -1495,13 +1103,12 @@ H5_DLL herr_t H5Fget_mdc_size(hid_t file_id, size_t *max_size_ptr, size_t *min_c * is enabled. However, the call should be useful if you choose to control metadata cache size from your * program. * - * See "Metadata Caching in HDF5" for details about the metadata cache and the adaptive cache resizing + * See \ref_mdc_in_hdf5 for details about the metadata cache and the adaptive cache resizing * algorithms. If you have not read, understood, and thought about the material covered in that * documentation, * you should not be using this API call. * \endparblock * - * \todo Fix the MDC document reference! */ H5_DLL herr_t H5Freset_mdc_hit_rate_stats(hid_t file_id); /** @@ -1797,6 +1404,9 @@ H5_DLL herr_t H5Fclear_elink_file_cache(hid_t file_id); * For the parameters \p low and \p high, see the description for * H5Pset_libver_bounds(). * + * \par Example + * \snippet H5F_examples.c libver_bounds + * * \since 1.10.2 * */ @@ -1835,7 +1445,7 @@ H5_DLL herr_t H5Fset_libver_bounds(hid_t file_id, H5F_libver_t low, H5F_libver_t * list, and H5Fget_mdc_logging_status() will return the current state of * the logging flags. * - * The log format is described in the \Emph{Metadata Cache Logging} document. + * The log format is described in the \ref_mdc_logging document. * * \note Logging can only be started or stopped if metadata cache logging was enabled * via H5Pset_mdc_log_options().\n @@ -1849,8 +1459,6 @@ H5_DLL herr_t H5Fset_libver_bounds(hid_t file_id, H5F_libver_t low, H5F_libver_t * * \since 1.10.0 * - * \todo Fix the document reference! - * */ H5_DLL herr_t H5Fstart_mdc_logging(hid_t file_id); /** @@ -1887,7 +1495,7 @@ H5_DLL herr_t H5Fstart_mdc_logging(hid_t file_id); * list, and H5Fget_mdc_logging_status() will return the current state of * the logging flags. * - * The log format is described in the \Emph{Metadata Cache Logging} document. + * The log format is described in the \ref_mdc_logging document. * * \note Logging can only be started or stopped if metadata cache logging was enabled * via H5Pset_mdc_log_options().\n @@ -1933,7 +1541,7 @@ H5_DLL herr_t H5Fstop_mdc_logging(hid_t file_id); * list, and H5Fget_mdc_logging_status() will return the current state of * the logging flags. * - * The log format is described in the \Emph{Metadata Cache Logging} document. + * The log format is described in the \ref_mdc_logging document. * * \note Unlike H5Fstart_mdc_logging() and H5Fstop_mdc_logging(), this function can * be called on any open file identifier. @@ -1944,7 +1552,7 @@ H5_DLL herr_t H5Fget_mdc_logging_status(hid_t file_id, hbool_t *is_enabled, hboo /** * \ingroup SWMR * - * \todo Finish this! + * \todo UFO? */ H5_DLL herr_t H5Fformat_convert(hid_t fid); /** @@ -1998,7 +1606,7 @@ H5_DLL herr_t H5Fget_page_buffering_stats(hid_t file_id, unsigned accesses[2], u * \brief Obtains information about a cache image if it exists * * \file_id - * \param[out] image_addr Offset of the cache image if it exists, or #HADDR_UNDEF if it does not + * \param[out] image_addr Offset of the cache image if it exists, or \c HADDR_UNDEF if it does not * \param[out] image_size Length of the cache image if it exists, or 0 if it does not * \returns \herr_t * @@ -2139,11 +1747,10 @@ H5_DLL herr_t H5Fwait(hid_t file_id); * the desired behavior. * \endparblock * - * \see Enabling a Strict Consistency Semantics Model in Parallel HDF5 + * \see \ref_cons_semantics * * \since 1.8.9 * - * \todo Fix the reference! */ H5_DLL herr_t H5Fset_mpi_atomicity(hid_t file_id, hbool_t flag); /** @@ -2163,11 +1770,10 @@ H5_DLL herr_t H5Fset_mpi_atomicity(hid_t file_id, hbool_t flag); * Upon successful return, \p flag will be set to \c 1 if file access is set * to atomic mode and \c 0 if file access is set to nonatomic mode. * - * \see Enabling a Strict Consistency Semantics Model in Parallel HDF5 + * \see \ref_cons_semantics * * \since 1.8.9 * - * \todo Fix the reference! */ H5_DLL herr_t H5Fget_mpi_atomicity(hid_t file_id, hbool_t *flag); #endif /* H5_HAVE_PARALLEL */ @@ -2199,14 +1805,14 @@ H5_DLL herr_t H5Fget_mpi_atomicity(hid_t file_id, hbool_t *flag); #ifndef H5_NO_DEPRECATED_SYMBOLS /* Macros */ -#define H5F_ACC_DEBUG (H5CHECK H5OPEN 0x0000u) /*print debug info (deprecated)*/ +#define H5F_ACC_DEBUG (H5CHECK H5OPEN 0x0000u) /**< Print debug info \deprecated In which version? */ /* Typedefs */ /** * Current "global" information about file */ -//! [H5F_info1_t_snip] +//! typedef struct H5F_info1_t { hsize_t super_ext_size; /**< Superblock extension size */ struct { @@ -2214,7 +1820,7 @@ typedef struct H5F_info1_t { H5_ih_info_t msgs_info; /**< Shared object header message index & heap size */ } sohm; } H5F_info1_t; -//! [H5F_info1_t_snip] +//! /* Function prototypes */ /** @@ -2252,7 +1858,7 @@ typedef struct H5F_info1_t { * header indexes. Each index might be either a B-tree or * a list. * - * \version 1.10.0 C function H5Fget_info() renamed to H5Fget_info1() and + * \version 1.10.0 Function H5Fget_info() renamed to H5Fget_info1() and * deprecated in this release. * * \since 1.8.0 diff --git a/src/H5Gmodule.h b/src/H5Gmodule.h index 219342d..fe26bd2 100644 --- a/src/H5Gmodule.h +++ b/src/H5Gmodule.h @@ -31,9 +31,94 @@ /** * \defgroup H5G H5G - * \brief Group Interface - * \details The HDF5 Group Interface, H5G, provides a mechanism for managing - * HDF5 groups and their members, which are other HDF5 objects. + * + * \details \Bold{Groups in HDF5:} A group associates names with objects and + * provides a mechanism for mapping a name to an object. Since all + * objects appear in at least one group (with the possible exception of + * the root object) and since objects can have names in more than one + * group, the set of all objects in an HDF5 file is a directed + * graph. The internal nodes (nodes with out-degree greater than zero) + * must be groups while the leaf nodes (nodes with out-degree zero) are + * either empty groups or objects of some other type. Exactly one + * object in every non-empty file is the root object. The root object + * always has a positive in-degree because it is pointed to by the file + * super block. + * + * \Bold{Locating objects in the HDF5 file hierarchy:} An object name + * consists of one or more components separated from one another by + * slashes. An absolute name begins with a slash and the object is + * located by looking for the first component in the root object, then + * looking for the second component in the first object, etc., until + * the entire name is traversed. A relative name does not begin with a + * slash and the traversal begins at the location specified by the + * create or access function. + * + * \Bold{Group implementations in HDF5:} The original HDF5 group + * implementation provided a single indexed structure for link + * storage. A new group implementation, in HDF5 Release 1.8.0, enables + * more efficient compact storage for very small groups, improved link + * indexing for large groups, and other advanced features. + * + * \li The \Emph{original indexed} format remains the default. Links + * are stored in a B-tree in the group’s local heap. + * \li Groups created in the new \Emph{compact-or-indexed} format, the + * implementation introduced with Release 1.8.0, can be tuned for + * performance, switching between the compact and indexed formats + * at thresholds set in the user application. + * - The \Emph{compact} format will conserve file space and processing + * overhead when working with small groups and is particularly + * valuable when a group contains no links. Links are stored + * as a list of messages in the group’s header. + * - The \Emph{indexed} format will yield improved + * performance when working with large groups, e.g., groups + * containing thousands to millions of members. Links are stored in + * a fractal heap and indexed with an improved B-tree. + * \li The new implementation also enables the use of link names consisting of + * non-ASCII character sets (see H5Pset_char_encoding()) and is + * required for all link types other than hard or soft links, e.g., + * external and user-defined links (see the \ref H5L APIs). + * + * The original group structure and the newer structures are not + * directly interoperable. By default, a group will be created in the + * original indexed format. An existing group can be changed to a + * compact-or-indexed format if the need arises; there is no capability + * to change back. As stated above, once in the compact-or-indexed + * format, a group can switch between compact and indexed as needed. + * + * Groups will be initially created in the compact-or-indexed format + * only when one or more of the following conditions is met: + * \li The low version bound value of the library version bounds property + * has been set to Release 1.8.0 or later in the file access property + * list (see H5Pset_libver_bounds()). Currently, that would require an + * H5Pset_libver_bounds() call with the low parameter set to + * #H5F_LIBVER_LATEST.\n When this property is set for an HDF5 file, + * all objects in the file will be created using the latest available + * format; no effort will be made to create a file that can be read by + * older libraries. + * \li The creation order tracking property, #H5P_CRT_ORDER_TRACKED, has been + * set in the group creation property list (see H5Pset_link_creation_order()). + * + * An existing group, currently in the original indexed format, will be + * converted to the compact-or-indexed format upon the occurrence of + * any of the following events: + * \li An external or user-defined link is inserted into the group. + * \li A link named with a string composed of non-ASCII characters is + * inserted into the group. + * + * The compact-or-indexed format offers performance improvements that + * will be most notable at the extremes, i.e., in groups with zero + * members and in groups with tens of thousands of members. But + * measurable differences may sometimes appear at a threshold as low as + * eight group members. Since these performance thresholds and criteria + * differ from application to application, tunable settings are + * provided to govern the switch between the compact and indexed + * formats (see H5Pset_link_phase_change()). Optimal thresholds will + * depend on the application and the operating environment. + * + * Future versions of HDF5 will retain the ability to create, read, + * write, and manipulate all groups stored in either the original + * indexed format or the compact-or-indexed format. + * */ #endif /* H5Gmodule_H */ diff --git a/src/H5Gpublic.h b/src/H5Gpublic.h index 68786dc..416ff2c 100644 --- a/src/H5Gpublic.h +++ b/src/H5Gpublic.h @@ -41,24 +41,31 @@ /* Public Typedefs */ /*******************/ -/* Types of link storage for groups */ +//! +/** + * Types of link storage for groups + */ typedef enum H5G_storage_type_t { - H5G_STORAGE_TYPE_UNKNOWN = -1, /* Unknown link storage type */ - H5G_STORAGE_TYPE_SYMBOL_TABLE, /* Links in group are stored with a "symbol table" */ - /* (this is sometimes called "old-style" groups) */ - H5G_STORAGE_TYPE_COMPACT, /* Links are stored in object header */ - H5G_STORAGE_TYPE_DENSE /* Links are stored in fractal heap & indexed with v2 B-tree */ + H5G_STORAGE_TYPE_UNKNOWN = -1, /**< Unknown link storage type */ + H5G_STORAGE_TYPE_SYMBOL_TABLE, /**< Links in group are stored with a "symbol table" */ + /**< (this is sometimes called "old-style" groups) */ + H5G_STORAGE_TYPE_COMPACT, /**< Links are stored in object header */ + H5G_STORAGE_TYPE_DENSE /**< Links are stored in fractal heap & indexed with v2 B-tree */ } H5G_storage_type_t; +//! -/* Information struct for group (for H5Gget_info/H5Gget_info_by_name/H5Gget_info_by_idx) */ -//! [H5G_info_t_snip] +//! +/** + * Information struct for group for + * H5Gget_info(), H5Gget_info_by_name(), and H5Gget_info_by_idx() + */ typedef struct H5G_info_t { - H5G_storage_type_t storage_type; /* Type of storage for links in group */ - hsize_t nlinks; /* Number of links in group */ - int64_t max_corder; /* Current max. creation order value for group */ - hbool_t mounted; /* Whether group has a file mounted on it */ + H5G_storage_type_t storage_type; /**< Type of storage for links in group */ + hsize_t nlinks; /**< Number of links in group */ + int64_t max_corder; /**< Current max. creation order value for group */ + hbool_t mounted; /**< Whether group has a file mounted on it */ } H5G_info_t; -//! [H5G_info_t_snip] +//! /********************/ /* Public Variables */ @@ -120,24 +127,8 @@ H5_DLL hid_t H5Gcreate2(hid_t loc_id, const char *name, hid_t lcpl_id, hid_t gcp /** * -------------------------------------------------------------------------- - * \ingroup H5G - * - * \brief Asynchronous version of H5Gcreate2() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[in] name Name of the group to create - * \lcpl_id - * \gcpl_id - * \gapl_id - * \es_id - * - * \return \hid_t{group} - * - * \see H5Gcreate2() - * + * \ingroup ASYNC + * \async_variant_of{H5Gcreate} */ H5_DLL hid_t H5Gcreate_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t lcpl_id, hid_t gcpl_id, hid_t gapl_id, hid_t es_id); @@ -223,22 +214,8 @@ H5_DLL hid_t H5Gopen2(hid_t loc_id, const char *name, hid_t gapl_id); /** * -------------------------------------------------------------------------- - * \ingroup H5G - * - * \brief Asynchronous version of H5Gopen2() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[in] name Name of the group to open - * \gapl_id - * \es_id - * - * \return \hid_t{group} - * - * \see H5Gopen2() - * + * \ingroup ASYNC + * \async_variant_of{H5Gopen} */ H5_DLL hid_t H5Gopen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t gapl_id, hid_t es_id); @@ -296,21 +273,8 @@ H5_DLL herr_t H5Gget_info(hid_t loc_id, H5G_info_t *ginfo); /** * -------------------------------------------------------------------------- - * \ingroup H5G - * - * \brief Asynchronous version of H5Gget_info() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[out] ginfo Struct in which group information is returned - * \es_id - * - * \return \hid_t{group} - * - * \see H5Gget_info() - * + * \ingroup ASYNC + * \async_variant_of{H5Gget_info} */ H5_DLL herr_t H5Gget_info_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, H5G_info_t *ginfo /*out*/, hid_t es_id); @@ -351,23 +315,8 @@ H5_DLL herr_t H5Gget_info_by_name(hid_t loc_id, const char *name, H5G_info_t *gi /** * -------------------------------------------------------------------------- - * \ingroup H5G - * - * \brief Asynchronous version of H5Gget_info_by_name() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[in] name Name of the group to query - * \param[out] ginfo Struct in which group information is returned - * \lapl_id - * \es_id - * - * \return \herr_t - * - * \see H5Gget_info_by_name() - * + * \ingroup ASYNC + * \async_variant_of{H5Gget_info_by_name} */ H5_DLL herr_t H5Gget_info_by_name_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, H5G_info_t *ginfo /*out*/, @@ -423,29 +372,8 @@ H5_DLL herr_t H5Gget_info_by_idx(hid_t loc_id, const char *group_name, H5_index_ /** * -------------------------------------------------------------------------- - * \ingroup H5G - * - * \brief Asynchronous version of H5Gcreate2() - * - * \app_file - * \app_func - * \app_line - * \fgdta_loc_id - * \param[in] group_name Name of the group to query - * \param[in] idx_type Transient index identifying object - * \param[in] order Transient index identifying object - * \param[in] n Position in the index of the group to query - * \param[out] ginfo Struct in which group information is returned - * \lapl_id - * \es_id - * - * \return Returns - * \li The size of the object name if successful, or - * \li 0 if no name is associated with the group identifier, or - * \li negative value, if failure occurred - * - * \see H5Gcreate2() - * + * \ingroup ASYNC + * \async_variant_of{H5Gget_info_by_idx} */ H5_DLL herr_t H5Gget_info_by_idx_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *group_name, H5_index_t idx_type, @@ -523,29 +451,18 @@ H5_DLL herr_t H5Grefresh(hid_t group_id); * Failure to release a group with this call will result in * resource leaks. * - * \since 1.0.0 + * \par Example + * \snippet H5F_examples.c mount * - * \version 1.4.0 Fortran function introduced in this release + * \since 1.0.0 * */ H5_DLL herr_t H5Gclose(hid_t group_id); /** * -------------------------------------------------------------------------- - * \ingroup H5G - * - * \brief Asynchronous version of H5Gcreate2() - * - * \app_file - * \app_func - * \app_line - * \group_id - * \es_id - * - * \return \herr_t - * - * \see H5Gcreate2() - * + * \ingroup ASYNC + * \async_variant_of{H5Gclose} */ H5_DLL herr_t H5Gclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t group_id, hid_t es_id); @@ -595,60 +512,649 @@ H5_DLL herr_t H5Gclose_async(const char *app_file, const char *app_func, unsigne /* Typedefs */ -/* +//! +/** * An object has a certain type. The first few numbers are reserved for use * internally by HDF5. Users may add their own types with higher values. The - * values are never stored in the file -- they only exist while an - * application is running. An object may satisfy the `isa' function for more - * than one type. + * values are never stored in the file -- they only exist while an application + * is running. An object may satisfy the `isa' function for more than one type. + * + * \deprecated */ typedef enum H5G_obj_t { - H5G_UNKNOWN = -1, /* Unknown object type */ - H5G_GROUP, /* Object is a group */ - H5G_DATASET, /* Object is a dataset */ - H5G_TYPE, /* Object is a named data type */ - H5G_LINK, /* Object is a symbolic link */ - H5G_UDLINK, /* Object is a user-defined link */ - H5G_RESERVED_5, /* Reserved for future use */ - H5G_RESERVED_6, /* Reserved for future use */ - H5G_RESERVED_7 /* Reserved for future use */ + H5G_UNKNOWN = -1, /**< Unknown object type */ + H5G_GROUP, /**< Object is a group */ + H5G_DATASET, /**< Object is a dataset */ + H5G_TYPE, /**< Object is a named data type */ + H5G_LINK, /**< Object is a symbolic link */ + H5G_UDLINK, /**< Object is a user-defined link */ + H5G_RESERVED_5, /**< Reserved for future use */ + H5G_RESERVED_6, /**< Reserved for future use */ + H5G_RESERVED_7 /**< Reserved for future use */ } H5G_obj_t; +//! -/** Define the operator function pointer for for H5Giterate() */ -//! [H5G_iterate_t_snip] +//! +/** + * Callback for H5Giterate() + * + * \deprecated + */ typedef herr_t (*H5G_iterate_t)(hid_t group, const char *name, void *op_data); -//! [H5G_iterate_t_snip] +//! -/** Information about an object */ -//! [H5G_stat_t_snip] +//! +/** + * Information about an object + * + * \deprecated + */ typedef struct H5G_stat_t { - unsigned long fileno[2]; /*file number */ - unsigned long objno[2]; /*object number */ - unsigned nlink; /*number of hard links to object*/ - H5G_obj_t type; /*basic object type */ - time_t mtime; /*modification time */ - size_t linklen; /*symbolic link value length */ - H5O_stat_t ohdr; /* Object header information */ + unsigned long fileno[2]; /**< file number */ + unsigned long objno[2]; /**< object number */ + unsigned nlink; /**< number of hard links to object*/ + H5G_obj_t type; /**< basic object type */ + time_t mtime; /**< modification time */ + size_t linklen; /**< symbolic link value length */ + H5O_stat_t ohdr; /**< Object header information */ } H5G_stat_t; -//! [H5G_stat_t_snip] +//! /* Function prototypes */ -H5_DLL hid_t H5Gcreate1(hid_t loc_id, const char *name, size_t size_hint); -H5_DLL hid_t H5Gopen1(hid_t loc_id, const char *name); -H5_DLL herr_t H5Glink(hid_t cur_loc_id, H5G_link_t type, const char *cur_name, const char *new_name); -H5_DLL herr_t H5Glink2(hid_t cur_loc_id, const char *cur_name, H5G_link_t type, hid_t new_loc_id, - const char *new_name); -H5_DLL herr_t H5Gmove(hid_t src_loc_id, const char *src_name, const char *dst_name); -H5_DLL herr_t H5Gmove2(hid_t src_loc_id, const char *src_name, hid_t dst_loc_id, const char *dst_name); -H5_DLL herr_t H5Gunlink(hid_t loc_id, const char *name); -H5_DLL herr_t H5Gget_linkval(hid_t loc_id, const char *name, size_t size, char *buf /*out*/); -H5_DLL herr_t H5Gset_comment(hid_t loc_id, const char *name, const char *comment); -H5_DLL int H5Gget_comment(hid_t loc_id, const char *name, size_t bufsize, char *buf); -H5_DLL herr_t H5Giterate(hid_t loc_id, const char *name, int *idx, H5G_iterate_t op, void *op_data); -H5_DLL herr_t H5Gget_num_objs(hid_t loc_id, hsize_t *num_objs); -H5_DLL herr_t H5Gget_objinfo(hid_t loc_id, const char *name, hbool_t follow_link, - H5G_stat_t *statbuf /*out*/); -H5_DLL ssize_t H5Gget_objname_by_idx(hid_t loc_id, hsize_t idx, char *name, size_t size); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Creates a new group and links it into the file + * + * \fgdta_loc_id + * \param[in] name Name of the group to create + * \param[in] size_hint Optional parameter indicating the number of bytes + * to reserve for the names that will appear in the group + * + * \return \hid_t{group} + * + * \deprecated This function is deprecated in favor of H5Gcreate2(). + * + * \details H5Gcreate1() creates a new group with the specified name at the + * specified location, \p loc_id. \p loc_id may be a file, group, + * dataset, named datatype or attribute. If an attribute, dataset, or + * named datatype is specified for \p loc_id then the group will be + * created at the location where the attribute, dataset, or named + * datatype is attached. The name, name, must not already be taken by + * some other object and all parent groups must already exist. + * + * \p name can be a relative path based at \p loc_id or an absolute + * path from the root of the file. Use of this function requires that + * any intermediate groups specified in the path already exist. + * + * The length of a group name, or of the name of any object within a + * group, is not limited. + * + * \p size_hint is a hint for the number of bytes to reserve to store + * the names which will be eventually added to the new group. Passing a + * value of zero for \p size_hint is usually adequate since the library + * is able to dynamically resize the name heap, but a correct hint may + * result in better performance. If a non-positive value is supplied + * for \p size_hint, then a default size is chosen. + * + * The return value is a group identifier for the open group. This + * group identifier should be closed by calling H5Gclose() when it is + * no longer needed. + * + * See H5Gcreate_anon() for a discussion of the differences between + * H5Gcreate1() and H5Gcreate_anon(). + * + * \par Example + * \snippet H5F_examples.c mount + * + * \version 1.8.0 Function H5Gcreate() renamed to H5Gcreate1() and deprecated + * in this release. + * \since 1.0.0 + * + */ +H5_DLL hid_t H5Gcreate1(hid_t loc_id, const char *name, size_t size_hint); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Opens an existing group for modification and returns a group + * identifier for that group + * + * \fgdta_loc_id + * \param[in] name Name of the group to open + * + * \return \hid_t{group} + * + * \deprecated This function is deprecated in favor of H5Gopen2(). + * + * \details H5Gopen1() opens an existing group, \p name, at the location + * specified by \p loc_id. + * + * H5Gopen1() returns a group identifier for the group that was + * opened. This group identifier should be released by calling + * H5Gclose() when it is no longer needed. + * + * \version 1.8.0 The function H5Gopen() was renamed to H5Gopen1() + * and deprecated in this release. + * \since 1.0.0 + * + */ +H5_DLL hid_t H5Gopen1(hid_t loc_id, const char *name); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Creates a link of the specified type from \p new_name to \p + * cur_name + * + * \fg_loc_id{cur_loc_id} + * \param[in] type Link type + * \param[in] cur_name Name of the existing object + * \param[in] new_name New name for the object + * + * \return \herr_t + * + * \deprecated This function is deprecated. + * + * \details H5Glink() creates a new name for an object that has some current + * name, possibly one of many names it currently has. + * + * If \p link_type is #H5G_LINK_HARD, then \p cur_name must specify + * the name of an existing object and both names are interpreted + * relative to \p cur_loc_id, which is either a file identifier or a + * group identifier. + * + * If \p link_type is #H5G_LINK_SOFT, then \p cur_name can be anything + * and is interpreted at lookup time relative to the group which + * contains the final component of \p new_name. For instance, if \p + * cur_name is \Code{./foo}, \p new_name is \Code{./x/y/bar}, and a + * request is made for \Code{./x/y/bar}, then the actual object looked + * up is \Code{./x/y/./foo}. + + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Glink(hid_t cur_loc_id, H5G_link_t type, const char *cur_name, const char *new_name); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Creates a link of the specified type from \p cur_name to \p + * new_name + * + * \fg_loc_id{cur_loc_id} + * \param[in] cur_name Name of the existing object + * \param[in] type Link type + * \fg_loc_id{new_loc_id} + * \param[in] new_name New name for the object + * + * \return \herr_t + * + * \deprecated This function is deprecated. + * + * \details H5Glink2() creates a new name for an object that has some current + * name, possibly one of many names it currently has. + * + * If \p link_type is #H5G_LINK_HARD, then \p cur_name must specify the + * name of an existing object and both names are interpreted relative + * to \p cur_loc_id and \p new_loc_id, respectively, which are either + * file identifiers or group identifiers. + * + * If \p link_type is #H5G_LINK_SOFT, then \p cur_name can be anything + * and is interpreted at lookup time relative to the group which + * contains the final component of \p new_name. For instance, if \p + * current_name is \Code{./foo}, \p new_name is \Code{./x/y/bar}, and a + * request is made for \Code{./x/y/bar}, then the actual object looked + * up is \Code{./x/y/./foo}. + + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Glink2(hid_t cur_loc_id, const char *cur_name, H5G_link_t type, hid_t new_loc_id, + const char *new_name); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Renames an object within an HDF5 file + * + * \fg_loc_id{src_loc_id} + * \param[in] src_name Object's original name + * \param[in] dst_name Object's new name + * + * \return \herr_t + * + * \deprecated This function is deprecated. + * + * \details H5Gmove() renames an object within an HDF5 file. The original name, + * \p src_name, is unlinked from the group graph and the new name, \p + * dst_name, is inserted as an atomic operation. Both names are + * interpreted relative to \p loc_id, which is either a file or a group + * identifier. + * + * \attention Exercise care in moving groups as it is possible to render data in + * a file inaccessible with H5Gmove(). See The Group Interface in the + * HDF5 User's Guide. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Gmove(hid_t src_loc_id, const char *src_name, const char *dst_name); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Renames an object within an HDF5 file + * + * \fg_loc_id{src_loc_id} + * \param[in] src_name Object's original name + * \fg_loc_id{dst_loc_id} + * \param[in] dst_name Object's new name + * + * \return \herr_t + * + * \deprecated This function is deprecated. + * + * \details H5Gmove2() renames an object within an HDF5 file. The original name, + * \p src_name, is unlinked from the group graph and the new name, \p + * dst_name, is inserted as an atomic operation. + * + * \p src_name and \p dst_name are interpreted relative to \p + * src_loc_id and \p dst_loc_id, respectively, which are either file or + * group identifiers. + * + * \attention Exercise care in moving groups as it is possible to render data in + * a file inaccessible with H5Gmove2(). See The Group Interface in the + * HDF5 User's Guide. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Gmove2(hid_t src_loc_id, const char *src_name, hid_t dst_loc_id, const char *dst_name); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Removes the link to an object from a group + * + * \fg_loc_id{loc_id} + * \param[in] name Name of the object to unlink + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the function H5Ldelete(). + * + * \details H5Gunlink() removes the object specified by \p name from the group + * graph and decrements the link count for the object to which \p name + * points. This action eliminates any association between name and the + * object to which name pointed. + * + * Object headers keep track of how many hard links refer to an object; + * when the link count reaches zero, the object can be removed from the + * file. Objects which are open are not removed until all identifiers + * to the object are closed. + * + * If the link count reaches zero, all file space associated with the + * object will be released, i.e., identified in memory as freespace. If + * any object identifier is open for the object, the space will not be + * released until after the object identifier is closed. + * + * Note that space identified as freespace is available for re-use only + * as long as the file remains open; once a file has been closed, the + * HDF5 library loses track of freespace. See “Freespace Management” in + * the HDF5 User's Guide for further details. + * + * \attention Exercise care in moving groups as it is possible to render data in + * a file inaccessible with H5Gunlink(). See The Group Interface in the + * HDF5 User's Guide. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Gunlink(hid_t loc_id, const char *name); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Returns the name of the object that the symbolic link points to + * + * \fg_loc_id{loc_id} + * \param[in] name Symbolic link to the object whose name is to be returned + * \param[in] size Maximum number of characters of value to be returned + * \param[out] buf A buffer to hold the name of the object being sought + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the function H5Lget_val(). + * + * \details H5Gget_linkval() returns up to size characters of the name of the + * object that the symbolic link name points to. + * + * The parameter \p loc_id is a file or group identifier. + * + * The parameter \p name must be a symbolic link pointing to the + * desired object and must be defined relative to \p loc_id. + * + * If size is smaller than the size of the returned object name, then + * the name stored in the buffer value will not be \c NULL terminated. + * + * This function fails if \p name is not a symbolic link. The presence + * of a symbolic link can be tested by passing zero for \p size and \p + * NULL for value. + * + * This function should be used only after H5Lget_info1() (or the + * deprecated function H5Gget_objinfo()) has been called to verify that + * name is a symbolic link. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Gget_linkval(hid_t loc_id, const char *name, size_t size, char *buf /*out*/); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Sets comment for specified object + * + * \fgdt_loc_id + * \param[in] name Name of the object whose comment is to be set or reset + * name must be \Code{'.'} (dot) if \p loc_id fully specifies + * the object for which the comment is to be set. + * \param[in] comment The new comment + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the function + * H5Oset_comment(). + * + * \details H5Gset_comment() sets the comment for the object specified by \p + * loc_id and name to comment. Any previously existing comment is + * overwritten. + * + * \p loc_id can specify any object in the file. name can be one of the + * following: + * \li The name of the object relative to \p loc_id + * \li An absolute name of the object, starting from \c /, the file’s + * root group + * \li A dot (\c .), if \p loc_id fully specifies the object + * + * If \p comment is the empty string or a null pointer, the comment + * message is removed from the object. + * + * Comments should be relatively short, null-terminated, ASCII strings. + * + * Comments can be attached to any object that has an object header, + * e.g., datasets, groups, and named datatypes, but not symbolic links. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Gset_comment(hid_t loc_id, const char *name, const char *comment); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Retrieves comment for specified object + * + * \fgdt_loc_id + * \param[in] name Name of the object whose comment is to be set or reset + * name must be \Code{'.'} (dot) if \p loc_id fully specifies + * the object for which the comment is to be set. + * \param[in] bufsize Maximum number of comment characters to be returned in \p buf. + * \param[in] buf The comment + * + * \return Returns the number of characters in the comment, counting the \c NULL + * terminator, if successful; the value returned may be larger than + * \p bufsize. Otherwise returns a negative value. + * + * \deprecated This function is deprecated in favor of the function + * H5Oget_comment(). + * + * \details H5Gget_comment() retrieves the comment for the the object specified + * by \p loc_id and \p name. The comment is returned in the buffer \p + * buf. + * + * \p loc_id can specify any object in the file. name can be one of the + * following: + * \li The name of the object relative to \p loc_id + * \li An absolute name of the object, starting from \c /, the file’s + * root group + * \li A dot (\c .), if \p loc_id fully specifies the object + * + * At most bufsize characters, including a null-terminator, are + * returned in \p buf. The returned value is not null-terminated if the + * comment is longer than the supplied buffer. If the size of the + * comment is unknown, a preliminary \p H5Gget_comment() call will + * return the size of the comment, including space for the + * null-terminator. + * + * If an object does not have a comment, the empty string is returned + * in comment. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL int H5Gget_comment(hid_t loc_id, const char *name, size_t bufsize, char *buf); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Iterates over the entries of a group invoking a callback for each + * entry encountered + * + * \fg_loc_id + * \param[in] name Group over which the iteration is performed + * \param[in,out] idx Location at which to begin the iteration + * \param[in] op Operation to be performed on an object at each step of the + * iteration + * \param[in,out] op_data Data associated with the operation + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the function + * H5Literate1(). + * + * \details H5Giterate() iterates over the members of name in the file or group + * specified with \p loc_id. For each object in the group, the \p + * op_data and some additional information, specified below, are passed + * to the operator function. The iteration begins with the \p idx + * object in the group and the next element to be processed by the + * operator is returned in \p idx. If \p idx is NULL, then the iterator + * starts at the first group member; since no stopping point is + * returned in this case, the iterator cannot be restarted if one of + * the calls to its operator returns non-zero. H5Giterate() does not + * recursively follow links into subgroups of the specified group. + * + * The prototype for \ref H5G_iterate_t is: + * \snippet this H5G_iterate_t_snip + * + * The operation receives the group identifier for the group being + * iterated over, \p group, the name of the current object within + * the group, \p name, and the pointer to the operator data + * passed in to H5Giterate(), \p op_data. + * + * The return values from an operator are: + * \li Zero causes the iterator to continue, returning zero when all + * group members have been processed. + * \li Positive causes the iterator to immediately return that positive + * value, indicating short-circuit success. The iterator can be + * restarted at the next group member. + * \li Negative causes the iterator to immediately return that value, + * indicating failure. The iterator can be restarted at the next + * group member. + * + * H5Giterate() assumes that the membership of the group identified by + * \p name remains unchanged through the iteration. If the membership + * changes during the iteration, the function's behavior is undefined. + * + * H5Giterate() is not recursive. In particular, if a member of \p name + * is found to be a group, call it \c subgroup_a, H5Giterate() does not + * examine the members of \c subgroup_a. When recursive iteration is + * required, the application must handle the recursion, explicitly + * calling H5Giterate() on discovered subgroups. + + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Giterate(hid_t loc_id, const char *name, int *idx, H5G_iterate_t op, void *op_data); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Returns number of objects in the group specified by its identifier + * + * \fg_loc_id + * \param[out] num_objs Number of objects in the group + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the function H5Gget_info(). + * + * \details H5Gget_num_objs() returns number of objects in a group. Group is + * specified by its identifier \p loc_id. If a file identifier is + * passed in, then the number of objects in the root group is returned. + * + * \version 1.8.0 Function deprecated in this release. + * + */ +H5_DLL herr_t H5Gget_num_objs(hid_t loc_id, hsize_t *num_objs); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Returns information about an object. + * + * \fgdt_loc_id + * \param[in] name Name of the object for which status is being sought + * \param[in] follow_link Link flag + * \param[out] statbuf Buffer in which to return information about the object + * + * \return \herr_t + * + * \deprecated This function is deprecated in favor of the functions H5Oget_info() + * and H5Lget_info1(). + * + * \details H5Gget_objinfo() returns information about the specified object + * through the \p statbuf argument. + * + * A file or group identifier, \p loc_id, and an object name, \p name, + * relative to \p loc_id, are commonly used to specify the + * object. However, if the object identifier is already known to the + * application, an alternative approach is to use that identifier, \c + * obj_id, in place of \p loc_id, and a dot (\c .) in place of \p + * name. Thus, the alternative versions of the first portion of an + * H5Gget_objinfo() call would be as follows: + * \code + * H5Gget_objinfo (loc_id name ...) + * H5Gget_objinfo (obj_id . ...) + * \endcode + * + * If the object is a symbolic link and follow_link is zero (0), then + * the information returned describes the link itself; otherwise the + * link is followed and the information returned describes the object + * to which the link points. If \p follow_link is non-zero but the + * final symbolic link is dangling (does not point to anything), then + * an error is returned. The \p statbuf fields are undefined for an + * error. The existence of an object can be tested by calling this + * function with a \c NULL \p statbuf. + * + * H5Gget_objinfo() fills in the following data structure (defined in + * H5Gpublic.h): + * \snippet this H5G_stat_t_snip + * + * where \ref H5O_stat_t (defined in H5Opublic.h) is: + * \snippet H5Opublic.h H5O_stat_t_snip + * + * \attention Some systems will be able to record the time accurately but unable + * to retrieve the correct time; such systems (e.g., Irix64) will + * report an \c mtime value of 0 (zero). + * + * \version 1.8.0 Function deprecated in this release. + * \version 1.6.1 Two new fields were added to the \ref H5G_stat_t struct in + * this release. + * + */ +H5_DLL herr_t H5Gget_objinfo(hid_t loc_id, const char *name, hbool_t follow_link, + H5G_stat_t *statbuf /*out*/); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Returns a name of an object specified by an index + * + * \fg_loc_id + * \param[in] idx Transient index identifying object + * \param[in,out] name Pointer to user-provided buffer the object name + * \param[in] size Name length + * + * \return Returns the size of the object name if successful, or 0 if no name is + * associated with the group identifier. Otherwise returns a negative + * value. + * + * \deprecated This function is deprecated in favor of the function H5Lget_name_by_idx(). + * + * \details H5Gget_objname_by_idx() returns a name of the object specified by + * the index \p idx in the group \p loc_id. + * + * The group is specified by a group identifier \p loc_id. If + * preferred, a file identifier may be passed in \p loc_id; that file's + * root group will be assumed. + * + * \p idx is the transient index used to iterate through the objects in + * the group. The value of \p idx is any nonnegative number less than + * the total number of objects in the group, which is returned by the + * function H5Gget_num_objs(). Note that this is a transient index; an + * object may have a different index each time a group is opened. + * + * The object name is returned in the user-specified buffer \p name. + * + * If the size of the provided buffer \p name is less or equal the + * actual object name length, the object name is truncated to + * \Code{max_size - 1} characters. + * + * Note that if the size of the object's name is unkown, a preliminary + * call to H5Gget_objname_by_idx() with \p name set to \c NULL will + * return the length of the object's name. A second call to + * H5Gget_objname_by_idx() can then be used to retrieve the actual + * name. + * + * \version 1.8.0 Function deprecated in this release. + * \since 1.6.0 + * + */ +H5_DLL ssize_t H5Gget_objname_by_idx(hid_t loc_id, hsize_t idx, char *name, size_t size); +/** + *------------------------------------------------------------------------- + * \ingroup H5G + * + * \brief Returns the type of an object specified by an index + * + * \fg_loc_id + * \param[in] idx Transient index identifying object + * + * \return Returns the type of the object if successful. Otherwise returns a + * negative value. + * + * \deprecated This function is deprecated in favor of the function H5Oget_info(). + * + * \details H5Gget_objtype_by_idx() returns the type of the object specified by + * the index \p idx in the group \p loc_id. + * + * The group is specified by a group identifier \p loc_id. If + * preferred, a file identifier may be passed in \p loc_id; that file's + * root group will be assumed. + * + * \p idx is the transient index used to iterate through the objects in + * the group. This parameter is described in more detail in the + * discussion of H5Gget_objname_by_idx(). + * + * \version 1.8.0 Function deprecated in this release. + * \version 1.6.0 The function return type changed from \c int to the enumerated + * type \ref H5G_obj_t. + * \since 1.6.0 + * + */ H5_DLL H5G_obj_t H5Gget_objtype_by_idx(hid_t loc_id, hsize_t idx); #endif /* H5_NO_DEPRECATED_SYMBOLS */ diff --git a/src/H5Ipublic.h b/src/H5Ipublic.h index be347e4..8e5e167 100644 --- a/src/H5Ipublic.h +++ b/src/H5Ipublic.h @@ -32,6 +32,7 @@ * test/tmisc.c to verify that the H5I{inc|dec|get}_ref() routines * work correctly with it. \endinternal */ +//! typedef enum H5I_type_t { H5I_UNINIT = (-2), /**< uninitialized type */ H5I_BADID = (-1), /**< invalid Type */ @@ -53,6 +54,7 @@ typedef enum H5I_type_t { H5I_EVENTSET, /**< type ID for event sets */ H5I_NTYPES /**< number of library types, MUST BE LAST! */ } H5I_type_t; +//! /** * Type of IDs to return to users @@ -86,30 +88,30 @@ typedef herr_t (*H5I_free_t)(void *, void **); /** * The type of a function to compare objects & keys */ -//! [H5I_search_func_t_snip] +//! typedef int (*H5I_search_func_t)(void *obj, hid_t id, void *key); -//! [H5I_search_func_t_snip] +//! /** * The type of H5Iiterate() callback functions */ -//! [H5I_iterate_func_t_snip] +//! typedef herr_t (*H5I_iterate_func_t)(hid_t id, void *udata); -//! [H5I_iterate_func_t_snip] +//! /** * The type of the realize_cb callback for H5Iregister_future */ -//! [H5I_future_realize_func_t_snip] +//! typedef herr_t (*H5I_future_realize_func_t)(void *future_object, hid_t *actual_object_id); -//! [H5I_future_realize_func_t_snip] +//! /** * The type of the discard_cb callback for H5Iregister_future */ -//! [H5I_future_discard_func_t_snip] +//! typedef herr_t (*H5I_future_discard_func_t)(void *future_object); -//! [H5I_future_discard_func_t_snip] +//! #ifdef __cplusplus extern "C" { @@ -173,7 +175,7 @@ H5_DLL hid_t H5Iregister(H5I_type_t type, const void *object); * * \details The \p realize_cb parameter is a function pointer that will be * invoked by the HDF5 library to convert a future object into an - * actual object. The \realize_cb function may be invoked by + * actual object. The \p realize_cb function may be invoked by * H5Iobject_verify() to return the actual object for a user-defined * ID class (i.e. an ID class registered with H5Iregister_type()) or * internally by the HDF5 library in order to use or get information @@ -281,7 +283,7 @@ H5_DLL void *H5Iremove_verify(hid_t id, H5I_type_t type); * \p id. * * Valid types returned by the function are: - * \types + * \id_types * * If no valid type can be determined or the identifier submitted is * invalid, the function returns #H5I_BADID. diff --git a/src/H5Lmodule.h b/src/H5Lmodule.h index 54b94a4..16f1f34 100644 --- a/src/H5Lmodule.h +++ b/src/H5Lmodule.h @@ -35,6 +35,8 @@ * * \defgroup TRAV Link Traversal * \ingroup H5L + * \defgroup H5LA Advanced Link Functions + * \ingroup H5L */ #endif /* H5Lmodule_H */ diff --git a/src/H5Lpublic.h b/src/H5Lpublic.h index d5ec346..4b5e9e4 100644 --- a/src/H5Lpublic.h +++ b/src/H5Lpublic.h @@ -92,7 +92,7 @@ typedef enum { /** * \brief Information struct for links */ -//! [H5L_info2_t_snip] +//! typedef struct { H5L_type_t type; /**< Type of link */ hbool_t corder_valid; /**< Indicate if creation order is valid */ @@ -103,7 +103,7 @@ typedef struct { size_t val_size; /**< Size of a soft link or user-defined link value */ } u; } H5L_info2_t; -//! [H5L_info2_t_snip] +//! /* The H5L_class_t struct can be used to override the behavior of a * "user-defined" link class. Users should populate the struct with callback @@ -150,7 +150,7 @@ typedef ssize_t (*H5L_query_func_t)(const char *link_name, const void *lnkdata, * "user-defined" link class. Users should populate the struct with callback * functions defined elsewhere. */ -//! [H5L_class_t_snip] +//! typedef struct { int version; /**< Version number of this struct */ H5L_type_t id; /**< Link type ID */ @@ -162,16 +162,16 @@ typedef struct { H5L_delete_func_t del_func; /**< Callback for link deletion */ H5L_query_func_t query_func; /**< Callback for queries */ } H5L_class_t; -//! [H5L_class_t_snip] +//! /** * \brief Prototype for H5Literate2(), H5Literate_by_name2() operator * * The H5O_token_t version is used in the VOL layer and future public API calls. */ -//! [H5L_iterate2_t_snip] +//! typedef herr_t (*H5L_iterate2_t)(hid_t group, const char *name, const H5L_info2_t *info, void *op_data); -//! [H5L_iterate2_t_snip] +//! /** * \brief Callback for external link traversal @@ -201,8 +201,6 @@ typedef herr_t (*H5L_elink_traverse_t)(const char *parent_file_name, const char * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lmove() moves a link within an HDF5 file. The original link, * \p src_name, is removed from \p src_loc and the new link, * \p dst_name, is inserted at dst_loc. This change is @@ -321,8 +319,6 @@ H5_DLL herr_t H5Lcopy(hid_t src_loc, const char *src_name, hid_t dst_loc, const * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lcreate_hard() creates a new hard link to a pre-existing object * in an HDF5 file. * @@ -357,6 +353,11 @@ H5_DLL herr_t H5Lcopy(hid_t src_loc, const char *src_name, hid_t dst_loc, const */ H5_DLL herr_t H5Lcreate_hard(hid_t cur_loc, const char *cur_name, hid_t dst_loc, const char *dst_name, hid_t lcpl_id, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Lcreate_hard} + */ H5_DLL herr_t H5Lcreate_hard_async(const char *app_file, const char *app_func, unsigned app_line, hid_t cur_loc_id, const char *cur_name, hid_t new_loc_id, const char *new_name, hid_t lcpl_id, hid_t lapl_id, hid_t es_id); @@ -373,8 +374,6 @@ H5_DLL herr_t H5Lcreate_hard_async(const char *app_file, const char *app_func, u * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lcreate_soft() creates a new soft link to an object in an HDF5 * file. * @@ -426,6 +425,11 @@ H5_DLL herr_t H5Lcreate_hard_async(const char *app_file, const char *app_func, u */ H5_DLL herr_t H5Lcreate_soft(const char *link_target, hid_t link_loc_id, const char *link_name, hid_t lcpl_id, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Lcreate_soft} + */ H5_DLL herr_t H5Lcreate_soft_async(const char *app_file, const char *app_func, unsigned app_line, const char *link_target, hid_t link_loc_id, const char *link_name, hid_t lcpl_id, hid_t lapl_id, hid_t es_id); @@ -440,8 +444,6 @@ H5_DLL herr_t H5Lcreate_soft_async(const char *app_file, const char *app_func, u * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Ldelete() removes the link specified by \p name from the location * \p loc_id. * @@ -468,6 +470,11 @@ H5_DLL herr_t H5Lcreate_soft_async(const char *app_file, const char *app_func, u * */ H5_DLL herr_t H5Ldelete(hid_t loc_id, const char *name, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Ldelete} + */ H5_DLL herr_t H5Ldelete_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t lapl_id, hid_t es_id); /** @@ -484,8 +491,6 @@ H5_DLL herr_t H5Ldelete_async(const char *app_file, const char *app_func, unsign * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Ldelete_by_idx() removes the \Emph{n}-th link in a group * according to the specified order, \p order, in the specified index, * \p index. @@ -500,6 +505,11 @@ H5_DLL herr_t H5Ldelete_async(const char *app_file, const char *app_func, unsign */ H5_DLL herr_t H5Ldelete_by_idx(hid_t loc_id, const char *group_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Ldelete_by_idx} + */ H5_DLL herr_t H5Ldelete_by_idx_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *group_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, hid_t lapl_id, hid_t es_id); @@ -516,8 +526,6 @@ H5_DLL herr_t H5Ldelete_by_idx_async(const char *app_file, const char *app_func, * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lget_val() returns tha value of link \p name. For smbolic links, * this is the path to which the link points, including the null * terminator. For external and user-defined links, it is the link @@ -575,8 +583,6 @@ H5_DLL herr_t H5Lget_val(hid_t loc_id, const char *name, void *buf /*out*/, size * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lget_val_by_idx() retrieves the value of the \Emph{n}-th link in * a group, according to the specified order, \p order, within an * index, \p index. @@ -630,8 +636,6 @@ H5_DLL herr_t H5Lget_val_by_idx(hid_t loc_id, const char *group_name, H5_index_t * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lexists() allows an application to determine whether the link \p * name exists in the location specified by \p loc_id. The link may be * of any type; only the presence of a link with that name is checked. @@ -707,6 +711,11 @@ H5_DLL herr_t H5Lget_val_by_idx(hid_t loc_id, const char *group_name, H5_index_t * */ H5_DLL htri_t H5Lexists(hid_t loc_id, const char *name, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Lexists} + */ H5_DLL herr_t H5Lexists_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hbool_t *exists, hid_t lapl_id, hid_t es_id); /** @@ -721,8 +730,6 @@ H5_DLL herr_t H5Lexists_async(const char *app_file, const char *app_func, unsign * * \return \herr_t * - * \todo We need to get the location ID story straight! - * * \details H5Lget_info2() returns information about the specified link through * the \p linfo argument. * @@ -830,8 +837,6 @@ H5_DLL herr_t H5Lget_info2(hid_t loc_id, const char *name, H5L_info2_t *linfo, h * * \see H5Lget_info2() * - * \todo Document H5Lget_info_by_idx() - * */ H5_DLL herr_t H5Lget_info_by_idx2(hid_t loc_id, const char *group_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, H5L_info2_t *linfo, hid_t lapl_id); @@ -957,6 +962,11 @@ H5_DLL ssize_t H5Lget_name_by_idx(hid_t loc_id, const char *group_name, H5_index */ H5_DLL herr_t H5Literate2(hid_t grp_id, H5_index_t idx_type, H5_iter_order_t order, hsize_t *idx, H5L_iterate2_t op, void *op_data); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Literate} + */ H5_DLL herr_t H5Literate_async(const char *app_file, const char *app_func, unsigned app_line, hid_t group_id, H5_index_t idx_type, H5_iter_order_t order, hsize_t *idx_p, H5L_iterate2_t op, void *op_data, hid_t es_id); @@ -1642,8 +1652,10 @@ H5_DLL herr_t H5Lcreate_external(const char *file_name, const char *obj_name, hi /* Typedefs */ -/* Information struct for link (for H5Lget_info1/H5Lget_info_by_idx1) */ -//! [H5L_info1_t_snip] +//! +/** + * Information struct for link (for H5Lget_info1() and H5Lget_info_by_idx1()) + */ typedef struct { H5L_type_t type; /**< Type of link */ hbool_t corder_valid; /**< Indicate if creation order is valid */ @@ -1654,7 +1666,7 @@ typedef struct { size_t val_size; /**< Size of a soft link or UD link value */ } u; } H5L_info1_t; -//! [H5L_info1_t_snip] +//! /** Callback during link traversal */ typedef hid_t (*H5L_traverse_0_func_t)(const char *link_name, hid_t cur_group, const void *lnkdata, @@ -1674,9 +1686,9 @@ typedef struct { } H5L_class_0_t; /** Prototype for H5Literate1() / H5Literate_by_name1() operator */ -//! [H5L_iterate1_t_snip] +//! typedef herr_t (*H5L_iterate1_t)(hid_t group, const char *name, const H5L_info1_t *info, void *op_data); -//! [H5L_iterate1_t_snip] +//! /* Function prototypes */ /** @@ -1694,8 +1706,6 @@ typedef herr_t (*H5L_iterate1_t)(hid_t group, const char *name, const H5L_info1_ * \deprecated As of HDF5-1.12 this function has been deprecated in favor of * the function H5Lget_info2() or the macro H5Lget_info(). * - * \todo We need to get the location ID story straight! - * * \details H5Lget_info1() returns information about the specified link through * the \p linfo argument. * diff --git a/src/H5MMpublic.h b/src/H5MMpublic.h index ebfb377..70ac644 100644 --- a/src/H5MMpublic.h +++ b/src/H5MMpublic.h @@ -29,8 +29,13 @@ #include "H5public.h" /* These typedefs are currently used for VL datatype allocation/freeing */ +//! typedef void *(*H5MM_allocate_t)(size_t size, void *alloc_info); +//! + +//! typedef void (*H5MM_free_t)(void *mem, void *free_info); +//! #ifdef __cplusplus extern "C" { diff --git a/src/H5Mmodule.h b/src/H5Mmodule.h index 8b4f11f..3dae3e2 100644 --- a/src/H5Mmodule.h +++ b/src/H5Mmodule.h @@ -26,4 +26,49 @@ #define H5_MY_PKG_ERR H5E_MAP #define H5_MY_PKG_INIT YES +/** + * \defgroup H5M H5M + * \brief Map Interface + * + * \details \Bold{The interface can only be used with the HDF5 VOL connectors that + * implement map objects.} The native HDF5 library does not support this + * feature. + * + * While the HDF5 data model is a flexible way to store data, some + * applications require a more general way to index information. HDF5 + * effectively uses key-value stores internally for a variety of + * purposes, but it does not expose a generic key-value store to the + * API. The Map APIs provide this capability to the HDF5 applications + * in the form of HDF5 map objects. These Map objects contain + * application-defined key-value stores, to which key-value pairs can + * be added, and from which values can be retrieved by key. + * + * HDF5 VOL connectors with support for map objects: + * - DAOS + * + * \par Example: + * \code + * hid_t file_id, fapl_id, map_id, vls_type_id; + * const char *names[2] = ["Alice", "Bob"]; + * uint64_t IDs[2] = [25385486, 34873275]; + * uint64_t val_out; + * + * + * + * vls_type_id = H5Tcopy(H5T_C_S1); + * H5Tset_size(vls_type_id, H5T_VARIABLE); + * file_id = H5Fcreate("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, fapl_id); + * map_id = H5Mcreate(file_id, "map", vls_type_id, H5T_NATIVE_UINT64, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT); + * H5Mput(map_id, vls_type_id, &names[0], H5T_NATIVE_UINT64, &IDs[0], H5P_DEFAULT); + * H5Mput(map_id, vls_type_id, &names[1], H5T_NATIVE_UINT64, &IDs[1], H5P_DEFAULT); + * H5Mget(map_id, vls_type_id, &names[0], H5T_NATIVE_UINT64, &val_out, H5P_DEFAULT); + * if(val_out != IDs[0]) + * ERROR; + * H5Mclose(map_id); + * H5Tclose(vls_type_id); + * H5Fclose(file_id); + * \endcode + * + */ + #endif /* H5Dmodule_H */ diff --git a/src/H5Mpublic.h b/src/H5Mpublic.h index 4d8bef4..e0be828 100644 --- a/src/H5Mpublic.h +++ b/src/H5Mpublic.h @@ -61,8 +61,12 @@ typedef enum H5VL_map_specific_t { H5VL_MAP_DELETE /* H5Mdelete */ } H5VL_map_specific_t; -/* Callback for H5Miterate() */ +//! +/** + * Callback for H5Miterate() + */ typedef herr_t (*H5M_iterate_t)(hid_t map_id, const void *key, void *op_data); +//! /********************/ /* Public Variables */ @@ -81,38 +85,397 @@ extern "C" { */ #ifdef H5_HAVE_MAP_API +/** + * \ingroup H5M + * + * \brief Creates a map object + * + * \fgdta_loc_id + * \param[in] name Map object name + * \type_id{key_type_id} + * \type_id{val_type_id} + * \lcpl_id + * \mcpl_id + * \mapl_id + * \returns \hid_t{map object} + * + * \details H5Mcreate() creates a new map object for storing key-value + * pairs. The in-file datatype for keys is defined by \p key_type_id + * and the in-file datatype for values is defined by \p val_type_id. \p + * loc_id specifies the location to create the the map object and \p + * name specifies the name of the link to the map object relative to + * \p loc_id. + * + * \since 1.13.0 + * + */ H5_DLL hid_t H5Mcreate(hid_t loc_id, const char *name, hid_t key_type_id, hid_t val_type_id, hid_t lcpl_id, hid_t mcpl_id, hid_t mapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Mcreate} + */ H5_DLL hid_t H5Mcreate_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t key_type_id, hid_t val_type_id, hid_t lcpl_id, hid_t mcpl_id, hid_t mapl_id, hid_t es_id); + +/** + * \ingroup H5M + * + * \brief + * + * \details + * + * \since 1.13.0 + * + */ H5_DLL hid_t H5Mcreate_anon(hid_t loc_id, hid_t key_type_id, hid_t val_type_id, hid_t mcpl_id, hid_t mapl_id); + +/** + * \ingroup H5M + * + * \brief Opens a map object + * + * \fgdta_loc_id{loc_id} + * \param[in] name Map object name relative to \p loc_id + * \mapl_id + * \returns \hid_t{map object} + * + * \details H5Mopen() finds a map object specified by \p name under the location + * specified by \p loc_id. The map object should be close with + * H5Mclose() when the application is not longer interested in + * accessing it. + * + * \since 1.13.0 + * + */ H5_DLL hid_t H5Mopen(hid_t loc_id, const char *name, hid_t mapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Mopen} + */ H5_DLL hid_t H5Mopen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t mapl_id, hid_t es_id); + +/** + * \ingroup H5M + * + * \brief Terminates access to a map object + * + * \map_id + * \returns \herr_t + * + * \details H5Mclose() closes access to a map object specified by \p map_id and + * releases resources used by it. + * + * It is illegal to subsequently use that same map identifier in calls + * to other map functions. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Mclose(hid_t map_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Mclose} + */ H5_DLL herr_t H5Mclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t map_id, hid_t es_id); -H5_DLL hid_t H5Mget_key_type(hid_t map_id); -H5_DLL hid_t H5Mget_val_type(hid_t map_id); -H5_DLL hid_t H5Mget_create_plist(hid_t map_id); -H5_DLL hid_t H5Mget_access_plist(hid_t map_id); + +/** + * \ingroup H5M + * + * \brief Gets key datatype for a map object + * + * \map_id + * \returns \hid_t{datatype} + * + * \details H5Mget_key_type() retrieves key datatype as stored in the file for a + * map object specified by \p map_id and returns identifier for the + * datatype. + * + * \since 1.13.0 + * + */ +H5_DLL hid_t H5Mget_key_type(hid_t map_id); + +/** + * \ingroup H5M + * + * \brief Gets value datatype for a map object + * + * \map_id + * \returns \hid_t{datatype} + * + * \details H5Mget_val_type() retrieves value datatype as stored in the file for + * a map object specified by \p map_id and returns identifier for the + * datatype . + * + * \since 1.13.0 + * + */ +H5_DLL hid_t H5Mget_val_type(hid_t map_id); + +/** + * \ingroup H5M + * + * \brief Gets creation property list for a map object + * + * \map_id + * \returns \hid_t{map creation property list} + * + * \details H5Mget_create_plist() returns an identifier for a copy of the + * creation property list for a map object specified by \p map_id. + * + * \since 1.13.0 + * + */ +H5_DLL hid_t H5Mget_create_plist(hid_t map_id); + +/** + * \ingroup H5M + * + * \brief Gets access property list for a map object + * + * \map_id + * \returns \hid_t{map access property list} + * + * \details H5Mget_access_plist() returns an identifier for a copy of the access + * property list for a map object specified by \p map_id. + * + * \since 1.13.0 + * + */ +H5_DLL hid_t H5Mget_access_plist(hid_t map_id); + +/** + * \ingroup H5M + * + * \brief Retrieves the number of key-value pairs in a map object + * + * \map_id + * \param[out] count The number of key-value pairs stored in the map object + * \dxpl_id + * \returns \herr_t + * + * \details H5Mget_count() retrieves the number of key-value pairs stored in a + * map specified by map_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Mget_count(hid_t map_id, hsize_t *count, hid_t dxpl_id); + +/** + * \ingroup H5M + * + * \brief Adds a key-value pair to a map object + * + * \map_id + * \type_id{key_mem_type_id} + * \param[in] key Pointer to key buffer + * \type_id{val_mem_type_id} + * \param[in] value Pointer to value buffer + * \dxpl_id + * \returns \herr_t + * + * \details H5Mput() adds a key-value pair to a map object specified by \p + * map_id, or updates the value for the specified key if one was set + * previously. + * + * \p key_mem_type_id and \p val_mem_type_id specify the datatypes for + * the provided key and value buffers, and if different from those used + * to create the map object, the key and value will be internally + * converted to the datatypes for the map object. + * + * Any further options can be specified through the property list + * \p dxpl_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Mput(hid_t map_id, hid_t key_mem_type_id, const void *key, hid_t val_mem_type_id, const void *value, hid_t dxpl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Mput} + */ H5_DLL herr_t H5Mput_async(const char *app_file, const char *app_func, unsigned app_line, hid_t map_id, hid_t key_mem_type_id, const void *key, hid_t val_mem_type_id, const void *value, hid_t dxpl_id, hid_t es_id); + +/** + * \ingroup H5M + * + * \brief Retrieves a key-value pair from a map object + * + * \map_id + * \type_id{key_mem_type_id} + * \param[in] key Pointer to key buffer + * \type_id{val_mem_type_id} + * \param[out] value Pointer to value buffer + * \dxpl_id + * \returns \herr_t + * + * \details H5Mget() retrieves from a map object specified by \p map_id, the + * value associated with the provided key \p key. \p key_mem_type_id + * and \p val_mem_type_id specify the datatypes for the provided key + * and value buffers. If if the datatype specified by \p + * key_mem_type_id is different from that used to create the map object + * the key will be internally converted to the datatype for the map + * object for the query, and if the datatype specified by \p + * val_mem_type_id is different from that used to create the map object + * the returned value will be converted to have a datatype as specified + * by \p val_mem_type_id before the function returns. + * + * Any further options can be specified through the property list + * \p dxpl_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Mget(hid_t map_id, hid_t key_mem_type_id, const void *key, hid_t val_mem_type_id, void *value, hid_t dxpl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Mget} + */ H5_DLL herr_t H5Mget_async(const char *app_file, const char *app_func, unsigned app_line, hid_t map_id, hid_t key_mem_type_id, const void *key, hid_t val_mem_type_id, void *value, hid_t dxpl_id, hid_t es_id); + +/** + * \ingroup H5M + * + * \brief Checks if provided key exists in a map object + * + * \map_id + * \type_id{key_mem_type_id} + * \param[in] key Pointer to key buffer + * \param[out] exists Pointer to a buffer to return the existence status + * \dxpl_id + * \returns \herr_t + * + * \details H5Mexists() checks if the provided key is stored in the map object + * specified by \p map_id. If \p key_mem_type_id is different from that + * used to create the map object the key will be internally converted + * to the datatype for the map object for the query. + * + * Any further options can be specified through the property list + * \p dxpl_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Mexists(hid_t map_id, hid_t key_mem_type_id, const void *key, hbool_t *exists, hid_t dxpl_id); + +/** + * \ingroup H5M + * + * \brief Iterates over all key-value pairs in a map object + * + * \map_id + * \param[in,out] idx iteration index + * \type_id{key_mem_type_id} + * \param[in] op User-defined iterator function + * \op_data + * \dxpl_id + * \returns \herr_t + * + * \details H5Miterate() iterates over all key-value pairs stored in the map + * object specified by \p map_id, making the callback specified by \p + * op for each. The \p idx parameter is an in/out parameter that may be + * used to restart a previously interrupted iteration. At the start of + * iteration \p idx should be set to 0, and to restart iteration at the + * same location on a subsequent call to H5Miterate(), \p idx should be + * the same value as returned by the previous call. Iterate callback is + * defined as: + * \snippet this H5M_iterate_t_snip + * The \p key parameter is the buffer for the key for this iteration, + * converted to the datatype specified by \p key_mem_type_id. The \p + * op_data parameter is a simple pass through of the value passed to + * H5Miterate(), which can be used to store application-defined data for + * iteration. A negative return value from this function will cause + * H5Miterate() to issue an error, while a positive return value will + * cause H5Miterate() to stop iterating and return this value without + * issuing an error. A return value of zero allows iteration to continue. + * + * Any further options can be specified through the property list \p dxpl_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Miterate(hid_t map_id, hsize_t *idx, hid_t key_mem_type_id, H5M_iterate_t op, void *op_data, hid_t dxpl_id); + +/** + * \ingroup H5M + * + * \brief Iterates over all key-value pairs in a map object + * + * \loc_id + * \param[in] map_name Map object name relative to the location specified by \p loc_id + * \param[in,out] idx Iteration index + * \type_id{key_mem_type_id} + * \param[in] op User-defined iterator function + * \op_data + * \dxpl_id + * \lapl_id + * \returns \herr_t + * + * \details H5Miterate_by_name() iterates over all key-value pairs stored in the + * map object specified by \p map_id, making the callback specified by + * \p op for each. The \p idx parameter is an in/out parameter that may + * be used to restart a previously interrupted iteration. At the start + * of iteration \p idx should be set to 0, and to restart iteration at + * the same location on a subsequent call to H5Miterate(), \p idx + * should be the same value as returned by the previous call. Iterate + * callback is defined as: + * \snippet this H5M_iterate_t_snip + * The\p key parameter is the buffer for the key for this iteration, + * converted to the datatype specified by \p key_mem_type_id. The \p + * op_data parameter is a simple pass through of the value passed to + * H5Miterate(), which can be used to store application-defined data + * for iteration. A negative return value from this function will cause + * H5Miterate() to issue an error, while a positive return value will cause + * H5Miterate() to stop iterating and return this value without issuing an + * error. A return value of zero allows iteration to continue. + * + * Any further options can be specified through the property list \p dxpl_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Miterate_by_name(hid_t loc_id, const char *map_name, hsize_t *idx, hid_t key_mem_type_id, H5M_iterate_t op, void *op_data, hid_t dxpl_id, hid_t lapl_id); + +/** + * \ingroup H5M + * + * \brief Deletes a key-value pair from a map object + * + * \map_id + * \type_id{key_mem_type_id} + * \param[in] key Pointer to key buffer + * \dxpl_id + * \returns \herr_t + * + * \details H5Mdelete() deletes a key-value pair from the map object specified + * by \p map_id. \p key_mem_type_id specifies the datatype for the + * provided key buffer key, and if different from that used to create + * the map object, the key will be internally converted to the datatype + * for the map object. + * + * Any further options can be specified through the property list \p dxpl_id. + * + * \since 1.13.0 + * + */ H5_DLL herr_t H5Mdelete(hid_t map_id, hid_t key_mem_type_id, const void *key, hid_t dxpl_id); /* API Wrappers for async routines */ diff --git a/src/H5Opublic.h b/src/H5Opublic.h index 04ef35b..7bc7784 100644 --- a/src/H5Opublic.h +++ b/src/H5Opublic.h @@ -86,17 +86,15 @@ #define H5O_INFO_NUM_ATTRS 0x0004u /* Fill in the num_attrs field */ #define H5O_INFO_ALL (H5O_INFO_BASIC | H5O_INFO_TIME | H5O_INFO_NUM_ATTRS) -/* Flags for H5Oget_native_info. - * Theses flags determine which fields will be filled in in the H5O_native_info_t - * struct. +//! +/** + * Flags for H5Oget_native_info(). Theses flags determine which fields will be + * filled in in the \ref H5O_native_info_t struct. */ -//! [H5O_native_info_fields_snip] - #define H5O_NATIVE_INFO_HDR 0x0008u /* Fill in the hdr field */ #define H5O_NATIVE_INFO_META_SIZE 0x0010u /* Fill in the meta_size field */ #define H5O_NATIVE_INFO_ALL (H5O_NATIVE_INFO_HDR | H5O_NATIVE_INFO_META_SIZE) - -//! [H5O_native_info_fields_snip] +//! /* Convenience macro to check if the token is the 'undefined' token value */ #define H5O_IS_TOKEN_UNDEF(token) (!HDmemcmp(&(token), &(H5O_TOKEN_UNDEF), sizeof(H5O_token_t))) @@ -105,46 +103,48 @@ /* Public Typedefs */ /*******************/ -//! [H5O_type_t_snip] - -/* Types of objects in file */ +//! +/** + * Types of objects in file + */ typedef enum H5O_type_t { - H5O_TYPE_UNKNOWN = -1, /* Unknown object type */ - H5O_TYPE_GROUP, /* Object is a group */ - H5O_TYPE_DATASET, /* Object is a dataset */ - H5O_TYPE_NAMED_DATATYPE, /* Object is a named data type */ - H5O_TYPE_MAP, /* Object is a map */ - H5O_TYPE_NTYPES /* Number of different object types (must be last!) */ + H5O_TYPE_UNKNOWN = -1, /**< Unknown object type */ + H5O_TYPE_GROUP, /**< Object is a group */ + H5O_TYPE_DATASET, /**< Object is a dataset */ + H5O_TYPE_NAMED_DATATYPE, /**< Object is a named data type */ + H5O_TYPE_MAP, /**< Object is a map */ + H5O_TYPE_NTYPES /**< Number of different object types (must be last!) */ } H5O_type_t; +//! -//! [H5O_type_t_snip] - -/* Information struct for object header metadata (for H5Oget_info/H5Oget_info_by_name/H5Oget_info_by_idx) */ -//! [H5O_hdr_info_t_snip] - +//! +/** + * Information struct for object header metadata (for + * H5Oget_info(), H5Oget_info_by_name(), H5Oget_info_by_idx()) + */ typedef struct H5O_hdr_info_t { - unsigned version; /* Version number of header format in file */ - unsigned nmesgs; /* Number of object header messages */ - unsigned nchunks; /* Number of object header chunks */ - unsigned flags; /* Object header status flags */ + unsigned version; /**< Version number of header format in file */ + unsigned nmesgs; /**< Number of object header messages */ + unsigned nchunks; /**< Number of object header chunks */ + unsigned flags; /**< Object header status flags */ struct { - hsize_t total; /* Total space for storing object header in file */ - hsize_t meta; /* Space within header for object header metadata information */ - hsize_t mesg; /* Space within header for actual message information */ - hsize_t free; /* Free space within object header */ + hsize_t total; /**< Total space for storing object header in file */ + hsize_t meta; /**< Space within header for object header metadata information */ + hsize_t mesg; /**< Space within header for actual message information */ + hsize_t free; /**< Free space within object header */ } space; struct { - uint64_t present; /* Flags to indicate presence of message type in header */ - uint64_t shared; /* Flags to indicate message type is shared in header */ + uint64_t present; /**< Flags to indicate presence of message type in header */ + uint64_t shared; /**< Flags to indicate message type is shared in header */ } mesg; } H5O_hdr_info_t; +//! -//! [H5O_hdr_info_t_snip] - -//! [H5O_info2_t_snip] - -/* Data model information struct for objects */ -/* (For H5Oget_info / H5Oget_info_by_name / H5Oget_info_by_idx version 3) */ +//! +/** + * Data model information struct for objects + * (For H5Oget_info(), H5Oget_info_by_name(), H5Oget_info_by_idx() version 3) + */ typedef struct H5O_info2_t { unsigned long fileno; /* File number that object is located in */ H5O_token_t token; /* Token representing the object */ @@ -156,44 +156,52 @@ typedef struct H5O_info2_t { time_t btime; /* Birth time */ hsize_t num_attrs; /* # of attributes attached to object */ } H5O_info2_t; +//! -//! [H5O_info2_t_snip] - -//! [H5O_native_info_t_snip] - -/* Native file format information struct for objects */ -/* (For H5Oget_native_info / H5Oget_native_info_by_name / H5Oget_native_info_by_idx) */ +//! +/** + * Native file format information struct for objects. + * (For H5Oget_native_info(), H5Oget_native_info_by_name(), H5Oget_native_info_by_idx()) + */ typedef struct H5O_native_info_t { - H5O_hdr_info_t hdr; /* Object header information */ + H5O_hdr_info_t hdr; /**< Object header information */ /* Extra metadata storage for obj & attributes */ struct { - H5_ih_info_t obj; /* v1/v2 B-tree & local/fractal heap for groups, B-tree for chunked datasets */ - H5_ih_info_t attr; /* v2 B-tree & heap for attributes */ + H5_ih_info_t obj; /**< v1/v2 B-tree & local/fractal heap for groups, B-tree for chunked datasets */ + H5_ih_info_t attr; /**< v2 B-tree & heap for attributes */ } meta_size; } H5O_native_info_t; +//! -//! [H5O_native_info_t_snip] - -/* Typedef for message creation indexes */ +/** + * Typedef for message creation indexes + */ typedef uint32_t H5O_msg_crt_idx_t; -/* Prototype for H5Ovisit/H5Ovisit_by_name() operator (version 3) */ -//! [H5O_iterate2_t_snip] - +//! +/** + * Prototype for H5Ovisit(), H5Ovisit_by_name() operator (version 3) + */ typedef herr_t (*H5O_iterate2_t)(hid_t obj, const char *name, const H5O_info2_t *info, void *op_data); +//! -//! [H5O_iterate2_t_snip] - +//! typedef enum H5O_mcdt_search_ret_t { - H5O_MCDT_SEARCH_ERROR = -1, /* Abort H5Ocopy */ - H5O_MCDT_SEARCH_CONT, /* Continue the global search of all committed datatypes in the destination file */ - H5O_MCDT_SEARCH_STOP /* Stop the search, but continue copying. The committed datatype will be copied but - not merged. */ + H5O_MCDT_SEARCH_ERROR = -1, /**< Abort H5Ocopy */ + H5O_MCDT_SEARCH_CONT, /**< Continue the global search of all committed datatypes in the destination file + */ + H5O_MCDT_SEARCH_STOP /**< Stop the search, but continue copying. The committed datatype will be copied + but not merged. */ } H5O_mcdt_search_ret_t; +//! -/* Callback to invoke when completing the search for a matching committed datatype from the committed dtype - * list */ +//! +/** + * Callback to invoke when completing the search for a matching committed + * datatype from the committed dtype list + */ typedef H5O_mcdt_search_ret_t (*H5O_mcdt_search_cb_t)(void *op_data); +//! /********************/ /* Public Variables */ @@ -249,6 +257,11 @@ extern "C" { * */ H5_DLL hid_t H5Oopen(hid_t loc_id, const char *name, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Oopen} + */ H5_DLL hid_t H5Oopen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t lapl_id, hid_t es_id); @@ -332,6 +345,11 @@ H5_DLL hid_t H5Oopen_by_token(hid_t loc_id, H5O_token_t token); */ H5_DLL hid_t H5Oopen_by_idx(hid_t loc_id, const char *group_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Oopen_by_idx} + */ H5_DLL hid_t H5Oopen_by_idx_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *group_name, H5_index_t idx_type, H5_iter_order_t order, hsize_t n, hid_t lapl_id, hid_t es_id); @@ -553,6 +571,11 @@ H5_DLL herr_t H5Oget_info3(hid_t loc_id, H5O_info2_t *oinfo, unsigned fields); */ H5_DLL herr_t H5Oget_info_by_name3(hid_t loc_id, const char *name, H5O_info2_t *oinfo, unsigned fields, hid_t lapl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Oget_info_by_name} + */ H5_DLL herr_t H5Oget_info_by_name_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, H5O_info2_t *oinfo /*out*/, unsigned fields, hid_t lapl_id, hid_t es_id); @@ -968,7 +991,6 @@ H5_DLL herr_t H5Odecr_refcount(hid_t object_id); * - H5Pset_copy_object() * - H5Pset_create_intermediate_group() * - H5Pset_mcdt_search_cb() - * . * - Copying Committed Datatypes with #H5Ocopy - A comprehensive * discussion of copying committed datatypes (PDF) in * Advanced Topics in HDF5 @@ -980,6 +1002,11 @@ H5_DLL herr_t H5Odecr_refcount(hid_t object_id); */ H5_DLL herr_t H5Ocopy(hid_t src_loc_id, const char *src_name, hid_t dst_loc_id, const char *dst_name, hid_t ocpypl_id, hid_t lcpl_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Ocopy} + */ H5_DLL herr_t H5Ocopy_async(const char *app_file, const char *app_func, unsigned app_line, hid_t src_loc_id, const char *src_name, hid_t dst_loc_id, const char *dst_name, hid_t ocpypl_id, hid_t lcpl_id, hid_t es_id); @@ -1522,6 +1549,11 @@ H5_DLL herr_t H5Ovisit_by_name3(hid_t loc_id, const char *obj_name, H5_index_t i * */ H5_DLL herr_t H5Oclose(hid_t object_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Oclose} + */ H5_DLL herr_t H5Oclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t object_id, hid_t es_id); @@ -1572,6 +1604,11 @@ H5_DLL herr_t H5Oclose_async(const char *app_file, const char *app_func, unsigne * */ H5_DLL herr_t H5Oflush(hid_t obj_id); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Oflush} + */ H5_DLL herr_t H5Oflush_async(const char *app_file, const char *app_func, unsigned app_line, hid_t obj_id, hid_t es_id); /** @@ -1599,6 +1636,11 @@ H5_DLL herr_t H5Oflush_async(const char *app_file, const char *app_func, unsigne * */ H5_DLL herr_t H5Orefresh(hid_t oid); +/** + * -------------------------------------------------------------------------- + * \ingroup ASYNC + * \async_variant_of{H5Orefresh} + */ H5_DLL herr_t H5Orefresh_async(const char *app_file, const char *app_func, unsigned app_line, hid_t oid, hid_t es_id); @@ -1873,46 +1915,49 @@ H5_DLLVAR const H5O_token_t H5O_TOKEN_UNDEF_g; /* Typedefs */ -/* A struct that's part of the H5G_stat_t structure (deprecated) */ -//! [H5O_stat_t_snip] +//! +/** + * A struct that's part of the \ref H5G_stat_t structure + * \deprecated + */ typedef struct H5O_stat_t { - hsize_t size; /* Total size of object header in file */ - hsize_t free; /* Free space within object header */ - unsigned nmesgs; /* Number of object header messages */ - unsigned nchunks; /* Number of object header chunks */ + hsize_t size; /**< Total size of object header in file */ + hsize_t free; /**< Free space within object header */ + unsigned nmesgs; /**< Number of object header messages */ + unsigned nchunks; /**< Number of object header chunks */ } H5O_stat_t; -//! [H5O_stat_t_snip] - -//! [H5O_info1_t_snip] +//! -/* Information struct for object */ -/* (For H5Oget_info/H5Oget_info_by_name/H5Oget_info_by_idx versions 1 & 2) */ +//! +/** + * Information struct for object (For H5Oget_info(), H5Oget_info_by_name(), + * H5Oget_info_by_idx() versions 1 & 2.) + */ typedef struct H5O_info1_t { - unsigned long fileno; /* File number that object is located in */ - haddr_t addr; /* Object address in file */ - H5O_type_t type; /* Basic object type (group, dataset, etc.) */ - unsigned rc; /* Reference count of object */ - time_t atime; /* Access time */ - time_t mtime; /* Modification time */ - time_t ctime; /* Change time */ - time_t btime; /* Birth time */ - hsize_t num_attrs; /* # of attributes attached to object */ - H5O_hdr_info_t hdr; /* Object header information */ + unsigned long fileno; /**< File number that object is located in */ + haddr_t addr; /**< Object address in file */ + H5O_type_t type; /**< Basic object type (group, dataset, etc.) */ + unsigned rc; /**< Reference count of object */ + time_t atime; /**< Access time */ + time_t mtime; /**< Modification time */ + time_t ctime; /**< Change time */ + time_t btime; /**< Birth time */ + hsize_t num_attrs; /**< # of attributes attached to object */ + H5O_hdr_info_t hdr; /**< Object header information */ /* Extra metadata storage for obj & attributes */ struct { - H5_ih_info_t obj; /* v1/v2 B-tree & local/fractal heap for groups, B-tree for chunked datasets */ - H5_ih_info_t attr; /* v2 B-tree & heap for attributes */ + H5_ih_info_t obj; /**< v1/v2 B-tree & local/fractal heap for groups, B-tree for chunked datasets */ + H5_ih_info_t attr; /**< v2 B-tree & heap for attributes */ } meta_size; } H5O_info1_t; +//! -//! [H5O_info1_t_snip] - -/* Prototype for H5Ovisit/H5Ovisit_by_name() operator (versions 1 & 2) */ -//! [H5O_iterate1_t_snip] - +//! +/** + * Prototype for H5Ovisit(), H5Ovisit_by_name() operator (versions 1 & 2) + */ typedef herr_t (*H5O_iterate1_t)(hid_t obj, const char *name, const H5O_info1_t *info, void *op_data); - -//! [H5O_iterate1_t_snip] +//! /* Function prototypes */ diff --git a/src/H5PLpublic.h b/src/H5PLpublic.h index c3555bc..55ff594 100644 --- a/src/H5PLpublic.h +++ b/src/H5PLpublic.h @@ -28,8 +28,7 @@ */ #define H5PL_NO_PLUGIN "::" -//! [H5PL_type_t_snip] - +//! /** * Plugin type (bit-position) used by the plugin library */ @@ -39,8 +38,7 @@ typedef enum H5PL_type_t { H5PL_TYPE_VOL = 1, /**< VOL driver */ H5PL_TYPE_NONE = 2 /**< Sentinel: This must be last! */ } H5PL_type_t; - -//! [H5PL_type_t_snip] +//! /* Common dynamic plugin type flags used by the set/get_loading_state functions */ #define H5PL_FILTER_PLUGIN 0x0001 diff --git a/src/H5Pmodule.h b/src/H5Pmodule.h index 130cb90..18f30c6 100644 --- a/src/H5Pmodule.h +++ b/src/H5Pmodule.h @@ -44,7 +44,6 @@ * and compressed. * * \todo Describe concisely what the functions in this module are about. - * \todo Clicking on "more" after "Property List Interface" at the top does not work * * \defgroup GPLO General Property List Operations * \ingroup H5P @@ -70,6 +69,10 @@ * \ingroup H5P * \defgroup OCPPL Object Copy Properties * \ingroup H5P + * \defgroup GACPL General Access Properties + * \ingroup H5P + * \defgroup MAPL Map Access Properties + * \ingroup H5P */ #endif /* H5Pmodule_H */ diff --git a/src/H5Ppublic.h b/src/H5Ppublic.h index 801c561..6235ac8 100644 --- a/src/H5Ppublic.h +++ b/src/H5Ppublic.h @@ -111,23 +111,50 @@ extern "C" { /*******************/ /* Define property list class callback function pointer types */ -//! [H5P_cls_create_func_t_snip] +//! typedef herr_t (*H5P_cls_create_func_t)(hid_t prop_id, void *create_data); -//! [H5P_cls_create_func_t_snip] -//! [H5P_cls_copy_func_t_snip] +//! + +//! typedef herr_t (*H5P_cls_copy_func_t)(hid_t new_prop_id, hid_t old_prop_id, void *copy_data); -//! [H5P_cls_copy_func_t_snip] -//! [H5P_cls_close_func_t_snip] +//! + +//! typedef herr_t (*H5P_cls_close_func_t)(hid_t prop_id, void *close_data); -//! [H5P_cls_close_func_t_snip] +//! /* Define property list callback function pointer types */ -//! [H5P_prp_cb1_t_snip] +//! +/** + * \brief Callback function for H5Pregister2(),H5Pregister1(),H5Pinsert2(),H5Pinsert1() + * + * \param[in] name The name of the property + * \param[in] size The size of the property in bytes + * \param[in,out] value The value for the property + * \return \herr_t + * + * \details The H5P_prp_cb1_t() describes the parameters used by the + * property create,copy and close callback functions. + */ typedef herr_t (*H5P_prp_cb1_t)(const char *name, size_t size, void *value); -//! [H5P_prp_cb1_t_snip] -//! [H5P_prp_cb2_t_snip] +//! + +//! +/** + * \brief Callback function for H5Pregister2(),H5Pregister1(),H5Pinsert2(),H5Pinsert1() + * + * \plist_id{prop_id} + * \param[in] name The name of the property + * \param[in] size The size of the property in bytes + * \param[in] value The value for the property + * \return \herr_t + * + * \details The H5P_prp_cb2_t() describes the parameters used by the + * property set ,copy and delete callback functions. + */ typedef herr_t (*H5P_prp_cb2_t)(hid_t prop_id, const char *name, size_t size, void *value); -//! [H5P_prp_cb2_t_snip] +//! + typedef H5P_prp_cb1_t H5P_prp_create_func_t; typedef H5P_prp_cb2_t H5P_prp_set_func_t; typedef H5P_prp_cb2_t H5P_prp_get_func_t; @@ -135,60 +162,93 @@ typedef herr_t (*H5P_prp_encode_func_t)(const void *value, void **buf, size_t *s typedef herr_t (*H5P_prp_decode_func_t)(const void **buf, void *value); typedef H5P_prp_cb2_t H5P_prp_delete_func_t; typedef H5P_prp_cb1_t H5P_prp_copy_func_t; -//! [H5P_prp_compare_func_t_snip] + +//! typedef int (*H5P_prp_compare_func_t)(const void *value1, const void *value2, size_t size); -//! [H5P_prp_compare_func_t_snip] +//! + typedef H5P_prp_cb1_t H5P_prp_close_func_t; /* Define property list iteration function type */ -//! [H5P_iterate_t_snip] +//! typedef herr_t (*H5P_iterate_t)(hid_t id, const char *name, void *iter_data); -//! [H5P_iterate_t_snip] +//! -/* Actual IO mode property */ +//! +/** + * Actual IO mode property + * + * \details The default value, #H5D_MPIO_NO_CHUNK_OPTIMIZATION, is used for all + * I/O operations that do not use chunk optimizations, including + * non-collective I/O and contiguous collective I/O. + */ typedef enum H5D_mpio_actual_chunk_opt_mode_t { - /* The default value, H5D_MPIO_NO_CHUNK_OPTIMIZATION, is used for all I/O - * operations that do not use chunk optimizations, including non-collective - * I/O and contiguous collective I/O. - */ H5D_MPIO_NO_CHUNK_OPTIMIZATION = 0, + /**< No chunk optimization was performed. Either no collective I/O was + attempted or the dataset wasn't chunked. */ H5D_MPIO_LINK_CHUNK, + /**< Collective I/O is performed on all chunks simultaneously. */ H5D_MPIO_MULTI_CHUNK + /**< Each chunk was individually assigned collective or independent I/O based + on what fraction of processes access the chunk. If the fraction is greater + than the multi chunk ratio threshold, collective I/O is performed on that + chunk. The multi chunk ratio threshold can be set using + H5Pset_dxpl_mpio_chunk_opt_ratio(). The default value is 60%. */ } H5D_mpio_actual_chunk_opt_mode_t; +//! +//! +/** + * The following values are conveniently defined as a bit field so that + * we can switch from the default to independent or collective and then to + * mixed without having to check the original value. + */ typedef enum H5D_mpio_actual_io_mode_t { - /* The following four values are conveniently defined as a bit field so that - * we can switch from the default to independent or collective and then to - * mixed without having to check the original value. - * - * NO_COLLECTIVE means that either collective I/O wasn't requested or that - * no I/O took place. - * - * CHUNK_INDEPENDENT means that collective I/O was requested, but the - * chunk optimization scheme chose independent I/O for each chunk. - */ - H5D_MPIO_NO_COLLECTIVE = 0x0, + H5D_MPIO_NO_COLLECTIVE = 0x0, + /**< No collective I/O was performed. Collective I/O was not requested or + collective I/O isn't possible on this dataset */ H5D_MPIO_CHUNK_INDEPENDENT = 0x1, - H5D_MPIO_CHUNK_COLLECTIVE = 0x2, - H5D_MPIO_CHUNK_MIXED = 0x1 | 0x2, - - /* The contiguous case is separate from the bit field. */ + /**< HDF5 performed one the chunk collective optimization schemes and each + chunk was accessed independently */ + H5D_MPIO_CHUNK_COLLECTIVE = 0x2, + /**< HDF5 performed one the chunk collective optimization schemes and each + chunk was accessed collectively */ + H5D_MPIO_CHUNK_MIXED = 0x1 | 0x2, + /**< HDF5 performed one the chunk collective optimization schemes and some + chunks were accessed independently, some collectively. */ + /** \internal The contiguous case is separate from the bit field. */ H5D_MPIO_CONTIGUOUS_COLLECTIVE = 0x4 + /**< Collective I/O was performed on a contiguous dataset */ } H5D_mpio_actual_io_mode_t; +//! -/* Broken collective IO property */ +//! +/** + * Broken collective IO property + */ typedef enum H5D_mpio_no_collective_cause_t { - H5D_MPIO_COLLECTIVE = 0x00, - H5D_MPIO_SET_INDEPENDENT = 0x01, - H5D_MPIO_DATATYPE_CONVERSION = 0x02, - H5D_MPIO_DATA_TRANSFORMS = 0x04, - H5D_MPIO_MPI_OPT_TYPES_ENV_VAR_DISABLED = 0x08, - H5D_MPIO_NOT_SIMPLE_OR_SCALAR_DATASPACES = 0x10, - H5D_MPIO_NOT_CONTIGUOUS_OR_CHUNKED_DATASET = 0x20, - H5D_MPIO_PARALLEL_FILTERED_WRITES_DISABLED = 0x40, + H5D_MPIO_COLLECTIVE = 0x00, + /**< Collective I/O was performed successfully */ + H5D_MPIO_SET_INDEPENDENT = 0x01, + /**< Collective I/O was not performed because independent I/O was requested */ + H5D_MPIO_DATATYPE_CONVERSION = 0x02, + /**< Collective I/O was not performed because datatype conversions were required */ + H5D_MPIO_DATA_TRANSFORMS = 0x04, + /**< Collective I/O was not performed because data transforms needed to be applied */ + H5D_MPIO_MPI_OPT_TYPES_ENV_VAR_DISABLED = 0x08, + /**< \todo FIXME! */ + H5D_MPIO_NOT_SIMPLE_OR_SCALAR_DATASPACES = 0x10, + /**< Collective I/O was not performed because one of the dataspaces was neither simple nor scalar */ + H5D_MPIO_NOT_CONTIGUOUS_OR_CHUNKED_DATASET = 0x20, + /**< Collective I/O was not performed because the dataset was neither contiguous nor chunked */ + H5D_MPIO_PARALLEL_FILTERED_WRITES_DISABLED = 0x40, + /**< \todo FIXME! */ H5D_MPIO_ERROR_WHILE_CHECKING_COLLECTIVE_POSSIBLE = 0x80, - H5D_MPIO_NO_COLLECTIVE_MAX_CAUSE = 0x100 + /**< \todo FIXME! */ + H5D_MPIO_NO_COLLECTIVE_MAX_CAUSE = 0x100 + /**< Sentinel */ } H5D_mpio_no_collective_cause_t; +//! /********************/ /* Public Variables */ @@ -493,8 +553,6 @@ H5_DLL hid_t H5Pcreate(hid_t cls_id); * list of this class is being created. The #H5P_cls_create_func_t * callback function is defined as follows: * - * \todo fix snippets to work, when you click on them. - * * \snippet this H5P_cls_create_func_t_snip * * The parameters to this callback function are defined as follows: @@ -1233,8 +1291,6 @@ H5_DLL herr_t H5Pget_size(hid_t id, const char *name, size_t *size); * property list objects; the initial value is assumed to * have any necessary setup already performed on it. * - * \todo "cpp_note" goes here - * * \since 1.8.0 * */ @@ -1326,9 +1382,6 @@ H5_DLL htri_t H5Pisa_class(hid_t plist_id, hid_t pclass_id); * If the membership changes during the iteration, the function's * behavior is undefined. * - * - * \todo "cpp_note" goes here - * * \since 1.4.0 * */ @@ -1582,16 +1635,12 @@ H5_DLL int H5Piterate(hid_t id, int *idx, H5P_iterate_t iter_func, void *iter_da * property is being closed. The #H5P_prp_close_func_t callback * function is defined as follows: * - * \snippet this H5P_prp_cb2_t_snip + * \snippet this H5P_prp_cb1_t_snip * * The parameters to the callback function are defined as follows: * * * - * - * - * - * * * * @@ -1611,8 +1660,6 @@ H5_DLL int H5Piterate(hid_t id, int *idx, H5P_iterate_t iter_func, void *iter_da * list close routine returns an error value but the property list is * still closed. * - * \todo "cpp_note" goes here - * * \since 1.8.0 * */ @@ -1701,7 +1748,7 @@ H5_DLL herr_t H5Punregister(hid_t pclass_id, const char *name); /* Object creation property list (OCPL) routines */ /** - * \ingroup OCPL + * \ingroup DCPL * * \brief Verifies that all required filters are available * @@ -1805,7 +1852,7 @@ H5_DLL herr_t H5Pget_attr_phase_change(hid_t plist_id, unsigned *max_compact, un * \todo Signature for H5Pget_filter2 is different in H5Pocpl.c than in * H5Ppublic.h * - * \plist_id{plist_id} + * \ocpl_id{plist_id} * \param[in] idx Sequence number within the filter pipeline of the filter * for which information is sought * \param[out] flags Bit vector specifying certain general properties of the @@ -1869,7 +1916,7 @@ H5_DLL H5Z_filter_t H5Pget_filter2(hid_t plist_id, unsigned idx, unsigned int *f * * \brief Returns information about the specified filter * - * \plist_id + * \ocpl_id{plist_id} * \param[in] filter_id Filter identifier * \param[out] flags Bit vector specifying certain general * properties of the filter @@ -1927,10 +1974,7 @@ H5_DLL herr_t H5Pget_filter_by_id2(hid_t plist_id, H5Z_filter_t filter_id, unsig * * \brief Returns the number of filters in the pipeline * - * \todo Signature for H5Pget_nfilters() is different in H5Pocpl.c than in - * H5Ppublic.h. - * - * \plist_id + * \ocpl_id{plist_id} * * \return Returns the number of filters in the pipeline if successful; * otherwise returns a negative value. @@ -1938,8 +1982,8 @@ H5_DLL herr_t H5Pget_filter_by_id2(hid_t plist_id, H5Z_filter_t filter_id, unsig * \details H5Pget_nfilters() returns the number of filters defined in the * filter pipeline associated with the property list \p plist_id. * - * In each pipeline, the filters are numbered from 0 through N-1, - * where N is the value returned by this function. During output to + * In each pipeline, the filters are numbered from 0 through \Code{N-1}, + * where \c N is the value returned by this function. During output to * the file, the filters are applied in increasing order; during * input from the file, they are applied in decreasing order. * @@ -1987,7 +2031,7 @@ H5_DLL herr_t H5Pget_obj_track_times(hid_t plist_id, hbool_t *track_times); * * \brief Modifies a filter in the filter pipeline * - * \plist_id + * \ocpl_id{plist_id} * \param[in] filter Filter to be modified * \param[in] flags Bit vector specifying certain general properties * of the filter @@ -2016,7 +2060,7 @@ H5_DLL herr_t H5Pmodify_filter(hid_t plist_id, H5Z_filter_t filter, unsigned int * * \brief Delete one or more filters in the filter pipeline * - * \plist_id + * \ocpl_id{plist_id} * \param[in] filter Filter to be deleted * * \return \herr_t @@ -2170,7 +2214,7 @@ H5_DLL herr_t H5Pset_attr_phase_change(hid_t plist_id, unsigned max_compact, uns * * \brief Sets deflate (GNU gzip) compression method and compression level * - * \plist_id + * \ocpl_id{plist_id} * \param[in] level Compression level * * \return \herr_t @@ -2226,7 +2270,7 @@ H5_DLL herr_t H5Pset_deflate(hid_t plist_id, unsigned level); * * \brief Adds a filter to the filter pipeline * - * \param[in] plist_id Dataset or group creation property list identifier + * \ocpl_id{plist_id} * \param[in] filter Filter identifier for the filter to be added to the * pipeline * \param[in] flags Bit vector specifying certain general properties of @@ -2478,7 +2522,7 @@ H5_DLL herr_t H5Pset_deflate(hid_t plist_id, unsigned level); * (The SZIP filter is an exception to this rule; see H5Pset_szip() * for details.) * - * \todo Removed several references to links to documentation + * \see \ref_filter_pipe, \ref_group_impls * * \version 1.8.5 Function applied to group creation property lists. * \since 1.6.0 @@ -2491,7 +2535,7 @@ H5_DLL herr_t H5Pset_filter(hid_t plist_id, H5Z_filter_t filter, unsigned int fl * * \brief Sets up use of the Fletcher32 checksum filter * - * \param[in] plist_id Dataset or group creation property list identifier + * \ocpl_id{plist_id} * * \return \herr_t * @@ -2672,7 +2716,7 @@ H5_DLL herr_t H5Pget_shared_mesg_index(hid_t plist_id, unsigned index_num, unsig /** * \ingroup FCPL * - * \brief Retrieves number of shared object header message indexes in file + * \brief Retrieves the number of shared object header message indexes in file * creation property list * * \fcpl_id{plist_id} @@ -3237,8 +3281,8 @@ H5_DLL herr_t H5Pget_core_write_tracking(hid_t fapl_id, hbool_t *is_enabled, siz * * + * with no system buffering. This driver is POSIX-compliant and + * is the default file driver for all systems. * * * @@ -3276,8 +3320,9 @@ H5_DLL herr_t H5Pget_core_write_tracking(hid_t fapl_id, hbool_t *is_enabled, siz * * + * memory until the file is closed. At closing, the memory + * version of the file can be written back to disk or abandoned. + * * * * @@ -3286,7 +3331,8 @@ H5_DLL herr_t H5Pget_core_write_tracking(hid_t fapl_id, hbool_t *is_enabled, siz * + * systems that do not support files larger than 2 gigabytes. + * * * * @@ -3352,7 +3398,7 @@ H5_DLL hid_t H5Pget_driver(hid_t plist_id); * struct. Driver-specific versions of that struct are defined * for each low-level driver in the relevant source code file * H5FD*.c. For example, the struct used for the MULTI driver is - * #H5FD_multi_fapl_t defined in H5FDmulti.c. + * \c H5FD_multi_fapl_t defined in H5FDmulti.c. * * If no driver-specific properties have been registered, * H5Pget_driver_info() returns NULL. @@ -3369,8 +3415,50 @@ H5_DLL hid_t H5Pget_driver(hid_t plist_id); * */ H5_DLL const void *H5Pget_driver_info(hid_t plist_id); -H5_DLL herr_t H5Pget_elink_file_cache_size(hid_t plist_id, unsigned *efc_size); -H5_DLL herr_t H5Pget_evict_on_close(hid_t fapl_id, hbool_t *evict_on_close); +/** + * \ingroup FAPL + * + * \brief Retrieves the size of the external link open file cache + * + * \fapl_id{plist_id} + * \param[out] efc_size External link open file cache size in number of files + * + * \return \herr_t + * + * \details H5Pget_elink_file_cache_size() retrieves the number of files that + * can be held open in an external link open file cache. + * + * \since 1.8.7 + * + */ +H5_DLL herr_t H5Pget_elink_file_cache_size(hid_t plist_id, unsigned *efc_size); +/** + * \ingroup FAPL + * + * \brief Retrieves the file access property list setting that determines + * whether an HDF5 object will be evicted from the library's metadata + * cache when it is closed + * + * \fapl_id + * \param[out] evict_on_close Pointer to a variable that will indicate if + * the object will be evicted on close + * + * \return \herr_t + * + * \details The library's metadata cache is fairly conservative about holding on + * to HDF5 object metadata (object headers, chunk index structures, + * etc.), which can cause the cache size to grow, resulting in memory + * pressure on an application or system. When enabled, the "evict on + * close" property will cause all metadata for an object to be + * immediately evicted from the cache as long as it is not referenced + * by any other open object. + * + * See H5Pset_evict_on_close() for additional notes on behavior. + * + * \since 1.10.1 + * + */ +H5_DLL herr_t H5Pget_evict_on_close(hid_t fapl_id, hbool_t *evict_on_close); /** * \ingroup FAPL * @@ -3386,9 +3474,7 @@ H5_DLL herr_t H5Pget_evict_on_close(hid_t fapl_id, hbool_t *evict_on_close) * application can retrieve a file handle for low-level access to * a particular member of a family of files. The file handle is * retrieved with a separate call to H5Fget_vfd_handle() (or, - * in special circumstances, to H5FDget_vfd_handle()). - * - * \todo References the VFL documentation. + * in special circumstances, to H5FDget_vfd_handle(), see \ref VFL). * * \since 1.6.0 * @@ -3415,8 +3501,113 @@ H5_DLL herr_t H5Pget_family_offset(hid_t fapl_id, hsize_t *offset); * */ H5_DLL herr_t H5Pget_fclose_degree(hid_t fapl_id, H5F_close_degree_t *degree); +/** + * \ingroup FAPL + * + * \brief Retrieves a copy of the file image designated as the initial content + * and structure of a file + * + * \fapl_id + * \param[in,out] buf_ptr_ptr On input, \c NULL or a pointer to a + * pointer to a buffer that contains the + * file image.\n On successful return, if \p buf_ptr_ptr is not + * \c NULL, \Code{*buf_ptr_ptr} will contain a pointer to a copy + * of the initial image provided in the last call to + * H5Pset_file_image() for the supplied \p fapl_id. If no initial + * image has been set, \Code{*buf_ptr_ptr} will be \c NULL. + * \param[in,out] buf_len_ptr On input, \c NULL or a pointer to a buffer + * specifying the required size of the buffer to hold the file + * image.\n On successful return, if \p buf_len_ptr was not + * passed in as \c NULL, \p buf_len_ptr will return the required + * size in bytes of the buffer to hold the initial file image in + * the supplied file access property list, \p fapl_id. If no + * initial image is set, the value of \Code{*buf_len_ptr} will be + * set to 0 (zero) + * \return \herr_t + * + * \details H5Pget_file_image() allows an application to retrieve a copy of the + * file image designated for a VFD to use as the initial contents of a file. + * + * If file image callbacks are defined, H5Pget_file_image() will use + * them when allocating and loading the buffer to return to the + * application (see H5Pset_file_image_callbacks()). If file image + * callbacks are not defined, the function will use \c malloc and \c + * memcpy. When \c malloc and \c memcpy are used, it is the caller’s + * responsibility to discard the returned buffer with a call to \c + * free. + * + * It is the responsibility of the calling application to free the + * buffer whose address is returned in \p buf_ptr_ptr. This can be + * accomplished with \c free if file image callbacks have not been set + * (see H5Pset_file_image_callbacks()) or with the appropriate method + * if file image callbacks have been set. + * + * \see H5LTopen_file_image(), H5Fget_file_image(), H5Pset_file_image(), + * H5Pset_file_image_callbacks(), H5Pget_file_image_callbacks(), + * \ref H5FD_file_image_callbacks_t, \ref H5FD_file_image_op_t, + * + * HDF5 File Image Operations. + * + * + * \since 1.8.9 + * + */ H5_DLL herr_t H5Pget_file_image(hid_t fapl_id, void **buf_ptr_ptr, size_t *buf_len_ptr); +/** + * \ingroup FAPL + * + * \brief Retrieves callback routines for working with file images + * + * \fapl_id + * \param[in,out] callbacks_ptr Pointer to the instance of the + * #H5FD_file_image_callbacks_t struct in which the callback + * routines are to be returned\n + * Struct fields must be initialized to NULL before the call + * is made.\n + * Struct field contents upon return will match those passed in + * in the last H5Pset_file_image_callbacks() call for the file + * access property list \p fapl_id. + * \return \herr_t + * + * \details H5Pget_file_image_callbacks() retrieves the callback routines set for + * working with file images opened with the file access property list + * \p fapl_id. + * + * The callbacks must have been previously set with + * H5Pset_file_image_callbacks() in the file access property list. + * + * Upon the successful return of H5Pset_file_image_callbacks(), the + * fields in the instance of the #H5FD_file_image_callbacks_t struct + * pointed to by \p callbacks_ptr will contain the same values as were + * passed in the most recent H5Pset_file_image_callbacks() call for the + * file access property list \p fapl_id. + * + * \see H5LTopen_file_image(), H5Fget_file_image(), H5Pset_file_image(), + * H5Pset_file_image_callbacks(), H5Pget_file_image_callbacks(), + * \ref H5FD_file_image_callbacks_t, \ref H5FD_file_image_op_t, + * + * HDF5 File Image Operations. + * + * \since 1.8.9 + * + */ H5_DLL herr_t H5Pget_file_image_callbacks(hid_t fapl_id, H5FD_file_image_callbacks_t *callbacks_ptr); +/** + * \ingroup FAPL + * + * \brief Retrieves the file locking property values + * + * \fapl_id + * \param[out] use_file_locking File locking flag + * \param[out] ignore_when_disabled Ignore when disabled flag + * \return \herr_t + * + * \details H5Pget_file_locking() retrieves the file locking property values for + * the file access property list specified by \p fapl_id. + * + * \since 1.10.7 + * + */ H5_DLL herr_t H5Pget_file_locking(hid_t fapl_id, hbool_t *use_file_locking, hbool_t *ignore_when_disabled); /** * \ingroup FAPL @@ -3471,11 +3662,212 @@ H5_DLL herr_t H5Pget_gc_references(hid_t fapl_id, unsigned *gc_ref /*out*/); * */ H5_DLL herr_t H5Pget_libver_bounds(hid_t plist_id, H5F_libver_t *low, H5F_libver_t *high); +/** + * \ingroup FAPL + * + * \brief Get the current initial metadata cache configuration from the + * provided file access property list + * + * \fapl_id{plist_id} + * \param[in,out] config_ptr Pointer to the instance of #H5AC_cache_config_t + * in which the current metadata cache configuration is to be + * reported + * \return \herr_t + * + * \note The \c in direction applies only to the \ref H5AC_cache_config_t.version + * field. All other fields are \c out parameters. + * + * \details The fields of the #H5AC_cache_config_t structure are shown + * below: + * \snippet H5ACpublic.h H5AC_cache_config_t_snip + * \click4more + * + * H5Pget_mdc_config() gets the initial metadata cache configuration + * contained in a file access property list and loads it into the + * instance of #H5AC_cache_config_t pointed to by the \p config_ptr + * parameter. This configuration is used when the file is opened. + * + * Note that the version field of \Code{*config_ptr} must be + * initialized; this allows the library to support earlier versions of + * the #H5AC_cache_config_t structure. + * + * See the overview of the metadata cache in the special topics section + * of the user guide for details on the configuration data returned. If + * you haven't read and understood that documentation, the results of + * this call will not make much sense. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pget_mdc_config(hid_t plist_id, H5AC_cache_config_t *config_ptr); /* out */ +/** + * \ingroup FAPL + * + * \brief Retrieves the metadata cache image configuration values for a file + * access property list + * + * \fapl_id{plist_id} + * \param[out] config_ptr Pointer to metadata cache image configuration values + * \return \herr_t + * + * \details H5Pget_mdc_image_config() retrieves the metadata cache image values + * into \p config_ptr for the file access property list specified in \p + * plist_id. + * + * #H5AC_cache_image_config_t is defined as follows: + * \snippet H5ACpublic.h H5AC_cache_image_config_t_snip + * \click4more + * + * \since 1.10.1 + */ H5_DLL herr_t H5Pget_mdc_image_config(hid_t plist_id, H5AC_cache_image_config_t *config_ptr /*out*/); +/** + * \ingroup FAPL + * + * \brief Gets metadata cache logging options + * + * \fapl_id{plist_id} + * \param[out] is_enabled Flag whether logging is enabled + * \param[out] location Location of log in UTF-8/ASCII (file path/name) (On + * Windows, this must be ASCII) + * \param[out] location_size Size in bytes of the location string + * \param[out] start_on_access Whether the logging begins as soon as the file is + * opened or created + * \return \herr_t + * + * \details The metadata cache is a central part of the HDF5 library through + * which all file metadata reads and writes take place. File metadata + * is normally invisible to the user and is used by the library for + * purposes such as locating and indexing data. File metadata should + * not be confused with user metadata, which consists of attributes + * created by users and attached to HDF5 objects such as datasets via + * \ref H5A API calls. + * + * Due to the complexity of the cache, a trace/logging feature has been + * created that can be used by HDF5 developers for debugging and + * performance analysis. The functions that control this functionality + * will normally be of use to a very limited number of developers + * outside of The HDF Group. The functions have been documented to help + * users create logs that can be sent with bug reports. + * + * Control of the log functionality is straightforward. Logging is + * enabled via the H5Pset_mdc_log_options() function, which will modify + * the file access property list used to open or create a file. This + * function has a flag that determines whether logging begins at file + * open or starts in a paused state. Log messages can then be + * controlled via the H5Fstart_mdc_logging() / H5Fstop_mdc_logging() + * functions. H5Pget_mdc_log_options() can be used to examine a file + * access property list, and H5Fget_mdc_logging_status() will return + * the current state of the logging flags. + * + * The log format is described in the + * Metadata Cache Logging document. + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pget_mdc_log_options(hid_t plist_id, hbool_t *is_enabled, char *location, size_t *location_size, hbool_t *start_on_access); -H5_DLL herr_t H5Pget_meta_block_size(hid_t fapl_id, hsize_t *size /*out*/); +/** + * \ingroup FAPL + * + * \brief Returns the current metadata block size setting + * + * \fapl_id{fapl_id} + * \param[out] size Minimum size, in bytes, of metadata block allocations + * + * \return \herr_t + * + * \details Returns the current minimum size, in bytes, of new + * metadata block allocations. This setting is retrieved from the + * file access property list \p fapl_id. + * + * This value is set by H5Pset_meta_block_size() and is + * retrieved from the file access property list \p fapl_id. + * + * \since 1.4.0 + */ +H5_DLL herr_t H5Pget_meta_block_size(hid_t fapl_id, hsize_t *size); +/** + * \ingroup FAPL + * + * \brief Retrieves the number of read attempts from a file access + * property list + * + * \fapl_id{plist_id} + * \param[out] attempts The number of read attempts + * + * \return \herr_t + * + * \details H5Pget_metadata_read_attempts() retrieves the number of read + * attempts that is set in the file access property list \p plist_id. + * + * For a default file access property list, the value retrieved + * will depend on whether the user sets the number of attempts via + * H5Pset_metadata_read_attempts(): + * + *
      + * + *
    • If the number of attempts is set to N, the value + * returned will be N. + *
    • If the number of attempts is not set, the value returned + * will be the default for non-SWMR access (1). SWMR is short + * for single-writer/multiple-reader. + *
    + * + * For the file access property list of a specified HDF5 file, + * the value retrieved will depend on how the file is opened + * and whether the user sets the number of read attempts via + * H5Pset_metadata_read_attempts(): + * + *
      + *
    • For a file opened with SWMR access: + * + *
        + *
      • If the number of attempts is set to N, the value + * returned will be N. + *
      • If the number of attempts is not set, the value + * returned will be the default for SWMR access (100). + *
      + *
    • For a file opened without SWMR access, the value + * retrieved will always be the default for non-SWMR access + * (1). The value set via H5Pset_metadata_read_attempts() does + * not have any effect on non-SWMR access. + *
    + * + * \par Failure Modes + * \parblock + * + * When the input property list is not a file access property list. + * + * When the library is unable to retrieve the number of read attempts from + * the file access property list. + * + * \endparblock + * + * \par Examples + * \parblock + * + * The first example illustrates the two cases for retrieving the number + * of read attempts from a default file access property list. + * + * \include H5Pget_metadata_read_attempts.1.c + * + * The second example illustrates the two cases for retrieving the + * number of read attempts from the file access property list of a file + * opened with SWMR acccess. + * + * \include H5Pget_metadata_read_attempts.2.c + * + * The third example illustrates the two cases for retrieving the number + * of read attempts from the file access property list of a file opened + * with non-SWMR acccess. + * + * \include H5Pget_metadata_read_attempts.3.c + * + * \endparblock + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pget_metadata_read_attempts(hid_t plist_id, unsigned *attempts); /** * \ingroup FAPL @@ -3510,10 +3902,102 @@ H5_DLL herr_t H5Pget_metadata_read_attempts(hid_t plist_id, unsigned *attempts); * */ H5_DLL herr_t H5Pget_multi_type(hid_t fapl_id, H5FD_mem_t *type); +/** + * \ingroup FAPL + * + * \brief Retrieves the object flush property values from the file access property list + * + * \fapl_id{plist_id} + * \param[in] func The user-defined callback function + * \param[in] udata The user-defined input data for the callback function + * + * \return \herr_t + * + * \details H5Pget_object_flush_cb() gets the user-defined callback + * function that is set in the file access property list + * \p fapl_id and stored in the parameter \p func. The callback is + * invoked whenever an object flush occurs in the file. This + * routine also obtains the user-defined input data that is + * passed along to the callback function in the parameter + * \p udata. + * + * \par Example + * \parblock + * The example below illustrates the usage of this routine to obtain the + * object flush property values. + * + * \include H5Pget_object_flush_cb.c + * \endparblock + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pget_object_flush_cb(hid_t plist_id, H5F_flush_cb_t *func, void **udata); -H5_DLL herr_t H5Pget_page_buffer_size(hid_t plist_id, size_t *buf_size, unsigned *min_meta_per, - unsigned *min_raw_per); +/** + * \ingroup FAPL + * + * \brief Retrieves the maximum size for the page buffer and the minimum + percentage for metadata and raw data pages + * + * \fapl_id{plist_id} + * \param[out] buf_size Maximum size, in bytes, of the page buffer + * \param[out] min_meta_perc Minimum metadata percentage to keep in the + * page buffer before allowing pages containing metadata to + * be evicted + * + * \param[out] min_raw_perc Minimum raw data percentage to keep in the + * page buffer before allowing pages containing raw data to + * be evicted + * + * \return \herr_t + * + * \details H5Pget_page_buffer_size() retrieves \p buf_size, the maximum + * size in bytes of the page buffer, \p min_meta_perc, the + * minimum metadata percentage, and \p min_raw_perc, the + * minimum raw data percentage. + * + * \since 1.10.1 + */ +H5_DLL herr_t H5Pget_page_buffer_size(hid_t plist_id, size_t *buf_size, unsigned *min_meta_perc, + unsigned *min_raw_perc); +/** + * \ingroup FAPL + * + * \brief Returns maximum data sieve buffer size + * + * \fapl_id{fapl_id} + * \param[in] size Maximum size, in bytes, of data sieve buffer + * + * \return \herr_t + * + * \details H5Pget_sieve_buf_size() retrieves, size, the current maximum + * size of the data sieve buffer. + * + * This value is set by H5Pset_sieve_buf_size() and is retrieved + * from the file access property list fapl_id. + * + * \version 1.6.0 The \p size parameter has changed from type \c hsize_t + * to \c size_t + * \since 1.4.0 + */ H5_DLL herr_t H5Pget_sieve_buf_size(hid_t fapl_id, size_t *size /*out*/); +/** + * \ingroup FAPL + * + * \brief Retrieves the current small data block size setting + * + * \fapl_id{fapl_id} + * \param[out] size Maximum size, in bytes, of the small data block + * + * \result \herr_t + * + * \details H5Pget_small_data_block_size() retrieves the current setting + * for the size of the small data block. + * + * If the returned value is zero (0), the small data block + * mechanism has been disabled for the file. + * + * \since 1.4.4 + */ H5_DLL herr_t H5Pget_small_data_block_size(hid_t fapl_id, hsize_t *size /*out*/); /** * \ingroup FAPL @@ -3684,6 +4168,64 @@ H5_DLL herr_t H5Pset_alignment(hid_t fapl_id, hsize_t threshold, hsize_t alignme */ H5_DLL herr_t H5Pset_cache(hid_t plist_id, int mdc_nelmts, size_t rdcc_nslots, size_t rdcc_nbytes, double rdcc_w0); +/** + * \ingroup FAPL + * + * \brief Sets write tracking information for core driver, #H5FD_CORE + * + * \fapl_id{fapl_id} + * \param[in] is_enabled Boolean value specifying whether feature is + enabled + * \param[in] page_size Positive integer specifying size, in bytes, of + * write aggregation pages Value of 1 (one) enables + * tracking with no paging. + * + * \return \herr_t + * + * \details When a file is created or opened for writing using the core + * virtual file driver (VFD) with the backing store option + * turned on, the core driver can be configured to track + * changes to the file and write out only the modified bytes. + * + * This write tracking feature is enabled and disabled with \p + * is_enabled. The default setting is that write tracking is + * disabled, or off. + * + * To avoid a large number of small writes, changes can + * be aggregated into pages of a user-specified size, \p + * page_size. + * + * Setting \p page_size to 1 enables tracking with no page + * aggregation. + * + * The backing store option is set via the function + * H5Pset_fapl_core. + * + * \attention + * \parblock + * This function is only for use with the core VFD and must + * be used after the call to H5Pset_fapl_core(). It is an error + * to use this function with any other VFD. + * + * It is an error to use this function when the backing store + * flag has not been set using H5Pset_fapl_core(). + * + * This function only applies to the backing store write + * operation which typically occurs when the file is flushed + * or closed. This function has no relationship to the + * increment parameter passed to H5Pset_fapl_core(). + * + * For optimum performance, the \p page_size parameter should be + * a power of two. + * + * It is an error to set the page size to 0. + * \endparblock + * + * \version 1.8.14 C function modified in this release to return error + * if \p page_size is set to 0 (zero). + * \since 1.8.13 + * + */ H5_DLL herr_t H5Pset_core_write_tracking(hid_t fapl_id, hbool_t is_enabled, size_t page_size); /** * \ingroup FAPL @@ -3712,13 +4254,146 @@ H5_DLL herr_t H5Pset_core_write_tracking(hid_t fapl_id, hbool_t is_enabled, size * */ H5_DLL herr_t H5Pset_driver(hid_t plist_id, hid_t driver_id, const void *driver_info); -H5_DLL herr_t H5Pset_elink_file_cache_size(hid_t plist_id, unsigned efc_size); -H5_DLL herr_t H5Pset_evict_on_close(hid_t fapl_id, hbool_t evict_on_close); -H5_DLL herr_t H5Pset_family_offset(hid_t fapl_id, hsize_t offset); /** * \ingroup FAPL * - * \brief Sets the file close degree + * \brief Sets the number of files that can be held open in an external + * link open file cache + * + * \par Motivation + * \parblock + * The external link open file cache holds files open after + * they have been accessed via an external link. This cache reduces + * the number of times such files are opened when external links are + * accessed repeatedly and can siginificantly improves performance in + * certain heavy-use situations and when low-level file opens or closes + * are expensive. + * + * H5Pset_elink_file_cache_size() sets the number of files + * that will be held open in an external link open file + * cache. H5Pget_elink_file_cache_size() retrieves the size of an existing + * cache; and H5Fclear_elink_file_cache() clears an existing cache without + * closing it. + * \endparblock + * + * \fapl_id{plist_id} + * \param[in] efc_size External link open file cache size in number of files + * Default setting is 0 (zero). + * + * \return \herr_t + * + * \details H5Pset_elink_file_cache_size() specifies the number of files + * that will be held open in an external link open file cache. + * + * The default external link open file cache size is 0 (zero), + * meaning that files accessed via an external link are not + * held open. Setting the cache size to a positive integer + * turns on the cache; setting the size back to zero turns it + * off. + * + * With this property set, files are placed in the external + * link open file cache cache when they are opened via an + * external link. Files are then held open until either + * they are evicted from the cache or the parent file is + * closed. This property setting can improve performance when + * external links are repeatedly accessed. + * + * When the cache is full, files will be evicted using a least + * recently used (LRU) scheme; the file which has gone the + * longest time without being accessed through the parent file + * will be evicted and closed if nothing else is holding that + * file open. + * + * Files opened through external links inherit the parent + * file’s file access property list by default, and therefore + * inherit the parent file’s external link open file cache + * setting. + * + * When child files contain external links of their own, the + * caches can form a graph of cached external files. Closing + * the last external reference to such a graph will recursively + * close all files in the graph, even if cycles are present. + * \par Example + * \parblock + * The following code sets up an external link open file cache that will + * hold open up to 8 files reached through external links: + * + * \code + * status = H5Pset_elink_file_cache_size(fapl_id, 8); + * \endcode + * \endparblock + * + * \since 1.8.7 + */ +H5_DLL herr_t H5Pset_elink_file_cache_size(hid_t plist_id, unsigned efc_size); +/** + * \ingroup FAPL + * + * \brief Controls the library's behavior of evicting metadata associated with + * a closed object + * + * \fapl_id + * \param[in] evict_on_close Whether the HDF5 object should be evicted on close + * + * \return \herr_t + * + * \details The library's metadata cache is fairly conservative about holding + * on to HDF5 object metadata(object headers, chunk index structures, + * etc.), which can cause the cache size to grow, resulting in memory + * pressure on an application or system. When enabled, the "evict on + * close" property will cause all metadata for an object to be evicted + * from the cache as long as metadata is not referenced by any other + * open object. + * + * This function only applies to file access property lists. + * + * The default library behavior is to not evict on object or file + * close. + * + * When applied to a file access property list, any subsequently opened + * object will inherit the "evict on close" property and will have + * its metadata evicted when the object is closed. + * + * \since 1.10.1 + * + */ +H5_DLL herr_t H5Pset_evict_on_close(hid_t fapl_id, hbool_t evict_on_close); +/** + * \ingroup FAPL + * + * \brief Sets offset property for low-level access to a file in a family of + * files + * + * \fapl_id + * \param[in] offset Offset in bytes within the HDF5 file + * + * \return \herr_t + * + * \details H5Pset_family_offset() sets the offset property in the file access + * property list \p fapl_id so that the user application can + * retrieve a file handle for low-level access to a particular member + * of a family of files. The file handle is retrieved with a separate + * call to H5Fget_vfd_handle() (or, in special circumstances, to + * H5FDget_vfd_handle(); see \ref VFL). + * + * The value of \p offset is an offset in bytes from the beginning of + * the HDF5 file, identifying a user-determined location within the + * HDF5 file. + * The file handle the user application is seeking is for the specific + * member-file in the associated family of files to which this offset + * is mapped. + * + * Use of this function is only appropriate for an HDF5 file written as + * a family of files with the \c FAMILY file driver. + * + * \since 1.6.0 + * + */ +H5_DLL herr_t H5Pset_family_offset(hid_t fapl_id, hsize_t offset); +/** + * \ingroup FAPL + * + * \brief Sets the file close degree * * \fapl_id * \param[in] degree Pointer to a location containing the file close @@ -3776,9 +4451,279 @@ H5_DLL herr_t H5Pset_family_offset(hid_t fapl_id, hsize_t offset); * */ H5_DLL herr_t H5Pset_fclose_degree(hid_t fapl_id, H5F_close_degree_t degree); +/** + * \ingroup FAPL + * + * \brief Sets an initial file image in a memory buffer + * + * \fapl_id + * \param[in] buf_ptr Pointer to the initial file image, or + * NULL if no initial file image is desired + * \param[in] buf_len Size of the supplied buffer, or + * 0 (zero) if no initial image is desired + * + * \return \herr_t + * + * \details H5Pset_file_image() allows an application to provide a file image + * to be used as the initial contents of a file. + * Calling H5Pset_file_image()makes a copy of the buffer specified in + * \p buf_ptr of size \p buf_len. + * + * \par Motivation: + * H5Pset_file_image() and other elements of HDF5 are + * used to load an image of an HDF5 file into system memory and open + * that image as a regular HDF5 file. An application can then use the + * file without the overhead of disk I/O. + * + * \par Recommended Reading: + * This function is part of the file image + * operations feature set. It is highly recommended to study the guide + * [HDF5 File Image Operations] + * (https://portal.hdfgroup.org/display/HDF5/HDF5+File+Image+Operations + * ) before using this feature set. See the “See Also” section below + * for links to other elements of HDF5 file image operations. + * + * \see + * \li H5LTopen_file_image() + * \li H5Fget_file_image() + * \li H5Pget_file_image() + * \li H5Pset_file_image_callbacks() + * \li H5Pget_file_image_callbacks() + * + * \li [HDF5 File Image Operations] + * (https://portal.hdfgroup.org/display/HDF5/HDF5+File+Image+Operations) + * in [Advanced Topics in HDF5] + * (https://portal.hdfgroup.org/display/HDF5/Advanced+Topics+in+HDF5) + * + * \li Within H5Pset_file_image_callbacks(): + * \li Callback #H5FD_file_image_callbacks_t + * \li Callback #H5FD_file_image_op_t + * + * \version 1.8.13 Fortran subroutine added in this release. + * \since 1.8.9 + * + */ H5_DLL herr_t H5Pset_file_image(hid_t fapl_id, void *buf_ptr, size_t buf_len); +/** + * \ingroup FAPL + * + * \brief Sets the callbacks for working with file images + * + * \note **Motivation:** H5Pset_file_image_callbacks() and other elements + * of HDF5 are used to load an image of an HDF5 file into system + * memory and open that image as a regular HDF5 file. An application + * can then use the file without the overhead of disk I/O.\n + * **Recommended Reading:** This function is part of the file + * image operations feature set. It is highly recommended to study + * the guide [HDF5 File Image Operations] + * (https://portal.hdfgroup.org/display/HDF5/HDF5+File+Image+Operations + * ) before using this feature set. See the “See Also” section below + * for links to other elements of HDF5 file image operations. + * + * \fapl_id + * \param[in,out] callbacks_ptr Pointer to the instance of the + * #H5FD_file_image_callbacks_t structure + * + * \return \herr_t \n + * **Failure Modes**: Due to interactions between this function and + * H5Pset_file_image() and H5Pget_file_image(), + * H5Pset_file_image_callbacks() will fail if a file image has + * already been set in the target file access property list, \p fapl_id. + * + * \details H5Pset_file_image_callbacks() sets callback functions for working + * with file images in memory. + * + * H5Pset_file_image_callbacks() allows an application to control the + * management of file image buffers through user defined callbacks. + * These callbacks can be used in the management of file image buffers + * in property lists and with certain file drivers. + * + * H5Pset_file_image_callbacks() must be used before any file image has + * been set in the file access property list. Once a file image has + * been set, the function will fail. + * + * The callback routines set up by H5Pset_file_image_callbacks() are + * invoked when a new file image buffer is allocated, when an existing + * file image buffer is copied or resized, or when a file image buffer + * is released from use. + * + * Some file drivers allow the use of user-defined callback functions + * for allocating, freeing, and copying the driver’s internal buffer, + * potentially allowing optimizations such as avoiding large \c malloc + * and \c memcpy operations, or to perform detailed logging. + * + * From the perspective of the HDF5 library, the operations of the + * \ref H5FD_file_image_callbacks_t.image_malloc "image_malloc", + * \ref H5FD_file_image_callbacks_t.image_memcpy "image_memcpy", + * \ref H5FD_file_image_callbacks_t.image_realloc "image_realloc", and + * \ref H5FD_file_image_callbacks_t.image_free "image_free" callbacks + * must be identical to those of the + * corresponding C standard library calls (\c malloc, \c memcpy, + * \c realloc, and \c free). While the operations must be identical, + * the file image callbacks have more parameters. The return values + * of \ref H5FD_file_image_callbacks_t.image_malloc "image_malloc" and + * \ref H5FD_file_image_callbacks_t.image_realloc "image_realloc" are identical to + * the return values of \c malloc and \c realloc. The return values of + * \ref H5FD_file_image_callbacks_t.image_malloc "image_malloc" and + * \ref H5FD_file_image_callbacks_t.image_free "image_free" differ from the return + * values of \c memcpy and \c free in that the return values of + * \ref H5FD_file_image_callbacks_t.image_memcpy "image_memcpy" and + * \ref H5FD_file_image_callbacks_t.image_free "image_free" can also indicate failure. + * + * The callbacks and their parameters, along with a struct and + * an \c ENUM required for their use, are described below. + * + * Callback struct and \c ENUM: + * + * The callback functions set up by H5Pset_file_image_callbacks() use + * a struct and an \c ENUM that are defined as follows + * + * The struct #H5FD_file_image_callbacks_t serves as a container + * for the callback functions and a pointer to user-supplied data. + * The struct is defined as follows: + * \snippet H5FDpublic.h H5FD_file_image_callbacks_t_snip + * + * Elements of the #H5FD_file_image_op_t are used by the + * callbacks to invoke certain operations on file images. The ENUM is + * defined as follows: + * \snippet H5FDpublic.h H5FD_file_image_op_t_snip + * + * The elements of the #H5FD_file_image_op_t are used in the following + * callbacks: + * + * - The \ref H5FD_file_image_callbacks_t.image_malloc "image_malloc" callback + * contains a pointer to a function that must appear to HDF5 to have + * functionality identical to that of the standard C library \c malloc() call. + * + * - Signature in #H5FD_file_image_callbacks_t: + * \snippet H5FDpublic.h image_malloc_snip + * \n + * - The \ref H5FD_file_image_callbacks_t.image_memcpy "image_memcpy" + * callback contains a pointer to a function + * that must appear to HDF5 to have functionality identical to that + * of the standard C library \c memcopy() call, except that it returns + * a \p NULL on failure. (The \c memcpy C Library routine is defined + * to return the \p dest parameter in all cases.) + * + * - Setting \ref H5FD_file_image_callbacks_t.image_memcpy "image_memcpy" + * to \c NULL indicates that HDF5 should invoke + * the standard C library \c memcpy() routine when copying buffers. + * + * - Signature in #H5FD_file_image_callbacks_t: + * \snippet H5FDpublic.h image_memcpy_snip + * \n + * - The \ref H5FD_file_image_callbacks_t.image_realloc "image_realloc" callback + * contains a pointer to a function that must appear to HDF5 to have + * functionality identical to that of the standard C library \c realloc() call. + * + * - Setting \ref H5FD_file_image_callbacks_t.image_realloc "image_realloc" + * to \p NULL indicates that HDF5 should + * invoke the standard C library \c realloc() routine when resizing + * file image buffers. + * + * - Signature in #H5FD_file_image_callbacks_t: + * \snippet H5FDpublic.h image_realloc_snip + * \n + * - The \ref H5FD_file_image_callbacks_t.image_free "image_free" callback contains + * a pointer to a function that must appear to HDF5 to have functionality + * identical to that of the standard C library \c free() call, except + * that it will return \c 0 (\c SUCCEED) on success and \c -1 (\c FAIL) on failure. + * + * - Setting \ref H5FD_file_image_callbacks_t.image_free "image_free" + * to \c NULL indicates that HDF5 should invoke + * the standard C library \c free() routine when releasing file image + * buffers. + * + * - Signature in #H5FD_file_image_callbacks_t: + * \snippet H5FDpublic.h image_free_snip + * \n + * - The \ref H5FD_file_image_callbacks_t.udata_copy "udata_copy" + * callback contains a pointer to a function + * that, from the perspective of HDF5, allocates a buffer of suitable + * size, copies the contents of the supplied \p udata into the new + * buffer, and returns the address of the new buffer. The function + * returns NULL on failure. This function is necessary if a non-NULL + * \p udata parameter is supplied, so that property lists containing + * the image callbacks can be copied. If the \p udata parameter below + * is \c NULL, then this parameter should be \c NULL as well. + * + * - Signature in #H5FD_file_image_callbacks_t: + * \snippet H5FDpublic.h udata_copy_snip + * \n + * - The \ref H5FD_file_image_callbacks_t.udata_free "udata_free" + * callback contains a pointer to a function + * that, from the perspective of HDF5, frees a user data block. This + * function is necessary if a non-NULL udata parameter is supplied so + * that property lists containing image callbacks can be discarded + * without a memory leak. If the udata parameter below is \c NULL, + * this parameter should be \c NULL as well. + * + * - Signature in #H5FD_file_image_callbacks_t: + * \snippet H5FDpublic.h udata_free_snip + * + * - \p **udata**, the final field in the #H5FD_file_image_callbacks_t + * struct, provides a pointer to user-defined data. This pointer will + * be passed to the + * \ref H5FD_file_image_callbacks_t.image_malloc "image_malloc", + * \ref H5FD_file_image_callbacks_t.image_memcpy "image_memcpy", + * \ref H5FD_file_image_callbacks_t.image_realloc "image_realloc", and + * \ref H5FD_file_image_callbacks_t.image_free "image_free" callbacks. + * Define udata as \c NULL if no user-defined data is provided. + * + * \since 1.8.9 + * + */ H5_DLL herr_t H5Pset_file_image_callbacks(hid_t fapl_id, H5FD_file_image_callbacks_t *callbacks_ptr); +/** + * \ingroup FAPL + * + * \brief Sets the file locking property values + * + * \fapl_id + * \param[in] use_file_locking Toggle to specify file locking (or not) + * \param[in] ignore_when_disabled Toggle to ignore when disabled (or not) + * + * \return \herr_t + * + * \details H5Pset_file_locking() overrides the default file locking flag + * setting that was set when the library was configured. + * + * This setting can be overridden by the \c HDF5_USE_FILE_LOCKING + * environment variable. + * + * File locking is used when creating/opening a file to prevent + * problematic file accesses. + * + * \since 1.10.7 + * + */ H5_DLL herr_t H5Pset_file_locking(hid_t fapl_id, hbool_t use_file_locking, hbool_t ignore_when_disabled); +/** + * \ingroup FAPL + * + * \brief Sets garbage collecting references flag + * + * \fapl_id + * \param[in] gc_ref Flag setting reference garbage collection to on (1) or off (0) + * + * \return \herr_t + * + * \details H5Pset_gc_references() sets the flag for garbage collecting + * references for the file. + * + * Dataset region references and other reference types use space in an + * HDF5 file's global heap. If garbage collection is on and the user + * passes in an uninitialized value in a reference structure, the heap + * might get corrupted. When garbage collection is off, however, and + * the user re-uses a reference, the previous heap block will be + * orphaned and not returned to the free heap space. + * + * When garbage collection is on, the user must initialize the + * reference structures to 0 or risk heap corruption. + * + * The default value for garbage collecting references is off. + * + */ H5_DLL herr_t H5Pset_gc_references(hid_t fapl_id, unsigned gc_ref); /** * \ingroup FAPL @@ -3861,10 +4806,11 @@ H5_DLL herr_t H5Pset_gc_references(hid_t fapl_id, unsigned gc_ref); * * * @@ -3872,14 +4818,15 @@ H5_DLL herr_t H5Pset_gc_references(hid_t fapl_id, unsigned gc_ref); * * * @@ -3888,39 +4835,356 @@ H5_DLL herr_t H5Pset_gc_references(hid_t fapl_id, unsigned gc_ref); * \p high=#H5F_LIBVER_V110 * * * *
    \ref hid_t \c prop_idIN: The identifier of the property list being closed
    \Code{const char * name}IN: The name of the property in the list
    #H5FD_SEC2This driver uses POSIX file-system functions like read and * write to perform I/O to a single, permanent file on local disk - * with no system buffering. This driver is POSIX-compliant and is - * the default file driver for all systems.H5Pset_fapl_sec2()
    #H5FD_COREWith this driver, an application can work with a file in * memory for faster reads and writes. File contents are kept in - * memory until the file is closed. At closing, the memory version - * of the file can be written back to disk or abandoned.H5Pset_fapl_core()
    With this driver, the HDF5 file’s address space is partitioned * into pieces and sent to separate storage files using an * underlying driver of the user’s choice. This driver is for - * systems that do not support files larger than 2 gigabytes.H5Pset_fapl_family()
    \p low=#H5F_LIBVER_V18
    * \p high=#H5F_LIBVER_V18
    - * \li The library will create objects with the latest format versions - * available to library release 1.8.x. + * \li The library will create objects with the latest format + * versions available to library release 1.8.x. * \li API calls that create objects or features that are available - * to versions of the library greater than 1.8.x release will fail. + * to versions of the library greater than 1.8.x release will + * fail. * \li Earlier versions of the library may not be able to access * objects created with this setting.
    \p low=#H5F_LIBVER_V18
    * \p high=#H5F_LIBVER_V110
    - * \li The library will create objects with the latest format versions - * available to library release 1.8.x. + * \li The library will create objects with the latest format + * versions available to library release 1.8.x. * \li The library will allow objects to be created with the latest * format versions available to library release 1.10.x. - * Since 1.10.x is also #H5F_LIBVER_LATEST, there is no upper limit - * on the format versions to use. For example, if a newer format - * version is required to support a feature e.g. virtual dataset, - * this setting will allow the object to be created. + * Since 1.10.x is also #H5F_LIBVER_LATEST, there is no upper + * limit on the format versions to use. For example, if a + * newer format version is required to support a feature e.g. + * virtual dataset, this setting will allow the object to be + * created. * \li Earlier versions of the library may not be able to access * objects created with this setting.
    - * \li The library will create objects with the latest format versions - * available to library release 1.10.x. + * \li The library will create objects with the latest format + * versions available to library release 1.10.x. * \li The library will allow objects to be created with the latest * format versions available to library release 1.10.x. - * Since 1.10.x is also #H5F_LIBVER_LATEST, there is no upper limit - * on the format versions to use. For example, if a newer format - * version is required to support a feature e.g. virtual dataset, - * this setting will allow the object to be created. + * Since 1.10.x is also #H5F_LIBVER_LATEST, there is no upper + * limit on the format versions to use. For example, if a + * newer format version is required to support a feature e.g. + * virtual dataset, this setting will allow the object to be + * created. * \li This setting allows users to take advantage of the latest - * features and performance enhancements in the library. However, - * objects written with this setting may be accessible to a smaller - * range of library versions than would be the case if low is set - * to #H5F_LIBVER_EARLIEST. - * \li Earlier versions of the library may not be able to access objects created with this + * features and performance enhancements in the library. + * However, objects written with this setting may be + * accessible to a smaller range of library versions than + * would be the case if low is set to #H5F_LIBVER_EARLIEST. + * \li Earlier versions of the library may not be able to access + * objects created with this * setting. *
    * - * \version 1.10.2 #H5F_LIBVER_V18 added to the enumerated defines in #H5F_libver_t. + * \version 1.10.2 #H5F_LIBVER_V18 added to the enumerated defines in + * #H5F_libver_t. * * \since 1.8.0 * */ H5_DLL herr_t H5Pset_libver_bounds(hid_t plist_id, H5F_libver_t low, H5F_libver_t high); +/** + * \ingroup FAPL + * + * \brief Set the initial metadata cache configuration in the indicated File + * Access Property List to the supplied value + * + * \fapl_id{plist_id} + * \param[in] config_ptr Pointer to the instance of \p H5AC_cache_config_t + * containing the desired configuration + * \return \herr_t + * + * \details The fields of the #H5AC_cache_config_t structure are shown + * below: + * \snippet H5ACpublic.h H5AC_cache_config_t_snip + * \click4more + * + * \details H5Pset_mdc_config() attempts to set the initial metadata cache + * configuration to the supplied value. It will fail if an invalid + * configuration is detected. This configuration is used when the file + * is opened. + * + * See the overview of the metadata cache in the special topics section + * of the user manual for details on what is being configured. If you + * have not read and understood that documentation, you really should + * not be using this API call. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pset_mdc_config(hid_t plist_id, H5AC_cache_config_t *config_ptr); +/** + * \ingroup FAPL + * + * \brief Sets metadata cache logging options + * + * \fapl_id{plist_id} + * \param[in] is_enabled Whether logging is enabled + * \param[in] location Location of log in UTF-8/ASCII (file path/name) + * (On Windows, this must be ASCII) + * \param[in] start_on_access Whether the logging will begin as soon as the + * file is opened or created + * + * \return \herr_t + * + * \details The metadata cache is a central part of the HDF5 library through + * which all file metadata reads and writes take place. File metadata + * is normally invisible to the user and is used by the library for + * purposes such as locating and indexing data. File metadata should + * not be confused with user metadata, which consists of attributes + * created by users and attached to HDF5 objects such as datasets via + * H5A API calls. + * + * Due to the complexity of the cache, a trace/logging feature has + * been created that can be used by HDF5 developers for debugging and + * performance analysis. The functions that control this functionality + * will normally be of use to a very limited number of developers + * outside of The HDF Group. The functions have been documented to + * help users create logs that can be sent with bug reports. + * + * Control of the log functionality is straightforward. Logging is + * enabled via the H5Pset_mdc_log_options() function, + * which will modify the file access property list used to open or + * create a file. This function has a flag that determines whether + * logging begins at file open or starts in a paused state. Log + * messages can then be controlled via the H5Fstart_mdc_logging() + * and H5Fstop_mdc_logging() function. + * + * H5Pget_mdc_log_options() can be used to examine a file access + * property list, and H5Fget_mdc_logging_status() will return the + * current state of the logging flags. + * + * The log format is described in [Metadata Cache Logging] + * (https://portal.hdfgroup.org/display/HDF5/Fine-tuning+the+Metadata+Cache). + * + * \since 1.10.0 + * + */ H5_DLL herr_t H5Pset_mdc_log_options(hid_t plist_id, hbool_t is_enabled, const char *location, hbool_t start_on_access); +/** + * \ingroup FAPL + * + * \brief Sets the minimum metadata block size + * + * \fapl_id{fapl_id} + * \param[in] size Minimum size, in bytes, of metadata block allocations + * + * \return \herr_t + * + * \details H5Pset_meta_block_size() sets the minimum size, in bytes, of + * metadata block allocations when #H5FD_FEAT_AGGREGATE_METADATA is set by a VFL + * driver. + + * Each raw metadata block is initially allocated to be of the given size. + * Specific metadata objects (e.g., object headers, local heaps, B-trees) are then + * sub-allocated from this block. + * + * The default setting is 2048 bytes, meaning that the library will + * attempt to aggregate metadata in at least 2K blocks in the file. + * Setting the value to zero (\Code{0}) with this function will turn + * off metadata aggregation, even if the VFL driver attempts to use the + * metadata aggregation strategy. + * + * Metadata aggregation reduces the number of small data objects in the file that + * would otherwise be required for metadata. The aggregated block of metadata is + * usually written in a single write action and always in a contiguous block, + * potentially significantly improving library and application performance. + * + * \since 1.4.0 + */ H5_DLL herr_t H5Pset_meta_block_size(hid_t fapl_id, hsize_t size); +/** + * \ingroup FAPL + * + * \brief Sets the number of read attempts in a file access property list + * + * \fapl_id{plist_id} + * \param[in] attempts The number of read attempts. Must be a value greater than \Code{0} + * + * \return \herr_t + * + * \return Failure Modes: + * - When the user sets the number of read attempts to \Code{0}. + * - When the input property list is not a file access property list. + * - When the library is unable to set the number of read attempts in the file access property list. + * + * \details H5Pset_metadata_read_attempts() sets the number of reads that the + * library will try when reading checksummed metadata in an HDF5 file opened + * with SWMR access. When reading such metadata, the library will compare the + * checksum computed for the metadata just read with the checksum stored within + * the piece of checksum. When performing SWMR operations on a file, the + * checksum check might fail when the library reads data on a system that is not + * atomic. To remedy such situations, the library will repeatedly read the piece + * of metadata until the check passes or finally fails the read when the allowed + * number of attempts is reached. + * + * The number of read attempts used by the library will depend on how the file is + * opened and whether the user sets the number of read attempts via this routine: + + * - For a file opened with SWMR access: + * - If the user sets the number of attempts to \Code{N}, the library will use \Code{N}. + * - If the user does not set the number of attempts, the library will use the + * default for SWMR access (\Code{100}). + * - For a file opened with non-SWMR access, the library will always use the default + * for non-SWMR access (\Code{1}). The value set via this routine does not have any effect + * during non-SWMR access. + * + * \b Example: The first example illustrates the case in setting the number of read attempts for a file + * opened with SWMR access. + * + * \snippet H5Pset_metadata_read_attempts.c SWMR Access + * + * \b Example: The second example illustrates the case in setting the number of + * read attempts for a file opened with non-SWMR access. The value + * set in the file access property list does not have any effect. + * + * \snippet H5Pset_metadata_read_attempts.c non-SWMR Access + * + * \note \b Motivation: On a system that is not atomic, the library might + * possibly read inconsistent metadata with checksum when performing + * single-writer/multiple-reader (SWMR) operations for an HDF5 file. Upon + * encountering such situations, the library will try reading the metadata + * again to obtain consistent data. This routine provides the means to set + * the number of read attempts other than the library default. + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pset_metadata_read_attempts(hid_t plist_id, unsigned attempts); +/** + * \ingroup FAPL + * + * \brief Specifies type of data to be accessed via the \Code{MULTI} driver, + * enabling more direct access + * + * \fapl_id{fapl_id} + * \param[in] type Type of data to be accessed + * + * \return \herr_t + * + * \details H5Pset_multi_type() sets the \Emph{type of data} property in the file + * access property list \p fapl_id. This setting enables a user + * application to specify the type of data the application wishes to + * access so that the application can retrieve a file handle for + * low-level access to the particular member of a set of \Code{MULTI} + * files in which that type of data is stored. The file handle is + * retrieved with a separate call to H5Fget_vfd_handle() (or, in special + * circumstances, to H5FDget_vfd_handle(); see \ref VFL. + * + * The type of data specified in \p type may be one of the following: + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + * + *
    #H5FD_MEM_SUPER Super block data
    #H5FD_MEM_BTREE B-tree data
    #H5FD_MEM_DRAW Dataset raw data
    #H5FD_MEM_GHEAP Global heap data
    #H5FD_MEM_LHEAP Local Heap data
    #H5FD_MEM_OHDR Object header data
    + * + * This function is for use only when accessing an HDF5 file written as a set of + * files with the \Code{MULTI} file driver. + * + * \since 1.6.0 + */ H5_DLL herr_t H5Pset_multi_type(hid_t fapl_id, H5FD_mem_t type); +/** + * \ingroup FAPL + * + * \brief Sets a callback function to invoke when an object flush occurs in the file + * + * \fapl_id{plist_id} + * \op{func} + * \op_data_in{udata} + * + * \return \herr_t + * + * \details H5Pset_object_flush_cb() sets the callback function to invoke in the + * file access property list \p plist_id whenever an object flush occurs in + * the file. Library objects are group, dataset, and committed + * datatype. + * + * The callback function \p func must conform to the prototype defined below: + * \code + * typedef herr_t (*H5F_flush_cb_t)(hid_t object_id, void *user_data) + * \endcode + * + * The parameters of the callback function, per the above prototypes, are defined as follows: + * - \Code{object_id} is the identifier of the object which has just been flushed. + * - \Code{user_data} is the user-defined input data for the callback function. + * + * \b Example: The example below illustrates the usage of this routine to set + * the callback function to invoke when an object flush occurs. + * + * \include H5Pset_object_flush_cb.c + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pset_object_flush_cb(hid_t plist_id, H5F_flush_cb_t func, void *udata); +/** + * \ingroup FAPL + * + * \brief Sets the maximum size of the data sieve buffer + * + * \fapl_id{fapl_id} + * \param[in] size Maximum size, in bytes, of data sieve buffer + * + * \return \herr_t + * + * \details H5Pset_sieve_buf_size() sets \p size, the maximum size in bytes of the + * data sieve buffer, which is used by file drivers that are capable of + * using data sieving. + * + * The data sieve buffer is used when performing I/O on datasets in the + * file. Using a buffer which is large enough to hold several pieces of + * the dataset being read in for hyperslab selections boosts + * performance by quite a bit. + * + * The default value is set to 64KB, indicating that file I/O for raw + * data reads and writes will occur in at least 64KB blocks. Setting + * the value to zero (\Code{0}) with this API function will turn off + * the data sieving, even if the VFL driver attempts to use that + * strategy. + * + * Internally, the library checks the storage sizes of the datasets in + * the file. It picks the smaller one between the size from the file + * access property and the size of the dataset to allocate the sieve + * buffer for the dataset in order to save memory usage. + * + * \version 1.6.0 The \p size parameter has changed from type \Code{hsize_t} to \Code{size_t}. + * + * \since 1.4.0 + */ H5_DLL herr_t H5Pset_sieve_buf_size(hid_t fapl_id, size_t size); +/** + * \ingroup FAPL + * + * \brief Sets the size of a contiguous block reserved for small data + * + * \fapl_id{fapl_id} + * \param[in] size Maximum size, in bytes, of the small data block. + The default size is \Code{2048}. + * + * \return \herr_t + * + * \details H5Pset_small_data_block_size() reserves blocks of \p size bytes for the + * contiguous storage of the raw data portion of \Emph{small} datasets. The + * HDF5 library then writes the raw data from small datasets to this + * reserved space, thus reducing unnecessary discontinuities within + * blocks of meta data and improving I/O performance. + * + * A small data block is actually allocated the first time a qualifying + * small dataset is written to the file. Space for the raw data portion + * of this small dataset is suballocated within the small data block. + * The raw data from each subsequent small dataset is also written to + * the small data block until it is filled; additional small data + * blocks are allocated as required. + * + * The HDF5 library employs an algorithm that determines whether I/O + * performance is likely to benefit from the use of this mechanism with + * each dataset as storage space is allocated in the file. A larger + * \p size will result in this mechanism being employed with larger + * datasets. + * + * The small data block size is set as an allocation property in the + * file access property list identified by \p fapl_id. + * + * Setting \p size to zero (\Code{0}) disables the small data block mechanism. + * + * \since 1.4.4 + */ H5_DLL herr_t H5Pset_small_data_block_size(hid_t fapl_id, hsize_t size); /** * \ingroup FAPL @@ -3943,14 +5207,242 @@ H5_DLL herr_t H5Pset_small_data_block_size(hid_t fapl_id, hsize_t size); H5_DLL herr_t H5Pset_vol(hid_t plist_id, hid_t new_vol_id, const void *new_vol_info); #ifdef H5_HAVE_PARALLEL +/** + * \ingroup GACPL + * + * \brief Sets metadata I/O mode for read operations to collective or independent (default) + * + * \gacpl_id + * \param[in] is_collective Boolean value indicating whether metadata reads are collective + * (\Code{1}) or independent (\Code{0}). + * Default mode: Independent (\Code{0}) + * + * \return \herr_t + * + * \details H5Pset_all_coll_metadata_ops() sets the metadata I/O mode for read + * operations in the access property list \p plist_id. + * + * When engaging in parallel I/O, all metadata write operations must be + * collective. If \p is_collective is \Code{1}, this property specifies + * that the HDF5 library will perform all metadata read operations + * collectively; if \p is_collective is \Code{0}, such operations may + * be performed independently. + * + * Users must be aware that several HDF5 operations can potentially + * issue metadata reads. These include opening a dataset, datatype, or + * group; reading an attribute; or issuing a \Emph{get info} call such + * as getting information for a group with H5Fget_info(). Collective + * I/O requirements must be kept in mind when issuing such calls in the + * context of parallel I/O. + * + * If this property is collective on a file access property list that + * is used in creating or opening a file, then the HDF5 library will + * assume that all metadata read operations issued on that file + * identifier will be issued collectively from all ranks irrespective + * of the individual setting of a particular operation. If this + * assumption is not adhered to, corruption will be introduced in the + * metadata cache and HDF5’s behavior will be undefined. + * + * Alternatively, a user may wish to avoid setting this property + * globally on the file access property list, and individually set it + * on particular object access property lists (dataset, group, link, + * datatype, attribute access property lists) for certain operations. + * This will indicate that only the operations issued with such an + * access property list will be called collectively and other + * operations may potentially be called independently. There are, + * however, several HDF5 operations that can issue metadata reads but + * have no property list in their function signatures to allow passing + * the collective requirement property. For those operations, the only + * option is to set the global collective requirement property on the + * file access property list; otherwise the metadata reads that can be + * triggered from those operations will be done independently by each + * process. + * + * Functions that do not accommodate an access property list but that + * might issue metadata reads are listed in \ref maybe_metadata_reads. + * + * \attention As noted above, corruption will be introduced into the metadata + * cache and HDF5 library behavior will be undefined when both of the following + * conditions exist: + * - A file is created or opened with a file access property list in which the + * collective metadata I/O property is set to \Code{1}. + * - Any function is called that triggers an independent metadata read while the + * file remains open with that file access property list. + * + * \attention An approach that avoids this corruption risk is described above. + * + * \sa_metadata_ops + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pset_all_coll_metadata_ops(hid_t plist_id, hbool_t is_collective); +/** + * \ingroup GACPL + * + * \brief Retrieves metadata read mode setting + * + * \gacpl_id + * \param[out] is_collective Pointer to a buffer containing the Boolean value indicating whether metadata + * reads are collective (\Code{>0}) or independent (\Code{0}). + * Default mode: Independent (\Code{0}) + * + * \return \herr_t + * + * \details H5Pget_all_coll_metadata_ops() retrieves the collective metadata read setting from the access + * property list \p plist_id into \p is_collective. + * + * \sa_metadata_ops + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pget_all_coll_metadata_ops(hid_t plist_id, hbool_t *is_collective); +/** + * \ingroup FAPL + * + * \brief Sets metadata write mode to collective or independent (default) + * + * \fapl_id{plist_id} + * \param[out] is_collective Boolean value indicating whether metadata + * writes are collective (\Code{>0}) or independent (\Code{0}). + * \Emph{Default mode:} Independent (\Code{0}) + * \return \herr_t + * + * \details H5Pset_coll_metadata_write() tells the HDF5 library whether to + * perform metadata writes collectively (1) or independently (0). + * + * If collective access is selected, then on a flush of the metadata + * cache, all processes will divide the metadata cache entries to be + * flushed evenly among themselves and issue a single MPI-IO collective + * write operation. This is the preferred method when the size of the + * metadata created by the application is large. + * + * If independent access is selected, the library uses the default + * method for doing metadata I/O either from process zero or + * independently from each process. + * + * \sa_metadata_ops + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pset_coll_metadata_write(hid_t plist_id, hbool_t is_collective); +/** + * \ingroup FAPL + * + * \brief Retrieves metadata write mode setting + * + * \fapl_id{plist_id} + * \param[out] is_collective Pointer to a boolean value indicating whether + * metadata writes are collective (\Code{>0}) or independent (\Code{0}). + * \Emph{Default mode:} Independent (\Code{0}) + * \return \herr_t + * + * \details H5Pget_coll_metadata_write() retrieves the collective metadata write + * setting from the file access property into \p is_collective. + * + * \sa_metadata_ops + * + * \since 1.10.0 + */ H5_DLL herr_t H5Pget_coll_metadata_write(hid_t plist_id, hbool_t *is_collective); + +/** + * \todo Add missing documentation + */ H5_DLL herr_t H5Pget_mpi_params(hid_t fapl_id, MPI_Comm *comm, MPI_Info *info); + +/** + * \todo Add missing documentation + */ H5_DLL herr_t H5Pset_mpi_params(hid_t fapl_id, MPI_Comm comm, MPI_Info info); #endif /* H5_HAVE_PARALLEL */ +/** + * \ingroup FAPL + * + * \brief Sets the metadata cache image option for a file access property list + * + * \fapl_id{plist_id} + * \param[out] config_ptr Pointer to metadata cache image configuration values + * \return \herr_t + * + * \details H5Pset_mdc_image_config() sets the metadata cache image option with + * configuration values specified by \p config_ptr for the file access + * property list specified in \p plist_id. + * + * #H5AC_cache_image_config_t is defined as follows: + * \snippet H5ACpublic.h H5AC_cache_image_config_t_snip + * \click4more + * + * \par Limitations: While it is an obvious error to request a cache image when + * opening the file read only, it is not in general possible to test for + * this error in the H5Pset_mdc_image_config() call. Rather than fail the + * subsequent file open, the library silently ignores the file image + * request in this case.\n It is also an error to request a cache image on + * a file that does not support superblock extension messages (i.e. a + * superblock version less than 2). As above, it is not always possible to + * detect this error in the H5Pset_mdc_image_config() call, and thus the + * request for a cache image will fail silently in this case as well.\n + * Creation of cache images is currently disabled in parallel -- as above, + * any request for a cache image in this context will fail silently.\n + * Files with cache images may be read in parallel applications, but note + * that the load of the cache image is a collective operation triggered by + * the first operation that accesses metadata after file open (or, if + * persistent free space managers are enabled, on the first allocation or + * deallocation of file space, or read of file space manager status, + * whichever comes first). Thus the parallel process may deadlock if any + * process does not participate in this access.\n + * In long sequences of file closes and opens, infrequently accessed + * metadata can accumulate in the cache image to the point where the cost + * of storing and restoring this metadata exceeds the benefit of retaining + * frequently used metadata in the cache image. When implemented, the + * #H5AC_cache_image_config_t::entry_ageout should address this problem. In + * the interim, not requesting a cache image every n file close/open cycles + * may be an acceptable work around. The choice of \c n will be driven by + * application behavior, but \Code{n = 10} seems a good starting point. + * + * \since 1.10.1 + */ H5_DLL herr_t H5Pset_mdc_image_config(hid_t plist_id, H5AC_cache_image_config_t *config_ptr); +/** + * \ingroup FAPL + * + * \brief Sets the maximum size for the page buffer and the minimum percentage + * for metadata and raw data pages + * + * \fapl_id{plist_id} + * \param[in] buf_size Maximum size, in bytes, of the page buffer + * \param[in] min_meta_per Minimum metadata percentage to keep in the page buffer + * before allowing pages containing metadata to be evicted (Default is 0) + * \param[in] min_raw_per Minimum raw data percentage to keep in the page buffer + * before allowing pages containing raw data to be evicted (Default is 0) + * \return \herr_t + * + * \details H5Pset_page_buffer_size() sets buf_size, the maximum size in bytes + * of the page buffer. The default value is zero, meaning that page + * buffering is disabled. When a non-zero page buffer size is set, the + * library will enable page buffering if that size is larger or equal + * than a single page size if a paged file space strategy is enabled + * using the functions H5Pset_file_space_strategy() and + * H5Pset_file_space_page_size(). + * + * The page buffer layer captures all I/O requests before they are + * issued to the VFD and "caches" them in fixed sized pages. Once the + * total number of pages exceeds the page buffer size, the library + * evicts pages from the page buffer by writing them to the VFD. At + * file close, the page buffer is flushed writing all the pages to the + * file. + * + * If a non-zero page buffer size is set, and the file space strategy + * is not set to paged or the page size for the file space strategy is + * larger than the page buffer size, the subsequent call to H5Fcreate() + * or H5Fopen() using the \p plist_id will fail. + * + * The function also allows setting the minimum percentage of pages for + * metadata and raw data to prevent a certain type of data to evict hot + * data of the other type. + * + * \since 1.10.1 + * + */ H5_DLL herr_t H5Pset_page_buffer_size(hid_t plist_id, size_t buf_size, unsigned min_meta_per, unsigned min_raw_per); @@ -4288,7 +5780,7 @@ H5_DLL H5D_layout_t H5Pget_layout(hid_t plist_id); * virtual dataset that has the creation property list specified * by \p dcpl_id. * - * \virtual + * \see_virtual * * \since 1.10.0 * @@ -4331,7 +5823,7 @@ H5_DLL herr_t H5Pget_virtual_count(hid_t dcpl_id, size_t *count /*out*/); * assigned to \p size for a second H5Pget_virtual_dsetname() * call, which will retrieve the actual dataset name. * - * \virtual + * \see_virtual * * \since 1.10.0 * @@ -4375,7 +5867,7 @@ H5_DLL ssize_t H5Pget_virtual_dsetname(hid_t dcpl_id, size_t index, char *name / * \p size for a second H5Pget_virtual_filename() call, which * will retrieve the actual filename. * - * \virtual + * \see_virtual * * \since 1.10.0 * @@ -4400,7 +5892,7 @@ H5_DLL ssize_t H5Pget_virtual_filename(hid_t dcpl_id, size_t index, char *name / * index, \p index, and returns a dataspace identifier for the * selection within the source dataset used in the mapping. * - * \virtual + * \see_virtual * * \since 1.10.0 * @@ -4425,7 +5917,7 @@ H5_DLL hid_t H5Pget_virtual_srcspace(hid_t dcpl_id, size_t index); * index, \p index, and returns a dataspace identifier for the * selection within the virtual dataset used in the mapping. * - * \virtual + * \see_virtual * * \since 1.10.0 * @@ -5159,6 +6651,133 @@ H5_DLL herr_t H5Pset_scaleoffset(hid_t plist_id, H5Z_SO_scale_type_t scale_type, * */ H5_DLL herr_t H5Pset_szip(hid_t plist_id, unsigned options_mask, unsigned pixels_per_block); + +/** + * \ingroup DCPL + * + * \brief Sets the mapping between virtual and source datasets + * + * \dcpl_id + * \param[in] vspace_id The dataspace identifier with the selection within the + * virtual dataset applied, possibly an unlimited selection + * \param[in] src_file_name The name of the HDF5 file where the source dataset is + * located or a \Code{"."} (period) for a source dataset in the same + * file. The file might not exist yet. The name can be specified using + * a C-style \c printf statement as described below. + * \param[in] src_dset_name The path to the HDF5 dataset in the file specified by + * \p src_file_name. The dataset might not exist yet. The dataset name + * can be specified using a C-style \c printf statement as described below. + * \param[in] src_space_id The source dataset’s dataspace identifier with a + * selection applied, possibly an unlimited selection + * \return \herr_t + * + * \details H5Pset_virtual() maps elements of the virtual dataset (VDS) + * described by the virtual dataspace identifier \p vspace_id to the + * elements of the source dataset described by the source dataset + * dataspace identifier \p src_space_id. The source dataset is + * identified by the name of the file where it is located, + * \p src_file_name, and the name of the dataset, \p src_dset_name. + * + * \par C-style \c printf Formatting Statements: + * C-style \c printf formatting allows a pattern to be specified in the name + * of a source file or dataset. Strings for the file and dataset names are + * treated as literals except for the following substitutions: + * + * + * + * + * + * + * + * + * + *
    \Code{"%%"}Replaced with a single \Code{"%"} (percent) character.
    "%b"Where "" is the virtual dataset dimension axis (0-based) + * and \Code{"b"} indicates that the block count of the selection in that + * dimension should be used. The full expression (for example, \Code{"%0b"}) + * is replaced with a single numeric value when the mapping is evaluated at + * VDS access time. Example code for many source and virtual dataset mappings + * is available in the "Examples of Source to Virtual Dataset Mapping" + * chapter in the + * + * RFC: HDF5 Virtual Dataset. + *
    + * If the printf form is used for the source file or dataset names, the + * selection in the source dataset’s dataspace must be fixed-size. + * + * \par Source File Resolutions: + * When a source dataset residing in a different file is accessed, the + * library will search for the source file \p src_file_name as described + * below: + * \li If \p src_file_name is a \Code{"."} (period) then it refers to the + * file containing the virtual dataset. + * \li If \p src_file_name is a relative pathname, the following steps are + * performed: + * - The library will get the prefix(es) set in the environment + * variable \c HDF5_VDS_PREFIX and will try to prepend each prefix + * to \p src_file_name to form a new \p src_file_name. If the new + * \p src_file_name does not exist or if \c HDF5_VDS_PREFIX is not + * set, the library will get the prefix set via H5Pset_virtual_prefix() + * and prepend it to \p src_file_name to form a new \p src_file_name. + * If the new \p src_file_name does not exist or no prefix is being + * set by H5Pset_virtual_prefix() then the path of the file containing + * the virtual dataset is obtained. This path can be the absolute path + * or the current working directory plus the relative path of that + * file when it is created/opened. The library will prepend this path + * to \p src_file_name to form a new \p src_file_name. + * - If the new \p src_file_name does not exist, then the library will + * look for \p src_file_name and will return failure/success accordingly. + * \li If \p src_file_name is an absolute pathname, the library will first + * try to find \p src_file_name. If \p src_file_name does not exist, + * \p src_file_name is stripped of directory paths to form a new + * \p src_file_name. The search for the new \p src_file_name then follows + * the same steps as described above for a relative pathname. See + * examples below illustrating how \p src_file_name is stripped to form + * a new \p src_file_name. + * \par + * Note that \p src_file_name is considered to be an absolute pathname when + * the following condition is true: + * \li For Unix, the first character of \p src_file_name is a slash + * (\Code{/}).\n For example, consider a \p src_file_name of + * \Code{/tmp/A.h5}. If that source file does not exist, the new + * \p src_file_name after stripping will be \Code{A.h5}. + * \li For Windows, there are 6 cases: + * 1. \p src_file_name is an absolute drive with absolute pathname.\n + * For example, consider a \p src_file_name of \Code{/tmp/A.h5}. + * If that source file does not exist, the new \p src_file_name + * after stripping will be \Code{A.h5}. + * 2. \p src_file_name is an absolute pathname without specifying + * drive name.\n For example, consider a \p src_file_name of + * \Code{/tmp/A.h5}. If that source file does not exist, the new + * \p src_file_name after stripping will be \Code{A.h5}. + * 3. \p src_file_name is an absolute drive with relative pathname.\n + * For example, consider a \p src_file_name of \Code{/tmp/A.h5}. + * If that source file does not exist, the new \p src_file_name + * after stripping will be \Code{tmp/A.h5}. + * 4. \p src_file_name is in UNC (Uniform Naming Convention) format + * with server name, share name, and pathname.\n + * For example, consider a \p src_file_name of \Code{/tmp/A.h5}. + * If that source file does not exist, the new \p src_file_name + * after stripping will be \Code{A.h5}. + * 5. \p src_file_name is in Long UNC (Uniform Naming Convention) + * format with server name, share name, and pathname.\n + * For example, consider a \p src_file_name of \Code{/tmp/A.h5}. + * If that source file does not exist, the new \p src_file_name + * after stripping will be \Code{A.h5}. + * 6. \p src_file_name is in Long UNC (Uniform Naming Convention) + * format with an absolute drive and an absolute pathname.\n + * For example, consider a \p src_file_name of \Code{/tmp/A.h5}. + * If that source file does not exist, the new \p src_file_name + * after stripping will be \Code{A.h5} + * + * \see + * Virtual Dataset Overview + * + * \see_virtual + * + * \version 1.10.2 A change was made to the method of searching for VDS source files. + * \since 1.10.0 + * + */ H5_DLL herr_t H5Pset_virtual(hid_t dcpl_id, hid_t vspace_id, const char *src_file_name, const char *src_dset_name, hid_t src_space_id); @@ -5191,8 +6810,6 @@ H5_DLL herr_t H5Pset_virtual(hid_t dcpl_id, hid_t vspace_id, const char *src_fil * \p udata is the user-defined input data for the callback * function. * - * \todo Example Usage was removed and needs to be re-added - * * \since 1.10.0 * */ @@ -5298,7 +6915,7 @@ H5_DLL ssize_t H5Pget_efile_prefix(hid_t dapl_id, char *prefix /*out*/, size_t s * NULL will return the size of the prefix without the NULL * terminator. * - * \virtual + * \see_virtual * * \since 1.10.2 * @@ -5348,7 +6965,7 @@ H5_DLL herr_t H5Pget_virtual_printf_gap(hid_t dapl_id, hsize_t *gap_size); * list, \p dapl_id, and retrieves the flag, \p view, set by the * H5Pset_virtual_view() call. * - * \virtual + * \see_virtual * * \since 1.10.0 * @@ -5389,7 +7006,7 @@ H5_DLL herr_t H5Pget_virtual_view(hid_t dapl_id, H5D_vds_view_t *view); * \p boundary. It is a 1-dimensional array with \p ndims * elements, which should be the same as the rank of the * dataset’s dataspace. While appending to a dataset along a - * particular dimension index via H5DOappend(), the library + * particular dimension index via H5Dappend(), the library * determines a boundary is reached when the resulting dimension * size is divisible by \p boundary[index]. A zero value for * \p boundary[index] indicates no boundary is set for that @@ -5415,8 +7032,7 @@ H5_DLL herr_t H5Pget_virtual_view(hid_t dapl_id, H5D_vds_view_t *view); * * The callback function \p func must conform to the following * prototype: - * \Code{typedef herr_t (#H5D_append_cb_t)(hid_t dataset_id, - * hsize_t *cur_dims, void *user_data)} + * \snippet H5Dpublic.h H5D_append_cb_t_snip * * The parameters of the callback function, per the above * prototype, are defined as follows: @@ -5426,10 +7042,6 @@ H5_DLL herr_t H5Pget_virtual_view(hid_t dapl_id, H5D_vds_view_t *view); * a boundary is hit. * \li \p user_data is the user-defined input data. * - * \todo Example Usage was removed and should be added back. - * \todo Adding snippet for H5D_append_cb_t_snip did not work. - * \todo H5DOappend() not found - * * \since 1.10.0 * */ @@ -5643,7 +7255,7 @@ H5_DLL herr_t H5Pset_efile_prefix(hid_t dapl_id, const char *prefix); * buffer should not be freed until the property list has been * closed. * - * \virtual + * \see_virtual * * \since 1.10.2 * @@ -5672,122 +7284,345 @@ H5_DLL herr_t H5Pset_virtual_prefix(hid_t dapl_id, const char *prefix); * datasets will determine the extent of the unlimited virtual * dataset with the printf-style mappings. * - * Consider the following examples where the regularly spaced - * blocks of a virtual dataset are mapped to datasets with the - * names d-1, d-2, d-3, ..., d-N, ... : + * Consider the following examples where the regularly spaced + * blocks of a virtual dataset are mapped to datasets with the + * names d-1, d-2, d-3, ..., d-N, ... : + * + * \li If the dataset d-2 is missing and \p gap_size is set to 0, + * then the virtual dataset will contain only data found + * in d-1. + * \li If d-2 and d-3 are missing and \p gap_size is set to 2, + * then the virtual dataset will contain the data from + * d-1, d-3, ..., d-N, ... . The blocks that are mapped to + * d-2 and d-3 will be filled according to the virtual + * dataset’s fill value setting. + * + * \see_virtual + * + * \since 1.10.0 + * + */ +H5_DLL herr_t H5Pset_virtual_printf_gap(hid_t dapl_id, hsize_t gap_size); +/** + * \ingroup DAPL + * + * \brief Sets the view of the virtual dataset (VDS) to include or exclude + * missing mapped elements + * + * \dapl_id + * \param[in] view Flag specifying the extent of the data to be included + * in the view. Valid values are: + * \li #H5D_VDS_FIRST_MISSING: View includes all data + * before the first missing mapped data + * \li #H5D_VDS_LAST_AVAILABLE View includes all + * available mapped data + * + * \return \herr_t + * + * \details H5Pset_virtual_view() takes the access property list for the + * virtual dataset, \p dapl_id, and the flag, \p view, and sets + * the VDS view according to the flag value. + * + * If \p view is set to #H5D_VDS_FIRST_MISSING, the view includes + * all data before the first missing mapped data. This setting + * provides a view containing only the continuous data starting + * with the dataset’s first data element. Any break in + * continuity terminates the view. + * + * If \p view is set to #H5D_VDS_LAST_AVAILABLE, the view + * includes all available mapped data. + * + * Missing mapped data is filled with the fill value set in the + * VDS creation property list. + * + * \see_virtual + * + * \since 1.10.0 + * + */ +H5_DLL herr_t H5Pset_virtual_view(hid_t dapl_id, H5D_vds_view_t view); + +/* Dataset xfer property list (DXPL) routines */ +/** + * + * \ingroup DXPL + * + * \brief Gets B-tree split ratios for a dataset transfer property list + * + * \dxpl_id{plist_id} + * \param[out] left The B-tree split ratio for left-most nodes + * \param[out] middle The B-tree split ratio for right-most nodes and lone nodes + * \param[out] right The B-tree split ratio for all other nodes + * \return \herr_t + * + * \details H5Pget_btree_ratios() returns the B-tree split ratios for a dataset + * transfer property list. + * + * The B-tree split ratios are returned through the non-NULL arguments + * \p left, \p middle, and \p right, as set by the H5Pset_btree_ratios() + * function. + * + */ +H5_DLL herr_t H5Pget_btree_ratios(hid_t plist_id, double *left /*out*/, double *middle /*out*/, + double *right /*out*/); +/** + * + * \ingroup DXPL + * + * \brief Reads buffer settings + * + * \param[in] plist_id Identifier for the dataset transfer property list + * \param[out] tconv Address of the pointer to application-allocated type + * conversion buffer + * \param[out] bkg Address of the pointer to application-allocated + * background buffer + * + * \return Returns buffer size, in bytes, if successful; otherwise 0 on failure. + * + * \details H5Pget_buffer() reads values previously set with H5Pset_buffer(). + * + * \version 1.6.0 The return type changed from \p hsize_t to \p size_t. + * \version 1.4.0 The return type changed to \p hsize_t. + * + */ +H5_DLL size_t H5Pget_buffer(hid_t plist_id, void **tconv /*out*/, void **bkg /*out*/); +/** + * + * \ingroup DXPL + * + * \brief Retrieves a data transform expression + * + * \param[in] plist_id Identifier of the property list or class + * \param[out] expression Pointer to memory where the transform expression will + * be copied + * \param[in] size Number of bytes of the transform expression to copy + * to + * + * \return Success: the size of the transform expression. Failure: a negative + * value. + * + * \details H5Pget_data_transform() retrieves the data transform expression + * previously set in the dataset transfer property list \p plist_id + * by H5Pset_data_transform(). + * + * H5Pget_data_transform() can be used to both retrieve the transform + * expression and query its size. + * + * If \p expression is non-NULL, up to \p size bytes of the data + * transform expression are written to the buffer. If \p expression + * is NULL, \p size is ignored, and the function does not write + * anything to the buffer. The function always returns the size of + * the data transform expression. + * + * If 0 is returned for the size of the expression, no data transform + * expression exists for the property list. + * + * If an error occurs, the buffer pointed to by \p expression is + * unchanged, and the function returns a negative value. + * + * \par Example + * An example snippet from examples/h5_dtransform.c: + * \snippet h5_dtransform.c H5Pget_data_transform_snip + * + * \since 1.8.0 + * + */ +H5_DLL ssize_t H5Pget_data_transform(hid_t plist_id, char *expression /*out*/, size_t size); +/** + * + * \ingroup DXPL + * + * \brief Determines whether error-detection is enabled for dataset reads + * + * \param[in] plist_id Dataset transfer property list identifier + * + * \return Returns \p H5Z_ENABLE_EDC or \p H5Z_DISABLE_EDC if successful; + * otherwise returns a negative value. + * + * \details H5Pget_edc_check() queries the dataset transfer property + * list \p plist to determine whether error detection is enabled for + * data read operations. + * + * \since 1.6.0 + * + */ +H5_DLL H5Z_EDC_t H5Pget_edc_check(hid_t plist_id); +/** + * + * \ingroup DXPL + * + * \brief Retrieves number of I/O vectors to be read/written in hyperslab I/O + * + * \param[in] fapl_id Dataset transfer property list identifier + * \param[out] size Number of I/O vectors to accumulate in memory for I/O operations + * + * \return \herr_t + * + * \details H5Pget_hyper_vector_size() retrieves the number of I/O vectors to be accumulated in + * memory before being issued to the lower levels of the HDF5 library for reading or + * writing the actual data. + * + * The number of I/O vectors set in the dataset transfer property list \p fapl_id is + * returned in \p size. Unless the default value is in use, \p size was + * previously set with a call to H5Pset_hyper_vector_size(). + * + * \since 1.6.0 + * + */ +H5_DLL herr_t H5Pget_hyper_vector_size(hid_t fapl_id, size_t *size /*out*/); +/** + * + * \ingroup DXPL + * + * \brief Checks status of the dataset transfer property list (\b DEPRECATED) + * + * \deprecated{H5Pget_preserve() is deprecated as it is no longer useful; + * compound datatype field preservation is now core functionality + * in the HDF5 library.} + * + * \param[in] plist_id Identifier for the dataset transfer property list + * + * \return Returns 1 or 0 if successful; otherwise returns a negative value. + * + * \details H5Pget_preserve() checks the status of the dataset transfer + * property list. + * + * \version 1.6.0 The flag parameter was changed from INTEGER to LOGICAL to + * better match the C API. (Fortran 90) + * + */ +H5_DLL int H5Pget_preserve(hid_t plist_id); +/** + * + * \ingroup DXPL + * + * \brief Gets user-defined datatype conversion callback function * - * \li If the dataset d-2 is missing and \p gap_size is set to 0, - * then the virtual dataset will contain only data found - * in d-1. - * \li If d-2 and d-3 are missing and \p gap_size is set to 2, - * then the virtual dataset will contain the data from - * d-1, d-3, ..., d-N, ... . The blocks that are mapped to - * d-2 and d-3 will be filled according to the virtual - * dataset’s fill value setting. + * \param[in] dxpl_id Dataset transfer property list identifier + * \param[out] op User-defined type conversion callback function + * \param[out] operate_data User-defined input data for the callback function * - * \virtual + * \return \herr_t * - * \since 1.10.0 + * \details H5Pget_type_conv_cb() gets the user-defined datatype conversion + * callback function \p op in the dataset transfer property list + * \p dxpl_id. + * + * The parameter \p operate_data is a pointer to user-defined input + * data for the callback function. + * + * The callback function \p op defines the actions an application is + * to take when there is an exception during datatype conversion. + * + * Please refer to the function H5Pset_type_conv_cb() for more details. * */ -H5_DLL herr_t H5Pset_virtual_printf_gap(hid_t dapl_id, hsize_t gap_size); +H5_DLL herr_t H5Pget_type_conv_cb(hid_t dxpl_id, H5T_conv_except_func_t *op, void **operate_data); /** - * \ingroup DAPL * - * \brief Sets the view of the virtual dataset (VDS) to include or exclude - * missing mapped elements + * \ingroup DXPL * - * \dapl_id - * \param[in] view Flag specifying the extent of the data to be included - * in the view. Valid values are: - * \li #H5D_VDS_FIRST_MISSING: View includes all data - * before the first missing mapped data - * \li #H5D_VDS_LAST_AVAILABLE View includes all - * available mapped data + * \brief Gets the memory manager for variable-length datatype allocation in H5Dread() and H5Dvlen_reclaim() + * + * \param[in] plist_id Identifier for the dataset transfer property list + * \param[out] alloc_func User's allocate routine, or NULL for system malloc + * \param[out] alloc_info Extra parameter for user’s allocation routine. + * Contents are ignored if preceding + * parameter is NULL \param[out] free_func User's free routine, or NULL for + * system free \param[out] free_info + * Extra parameter for user’s free routine. Contents are ignored if preceding + * parameter is NULL * * \return \herr_t * - * \details H5Pset_virtual_view() takes the access property list for the - * virtual dataset, \p dapl_id, and the flag, \p view, and sets - * the VDS view according to the flag value. + * \details H5Pget_vlen_mem_manager() is the companion function to + * H5Pset_vlen_mem_manager(), returning the parameters set by + * that function. * - * If \p view is set to #H5D_VDS_FIRST_MISSING, the view includes - * all data before the first missing mapped data. This setting - * provides a view containing only the continuous data starting - * with the dataset’s first data element. Any break in - * continuity terminates the view. + */ +H5_DLL herr_t H5Pget_vlen_mem_manager(hid_t plist_id, H5MM_allocate_t *alloc_func, void **alloc_info, + H5MM_free_t *free_func, void **free_info); +/** * - * If \p view is set to #H5D_VDS_LAST_AVAILABLE, the view - * includes all available mapped data. + * \ingroup DXPL * - * Missing mapped data is filled with the fill value set in the - * VDS creation property list. + * \brief Sets B-tree split ratios for a dataset transfer property list * - * \virtual + * \param[in] plist_id The dataset transfer property list identifier + * \param[in] left The B-tree split ratio for left-most nodes + * \param[in] middle The B-tree split ratio for all other nodes + * \param[in] right The B-tree split ratio for right-most nodes and lone + * nodes * - * \since 1.10.0 + * \return \herr_t + * + * \details H5Pset_btree_ratios() sets the B-tree split ratios for a dataset + * transfer property list. The split ratios determine what percent of + * children go in the first node when a node splits. + * + * The ratio \p left is used when the splitting node is the left-most + * node at its level in the tree; + * the ratio \p right is used when the splitting node is the right-most + * node at its level; and + * the ratio \p middle is used for all other cases. + * + * A node that is the only node at its level in the tree uses the + * ratio \p right when it splits. + * + * All ratios are real numbers between 0 and 1, inclusive. * */ -H5_DLL herr_t H5Pset_virtual_view(hid_t dapl_id, H5D_vds_view_t view); +H5_DLL herr_t H5Pset_btree_ratios(hid_t plist_id, double left, double middle, double right); -/* Dataset xfer property list (DXPL) routines */ -H5_DLL herr_t H5Pget_btree_ratios(hid_t plist_id, double *left /*out*/, double *middle /*out*/, - double *right /*out*/); -H5_DLL size_t H5Pget_buffer(hid_t plist_id, void **tconv /*out*/, void **bkg /*out*/); /** * * \ingroup DXPL * - * \brief Retrieves a data transform expression - * - * \param[in] plist_id Identifier of the property list or class - * \param[out] expression Pointer to memory where the transform expression - * will be copied - * \param[in] size Number of bytes of the transform expression to copy to - * - * \return Returns the size of the transform expression if successful; - * otherwise returns a negative value. + * \brief Sets type conversion and background buffers * - * \details H5Pget_data_transform() retrieves the data transform - * expression previously set in the dataset transfer property - * list \p plist_id by H5Pset_data_transform(). + * \dxpl_id{plist_id} + * \param[in] size Size, in bytes, of the type conversion and background buffers + * \param[in] tconv Pointer to application-allocated type conversion buffer + * \param[in] bkg Pointer to application-allocated background buffer + * \return \herr_t * - * H5Pget_data_transform() can be used to both retrieve the - * transform expression and to query its size. + * \details Given a dataset transfer property list, H5Pset_buffer() sets the + * maximum size for the type conversion buffer and background buffer + * and optionally supplies pointers to application-allocated + * buffers. If the buffer size is smaller than the entire amount of + * data being transferred between the application and the file, and a + * type conversion buffer or background buffer is required, then strip + * mining will be used. * - * If \p expression is non-NULL, up to \p size bytes of the data - * transform expression are written to the buffer. If - * \p expression is NULL, \p size is ignored and the function - * does not write anything to the buffer. The function always - * returns the size of the data transform expression. + * Note that there are minimum size requirements for the buffer. Strip + * mining can only break the data up along the first dimension, so the + * buffer must be large enough to accommodate a complete slice that + * encompasses all of the remaining dimensions. For example, when strip + * mining a \Code{100x200x300} hyperslab of a simple data space, the + * buffer must be large enough to hold \Code{1x200x300} data + * elements. When strip mining a \Code{100x200x300x150} hyperslab of a + * simple data space, the buffer must be large enough to hold + * \Code{1x200x300x150} data elements. * - * If 0 is returned for the size of the expression, no data - * transform expression exists for the property list. + * If \p tconv and/or \p bkg are null pointers, then buffers will be + * allocated and freed during the data transfer. * - * If an error occurs, the buffer pointed to by \p expression is - * unchanged and the function returns a negative value. + * The default value for the maximum buffer is 1 MiB. * - * \since 1.8.0 + * \version 1.6.0 The \p size parameter has changed from type hsize_t to \c size_t. + * \version 1.4.0 The \p size parameter has changed to type hsize_t. * */ -H5_DLL ssize_t H5Pget_data_transform(hid_t plist_id, char *expression /*out*/, size_t size); -H5_DLL H5Z_EDC_t H5Pget_edc_check(hid_t plist_id); -H5_DLL herr_t H5Pget_hyper_vector_size(hid_t fapl_id, size_t *size /*out*/); -H5_DLL int H5Pget_preserve(hid_t plist_id); -H5_DLL herr_t H5Pget_type_conv_cb(hid_t dxpl_id, H5T_conv_except_func_t *op, void **operate_data); -H5_DLL herr_t H5Pget_vlen_mem_manager(hid_t plist_id, H5MM_allocate_t *alloc_func, void **alloc_info, - H5MM_free_t *free_func, void **free_info); -H5_DLL herr_t H5Pset_btree_ratios(hid_t plist_id, double left, double middle, double right); -H5_DLL herr_t H5Pset_buffer(hid_t plist_id, size_t size, void *tconv, void *bkg); +H5_DLL herr_t H5Pset_buffer(hid_t plist_id, size_t size, void *tconv, void *bkg); + /** * \ingroup DXPL * * \brief Sets a data transform expression * - * \param[in] plist_id Identifier of the property list or class + * \dxpl_id{plist_id} * \param[in] expression Pointer to the null-terminated data transform * expression - * * \return \herr_t * * \details H5Pset_data_transform() sets the data transform to be used for @@ -5795,11 +7630,11 @@ H5_DLL herr_t H5Pset_buffer(hid_t plist_id, size_t size, void *tconv, void *b * transfer property list \p plist_id. * * The \p expression parameter is a string containing an algebraic - * expression, such as (5/9.0)*(x-32) or x*(x-5). When a dataset - * is read or written with this property list, the transform - * expression is applied with the x being replaced by the values - * in the dataset. When reading data, the values in the file are - * not changed and the transformed data is returned to the user. + * expression, such as \Code{(5/9.0)*(x-32)} or \Code{x*(x-5)}. When a + * dataset is read or written with this property list, the transform + * expression is applied with the \c x being replaced by the values in + * the dataset. When reading data, the values in the file are not + * changed and the transformed data is returned to the user. * * Data transforms can only be applied to integer or * floating-point datasets. Order of operations is obeyed and @@ -5813,17 +7648,354 @@ H5_DLL herr_t H5Pset_buffer(hid_t plist_id, size_t size, void *tconv, void *b * */ H5_DLL herr_t H5Pset_data_transform(hid_t plist_id, const char *expression); + +/** + * \ingroup DXPL + * + * \brief Sets the dataset transfer property list to enable or disable error + * detection when reading data + * + * \dxpl_id{plist_id} + * \param[in] check Specifies whether error checking is enabled or disabled + * for dataset read operations + * \return \herr_t + * + * \details H5Pset_edc_check() sets the dataset transfer property list \p plist + * to enable or disable error detection when reading data. + * + * Whether error detection is enabled or disabled is specified in the + * \p check parameter. Valid values are #H5Z_ENABLE_EDC (default) and + * #H5Z_DISABLE_EDC. + * + * \note The initial error detection implementation, Fletcher32 checksum, + * supports error detection for chunked datasets only. + * + * \attention The Fletcher32 EDC checksum filter, set with H5Pset_fletcher32(), + * was added in HDF5 Release 1.6.0. In the original implementation, + * however, the checksum value was calculated incorrectly on + * little-endian systems. The error was fixed in HDF5 Release 1.6.3.\n + * As a result of this fix, an HDF5 library of Release 1.6.0 through + * Release 1.6.2 cannot read a dataset created or written with + * Release 1.6.3 or later if the dataset was created with the + * checksum filter and the filter is enabled in the reading + * library. (Libraries of Release 1.6.3 and later understand the + * earlier error and compensate appropriately.)\n + * \Bold{Work-around:} An HDF5 library of Release 1.6.2 or earlier + * will be able to read a dataset created or written with the + * checksum filter by an HDF5 library of Release 1.6.3 or later if + * the checksum filter is disabled for the read operation. This can + * be accomplished via an H5Pset_edc_check() call with the value + * #H5Z_DISABLE_EDC in the second parameter. This has the obvious + * drawback that the application will be unable to verify the + * checksum, but the data does remain accessible. + * + * \version 1.6.3 Error in checksum calculation on little-endian systems + * corrected in this release. + * \since 1.6.0 + * + */ H5_DLL herr_t H5Pset_edc_check(hid_t plist_id, H5Z_EDC_t check); + +/** + * \ingroup DXPL + * + * \brief Sets user-defined filter callback function + * + * \dxpl_id{plist_id} + * \param[in] func User-defined filter callback function + * \param[in] op_data User-defined input data for the callback function + * \return \herr_t + * + * \details H5Pset_filter_callback() sets the user-defined filter callback + * function \p func in the dataset transfer property list \p plist_id. + * + * The parameter \p op_data is a pointer to user-defined input data for + * the callback function and will be passed through to the callback + * function. + * + * The callback function \p func defines the actions an application is + * to take when a filter fails. The function prototype is as follows: + * \snippet H5Zpublic.h H5Z_filter_func_t_snip + * where \c filter indicates which filter has failed, \c buf and \c buf_size + * are used to pass in the failed data, and op_data is the required + * input data for this callback function. + * + * Valid callback function return values are #H5Z_CB_FAIL and #H5Z_CB_CONT. + * + * \since 1.6.0 + * + */ H5_DLL herr_t H5Pset_filter_callback(hid_t plist_id, H5Z_filter_func_t func, void *op_data); -H5_DLL herr_t H5Pset_hyper_vector_size(hid_t fapl_id, size_t size); + +/** + * \ingroup DXPL + * + * \brief Sets number of I/O vectors to be read/written in hyperslab I/O + * + * \dxpl_id{plist_id} + * \param[in] size Number of I/O vectors to accumulate in memory for I/O + * operations\n + * Must be greater than 1 (one)\n + * Default value: 1024 + * \return \herr_t + * + * \details H5Pset_hyper_vector_size() sets the number of I/O vectors to be + * accumulated in memory before being issued to the lower levels of + * the HDF5 library for reading or writing the actual data. + * + * The I/O vectors are hyperslab offset and length pairs and are + * generated during hyperslab I/O. + * + * The number of I/O vectors is passed in \p size to be set in the + * dataset transfer property list \p plist_id. \p size must be + * greater than 1 (one). + * + * H5Pset_hyper_vector_size() is an I/O optimization function; + * increasing vector_size should provide better performance, but the + * library will use more memory during hyperslab I/O. The default value + * of \p size is 1024. + * + * \since 1.6.0 + * + */ +H5_DLL herr_t H5Pset_hyper_vector_size(hid_t plist_id, size_t size); + +/** + * \ingroup DXPL + * + * \brief Sets the dataset transfer property list \p status + * + * \dxpl_id{plist_id} + * \param[in] status Status toggle of the dataset transfer property list + * \return \herr_t + * + * \deprecated This function is deprecated as it no longer has any effect; + * compound datatype field preservation is now core functionality in + * the HDF5 library. + * + * \details H5Pset_preserve() sets the dataset transfer property list status to + * \c 1 or \c 0. + * + * When reading or writing compound datatypes and the destination is + * partially initialized and the read/write is intended to initialize + * the other members, one must set this property to \c 1. Otherwise the + * I/O pipeline treats the destination datapoints as completely + * uninitialized. + * + * \todo Add missing version information: introduction, deprecation, etc. + * Why is the declaration not in the deprecated section? + * + */ H5_DLL herr_t H5Pset_preserve(hid_t plist_id, hbool_t status); + +/** + * \ingroup DXPL + * + * \brief Sets user-defined datatype conversion callback function + * + * \dxpl_id + * \param[in] op User-defined type conversion callback function + * \param[in] operate_data User-defined input data for the callback function + * \return \herr_t + * + * \details H5Pset_type_conv_cb() sets the user-defined datatype conversion + * callback function \p op in the dataset transfer property list \p + * dxpl_id + * + * The parameter operate_data is a pointer to user-defined input data + * for the callback function and will be passed through to the callback + * function. + * + * The callback function \p op defines the actions an application is to + * take when there is an exception during datatype conversion. The + * function prototype is as follows: + * \snippet H5Tpublic.h H5T_conv_except_func_t_snip + * + * \todo Add version information. + * + */ H5_DLL herr_t H5Pset_type_conv_cb(hid_t dxpl_id, H5T_conv_except_func_t op, void *operate_data); + +/** + * \ingroup DXPL + * + * \brief Sets the memory manager for variable-length datatype allocation in + * H5Dread() and H5Dvlen_reclaim() + * + * \dxpl_id{plist_id} + * \param[in] alloc_func User's allocate routine, or \c NULL for system \c malloc + * \param[in] alloc_info Extra parameter for user's allocation routine. + * Contents are ignored if preceding parameter is \c NULL. + * \param[in] free_func User's free routine, or \c NULL for system \c free + * \param[in] free_info Extra parameter for user's free routine. Contents are + * ignored if preceding parameter is \c NULL + * \return \herr_t + * + * \details H5Pset_vlen_mem_manager() sets the memory manager for + * variable-length datatype allocation in H5Dread() and free in + * H5Dvlen_reclaim(). + * + * The \p alloc_func and \p free_func parameters identify the memory + * management routines to be used. If the user has defined custom + * memory management routines, \p alloc_func and/or free_func should be + * set to make those routine calls (i.e., the name of the routine is + * used as the value of the parameter); if the user prefers to use the + * system's \c malloc and/or \c free, the \p alloc_func and \p + * free_func parameters, respectively, should be set to \c NULL + * + * The prototypes for these user-defined functions are as follows: + * \snippet H5MMpublic.h H5MM_allocate_t_snip + * + * \snippet H5MMpublic.h H5MM_free_t_snip + * + * The \p alloc_info and \p free_info parameters can be used to pass + * along any required information to the user's memory management + * routines. + * + * In summary, if the user has defined custom memory management + * routines, the name(s) of the routines are passed in the \p + * alloc_func and \p free_func parameters and the custom routines' + * parameters are passed in the \p alloc_info and \p free_info + * parameters. If the user wishes to use the system \c malloc and \c + * free functions, the \p alloc_func and/or \p free_func parameters are + * set to \c NULL and the \p alloc_info and \p free_info parameters are + * ignored. + * + * \todo Add version information. + */ H5_DLL herr_t H5Pset_vlen_mem_manager(hid_t plist_id, H5MM_allocate_t alloc_func, void *alloc_info, H5MM_free_t free_func, void *free_info); + #ifdef H5_HAVE_PARALLEL +/** + * \ingroup DXPL + * + * \brief Retrieves the type of chunk optimization that HDF5 actually performed + * on the last parallel I/O call (not necessarily the type requested) + * + * \dxpl_id{plist_id} + * \param[out] actual_chunk_opt_mode The type of chunk optimization performed by HDF5 + * \return \herr_t + * + * \par Motivation: + * A user can request collective I/O via a data transfer property list + * (DXPL) that has been suitably modified with H5Pset_dxpl_mpio(). + * However, HDF5 will sometimes ignore this request and perform independent + * I/O instead. This property allows the user to see what kind of I/O HDF5 + * actually performed. Used in conjunction with H5Pget_mpio_actual_io_mode(), + * this property allows the user to determine exactly what HDF5 did when + * attempting collective I/O. + * + * \details H5Pget_mpio_actual_chunk_opt_mode() retrieves the type of chunk + * optimization performed when collective I/O was requested. This + * property is set before I/O takes place, and will be set even if I/O + * fails. + * + * Valid values returned in \p actual_chunk_opt_mode: + * \snippet this H5D_mpio_actual_chunk_opt_mode_t_snip + * \click4more + * + * \since 1.8.8 + * + */ H5_DLL herr_t H5Pget_mpio_actual_chunk_opt_mode(hid_t plist_id, H5D_mpio_actual_chunk_opt_mode_t *actual_chunk_opt_mode); +/** + * \ingroup DXPL + * + * \brief Retrieves the type of I/O that HDF5 actually performed on the last + * parallel I/O call (not necessarily the type requested) + * + * \dxpl_id{plist_id} + * \param[out] actual_io_mode The type of I/O performed by this process + * \return \herr_t + * + * \par Motivation: + * A user can request collective I/O via a data transfer property list + * (DXPL) that has been suitably modified with H5Pset_dxpl_mpio(). + * However, HDF5 will sometimes ignore this request and perform independent + * I/O instead. This property allows the user to see what kind of I/O HDF5 + * actually performed. Used in conjunction with H5Pget_mpio_actual_chunk_opt_mode(), + * this property allows the user to determine exactly HDF5 did when + * attempting collective I/O. + * + * \details H5Pget_mpio_actual_io_mode() retrieves the type of I/O performed on + * the selection of the current process. This property is set after all + * I/O is completed; if I/O fails, it will not be set. + * + * Valid values returned in \p actual_io_mode: + * \snippet this H5D_mpio_actual_io_mode_t_snip + * \click4more + * + * \attention All processes do not need to have the same value. For example, if + * I/O is being performed using the multi chunk optimization scheme, + * one process's selection may include only chunks accessed + * collectively, while another may include chunks accessed + * independently. In this case, the first process will report + * #H5D_MPIO_CHUNK_COLLECTIVE while the second will report + * #H5D_MPIO_CHUNK_INDEPENDENT. + * + * \see H5Pget_mpio_no_collective_cause(), H5Pget_mpio_actual_chunk_opt_mode() + * + * \since 1.8.8 + * + */ H5_DLL herr_t H5Pget_mpio_actual_io_mode(hid_t plist_id, H5D_mpio_actual_io_mode_t *actual_io_mode); +/** + * \ingroup DXPL + * + * \brief Retrieves local and global causes that broke collective I/O on the last + * parallel I/O call + * + * \dxpl_id{plist_id} + * \param[out] local_no_collective_cause An enumerated set value indicating the + * causes that prevented collective I/O in the local process + * \param[out] global_no_collective_cause An enumerated set value indicating + * the causes across all processes that prevented collective I/O + * \return \herr_t + * + * \par Motivation: + * A user can request collective I/O via a data transfer property list (DXPL) + * that has been suitably modified with H5P_SET_DXPL_MPIO. However, there are + * conditions that can cause HDF5 to forgo collective I/O and perform + * independent I/O. Such causes can be different across the processes of a + * parallel application. This function allows the user to determine what + * caused the HDF5 library to skip collective I/O locally, that is in the + * local process, and globally, across all processes. + * + * \details H5Pget_mpio_no_collective_cause() serves two purposes. It can be + * used to determine whether collective I/O was used for the last + * preceding parallel I/O call. If collective I/O was not used, the + * function retrieves the local and global causes that broke collective + * I/O on that parallel I/O call. The properties retrieved by this + * function are set before I/O takes place and are retained even when + * I/O fails. + * + * Valid values returned in \p local_no_collective_cause and \p + * global_no_collective_cause are as follows or, if there are multiple + * causes, a bitwise OR of the relevant causes; the numbers in the + * center column are the bitmask values: + * \snippet this H5D_mpio_no_collective_cause_t_snip + * \click4more + * + * \attention Each process determines whether it can perform collective I/O and + * broadcasts the result. Those results are combined to make a + * collective decision; collective I/O will be performed only if all + * processes can perform collective I/O.\n + * If collective I/O was not used, the causes that prevented it are + * reported by individual process by means of an enumerated set. The + * causes may differ among processes, so H5Pget_mpio_no_collective_cause() + * returns two property values. The first value is the one produced + * by the local process to report local causes. This local information + * is encoded in an enumeration, the \ref H5D_mpio_no_collective_cause_t + * described above, with all individual causes combined into a single + * enumeration value by means of a bitwise OR operation. The second + * value reports global causes; this global value is the result of a + * bitwise-OR operation across the values returned by all the processes. + * + * \since 1.8.10 + * + */ H5_DLL herr_t H5Pget_mpio_no_collective_cause(hid_t plist_id, uint32_t *local_no_collective_cause, uint32_t *global_no_collective_cause); #endif /* H5_HAVE_PARALLEL */ @@ -5880,6 +8052,36 @@ H5_DLL herr_t H5Pget_create_intermediate_group(hid_t plist_id, unsigned *crt_int H5_DLL herr_t H5Pset_create_intermediate_group(hid_t plist_id, unsigned crt_intmd); /* Group creation property list (GCPL) routines */ + +/** + * \ingroup GCPL + * + * \brief Returns the estimated link count and average link name length in a group + * + * \gcpl_id{plist_id} + * \param[out] est_num_entries The estimated number of links in the group + * referenced by \p plist_id + * \param[out] est_name_len The estimated average length of line names in the group + * referenced by \p plist_id + * \return \herr_t + * + * \details H5Pget_est_link_info() retrieves two settings from the group creation + * property list \p plist_id: the estimated number of links that are + * expected to be inserted into a group created with the property list + * and the estimated average length of those link names. + * + * The estimated number of links is returned in \p est_num_entries. The + * limit for \p est_num_entries is 64 K. + * + * The estimated average length of the anticipated link names is returned + * in \p est_name_len. The limit for \p est_name_len is 64 K. + * + * See \ref_group_impls for a discussion of the available types of HDF5 + * group structures. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pget_est_link_info(hid_t plist_id, unsigned *est_num_entries /* out */, unsigned *est_name_len /* out */); /** @@ -5949,7 +8151,64 @@ H5_DLL herr_t H5Pget_link_creation_order(hid_t plist_id, unsigned *crt_order_fla */ H5_DLL herr_t H5Pget_link_phase_change(hid_t plist_id, unsigned *max_compact /*out*/, unsigned *min_dense /*out*/); +/** + * \ingroup GCPL + * + * \brief Retrieves the anticipated size of the local heap for original-style + * groups + * + * \gcpl_id{plist_id} + * \param[out] size_hint Anticipated size of local heap + * \return \herr_t + * + * \details H5Pget_local_heap_size_hint() queries the group creation property + * list, \p plist_id, for the anticipated size of the local heap, \p + * size_hint, for original-style groups, i.e., for groups of the style + * used prior to HDF5 Release 1.8.0. See H5Pset_local_heap_size_hint() + * for further discussion. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pget_local_heap_size_hint(hid_t plist_id, size_t *size_hint /*out*/); +/** + * \ingroup GCPL + * + * \brief Sets estimated number of links and length of link names in a group + * + * \gcpl_id{plist_id} + * \param[in] est_num_entries Estimated number of links to be inserted into group + * \param[in] est_name_len Estimated average length of link names + * \return \herr_t + * + * \details H5Pset_est_link_info() inserts two settings into the group creation + * property list plist_id: the estimated number of links that are + * expected to be inserted into a group created with the property list + * and the estimated average length of those link names. + * + * The estimated number of links is passed in \p est_num_entries. The + * limit for \p est_num_entries is 64 K. + * + * The estimated average length of the anticipated link names is passed + * in \p est_name_len. The limit for \p est_name_len is 64 K. + * + * The values for these two settings are multiplied to compute the + * initial local heap size (for old-style groups, if the local heap + * size hint is not set) or the initial object header size for + * (new-style compact groups; see \ref_group_impls). Accurately setting + * these parameters will help reduce wasted file space. + * + * If a group is expected to have many links and to be stored in dense + * format, set \p est_num_entries to 0 (zero) for maximum + * efficiency. This will prevent the group from being created in the + * compact format. + * + * See \ref_group_impls for a discussion of the available types of HDF5 + * group structures. + * + * \since 1.8.0 + * + */ H5_DLL herr_t H5Pset_est_link_info(hid_t plist_id, unsigned est_num_entries, unsigned est_name_len); /** * \ingroup GCPL @@ -6039,15 +8298,119 @@ H5_DLL herr_t H5Pset_link_creation_order(hid_t plist_id, unsigned crt_order_flag * the number of links falls below this threshold are * automatically converted to compact format. * - * \since 1.8.0 + * \since 1.8.0 + * + */ +H5_DLL herr_t H5Pset_link_phase_change(hid_t plist_id, unsigned max_compact, unsigned min_dense); +/** + * \ingroup GCPL + * + * \brief Specifies the anticipated maximum size of a local heap + * + * \gcpl_id{plist_id} + * \param[in] size_hint Anticipated maximum size in bytes of local heap + * \return \herr_t + * + * \details H5Pset_local_heap_size_hint() is used with original-style HDF5 + * groups (see “Motivation” below) to specify the anticipated maximum + * local heap size, size_hint, for groups created with the group + * creation property list \p plist_id. The HDF5 library then uses \p + * size_hint to allocate contiguous local heap space in the file for + * each group created with \p plist_id. + * + * For groups with many members or very few members, an appropriate + * initial value of \p size_hint would be the anticipated number of + * group members times the average length of group member names, plus a + * small margin: + * \code + * size_hint = max_number_of_group_members * + * (average_length_of_group_member_link_names + 2) + * \endcode + * If it is known that there will be groups with zero members, the use + * of a group creation property list with \p size_hint set to to 1 (one) + * will guarantee the smallest possible local heap for each of those groups. + * + * Setting \p size_hint to zero (0) causes the library to make a + * reasonable estimate for the default local heap size. + * + * \par Motivation: + * In situations where backward-compatibility is required, specifically, when + * libraries prior to HDF5 Release 1.8.0 may be used to read the file, groups + * must be created and maintained in the original style. This is HDF5’s default + * behavior. If backward compatibility with pre-1.8.0 libraries is not a concern, + * greater efficiencies can be obtained with the new-format compact and indexed + * groups. See Group + * implementations in HDF5 in the \ref H5G API introduction (at the bottom).\n + * H5Pset_local_heap_size_hint() is useful for tuning file size when files + * contain original-style groups with either zero members or very large + * numbers of members.\n + * The original style of HDF5 groups, the only style available prior to HDF5 + * Release 1.8.0, was well-suited for moderate-sized groups but was not optimized + * for either very small or very large groups. This original style remains the + * default, but two new group implementations were introduced in HDF5 Release 1.8.0: + * compact groups to accommodate zero to small numbers of members and indexed groups + * for thousands or tens of thousands of members ... or millions, if that's what + * your application requires.\n + * The local heap size hint, \p size_hint, is a performance tuning parameter for + * original-style groups. As indicated above, an HDF5 group may have zero, a handful, + * or tens of thousands of members. Since the original style of HDF5 groups stores the + * metadata for all of these group members in a uniform format in a local heap, the size + * of that metadata (and hence, the size of the local heap) can vary wildly from group + * to group. To intelligently allocate space and to avoid unnecessary fragmentation of + * the local heap, it can be valuable to provide the library with a hint as to the local + * heap’s likely eventual size. This can be particularly valuable when it is known that + * a group will eventually have a great many members. It can also be useful in conserving + * space in a file when it is known that certain groups will never have any members. + * + * \since 1.8.0 + * + */ +H5_DLL herr_t H5Pset_local_heap_size_hint(hid_t plist_id, size_t size_hint); + +/* Map access property list (MAPL) routines */ +#ifdef H5_HAVE_MAP_API +/** + * \ingroup MAPL + * + * \brief Set map iteration hints + * + * \mapl_id + * \param[in] key_prefetch_size Number of keys to prefetch at a time during + * iteration + * \param[in] key_alloc_size The initial size of the buffer allocated to hold + * prefetched keys + * \return \herr_t + * + * \details H5Pset_map_iterate_hints() adjusts the behavior of H5Miterate() when + * prefetching keys for iteration. The \p key_prefetch_size parameter + * specifies the number of keys to prefetch at a time during + * iteration. The \p key_alloc_size parameter specifies the initial + * size of the buffer allocated to hold these prefetched keys. If this + * buffer is too small it will be reallocated to a larger size, though + * this may result in an additional I/O. + * + * \since 1.12.? * */ -H5_DLL herr_t H5Pset_link_phase_change(hid_t plist_id, unsigned max_compact, unsigned min_dense); -H5_DLL herr_t H5Pset_local_heap_size_hint(hid_t plist_id, size_t size_hint); - -/* Map access property list (MAPL) routines */ -#ifdef H5_HAVE_MAP_API H5_DLL herr_t H5Pset_map_iterate_hints(hid_t mapl_id, size_t key_prefetch_size, size_t key_alloc_size); +/** + * \ingroup MAPL + * + * \brief Set map iteration hints + * + * \mapl_id + * \param[out] key_prefetch_size Pointer to number of keys to prefetch at a time + * during iteration + * \param[out] key_alloc_size Pointer to the initial size of the buffer allocated + * to hold prefetched keys + * \return \herr_t + * + * \details H5Pget_map_iterate() returns the map iterate hints, \p key_prefetch_size + * and \p key_alloc_size, as set by H5Pset_map_iterate_hints(). + * + * \since 1.12.? + * + */ H5_DLL herr_t H5Pget_map_iterate_hints(hid_t mapl_id, size_t *key_prefetch_size /*out*/, size_t *key_alloc_size /*out*/); #endif /* H5_HAVE_MAP_API */ @@ -6409,9 +8772,6 @@ H5_DLL herr_t H5Pset_elink_acc_flags(hid_t lapl_id, unsigned flags); * * * - * \todo Add Programming Note for C++ Developers Using C Functions - * - * * \since 1.8.3 * */ @@ -6594,10 +8954,7 @@ H5_DLL herr_t H5Pset_nlinks(hid_t plist_id, size_t nlinks); * \li H5Pget_mcdt_search_cb() * \li H5Pset_copy_object() * \li H5Pset_mcdt_search_cb() - * - * \todo missing link to "Copying Committed Datatypes with H5Ocopy - A - * comprehensive discussion of copying committed datatypes (PDF) - * in Advanced Topics in HDF5 + * \li \ref_h5ocopy * * \since 1.8.9 * @@ -6711,8 +9068,7 @@ H5_DLL herr_t H5Pget_copy_object(hid_t plist_id, unsigned *copy_options /*out*/) * \li H5Pget_mcdt_search_cb() * \li H5Pset_copy_object() * \li H5Pset_mcdt_search_cb() - * - * \todo Link to Copying Committed Datatypes with H5Ocopy was removed. + * \li \ref_h5ocopy * * \since 1.8.9 * @@ -6803,8 +9159,8 @@ H5_DLL herr_t H5Pget_mcdt_search_cb(hid_t plist_id, H5O_mcdt_search_cb_t *func, * \li H5Pget_mcdt_search_cb() * \li H5Pset_copy_object() * \li H5Pset_mcdt_search_cb() + * \li \ref_h5ocopy * - * \todo Link to Copying Committed Datatypes with H5Ocopy was removed. * \version 1.8.9 #H5O_COPY_MERGE_COMMITTED_DTYPE_FLAG added in this release. * * \since 1.8.0 @@ -6890,9 +9246,7 @@ H5_DLL herr_t H5Pset_copy_object(hid_t plist_id, unsigned copy_options); * \li H5Pget_mcdt_search_cb() * \li H5Pset_copy_object() * \li H5Pset_mcdt_search_cb() - * - * \todo Link removed to "Copying Committed Datatypes with H5Ocopy" in Advanced Topics in HDF5 - * \todo Programming Note for C++ Developers Using C Functions: + * \li \ref_h5ocopy * * \since 1.8.9 * @@ -6911,38 +9265,316 @@ H5_DLL herr_t H5Pset_mcdt_search_cb(hid_t plist_id, H5O_mcdt_search_cb_t func, v #define H5P_NO_CLASS H5P_ROOT /* Typedefs */ +/** + * \ingroup GPLOA + * + * \brief Registers a permanent property with a property list class + * + * \plistcls_id{cls_id} + * \param[in] name Name of property to register + * \param[in] size Size of property in bytes + * \param[in] def_value Default value for property in newly created + * property lists + * \param[in] prp_create Callback routine called when a property list is + * being created and the property value will be + * initialized + * \param[in] prp_set Callback routine called before a new value is + * copied into the property's value + * \param[in] prp_get Callback routine called when a property value is + * retrieved from the property + * \param[in] prp_del Callback routine called when a property is deleted + * from a property list + * \param[in] prp_copy Callback routine called when a property is copied + * from a property list + * \param[in] prp_close Callback routine called when a property list is + * being closed and the property value will be + * disposed of + * + * \return \herr_t + * + * \deprecated As of HDF5-1.8 this function was deprecated in favor of + * H5Pregister2() or the macro H5Pregister(). + * + * \details H5Pregister1() registers a new property with a property list + * class. The property will exist in all property list objects + * of that class after this routine is finished. The name of + * the property must not already exist. The default property + * value must be provided and all new property lists created + * with this property will have the property value set to the + * default provided. Any of the callback routines may be set + * to NULL if they are not needed. + * + * Zero-sized properties are allowed and do not store any data in + * the property list. These may be used as flags to indicate the + * presence or absence of a particular piece of information. The + * default pointer for a zero-sized property may be set to NULL. + * The property \p prp_create and \p prp_close callbacks are called for + * zero-sized properties, but the \p prp_set and \p prp_get callbacks + * are never called. + * + * The \p prp_create routine is called when a new property list with + * this property is being created. The #H5P_prp_create_func_t + * callback function is defined as #H5P_prp_cb1_t. + * + * The \p prp_create routine may modify the value to be set and those + * changes will be stored as the initial value of the property. + * If the \p prp_create routine returns a negative value, the new + * property value is not copied into the property and the + * \p prp_create routine returns an error value. + * + * The \p prp_set routine is called before a new value is copied into + * the property. The #H5P_prp_set_func_t callback function is defined + * as #H5P_prp_cb2_t. + * + * The \p prp_set routine may modify the value pointer to be set and + * those changes will be used when setting the property's value. + * If the \p prp_set routine returns a negative value, the new property + * value is not copied into the property and the \p prp_set routine + * returns an error value. The \p prp_set routine will not be called + * for the initial value; only the \p prp_create routine will be + * called. + * + * \b Note: The \p prp_set callback function may be useful to range + * check the value being set for the property or may perform some + * transformation or translation of the value set. The \p prp_get + * callback would then reverse the transformation or translation. + * A single \p prp_get or \p prp_set callback could handle multiple + * properties by performing different actions based on the property + * name or other properties in the property list. + * + * The \p prp_get routine is called when a value is retrieved from a + * property value. The #H5P_prp_get_func_t callback function is + * defined as #H5P_prp_cb2_t. + * + * The \p prp_get routine may modify the value to be returned from the + * query and those changes will be returned to the calling routine. + * If the \p prp_set routine returns a negative value, the query + * routine returns an error value. + * + * The \p prp_del routine is called when a property is being + * deleted from a property list. The #H5P_prp_delete_func_t + * callback function is defined as #H5P_prp_cb2_t. + * + * The \p prp_del routine may modify the value passed in, but the + * value is not used by the library when the \p prp_del routine + * returns. If the \p prp_del routine returns a negative value, + * the property list deletion routine returns an error value but + * the property is still deleted. + * + * The \p prp_copy routine is called when a new property list with + * this property is being created through a \p prp_copy operation. + * The #H5P_prp_copy_func_t callback function is defined as + * #H5P_prp_cb1_t. + * + * The \p prp_copy routine may modify the value to be set and those + * changes will be stored as the new value of the property. If + * the \p prp_copy routine returns a negative value, the new + * property value is not copied into the property and the \p prp_copy + * routine returns an error value. + * + * The \p prp_close routine is called when a property list with this + * property is being closed. The #H5P_prp_close_func_t callback + * function is defined as #H5P_prp_cb1_t. + * + * The \p prp_close routine may modify the value passed in, but the + * value is not used by the library when the \p prp_close routine + * returns. If the \p prp_close routine returns a negative value, the + * property list close routine returns an error value but the property + * list is still closed. + * + * The #H5P_prp_cb1_t is as follows: + * \snippet this H5P_prp_cb1_t_snip + * + * The #H5P_prp_cb2_t is as follows: + * \snippet this H5P_prp_cb2_t_snip + * + * + * \cpp_c_api_note + * + */ /* Function prototypes */ H5_DLL herr_t H5Pregister1(hid_t cls_id, const char *name, size_t size, void *def_value, H5P_prp_create_func_t prp_create, H5P_prp_set_func_t prp_set, H5P_prp_get_func_t prp_get, H5P_prp_delete_func_t prp_del, H5P_prp_copy_func_t prp_copy, H5P_prp_close_func_t prp_close); +/** + * \ingroup GPLOA + * + * \brief Registers a temporary property with a property list + * + * \plist_id + * \param[in] name Name of property to create + * \param[in] size Size of property in bytes + * \param[in] value Initial value for the property + * \param[in] prp_set Callback routine called before a new value is copied + * into the property's value + * \param[in] prp_get Callback routine called when a property value is + * retrieved from the property + * \param[in] prp_delete Callback routine called when a property is deleted + * from a property list + * \param[in] prp_copy Callback routine called when a property is copied + * from an existing property list + * \param[in] prp_close Callback routine called when a property list is + * being closed and the property value will be disposed + * of + * + * \return \herr_t + * + * \deprecated As of HDF5-1.8 this function was deprecated in favor of + * H5Pinsert2() or the macro H5Pinsert(). + * + * \details H5Pinsert1() creates a new property in a property + * list. The property will exist only in this property list and + * copies made from it. + * + * The initial property value must be provided in \p value and + * the property value will be set accordingly. + * + * The name of the property must not already exist in this list, + * or this routine will fail. + * + * The \p prp_set and \p prp_get callback routines may be set to NULL + * if they are not needed. + * + * Zero-sized properties are allowed and do not store any data + * in the property list. The default value of a zero-size + * property may be set to NULL. They may be used to indicate the + * presence or absence of a particular piece of information. + * + * The \p prp_set routine is called before a new value is copied + * into the property. The #H5P_prp_set_func_t callback function + * is defined as #H5P_prp_cb2_t. + * The \p prp_set routine may modify the value pointer to be set and + * those changes will be used when setting the property's value. + * If the \p prp_set routine returns a negative value, the new property + * value is not copied into the property and the \p set routine + * returns an error value. The \p prp_set routine will be called for + * the initial value. + * + * \b Note: The \p prp_set callback function may be useful to range + * check the value being set for the property or may perform some + * transformation or translation of the value set. The \p prp_get + * callback would then reverse the transformation or translation. + * A single \p prp_get or \p prp_set callback could handle multiple + * properties by performing different actions based on the + * property name or other properties in the property list. + * + * The \p prp_get routine is called when a value is retrieved from + * a property value. The #H5P_prp_get_func_t callback function + * is defined as #H5P_prp_cb2_t. + * + * The \p prp_get routine may modify the value to be returned from + * the query and those changes will be preserved. If the \p prp_get + * routine returns a negative value, the query routine returns + * an error value. + * + * The \p prp_delete routine is called when a property is being + * deleted from a property list. The #H5P_prp_delete_func_t + * callback function is defined as #H5P_prp_cb2_t. + * + * The \p prp_copy routine is called when a new property list with + * this property is being created through a \p prp_copy operation. + * The #H5P_prp_copy_func_t callback function is defined as + * #H5P_prp_cb1_t. + * + * The \p prp_copy routine may modify the value to be set and those + * changes will be stored as the new value of the property. If the + * \p prp_copy routine returns a negative value, the new property value + * is not copied into the property and the prp_copy routine returns an + * error value. + * + * The \p prp_close routine is called when a property list with this + * property is being closed. + * The #H5P_prp_close_func_t callback function is defined as + * #H5P_prp_cb1_t. + * + * The \p prp_close routine may modify the value passed in, the + * value is not used by the library when the close routine + * returns. If the \p prp_close routine returns a negative value, + * the property list \p prp_close routine returns an error value + * but the property list is still closed. + * + * \b Note: There is no \p prp_create callback routine for temporary + * property list objects; the initial value is assumed to + * have any necessary setup already performed on it. + * + * The #H5P_prp_cb1_t is as follows: + * \snippet this H5P_prp_cb1_t_snip + * + * The #H5P_prp_cb2_t is as follows: + * \snippet this H5P_prp_cb2_t_snip + + * \cpp_c_api_note + */ H5_DLL herr_t H5Pinsert1(hid_t plist_id, const char *name, size_t size, void *value, H5P_prp_set_func_t prp_set, H5P_prp_get_func_t prp_get, H5P_prp_delete_func_t prp_delete, H5P_prp_copy_func_t prp_copy, H5P_prp_close_func_t prp_close); +/** + * \ingroup GPLO + * + * \brief Encodes the property values in a property list into a binary + * buffer + * + * \plist_id + * \param[out] buf Buffer into which the property list will be encoded. + * If the provided buffer is NULL, the size of the + * buffer required is returned through \p nalloc; the + * function does nothing more. + * \param[out] nalloc The size of the required buffer + * + * \return \herr_t + * + * \deprecated As of HDF5-1.12 this function has been deprecated in favor of + * H5Pencode2() or the macro H5Pencode(). + * + * \details H5Pencode1() encodes the property list \p plist_id into the + * binary buffer \p buf. + * + * If the required buffer size is unknown, \p buf can be passed + * in as NULL and the function will set the required buffer size + * in \p nalloc. The buffer can then be created and the property + * list encoded with a subsequent H5Pencode1() call. + * + * If the buffer passed in is not big enough to hold the encoded + * properties, the H5Pencode1() call can be expected to fail with + * a segmentation fault. + * + * Properties that do not have encode callbacks will be skipped. + * There is currently no mechanism to register an encode callback for + * a user-defined property, so user-defined properties cannot currently + * be encoded. + * + * Some properties cannot be encoded, particularly properties that are + * reliant on local context. + * + * \since 1.10.0 + * + */ H5_DLL herr_t H5Pencode1(hid_t plist_id, void *buf, size_t *nalloc); /** - * \ingroup OCPL + * \ingroup DCPL * * \brief Returns information about a filter in a pipeline (DEPRECATED) * - * \todo H5Pget_filter1() prototype does not match source in H5Pocpl.c. - * Also, it is not in a deprecated file. Is that okay? + * * * \plist_id{plist_id} - * \param[in] filter Sequence number within the filter pipeline of the filter - * for which information is sought - * \param[out] flags Bit vector specifying certain general properties of - * the filter + * \param[in] filter Sequence number within the filter pipeline of + * the filter for which information is sought + * \param[out] flags Bit vector specifying certain general properties + * of the filter * \param[in,out] cd_nelmts Number of elements in \p cd_values - * \param[out] cd_values Auxiliary data for the filter - * \param[in] namelen Anticipated number of characters in \p name - * \param[out] name Name of the filter + * \param[out] cd_values Auxiliary data for the filter + * \param[in] namelen Anticipated number of characters in \p name + * \param[out] name Name of the filter * * \return Returns the filter identifier if successful; Otherwise returns * a negative value. See: #H5Z_filter_t * + * \deprecated When was this function deprecated? + * * \details H5Pget_filter1() returns information about a filter, specified * by its filter number, in a filter pipeline, specified by the * property list with which it is associated. @@ -6977,13 +9609,127 @@ H5_DLL herr_t H5Pencode1(hid_t plist_id, void *buf, size_t *nalloc); H5_DLL H5Z_filter_t H5Pget_filter1(hid_t plist_id, unsigned filter, unsigned int *flags /*out*/, size_t *cd_nelmts /*out*/, unsigned cd_values[] /*out*/, size_t namelen, char name[]); -H5_DLL herr_t H5Pget_filter_by_id1(hid_t plist_id, H5Z_filter_t id, unsigned int *flags /*out*/, - size_t *cd_nelmts /*out*/, unsigned cd_values[] /*out*/, size_t namelen, - char name[] /*out*/); -H5_DLL herr_t H5Pget_version(hid_t plist_id, unsigned *boot /*out*/, unsigned *freelist /*out*/, - unsigned *stab /*out*/, unsigned *shhdr /*out*/); -H5_DLL herr_t H5Pset_file_space(hid_t plist_id, H5F_file_space_type_t strategy, hsize_t threshold); -H5_DLL herr_t H5Pget_file_space(hid_t plist_id, H5F_file_space_type_t *strategy, hsize_t *threshold); +/** + * \ingroup DCPL + * + * \brief Returns information about the specified filter + * + * \plist_id{plist_id} + * \param[in] id Filter identifier + * \param[out] flags Bit vector specifying certain general properties + * of the filter + * \param[in,out] cd_nelmts Number of elements in \p cd_values + * \param[out] cd_values Auxiliary data for the filter + * \param[in] namelen Anticipated number of characters in \p name + * \param[out] name Name of the filter + * + * + * \return Returns a non-negative value if successful; Otherwise returns + * a negative value. + * + * \deprecated As of HDF5-1.8 this function was deprecated in favor of + * H5Pget_filter_by_id2() or the macro H5Pget_filter_by_id(). + * + * \details H5Pget_filter_by_id1() returns information about a filter, specified + * in \p id, a filter identifier. + * + * \p plist_id must be a dataset or group creation property list and + * \p id must be in the associated filter pipeline. + * + * The \p id and \p flags parameters are used in the same + * manner as described in the discussion of H5Pset_filter(). + * + * Aside from the fact that they are used for output, the parameters + * \p cd_nelmts and \p cd_values[] are used in the same manner as + * described in the discussion of H5Pset_filter(). + * On input, the \p cd_nelmts parameter indicates the number of entries + * in the \p cd_values[] array allocated by the calling program; + * on exit it contains the number of values defined by the filter. + * + * On input, the \p namelen parameter indicates the number of + * characters allocated for the filter name by the calling program + * in the array \p name[]. On exit \p name[] contains the name of the + * filter with one character of the name in each element of the array. + * + * If the filter specified in \p id is not set for the property + * list, an error will be returned and this function will fail. + * + * + * \version 1.8.5 Function extended to work with group creation property + * lists. + * \version 1.8.0 Function H5Pget_filter_by_id() renamed to + * H5Pget_filter_by_id1() and deprecated in this release. + * \version 1.6.0 Function introduced in this release. + */ +H5_DLL herr_t H5Pget_filter_by_id1(hid_t plist_id, H5Z_filter_t id, unsigned int *flags /*out*/, + size_t *cd_nelmts /*out*/, unsigned cd_values[] /*out*/, size_t namelen, + char name[] /*out*/); +/** + * \ingroup FCPL + * + * \brief Retrieves the version information of various objects + * for a file creation property list(deprecated) + * + * \plist_id + * \param[out] boot Pointer to location to return super block version number + * \param[out] freelist Pointer to location to return global freelist version number + * \param[out] stab Pointer to location to return symbol table version number + * \param[out] shhdr Pointer to location to return shared object header version + * number + * + * \return \herr_t + * + * \deprecated Deprecated in favor of the function H5Fget_info() + * + * \details H5Pget_version() retrieves the version information of various objects + * for a file creation property list. Any pointer parameters which are + * passed as NULL are not queried. + * + * \version 1.6.4 \p boot, \p freelist, \p stab, \p shhdr parameter types + * changed to unsigned. + * + */ +H5_DLL herr_t H5Pget_version(hid_t plist_id, unsigned *boot /*out*/, unsigned *freelist /*out*/, + unsigned *stab /*out*/, unsigned *shhdr /*out*/); +/** + * \ingroup FCPL + * + * \brief Sets the file space handling strategy and the free-space section + * size threshold. + * + * \fcpl_id{plist_id} + * \param[in] strategy The file space handling strategy to be used. See: + * #H5F_fspace_strategy_t + * \param[in] threshold The smallest free-space section size that the free + * space manager will track + * + * \return \herr_t + * + * \deprecated When was this function deprecated? + * + * \details Maps to the function H5Pset_file_space_strategy(). + * + */ +H5_DLL herr_t H5Pset_file_space(hid_t plist_id, H5F_file_space_type_t strategy, hsize_t threshold); +/** + * \ingroup FCPL + * + * \brief Retrieves the file space handling strategy, and threshold value for + * a file creation property list + * + * \fcpl_id{plist_id} + * \param[out] strategy Pointer to the file space handling strategy + * \param[out] threshold Pointer to the free-space section size threshold value + * + * \return \herr_t + * + * \deprecated When was this function deprecated? + * + * \details Maps to the function H5Pget_file_space_strategy() + * + * + */ +H5_DLL herr_t H5Pget_file_space(hid_t plist_id, H5F_file_space_type_t *strategy, hsize_t *threshold); #endif /* H5_NO_DEPRECATED_SYMBOLS */ #ifdef __cplusplus diff --git a/src/H5Rpublic.h b/src/H5Rpublic.h index 5d356e9..b33960e 100644 --- a/src/H5Rpublic.h +++ b/src/H5Rpublic.h @@ -40,40 +40,54 @@ /* Public Typedefs */ /*******************/ -/* +//! +/** * Reference types allowed. - * DO NOT CHANGE THE ORDER or VALUES as reference type values are encoded into - * the datatype message header. + * + * \internal DO NOT CHANGE THE ORDER or VALUES as reference type values are + * encoded into the datatype message header. */ typedef enum { - H5R_BADTYPE = (-1), /* Invalid reference type */ - H5R_OBJECT1 = 0, /* Backward compatibility (object) */ - H5R_DATASET_REGION1 = 1, /* Backward compatibility (region) */ - H5R_OBJECT2 = 2, /* Object reference */ - H5R_DATASET_REGION2 = 3, /* Region reference */ - H5R_ATTR = 4, /* Attribute Reference */ - H5R_MAXTYPE = 5 /* Highest type (invalid) */ + H5R_BADTYPE = (-1), /**< Invalid reference type */ + H5R_OBJECT1 = 0, /**< Backward compatibility (object) */ + H5R_DATASET_REGION1 = 1, /**< Backward compatibility (region) */ + H5R_OBJECT2 = 2, /**< Object reference */ + H5R_DATASET_REGION2 = 3, /**< Region reference */ + H5R_ATTR = 4, /**< Attribute Reference */ + H5R_MAXTYPE = 5 /**< Highest type (invalid) */ } H5R_type_t; +//! /* Deprecated types are kept for backward compatibility with previous versions */ +//! /** - * Deprecated object reference type that is used with deprecated reference APIs. - * Note! This type can only be used with the "native" HDF5 VOL connector. + * \deprecated Deprecated object reference type that is used with deprecated + * reference APIs. + * + * \note This type can only be used with the "native" HDF5 VOL connector. */ typedef haddr_t hobj_ref_t; +//! +//! /** - * Dataset region reference type that is used with deprecated reference APIs. - * (Buffer to store heap ID and index) - * This needs to be large enough to store largest haddr_t in a worst case + * Buffer to store heap ID and index + * + * This needs to be large enough to store largest #haddr_t in a worst case * machine (8 bytes currently) plus an int. - * Note! This type can only be used with the "native" HDF5 VOL connector. + * + * \deprecated Dataset region reference type that is used with deprecated + * reference APIs. + * + * \note This type can only be used with the "native" HDF5 VOL connector. */ typedef struct { uint8_t __data[H5R_DSET_REG_REF_BUF_SIZE]; } hdset_reg_ref_t; +//! +//! /** * Opaque reference type. The same reference type is used for object, * dataset region and attribute references. This is the type that @@ -81,10 +95,11 @@ typedef struct { */ typedef struct { union { - uint8_t __data[H5R_REF_BUF_SIZE]; /* opaque data */ - int64_t align; /* ensures alignment */ + uint8_t __data[H5R_REF_BUF_SIZE]; /**< opaque data */ + int64_t align; /**< ensures alignment */ } u; } H5R_ref_t; +//! /********************/ /* Public Variables */ @@ -121,8 +136,8 @@ extern "C" { * must be of the same type as the object being referenced, that is * a group, dataset or committed datatype property list. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * * H5Rdestroy() should be used to release the resource from the * reference. @@ -157,14 +172,12 @@ H5_DLL herr_t H5Rcreate_object(hid_t loc_id, const char *name, hid_t oapl_id, H5 * must be of the same type as the object being referenced, that is * a dataset property list in this case. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * * H5Rdestroy() should be used to release the resource from the * reference. * - * \see function_name() - * */ H5_DLL herr_t H5Rcreate_region(hid_t loc_id, const char *name, hid_t space_id, hid_t oapl_id, H5R_ref_t *ref_ptr); @@ -196,8 +209,8 @@ H5_DLL herr_t H5Rcreate_region(hid_t loc_id, const char *name, hid_t space_id, h * as that object, that is a group, dataset or committed datatype * property list. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * * H5Rdestroy() should be used to release the resource from the * reference. @@ -216,12 +229,12 @@ H5_DLL herr_t H5Rcreate_attr(hid_t loc_id, const char *name, const char *attr_na * * \return \herr_t * - * \details Given a reference, ref_ptr, to an object, region or attribute - * attached to an object, H5R_DESTROY releases allocated resources + * \details Given a reference, \p ref_ptr, to an object, region or attribute + * attached to an object, H5Rdestroy() releases allocated resources * from a previous create call. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * */ H5_DLL herr_t H5Rdestroy(H5R_ref_t *ref_ptr); @@ -236,23 +249,20 @@ H5_DLL herr_t H5Rdestroy(H5R_ref_t *ref_ptr); * * \param[in] ref_ptr Pointer to reference * - * \return Returns a valid reference type if successful; otherwise returns #H5R_UNKNOWN. + * \return Returns a valid reference type if successful; otherwise returns #H5R_BADTYPE . * * \details Given a reference, \p ref_ptr, H5Rget_type() returns the * type of the reference. * * Valid returned reference types are: + * \snippet this H5R_type_t_snip * - * #H5R_OBJECT2 Object reference version 2 - * #H5R_DATASET_REGION2 Region reference version 2 - * #H5R_ATTRIBUTE Attribute reference - * - * Note that #H5R_OBJECT1 and #H5R_DATASET REGION1 can never be - * associated to an H5R_ref_t reference and can therefore never be + * Note that #H5R_OBJECT1 and #H5R_DATASET_REGION1 can never be + * associated to an \ref H5R_ref_t reference and can therefore never be * returned through that function. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * */ H5_DLL H5R_type_t H5Rget_type(const H5R_ref_t *ref_ptr); @@ -273,8 +283,8 @@ H5_DLL H5R_type_t H5Rget_type(const H5R_ref_t *ref_ptr); * \details H5Requal() determines whether two references point to the * same object, region or attribute. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * */ H5_DLL htri_t H5Requal(const H5R_ref_t *ref1_ptr, const H5R_ref_t *ref2_ptr); @@ -320,8 +330,8 @@ H5_DLL herr_t H5Rcopy(const H5R_ref_t *src_ref_ptr, H5R_ref_t *dst_ref_ptr); * must be of the same type as the object being referenced, that is * a group or dataset property list. * - * H5R_ref_t is defined in H5Rpublic.h as: typedef unsigned char - * H5R_ref_t[#H5R_REF_BUF_SIZE]; + * \ref H5R_ref_t is defined in H5Rpublic.h as: + * \snippet this H5R_ref_t_snip * * The object opened with this function should be closed when it * is no longer needed so that resource leaks will not develop. Use @@ -333,22 +343,8 @@ H5_DLL hid_t H5Ropen_object(H5R_ref_t *ref_ptr, hid_t rapl_id, hid_t oapl_id); /** * -------------------------------------------------------------------------- - * \ingroup H5R - * - * \brief Asynchronous version of H5Ropen_object() - * - * \app_file - * \app_func - * \app_line - * \param[out] ref_ptr Pointer to reference to open - * \rapl_id - * \oapl_id - * \es_id - * - * \return \hid_t{object} - * - * \see H5Ropen_object() - * + * \ingroup ASYNC + * \async_variant_of{H5Ropen} */ H5_DLL hid_t H5Ropen_object_async(const char *app_file, const char *app_func, unsigned app_line, H5R_ref_t *ref_ptr, hid_t rapl_id, hid_t oapl_id, hid_t es_id); @@ -388,22 +384,8 @@ H5_DLL hid_t H5Ropen_region(H5R_ref_t *ref_ptr, hid_t rapl_id, hid_t oapl_id); /** * -------------------------------------------------------------------------- - * \ingroup H5R - * - * \brief Asynchronous version of H5Ropen_region() - * - * \app_file - * \app_func - * \app_line - * \param[in] ref_ptr Pointer to reference to open - * \rapl_id - * \oapl_id - * \es_id - * - * \return \hid_t{dataspace} - * - * \see H5Ropen_region() - * + * \ingroup ASYNC + * \async_variant_of{H5Ropen_region} */ H5_DLL hid_t H5Ropen_region_async(const char *app_file, const char *app_func, unsigned app_line, H5R_ref_t *ref_ptr, hid_t rapl_id, hid_t oapl_id, hid_t es_id); @@ -440,22 +422,8 @@ H5_DLL hid_t H5Ropen_attr(H5R_ref_t *ref_ptr, hid_t rapl_id, hid_t aapl_id); /** * -------------------------------------------------------------------------- - * \ingroup H5R - * - * \brief Asynchronous version of H5Ropen_attr() - * - * \app_file - * \app_func - * \app_line - * \param[in] ref_ptr Pointer to reference to open - * \rapl_id - * \aapl_id - * \es_id - * - * \return \hid_t{attribute} - * - * \see H5Ropen_attr() - * + * \ingroup ASYNC + * \async_variant_of{H5Ropen_attr} */ H5_DLL hid_t H5Ropen_attr_async(const char *app_file, const char *app_func, unsigned app_line, H5R_ref_t *ref_ptr, hid_t rapl_id, hid_t aapl_id, hid_t es_id); @@ -485,10 +453,7 @@ H5_DLL hid_t H5Ropen_attr_async(const char *app_file, const char *app_func, unsi * Upon success, the function returns in \p obj_type the type of * the object that the reference points to. Valid values for this * referenced object type are as followed (defined in H5Opublic.h): - * - * H5O_TYPE_GROUP Object is a group - * H5O_TYPE_DATASET Object is a dataset - * H5O_TYPE_NAMED_DATATYPE Object is a named datatype + * \snippet H5Opublic.h H5O_type_t_snip * */ H5_DLL herr_t H5Rget_obj_type3(H5R_ref_t *ref_ptr, hid_t rapl_id, H5O_type_t *obj_type); @@ -540,12 +505,12 @@ H5_DLL ssize_t H5Rget_file_name(const H5R_ref_t *ref_ptr, char *name, size_t siz * \details H5Rget_obj_name() retrieves the object name for the object, * region or attribute reference pointed to by \p ref_ptr. * - * The parameter \p rapl id is a reference access property list + * The parameter \p rapl_id is a reference access property list * identifier for the reference. The access property list can * be used to access external files that the reference points to * (through a file access property list). * - * Up to size characters of the name are returned in name; additional + * Up to size characters of the name are returned in \p name; additional * characters, if any, are not returned to the user application. If * the length of the name, which determines the required value of * \p size, is unknown, a preliminary call to H5Rget_obj_name() call @@ -553,8 +518,8 @@ H5_DLL ssize_t H5Rget_file_name(const H5R_ref_t *ref_ptr, char *name, size_t siz * object name. That value can then be passed in for \p size in the * second call to H5Rget_obj_name(), which will retrieve the actual * name. If there is no name associated with the object identifier - * or if the name is #NULL, H5Rget_obj_name() returns the size of - * the name buffer (the size does not include the #NULL terminator). + * or if the name is NULL, H5Rget_obj_name() returns the size of + * the name buffer (the size does not include the \c \0 terminator). * * If \p ref_ptr is an object reference, \p name will be returned with * a name for the referenced object. If \p ref_ptr is a dataset region @@ -624,15 +589,334 @@ H5_DLL ssize_t H5Rget_attr_name(const H5R_ref_t *ref_ptr, char *name, size_t siz /* Function prototypes */ #ifndef H5_NO_DEPRECATED_SYMBOLS +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Retrieves the type of object that an object reference points to + * + * \param[in] id The dataset containing the reference object or the group + * containing that dataset + * \param[in] ref_type Type of reference to query + * \param[in] ref Reference to query + * + * \return Returns a valid object type if successful; otherwise returns a + * negative value (#H5G_UNKNOWN). + * + * \deprecated This function has been renamed from H5Rget_obj_type() and is + * deprecated in favor of the macro H5Rget_obj_type() or the + * function H5Rget_obj_type2(). + * + * \details Given an object reference, \p ref, H5Rget_obj_type1() returns the + * type of the referenced object. + * + * A \Emph{reference type} is the type of reference, either an object + * reference or a dataset region reference. An \Emph{object reference} + * points to an HDF5 object while a \Emph{dataset region reference} + * points to a defined region within a dataset. + * + * The \Emph{referenced object} is the object the reference points + * to. The \Emph{referenced object type}, or the type of the referenced + * object, is the type of the object that the reference points to. + * + * The location identifier, \p id, is the identifier for either the + * dataset containing the object reference or the group containing that + * dataset. + * + * Valid reference types, to pass in as \p ref_type, include the + * following: + * \snippet this H5R_type_t_snip + * + * If the application does not already know the object reference type, + * that can be determined with three preliminary calls: + * + * \li Call H5Dget_type() on the dataset containing the reference to + * get a datatype identifier for the dataset’s datatype. + * \li Using that datatype identifier, H5Tget_class() returns a datatype + * class.\n If the datatype class is #H5T_REFERENCE, H5Tequal() can + * then be used to determine whether the reference’s datatype is + * #H5T_STD_REF_OBJ or #H5T_STD_REF_DSETREG: + * - If the datatype is #H5T_STD_REF_OBJ, the reference object type + * is #H5R_OBJECT. + * - If the datatype is #H5T_STD_REF_DSETREG, the reference object + * type is #H5R_DATASET_REGION. + * + * When the function completes successfully, it returns one of the + * following valid object type values (defined in H5Gpublic.h): + * \snippet H5Gpublic.h H5G_obj_t_snip + * + * \version 1.8.0 Function H5Rget_obj_type() renamed to H5Rget_obj_type1() and + * deprecated in this release. + * \since 1.6.0 + * + */ H5_DLL H5G_obj_t H5Rget_obj_type1(hid_t id, H5R_type_t ref_type, const void *ref); -H5_DLL hid_t H5Rdereference1(hid_t obj_id, H5R_type_t ref_type, const void *ref); + +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Opens the HDF5 object referenced + * + * \obj_id + * \param[in] ref_type The reference type of \p ref + * \param[in] ref Reference to open + * + * \return Returns identifier of referenced object if successful; otherwise + * returns a negative value. + * + * \deprecated This function has been renamed from H5Rdereference() and is + * deprecated in favor of the macro H5Rdereference() or the function + * H5Rdereference2(). + * + * \details Given a reference, \p ref, to an object or a region in an object, + * H5Rdereference1() opens that object and returns an identifier. + * + * The parameter \p obj_id must be a valid identifier for an object in + * the HDF5 file containing the referenced object, including the file + * identifier. + * + * The parameter \p ref_type specifies the reference type of the + * reference \p ref. \p ref_type may contain either of the following + * values: + * - #H5R_OBJECT + * - #H5R_DATASET_REGION + * + * The object opened with this function should be closed when it is no + * longer needed so that resource leaks will not develop. Use the + * appropriate close function such as H5Oclose() or H5Dclose() for + * datasets. + * + * \version 1.10.0 Function H5Rdereference() renamed to H5Rdereference1() and + * deprecated in this release. + * \since 1.8.0 + * + */ +H5_DLL hid_t H5Rdereference1(hid_t obj_id, H5R_type_t ref_type, const void *ref); #endif /* H5_NO_DEPRECATED_SYMBOLS */ -H5_DLL herr_t H5Rcreate(void *ref, hid_t loc_id, const char *name, H5R_type_t ref_type, hid_t space_id); -H5_DLL herr_t H5Rget_obj_type2(hid_t id, H5R_type_t ref_type, const void *ref, H5O_type_t *obj_type); -H5_DLL hid_t H5Rdereference2(hid_t obj_id, hid_t oapl_id, H5R_type_t ref_type, const void *ref); -H5_DLL hid_t H5Rget_region(hid_t dataset, H5R_type_t ref_type, const void *ref); +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Creates a reference + * + * \param[out] ref Reference created by the function call + * \param[in] loc_id Location identifier used to locate the object being pointed to + * \param[in] name Name of object at location \p loc_id + * \param[in] ref_type Type of reference + * \param[in] space_id Dataspace identifier with selection. Used only for + * dataset region references; pass as -1 if reference is + * an object reference, i.e., of type #H5R_OBJECT + * + * \return \herr_t + * + * \details H5Rcreate() creates the reference, \p ref, of the type specified in + * \p ref_type, pointing to the object \p name located at \p loc_id. + * + * The HDF5 library maps the void type specified above for \p ref to + * the type specified in \p ref_type, which will be one of the following: + * \snippet this H5R_type_t_snip + * + * The parameters \p loc_id and \p name are used to locate the object. + * + * The parameter \p space_id identifies the dataset region that a + * dataset region reference points to. This parameter is used only with + * dataset region references and should be set to -1 if the reference + * is an object reference, #H5R_OBJECT. + * + * \since 1.8.0 + */ +H5_DLL herr_t H5Rcreate(void *ref, hid_t loc_id, const char *name, H5R_type_t ref_type, hid_t space_id); + +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Retrieves the type of object that an object reference points to + * + * \param[in] id The dataset containing the reference object or the group + * containing that dataset + * \param[in] ref_type Type of reference to query + * \param[in] ref Reference to query + * \param[out] obj_type Type of referenced object + * + * \return \herr_t + * + * \details Given an object reference, \p ref, H5Rget_obj_type2() returns the + * type of the referenced object in \p obj_type. + * + * A \Emph{reference type} is the type of reference, either an object + * reference or a dataset region reference. An \Emph{object reference} + * points to an HDF5 object while a \Emph{dataset region reference} + * points to a defined region within a dataset. + * + * The \Emph{referenced object} is the object the reference points + * to. The \Emph{referenced object type}, or the type of the referenced + * object, is the type of the object that the reference points to. + * + * The location identifier, \p id, is the identifier for either the + * dataset containing the object reference or the group containing that + * dataset. + * + * Valid reference types, to pass in as \p ref_type, include the + * following: + * \snippet this H5R_type_t_snip + * + * If the application does not already know the object reference type, + * that can be determined with three preliminary calls: + * + * \li Call H5Dget_type() on the dataset containing the reference to + * get a datatype identifier for the dataset’s datatype. + * \li Using that datatype identifier, H5Tget_class() returns a datatype + * class.\n If the datatype class is #H5T_REFERENCE, H5Tequal() can + * then be used to determine whether the reference’s datatype is + * #H5T_STD_REF_OBJ or #H5T_STD_REF_DSETREG: + * - If the datatype is #H5T_STD_REF_OBJ, the reference object type + * is #H5R_OBJECT. + * - If the datatype is #H5T_STD_REF_DSETREG, the reference object + * type is #H5R_DATASET_REGION. + * + * When the function completes successfully, it returns one of the + * following valid object type values (defined in H5Opublic.h): + * \snippet H5Opublic.h H5O_type_t_snip + * + * \since 1.8.0 + * + */ +H5_DLL herr_t H5Rget_obj_type2(hid_t id, H5R_type_t ref_type, const void *ref, H5O_type_t *obj_type); + +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Opens the HDF5 object referenced + * + * \obj_id + * \oapl_id + * \param[in] ref_type The reference type of \p ref + * \param[in] ref Reference to open + * + * \return Returns identifier of referenced object if successful; otherwise + * returns a negative value. + * + * \details Given a reference, \p ref, to an object or a region in an object, + * H5Rdereference2() opens that object and returns an identifier. + * + * The parameter \p obj_id must be a valid identifier for the HDF5 file + * containing the referenced object or for any object in that HDF5 + * file. + * + * The parameter \p oapl_id is an object access property list + * identifier for the referenced object. The access property list must + * be of the same type as the object being referenced, that is a group, + * dataset, or datatype property list. + * + * The parameter \p ref_type specifies the reference type of the + * reference \p ref. \p ref_type may contain either of the following + * values: + * - #H5R_OBJECT + * - #H5R_DATASET_REGION + * + * The object opened with this function should be closed when it is no + * longer needed so that resource leaks will not develop. Use the + * appropriate close function such as H5Oclose() or H5Dclose() for + * datasets. + * + * \since 1.10.0 + * + */ +H5_DLL hid_t H5Rdereference2(hid_t obj_id, hid_t oapl_id, H5R_type_t ref_type, const void *ref); + +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Sets up a dataspace and selection as specified by a region reference + * + * \param[in] dataset File identifier or identifier for any object in the file + * containing the referenced region + * \param[in] ref_type Reference type of \p ref, which must be #H5R_DATASET_REGION + * \param[in] ref Region reference to open + * + * \return Returns a valid dataspace identifier if successful; otherwise returns + * a negative value. + * + * \details H5Rget_region() creates a copy of the dataspace of the dataset + * pointed to by a region reference, \p ref, and defines a selection + * matching the selection pointed to by ref within the dataspace copy. + * + * \p dataset is used to identify the file containing the referenced + * region; it can be a file identifier or an identifier for any object + * in the file. + * + * The parameter \p ref_type specifies the reference type of \p ref and + * must contain the value #H5R_DATASET_REGION. + * + * Use H5Sclose() to release the dataspace identifier returned by this + * function when the identifier is no longer needed. + * + */ +H5_DLL hid_t H5Rget_region(hid_t dataset, H5R_type_t ref_type, const void *ref); + +/** + * -------------------------------------------------------------------------- + * \ingroup H5R + * + * \brief Retrieves a name for a referenced object + * + * \param[in] loc_id Identifier for the file containing the reference or for + * any object in that file + * \param[in] ref_type Type of reference + * \param[in] ref An object or dataset region reference + * \param[out] name A buffer to place the name of the referenced object or + * dataset region. If \c NULL, then this call will return the + * size in bytes of the name. + * \param[in] size The size of the \p name buffer. When the size is passed in, + * the \c NULL terminator needs to be included. + * + * \return Returns the length of the name if successful, returning 0 (zero) if + * no name is associated with the identifier. Otherwise returns a + * negative value. + * + * \details H5Rget_name() retrieves a name for the object identified by \p ref.\n + * \p loc_id is used to identify the file containing the reference. It + * can be the file identifier for the file containing the reference or + * an identifier for any object in that file. + * + * \ref H5R_type_t is the reference type of \p ref. Valid values + * include the following: + * \snippet this H5R_type_t_snip + * + * \p ref is the reference for which the target object’s name is + * sought. + * + * If \p ref is an object reference, \p name will be returned with a + * name for the referenced object. If \p ref is a dataset region + * reference, \p name will contain a name for the object containing the + * referenced region. + * + * Up to \p size characters of the name are returned in \p name; + * additional characters, if any, are not returned to the user + * application. + * + * If the length of the name, which determines the required value of \p + * size, is unknown, a preliminary H5Rget_name() call can be made. The + * return value of this call will be the size of the object name. That + * value can then be assigned to \p size for a second H5Rget_name() + * call, which will retrieve the actual name. + * + * If there is no name associated with the object identifier or if the + * \p name is \c NULL, H5Rget_name() returns the size of the \p name + * buffer (the size does not include the \p NULL terminator). + * + * Note that an object in an HDF5 file may have multiple paths if there + * are multiple links pointing to it. This function may return any one + * of these paths. + * + * \since 1.8.0 + */ H5_DLL ssize_t H5Rget_name(hid_t loc_id, H5R_type_t ref_type, const void *ref, char *name, size_t size); #ifdef __cplusplus diff --git a/src/H5Spublic.h b/src/H5Spublic.h index 5037e0a..fd85dcc 100644 --- a/src/H5Spublic.h +++ b/src/H5Spublic.h @@ -364,10 +364,8 @@ H5_DLL hid_t H5Sdecode(const void *buf); * * \note Motivation: This function was introduced in HDF5-1.12 as part of the * H5Sencode() format change to enable 64-bit selection encodings and - * a dataspace selection that is tied to a file. See the New Features - * in HDF5 Release 1.12 as well as the H5Sencode() / H5Sdecode() Format Change RFC. - * - * \todo Fix the references. + * a dataspace selection that is tied to a file. See the \ref_news_112 + * as well as the \ref_sencode_fmt_change. * * \since 1.12.0 * @@ -1069,7 +1067,7 @@ H5_DLL herr_t H5Sselect_copy(hid_t dst_id, hid_t src_id); * buffer as: * \n 0 0 0 0 13 5 11 17 7 21 29 21 * - * \version 1.6.4 C coord parameter type changed to \p const #hsize_t. + * \version 1.6.4 C coord parameter type changed to \p const hsize_t. * \version 1.6.4 Fortran \p coord parameter type changed to \p INTEGER(HSIZE_T). * \since 1.0.0 * diff --git a/src/H5Tmodule.h b/src/H5Tmodule.h index 4f9edde..c489edc 100644 --- a/src/H5Tmodule.h +++ b/src/H5Tmodule.h @@ -44,12 +44,8 @@ * \ingroup H5T * \defgroup ENUM Enumeration Datatypes * \ingroup H5T - * \defgroup GTO General Datatype Operations - * \ingroup H5T * \defgroup OPAQUE Opaque Datatypes * \ingroup H5T - * \defgroup STRING String Datatypes - * \ingroup H5T * \defgroup VLEN Variable-length Sequence Datatypes * \ingroup H5T * diff --git a/src/H5Tpublic.h b/src/H5Tpublic.h index 1457053..5301ea2 100644 --- a/src/H5Tpublic.h +++ b/src/H5Tpublic.h @@ -28,7 +28,7 @@ * internal If this goes over 16 types (0-15), the file format will need to * change. */ -//! [H5T_class_t_snip] +//! typedef enum H5T_class_t { H5T_NO_CLASS = -1, /**< error */ H5T_INTEGER = 0, /**< integer types */ @@ -45,12 +45,12 @@ typedef enum H5T_class_t { H5T_NCLASSES /**< sentinel: this must be last */ } H5T_class_t; -//! [H5T_class_t_snip] +//! /** * Byte orders */ -//! [H5T_order_t_snip] +//! typedef enum H5T_order_t { H5T_ORDER_ERROR = -1, /**< error */ H5T_ORDER_LE = 0, /**< little endian */ @@ -60,12 +60,12 @@ typedef enum H5T_order_t { H5T_ORDER_NONE = 4 /**< no particular order (strings, bits,..) */ /*H5T_ORDER_NONE must be last */ } H5T_order_t; -//! [H5T_order_t_snip] +//! /** * Types of integer sign schemes */ -//! [H5T_sign_t_snip] +//! typedef enum H5T_sign_t { H5T_SGN_ERROR = -1, /**< error */ H5T_SGN_NONE = 0, /**< this is an unsigned type */ @@ -73,12 +73,12 @@ typedef enum H5T_sign_t { H5T_NSGN = 2 /** sentinel: this must be last! */ } H5T_sign_t; -//! [H5T_sign_t_snip] +//! /** * Floating-point normalization schemes */ -//! [H5T_norm_t_snip] +//! typedef enum H5T_norm_t { H5T_NORM_ERROR = -1, /**< error */ H5T_NORM_IMPLIED = 0, /**< msb of mantissa isn't stored, always 1 */ @@ -86,7 +86,7 @@ typedef enum H5T_norm_t { H5T_NORM_NONE = 2 /**< not normalized */ /*H5T_NORM_NONE must be last */ } H5T_norm_t; -//! [H5T_norm_t_snip] +//! /** * Character set to use for text strings. @@ -141,7 +141,7 @@ typedef enum H5T_str_t { /** * Type of padding to use in other atomic types */ -//! [H5T_pad_t_snip] +//! typedef enum H5T_pad_t { H5T_PAD_ERROR = -1, /**< error */ H5T_PAD_ZERO = 0, /**< always set to zero */ @@ -150,7 +150,7 @@ typedef enum H5T_pad_t { H5T_NPAD = 3 /**< sentinal: THIS MUST BE LAST */ } H5T_pad_t; -//! [H5T_pad_t_snip] +//! /** * Commands sent to conversion functions @@ -173,14 +173,14 @@ typedef enum H5T_bkg_t { /** * Type conversion client data */ -//! [H5T_cdata_t_snip] +//! typedef struct H5T_cdata_t { H5T_cmd_t command; /**< what should the conversion function do? */ H5T_bkg_t need_bkg; /**< is the background buffer needed? */ hbool_t recalc; /**< recalculate private data */ void * priv; /**< private data */ } H5T_cdata_t; -//! [H5T_cdata_t_snip] +//! /** * Conversion function persistence @@ -194,25 +194,32 @@ typedef enum H5T_pers_t { /** * The order to retrieve atomic native datatype */ -//! [H5T_direction_t_snip] +//! typedef enum H5T_direction_t { H5T_DIR_DEFAULT = 0, /**< default direction is inscendent */ H5T_DIR_ASCEND = 1, /**< in inscendent order */ H5T_DIR_DESCEND = 2 /**< in descendent order */ } H5T_direction_t; -//! [H5T_direction_t_snip] +//! /** * The exception type passed into the conversion callback function */ typedef enum H5T_conv_except_t { - H5T_CONV_EXCEPT_RANGE_HI = 0, /**< source value is greater than destination's range */ - H5T_CONV_EXCEPT_RANGE_LOW = 1, /**< source value is less than destination's range */ - H5T_CONV_EXCEPT_PRECISION = 2, /**< source value loses precision in destination */ - H5T_CONV_EXCEPT_TRUNCATE = 3, /**< source value is truncated in destination */ - H5T_CONV_EXCEPT_PINF = 4, /**< source value is positive infinity(floating number) */ - H5T_CONV_EXCEPT_NINF = 5, /**< source value is negative infinity(floating number) */ - H5T_CONV_EXCEPT_NAN = 6 /**< source value is NaN(floating number) */ + H5T_CONV_EXCEPT_RANGE_HI = 0, + /**< Source value is greater than destination's range */ + H5T_CONV_EXCEPT_RANGE_LOW = 1, + /**< Source value is less than destination's range */ + H5T_CONV_EXCEPT_PRECISION = 2, + /**< Source value loses precision in destination */ + H5T_CONV_EXCEPT_TRUNCATE = 3, + /**< Source value is truncated in destination */ + H5T_CONV_EXCEPT_PINF = 4, + /**< Source value is positive infinity */ + H5T_CONV_EXCEPT_NINF = 5, + /**< Source value is negative infinity */ + H5T_CONV_EXCEPT_NAN = 6 + /**< Source value is \c NaN (not a number, including \c QNaN and \c SNaN) */ } H5T_conv_except_t; /** @@ -254,17 +261,31 @@ extern "C" { /** * All datatype conversion functions are... */ -//! [H5T_conv_t_snip] +//! typedef herr_t (*H5T_conv_t)(hid_t src_id, hid_t dst_id, H5T_cdata_t *cdata, size_t nelmts, size_t buf_stride, size_t bkg_stride, void *buf, void *bkg, hid_t dset_xfer_plist); -//! [H5T_conv_t_snip] +//! +//! /** - * Exception handler. If an exception like overflow happenes during conversion, - * this function is called if it's registered through H5Pset_type_conv_cb(). + * \brief Exception handler. + * + * \param[in] except_type The kind of exception that occurred + * \param[in] src_id Source datatype identifier + * \param[in] dst_id Destination datatype identifier + * \param[in] src_buf Source data buffer + * \param[in,out] dst_buf Destination data buffer + * \param[in,out] user_data Callback context + * \returns Valid callback function return values are #H5T_CONV_ABORT, + * #H5T_CONV_UNHANDLED and #H5T_CONV_HANDLED. + * + * \details If an exception like overflow happenes during conversion, this + * function is called if it's registered through H5Pset_type_conv_cb(). + * */ typedef H5T_conv_ret_t (*H5T_conv_except_func_t)(H5T_conv_except_t except_type, hid_t src_id, hid_t dst_id, void *src_buf, void *dst_buf, void *user_data); +//! /* When this header is included from a private header, don't make calls to H5open() */ #undef H5OPEN @@ -1061,7 +1082,7 @@ H5_DLLVAR hid_t H5T_NATIVE_UINT_FAST64_g; * predefined datatype. * * When creating a variable-length string datatype, \p size must - * be #H5T_VARIABLE. + * be #H5T_VARIABLE; see \ref_vlen_strings. * * When creating a fixed-length string datatype, \p size will * be the length of the string in bytes. The length of the @@ -1075,13 +1096,9 @@ H5_DLLVAR hid_t H5T_NATIVE_UINT_FAST64_g; * The datatype identifier returned from this function should be * released with H5Tclose or resource leaks will result. * - * \since 1.2.0 - * * \see H5Tclose() * - * \todo Original has a reference to “Creating variable-length string - * datatypes”. - * \todo Create an example for H5Tcreate. + * \since 1.2.0 * */ H5_DLL hid_t H5Tcreate(H5T_class_t type, size_t size); @@ -1106,8 +1123,6 @@ H5_DLL hid_t H5Tcreate(H5T_class_t type, size_t size); * The returned datatype identifier should be released with H5Tclose() * to prevent resource leak. * - * \todo Create an example for H5Tcopy(). - * */ H5_DLL hid_t H5Tcopy(hid_t type_id); /** @@ -1130,8 +1145,6 @@ H5_DLL herr_t H5Tclose(hid_t type_id); * * \brief Asynchronous version of H5Tclose(). * - * \todo Create an example for H5Tclose_async(). - * */ H5_DLL herr_t H5Tclose_async(const char *app_file, const char *app_func, unsigned app_line, hid_t type_id, hid_t es_id); @@ -1222,8 +1235,6 @@ H5_DLL herr_t H5Tcommit2(hid_t loc_id, const char *name, hid_t type_id, hid_t lc * * \brief Asynchronous version of H5Tcommit2(). * - * \todo Create an example for H5Tcommit_async(). - * */ H5_DLL herr_t H5Tcommit_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t type_id, hid_t lcpl_id, hid_t tcpl_id, hid_t tapl_id, @@ -1257,8 +1268,6 @@ H5_DLL hid_t H5Topen2(hid_t loc_id, const char *name, hid_t tapl_id); * * \brief Asynchronous version of H5Topen2(). * - * \todo Create an example for H5Topen_async(). - * */ H5_DLL hid_t H5Topen_async(const char *app_file, const char *app_func, unsigned app_line, hid_t loc_id, const char *name, hid_t tapl_id, hid_t es_id); @@ -1503,8 +1512,6 @@ H5_DLL herr_t H5Trefresh(hid_t type_id); * * \since 1.2.0 * - * \todo Create example for H5Tinsert - * */ H5_DLL herr_t H5Tinsert(hid_t parent_id, const char *name, size_t offset, hid_t member_id); /** @@ -1663,10 +1670,7 @@ H5_DLL herr_t H5Tenum_valueof(hid_t type, const char *name, void *value /*out*/) * character base type creates a variable-length sequence of strings * (a variable-length, 1-dimensional array), with each element of * the array being of the string or character base type.\n - * To create a variable-length string datatype, see "Creating - * variable-length string datatypes." - * - * \todo Fix the reference. + * To create a variable-length string datatype, see \ref_vlen_strings. * */ H5_DLL hid_t H5Tvlen_create(hid_t base_id); @@ -1870,13 +1874,9 @@ H5_DLL htri_t H5Tdetect_class(hid_t type_id, H5T_class_t cls); * actual data and a size value. This function does not return the * size of actual variable-length sequence data. * - * \since 1.2.0 - * * \see H5Tset_size() * - * \todo Original has a reference to “Creating variable-length string datatypes”. - * \todo Create an example for H5Tget_size(). - * + * \since 1.2.0 */ H5_DLL size_t H5Tget_size(hid_t type_id); /** @@ -2090,7 +2090,7 @@ H5_DLL H5T_pad_t H5Tget_inpad(hid_t type_id); */ H5_DLL H5T_str_t H5Tget_strpad(hid_t type_id); /** - * \ingroup COMPOUND + * \ingroup COMPOUND ENUM * * \brief Retrieves the number of elements in a compound or enumeration datatype * @@ -2107,7 +2107,7 @@ H5_DLL H5T_str_t H5Tget_strpad(hid_t type_id); */ H5_DLL int H5Tget_nmembers(hid_t type_id); /** - * \ingroup COMPOUND + * \ingroup COMPOUND ENUM * * \brief Retrieves the name of a compound or enumeration datatype member * @@ -2134,7 +2134,7 @@ H5_DLL int H5Tget_nmembers(hid_t type_id); */ H5_DLL char *H5Tget_member_name(hid_t type_id, unsigned membno); /** - * \ingroup COMPOUND + * \ingroup COMPOUND ENUM * * \brief Retrieves the index of a compound or enumeration datatype member * @@ -2406,6 +2406,7 @@ H5_DLL hid_t H5Tget_native_type(hid_t type_id, H5T_direction_t direction); * * \li Variable-length string datatypes: If \p dtype_id is a * variable-length string, size must normally be set to #H5T_VARIABLE. + * See \ref_vlen_strings. * * \li Compound datatypes: This function may be used to increase or * decrease the size of a compound datatype, but the function will @@ -2416,12 +2417,9 @@ H5_DLL hid_t H5Tget_native_type(hid_t type_id, H5T_direction_t direction); * variable-length array datatypes (#H5T_VLEN), or reference datatypes * (#H5T_REFERENCE). * - * \since 1.2.0 - * * \see H5Tget_size() * - *\todo Create an example for H5Tset_size(). - *\todo Original has a reference to “Creating variable-length string datatypes”. + * \since 1.2.0 * */ H5_DLL herr_t H5Tset_size(hid_t type_id, size_t size); @@ -2873,7 +2871,7 @@ H5_DLL htri_t H5Tcompiler_conv(hid_t src_id, hid_t dst_id); * enough to hold the larger of the input and output data. * * \version 1.6.3 \p nelmts parameter type changed to size_t. - * \version 1.4.0 \p nelmts parameter type changed to \ref hsize_t. + * \version 1.4.0 \p nelmts parameter type changed to hsize_t. * */ H5_DLL herr_t H5Tconvert(hid_t src_id, hid_t dst_id, size_t nelmts, void *buf, void *background, diff --git a/src/H5VLconnector.h b/src/H5VLconnector.h index ac40e70..2a8ef6e 100644 --- a/src/H5VLconnector.h +++ b/src/H5VLconnector.h @@ -458,7 +458,7 @@ typedef struct H5VL_token_class_t { * \ingroup H5VLDEV * Class information for each VOL connector */ -//! [H5VL_class_t_snip] +//! typedef struct H5VL_class_t { /* Overall connector fields & callbacks */ unsigned version; /**< VOL connector class struct version # */ @@ -492,7 +492,7 @@ typedef struct H5VL_class_t { herr_t (*optional)(void *obj, int op_type, hid_t dxpl_id, void **req, va_list arguments); /**< Optional callback */ } H5VL_class_t; -//! [H5VL_class_t_snip] +//! /********************/ /* Public Variables */ @@ -529,15 +529,14 @@ extern "C" { * uncommon, as most VOL-specific properties are added to the file * access property list via the connector's API calls which set the * VOL connector for the file open/create. For more information, see - * the VOL documentation. + * the \ref_vol_doc. * * H5VL_class_t is defined in H5VLconnector.h in the source code. It * contains class information for each VOL connector: * \snippet this H5VL_class_t_snip * - * \todo Fix the reference to VOL documentation. - * * \since 1.12.0 + * */ H5_DLL hid_t H5VLregister_connector(const H5VL_class_t *cls, hid_t vipl_id); /** diff --git a/src/H5VLmodule.h b/src/H5VLmodule.h index 78c5986..009c0e5 100644 --- a/src/H5VLmodule.h +++ b/src/H5VLmodule.h @@ -32,6 +32,10 @@ * \brief Virtual Object Layer Interface * \todo Describe concisely what the functions in this module are about. * + * \defgroup ASYNC Asynchronous Functions + * \brief Asynchronous Functions + * \todo Describe concisely what the functions in this module are about. + * * \defgroup H5VLDEF Definitions * \ingroup H5VL * \defgroup H5VLDEV VOL Developer diff --git a/src/H5VLpublic.h b/src/H5VLpublic.h index ce50653..c5f85dc 100644 --- a/src/H5VLpublic.h +++ b/src/H5VLpublic.h @@ -93,9 +93,9 @@ * connectors. Subsequent values should be obtained from the HDF5 * development team at mailto:help@hdfgroup.org. */ -//! [H5VL_class_value_t_snip] +//! typedef int H5VL_class_value_t; -//! [H5VL_class_value_t_snip] +//! /** * \ingroup H5VLDEF @@ -153,11 +153,10 @@ extern "C" { * uncommon, as most VOL-specific properties are added to the file * access property list via the connector's API calls which set the * VOL connector for the file open/create. For more information, see - * the VOL documentation. - * - * \todo Fix the reference to VOL documentation. + * \ref_vol_doc. * * \since 1.12.0 + * */ H5_DLL hid_t H5VLregister_connector_by_name(const char *connector_name, hid_t vipl_id); /** @@ -192,11 +191,10 @@ H5_DLL hid_t H5VLregister_connector_by_name(const char *connector_name, hid_t vi * uncommon, as most VOL-specific properties are added to the file * access property list via the connector's API calls which set the * VOL connector for the file open/create. For more information, see - * the VOL documentation. - * - * \todo Fix the reference to VOL documentation. + * the \ref_vol_doc. * * \since 1.12.0 + * */ H5_DLL hid_t H5VLregister_connector_by_value(H5VL_class_value_t connector_value, hid_t vipl_id); /** diff --git a/src/H5Zmodule.h b/src/H5Zmodule.h index 25007b3..76a2380 100644 --- a/src/H5Zmodule.h +++ b/src/H5Zmodule.h @@ -77,11 +77,9 @@ * Custom filters that have been registered with the library will have * additional unique identifiers. * - * See \Emph{HDF5 Dynamically Loaded Filters} for more information on - * how an HDF5 application can apply a filter that is not registered - * with the HDF5 library. - * - * \todo Fix the reference. + * See \ref_dld_filters for more information on how an HDF5 + * application can apply a filter that is not registered with the HDF5 + * library. * * \defgroup H5ZPRE Predefined Filters * \ingroup H5Z diff --git a/src/H5Zpublic.h b/src/H5Zpublic.h index 4c9b006..90277cf 100644 --- a/src/H5Zpublic.h +++ b/src/H5Zpublic.h @@ -230,16 +230,18 @@ typedef enum H5Z_EDC_t { * Return values for filter callback function */ typedef enum H5Z_cb_return_t { - H5Z_CB_ERROR = -1, - H5Z_CB_FAIL = 0, /**< I/O should fail if filter fails. */ - H5Z_CB_CONT = 1, /**< I/O continues if filter fails. */ - H5Z_CB_NO = 2 + H5Z_CB_ERROR = -1, /**< error value */ + H5Z_CB_FAIL = 0, /**< I/O should fail if filter fails. */ + H5Z_CB_CONT = 1, /**< I/O continues if filter fails. */ + H5Z_CB_NO = 2 /**< sentinel */ } H5Z_cb_return_t; +//! /** * Filter callback function definition */ typedef H5Z_cb_return_t (*H5Z_filter_func_t)(H5Z_filter_t filter, void *buf, size_t buf_size, void *op_data); +//! /** * Structure for filter callback property @@ -254,58 +256,93 @@ extern "C" { #endif /** - * \details Before a dataset gets created, the \c can_apply callbacks for any - * filters used in the dataset creation property list are called with - * the dataset's dataset creation property list, the dataset's - * datatype and a dataspace describing a chunk (for chunked dataset - * storage). + * \brief This callback determines if a filter can be applied to the dataset + * with the characteristics provided * - * The \c can_apply callback must determine if the combination of the - * dataset creation property list setting, the datatype and the - * dataspace represent a valid combination to apply this filter to. - * For example, some cases of invalid combinations may involve the - * filter not operating correctly on certain datatypes (or certain - * datatype sizes), or certain sizes of the chunk dataspace. + * \dcpl_id + * \type_id + * \space_id * - * The \c can_apply callback can be the NULL pointer, in which case, - * the library will assume that it can apply to any combination of - * dataset creation property list values, datatypes and dataspaces. + * \return \htri_t * - * The \c can_apply callback returns positive a valid combination, - * zero for an invalid combination and negative for an error. - */ -//! [H5Z_can_apply_func_t_snip] + * \details Before a dataset gets created, the \ref H5Z_can_apply_func_t + * callbacks for any filters used in the dataset creation property list + * are called with the dataset's dataset creation property list, the + * dataset's datatype and a dataspace describing a chunk (for chunked + * dataset storage). + * + * The \ref H5Z_can_apply_func_t callback must determine if the + * combination of the dataset creation property list setting, the + * datatype and the dataspace represent a valid combination to apply + * this filter to. For example, some cases of invalid combinations may + * involve the filter not operating correctly on certain datatypes (or + * certain datatype sizes), or certain sizes of the chunk dataspace. + * + * The \ref H5Z_can_apply_func_t callback can be the NULL pointer, in + * which case, the library will assume that it can apply to any + * combination of dataset creation property list values, datatypes and + * dataspaces. + * + * The \ref H5Z_can_apply_func_t callback returns positive a valid + * combination, zero for an invalid combination and negative for an + * error. + */ +//! typedef htri_t (*H5Z_can_apply_func_t)(hid_t dcpl_id, hid_t type_id, hid_t space_id); -//! [H5Z_can_apply_func_t_snip] -/** - * \details After the "can_apply" callbacks are checked for new datasets, the - * \c set_local callbacks for any filters used in the dataset creation - * property list are called. These callbacks receive the dataset's - * private copy of the dataset creation property list passed in to - * H5Dcreate() (i.e. not the actual property list passed in to - * H5Dcreate()) and the datatype ID passed in to H5Dcreate() (which is - * not copied and should not be modified) and a dataspace describing - * the chunk (for chunked dataset storage) (which should also not be - * modified). - * - * The \c set_local callback must set any parameters that are specific - * to this dataset, based on the combination of the dataset creation - * property list values, the datatype and the dataspace. For example, - * some filters perform different actions based on different datatypes - * (or datatype sizes) or different number of dimensions or dataspace - * sizes. +//! +/** + * \brief The filter operation callback function, defining a filter's operation + * on data * - * The \c set_local callback can be the NULL pointer, in which case, - * the library will assume that there are no dataset-specific settings - * for this filter. + * \dcpl_id + * \type_id + * \space_id * - * The \c set_local callback must return non-negative on success and - * negative for an error. - */ -//! [H5Z_set_local_func_t_snip] + * \return \herr_t + * + * \details After the \ref H5Z_can_apply_func_t callbacks are checked for new + * datasets, the \ref H5Z_set_local_func_t callbacks for any filters + * used in the dataset creation property list are called. These + * callbacks receive the dataset's private copy of the dataset creation + * property list passed in to H5Dcreate() (i.e. not the actual property + * list passed in to H5Dcreate()) and the datatype ID passed in to + * H5Dcreate() (which is not copied and should not be modified) and a + * dataspace describing the chunk (for chunked dataset storage) (which + * should also not be modified). + * + * The \ref H5Z_set_local_func_t callback must set any parameters that + * are specific to this dataset, based on the combination of the + * dataset creation property list values, the datatype and the + * dataspace. For example, some filters perform different actions based + * on different datatypes (or datatype sizes) or different number of + * dimensions or dataspace sizes. + * + * The \ref H5Z_set_local_func_t callback can be the NULL pointer, in + * which case, the library will assume that there are no + * dataset-specific settings for this filter. + * + * The \ref H5Z_set_local_func_t callback must return non-negative on + * success and negative for an error. + */ +//! typedef herr_t (*H5Z_set_local_func_t)(hid_t dcpl_id, hid_t type_id, hid_t space_id); -//! [H5Z_set_local_func_t_snip] +//! + /** + * \brief The filter operation callback function, defining a filter's operation + * on data + * + * \param[in] flags Bit vector specifying certain general properties of the filter + * \param[in] cd_nelmts Number of elements in \p cd_values + * \param[in] cd_values Auxiliary data for the filter + * \param[in] nbytes The number of valid bytes in \p buf to be filtered + * \param[in,out] buf_size The size of \p buf + * \param[in,out] buf The filter buffer + * + * \return Returns the number of valid bytes of data contained in \p buf. In the + * case of failure, the return value is 0 (zero) and all pointer + * arguments are left unchanged. + * * \details A filter gets definition flags and invocation flags (defined * above), the client data array and size defined when the filter was * added to the pipeline, the size in bytes of the data on which to @@ -321,15 +358,15 @@ typedef herr_t (*H5Z_set_local_func_t)(hid_t dcpl_id, hid_t type_id, hid_t space * output buffer. If an error occurs then the function should return * zero and leave all pointer arguments unchanged. */ -//! [H5Z_func_t_snip] +//! typedef size_t (*H5Z_func_t)(unsigned int flags, size_t cd_nelmts, const unsigned int cd_values[], size_t nbytes, size_t *buf_size, void **buf); -//! [H5Z_func_t_snip] +//! /** * The filter table maps filter identification numbers to structs that * contain a pointers to the filter function and timing statistics. */ -//! [H5Z_class2_t_snip] +//! typedef struct H5Z_class2_t { int version; /**< Version number of the H5Z_class_t struct */ H5Z_filter_t id; /**< Filter ID number */ @@ -340,7 +377,7 @@ typedef struct H5Z_class2_t { H5Z_set_local_func_t set_local; /**< The "set local" callback for a filter */ H5Z_func_t filter; /**< The actual filter function */ } H5Z_class2_t; -//! [H5Z_class2_t_snip] +//! /** * \ingroup H5Z @@ -635,7 +672,7 @@ H5_DLL herr_t H5Zget_filter_info(H5Z_filter_t filter, unsigned int *filter_confi * The filter table maps filter identification numbers to structs that * contain a pointers to the filter function and timing statistics. */ -//! [H5Z_class1_t_snip] +//! typedef struct H5Z_class1_t { H5Z_filter_t id; /**< Filter ID number */ const char * name; /**< Comment for debugging */ @@ -643,7 +680,7 @@ typedef struct H5Z_class1_t { H5Z_set_local_func_t set_local; /**< The "set local" callback for a filter */ H5Z_func_t filter; /**< The actual filter function */ } H5Z_class1_t; -//! [H5Z_class1_t_snip] +//! #endif /* H5_NO_DEPRECATED_SYMBOLS */ diff --git a/src/H5public.h b/src/H5public.h index b319550..751abbe 100644 --- a/src/H5public.h +++ b/src/H5public.h @@ -95,50 +95,130 @@ extern "C" { #define H5_NO_EXPAND(x) (x) /* Version numbers */ -#define H5_VERS_MAJOR 1 /* For major interface/format changes */ -#define H5_VERS_MINOR 13 /* For minor interface/format changes */ -#define H5_VERS_RELEASE 0 /* For tweaks, bug-fixes, or development */ -#define H5_VERS_SUBRELEASE "" /* For pre-releases like snap0 */ -/* Empty string for real releases. */ -#define H5_VERS_INFO "HDF5 library version: 1.13.0" /* Full version string */ +/** + * For major interface/format changes + */ +#define H5_VERS_MAJOR 1 +/** + * For minor interface/format changes + */ +#define H5_VERS_MINOR 13 +/** + * For tweaks, bug-fixes, or development + */ +#define H5_VERS_RELEASE 0 +/** + * For pre-releases like \c snap0. Empty string for official releases. + */ +#define H5_VERS_SUBRELEASE "" +/** + * Full version string + */ +#define H5_VERS_INFO "HDF5 library version: 1.13.0" #define H5check() H5check_version(H5_VERS_MAJOR, H5_VERS_MINOR, H5_VERS_RELEASE) /* macros for comparing the version */ +/** + * \brief Determines whether the version of the library being used is greater + * than or equal to the specified version + * + * \param[in] Maj Major version number - A non-negative integer value + * \param[in] Min Minor version number - A non-negative integer value + * \param[in] Rel Release version number - A non-negative integer value + * \returns A value of 1 is returned if the library version is greater than + * or equal to the version number specified.\n + * A value of 0 is returned if the library version is less than the + * version number specified.\n + * A library version is greater than the specified version number if + * its major version is larger than the specified major version + * number. If the major version numbers are the same, it is greater + * than the specified version number if its minor version is larger + * than the specified minor version number. If the minor version + * numbers are the same, then a library version would be greater than + * the specified version number if its release number is larger than + * the specified release number. + * + * \details The #H5_VERSION_GE and #H5_VERSION_LE macros are used at compile + * time to conditionally include or exclude code based on the version + * of the HDF5 library against which an application will be linked. + * + * The #H5_VERSION_GE macro compares the version of the HDF5 library + * being used against the version number specified in the parameters. + * + * For more information about release versioning, see \ref_h5lib_relver. + * + * \since 1.8.7 + * + */ #define H5_VERSION_GE(Maj, Min, Rel) \ (((H5_VERS_MAJOR == Maj) && (H5_VERS_MINOR == Min) && (H5_VERS_RELEASE >= Rel)) || \ ((H5_VERS_MAJOR == Maj) && (H5_VERS_MINOR > Min)) || (H5_VERS_MAJOR > Maj)) +/** + * \brief Determines whether the version of the library being used is less + * than or equal to the specified version + * + * \param[in] Maj Major version number - A non-negative integer value + * \param[in] Min Minor version number - A non-negative integer value + * \param[in] Rel Release version number - A non-negative integer value + * \returns A value of 1 is returned if the library version is less than + * or equal to the version number specified.\n + * A value of 0 is returned if the library version is greater than the + * version number specified.\n + * A library version is less than the specified version number if + * its major version is smaller than the specified major version + * number. If the major version numbers are the same, it is smaller + * than the specified version number if its minor version is smaller + * than the specified minor version number. If the minor version + * numbers are the same, then a library version would be smaller than + * the specified version number if its release number is smaller than + * the specified release number. + * + * \details The #H5_VERSION_GE and #H5_VERSION_LE macros are used at compile + * time to conditionally include or exclude code based on the version + * of the HDF5 library against which an application will be linked. + * + * The #H5_VERSION_LE macro compares the version of the HDF5 library + * being used against the version number specified in the parameters. + * + * For more information about release versioning, see \ref_h5lib_relver. + * + * \since 1.8.7 + * + */ #define H5_VERSION_LE(Maj, Min, Rel) \ (((H5_VERS_MAJOR == Maj) && (H5_VERS_MINOR == Min) && (H5_VERS_RELEASE <= Rel)) || \ ((H5_VERS_MAJOR == Maj) && (H5_VERS_MINOR < Min)) || (H5_VERS_MAJOR < Maj)) -/* +/** * Status return values. Failed integer functions in HDF5 result almost * always in a negative value (unsigned failing functions sometimes return * zero for failure) while successful return is non-negative (often zero). * The negative failure value is most commonly -1, but don't bet on it. The * proper way to detect failure is something like: - * - * if((dset = H5Dopen2(file, name)) < 0) - * fprintf(stderr, "unable to open the requested dataset\n"); + * \code + * if((dset = H5Dopen2(file, name)) < 0) + * fprintf(stderr, "unable to open the requested dataset\n"); + * \endcode */ typedef int herr_t; -/* +/** * Boolean type. Successful return values are zero (false) or positive * (true). The typical true value is 1 but don't bet on it. Boolean - * functions cannot fail. Functions that return `htri_t' however return zero + * functions cannot fail. Functions that return #htri_t however return zero * (false), positive (true), or negative (failure). The proper way to test - * for truth from a htri_t function is: - * - * if ((retval = H5Tcommitted(type)) > 0) { - * printf("data type is committed\n"); - * } else if (!retval) { - * printf("data type is not committed\n"); - * } else { - * printf("error determining whether data type is committed\n"); - * } + * for truth from a #htri_t function is: + * \code + * if ((retval = H5Tcommitted(type)) > 0) { + * printf("data type is committed\n"); + * } else if (!retval) { + * printf("data type is not committed\n"); + * } else { + * printf("error determining whether data type is committed\n"); + * } + * \endcode */ #ifdef H5_HAVE_STDBOOL_H #include @@ -307,8 +387,7 @@ typedef unsigned long uint32_t; #error "nothing appropriate for uint32_t" #endif -//! [H5_iter_order_t_snip] - +//! /** * Common iteration orders */ @@ -319,8 +398,7 @@ typedef enum { H5_ITER_NATIVE, /**< No particular order, whatever is fastest */ H5_ITER_N /**< Number of iteration orders */ } H5_iter_order_t; - -//! [H5_iter_order_t_snip] +//! /* Iteration callback values */ /* (Actually, any positive value will cause the iterator to stop and pass back @@ -330,8 +408,7 @@ typedef enum { #define H5_ITER_CONT (0) #define H5_ITER_STOP (1) -//! [H5_index_t_snip] - +//! /** * The types of indices on links in groups/attributes on objects. * Primarily used for " by index" routines and for iterating over @@ -343,18 +420,17 @@ typedef enum H5_index_t { H5_INDEX_CRT_ORDER, /**< Index on creation order */ H5_INDEX_N /**< Number of indices defined */ } H5_index_t; - -//! [H5_index_t_snip] +//! /** * Storage info struct used by H5O_info_t and H5F_info_t */ -//! [H5_ih_info_t_snip] +//! typedef struct H5_ih_info_t { hsize_t index_size; /**< btree and/or list */ hsize_t heap_size; } H5_ih_info_t; -//! [H5_ih_info_t_snip] +//! /** * The maximum size allowed for tokens @@ -364,17 +440,17 @@ typedef struct H5_ih_info_t { */ #define H5O_MAX_TOKEN_SIZE (16) -//! [H5O_token_t_snip] - +//! /** + * Type for object tokens + * * \internal (Hoisted here, since it's used by both the - * H5Lpublic.h and H5Opublic.h headers) */ -/* Type for object tokens */ + * H5Lpublic.h and H5Opublic.h headers) + */ typedef struct H5O_token_t { uint8_t __data[H5O_MAX_TOKEN_SIZE]; } H5O_token_t; - -//! [H5O_token_t_snip] +//! /** * Allocation statistics info struct -- cgit v0.12