/* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* Copyright by The HDF Group. *
* All rights reserved. *
* *
* This file is part of HDF5. The full HDF5 copyright notice, including *
* terms governing use, modification, and redistribution, is contained in *
* the COPYING file, which can be found at the root of the source code *
* distribution tree, or in https://www.hdfgroup.org/licenses. *
* If you do not have access to either file, you may request a copy from *
* help@hdfgroup.org. *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * */
/*
* Programmer: Quincey Koziol
* Saturday, September 12, 2015
*
* Purpose: This file contains declarations which define macros for the
* H5D package. Including this header means that the source file
* is part of the H5D package.
*/
#ifndef H5Dmodule_H
#define H5Dmodule_H
/* Define the proper control macros for the generic FUNC_ENTER/LEAVE and error
* reporting macros.
*/
#define H5D_MODULE
#define H5_MY_PKG H5D
#define H5_MY_PKG_ERR H5E_DATASET
/** \page H5D_UG HDF5 Datasets
*
* \section sec_dataset HDF5 Datasets
*
* \subsection subsec_dataset_intro Introduction
*
* An HDF5 dataset is an object composed of a collection of data elements, or raw data, and
* metadata that stores a description of the data elements, data layout, and all other information
* necessary to write, read, and interpret the stored data. From the viewpoint of the application the
* raw data is stored as a one-dimensional or multi-dimensional array of elements (the raw data),
* those elements can be any of several numerical or character types, small arrays, or even
* compound types similar to C structs. The dataset object may have attribute objects. See the
* figure below.
*
*
* Dataset functions
*
* Function |
* Purpose |
*
*
* #H5Dcreate |
* Creates a dataset at the specified location. The
* C function is a macro: \see \ref api-compat-macros. |
*
*
* #H5Dcreate_anon |
* Creates a dataset in a file without linking it into the file structure. |
*
*
* #H5Dopen |
* Opens an existing dataset. The C function is a macro: \see \ref api-compat-macros. |
*
*
* #H5Dclose |
* Closes the specified dataset. |
*
*
* #H5Dget_space |
* Returns an identifier for a copy of the dataspace for a dataset. |
*
*
* #H5Dget_space_status |
* Determines whether space has been allocated for a dataset. |
*
*
* #H5Dget_type |
* Returns an identifier for a copy of the datatype for a dataset. |
*
*
* #H5Dget_create_plist |
* Returns an identifier for a copy of the dataset creation property list for a dataset. |
*
*
* #H5Dget_access_plist |
* Returns the dataset access property list associated with a dataset. |
*
*
* #H5Dget_offset |
* Returns the dataset address in a file. |
*
*
* #H5Dget_storage_size |
* Returns the amount of storage required for a dataset. |
*
*
* #H5Dvlen_get_buf_size |
* Determines the number of bytes required to store variable-length (VL) data. |
*
*
* #H5Dvlen_reclaim |
* Reclaims VL datatype memory buffers. |
*
*
* #H5Dread |
* Reads raw data from a dataset into a buffer. |
*
*
* #H5Dwrite |
* Writes raw data from a buffer to a dataset. |
*
*
* #H5Diterate |
* Iterates over all selected elements in a dataspace. |
*
*
* #H5Dgather |
* Gathers data from a selection within a memory buffer. |
*
*
* #H5Dscatter |
* Scatters data into a selection within a memory buffer. |
*
*
* #H5Dfill |
* Fills dataspace elements with a fill value in a memory buffer. |
*
*
* #H5Dset_extent |
* Changes the sizes of a dataset’s dimensions. |
*
*
*
*
* Dataset creation property list functions (H5P)
*
* Function |
* Purpose |
*
*
* #H5Pset_layout |
* Sets the type of storage used to store the raw data for a dataset. |
*
*
* #H5Pget_layout |
* Returns the layout of the raw data for a dataset. |
*
*
* #H5Pset_chunk |
* Sets the size of the chunks used to store a chunked layout dataset. |
*
*
* #H5Pget_chunk |
* Retrieves the size of chunks for the raw data of a chunked layout dataset. |
*
*
* #H5Pset_deflate |
* Sets compression method and compression level. |
*
*
* #H5Pset_fill_value |
* Sets the fill value for a dataset. |
*
*
* #H5Pget_fill_value |
* Retrieves a dataset fill value. |
*
*
* #H5Pfill_value_defined |
* Determines whether the fill value is defined. |
*
*
* #H5Pset_fill_time |
* Sets the time when fill values are written to a dataset. |
*
*
* #H5Pget_fill_time |
* Retrieves the time when fill value are written to a dataset. |
*
*
* #H5Pset_alloc_time |
* Sets the timing for storage space allocation. |
*
*
* #H5Pget_alloc_time |
* Retrieves the timing for storage space allocation. |
*
*
* #H5Pset_filter |
* Adds a filter to the filter pipeline. |
*
*
* #H5Pall_filters_avail |
* Verifies that all required filters are available. |
*
*
* #H5Pget_nfilters |
* Returns the number of filters in the pipeline. |
*
*
* #H5Pget_filter |
* Returns information about a filter in a pipeline.
* The C function is a macro: \see \ref api-compat-macros. |
*
*
* #H5Pget_filter_by_id |
* Returns information about the specified filter.
* The C function is a macro: \see \ref api-compat-macros. |
*
*
* #H5Pmodify_filter |
* Modifies a filter in the filter pipeline. |
*
*
* #H5Premove_filter |
* Deletes one or more filters in the filter pipeline. |
*
*
* #H5Pset_fletcher32 |
* Sets up use of the Fletcher32 checksum filter. |
*
*
* #H5Pset_nbit |
* Sets up use of the n-bit filter. |
*
*
* #H5Pset_scaleoffset |
* Sets up use of the scale-offset filter. |
*
*
* #H5Pset_shuffle |
* Sets up use of the shuffle filter. |
*
*
* #H5Pset_szip |
* Sets up use of the Szip compression filter. |
*
*
* #H5Pset_external |
* Adds an external file to the list of external files. |
*
*
* #H5Pget_external_count |
* Returns the number of external files for a dataset. |
*
*
* #H5Pget_external |
* Returns information about an external file. |
*
*
* #H5Pset_char_encoding |
* Sets the character encoding used to encode a string. Use to set ASCII or UTF-8 character
* encoding for object names. |
*
*
* #H5Pget_char_encoding |
* Retrieves the character encoding used to create a string. |
*
*
*
*
* Dataset access property list functions (H5P)
*
* Function |
* Purpose |
*
*
* #H5Pset_buffer |
* Sets type conversion and background buffers. |
*
*
* #H5Pget_buffer |
* Reads buffer settings. |
*
*
* #H5Pset_chunk_cache |
* Sets the raw data chunk cache parameters. |
*
*
* #H5Pget_chunk_cache |
* Retrieves the raw data chunk cache parameters. |
*
*
* #H5Pset_edc_check |
* Sets whether to enable error-detection when reading a dataset. |
*
*
* #H5Pget_edc_check |
* Determines whether error-detection is enabled for dataset reads. |
*
*
* #H5Pset_filter_callback |
* Sets user-defined filter callback function. |
*
*
* #H5Pset_data_transform |
* Sets a data transform expression. |
*
*
* #H5Pget_data_transform |
* Retrieves a data transform expression. |
*
*
* #H5Pset_type_conv_cb |
* Sets user-defined datatype conversion callback function. |
*
*
* #H5Pget_type_conv_cb |
* Gets user-defined datatype conversion callback function. |
*
*
* #H5Pset_hyper_vector_size |
* Sets number of I/O vectors to be read/written in hyperslab I/O. |
*
*
* #H5Pget_hyper_vector_size |
* Retrieves number of I/O vectors to be read/written in hyperslab I/O. |
*
*
* #H5Pset_btree_ratios |
* Sets B-tree split ratios for a dataset transfer property list. |
*
*
* #H5Pget_btree_ratios |
* Gets B-tree split ratios for a dataset transfer property list. |
*
*
* #H5Pset_vlen_mem_manager |
* Sets the memory manager for variable-length datatype allocation in #H5Dread and
* #H5Dvlen_reclaim. |
*
*
* #H5Pget_vlen_mem_manager |
* Gets the memory manager for variable-length datatype allocation in #H5Dread and
* #H5Dvlen_reclaim. |
*
*
* #H5Pset_dxpl_mpio |
* Sets data transfer mode. |
*
*
* #H5Pget_dxpl_mpio |
* Returns the data transfer mode. |
*
*
* #H5Pset_dxpl_mpio_chunk_opt |
* Sets a flag specifying linked-chunk I/O or multi-chunk I/O. |
*
*
* #H5Pset_dxpl_mpio_chunk_opt_num |
* Sets a numeric threshold for linked-chunk I/O. |
*
*
* #H5Pset_dxpl_mpio_chunk_opt_ratio |
* Sets a ratio threshold for collective I/O. |
*
*
* #H5Pset_dxpl_mpio_collective_opt |
* Sets a flag governing the use of independent versus collective I/O. |
*
*
* #H5Pset_multi_type |
* Sets the type of data property for the MULTI driver. |
*
*
* #H5Pget_multi_type |
* Retrieves the type of data property for the MULTI driver. |
*
*
* #H5Pset_small_data_block_size |
* Sets the size of a contiguous block reserved for small data. |
*
*
* #H5Pget_small_data_block_size |
* Retrieves the current small data block size setting. |
*
*
*
* \subsection subsec_dataset_program Programming Model for Datasets
* This section explains the programming model for datasets.
*
* \subsubsection subsubsec_dataset_program_general General Model
*
* The programming model for using a dataset has three main phases:
* \li Obtain access to the dataset
* \li Operate on the dataset using the dataset identifier returned at access
* \li Release the dataset
*
* These three phases or steps are described in more detail below the figure.
*
* A dataset may be opened several times and operations performed with several different
* identifiers to the same dataset. All the operations affect the dataset although the calling program
* must synchronize if necessary to serialize accesses.
*
* Note that the dataset remains open until every identifier is closed. The figure below shows the
* basic sequence of operations.
*
*
* Stages of the data pipeline
*
* Layers |
* Description |
*
*
* I/O initiation |
* Initiation of HDF5 I/O activities (#H5Dwrite and #H5Dread) in a user’s application program. |
*
*
* Memory hyperslab operation |
* Data is scattered to (for read), or gathered from (for write) the application’s memory buffer
* (bypassed if no datatype conversion is needed). |
*
*
* Datatype conversion |
* Datatype is converted if it is different between memory and storage (bypassed if no datatype
* conversion is needed). |
*
*
* File hyperslab operation |
* Data is gathered from (for read), or scattered to (for write) to file space in memory (bypassed
* if no datatype conversion is needed). |
*
*
* Filter pipeline |
* Data is processed by filters when it passes. Data can be modified and restored here (bypassed
* if no datatype conversion is needed, no filter is enabled, or dataset is not chunked). |
*
*
* Virtual File Layer |
* Facilitate easy plug-in file drivers such as MPIO or POSIX I/O. |
*
*
* Actual I/O |
* Actual file driver used by the library such as MPIO or STDIO. |
*
*
*
*
* Data pipeline filters
*
* Filter |
* Description |
*
*
* gzip compression |
* Data compression using zlib. |
*
*
* Szip compression |
* Data compression using the Szip library. See The HDF Group website for more information
* regarding the Szip filter. |
*
*
* N-bit compression |
* Data compression using an algorithm specialized for n-bit datatypes. |
*
*
* Scale-offset compression |
* Data compression using a “scale and offset” algorithm. |
*
*
* Shuffling |
* To improve compression performance, data is regrouped by its byte position in the data
* unit. In other words, the 1st, 2nd, 3rd, and 4th bytes of integers are stored together
* respectively. |
*
*
* Fletcher32 |
* Fletcher32 checksum for error-detection. |
*
*
*
* Filters may be used only for chunked data and are applied to chunks of data between the file
* hyperslab stage and the virtual file layer. At this stage in the pipeline, the data is organized as
* fixed-size blocks of elements, and the filter stage processes each chunk separately.
*
* Filters are selected by dataset creation properties, and some behavior may be controlled by data
* transfer properties. The library determines what filters must be applied and applies them in the
* order in which they were set by the application. That is, if an application calls
* #H5Pset_shuffle and then #H5Pset_deflate when creating a dataset’s creation property list, the
* library will apply the shuffle filter first and then the deflate filter.
*
* For more information,
* \li @see @ref subsubsec_dataset_filters_nbit
* \li @see @ref subsubsec_dataset_filters_scale
*
* \subsubsection subsubsec_dataset_transfer_drive File Drivers
* I/O is performed by the HDF5 virtual file layer. The file driver interface writes and reads blocks
* of data; each driver module implements the interface using different I/O mechanisms. The table
* below lists the file drivers currently supported. Note that the I/O mechanisms are separated from
* the pipeline processing: the pipeline and filter operations are identical no matter what data access
* mechanism is used.
*
*
* Storage allocation and fill summary
*
* When to allocate space |
* When to write fill value |
* What fill value to write |
* Library create-write-close behavior |
*
*
* Early |
* Never |
* - |
* Library allocates space when dataset is created, but never writes a fill value to dataset. A read
* of unwritten data returns undefined values. |
*
*
* Late |
* Never |
* - |
* Library allocates space when dataset is written to, but never writes a fill value to the dataset. A
* read of unwritten data returns undefined values. |
*
*
* Incremental |
* Never |
* - |
* Library allocates space when a dataset or chunk (whichever is the smallest unit of space)
* is written to, but it never writes a fill value to a dataset or a chunk. A read of unwritten data
* returns undefined values. |
*
*
* - |
* Allocation |
* Undefined |
* Error on creating the dataset. The dataset is not created. |
*
*
* Early |
* Allocation |
* Default or User-defined |
* Allocate space for the dataset when the dataset is created. Write the fill value (default or
* user-defined) to the entire dataset when the dataset is created. |
*
*
* Late |
* Allocation |
* Default or User-define |
* Allocate space for the dataset when the application first writes data values to the dataset.
* Write the fill value to the entire dataset before writing application data values. |
*
*
* Incremental |
* Allocation |
* Default or User-define |
* Allocate space for the dataset when the application first writes data values to the dataset or
* chunk (whichever is the smallest unit of space). Write the fill value to the entire dataset
* or chunk before writing application data values. |
*
*
*
* During the #H5Dread function call, the library behavior depends on whether space has been
* allocated, whether the fill value has been written to storage, how the fill value is defined, and
* when to write the fill value. The table below summarizes the different behaviors.
*
*