Detailed Description

Bypassing default HDF5 behavior in order to optimize for specific use cases (H5DO)

HDF5 functions described is this section are implemented in the HDF5 High-level library as optimized functions. These functions generally require careful setup and testing as they enable an application to bypass portions of the HDF5 library's I/O pipeline for performance purposes.

These functions are distributed in the standard HDF5 distribution and are available any time the HDF5 High-level library is available.

H5DOappend
Appends data to a dataset along a specified dimension.
H5DOread_chunk
Reads a raw data chunk directly from a dataset in a file into a buffer (DEPRECATED)
H5DOwrite_chunk
Writes a raw data chunk from a buffer directly to a dataset in a file (DEPRECATED)

Functions
H5_HLDLL herr_t	H5DOappend (hid_t dset_id, hid_t dxpl_id, unsigned axis, size_t extension, hid_t memtype, const void *buf)
	Appends data to a dataset along a specified dimension.

H5_HLDLL herr_t	H5DOwrite_chunk (hid_t dset_id, hid_t dxpl_id, uint32_t filters, const hsize_t offset, size_t data_size, const void buf)
	Writes a raw data chunk from a buffer directly to a dataset in a file.

H5_HLDLL herr_t	H5DOread_chunk (hid_t dset_id, hid_t dxpl_id, const hsize_t offset, uint32_t filters, void *buf)
	Reads a raw data chunk directly from a dataset in a file into a buffer.

Function Documentation

◆ H5DOappend()

H5_HLDLL herr_t H5DOappend	(	hid_t	dset_id,
		hid_t	dxpl_id,
		unsigned	axis,
		size_t	extension,
		hid_t	memtype,
		const void *	buf )

Appends data to a dataset along a specified dimension.

Parameters

[in]	dset_id	Dataset identifier
[in]	dxpl_id	Dataset transfer property list identifier
[in]	axis	Dataset Dimension (0-based) for the append
[in]	extension	Number of elements to append for the axis-th dimension
[in]	memtype	The memory datatype identifier
[in]	buf	Buffer with data for the append

Returns: Returns a non-negative value if successful; otherwise, returns a negative value.

The H5DOappend() routine extends a dataset by extension number of elements along a dimension specified by a dimension axis and writes buf of elements to the dataset. Dimension axis is 0-based. Elements’ type is described by memtype.

This routine combines calling H5Dset_extent(), H5Sselect_hyperslab(), and H5Dwrite() into a single routine that simplifies application development for the common case of appending elements to an existing dataset.

For a multi-dimensional dataset, appending to one dimension will write a contiguous hyperslab over the other dimensions. For example, if a 3-D dataset has dimension sizes (3, 5, 8), extending the 0th dimension (currently of size 3) by 3 will append 3*5*8 = 120 elements (which must be pointed to by the buffer parameter) to the dataset, making its final dimension sizes (6, 5, 8).

If a dataset has more than one unlimited dimension, any of those dimensions may be appended to, although only along one dimension per call to H5DOappend().

Since: 1.10.0

◆ H5DOread_chunk()

H5_HLDLL herr_t H5DOread_chunk	(	hid_t	dset_id,
		hid_t	dxpl_id,
		const hsize_t *	offset,
		uint32_t *	filters,
		void *	buf )

Reads a raw data chunk directly from a dataset in a file into a buffer.

Parameters

[in]	dset_id	Identifier for the dataset to be read
[in]	dxpl_id	Transfer property list identifier for this I/O operation
[in]	offset	Logical position of the chunk's first element in the dataspace
[in,out]	filters	Mask for identifying the filters used with the chunk
[in]	buf	Buffer containing the chunk read from the dataset

Returns: Returns a non-negative value if successful; otherwise, returns a negative value.

Deprecated

This function was deprecated in favor of the function H5Dread_chunk() as of HDF5-1.10.3. In HDF5 1.10.3, the functionality of H5DOread_chunk() was moved to H5Dread_chunk().

For compatibility, this API call has been left as a stub which simply calls H5Dread_chunk(). New code should use H5Dread_chunk().

The H5DOread_chunk() reads a raw data chunk as specified by its logical offset in a chunked dataset dset_id from the dataset in the file into the application memory buffer buf. The data in buf is read directly from the file bypassing the library's internal data transfer pipeline, including filters.

dxpl_id is a data transfer property list identifier.

The mask filters indicates which filters are used with the chunk when written. A zero value indicates that all enabled filters are applied on the chunk. A filter is skipped if the bit corresponding to the filter's position in the pipeline (0 ≤ position < 32) is turned on.

offset is an array specifying the logical position of the first element of the chunk in the dataset's dataspace. The length of the offset array must equal the number of dimensions, or rank, of the dataspace. The values in offset must not exceed the dimension limits and must specify a point that falls on a dataset chunk boundary.

buf is the memory buffer containing the chunk read from the dataset in the file.

Example: The following code illustrates the use of H5DOread_chunk() to read a chunk from a dataset:

#include <zlib.h>

#include <math.h>

#define DEFLATE_SIZE_ADJUST(s) (ceil(((double)(s)) * 1.001) + 12)

:

:

size_t buf_size = CHUNK_NX*CHUNK_NY*sizeof(int);

const Bytef *z_src = (const Bytef *)(direct_buf);

Bytef *z_dst; /* Destination buffer */

uLongf z_dst_nbytes = (uLongf)DEFLATE_SIZE_ADJUST(buf_size);

uLong z_src_nbytes = (uLong)buf_size;

int aggression = 9; /* Compression aggression setting */

uint32_t filter_mask = 0;

size_t buf_size = CHUNK_NX * CHUNK_NY * sizeof(int);

/* For H5DOread_chunk() */

void *readbuf = NULL; /* Buffer for reading data */

const Bytef *pt_readbuf; /* Point to the buffer for data read */

hsize_t read_chunk_nbytes; /* Size of chunk on disk */

int read_dst_buf[CHUNK_NX][CHUNK_NY]; /* Buffer to hold un-compressed data */

/* Create the data space */

if ((dataspace = H5Screate_simple(RANK, dims, maxdims)) < 0)

goto error;

/* Create a new file */

if ((file = H5Fcreate(FILE_NAME5, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT)) < 0)

goto error;

/* Modify dataset creation properties, i.e. enable chunking and compression */

if ((cparms = H5Pcreate(H5P_DATASET_CREATE)) < 0)

goto error;

if ((status = H5Pset_chunk(cparms, RANK, chunk_dims)) < 0)

goto error;

if ((status = H5Pset_deflate(cparms, aggression)) < 0)

goto error;

/* Create a new dataset within the file using cparms creation properties */

if ((dset_id = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT, dataspace, H5P_DEFAULT, cparms,

H5P_DEFAULT)) < 0)

goto error;

/* Initialize data for one chunk */

for (i = n = 0; i < CHUNK_NX; i++)

for (j = 0; j < CHUNK_NY; j++)

direct_buf[i][j] = n++;

/* Allocate output (compressed) buffer */

outbuf = malloc(z_dst_nbytes);

z_dst = (Bytef *)outbuf;

/* Perform compression from the source to the destination buffer */

ret = compress2(z_dst, &z_dst_nbytes, z_src, z_src_nbytes, aggression);

/* Check for various zlib errors */

if (Z_BUF_ERROR == ret) {

fprintf(stderr, "overflow");

goto error;

}

else if (Z_MEM_ERROR == ret) {

fprintf(stderr, "deflate memory error");

goto error;

}

else if (Z_OK != ret) {

fprintf(stderr, "other deflate error");

goto error;

}

/* Write the compressed chunk data repeatedly to cover all the

* * chunks in the dataset, using the direct write function. */

for (i = 0; i < NX / CHUNK_NX; i++) {

for (j = 0; j < NY / CHUNK_NY; j++) {

status = H5DOwrite_chunk(dset_id, H5P_DEFAULT, filter_mask, offset, z_dst_nbytes, outbuf);

offset[1] += CHUNK_NY;

}

offset[0] += CHUNK_NX;

offset[1] = 0;

}

if (H5Fflush(dataset, H5F_SCOPE_LOCAL) < 0)

goto error;

if (H5Dclose(dataset) < 0)

goto error;

if ((dataset = H5Dopen2(file, DATASETNAME1, H5P_DEFAULT)) < 0)

goto error;

offset[0] = CHUNK_NX;

offset[1] = CHUNK_NY;

/* Get the size of the compressed chunk */

ret = H5Dget_chunk_storage_size(dataset, offset, &read_chunk_nbytes);

readbuf = malloc(read_chunk_nbytes);

pt_readbuf = (const Bytef *)readbuf;

/* Use H5DOread_chunk() to read the chunk back */

if ((status = H5DOread_chunk(dataset, H5P_DEFAULT, offset, &read_filter_mask, readbuf)) < 0)

goto error;

ret =

uncompress((Bytef *)read_dst_buf, (uLongf *)&buf_size, pt_readbuf, (uLong)read_chunk_nbytes);

/* Check for various zlib errors */

if (Z_BUF_ERROR == ret) {

fprintf(stderr, "error: not enough room in output buffer");

goto error;

}

else if (Z_MEM_ERROR == ret) {

fprintf(stderr, "error: not enough memory");

goto error;

}

else if (Z_OK != ret) {

fprintf(stderr, "error: corrupted input data");

goto error;

}

/* Data verification here */

:

:

H5P_DATASET_CREATE
#define H5P_DATASET_CREATE
Definition H5Ppublic.h:53

H5DOread_chunk
H5_HLDLL herr_t H5DOread_chunk(hid_t dset_id, hid_t dxpl_id, const hsize_t *offset, uint32_t *filters, void *buf)
Reads a raw data chunk directly from a dataset in a file into a buffer.

H5DOwrite_chunk
H5_HLDLL herr_t H5DOwrite_chunk(hid_t dset_id, hid_t dxpl_id, uint32_t filters, const hsize_t *offset, size_t data_size, const void *buf)
Writes a raw data chunk from a buffer directly to a dataset in a file.

H5T_NATIVE_LONG
#define H5T_NATIVE_LONG
Definition H5Tpublic.h:777

Version: 1.10.3 Function deprecated in favor of H5Dread_chunk.

Since: 1.10.2, 1.8.19

◆ H5DOwrite_chunk()

H5_HLDLL herr_t H5DOwrite_chunk	(	hid_t	dset_id,
		hid_t	dxpl_id,
		uint32_t	filters,
		const hsize_t *	offset,
		size_t	data_size,
		const void *	buf )

Writes a raw data chunk from a buffer directly to a dataset in a file.

Parameters

[in]	dset_id	Identifier for the dataset to write to
[in]	dxpl_id	Transfer property list identifier for this I/O operation
[in]	filters	Mask for identifying the filters in use
[in]	offset	Logical position of the chunk's first element in the dataspace
[in]	data_size	Size of the actual data to be written in bytes
[in]	buf	Buffer containing data to be written to the chunk

Returns: Returns a non-negative value if successful; otherwise, returns a negative value.

Deprecated

This function was deprecated in favor of the function H5Dwrite_chunk() of HDF5-1.10.3. The functionality of H5DOwrite_chunk() was moved to H5Dwrite_chunk().

For compatibility, this API call has been left as a stub which simply calls H5Dwrite_chunk(). New code should use H5Dwrite_chunk().

The H5DOwrite_chunk() writes a raw data chunk as specified by its logical offset in a chunked dataset dset_id from the application memory buffer buf to the dataset in the file. Typically, the data in buf is preprocessed in memory by a custom transformation, such as compression. The chunk will bypass the library's internal data transfer pipeline, including filters, and will be written directly to the file.

dxpl_id is a data transfer property list identifier.

filters is a mask providing a record of which filters are used with the chunk. The default value of the mask is zero (0), indicating that all enabled filters are applied. A filter is skipped if the bit corresponding to the filter's position in the pipeline (0 ≤ position < 32) is turned on. This mask is saved with the chunk in the file.

offset is an array specifying the logical position of the first element of the chunk in the dataset's dataspace. The length of the offset array must equal the number of dimensions, or rank, of the dataspace. The values in offset must not exceed the dimension limits and must specify a point that falls on a dataset chunk boundary.

data_size is the size in bytes of the chunk, representing the number of bytes to be read from the buffer buf. If the data chunk has been precompressed, data_size should be the size of the compressed data.

buf is the memory buffer containing data to be written to the chunk in the file.

Attention: Exercise caution when using H5DOread_chunk() and H5DOwrite_chunk(), as they read and write data chunks directly in a file. H5DOwrite_chunk() bypasses hyperslab selection, the conversion of data from one datatype to another, and the filter pipeline to write the chunk. Developers should have experience with these processes before using this function. Please see Using the Direct Chunk Write Function for more information.

Note: H5DOread_chunk() and H5DOwrite_chunk() are not supported under parallel and do not support variable length types.

Example: The following code illustrates the use of H5DOwrite_chunk to write an entire dataset, chunk by chunk:

#include <zlib.h>

#include <math.h>

#define DEFLATE_SIZE_ADJUST(s) (ceil(((double)(s)) * 1.001) + 12)

:

:

size_t buf_size = CHUNK_NX*CHUNK_NY*sizeof(int);

const Bytef *z_src = (const Bytef *)(direct_buf);

Bytef *z_dst; /* Destination buffer */

uLongf z_dst_nbytes = (uLongf)DEFLATE_SIZE_ADJUST(buf_size);

uLong z_src_nbytes = (uLong)buf_size;

int aggression = 9; /* Compression aggression setting */

uint32_t filter_mask = 0;

size_t buf_size = CHUNK_NX * CHUNK_NY * sizeof(int);

/* Create the data space */

if ((dataspace = H5Screate_simple(RANK, dims, maxdims)) < 0)

goto error;

/* Create a new file */

if ((file = H5Fcreate(FILE_NAME5, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT)) < 0)

goto error;

/* Modify dataset creation properties, i.e. enable chunking and compression */

if ((cparms = H5Pcreate(H5P_DATASET_CREATE)) < 0)

goto error;

if ((status = H5Pset_chunk(cparms, RANK, chunk_dims)) < 0)

goto error;

if ((status = H5Pset_deflate(cparms, aggression)) < 0)

goto error;

/* Create a new dataset within the file using cparms creation properties */

if ((dset_id = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT, dataspace, H5P_DEFAULT, cparms,

H5P_DEFAULT)) < 0)

goto error;

/* Initialize data for one chunk */

for (i = n = 0; i < CHUNK_NX; i++)

for (j = 0; j < CHUNK_NY; j++)

direct_buf[i][j] = n++;

/* Allocate output (compressed) buffer */

outbuf = malloc(z_dst_nbytes);

z_dst = (Bytef *)outbuf;

/* Perform compression from the source to the destination buffer */

ret = compress2(z_dst, &z_dst_nbytes, z_src, z_src_nbytes, aggression);

/* Check for various zlib errors */

if (Z_BUF_ERROR == ret) {

fprintf(stderr, "overflow");

goto error;

}

else if (Z_MEM_ERROR == ret) {

fprintf(stderr, "deflate memory error");

goto error;

}

else if (Z_OK != ret) {

fprintf(stderr, "other deflate error");

goto error;

}

/* Write the compressed chunk data repeatedly to cover all the

* * chunks in the dataset, using the direct write function. */

for (i = 0; i < NX / CHUNK_NX; i++) {

for (j = 0; j < NY / CHUNK_NY; j++) {

status =

H5DOwrite_chunk(dset_id, H5P_DEFAULT, filter_mask, offset, z_dst_nbytes, outbuf);

offset[1] += CHUNK_NY;

}

offset[0] += CHUNK_NX;

offset[1] = 0;

}

/* Overwrite the first chunk with uncompressed data. Set the filter mask to

* * indicate the compression filter is skipped */

filter_mask = 0x00000001;

offset[0] = offset[1] = 0;

if (H5DOwrite_chunk(dset_id, H5P_DEFAULT, filter_mask, offset, buf_size, direct_buf) < 0)

goto error;

/* Read the entire dataset back for data verification converting ints to longs */

if (H5Dread(dataset, H5T_NATIVE_LONG, H5S_ALL, H5S_ALL, H5P_DEFAULT, outbuf_long) < 0)

goto error;

/* Data verification here */

:

:

Version: 1.10.3 Function deprecated in favor of H5Dwrite_chunk.

Since: 1.8.11

Detailed Description

Functions

Function Documentation

◆ H5DOappend()

◆ H5DOread_chunk()

◆ H5DOwrite_chunk()