Parallel HDF5 Design

 

1. Design Overview

In this section, I first describe the function requirements of the Parallel HDF5 (PHDF5) software and the assumed system requirements. Section 2 describes the programming model of the PHDF5 interface. Section 3 shows an example PHDF5 program.

1.1. Function requirements

1.2. System requirements

2. Programming Model

HDF5 uses optional access template object to control the file access mechanism. The general model in accessing an HDF5 file in parallel contains the following steps:

2.1. Setup access template

Each processes of the MPI communicator creates an access template and sets it up with MPI parallel access information (communicator, info object, access-mode).

2.1. File open

All processes of the MPI communicator open an HDF5 file by a collective call (H5FCreate or H5Fopen) with the access template.

2.2. Dataset open

All processes of the MPI communicator open a dataset by a collective call (H5Dcreate or H5Dopen).  This version supports only collective dataset open.  Future version may support datasets open by a subset of the processes that have opened the file.

2.3. Dataset access

2.3.1. Independent dataset access

Each process may do independent and arbitrary number of data I/O access by independent calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for independent access.  (The default transfer mode is independent transfer).  If the dataset is an unlimited dimension one and if the H5Dwrite is writing data beyond the current dimension size of the dataset, all processes that have opened the dataset must make a collective call (H5Dallocate) to allocate more space for the dataset BEFORE the independent H5Dwrite call.

2.3.2. Collective dataset access

All processes that have opened the dataset may do collective data I/O access by collective calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for collective access.  Pre-allocation (H5Dallocate) is not needed for unlimited dimension datasets since the H5Dallocate call, if needed, is done internally by the collective data access call.

2.3.3. Dataset attributes access

Changes to attributes can only occur at the "main process" (process 0).  Read only access to attributes can occur independent in each process that has opened the dataset.  (API to be defined later.)
 

2.4. Dataset close

All processes that have opened the dataset must close the dataset by a collective call (H5Dclose).

2.5. File close

All processes that have opened the file must close the file by a collective call (H5Fclose).
 

3. Parallel HDF5 Example


Example code


Send comments to
hdfparallel@ncsa.uiuc.edu

Last Modified: Feb 16, 1998