Parallel HDF5 Design

1. Design Overview

In this section, I first describe the function requirements of the Parallel HDF5 (PHDF5) software and the assumed system requirements. Section 2 describes the programming model of the PHDF5 interface. Section 3 shows an example PHDF5 program.

1.1. Function requirements

An API to support parallel file access for HDF5 files in a message passing environment.
Fast parallel I/O to large datasets through standard parallel I/O interface.
Processes are required to do collective API calls only when structural changes are needed for the HDF5 file.
Each process may do independent I/O requests to different datasets in the same or different HDF5 files.
Supports collective I/O requests for datasets (to be included in next version).
Minimize diviation from HDF5 interface.

1.2. System requirements

C language interface is the initial requirement. Fortran77 interface will be added later.
Use Message Passing Interface (MPI) for interprocess communication.
Use MPI-IO calls for parallel file accesses.
Initial platforms—IBM SP2, Intel TFLOPS and SGI Origin 2000.

2. Programming Model

HDF5 uses optional access template object to control the file access mechanism. The general model in accessing an HDF5 file in parallel contains the following steps:

Setup access template
File open
Dataset open
Dataset data access (zero or more)
Dataset close
File close

2.1. Setup access template

Each processes of the MPI communicator creates an access template and sets it up with MPI parallel access information (communicator, info object, access-mode).

2.1. File open

All processes of the MPI communicator open an HDF5 file by a collective call (H5FCreate or H5Fopen) with the access template.

2.2. Dataset open

All processes of the MPI communicator open a dataset by a collective call (H5Dcreate or H5Dopen). This version supports only collective dataset open. Future version may support datasets open by a subset of the processes that have opened the file.

2.3. Dataset access

2.3.1. Independent dataset access

Each process may do independent and arbitrary number of data I/O access by independent calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for independent access. (The default transfer mode is independent transfer). If the dataset is an unlimited dimension one and if the H5Dwrite is writing data beyond the current dimension size of the dataset, all processes that have opened the dataset must make a collective call (H5Dallocate) to allocate more space for the dataset BEFORE the independent H5Dwrite call.

2.3.2. Collective dataset access

All processes that have opened the dataset may do collective data I/O access by collective calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for collective access. Pre-allocation (H5Dallocate) is not needed for unlimited dimension datasets since the H5Dallocate call, if needed, is done internally by the collective data access call.

2.3.3. Dataset attributes access

Changes to attributes can only occur at the "main process" (process 0). Read only access to attributes can occur independent in each process that has opened the dataset. (API to be defined later.)

2.4. Dataset close

All processes that have opened the dataset must close the dataset by a collective call (H5Dclose).

2.5. File close

All processes that have opened the file must close the file by a collective call (H5Fclose).

3. Parallel HDF5 Example


Example code

Send comments to
hdfparallel@ncsa.uiuc.edu