Parallel HDF5 Design
In this section, I first describe the function requirements of the Parallel HDF5 (PHDF5) software and the assumed system requirements. Section 2 describes the programming model of the PHDF5 interface. Section 3 shows an example PHDF5 program.
HDF5 uses optional access template object to control the file access mechanism. The general model in accessing an HDF5 file in parallel contains the following steps:
Each processes of the MPI communicator creates an access template and sets it up with MPI parallel access information (communicator, info object, access-mode).
All processes of the MPI communicator open an HDF5 file by a collective call (H5FCreate or H5Fopen) with the access template.
All processes of the MPI communicator open a dataset by a collective call (H5Dcreate or H5Dopen). This version supports only collective dataset open. Future version may support datasets open by a subset of the processes that have opened the file.
Each process may do independent and arbitrary number of data I/O access by independent calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for independent access. (The default transfer mode is independent transfer). If the dataset is an unlimited dimension one and if the H5Dwrite is writing data beyond the current dimension size of the dataset, all processes that have opened the dataset must make a collective call (H5Dallocate) to allocate more space for the dataset BEFORE the independent H5Dwrite call.
All processes that have opened the dataset may do collective data I/O access by collective calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for collective access. Pre-allocation (H5Dallocate) is not needed for unlimited dimension datasets since the H5Dallocate call, if needed, is done internally by the collective data access call.
Changes to attributes can only occur at the "main process" (process 0). Read only access to attributes can occur independent in each process that has opened the dataset. (API to be defined later.)
All processes that have opened the dataset must close the dataset by a collective call (H5Dclose).
All processes that have opened the file must close the file by a collective call (H5Fclose).
Example code
Send comments to
hdfparallel@ncsa.uiuc.edu