summaryrefslogtreecommitdiffstats
path: root/doxygen/dox/IntroParHDF5.dox
diff options
context:
space:
mode:
Diffstat (limited to 'doxygen/dox/IntroParHDF5.dox')
-rw-r--r--doxygen/dox/IntroParHDF5.dox271
1 files changed, 271 insertions, 0 deletions
diff --git a/doxygen/dox/IntroParHDF5.dox b/doxygen/dox/IntroParHDF5.dox
new file mode 100644
index 0000000..1f04e96
--- /dev/null
+++ b/doxygen/dox/IntroParHDF5.dox
@@ -0,0 +1,271 @@
+/** @page IntroParHDF5 A Brief Introduction to Parallel HDF5
+
+Navigate back: \ref index "Main" / \ref GettingStarted
+<hr>
+
+If you are new to HDF5 please see the @ref LearnBasics topic first.
+
+\section sec_pintro_overview Overview of Parallel HDF5 (PHDF5) Design
+There were several requirements that we had for Parallel HDF5 (PHDF5). These were:
+\li Parallel HDF5 files had to be compatible with serial HDF5 files and sharable
+between different serial and parallel platforms.
+\li Parallel HDF5 had to be designed to have a single file image to all processes,
+rather than having one file per process. Having one file per process can cause expensive
+post processing, and the files are not usable by different processes.
+\li A standard parallel I/O interface had to be portable to different platforms.
+
+With these requirements of HDF5 our initial target was to support MPI programming, but not
+for shared memory programming. We had done some experimentation with thread-safe support
+for Pthreads and for OpenMP, and decided to use these.
+
+Implementation requirements were to:
+\li Not use Threads, since they were not commonly supported in 1998 when we were looking at this.
+\li Not have a reserved process, as this might interfere with parallel algorithms.
+\li Not spawn any processes, as this is not even commonly supported now.
+
+The following shows the Parallel HDF5 implementation layers.
+
+
+\subsection subsec_pintro_prog Parallel Programming with HDF5
+This tutorial assumes that you are somewhat familiar with parallel programming with MPI (Message Passing Interface).
+
+If you are not familiar with parallel programming, here is a tutorial that may be of interest:
+<a href="http://www.nersc.gov/users/training/online-tutorials/introduction-to-scientific-i-o/?show_all=1">Tutorial on HDF5 I/O tuning at NERSC</a>
+
+Some of the terms that you must understand in this tutorial are:
+<ul>
+<li>
+<strong>MPI Communicator</strong>
+Allows a group of processes to communicate with each other.
+
+Following are the MPI routines for initializing MPI and the communicator and finalizing a session with MPI:
+<table>
+<tr>
+<th>C</th>
+<th>Fortran</th>
+<th>Description</th>
+</tr>
+<tr>
+<td>MPI_Init</td>
+<td>MPI_INIT</td>
+<td>Initialize MPI (MPI_COMM_WORLD usually)</td>
+</tr>
+<tr>
+<td>MPI_Comm_size</td>
+<td>MPI_COMM_SIZE</td>
+<td>Define how many processes are contained in the communicator</td>
+</tr>
+<tr>
+<td>MPI_Comm_rank</td>
+<td>MPI_COMM_RANK</td>
+<td>Define the process ID number within the communicator (from 0 to n-1)</td>
+</tr>
+<tr>
+<td>MPI_Finalize</td>
+<td>MPI_FINALIZE</td>
+<td>Exiting MPI</td>
+</tr>
+</table>
+</li>
+<li>
+<strong>Collective</strong>
+MPI defines this to mean all processes of the communicator must participate in the right order.
+</li>
+</ul>
+
+Parallel HDF5 opens a parallel file with a communicator. It returns a file handle to be used for future access to the file.
+
+All processes are required to participate in the collective Parallel HDF5 API. Different files can be opened using different communicators.
+
+Examples of what you can do with the Parallel HDF5 collective API:
+\li File Operation: Create, open and close a file
+\li Object Creation: Create, open, and close a dataset
+\li Object Structure: Extend a dataset (increase dimension sizes)
+\li Dataset Operations: Write to or read from a dataset
+(Array data transfer can be collective or independent.)
+
+Once a file is opened by the processes of a communicator:
+\li All parts of the file are accessible by all processes.
+\li All objects in the file are accessible by all processes.
+\li Multiple processes write to the same dataset.
+\li Each process writes to an individual dataset.
+
+Please refer to the Supported Configuration Features Summary in the release notes for the current release
+of HDF5 for an up-to-date list of the platforms that we support Parallel HDF5 on.
+
+
+\subsection subsec_pintro_create_file Creating and Accessing a File with PHDF5
+The programming model for creating and accessing a file is as follows:
+<ol>
+<li>Set up an access template object to control the file access mechanism.</li>
+<li>Open the file.</li>
+<li>Close the file.</li>
+</ol>
+
+Each process of the MPI communicator creates an access template and sets it up with MPI parallel
+access information. This is done with the #H5Pcreate call to obtain the file access property list
+and the #H5Pset_fapl_mpio call to set up parallel I/O access.
+
+Following is example code for creating an access template in HDF5:
+<em>C</em>
+\code
+ 23 MPI_Comm comm = MPI_COMM_WORLD;
+ 24 MPI_Info info = MPI_INFO_NULL;
+ 25
+ 26 /*
+ 27 * Initialize MPI
+ 28 */
+ 29 MPI_Init(&argc, &argv);
+ 30 MPI_Comm_size(comm, &mpi_size);
+ 31 MPI_Comm_rank(comm, &mpi_rank);
+ 32
+ 33 /*
+ 34 * Set up file access property list with parallel I/O access
+ 35 */
+ 36 plist_id = H5Pcreate(H5P_FILE_ACCESS); 37 H5Pset_fapl_mpio(plist_id, comm, info);
+\endcode
+
+<em>Fortran</em>
+\code
+ 23 comm = MPI_COMM_WORLD
+ 24 info = MPI_INFO_NULL
+ 25
+ 26 CALL MPI_INIT(mpierror)
+ 27 CALL MPI_COMM_SIZE(comm, mpi_size, mpierror)
+ 28 CALL MPI_COMM_RANK(comm, mpi_rank, mpierror)
+ 29 !
+ 30 ! Initialize FORTRAN interface
+ 31 !
+ 32 CALL h5open_f(error)
+ 33
+ 34 !
+ 35 ! Setup file access property list with parallel I/O access.
+ 36 !
+ 37 CALL h5pcreate_f(H5P_FILE_ACCESS_F, plist_id, error) 38 CALL h5pset_fapl_mpio_f(plist_id, comm, info, error)
+\endcode
+
+The following example programs create an HDF5 file using Parallel HDF5:
+<a href="https://github.com/HDFGroup/hdf5-examples/blob/master/C/H5Parallel/ph5_file_create.c">C: file_create.c</a>
+<a href="https://github.com/HDFGroup/hdf5-examples/blob/master/Fortran/H5Parallel/ph5_f90_file_create.F90">F90: file_create.F90</a>
+
+
+\subsection subsec_pintro_create_dset Creating and Accessing a Dataset with PHDF5
+The programming model for creating and accessing a dataset is as follows:
+<ol>
+<li>
+Create or open a Parallel HDF5 file with a collective call to:
+#H5Dcreate
+#H5Dopen
+</li>
+<li>
+Obtain a copy of the file transfer property list and set it to use collective or independent I/O.
+<ul>
+<li>
+Do this by first passing a data transfer property list class type to: #H5Pcreate
+</li>
+<li>
+Then set the data transfer mode to either use independent I/O access or to use collective I/O, with a call to: #H5Pset_dxpl_mpio
+
+Following are the parameters required by this call:
+<em>C</em>
+\code
+ herr_t H5Pset_dxpl_mpio (hid_t dxpl_id, H5FD_mpio_xfer_t xfer_mode )
+ dxpl_id IN: Data transfer property list identifier
+ xfer_mode IN: Transfer mode:
+ H5FD_MPIO_INDEPENDENT - use independent I/O access
+ (default)
+ H5FD_MPIO_COLLECTIVE - use collective I/O access
+\endcode
+
+<em>Fortran</em>
+\code
+ h5pset_dxpl_mpi_f (prp_id, data_xfer_mode, hdferr)
+ prp_id IN: Property List Identifier (INTEGER (HID_T))
+ data_xfer_mode IN: Data transfer mode (INTEGER)
+ H5FD_MPIO_INDEPENDENT_F (0)
+ H5FD_MPIO_COLLECTIVE_F (1)
+ hdferr IN: Error code (INTEGER)
+\endcode
+</li>
+<li>
+Access the dataset with the defined transfer property list.
+All processes that have opened a dataset may do collective I/O. Each process may do an independent
+and arbitrary number of data I/O access calls, using:
+#H5Dwrite
+#H5Dread
+
+If a dataset is unlimited, you can extend it with a collective call to: #H5Dextend
+</li>
+</ul>
+</li>
+</ol>
+
+The following code demonstrates a collective write using Parallel HDF5:
+<em>C</em>
+\code
+ 95 /*
+ 96 * Create property list for collective dataset write.
+ 97 */
+ 98 plist_id = H5Pcreate (H5P_DATASET_XFER); 99 H5Pset_dxpl_mpio (plist_id, H5FD_MPIO_COLLECTIVE);
+ 100
+ 101 status = H5Dwrite (dset_id, H5T_NATIVE_INT, memspace, filespace,
+ 102 plist_id, data);
+\endcode
+
+<em>Fortran</em>
+\code
+ 108 ! Create property list for collective dataset write
+ 109 !
+ 110 CALL h5pcreate_f (H5P_DATASET_XFER_F, plist_id, error) 111 CALL h5pset_dxpl_mpio_f (plist_id, H5FD_MPIO_COLLECTIVE_F, error)
+ 112
+ 113 !
+ 114 ! Write the dataset collectively.
+ 115 !
+ 116 CALL h5dwrite_f (dset_id, H5T_NATIVE_INTEGER, data, dimsfi, error, &
+ 117 file_space_id = filespace, mem_space_id = memspace, xfer_prp = plist_id)
+\endcode
+
+The following example programs create an HDF5 dataset using Parallel HDF5:
+<a href="https://github.com/HDFGroup/hdf5-examples/blob/master/C/H5Parallel/ph5_dataset.c">C: dataset.c</a>
+<a href="https://github.com/HDFGroup/hdf5-examples/blob/master/Fortran/H5Parallel/ph5_f90_dataset.F90">F90: dataset.F90</a>
+
+
+\subsubsection subsec_pintro_hyperslabs Hyperslabs
+The programming model for writing and reading hyperslabs is:
+/li Each process defines the memory and file hyperslabs.
+/li Each process executes a partial write/read call which is either collective or independent.
+
+The memory and file hyperslabs in the first step are defined with the #H5Sselect_hyperslab.
+
+The start (or offset), count, stride, and block parameters define the portion of the dataset
+to write to. By changing the values of these parameters you can write hyperslabs with Parallel
+HDF5 by contiguous hyperslab, by regularly spaced data in a column/row, by patterns, and by chunks:
+
+<table>
+<tr>
+<td>
+\li @subpage IntroParContHyperslab
+</td>
+</tr>
+<tr>
+<td>
+\li @subpage IntroParRegularSpaced
+</td>
+</tr>
+<tr>
+<td>
+\li @subpage IntroParPattern
+</td>
+</tr>
+<tr>
+<td>
+\li @subpage IntroParChunk
+</td>
+</tr>
+</table>
+
+
+<hr>
+Navigate back: \ref index "Main" / \ref GettingStarted
+
+*/