summaryrefslogtreecommitdiffstats
path: root/release_docs/INSTALL_parallel
diff options
context:
space:
mode:
Diffstat (limited to 'release_docs/INSTALL_parallel')
-rw-r--r--release_docs/INSTALL_parallel211
1 files changed, 74 insertions, 137 deletions
diff --git a/release_docs/INSTALL_parallel b/release_docs/INSTALL_parallel
index d771c0b..d3d7830 100644
--- a/release_docs/INSTALL_parallel
+++ b/release_docs/INSTALL_parallel
@@ -1,12 +1,24 @@
Installation instructions for Parallel HDF5
-------------------------------------------
+0. Use Build Scripts
+--------------------
+The HDF Group is accumulating build scripts to handle building parallel HDF5
+on various platforms (Cray, IBM, SGI, etc...). These scripts are being
+maintained and updated continuously for current and future systems. The reader
+is strongly encouraged to consult the repository at,
+
+https://github.com/HDFGroup/build_hdf5
+
+for building parallel HDF5 on these system. All contributions, additions
+and fixes to the repository are welcomed and encouraged.
+
1. Overview
-----------
This file contains instructions for the installation of parallel HDF5 (PHDF5).
It is assumed that you are familiar with the general installation steps as
-described in the INSATLL file. Get familiar with that file before trying
+described in the INSTALL file. Get familiar with that file before trying
the parallel HDF5 installation.
The remaining of this section explains the requirements to run PHDF5.
@@ -17,19 +29,22 @@ of running the parallel test suites.
1.1. Requirements
-----------------
-PHDF5 requires an MPI compiler with MPI-IO support and a parallel file system.
-If you don't know yet, you should first consult with your system support staff
-of information how to compile an MPI program, how to run an MPI application,
-and how to access the parallel file system. There are sample MPI-IO C and
-Fortran programs in the appendix section of "Sample programs". You can use
-them to run simple tests of your MPI compilers and the parallel file system.
+PHDF5 requires an MPI compiler with MPI-IO support and a POSIX compliant
+(Ref. 1) parallel file system. If you don't know yet, you should first consult
+with your system support staff of information how to compile an MPI program,
+how to run an MPI application, and how to access the parallel file system.
+There are sample MPI-IO C and Fortran programs in the appendix section of
+"Sample programs". You can use them to run simple tests of your MPI compilers
+and the parallel file system.
1.2. Further Help
-----------------
-If you still have difficulties installing PHDF5 in your system, please send
-mail to
- help@hdfgroup.org
+
+For help with installing, questions can be posted to the HDF Forum or sent to the HDF Helpdesk:
+
+ HDF Forum: https://forum.hdfgroup.org/
+ HDF Helpdesk: https://portal.hdfgroup.org/display/support/The+HDF+Help+Desk
In your mail, please include the output of "uname -a". If you have run the
"configure" command, attach the output of the command and the content of
@@ -48,54 +63,15 @@ more detailed explanations.
----------------------------
HDF5 knows several parallel compilers: mpicc, hcc, mpcc, mpcc_r. To build
parallel HDF5 with one of the above, just set CC as it and configure.
-The "--enable-parallel" is optional in this case.
- $ CC=/usr/local/mpi/bin/mpicc ./configure --prefix=<install-directory>
+ $ CC=/usr/local/mpi/bin/mpicc ./configure --enable-parallel --prefix=<install-directory>
$ make # build the library
$ make check # verify the correctness
# Read the Details section about parallel tests.
$ make install
-2.2. IBM SP
------------
-During the build stage, the H5detect is compiled and executed to generate
-the source file H5Tinit.c which is compiled as part of the HDF5 library. In
-parallel mode, make sure your environment variables are set correctly to
-execute a single process mpi application. Otherwise, multiple processes
-attempt to write to the same H5Tinit.c file, resulting in a scrambled
-source file. Unfortunately, the setting varies from machine to machine.
-E.g., the following works for the IBM SP machine at LLNL.
-
- setenv MP_PROCS 1
- setenv MP_NODES 1
- setenv MP_LABELIO no
- setenv MP_RMPOOL 0
- setenv LLNL_COMPILE_SINGLE_THREADED TRUE # for LLNL site only
-
-The shared library configuration is problematic. So, only static library
-is supported.
-
-Then do the following steps:
-
- $ ./configure --disable-shared --prefix=<install-directory>
- $ make # build the library
- $ make check # verify the correctness
- # Read the Details section about parallel tests.
- $ make install
-
-We also suggest that you add "-qxlf90=autodealloc" to FFLAGS when building
-parallel with fortran enabled. This can be done by invoking:
-
- setenv FFLAGS -qxlf90=autodealloc # 32 bit build
-or
- setenv FFLAGS "-q64 -qxlf90=autodealloc" # 64 bit build
-
-prior to running configure. Recall that the "-q64" is necessary for 64
-bit builds.
-
-
-2.3. Linux 2.4 and greater
+2.2. Linux 2.4 and greater
--------------------------
Be sure that your installation of MPICH was configured with the following
configuration command-line option:
@@ -106,83 +82,39 @@ This allows for >2GB sized files on Linux systems and is only available with
Linux kernels 2.4 and greater.
-2.4. Red Storm (Cray XT3) (for v1.8 and later)
+2.3. Hopper (Cray XE6) (for v1.8 and later)
-------------------------
-Both serial and parallel HDF5 are supported in Red Storm.
-2.4.1 Building serial HDF5 for Red Storm
-------------------------------------------
-The following steps are for building the serial HDF5 for the Red Storm
-compute nodes. They would probably work for other Cray XT3 systems but have
+The following steps are for building HDF5 for the Hopper compute
+nodes. They would probably work for other Cray systems but have
not been verified.
-# Assume you already have a copy of HDF5 source code in directory `hdf5' and
-# want to install the binary in directory `/project/hdf5/hdf5'.
+Obtain the HDF5 source code:
+ https://portal.hdfgroup.org/display/support/Downloads
-$ cd hdf5
-$ bin/yodconfigure configure
-$ env RUNSERIAL="yod -sz 1" \
- CC=cc FC=ftn CXX=CC \
- ./configure --prefix=/project/hdf5/hdf5
-$ make
-$ make check
+The entire build process should be done on a MOM node in an interactive allocation and on a file system accessible by all compute nodes.
+Request an interactive allocation with qsub:
+qsub -I -q debug -l mppwidth=8
-# if all is well, install the binary.
-$ make install
+- create a build directory build-hdf5:
+ mkdir build-hdf5; cd build-hdf5/
-2.4.2 Building parallel HDF5 for Red Storm
-------------------------------------------
-The following steps are for building the Parallel HDF5 for the Red Storm
-compute nodes. They would probably work for other Cray XT3 systems but have
-not been verified.
+- configure HDF5:
+ RUNSERIAL="aprun -q -n 1" RUNPARALLEL="aprun -q -n 6" FC=ftn CC=cc /path/to/source/configure --enable-fortran --enable-parallel --disable-shared
+
+ RUNSERIAL and RUNPARALLEL tell the library how it should launch programs that are part of the build procedure.
+
+- Compile HDF5:
+ gmake
+
+- Check HDF5
+ gmake check
-# Assume you already have a copy of HDF5 source code in directory `hdf5' and
-# want to install the binary in directory `/project/hdf5/phdf5'. You also
-# have done the proper setup to have mpicc and mpif90 as the compiler commands.
-
-$ cd hdf5
-$ bin/yodconfigure configure
-$ env RUNSERIAL="yod -sz 1" RUNPARALLEL="yod -sz 3" \
- CC=cc FC=ftn \
- ./configure --enable-parallel --prefix=/project/hdf5/phdf5
-$ make
-$ make check
-
-# if all is well, install the binary.
-$ make install
-
-2.4.3 Red Storm known problems
-------------------------------
-For Red Storm, a Cray XT3 system, the yod command sometimes gives the
-message, "yod allocation delayed for node recovery". This interferes with
-test suites that do not expect seeing this message. To bypass this problem,
-I launch the executables with a command shell script called "myyod" which
-consists of the following lines. (You should set $RUNSERIAL and $RUNPARALLEL
-to use myyod instead of yod.)
-==== myyod =======
-#!/bin/sh
-# sleep 2 seconds to allow time for the node recovery else it pops the
-# message,
-# yod allocation delayed for node recovery
-sleep 2
-yod $*
-==== end of myyod =======
-
-For Red Storm, a Cray XT3 system, the tools/h5ls/testh5ls.sh will fail on
-the test "Testing h5ls -w80 -r -g tgroup.h5" fails. This test is
-expected to fail and exit with a non-zero code but the yod command does
-not propagate the exit code of the executables. Yod always returns 0 if it
-can launch the executable. The test suite shell expects a non-zero for
-this particular test, therefore it concludes the test has failed when it
-receives 0 from yod. To bypass this problem for now, change the following
-lines in the tools/h5ls/testh5ls.sh.
-======== Original =========
-# The following combination of arguments is expected to return an error message
-# and return value 1
-TOOLTEST tgroup-1.ls 1 -w80 -r -g tgroup.h5
-======== Skip the test =========
-echo SKIP TOOLTEST tgroup-1.ls 1 -w80 -r -g tgroup.h5
-======== end of bypass ========
+- Install HDF5
+ gmake install
+
+The build will be in build-hdf5/hdf5/ (or whatever you specify in --prefix).
+To compile other HDF5 applications use the wrappers created by the build (build-hdf5/hdf5/bin/h5pcc or h5fc)
3. Detail explanation
@@ -215,28 +147,24 @@ a properly installed parallel compiler (e.g., MPICH's mpicc or IBM's mpcc_r)
and supply the compiler name as the value of the CC environment variable.
For examples,
- $ CC=mpcc_r ./configure
- $ CC=/usr/local/mpi/bin/mpicc ./configure
-
-If no such a compiler command is available then you must use your normal
-C compiler along with the location(s) of MPI/MPI-IO files to be used.
-For example,
-
- $ CPPFLAGS=-I/usr/local/mpi/include \
- LDFLAGS=-L/usr/local/mpi/lib/LINUX/ch_p4 \
- ./configure --enable-parallel=mpich
+ $ CC=mpcc_r ./configure --enable-parallel
+ $ CC=/usr/local/mpi/bin/mpicc ./configure --enable-parallel
If a parallel library is being built then configure attempts to determine how
to run a parallel application on one processor and on many processors. If the
compiler is `mpicc' and the user hasn't specified values for RUNSERIAL and
RUNPARALLEL then configure chooses `mpiexec' from the same directory as `mpicc':
- RUNSERIAL: /usr/local/mpi/bin/mpiexec -np 1
- RUNPARALLEL: /usr/local/mpi/bin/mpiexec -np $${NPROCS:=6}
+ RUNSERIAL: mpiexec -n 1
+ RUNPARALLEL: mpiexec -n $${NPROCS:=6}
The `$${NPROCS:=6}' will be substituted with the value of the NPROCS
environment variable at the time `make check' is run (or the value 6).
+Note that some MPI implementations (e.g. OpenMPI 4.0) disallow oversubscribing
+nodes by default so you'll have to either set NPROCS equal to the number of
+processors available (or fewer) or redefine RUNPARALLEL with appropriate
+flag(s) (--oversubscribe in OpenMPI).
4. Parallel test suite
----------------------
@@ -253,11 +181,6 @@ non-zero code. Failure to support file size greater than 2GB is not a fatal
error for HDF5 because HDF5 can use other file-drivers such as families of
files to bypass the file size limit.
-The t_posix_compliant tests if the file system is POSIX compliant when POSIX
-and MPI IO APIs are used. This is for information only and it always exits
-with 0 even when non-compliance errors have occurred. This is to prevent
-the test from aborting the remaining parallel HDF5 tests unnecessarily.
-
The t_cache does many small sized I/O requests and may not run well in a
slow file system such as NFS disk. If it takes a long time to run it, try
set the environment variable $HDF5_PARAPREFIX to a file system more suitable
@@ -274,6 +197,20 @@ if the tests should use directory /PFS/user/me, do
shell initial files like .profile, .cshrc, etc.)
+Reference
+---------
+1. POSIX Compliant. A good explanation is by Donald Lewin,
+ After a write() to a regular file has successfully returned, any
+ successful read() from each byte position on the file that was modified
+ by that write() will return the date that was written by the write(). A
+ subsequent write() to the same byte will overwrite the file data. If a
+ read() of a file data can be proven by any means [e.g., MPI_Barrier()]
+ to occur after a write() of that data, it must reflect that write(),
+ even if the calls are made by a different process.
+ Lewin, D. (1994). "POSIX Programmer's Guide (pg. 513-4)". O'Reilly
+ & Associates.
+
+
Appendix A. Sample programs
---------------------------
Here are sample MPI-IO C and Fortran programs. You may use them to run simple