summaryrefslogtreecommitdiffstats
path: root/doc/html/ph5design.html
blob: 12800520300f1a10553cb9036b06b710bd410dac (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
<META NAME="Generator" CONTENT="Microsoft Word 97">
<TITLE>new</TITLE>
<META NAME="Template" CONTENT="C:\PROGRAM FILES\MICROSOFT OFFICE\OFFICE\html.dot">
</HEAD>
<BODY LINK="#0000ff" VLINK="#800080">

<B><FONT SIZE=6><P ALIGN="CENTER">Parallel HDF5 Design</P>
</B></FONT><P ALIGN="CENTER">&nbsp;</P>
<H1>1. Design Overview</H1>
<P>In this section, I first describe the function requirements of the Parallel HDF5 (PHDF5) software and the assumed system requirements. Section 2 describes the programming model of the PHDF5 interface. Section 3 shows an example PHDF5 program. </P>
<H2>1.1. Function requirements</H2>

<UL>
<LI>An API to support parallel file access for HDF5 files in a message passing environment. </LI>
<LI>Fast parallel I/O to large datasets through standard parallel I/O interface.</LI>
<LI>Processes are required to do collective API calls only when structural changes are needed for the HDF5 file. </LI>
<LI>Each process may do independent I/O requests to different datasets in the same or different HDF5 files. </LI>
<LI>Supports collective I/O requests for datasets (to be included in next version). </LI>
<LI>Minimize diviation from HDF5 interface.</LI>
</UL>

<H2>1.2. System requirements</H2>

<UL>
<LI>C language interface is the initial requirement. Fortran77 interface will be added later. </LI>
<LI>Use Message Passing Interface (MPI) for interprocess communication. </LI>
<LI>Use MPI-IO calls for parallel file accesses. </LI>
<LI>Initial platforms—IBM SP2, Intel TFLOPS and SGI Origin 2000. </LI></UL>

<H1>2. Programming Model</H1>
<P>HDF5 uses optional access template object to control the file access
mechanism.  The general model in accessing an HDF5 file in parallel
contains the following steps: </P>

<UL>
<LI>Setup access template</LI>
<LI>File open </LI>
<LI>Dataset open </LI>
<LI>Dataset data access (zero or more) </LI>
<LI>Dataset close </LI>
<LI>File close </LI></UL>

<H2>2.1. Setup access template</H2>
<P>Each processes of the MPI communicator creates an access template and sets
it up with MPI parallel access information (communicator, info object,
access-mode).  </P>
<H2>2.1. File open</H2>
<P>All processes of the MPI communicator open an HDF5 file by a collective call
(H5FCreate or H5Fopen) with the access template. </P>
<H2>2.2. Dataset open</H2>
<P>All processes of the MPI communicator open a dataset by a collective call (H5Dcreate or H5Dopen).&nbsp; This version supports only collective dataset open.&nbsp; Future version may support datasets open by a subset of the processes that have opened the file. </P>
<H2>2.3. Dataset access</H2>
<H3>2.3.1. Independent dataset access</H3>
<P>Each process may do independent and arbitrary number of data I/O access by independent calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for independent access.&nbsp; (The default transfer mode is independent transfer).&nbsp; If the dataset is an unlimited dimension one and if the H5Dwrite is writing data beyond the current dimension size of the dataset, all processes that have opened the dataset must make a collective call (H5Dallocate) to allocate more space for the dataset BEFORE the independent H5Dwrite call. </P>
<H3>2.3.2. Collective dataset access</H3>
<P>All processes that have opened the dataset may do collective data I/O access by collective calls (H5Dread or H5Dwrite) to the dataset with the transfer template set for collective access.&nbsp; Pre-allocation (H5Dallocate) is not needed for unlimited dimension datasets since the H5Dallocate call, if needed, is done internally by the collective data access call. </P>
<H3>2.3.3. Dataset attributes access</H3>
<P>Changes to attributes can only occur at the <I>"main process" </I>(process 0).&nbsp; Read only access to attributes can occur independent in each process that has opened the dataset.&nbsp; (API to be defined later.) <BR>
&nbsp; </P>
<H2>2.4. Dataset close</H2>
<P>All processes that have opened the dataset must close the dataset by a collective call (H5Dclose). </P>
<H2>2.5. File close</H2>
<P>All processes that have opened the file must close the file by a collective call (H5Fclose). <BR>
&nbsp; </P>
<H1>3. Parallel HDF5 Example</H1>
<PRE>
<CODE>
</CODE><A HREF="ph5example.c">Example code</A>
</PRE>
<P><HR></P>
<P>Send comments to <BR>
<A HREF="mailto:hdfparallel@ncsa.uiuc.edu">hdfparallel@ncsa.uiuc.edu</A> </P>
<H6>Last Modified: Feb 16, 1998</H6></BODY>
</HTML>