1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<title>HDF5 Files</title>
</head>
<body>
<h1>Files</h1>
<h2>1. Introduction</h2>
<p>HDF5 files are composed of a "boot block" describing information
required to portably access files on multiple platforms, followed
by information about the groups in a file and the datasets in the
file. The boot block contains information about the size of offsets
and lengths of objects, the number of entries in symbol tables
(used to store groups) and additional version information for the
file.
<h2>2. File access modes</h2>
<p>The HDF5 library assumes that all files are implicitly opened for read
access at all times. Passing the <code>H5F_ACC_RDWR</code>
parameter to <code>H5Fopen()</code> allows write access to a
file also. <code>H5Fcreate()</code> assumes write access as
well as read access, passing <code>H5F_ACC_TRUNC</code> forces
the truncation of an existing file, otherwise H5Fcreate will
fail to overwrite an existing file.
<h2>3. Creating, Opening, and Closing Files</h2>
<p>Files are created with the <code>H5Fcreate()</code> function,
and existing files can be accessed with <code>H5Fopen()</code>. Both
functions return an object ID which should be eventually released by
calling <code>H5Fclose()</code>.
<dl>
<dt><code>hid_t H5Fcreate (const char *<em>name</em>, uintn
<em>flags</em>, hid_t <em>create_properties</em>, hid_t
<em>access_properties</em>)</code>
<dd>This function creates a new file with the specified name in
the current directory. The file is opened with read and write
permission, and if the <code>H5F_ACC_TRUNC</code> flag is set,
any current file is truncated when the new file is created.
If a file of the same name exists and the
<code>H5F_ACC_TRUNC</code> flag is not set (or the
<code>H5F_ACC_EXCL</code> bit is set), this function will
fail. Passing <code>H5P_DEFAULT</code> for the creation
and/or access property lists uses the library's default
values for those properties. Creating and changing the
values of a property list is documented further below. The
return value is an ID for the open file and it should be
closed by calling <code>H5Fclose()</code> when it's no longer
needed. A negative value is returned for failure.
<br><br>
<dt><code>hid_t H5Fopen (const char *<em>name</em>, uintn
<em>flags</em>, hid_t <em>access_properties</em>)</code>
<dd>This function opens an existing file with read permission
and write permission if the <code>H5F_ACC_RDWR</code> flag is
set. The <em>access_properties</em> is a file access property
list ID or <code>H5P_DEFAULT</code> for the default I/O access
parameters. Creating and changing the parameters for access
templates is documented further below. Files which are opened
more than once return a unique identifier for each
<code>H5Fopen()</code> call and can be accessed through all
file IDs. The return value is an ID for the open file and it
should be closed by calling <code>H5Fclose()</code> when it's
no longer needed. A negative value is returned for failure.
<br><br>
<dt><code>herr_t H5Fclose (hid_t <em>file_id</em>)</code>
<dd>This function releases resources used by a file which was
opened by <code>H5Fcreate()</code> or <code>H5Fopen()</code>. After
closing a file the <em>file_id</em> should not be used again. This
function returns zero for success or a negative value for failure.
</dl>
<h2>4. File Property Lists</h2>
<p>Additional parameters to <code>H5Fcreate()</code> or
<code>H5Fopen()</code> are passed through property list
objects, which are created with the <code>H5Pcreate()</code>
function. These objects allow many parameters of a file's
creation or access to be changed from the default values.
Property lists are used as a portable and extensible method of
modifying multiple parameter values with simple API functions.
There are two kinds of file-related property lists,
namely file creation properties and file access properties.
<h3>4.1. File Creation Properties</h3>
<P>File creation property lists apply to <code>H5Fcreate()</code> only
and are used to control the file meta-data which is maintained
in the boot block of the file. The parameters which can be
modified are:
<dl>
<dt>User-Block Size <dd>The "user-block" is a fixed length block of
data located at the beginning of the file which is ignored by the
HDF5 library and may be used to store any data information found
to be useful to applications. This value may be set to any power
of two equal to 512 or greater (i.e. 512, 1024, 2048, etc). This
parameter is set and queried with the
<code>H5Pset_userblock()</code> and
<code>H5Pget_userblock()</code> calls.
<br><br>
<dt>Offset and Length Sizes
<dd>The number of bytes used to store the offset and length of
objects in the HDF5 file can be controlled with this
parameter. Values of 2, 4 and 8 bytes are currently
supported to allow 16-bit, 32-bit and 64-bit files to
be addressed. These parameters are set and queried
with the <code>H5Pset_sizes()</code> and
<code>H5Pget_sizes()</code> calls.
<br><br>
<dt>Symbol Table Parameters
<dd>The size of symbol table B-trees can be controlled by setting
the 1/2 rank and 1/2 node size parameters of the B-tree. These
parameters are set and queried with the
<code>H5Pset_sym_k()</code> and <code>H5Pget_sym_k()</code> calls.
<br><br>
<dt>Indexed Storage Parameters
<dd>The size of indexed storage B-trees can be controlled by
setting the 1/2 rank and 1/2 node size parameters of the B-tree.
These parameters are set and queried with the
<code>H5Pset_istore_k()</code> and <code>H5Pget_istore_k()</code>
calls.
</dl>
<h3>4.2. File Access Property Lists</h3>
<p>File access property lists apply to <code>H5Fcreate()</code> or
<code>H5Fopen()</code> and are used to control different methods of
performing I/O on files.
<dl>
<dt>Unbuffered I/O
<dd>Local permanent files can be accessed with the functions described
in Section 2 of the Posix manual, namely <code>open()</code>,
<code>lseek()</code>, <code>read()</code>, <code>write()</code>, and
<code>close()</code>. The <code>lseek64()</code> function is used
on operating systems that support it. This driver is enabled and
configured with <code>H5Pset_sec2()</code>, and queried with
<code>H5Pget_sec2()</code>.
<br><br>
<dt>Buffered I/O
<dd>Local permanent files can be accessed with the functions declared
in the <code>stdio.h</code> header file, namely
<code>fopen()</code>, <code>fseek()</code>, <code>fread()</code>,
<code>fwrite()</code>, and <code>fclose()</code>. The
<code>fseek64()</code> function is used on operating systems that
support it. This driver is enabled and configured with
<code>H5Pset_stdio()</code>, and queried with
<code>H5Pget_stdio()</code>.
<br><br>
<dt>Memory I/O
<dd>Local temporary files can be created and accessed directly from
memory without ever creating permanent storage. The library uses
<code>malloc()</code> and <code>free()</code> to create storage
space for the file. The total size of the file must be small enough
to fit in virtual memory. The name supplied to
<code>H5Fcreate()</code> is irrelevant, and <code>H5Fopen()</code>
will always fail.
<br><br>
<dt>Parallel Files using MPI I/O
<dd>This driver allows parallel access to a file through the MPI I/O
library. The parameters which can be modified are the MPI
communicator, the info object, and the access mode.
The communicator and info object are saved and then
passed to <code>MPI_File_open()</code> during file creation or open.
The access_mode controls the kind of parallel access the application
intends. (Note that it is likely that the next API revision will
remove the access_mode parameter and have access control specified
via the raw data transfer property list of <code>H5Dread()</code>
and <code>H5Dwrite()</code>.) These parameters are set and queried
with the <code>H5Pset_mpi()</code> and <code>H5Pget_mpi()</code>
calls.
<br><br>
<dt>Data Alignment
<dd>Sometimes file access is faster if certain things are
aligned on file blocks. This can be controlled by setting
alignment properties of a file access property list with the
<code>H5Pset_alignment()</code> function. Any allocation
request at least as large as some threshold will be aligned on
an address which is a multiple of some number.
</dl> </ul>
<h2>5. Examples of using file templates</h2>
<h3>5.1. Example of using file creation templates</h3>
<p>This following example shows how to create a file with 64-bit object
offsets and lengths:<br>
<pre>
hid_t create_template;
hid_t file_id;
create_template = H5Pcreate(H5P_FILE_CREATE);
H5Pset_sizes(create_template, 8, 8);
file_id = H5Fcreate("test.h5", H5F_ACC_TRUNC,
create_template, H5P_DEFAULT);
.
.
.
H5Fclose(file_id);
</pre>
<h3>5.2. Example of using file creation templates</h3>
<p>This following example shows how to open an existing file for
independent datasets access by MPI parallel I/O:<br>
<pre>
hid_t access_template;
hid_t file_id;
access_template = H5Pcreate(H5P_FILE_ACCESS);
H5Pset_mpi(access_template, MPI_COMM_WORLD, MPI_INFO_NULL);
/* H5Fopen must be called collectively */
file_id = H5Fopen("test.h5", H5F_ACC_RDWR, access_template);
.
.
.
/* H5Fclose must be called collectively */
H5Fclose(file_id);
</pre>
<h2>6. Low-level File Drivers</h2>
<p>HDF5 is able to access its address space through various types of
low-level <em>file drivers</em>. For instance, an address space might
correspond to a single file on a Unix file system, multiple files on a
Unix file system, multiple files on a parallel file system, or a block
of memory within the application. Generally, an HDF5 address space is
referred to as an "HDF5 file" regardless of how the space is organized
at the storage level.
<h3>6.1 Unbuffered Permanent Files</h3>
<p>The <em>sec2</em> driver uses functions from section 2 of the
Posix manual to access files stored on a local file system. These are
the <code>open()</code>, <code>close()</code>, <code>read()</code>,
<code>write()</code>, and <code>lseek()</code> functions. If the
operating system supports <code>lseek64()</code> then it is used instead
of <code>lseek()</code>. The library buffers meta data regardless of
the low-level driver, but using this driver prevents data from being
buffered again by the lowest layers of the HDF5 library.
<dl>
<dt><code>H5F_driver_t H5Pget_driver (hid_t
<em>access_properties</em>)</code>
<dd>This function returns the constant <code>H5F_LOW_SEC2</code> if the
<em>sec2</em> driver is defined as the low-level driver for the
specified access property list.
<br><br>
<dt><code>herr_t H5Pset_sec2 (hid_t <em>access_properties</em>)</code>
<dd>The file access properties are set to use the <em>sec2</em>
driver. Any previously defined driver properties are erased from the
property list. Additional parameters may be added to this function in
the future.
<br><br>
<dt><code>herr_t H5Pget_sec2 (hid_t <em>access_properties</em>)</code>
<dd>If the file access property list is set to the <em>sec2</em> driver
then this function returns zero; otherwise it returns a negative
value. In the future, additional arguments may be added to this
function to match those added to <code>H5Pset_sec2()</code>.
</dl>
<h3>6.2 Buffered Permanent Files</h3>
<p>The <em>stdio</em> driver uses the functions declared in the
<code>stdio.h</code> header file to access permanent files in a local
file system. These are the <code>fopen()</code>, <code>fclose()</code>,
<code>fread()</code>, <code>fwrite()</code>, and <code>fseek()</code>
functions. If the operating system supports <code>fseek64()</code> then
it is used instead of <code>fseek()</code>. Use of this driver
introduces an additional layer of buffering beneath the HDF5 library.
<dl>
<dt><code>H5F_driver_t H5Pget_driver(hid_t
<em>access_properties</em>)</code>
<dd>This function returns the constant <code>H5F_LOW_STDIO</code> if the
<em>stdio</em> driver is defined as the low-level driver for the
specified access property list.
<br><br>
<dt><code>herr_t H5Pset_stdio (hid_t <em>access_properties</em>)</code>
<dd>The file access properties are set to use the <em>stdio</em>
driver. Any previously defined driver properties are erased from the
property list. Additional parameters may be added to this function in
the future.
<br><br>
<dt><code>herr_t H5Pget_stdio (hid_t <em>access_properties</em>)</code>
<dd>If the file access property list is set to the <em>stdio</em> driver
then this function returns zero; otherwise it returns a negative
value. In the future, additional arguments may be added to this
function to match those added to <code>H5Pset_stdio()</code>.
</dl>
<h3>6.3 Buffered Temporary Files</h3>
<p>The <em>core</em> driver uses <code>malloc()</code> and
<code>free()</code> to allocated space for a file in the heap. Reading
and writing to a file of this type results in mem-to-mem copies instead
of disk I/O and as a result is somewhat faster. However, the total file
size must not exceed the amount of available virtual memory, and only
one HDF5 file handle can access the file (because the name of such a
file is insignificant and <code>H5Fopen()</code> always fails).
<dl>
<dt><code>H5F_driver_t H5Pget_driver (hid_t
<em>access_properties</em>)</code>
<dd>This function returns the constant <code>H5F_LOW_CORE</code> if the
<em>core</em> driver is defined as the low-level driver for the
specified access property list.
<br><br>
<dt><code>herr_t H5Pset_core (hid_t <em>access_properties</em>, size_t
<em>block_size</em>)</code>
<dd>The file access properties are set to use the <em>core</em>
driver and any previously defined driver properties are erased from
the property list. Memory for the file will always be allocated in
units of the specified <em>block_size</em>. Additional parameters may
be added to this function in the future.
<br><br>
<dt><code>herr_t H5Pget_core (hid_t <em>access_properties</em>, size_t
*<em>block_size</em>)</code>
<dd>If the file access property list is set to the <em>core</em> driver
then this function returns zero and <em>block_size</em> is set to the
block size used for the file; otherwise it returns a negative
value. In the future, additional arguments may be added to this
function to match those added to <code>H5Pset_core()</code>.
</dl>
<h3>6.4 Parallel Files</h3>
<p>This driver uses MPI I/O to provide parallel access to a file.
<dl>
<dt><code>H5F_driver_t H5Pget_driver (hid_t
<em>access_properties</em>)</code>
<dd>This function returns the constant <code>H5F_LOW_MPI</code> if the
<em>mpi</em> driver is defined as the low-level driver for the
specified access property list.
<br><br>
<dt><code>herr_t H5Pset_mpi (hid_t <em>access_properties</em>, MPI_Comm
<em>comm</em>, MPI_info <em>info</em>)</code>
<dd>The file access properties are set to use the <em>mpi</em>
driver and any previously defined driver properties are erased from
the property list. Additional parameters may be added to this
function in the future.
<br><br>
<dt><code>herr_t H5Pget_mpi (hid_t <em>access_properties</em>, MPI_Comm
*<em>comm</em>, MPI_info *<em>info</em>)</code>
<dd>If the file access property list is set to the <em>mpi</em> driver
then this function returns zero and <em>comm</em>, and <em>info</em>
are set to the values stored in the property
list; otherwise the function returns a negative value. In the future,
additional arguments may be added to this function to match those
added to <code>H5Pset_mpi()</code>.
</dl>
<h3>6.4 File Families</h3>
<p>A single HDF5 address space may be split into multiple files which,
together, form a file family. Each member of the family must be the
same logical size although the size and disk storage reported by
<code>ls</code>(1) may be substantially smaller. The name passed to
<code>H5Fcreate()</code> or <code>H5Fopen()</code> should include a
<code>printf(3c)</code> style integer format specifier which will be
replaced with the family member number (the first family member is
zero).
<p>Any HDF5 file can be split into a family of files by running
the file through <code>split</code>(1) and numbering the output
files. However, because HDF5 is lazy about extending the size
of family members, a valid file cannot generally be created by
concatenation of the family members. Additionally,
<code>split</code> and <code>cat</code> don't attempt to
generate files with holes. The <code>h5repart</code> program
can be used to repartition an HDF5 file or family into another
file or family and preserves holes in the files.
<dl>
<dt><code>h5repart</code> [<code>-v</code>] [<code>-b</code>
<em>block_size</em>[<em>suffix</em>]] [<code>-m</code>
<em>member_size</em>[<em>suffix</em>]] <em>source
destination</em>
<dd>This program repartitions an HDF5 file by copying the source
file or family to the destination file or family preserving
holes in the underlying Unix files. Families are used for the
source and/or destination if the name includes a
<code>printf</code>-style integer format such as "%d". The
<code>-v</code> switch prints input and output file names on
the standard error stream for progress monitoring,
<code>-b</code> sets the I/O block size (the default is 1kB),
and <code>-m</code> sets the output member size if the
destination is a family name (the default is 1GB). The block
and member sizes may be suffixed with the letters
<code>g</code>, <code>m</code>, or <code>k</code> for GB, MB,
or kB respectively.
<br><br>
<dt><code>H5F_driver_t H5Pget_driver (hid_t
<em>access_properties</em>)</code>
<dd>This function returns the constant <code>H5F_LOW_FAMILY</code> if
the <em>family</em> driver is defined as the low-level driver for the
specified access property list.
<br><br>
<dt><code>herr_t H5Pset_family (hid_t <em>access_properties</em>,
hsize_t <em>memb_size</em>, hid_t <em>member_properties</em>)</code>
<dd>The file access properties are set to use the <em>family</em>
driver and any previously defined driver properties are erased
from the property list. Each member of the file family will
use <em>member_properties</em> as its file access property
list. The <em>memb_size</em> argument gives the logical size
in bytes of each family member but the actual size could be
smaller depending on whether the file contains holes. The
member size is only used when creating a new file or
truncating an existing file; otherwise the member size comes
from the size of the first member of the family being
opened. Note: if the size of the <code>off_t</code> type is
four bytes then the maximum family member size is usually
2^31-1 because the byte at offset 2,147,483,647 is generally
inaccessable. Additional parameters may be added to this
function in the future.
<br><br>
<dt><code>herr_t H5Pget_family (hid_t <em>access_properties</em>,
hsize_t *<em>memb_size</em>, hid_t
*<em>member_properties</em>)</code>
<dd>If the file access property list is set to the <em>family</em>
driver then this function returns zero; otherwise the function
returns a negative value. On successful return,
<em>access_properties</em> will point to a copy of the member
access property list which should be closed by calling
<code>H5Pclose()</em> when the application is finished with
it. If <em>memb_size</em> is non-null then it will contain
the logical size in bytes of each family member. In the
future, additional arguments may be added to this function to
match those added to <code>H5Pset_family()</code>.
</dl>
<h3>6.5 Split Meta/Raw Files</h3>
<p>On occasion, it might be useful to separate meta data from raw
data. The <em>split</em> driver does this by creating two files: one for
meta data and another for raw data. The application provides a base
file name to <code>H5Fcreate()</code> or <code>H5Fopen()</code> and this
driver appends a file extension which defaults to ".meta" for the meta
data file and ".raw" for the raw data file. Each file can have its own
file access property list which allows, for instance, a split file with
meta data stored with the <em>core</em> driver and raw data stored with
the <em>sec2</em> driver.
<dl>
<dt><code>H5F_driver_t H5Pget_driver (hid_t
<em>access_properties</em>)</code>
<dd>This function returns the constant <code>H5F_LOW_SPLIT</code> if
the <em>split</em> driver is defined as the low-level driver for the
specified access property list.
<br><br>
<dt><code>herr_t H5Pset_split (hid_t <em>access_properties</em>,
const char *<em>meta_extension</em>, hid_t
<em>meta_properties</em>, const char *<em>raw_extension</em>, hid_t
<em>raw_properties</em>)</code>
<dd>The file access properties are set to use the <em>split</em>
driver and any previously defined driver properties are erased from
the property list. The meta file will have a name which is formed by
adding <em>meta_extension</em> (or ".meta") to the end of the base
name and will be accessed according to the
<em>meta_properties</em>. The raw file will have a name which is
formed by appending <em>raw_extension</em> (or ".raw") to the base
name and will be accessed according to the <em>raw_properties</em>.
Additional parameters may be added to this function in the future.
<br><br>
<dt><code>herr_t H5Pget_split (hid_t <em>access_properties</em>,
size_t <em>meta_ext_size</em>, const char *<em>meta_extension</em>,
hid_t <em>meta_properties</em>, size_t <em>raw_ext_size</em>, const
char *<em>raw_extension</em>, hid_t *<em>raw_properties</em>)</code>
<dd>If the file access property list is set to the <em>split</em>
driver then this function returns zero; otherwise the function
returns a negative value. On successful return,
<em>meta_properties</em> and <em>raw_properties</em> will
point to copies of the meta and raw access property lists
which should be closed by calling <code>H5Pclose()</em> when
the application is finished with them, but if the meta and/or
raw file has no property list then a negative value is
returned for that property list handle. Also, if
<em>meta_extension</em> and/or <em>raw_extension</em> are
non-null pointers, at most <em>meta_ext_size</em> or
<em>raw_ext_size</em> characters of the meta or raw file name
extension will be copied to the specified buffer. If the
actual name is longer than what was requested then the result
will not be null terminated (similar to
<code>strncpy()</code>). In the future, additional arguments
may be added to this function to match those added to
<code>H5Pset_split()</code>.
</dl>
<hr>
<address><a href="mailto:koziol@ncsa.uiuc.edu">Quincey Koziol</a></address>
<address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
<!-- Created: Tue Jan 27 09:11:27 EST 1998 -->
<!-- hhmts start -->
Last modified: Tue Jun 9 15:03:44 EDT 1998
<!-- hhmts end -->
</body>
</html>
|