1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
|
<HTML><HEAD>
<TITLE>HDF5 Tutorial - Creating a Dataset
</TITLE>
</HEAD>
<body bgcolor="#ffffff">
<!-- BEGIN MAIN BODY -->
<A HREF="http://www.ncsa.uiuc.edu/"><img border=0
src="http://www.ncsa.uiuc.edu/Images/NCSAhome/footerlogo.gif"
width=78 height=27 alt="NCSA"><P></A>
[ <A HREF="title.html"><I>HDF5 Tutorial Top</I></A> ]
<H1>
<BIG><BIG><BIG><FONT COLOR="#c101cd">Creating a Dataset</FONT>
</BIG></BIG></BIG></H1>
<hr noshade size=1>
<BODY>
<H2>Contents:</H2>
<UL>
<LI> <A HREF="#def">What is a Dataset</A>?
<LI> Programming Example
<UL>
<LI> <A HREF="#desc">Description</A>
<LI> <A HREF="#rem">Remarks</A>
<LI> <A HREF="#fc">File Contents</A>
<LI> <A HREF="#ddl">Dataset Definition in DDL</A>
</UL>
</UL>
<HR>
<A NAME="def">
<H2>What is a Dataset?</h2>
<P>
A dataset is a multidimensional array of data elements, together with
supporting metadata. To create a dataset, the application program must specify
the location to create the dataset, the dataset name, the data type and space
of the data array, and the dataset creation properties.
<P>
<H3> Data Types</H3>
A data type is a collection of data type properties, all of which can
be stored on disk, and which when taken as a whole, provide complete
information for data conversion to or from that data type.
<P>
There are two categories of data types in HDF5: atomic and compound data
types. An atomic type is a type which cannot be decomposed into smaller
units at the API level. A compound data type is a collection of one or more
atomic types or small arrays of such types.
<P>
Atomic types include integer, float, date and time, string, bit field, and
opaque. Figure 5.1 shows the HDF5 data types. Some of the HDF5 predefined
atomic data types are listed in Figure 5.2. In this tutorial, we consider
only
HDF5 predefined integers. For information on data types, see the HDF5
User's Guide.
<P>
<B>Fig 5.1</B> <I>HDF5 data types</I>
<PRE>
+-- integer
+-- floating point
+---- atomic ----+-- date and time
| +-- character string
HDF5 datatypes --| +-- bit field
| +-- opaque
|
+---- compound
</PRE>
<B>Fig. 5.2</B> <I>Examples of HDF5 predefined data types</I>
<table width="52%" border="1" cellpadding="4">
<tr bgcolor="#ffcc99" bordercolor="#FFFFFF">
<td width="20%"><b>Data Type</b></td>
<td width="80%"><b>Description</b></td>
</tr>
<tr bordercolor="#FFFFFF">
<td bgcolor="#99cccc" width="20%">H5T_STD_I32LE</td>
<td width="80%">Four-byte, little-endian, signed two's complement integer</td>
</tr>
<tr bordercolor="#FFFFFF">
<td bgcolor="#99cccc" width="20%">H5T_STD_U16BE</td>
<td width="80%">Two-byte, big-endian, unsigned integer</td>
</tr>
<tr bordercolor="#FFFFFF">
<td bgcolor="#99cccc" width="20%">H5T_IEEE_F32BE</td>
<td width="80%">Four-byte, big-endian, IEEE floating point</td>
</tr>
<tr bordercolor="#FFFFFF">
<td bgcolor="#99cccc" width="20%">H5T_IEEE_F64LE</td>
<td width="80%">Eight-byte, little-endian, IEEE floating point</td>
</tr>
<tr bordercolor="#FFFFFF">
<td bgcolor="#99cccc" width="20%">H5T_C_S1</td>
<td width="80%">One-byte, null-terminated string of eight-bit characters</td>
</tr>
</table>
<H3>Dataspaces</H3>
A dataspace describes the dimensionality of the data array. A dataspace
is either a regular N-dimensional array of data points, called a simple
dataspace, or a more general collection of data points organized in
another manner, called a complex dataspace. Figure 5.3 shows HDF5 dataspaces.
In this tutorial, we only consider simple dataspaces.
<P>
<B>Fig 5.3</B> <I>HDF5 dataspaces</I>
<PRE>
+-- simple
HDF5 dataspaces --|
+-- complex
</PRE>
The dimensions of a dataset can be fixed (unchanging), or they may be
unlimited, which means that they are extendible. A dataspace can also
describe portions of a dataset, making it possible to do partial I/O
operations on selections.
<h3>Dataset creation properties</H3>
When creating a dataset, HDF5 allows users to specify how raw data is
organized on disk and how the raw data is compressed. This information is
stored in a dataset creation property list and passed to the dataset
interface. The raw data on disk can be stored contiguously (in the same
linear way that it is organized in memory), partitioned into chunks and
stored externally, etc. In this tutorial, we use the default creation
property list; that is, no compression and
contiguous storage layout is used. For more information about the creation
properties, see the HDF5 User's Guide.
<P>
In HDF5, data types and spaces are independent objects, which are created
separately from any dataset that they might be attached to. Because of this the
creation of a dataset requires definitions of data type and dataspace.
In this tutorial, we use HDF5 predefined data types (integer) and consider
only simple dataspaces. Hence, only the creation of dataspace objects is
needed.
<P>
To create an empty dataset (no data written) the following steps need to be
taken:
<OL>
<LI> Obtain the location id where the dataset is to be created.
<LI> Define the dataset characteristics and creation properties.
<UL>
<LI> define a data type
<LI> define a dataspace
<LI> specify dataset creation properties
</UL>
<LI> Create the dataset.
<LI> Close the data type, dataspace, and the property list if necessary.
<LI> Close the dataset.
</OL>
To create a simple dataspace, the calling program must contain the following
calls:
<PRE>
dataspace_id = H5Screate_simple(rank, dims, maxdims);
H5Sclose(dataspace_id );
</PRE>
To create a dataset, the calling program must contain the following calls:
<PRE>
dataset_id = H5Dcreate(hid_t loc_id, const char *name, hid_t type_id,
hid_t space_id, hid_t create_plist_id);
H5Dclose (dataset_id);
</PRE>
<P>
<H2> Programming Example</H2>
<A NAME="desc">
<H3><U>Description</U></H3>
The following example shows how to create an empty dataset.
It creates a file called 'dset.h5', defines the dataset dataspace, creates a
dataset which is a 4x6 integer array, and then closes the dataspace,
the dataset, and the file. <BR>
[ <A HREF="examples/h5_crtdat.c">Download h5_crtdat.c</A> ]
<PRE>
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#include <hdf5.h>
#define FILE "dset.h5"
main() {
hid_t file_id, dataset_id, dataspace_id; /* identifiers */
hsize_t dims[2];
herr_t status;
/* Create a new file using default properties. */
file_id = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
/* Create the data
space for the dataset. */
dims[0] = 4;
dims[1] = 6;
dataspace_id = H5Screate_simple(2, dims, NULL);
/* Create the dataset. */
dataset_id = H5Dcreate(file_id, "/dset", H5T_STD_I32BE, dataspace_id,
H5P_DEFAULT);
/* End access to the dataset and release resources used by it. */
status = H5Dclose(dataset_id);
/* Terminate access to the data space. */
status = H5Sclose(dataspace_id);
/* Close the file. */
status = H5Fclose(file_id);
}
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
</PRE>
<A NAME="rem">
<H3><U>Remarks</U></H3>
<UL>
<LI> H5Screate_simple creates a new simple data space and returns a data space
identifier.
<PRE>
hid_t H5Screate_simple (int rank, const hsize_t * dims,
const hsize_t * maxdims)
</PRE>
<UL>
<LI> The first parameter specifies the rank of the dataset.
<LI> The second parameter specifies the size of the dataset.
<LI> The third parameter is for the upper limit on the size of the dataset.
If it is NULL, the upper limit is the same as the dimension
sizes specified by the second parameter.
</UL>
<P>
<LI> H5Dcreate creates a dataset at the specified location and returns a
dataset identifier.
<PRE>
hid_t H5Dcreate (hid_t loc_id, const char *name, hid_t type_id,
hid_t space_id, hid_t create_plist_id)
</PRE>
<UL>
<LI> The first parameter is the location identifier.
<LI> The second parameter is the name of the dataset to create.
<LI> The third parameter is the data type identifier. H5T_STD_I32BE, a
32-bit Big Endian integer, is an HDF atomic data type.
<LI> The fourth parameter is the data space identifier.
<LI> The last parameter specifies the dataset creation property list.
H5P_DEFAULT specifies the default dataset creation property list.
</UL>
<P>
<LI>H5Dcreate creates an empty array and initializes the data to 0.
<P>
<LI> When a dataset is no longer accessed by a program, H5Dclose must be
called to release the resource used by the dataset. This call is mandatory.
<PRE>
hid_t H5Dclose (hid_t dataset_id)
</PRE>
</UL>
<A NAME="fc">
<H3><U>File Contents</U></H3>
The file contents of 'dset.h5' are shown is <B>Figure 5.4</B> and <B>Figure 5.5</B>.
<table width="73%" border="1" cellspacing="4" bordercolor="#FFFFFF">
<tr bordercolor="#FFFFFF">
<td width="37%"><b>Figure 5.4</b> <i>The Contents of 'dset.h5'</i>
</td>
<td width="63%"><b>Figure 5.5</b> <i>'dset.h5' in DDL</i> </td>
</tr>
<tr bordercolor="#000000">
<!-- <td width="37%"><IMG src="dseth5.jpg" width="206" height="333"></td> -->
<td width="37%"><IMG src="img002.gif"></td>
<td width="63%">
<pre> HDF5 "dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) }
DATA {
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0
}
}
}
}
</pre>
</td>
</tr>
</table>
<A NAME="ddl">
<h3><U>Dataset Definition in DDL</U></H3>
The following is the simplified DDL dataset definition:
<P>
<B>Fig. 5.6</B> <I>HDF5 Dataset Definition</I>
<PRE>
<dataset> ::= DATASET "<dataset_name>" { <data type>
<dataspace>
<data>
<dataset_attribute>* }
<data type> ::= DATATYPE { <atomic_type> }
<dataspace> ::= DATASPACE { SIMPLE <current_dims> / <max_dims> }
<dataset_attribute> ::= <attribute>
</PRE>
<!-- BEGIN FOOTER INFO -->
<P><hr noshade size=1>
<font face="arial,helvetica" size="-1">
<a href="http://www.ncsa.uiuc.edu/"><img border=0
src="http://www.ncsa.uiuc.edu/Images/NCSAhome/footerlogo.gif"
width=78 height=27 alt="NCSA"><br>
The National Center for Supercomputing Applications</A><br>
<a href="http://www.uiuc.edu/">University of Illinois
at Urbana-Champaign</a><br>
<br>
<!-- <A HREF="helpdesk.mail.html"> -->
<A HREF="mailto:hdfhelp@@ncsa.uiuc.edu">
hdfhelp@@ncsa.uiuc.edu</A>
<BR> <H6>Last Modified: August 27, 1999</H6><BR>
<!-- modified by Barbara Jones - bljones@@ncsa.uiuc.edu -->
</FONT>
<BR>
<!-- <A HREF="mailto:hdfhelp@@ncsa.uiuc.edu"> -->
</BODY>
</HTML>
|