1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
|
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
<head>
<title>Ragged Arrays</title>
</head>
<body>
<h1>Ragged Arrays</h1>
<h2>1. Introduction</h2>
<p><b>Ragged arrays should be considered alpha quality. They were
added to HDF5 to satisfy the needs of the ASCI/DMF vector
bundle project; the interface and storage methods are likely
to change in the future in ways that are not backward
compatible.</b>
<p>A two-dimensional ragged array has been added to the library
and built on top of other existing functionality. A ragged
array is a one-dimensional array of <em>rows</em> where the
length of any row is independent of the lengths of the other
rows. The number of rows and the length of each row can be
changed at any time (the current version does not support
truncating an array by removing rows). All elements of the
ragged array have the same data type and, as with datasets, the
data is type-converted between memory buffers and files.
<p>The current implementation works best when most of the rows are
approximately the same length since a two dimensional dataset
can be created to hold a nominal number of elements from each
row with the additional elements stored in a separate dataset
which implements a heap.
<p>A ragged array is a composite object implemented as a group
with three datasets. The name of the group is the name of the
ragged array. The <em>raw</em> dataset is a two-dimensional
array that contains the first <em>N</em> elements of each row
where <em>N</em> is determined by the application when the array
is created. If most rows have fewer than <em>N</em> elements
then internal fragmentation may be quite bad.
<p>The <em>over</em> dataset is a one-dimensional array that
contains elements from each row that don't fit in the
<em>raw</em> dataset.
<p>The <em>meta</em> dataset maintains information about each row
such as the number of elements in the row, the location of the
overflow elements in the <em>over</em> dataset (if any), and the
amount of space reserved in <em>over</em> for the row. The
<em>meta</em> dataset has one entry per row and is where most of
the storage overhead is concentrated when rows are relatively
short.
<h2>2. Opening and Closing</h2>
<dl>
<dt><code>hid_t H5Rcreate (hid_t <em>location</em>, const char
*<em>name</em>, hid_t <em>type</em>, hid_t
<em>plist</em>)</code>
<dd>This function creates a new ragged array by creating the
group with the specified name and populating it with the
component datasets (which should not be accessed
independently). The dataset creation property list
<em>plist</em> defines the width of the <em>raw</em> dataset;
a nominal row is considered to be the width of a chunk. The
<em>type</em> argument defines the data type which will be
stored in the file. A negative value is returned if the array
cannot be created.
<br><br>
<dt><code>hid_t H5Ropen (hid_t <em>location</em>, const char
*<em>name</em>)</code>
<dd>This function opens a ragged array by opening the specified
group and the component datasets (which should not be accessed
indepently). A negative value is returned if the array cannot
be opened.
<br><br>
<dt><code>herr_t H5Rclose (hid_t <em>array</em>)</code>
<dd>All ragged arrays should be closed by calling this
function. The group and component datasets will be closed
automatically by the library.
</dl>
<h2>3. Reading and Writing</h2>
<p>In order to be as efficient as possible the ragged array layer
operates on sets of contiguous rows and it is to the
application's advantage to perform I/O on as many rows at a time
as possible. These functions take a starting row number and the
number of rows on which to operate.
<dl>
<dt><code>herr_t H5Rwrite (hid_t <em>array_id</em>, hssize_t
<em>start_row</em>, hsize_t <em>nrows</em>, hid_t
<em>type</em>, hsize_t <em>size</em>[], void
*<em>buf</em>[])</code>
<dd>A set of ragged array rows beginning at <em>start_row</em>
and continuing for <em>nrows</em> is written to the file,
converting the memory data type <em>type</em> to the file data
type which was defined when the array was created. The number
of elements to write from each row is specified in the
<em>size</em> array and the data for each row is pointed to
from the <em>buf</em> array. The <em>size</em> and
<em>buf</em> are indexed so their first element corresponds to
the first row on which to operate.
<br><br>
<dt><code>herr_t H5Rread (hid_t <em>array_id</em>, hssize_t
<em>start_row</em>, hsize_t <em>nrows</em>, hid_t
<em>type</em>, hsize_t <em>size</em>[], void
*<em>buf</em>[])</code>
<dd>A set of ragged array rows beginning at <em>start_row</em>
and continuing for <em>nrows</em> is read from the file,
converting from the file data type which was defined when the
array was created to the memory data type <em>type</em>. The
number of elements to read from each row is specified in the
<em>size</em> array and the buffers in which to place the
results are pointed to by the <em>buf</em> array. On return,
the <em>size</em> array will contain the actual size of the
row which may be different than the requested size. When the
request size is smaller than the actual size the row will be
truncated; otherwise the remainder of the output buffer will
be zero filled. If a pointer in the <em>buf</em> array is
null then the library will ignore the corresponding
<em>size</em> value and allocate a buffer large enough to hold
the entire row. This function returns negative for failures
with <em>buf</em> containing the original input values.
</dl>
<hr>
<address><a href="mailto:matzke@llnl.gov">Robb Matzke</a></address>
<!-- Created: Wed Aug 26 14:10:32 EDT 1998 -->
<!-- hhmts start -->
Last modified: Fri Aug 28 14:27:19 EDT 1998
<!-- hhmts end -->
</body>
</html>
|