/** @page ViewToolsEdit Command-line Tools For Editing HDF5 Files
Navigate back: \ref index "Main" / \ref GettingStarted / \ref ViewToolsCommand
\section secViewToolsEditTOC Contents
\ref secViewToolsEditRemove
\ref secViewToolsEditChange
\ref secViewToolsEditApply
\ref secViewToolsEditCopy
\ref secViewToolsEditAdd
\section secViewToolsEditRemove Remove Inaccessible Objects and Unused Space in a File
HDF5 files may accumulate unused space when they are read and rewritten to or if objects are deleted within
them. With many edits and deletions this unused space can add up to a sizable amount.
The h5repack
tool can be used to remove unused space in an HDF5
file. If no options other than the input and output HDF5 files are specified on the
h5repack
command line, it will write the file to the new
file, getting rid of the unused space:
\code
h5repack
\endcode
\section secViewToolsEditChange Change a Dataset's Storage Layout
The h5repack
utility can be used to change a dataset's storage
layout. By default, the storage layout of a dataset is defined at creation time and it cannot be changed.
However, with h5repack you can write an HDF5 file to a new file and change the layout for objects in the new file.
The -l
option in h5repack
is used to change the layout for an object. The string following the -l
option defines the layout type and parameters for specified objects (or all objects):
\code
h5repack -l [list of objects:]=
\endcode
If no object is specified, then everything in the input file will be written to the output file with the specified
layout type and parameters. If objects are specified then everything in the input file will be written to the
output file as is, except for those specified objects. They will be written to the output file with the given
layout type and parameters.
Following is a description of the dataset layouts and the h5repack
options to use to change a dataset:
Storage Layout h5repack Option Description
Contiguous
CONTI
Data is stored physically together
Chunked
CHUNK=DIM[xDIM...xDIM]
Data is stored in DIM[xDIM...xDIM] chunks
Compact
COMPA
Data is stored in the header of the object (less I/O)
If you type h5repack -h
on the command line, you will see
a detailed usage statement with examples of modifying the layout.
In the following example, the dataset /dset
in the file
dset.h5 is contiguous, as shown by the h5dump -pH
command.
The h5repack
utility writes dset.h5 to a new file, dsetrpk.h5,
where the dataset dset
is chunked. This can be seen by examining
the resulting dsetrpk.h5 file with h5dump
, as shown:
\code
$ h5dump -pH dset.h5
HDF5 "dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
STORAGE_LAYOUT {
CONTIGUOUS
SIZE 96
OFFSET 1400
}
FILTERS {
NONE
}
FILLVALUE {
FILL_TIME H5D_FILL_TIME_IFSET
VALUE 0
}
ALLOCATION_TIME {
H5D_ALLOC_TIME_LATE
}
}
}
}
$ h5repack -l dset:CHUNK=4x6 dset.h5 dsetrpk.h5
$ h5dump -pH dsetrpk.h5
HDF5 "dsetrpk.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
STORAGE_LAYOUT {
CHUNKED ( 4, 6 )
SIZE 96
}
FILTERS {
NONE
}
FILLVALUE {
FILL_TIME H5D_FILL_TIME_IFSET
VALUE 0
}
ALLOCATION_TIME {
H5D_ALLOC_TIME_INCR
}
}
}
}
\endcode
There can be many reasons that the storage layout needs to be changed for a dataset. For example,
there may be a performance issue with a dataset due to a small chunk size.
\section secViewToolsEditApply Apply Compression Filter to a Dataset
The h5repack
utility can be used to compress or
remove compression from a dataset in a file. By default, compression cannot be added to or removed
from a dataset once it has been created. However, with h5repack
you can write a file to a new file and specify a compression filter to apply to a dataset or datasets in the new file.
To apply a filter to an object in an HDF5 file, specify the -f
option,
where the string following the -f
option defines the filter and
its parameters (if there are any) to apply to a given object or objects:
\code
h5repack -f [list of objects:]=
\endcode
If no objects are specified then everything in the input file will be written to the output file with
the filter and parameters specified. If objects are specified, then everything in the input file will
be written to the output file as is, except for the specified objects. They will be written to the
output file with the filter and parameters specified.
If you type h5repack --help
on the command line,
you will see a detailed usage statement with examples of modifying a filter. There are actually
numerous filters that you can apply to a dataset:
Filter Options
GZIP compression (levels 1-9)
GZIP=<deflation level>
SZIP compression
SZIP=
Shuffle filter
SHUF
Checksum filter
FLET
NBIT compression
NBIT
HDF5 Scale/Offset filter
SOFF=
User defined filter
UD=
Remove ALL filters
NONE
Be aware that a dataset must be chunked to apply compression to it. If the dataset is not already chunked,
then h5repack
will apply chunking to it. Both chunking
and compression cannot be applied to a dataset at the same time with h5repack
.
In the following example,
\li h5dump lists the properties for the objects in dset.h5 . Note that the dataset dset is contiguous.
\li h5repack writes dset.h5 into a new file dsetrpk.h5 , applying GZIP Level 5 compression to the dataset /dset in dsetrpk.h5.
\li h5dump lists the properties for the new dsetrpk.h5 file. Note that /dset is both compressed and chunked.
Example
\code
$ h5dump -pH dset.h5
HDF5 "dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 12, 18 ) / ( 12, 18 ) }
STORAGE_LAYOUT {
CONTIGUOUS
SIZE 864
OFFSET 1400
}
FILTERS {
NONE
}
FILLVALUE {
FILL_TIME H5D_FILL_TIME_IFSET
VALUE 0
}
ALLOCATION_TIME {
H5D_ALLOC_TIME_LATE
}
}
}
}
$ h5repack -f dset:GZIP=5 dset.h5 dsetrpk.h5
$ h5dump -pH dsetrpk.h5
HDF5 "dsetrpk.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 12, 18 ) / ( 12, 18 ) }
STORAGE_LAYOUT {
CHUNKED ( 12, 18 )
SIZE 160 (5.400:1 COMPRESSION)
}
FILTERS {
COMPRESSION DEFLATE { LEVEL 5 }
}
FILLVALUE {
FILL_TIME H5D_FILL_TIME_IFSET
VALUE 0
}
ALLOCATION_TIME {
H5D_ALLOC_TIME_INCR
}
}
}
}
\endcode
\section secViewToolsEditCopy Copy Objects to Another File
The h5copy
utility can be used to copy an object or
objects from one HDF5 file to another or to a different location in the same file. It uses the
#H5Ocopy and #H5Lcopy APIs in HDF5.
Following are some of the options that can be used with h5copy
.
h5copy Options Description
-i, --input
Input file name
-o, --output
Output file name
-s, --source
Source object name
-d, --destination
Destination object name
-p, --parents
Make parent groups as needed
-v, --verbose
Verbose mode
-f, --flag
Flag type
For a complete list of options and information on using h5copy
, type:
\code
h5copy --help
\endcode
In the example below, the dataset /MyGroup/Group_A/dset2
in groups.h5
gets copied to the root
("/
") group of a new file,
newgroup.h5
, with the name
dset3
:
\code
$h5dump -H groups.h5
HDF5 "groups.h5" {
GROUP "/" {
GROUP "MyGroup" {
GROUP "Group_A" {
DATASET "dset2" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 2, 10 ) / ( 2, 10 ) }
}
}
GROUP "Group_B" {
}
DATASET "dset1" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 3, 3 ) / ( 3, 3 ) }
}
}
}
}
$ h5copy -i groups.h5 -o newgroup.h5 -s /MyGroup/Group_A/dset2 -d /dset3
$ h5dump -H newgroup.h5
HDF5 "newgroup.h5" {
GROUP "/" {
DATASET "dset3" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 2, 10 ) / ( 2, 10 ) }
}
}
}
\endcode
There are also h5copy
flags that can be specified
with the -f
option. In the example below, the
-f shallow
option specifies to copy only the
immediate members of the group /MyGroup
from
the groups.h5
file mentioned above to a new
file mygrouponly.h5
:
\code
h5copy -v -i groups.h5 -o mygrouponly.h5 -s /MyGroup -d /MyGroup -f shallow
\endcode
The output of the above command is shown below. The verbose option -v
describes the action that was taken, as shown in the highlighted text.
\code
Copying file and object to file and object
Using shallow flag
$ h5dump -H mygrouponly.h5
HDF5 "mygrouponly.h5" {
GROUP "/" {
GROUP "MyGroup" {
GROUP "Group_A" {
}
GROUP "Group_B" {
}
DATASET "dset1" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 3, 3 ) / ( 3, 3 ) }
}
}
}
}
\endcode
\section secViewToolsEditAdd Add or Remove User Block from File
The user block is a space in an HDF5 file that is not interpreted by the HDF5 library. It is a property
list that can be added when creating a file. See the #H5Pset_userblock API in the \ref RM for more
information regarding this property.
Once created in a file, the user block cannot be removed. However, you can use the
h5jam
and h5unjam
utilities to add or remove a user block from a file into a new file.
These two utilities work similarly, except that h5jam
adds a user block to a file and h5unjam
removes the user
block. You can also overwrite or delete a user block in a file.
Specify the -h
option to see a complete list of options
that can be used with h5jam
and
h5unjam
. For example:
\code
h5jam -h
h5unjam -h
\endcode
Below are the basic options for adding or removing a user block with h5jam
and h5unjam
:
h5copy Options Description
-i
Input File
-o
Output File
-u
File to add or remove from user block
Let's say you wanted to add the program that creates an HDF5 file to its user block. As an example, you
can take the h5_crtgrpar.c
program from the
\ref LBExamples
and add it to the file it creates, groups.h5
. This can
be done with h5jam
, as follows:
\code
h5jam -i groups.h5 -u h5_crtgrpar.c -o groupsub.h5
\endcode
You can actually view the file with more groupsub.h5
to see that the h5_crtgrpar.c
file is indeed included.
To remove the user block that was just added, type:
\code
h5unjam -i groupsub.h5 -u h5_crtgrparNEW.c -o groups-noub.h5
\endcode
This writes the user block in the file groupsub.h5
into h5_crtgrparNEW.c
. The new HDF5 file,
groups-noub.h5
, will not contain a user block.
Navigate back: \ref index "Main" / \ref GettingStarted / \ref ViewToolsCommand
*/
/** @page ViewToolsConvert Command-line Tools For Converting HDF5 Files
Navigate back: \ref index "Main" / \ref GettingStarted / \ref ViewToolsCommand
\section secViewToolsConvertTOC Contents
\ref secViewToolsConvertASCII
\ref secViewToolsConvertBinary
\ref secViewToolsConvertExport
\section secViewToolsConvertASCII Output HDF5 Dataset into an ASCII File (to Import into Excel and Other Applications)
The h5dump
utility can be used to convert an HDF5 dataset
into an ASCII file, which can then be imported into Excel and other applications. The following options are used:
Options Description
-d D, --dataset=D
Display dataset D
-o F, --output=F
Output raw data into file F
-y, --noindex
Suppress printing of array indices with the data
-w N, --width=N
Set N number of columns of output. A value of 0
sets the number to 65535 (the maximum)
As an example, h5_crtdat.c
from the \ref LBDsetCreate
HDF5 Tutorial topic, creates the file dset.h5
with
a dataset /dset
that is a 4 x 6 integer array. The
following is displayed when viewing dset.h5
with
h5dump
:
\code
$ h5dump dset.h5
HDF5 "dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
(0,0): 1, 2, 3, 4, 5, 6,
(1,0): 7, 8, 9, 10, 11, 12,
(2,0): 13, 14, 15, 16, 17, 18,
(3,0): 19, 20, 21, 22, 23, 24
}
}
}
}
\endcode
The following command will output the values of the /dset
dataset to the ASCII file dset.asci
:
\code
h5dump -d /dset -o dset.asci -y -w 50 dset.h5
\endcode
In particular, note that:
\li The default behavior of h5dump
is to print indices,
and the -y
option suppresses this.
\li The -w 50
option tells
h5dump
to allow 50 columns for outputting the data. The
value specified must be large enough to accommodate the dimension size of the dataset multiplied by the
number of positions and spaces needed to print each value. If the value is not large enough, the output
will wrap to the next line, and the data will not display as expected in Excel or other applications. To
ensure that the output does not wrap to the next line, you can also specify 0 (zero) for the
-w
option.
In addition to creating the ASCII file dset.asci
, the
above command outputs the metadata of the specified dataset:
\code
HDF5 "dset.h5" {
DATASET "/dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
}
}
}
\endcode
The dset.asci
file will contain the values for the dataset:
\code
1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24
\endcode
\section secViewToolsConvertBinary Output HDF5 Dataset into Binary File
The h5dump
utility can be used to convert an
HDF5 dataset to a binary file with the following options:
Options Description
-d D, --dataset=D
Display dataset D
-o F, --output=F
Output raw data into file F
-b B, --binary=B
Binary file output of form B.
Valid values are: LE, BE, NATIVE, FILE
As an example, h5_crtdat.c
from the
\ref LBDsetCreate HDF5 Tutorial topic, creates the file dset.h5 with a dataset
/dset
that is a 4 x 6 integer array. The
following is displayed when viewing dset.h5
with h5dump
:
\code
$ h5dump -d /dset/ dset.h5
HDF5 "dset.h5" {
DATASET "/dset/" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
(0,0): 1, 2, 3, 4, 5, 6,
(1,0): 7, 8, 9, 10, 11, 12,
(2,0): 13, 14, 15, 16, 17, 18,
(3,0): 19, 20, 21, 22, 23, 24
}
}
}
\endcode
As specified by the -d
and
-o
options, the following
h5dump
command will output the values of the dataset
/dset
to a file called
dset.bin
. The -b
option specifies that the output will be binary in Little Endian format (LE).
\code
h5dump -d /dset -b LE -o dset.bin dset.h5
\endcode
This command outputs the metadata for the dataset, as well as creating the binary file
dset.bin
:
\code
HDF5 "dset.h5" {
DATASET "/dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
}
}
}
\endcode
If you look at the resulting dset.bin
file with
a binary editor, you will see that it contains the dataset's values. For example (on Linux) you will see:
\code
$ od -t d dset.bin
0000000 1 2 3 4
0000020 5 6 7 8
0000040 9 10 11 12
0000060 13 14 15 16
0000100 17 18 19 20
0000120 21 22 23 24
0000140
\endcode
\section secViewToolsConvertExport Export from h5dump and Import into HDF5
The h5import
utility can use the output of
h5dump
as input to create a dataset or file.
The h5dump
utility must first create two files:
\li A DDL file, which will be used as an h5import
configuration file
\li A raw data file containing the data to be imported
The DDL file must be generated with the h5dump -p
option, to generate properties.
The raw data file that can be imported into HDF5 using this method may contain either numeric or string data with the following restrictions:
\li Numeric data requires the use of the h5dump -b
option to produce a binary data file.
\li String data must be written with the h5dump -y
and
--width=1
options, generating a single column of strings without indices.
Two examples follow: the first imports a dataset with a numeric datatype. Note that numeric data requires
the use of the h5dump -b
option to produce a binary data
file. The example program (h5_crtdat.c
) that creates this
file is included with the \ref IntroHDF5 tutorial and can be obtained from the \ref LBExamples page:
\code
h5dump -p -d "/dset" --ddl=dsetbin.dmp -o dset.bin -b dset.h5
h5import dset.bin -c dsetbin.dmp -o new-dset.h5
\endcode
The output before and after running these commands is shown below:
\code
$ h5dump dset.h5
HDF5 "dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
(0,0): 1, 2, 3, 4, 5, 6,
(1,0): 7, 8, 9, 10, 11, 12,
(2,0): 13, 14, 15, 16, 17, 18,
(3,0): 19, 20, 21, 22, 23, 24
}
}
}
}
$ h5dump -p -d "/dset" --ddl=dsetbin.dmp -o dset.bin -b dset.h5
$ h5import dset.bin -c dsetbin.dmp -o new-dset.h5
$ h5dump new-dset.h5
HDF5 "new-dset.h5" {
GROUP "/" {
DATASET "dset" {
DATATYPE H5T_STD_I32BE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
(0,0): 1, 2, 3, 4, 5, 6,
(1,0): 7, 8, 9, 10, 11, 12,
(2,0): 13, 14, 15, 16, 17, 18,
(3,0): 19, 20, 21, 22, 23, 24
}
}
}
}
\endcode
The second example imports string data. The example program that creates this file can be downloaded
from the \ref ExAPI page.
Note that string data requires use of the h5dump -y
option to exclude indexes and the h5dump --width=1
option to generate a single column of strings. The -o
option outputs the data into an ASCII file.
\code
h5dump -p -d "/DS1" -O vlstring.dmp -o vlstring.ascii -y --width=1 h5ex_t_vlstring.h5
h5import vlstring.ascii -c vlstring.dmp -o new-vlstring.h5
\endcode
The output before and after running these commands is shown below:
\code
$ h5dump h5ex_t_vlstring.h5
HDF5 "h5ex_t_vlstring.h5" {
GROUP "/" {
DATASET "DS1" {
DATATYPE H5T_STRING {
STRSIZE H5T_VARIABLE;
STRPAD H5T_STR_SPACEPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 4 ) / ( 4 ) }
DATA {
(0): "Parting", "is such", "sweet", "sorrow."
}
}
}
}
$ h5dump -p -d "/DS1" -O vlstring.dmp -o vlstring.ascii -y --width=1 h5ex_t_vlstring.h5
$ h5import vlstring.ascii -c vlstring.dmp -o new-vlstring.h5
$ h5dump new-vlstring.h5
HDF5 "new-vlstring.h5" {
GROUP "/" {
DATASET "DS1" {
DATATYPE H5T_STRING {
STRSIZE H5T_VARIABLE;
STRPAD H5T_STR_NULLTERM;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
}
DATASPACE SIMPLE { ( 4 ) / ( 4 ) }
DATA {
(0): "Parting", "is such", "sweet", "sorrow."
}
}
}
\endcode
Navigate back: \ref index "Main" / \ref GettingStarted / \ref ViewToolsCommand
*/