<HTML><HEAD> <TITLE>HDF5 Tutorial - Glossary </TITLE> </HEAD> <body bgcolor="#ffffff"> <!-- BEGIN MAIN BODY --> <A HREF="http://www.ncsa.uiuc.edu/"><img border=0 src="http://www.ncsa.uiuc.edu/Images/NCSAhome/footerlogo.gif" width=78 height=27 alt="NCSA"><P></A> [ <A HREF="title.html"><I>HDF5 Tutorial Top</I></A> ] <H1> <BIG><BIG><BIG><FONT COLOR="#c101cd">Glossary</FONT> </BIG></BIG></BIG></H1> <hr noshade size=1> <BODY> <DL> <DT><B>ATTRIBUTE</B> <DD>An HDF5 attribute is a small dataset that can be used to describe the nature and/or the intended usage of the object it is attached to. <P> <DT><B>BOOT BLOCK</B> <DD>HDF5 files are composed of a "boot block" describing information required to portably access files on multiple platforms, followed by information about the groups in a file and the datasets in the file. The boot block contains information about the size of offsets and lengths of objects, the number of entries in symbol tables (used to store groups) and additional version information for the file. <P> <DT><B>DATASET</B> <DD>An HDF5 dataset is a multi-dimensional array of data elements, together with supporting metadata. <P> <DT><B>DATASPACE</B> <DD>An HDF5 dataspace is an object that describes the dimensionality of the data array. A dataspace is either a regular N-dimensional array of data points, called a simple dataspace, or a more general collection of data points organized in another manner, called a complex dataspace. <P> <DT><B>DATATYPE</B> <DD>An HDF5 Datatype is an object that describes the type of the element in an HDF5 multi-dimensional array. There are two categories of datatypes: atomic and compound data types. An atomic type is a type which cannot be decomposed into smaller units at the API level. A compound is a collection of one or more atomic types or small arrays of such types. <P> <DT><B>DATASET CREATION PROPERTY LIST</B> <DD> The Dataset Creation Property List contains information on how raw data is organized on disk and how the raw data is compressed. The dataset API partitions these terms by layout, compression, and external storage: <UL> <B> Layout:</B> <UL> <LI> H5D_COMPACT: Data is small and can be stored in object header (not implemented yet). This eliminates disk seek/read requests. <LI> H5D_CONTIGUOUS: (<B>default</B>) The data is large, non-extendible, non-compressible, non-sparse, and can be stored externally. <LI> H5D_CHUNKED: The data is large and can be extended in any dimension. It is partitioned into chunks so each chunk is the same logical size. </UL> <B>Compression:</B> (gzip compression)<BR> <B>External Storage Properties:</B> The data must be contiguous to be stored externally. It allows you to store the data in one or more non-HDF5 files. </UL> <P> <DT><B>DATA TRANSFER PROPERTY LIST</B> <DD> The data transfer property list is used to control various aspects of the I/O, such as caching hints or collective I/O information. <P> <DT><B>DDL</B> <DD>DDL is a Data Description Language that describes HDF5 objects in Backus-Naur Form. <P> <DT><B>FILE ACCESS MODES</B> <DD>The file access modes determine whether an existing file will be overwritten. All newly created files are opened for both reading and writing. Possible values are: <PRE> H5F_ACC_RDWR: Allow read and write access to file. H5F_ACC_RDONLY: Allow read-only access to file. H5F_ACC_TRUNC: Truncate file, if it already exists, erasing all data previously stored in the file. H5F_ACC_EXCL: Fail if file already exists. H5F_ACC_DEBUG: Print debug information. H5P_DEFAULT: Apply default file access and creation properties. </PRE> <P> <DT><B>FILE ACCESS PROPERTY LIST</B> <DD> File access property lists are used to control different methods of performing I/O on files: <UL> <B>Unbuffered I/O:</B> Local permanent files can be accessed with the functions described in Section 2 of the Posix manual, namely open(), lseek(), read(), write(), and close(). <BR> <B>Buffered I/O:</B> Local permanent files can be accessed with the functions declared in the stdio.h header file, namely fopen(), fseek(), fread(), fwrite(), and fclose().<BR> <B>Memory I/O:</B> Local temporary files can be created and accessed directly from memory without ever creating permanent storage. The library uses malloc() and free() to create storage space for the file<BR> <B>Parallel Files using MPI I/O:</B> This driver allows parallel access to a file through the MPI I/O library. The parameters which can be modified are the MPI communicator, the info object, and the access mode. The communicator and info object are saved and then passed to MPI_File_open() during file creation or open. The access_mode controls the kind of parallel access the application intends.<BR> <B>Data Alignment:</B> Sometimes file access is faster if certain things are aligned on file blocks. This can be controlled by setting alignment properties of a file access property list with the H5Pset_alignment() function. </UL> <P> <DT><B>FILE CREATION PROPERTY LIST</B> <DD> The file creation property list is used to control the file metadata. The parameters that can be modified are: <UL> <B>User-Block Size:</B> The "user-block" is a fixed length block of data located at the beginning of the file which is ignored by the HDF5 library and may be used to store any data information found to be useful to applications. <BR> <B> Offset and Length Sizes:</B> The number of bytes used to store the offset and length of objects in the HDF5 file can be controlled with this parameter. Symbol Table Parameters: The size of symbol table B-trees can be controlled by setting the 1/2 rank and 1/2 node size parameters of the B-tree. <BR> <B> Indexed Storage Parameters:</B> The size of indexed storage B-trees can be controlled by setting the 1/2 rank and 1/2 node size parameters of the B-tree. </UL> <P> <DT><B>GROUP</B> <DD>A Group is a structure containing zero or more HDF5 objects, together with supporting metadata. The two primary HDF5 objects are datasets and groups. <P> <DT><B>HDF5</B> <DD>HDF5 is an abbreviation for Hierarchical Data Format Version 5. This file format is intended to make it easy to write and read scientific data <P> <UL> <LI> by including the information needed to understand the data within the file <P> <LI> by providing a library of C, FORTRAN, and other language programs that reduce the work required to provide efficient writing and reading - even with parallel IO </UL> <P> <DT><B>HDF5 FILE</B> <DD>An HDF5 file is a container for storing grouped collections of multi-dimensional arrays containing scientific data. <P> <DT><B>H5DUMP</B> <DD>h5dump is an HDF5 tool that describes the HDF5 file contents in DDL. <P> <DT><B>HYPERSLAB</B> <DD> A hyperslab is a portion of a dataset. A hyperslab selection can be a logically contiguous collection of points in a dataspace, or it can be a regular pattern of points or blocks in a dataspace. <P> <DT><B>MOUNTING FILES</B> <DD> HDF5 allows you to combine two or more HDF5 files in a manner similar to mounting files in UNIX. The group structure and metadata from one file appear as though they exist in another file. <P> <DT><B>NAMES</B> <DD>HDF5 object names are a slash-separated list of components. A name which begins with a slash is an absolute name which is accessed beginning with the root group of the file while all other relative names are accessed beginning with the specified group. <P> <DT><B>PARALLEL I/O (HDF5)</B> <DD>The parallel I/O version of HDF5 supports parallel file access using MPI (Message Passing Interface). <P> <DT><B>REFERENCE</B> <DD> <B>OBJECT REFERENCE:</B><BR> A reference to an entire object in the current HDF5 file. <P> An object reference points to an entire object in the current HDF5 file by storing the relative file address (OID) of the object header for the object pointed to. The relative file address of an object header is constant for the life of the object. An object reference is of a fixed size in the file. <P> <B>DATASET REGION REFERENCE:</B><BR> Reference to a specific dataset region. <P> A dataset region reference points to a region of a dataset in the current HDF5 file by storing the OID of the dataset and the global heap offset of the region referenced. The region referenced is located by retrieving the coordinates of the areas in the region from the global heap. A dataset region reference is of a variable size in the file. <P> <DT><B>THREADSAFE (HDF5)</B> <DD>A "thread-safe" version of HDF-5 (TSHDF5) is one that can be called from any thread of a multi-threaded program. Any calls to HDF can be made in any order, and each individual HDF call will perform correctly. A calling program does not have to explicitly lock the HDF library in order to do I/O. Applications programmers may assume that the TSHDF5 guarantees the following: <UL> <LI> the HDF-5 library does not create or destroy threads. <LI> the HDF-5 library uses modest amounts of per-thread private memory. <LI> the HDF-5 library only locks/unlocks it's own locks (no locks are passed in or returned from HDF), and the internal locking is guaranteed to be deadlock free. </UL> <P> These properties mean that the TSHDF5 library will not interfere with an application's use of threads. A TSHDF5 library is the same library as regular HDF-5 library, with additional code to synchronize access to the HDF-5 library's internal data structures. </DL> <!-- BEGIN FOOTER INFO --> <P><hr noshade size=1> <font face="arial,helvetica" size="-1"> <a href="http://www.ncsa.uiuc.edu/"><img border=0 src="http://www.ncsa.uiuc.edu/Images/NCSAhome/footerlogo.gif" width=78 height=27 alt="NCSA"><br> The National Center for Supercomputing Applications</A><br> <a href="http://www.uiuc.edu/">University of Illinois at Urbana-Champaign</a><br> <br> <!-- <A HREF="helpdesk.mail.html"> --> <A HREF="mailto:hdfhelp@ncsa.uiuc.edu"> hdfhelp@ncsa.uiuc.edu</A> <br> <BR> <H6>Last Modified: March 16, 2001</H6><BR> <!-- modified by Barbara Jones - bljones@ncsa.uiuc.edu --> </FONT> <BR> <!-- <A HREF="mailto:hdfhelp@ncsa.uiuc.edu"> --> </BODY> </HTML>