The HDF5 specification defines the standard objects and storage for the standard HDF5 objects. (For information about the HDF5 library, model and specification, see the HDF documentation.)  This document is an additional specification do define a standard profile for how to store image data in HDF5. Image data in HDF5 is stored as HDF5 datasets with standard attributes to define the properties of the image.

This specification is primarily concerned with two dimensional raster data similar to HDF4 Raster Images.  Specifications for storing other types of imagery will be covered in other documents.

This specification defines:

1. HDF5 Image Specification

1.1 Overview

Image data is stored as an HDF5 dataset with values of HDF5 class Integer or Float.  A common example would be a two dimensional dataset, with elements of class Integer, e.g., a two dimensional array of unsigned 8 bit integers.  However, this specification does not limit the dimensions or number type that may be used for an Image.

The dataset for an image is distinguished from other datasets by giving it an attribute "CLASS=IMAGE".  In addition, the Image dataset may have an optional attribute "PALETTE" that is an array of object references for zero or more palettes. The Image dataset may have additional attributes to describe the image data, as defined in Section 1.2.

A Palette is an HDF5 dataset which contains color map information.  A Pallet dataset has an attribute "CLASS=PALETTE" and other attributes indicating the type and size of the palette, as defined in Section 2.1.  A Palette is an independent object, which can be shared among several Image datasets.

1.2  Image Attributes

The attributes for the Image are scalars unless otherwise noted.  The length of String valued attributes should be at least the number of characters. Optionally, String valued attributes may be stored in a String longer than the minimum, in which case it must be zero terminated or null padded.  "Required" attributes must always be used. "Optional" attributes must be used when required.
 

Attributes

Attribute name="CLASS" (Required)
This attribute is type H5T_C_S1, with size 5.
For all Images, the value of this attribute is "IMAGE".
This attribute identifies this data set as intended to be interpreted as an image that conforms to the specifications on this page.
Attribute name="PALETTE"
A Image dataset within an HDF5 file may optionally specify an array of palettes to be viewed with. The dataset will have an attribute field called "PALETTE" which contains a one-dimensional array of object reference pointers (HDF5 datatype H5T_STD_REF_OBJ) which refer to palettes in the file. The palette datasets must conform to the Palette specification in section 2 below. The first palette in this array will be the default palette that the data may be viewed with.
Attribute name="IMAGE_SUBCLASS"
If present, the value of this attribute indicates the type of Palette that should be used with the Image.  This attribute is a scalar of type H5T_C_S1, with size according to the string plus one.  The values are:
"IMAGE_GRAYSCALE" (length 15)
A grayscale image
"IMAGE_BITMAP" (length 12)
A bit map image
"IMAGE_TRUECOLOR" (length 15)
A truecolor image
"IMAGE_INDEXED" (length 13)
An indexed image
Attribute name="INTERLACE_MODE"
For images with more than one component for each pixel, this optional attribute specifies the layout of the data. The values are type H5T_C_S1 of length 15. See section 1.3 for information about the storage layout for data.
"INTERLACE_PIXEL" (default): the component value for a pixel are contiguous.
"INTERLACE_PLANE": each component is stored as a plane.
Attribute name="DISPLAY_ORIGIN"
This optional attribute indicates the intended orientation of the data on a two-dimensional raster display.  The value indicates which corner the pixel at (0, 0) should be viewed.  The values are type H5T_C_S1 of length 2. If DISPLAY_ORIGIN is not set, the orientation is undefined.
"UL": (0,0) is at the upper left.
"LL": (0,0) is at the lower left.
"UR": (0,0) is at the upper right.
"LR": (0,0) is at the lower right.
Attribute name="IMAGE_WHITE_IS_ZERO"
This attribute is of type H5T_NATIVE_UCHAR.  0 = false, 1 = true .  This is used for images with IMAGE_SUBCLASS="IMAGE_GRAYSCALE" or "IMAGE_BITMAP".
Attribute name="IMAGE_MINMAXRANGE"
If present, this attribute is an array of two numbers, of the same HDF5 datatype as the data.  The first element is the minimum value of the data, and the second is the maximum.  This is used for images with IMAGE_SUBCLASS="IMAGE_GRAYSCALE", "IMAGE_BITMAP" or "IMAGE_INDEXED".
Attribute name="IMAGE_BACKGROUNDINDEX"
If set, this attribute indicates the index value that should be interpreted as the "background color".  This attribute is HDF5 type H5T_NATIVE_UINT.
Attribute name="IMAGE_TRANSPARENCY"
If set, this attribute indicates the index value that should be interpreted as the "transparent color".  This attribute is HDF5 type H5T_NATIVE_UINT.  This attribute may not be used for IMAGE_SUBCLASS="IMAGE_TRUE_COLOR".
Attribute name="IMAGE_ASPECTRATIO"
If set, this attribute indicates the aspect ratio.
Attribute name="IMAGE_COLORMODEL"
If set, this attribute indicates the color model of Palette that should be used with the Image.  This attribute is of type H5T_C_S1, with size 3, 4, or 5.  The value is one of the color models described in the Palette specification in section 2.2 below.  This attribute may be used only for IMAGE_SUBCLASS="IMAGE_TRUECOLOR" or "IMAGE_INDEXED".
Attribute name="IMAGE_GAMMACORRECTION"
If set, this attribute gives the Gamma correction.  The attribute is type H5T_NATIVE_FLOAT.  This attribute may be used only for IMAGE_SUBCLASS="IMAGE_TRUECOLOR" or "IMAGE_INDEXED".
Attribute name="IMAGE_VERSION" (Required)
This attribute is of type H5T_C_S1, with size corresponding to the length of the version string.  This attribute identifies the version number of this specification to which it conforms.  The current version number is "1.2".

 

 
 
 

Table 1. Attributes of an Image Dataset
Attribute Name (R = Required
O= Optional)
Type String Size Value
CLASS R String 5 "IMAGE"
PALETTE O Array Object References <references to Palette datasets>1
IMAGE_SUBCLASS O2 String 15, 
12, 
15,
13
"IMAGE_GRAYSCALE",
"IMAGE_BITMAP",
"IMAGE_TRUECOLOR",
"IMAGE_INDEXED"
INTERLACE_MODE O3,6 String 15 The layout of components if more than one component per pixel.
DISPLAY_ORIGIN O String 2 If set, indicates the intended location of the pixel (0,0).
IMAGE_WHITE_IS_ZERO O3,4 Unsigned Integer 0 = false, 1 = true
IMAGE_MINMAXRANGE O3,5 Array [2] <same datatype as data values> The (<minimum>, <maximum>) value of the data.
IMAGE_BACKGROUNDINDEX O3 Unsigned Integer The index of the background color.
IMAGE_TRANSPARENCY O3,5 Unsigned Integer The index of the transparent color.
IMAGE_ASPECTRATIO O3,4 Unsigned Integer The aspect ratio.
IMAGE_COLORMODEL O3,6 String 3, 4, or 5 The color model, as defined below in the Palette specification for attribute PAL_COLORMODEL.
IMAGE_GAMMACORRECTION O3,6 Float The gamma correction.
IMAGE_VERSION R String 3 "1.2"
1.  The first element of the array is the default Palette.
2.  This attribute is required for images that use one of the standard color map types listed.
3. This attribute is required if set for the source image, in the case that the image is translated from another file into HDF5.
4.  This applies to:  IMAGE_SUBCLASS="IMAGE_GRAYSCALE" or "IMAGE_BITMAP".
5.  This applies to:  IMAGE_SUBCLASS="IMAGE_GRAYSCALE", "IMAGE_BITMAP", or "IMAGE_INDEXED".
6.  This applies to: IMAGE_SUBCLASS="IMAGE_TRUECOLOR", or "IMAGE_INDEXED".
Table 2 summarizes the standard attributes for an Image datasets using the common sub-classes. R means that the attribute listed on the leftmost column is Required for the image subclass on the first row, O means that the attribute is Optional for that subclass and N that the attribute cannot be applied to that subclass. The two first rows show the only required attributes for all subclasses.
 
Table 2a. Applicability of Attributes to IMAGE sub-classes
IMAGE_SUBCLASS1 IMAGE_GRAYSCALE IMAGE_BITMAP
CLASS R R
IMAGE_VERSION R R
INTERLACE_MODE N N
IMAGE_WHITE_IS_ZERO R R
IMAGE_MINMAXRANGE O O
IMAGE_BACKGROUNDINDEX O O
IMAGE_TRANSPARENCY O O
IMAGE_ASPECTRATIO O O
IMAGE_COLORMODEL N N
IMAGE_GAMMACORRECTION N N
PALETTE O O
DISPLAY_ORIGIN O O
 
Table 2b. Applicability of Attributes to IMAGE sub-classes
IMAGE_SUBCLASS IMAGE_TRUECOLOR IMAGE_INDEXED
CLASS R R
IMAGE_VERSION R R
INTERLACE_MODE R N
IMAGE_WHITE_IS_ZERO N N
IMAGE_MINMAXRANGE N O
IMAGE_BACKGROUNDINDEX N O
IMAGE_TRANSPARENCY N O
IMAGE_ASPECTRATIO O O
IMAGE_COLORMODEL O O
IMAGE_GAMMACORRECTION O O
PALETTE O O
DISPLAY_ORIGIN O O

1.3 Storage Layout and Properties for Images

In the case of an image with more than one component per pixel (e.g., Red, Green, and Blue), the data may be arranged in one of two ways.  Following HDF4 terminology, the data may be interlaced by pixel or by plane, which should be indicated by the INTERLACE_MODE  attribute.  In both cases, the dataset will have a dataspace with three dimensions, height, width, and components.  The interlace modes specify different orders for the dimensions.
 
Table 3. Storage of multiple component image data.
Interlace Mode Dimensions in the Dataspace
INTERLACE_PIXEL [height][width][pixel components]
INTERLACE_PLANE [pixel components][height][width]

For example, consider a 5 (rows) by 10 (column) image, with Red, Green, and Blue components.  Each component is an unsigned byte. In HDF5, the datatype would be declared as an unsigned 8 bit integer.  For pixel interlace, the dataspace would be a three dimensional array, with dimensions: [10][5][3].  For plane interleave, the dataspace would be three dimensions: [3][10][5].

In the case of images with only one component, the dataspace may be either a two dimensional array, or a three dimensional array with the third dimension of size 1.  For example, a 5 by 10 image with 8 bit color indexes would be an HDF5 dataset with type unsigned 8 bit integer.  The dataspace could be either a two dimensional array, with dimensions [10][5], or three dimensions, with dimensions either [10][5][1] or [1][10][5].

Image datasets may be stored with any chunking or compression properties supported by HDF5.

A note concerning compatibility with HDF5 GR interface: An Image dataset is stored as an HDF5 dataset.  It is important to note that the order of the dimensions is the same as for any other HDF5 dataset.  For a two dimensional image that is to be stored as a series of horizontal scan lines, with the scan lines contiguous (i.e., the fastest changing dimension is 'width'), the image will have a dataspace with dim[0] = height and dim[1] = width.  This is completely consistent with all other HDF5 datasets.

Users familiar with HDF4 should be cautioned that this is not the same as HDF4, and specifically is not consistent with what the HDF4 GR interface does.
 

2.  HDF5 Palette Specification

2.1 Overview

A palette is the means by which color is applied to an image and is also referred to as a color lookup table. It is a table in which every row contains the numerical representation of a particular color. In the example of an 8 bit standard RGB color model palette, this numerical representation of a color is presented as a triplet specifying the intensity of red, green, and blue components that make up each color.

In this example, the color component numeric type is an 8 bit unsigned integer. While this is most common and recommended for general use, other component color numeric datatypes, such as a 16 bit unsigned integer , may be used. This type is specified as the type attribute of the palette dataset. (see H5Tget_type(), H5Tset_type())

The minimum and maximum values of the component color numeric are specified as attribute of the palette dataset. See below (attribute PAL_MINMAXNUMERIC). If these attributes do not exist, it is assumed that the range of values will fill the space of the color numeric type. i.e. with an 8 bit unsigned integer, the valid range would be 0 to 255 for each color component.

The HDF5 palette specification additionally allows for color models beyond RGB. YUV, HSV, CMY, CMYK, YCbCr color models are supported, and may be specified as a color model attribute of the palette dataset. (see "Palette Attributes" for details).

In HDF 4 and earlier, palettes were limited to 256 colors. The HDF5 palette specification allows for palettes of varying length. The length is specified as the number of rows of the palette dataset.
 
 
Important Note: The specification of the Indexed Palette will change substantially in the next version.  The Palette described here is denigrated and is not supported.

 
Denigrated

In a standard palette, the color entries are indexed directly. HDF5 supports the notion of a range index table. Such a table defines an ascending ordered list of ranges that map dataset values to the palette. If a range index table exists for the palette, the PAL_TYPE attribute will be set to "RANGEINDEX", and the PAL_RANGEINDEX attribute will contain an object reference to a range index table array. If not, the PAL_TYPE attribute either does not exist, or will be set to "STANDARD".

The range index table array consists of a one dimensional array with the same length as the palette dataset - 1. Ideally, the range index would be of the same type as the dataset it refers to, however this is not a requirement.

Example 2: A range index array of type floating point

The range index array attribute defines the "to" of the range. Notice that the range index array attribute is one less entry in size than the palette. The first entry of 0.1259, specifies that all values below and up to 0.1259 inclusive, will map to the first palette entry. The second entry signifies that all values greater than 0.1259 up to 0.3278 inclusive, will map to the second palette entry, etc. All value greater than the last range index array attribute (100000) map to the last entry in the palette.

2.2. Palette Attributes

A palette exists in an HDF file as an independent data set with accompanying attributes.  The Palette attributes are scalars except where noted otherwise.  String values should have size the length of the string value plus one.  "Required" attributes must be used.  "Optional" attributes must be used when required.

These attributes are defined as follows:

Attribute name="CLASS" (Required)
This attribute is of type H5T_C_S1, with size 7.
For all palettes, the value of this attribute is "PALETTE". This attribute identifies this palette data set as a palette that conforms to the specifications on this page.
Attribute name="PAL_COLORMODEL" (Required)
This attribute is of type H5T_C_S1, with size 3, 4, or 5.
Possible values for this are "RGB", "YUV", "CMY", "CMYK", "YCbCr", "HSV".
This defines the color model that the entries in the palette data set represent.
"RGB"
Each color index contains a triplet where the the first value defines the red component, second defines the green component, and the third the blue component.
"CMY"
Each color index contains a triplet where the the first value defines the cyan component, second defines the magenta component, and the third the yellow component.
"CMYK"
Each color index contains a quadruplet where the the first value defines the cyan component, second defines the magenta component, the third the yellow component, and the forth the black component.
"YCbCr"
Class Y encoding model. Each color index contains a triplet where the the first value defines the luminance, second defines the Cb Chromonance, and the third the Cr Chromonance.
"YUV"
Composite encoding color model. Each color index contains a triplet where the the first value defines the luminance component, second defines the chromonance component, and the third the value component.
"HSV"
Each color index contains a triplet where the the first value defines the hue component, second defines the saturation component, and the third the value component. The hue component defines the hue spectrum with a low value representing magenta/red progressing to a high value which would represent blue/magenta, passing through yellow, green, cyan. A low value for the saturation component means less color saturation than a high value. A low value for value will be darker than a high value.
Attribute name="PAL_TYPE" (Required)
This attribute is of type H5T_C_S1, with size 9 or 10.
The current supported values for this attribute are : "STANDARD8" or "RANGEINDEX"
A PAL_TYPE of "STANDARD8" defines a palette dataset such that the first entry defines index 0, the second entry defines index 1, etc. up until the length of the palette - 1. This assumes an image dataset with direct indexes into the palette.
 
Denigrated

If the PAL_TYPE is set to "RANGEINDEX", there will be an additional attribute with a name of "PAL_RANGEINDEX",  (See example 2 for more details)

Attribute name="PAL_RANGEINDEX"   (Denigrated)
The PAL_RANGEINDEX attribute contains an HDF object reference (HDF5 datatype H5T_STD_REF_OBJ) pointer which specifies a range index array in the file to be used for color lookups for the palette.  (Only for PAL_TYPE="RANGEINDEX")
Attribute name="PAL_MINMAXNUMERIC"
If present, this attribute is an array of two numbers, of the same HDF5 datatype as the palette elements or color numerics.

They specify the minimum and maximum values of the color numeric components. For example, if the palette was an RGB of type Float, the color numeric range for Red, Green, and Blue could be set to be between 0.0 and 1.0. The intensity of the color guns would then be scaled accordingly to be between this minimum and maximum attribute.
Attribute name="PAL_VERSION"  (Required)
This attribute is of type H5T_C_S1, with size corresponding to the length of the version string.  This attribute identifies the version number of this specification to which it conforms.  The current version is "1.2".
Table 4. Attributes of a Palette Dataset
Attribute Name (R = Required,
O = Optional)
Type String Size Value
CLASS R String
7
"PALETTE"
PAL_COLORMODEL R String
3, 4, or 5
Color Model:  "RGB", YUV", "CMY", "CMYK", "YCbCr", or "HSV"
PAL_TYPE R String
9


or 10

"STANDARD8" 
or "RANGEINDEX" (Denigrated)
Denigrated
RANGE_INDEX
Object Reference 
<Object Reference to Dataset of range index values>
PAL_MINMAXNUMERIC O Array[2] of <same datatype as palette> The first value is the <Minimum value for color values>, the second value is <Maximum value for color values>2
PAL_VERSION R String 4 "1.2"
 
1.  The RANGE_INDEX attribute is required if the PAL_TYPE is "RANGEINDEX".  Otherwise, the RANGE_INDEX attribute should be omitted. (Range index is denigrated.)
2.  The minimum and maximum are optional.  If not set, the range is assumed to the maximum range of the number type.  If one of these attributes is set, then both should be set.  The value of the minimum must be less than or equal to the value of the maximum.
Table 5 summarized the uses of the standard attributes for a palette dataset. R means that the attribute listed on the leftmost column is Required for the palette type on the first row, O means that the attribute is Optional for that type and N that the attribute cannot be applied to that type. The four first rows show the attributes that are always required  for the two palette types.
 
 
Table 5. Applicability of Attributes
PAL_TYPE STANDARD8 RANGEINDEX
CLASS R R
PAL_VERSION R R
PAL_COLORMODEL R R
RANGE_INDEX N R
PAL_MINMAXNUMERIC O O

2.3. Storage Layout for Palettes

The values of the Palette are stored as a dataset.  The datatype can be any HDF 5 atomic numeric type.  The dataset will have dimensions (nentries  by  ncomponents), where 'nentries' is the number of colors (usually 256) and 'ncomponents' is the number of values per color (3 for RGB, 4 for CMYK, etc.)
 

3.  Consistency and Correlation of Image and Palette Attributes

The objects in this specification are an extension to the base HDF5 specification and library.  They are accessible with the standard HDF5 library, but the semantics of the objects are not enforced by the base library.  For example, it is perfectly possible to add an attribute called IMAGE to any dataset, or to include an object reference to any HDF5 dataset in a PALETTE attribute.  This would be a valid HDF5 file, but not conformant to this specification.  The rules defined in this specification must be implemented with appropriate software, and applications must use conforming software to assure correctness.

The Image and Palette specifications include several redundant standard attributes, such as the IMAGE_COLORMODEL and the PAL_COLORMODEL.  These attributes are informative not normative, in that it is acceptable to attach a Palette to an Image dataset even if their attributes do not match.  Software is not required to enforce consistency, and files may contain mismatched associations of Images and Palettes.  In all cases, it is up to applications to determine what kinds of images and color models can be supported.

For example, an Image that was created from a file with an "RGB" may have a "YUV" Palette in its PALETTE attribute array.  This would be a legal HDF5 file and also conforms to this specification, although it may or may not be correct for a given application.