HDF5: API Mapping to legacy APIs

Functionality netCDF SD AIO HDF5 Comments
Open existing file for read/write ncopen SDstart AIO_open H5Fopen
Creates new file for read/write. nccreate

H5Fcreate SD API handles this with SDopen
Close file ncclose SDend AIO_close H5Fclose
Redefine parameters ncredef


Unneccessary under SD & HDF5 data-models
End "define" mode ncendef


Unneccessary under SD & HDF5 data-models
Query the number of datasets, dimensions and attributes in a file ncinquire SDfileinfo
H5Dget_info
H5Rget_num_relations
H5Gget_num_contents
HDF5 interface is more granular and flexible
Update a writeable file with current changes ncsync
AIO_flush H5Mflush HDF5 interface is more flexible because it can be applied to parts of the file hierarchy instead of the whole file at once. The SD interface does not have this feature, although most of the lower HDF library supports it.
Close file access without applying recent changes ncabort


How useful is this feature?
Create new dimension ncdimdef SDsetdimname
H5Mcreate SD interface actually creates dimensions with datasets, this just allows naming them
Get ID of existing dimension ncdimid SDgetdimid
H5Maccess SD interface looks up dimensions by index and the netCDF interface uses names, but they are close enough. The HDF5 interface does not current allow access to particular dimensions, only the dataspace as a whole.
Get size & name of dimension ncdiminq SDdiminfo
H5Mget_name
H5Sget_lrank
Only a rough match
Rename dimension ncdimrename SDsetdimname
H5Mset_name
Create a new dataset ncvardef SDcreate AIO_mkarray H5Mcreate
Attach to an existing dataset ncvarid SDselect AIO_arr_load H5Maccess
Get basic information about a dataset ncvarinq SDgetinfo AIO_arr_get_btype
AIO_arr_get_nelmts
AIO_arr_get_nbdims
AIO_arr_get_bdims
AIO_arr_get_slab
H5Dget_info All interfaces have different levels of information that they return, some use of auxilliary functions is required to get equivalent amount of information
Write a single value to a dataset ncvarput1 SDwritedata AIO_write H5Dwrite What is this useful for?
Read a single value from a dataset ncvarget1 SDreaddata AIO_read H5Dread What is this useful for?
Write a solid hyperslab of data (i.e. subset) to a dataset ncvarput SDwritedata AIO_write H5Dwrite
Read a solid hyperslab of data (i.e. subset) from a dataset ncvarget SDreaddata AIO_read H5Dread
Write a general hyperslab of data (i.e. possibly subsampled) to a dataset ncvarputg SDwritedata AIO_write H5Dwrite
Read a general hyperslab of data (i.e. possibly subsampled) from a dataset ncvargetg SDreaddata AIO_read H5Dread
Rename a dataset variable ncvarrename

H5Mset_name
Add an attribute to a dataset ncattput SDsetattr
H5Rattach_oid HDF5 requires creating a seperate object to attach to a dataset, but it also allows objects to be attributes of any other object, even nested.
Get attribute information ncattinq SDattrinfo
H5Dget_info HDF5 has no specific function for attributes, they are treated as all other objects in the file.
Retrieve attribute for a dataset ncattget SDreadattr
H5Dread HDF5 uses general dataset I/O for attributes.
Copy attribute from one dataset to another ncattcopy


What is this used for?
Get name of attribute ncattname SDattrinfo
H5Mget_name
Rename attribute ncattrename

H5Mset_name
Delete attribute ncattdel

H5Mdelete This can be faked in current HDF interface with lower-level calls
Compute # of bytes to store a number-type nctypelen DFKNTsize

Hmm, the HDF5 Datatype interface needs this functionality.
Indicate that fill-values are to be written to dataset ncsetfill SDsetfillmode

HDF5 Datatype interface should work on this functionality
Get information about "record" variables (Those datasets which share the same unlimited dimension ncrecinq


This should probably be wrapped in a higher layer interface, if it's needed for HDF5.
Get a record from each dataset sharing the unlimited dimension ncrecget


This is somewhat equivalent to reading a vdata with non-interlaced fields, only in a dataset oriented way. This should also be wrapped in a higher layer interface if it's necessary for HDF5.
Put a record from each dataset sharing the unlimited dimension ncrecput


This is somewhat equivalent to writing a vdata with non-interlaced fields, only in a dataset oriented way. This should also be wrapped in a higher layer interface if it's necessary for HDF5.
Map a dataset's name to an index to reference it with
SDnametoindex
H5Mfind_name Equivalent functionality except HDF5 call returns an OID instead of an index.
Get the valid range of values for data in a dataset
SDgetrange

Easily implemented with attributes at a higher level for HDF5.
Release access to a dataset
SDendaccess AIO_arr_destroy H5Mrelease Odd that the netCDF API doesn't have this...
Set the valid range of data in a dataset
SDsetrange

Easily implemented with attributes at a higher level for HDF5.
Set the label, units, format, etc. of the data values in a dataset
SDsetdatastrs

Easily implemented with attributes at a higher level for HDF5.
Get the label, units, format, etc. of the data values in a dataset
SDgetdatastrs

Easily implemented with attributes at a higher level for HDF5.
Set the label, units, format, etc. of the dimensions in a dataset
SDsetdimstrs

Easily implemented with attributes at a higher level for HDF5.
Get the label, units, format, etc. of the dimensions in a dataset
SDgetdimstrs

Easily implemented with attributes at a higher level for HDF5.
Set the scale of the dimensions in a dataset
SDsetdimscale

Easily implemented with attributes at a higher level for HDF5.
Get the scale of the dimensions in a dataset
SDgetdimscale

Easily implemented with attributes at a higher level for HDF5.
Set the calibration parameters of the data values in a dataset
SDsetcal

Easily implemented with attributes at a higher level for HDF5.
Get the calibration parameters of the data values in a dataset
SDgetcal

Easily implemented with attributes at a higher level for HDF5.
Set the fill value for the data values in a dataset
SDsetfillvalue

HDF5 needs something like this, I'm not certain where to put it.
Get the fill value for the data values in a dataset
SDgetfillvalue

HDF5 needs something like this, I'm not certain where to put it.
Move/Set the dataset to be in an 'external' file
SDsetexternalfile
H5Dset_storage HDF5 has simple functions for this, but needs an API for setting up the storage flow.
Move/Set the dataset to be stored using only certain bits from the dataset
SDsetnbitdataset
H5Dset_storage HDF5 has simple functions for this, but needs an API for setting up the storage flow.
Move/Set the dataset to be stored in compressed form
SDsetcompress
H5Dset_storage HDF5 has simple functions for this, but needs an API for setting up the storage flow.
Search for an dataset attribute with particular name
SDfindattr
H5Mfind_name
H5Mwild_search
HDF5 can handle wildcard searchs for this feature.
Map a run-time dataset handle to a persistant disk reference
SDidtoref

I'm not certain this is needed for HDF5.
Map a persistant disk reference for a dataset to an index in a group
SDreftoindex

I'm not certain this is needed for HDF5.
Determine if a dataset is a 'record' variable (i.e. it has an unlimited dimension)
SDisrecord

Easily implemented by querying the dimensionality at a higher level for HDF5.
Determine if a dataset is a 'coordinate' variable (i.e. it is used as a dimension)
SDiscoord

I'm not certain this is needed for HDF5.
Set the access type (i.e. parallel or serial) for dataset I/O
SDsetaccesstype

HDF5 has functions for reading the information about this, but needs a better API for setting up the storage flow.
Set the size of blocks used to store a dataset with unlimited dimensions
SDsetblocksize

HDF5 has functions for reading the information about this, but needs a better API for setting up the storage flow.
Sets backward compatibility of dimensions created.
SDsetdimval_comp

Unneccessary in HDF5.
Checks backward compatibility of dimensions created.
SDisdimval_comp

Unneccessary in HDF5.
Move/Set the dataset to be stored in chunked form
SDsetchunk
H5Dset_storage HDF5 has simple functions for this, but needs an API for setting up the storage flow.
Get the chunking information for a dataset stored in chunked form
SDgetchunkinfo
H5Dstorage_detail
Read/Write chunks of a dataset using a chunk index
SDreadchunk
SDwritechunk


I'm not certain that HDF5 needs something like this.
Tune chunk caching parameters for chunked datasets
SDsetchunkcache

HDF5 needs something like this.
Change some default behavior of the library

AIO_defaults
Something like this would be useful in HDF5, to tune I/O pipelines, etc.
Flush and close all open files

AIO_exit
Something like this might be useful in HDF5, although it could be encapsulated with a higher-level function.
Target an architecture for data-type storage

AIO_target
There are some rough parallels with using the data-type in HDF5 to create data-type objects which can be used to write out future datasets.
Map a filename to a file ID

AIO_filename H5Mget_name
Get the active directory (where new datasets are created)

AIO_getcwd
HDF5 allows multiple directories (groups) to be attached to, any of which can have new datasets created within it.
Change active directory

AIO_chdir
Since HDF5 has a slightly different access method for directories (groups), this functionality can be wrapped around calls to H5Gget_oid_by_name.
Create directory

AIO_mkdir H5Mcreate
Return detailed information about an object

AIO_stat H5Dget_info
H5Dstorage_detail
Perhaps more information should be provided through another function in HDF5?
Get "flag" information

AIO_getflags
Not required in HDF5.
Set "flag" information

AIO_setflags
Not required in HDF5.
Get detailed information about all objects in a directory

AIO_ls H5Gget_content_info_mult
H5Dget_info
H5Dstorage_detail
Only roughly equivalent functionality in HDF5, perhaps more should be added?
Get base type of object

AIO_BASIC H5Gget_content_info
Set base type of dataset

AIO_arr_set_btype H5Mcreate(DATATYPE)
Set dimensionality of dataset

AIO_arr_set_bdims H5Mcreate(DATASPACE)
Set slab of dataset to write

AIO_arr_set_slab
This is similar to the process of creating a dataspace for use when performing I/O on an HDF5 dataset
Describe chunking of dataset to write

AIO_arr_set_chunk H5Dset_storage
Describe array index permutation of dataset to write

AIO_arr_set_perm H5Dset_storage
Create a new dataset with dataspace and datatype information from an existing dataset.

AIO_arr_copy
This can be mimicked in HDF5 by attaching to the datatype and dataspace of an existing dataset and using the IDs to create new datasets.
Create a new directory to group objects within

AIO_mkgroup H5Mcreate(GROUP)
Read name of objects in directory

AIO_read_group H5Gget_content_info_mult
Add objects to directory

AIO_write_group H5Ginsert_item_mult
Combine an architecture and numeric type to derive the format's datatype

AIO_COMBINE
This is a nice feature to add to HDF5.
Derive an architecture from the format's datatype

AIO_ARCH
This is a nice feature to add to HDF5.
Derive a numeric type from the format's datatype

AIO_PNT
This is a nice feature to add to HDF5.
Register error handling function for library to call when errors occur

AIO_error_handler
This should be added to HDF5.