FunFiles: Funtools Data Files

Summary

This document describes the data file formats (FITS, array, raw events) as well as the file types (gzip, socket, etc.) supported by Funtools.

Description

Funtools supports FITS images and binary tables, and binary files containing array (homogeneous) data or event (heterogeneous) data. IRAF-style brackets are appended to the filename to specify various kinds of information needed to characterize these data:

  file[ext|ind|ARRAY()|EVENTS(),section][filters]
  or
  file[ext|ind|ARRAY()|EVENTS(),section,filters]
where:

Supported Data Formats

Funtools programs (and the underlying libraries) support the following data file formats:

Information needed to identify and characterize the event or image data can be specified on the command line using IRAF-style bracket notation appended to the filename:
  foo.fits                              # open FITS default extension
  image.fits[3]                         # open FITS extension #3
  events.fits[EVENTS]                   # open EVENTS extension
  array.file[ARRAY(s1024)]              # open 1024x1024 short array
  events.file[EVENTS(x:1024,y:1024...)] # open non-FITS event list
Note that in many Unix shells (e.g., csh and tcsh), filenames must be enclosed in quotes to protect the brackets from shell processing.

FITS Images and Binary Tables

When FunOpen() opens a FITS file without a bracket specifier, the default behavior is to look for a valid image in the primary HDU. In the absence of a primary image, Funtools will try to open an extension named either EVENTS or STDEVT, if one of these exists. This default behavior supports both FITS image processing and standard X-ray event list processing (which, after all, is what we at SAO/HEAD do).

In order to open a FITS binary table or image extension explicitly, it is necessary to specify either the extension name or the extension number in brackets:

  foo.fits[1]                      # open extension #1: the primary HDU
  foo.fits[3]                      # open extension #3 of a FITS file
  foo.fits[GTI]                    # open GTI extension of a FITS file
The ext argument specifies the name of the FITS extension (i.e. the value of the EXTENSION header parameter in a FITS extension), while the index specifies the value of the FITS EXTVER header parameter. Following FITS conventions, extension numbers start at 1.

When a FITS data file is opened for reading using FunOpen(), the specified extension is automatically located and is used to initialize the Funtools internal data structures.

Non-FITS Raw Event Files

In addition to FITS tables, Funtools programs and libraries can operate on non-FITS files containing heterogeneous event records. To specify such an event file, use: where event-spec is a string that specified the names, data types, and optional image dimensions for each element of the event record:

Data types follow standard conventions for FITS binary tables, but include two extra unsigned types ('U' and 'V'):

An optional integer value n can be prefixed to the type to indicate that the element is an array of n values. For example:
  foo.fits[EVENTS(x:I,y:I,status:4J)]
defines x and y as 16-bit ints and status as an array of 4 32-bit ints.

Furthermore, image dimensions can be attached to the event specification in order to tell Funtools how to bin the events into an image. They follow the conventions for the FITS TLMIN/TLMAX keywords. If the low image dimension is not specified, it defaults to 1. Thus:

both specify that the dimension of this column runs from 1 to 100.

NB: it is required that all padding be specified in the record definition. Thus, when writing out whole C structs instead of individual record elements, great care must be taken to include the compiler-added padding in the event definition.

For example, suppose a FITS binary table has the following set of column definitions:

  TTYPE1  = 'X                 ' / Label for field
  TFORM1  = '1I                ' / Data type for field
  TLMIN1  =                    1 / Min. axis value
  TLMAX1  =                   10 / Max. axis value
  TTYPE2  = 'Y                 ' / Label for field
  TFORM2  = '1I                ' / Data type for field
  TLMIN2  =                    2 / Min. axis value
  TLMAX2  =                   11 / Max. axis value
  TTYPE3  = 'PHA               ' / Label for field
  TFORM3  = '1I                ' / Data type for field
  TTYPE4  = 'PI                ' / Label for field
  TFORM4  = '1J                ' / Data type for field
  TTYPE5  = 'TIME              ' / Label for field
  TFORM5  = '1D                ' / Data type for field 
  TTYPE6  = 'DX                ' / Label for field
  TFORM6  = '1E                ' / Data type for field
  TLMIN6  =                    1 / Min. axis value
  TLMAX6  =                   10 / Max. axis value
  TTYPE7  = 'DY                ' / Label for field
  TFORM7  = '1E                ' / Data type for field
  TLMIN7  =                    3 / Min. axis value
  TLMAX7  =                   12 / Max. axis value
An raw event file containing these same data would have the event specification:
  EVENTS(X:I:10,Y:I:2:11,PHA:I,PI:J,TIME:D,DX:E:10,DY:E:3:12)

If no event specification string is included within the EVENTS() operator, then the event specification is taken from the EVENTS environment variable:

  setenv EVENTS "X:I:10,Y:I:10,PHA:I,PI:J,TIME:D,DX:E:10,DY:E:10"

In addition to knowing the data structure, it is necessary to know the endian ordering of the data, i.e., whether or not the data is in bigendian format, so that we can convert to the native format for this platform. This issue does not arise for FITS Binary Tables because all FITS files use big-endian ordering, regardless of platform. But for non-FITS data, big-endian data produced on a Sun workstation but read on a Linux PC needs to be byte-swapped, since PCs use little-endian ordering. To specify an ordering, use the bigendian= or endian= keywords on the command-line or the EVENTS_BIGENDIAN or EVENTS_ENDIAN environment variables. The value of the bigendian variables should be "true" or "false", while the value of the endian variables should be "little" or "big".

For example, a PC can access data produced by a Sun using:

  hrc.nepr[EVENTS(),bigendian=true]
or
  hrc.nepr[EVENTS(),endian=big]
or
  setenv EVENTS_BIGENDIAN true
or
  setenv EVENTS_ENDIAN big
If none of these are specified, the data are assumed to follow the format for that platform and no byte-swapping is performed.

Non-FITS Array Files

In addition to FITS images, Funtools programs and libraries can operate on non-FITS files containing arrays of homogeneous data. To specify an array file, use: where array-spec is of the form: and where [type] is:

The dim1 specification is required, but dim2 is optional and defaults to dim1. The skip specification is optional and defaults to 0. The optional endian specification can be 'l' or 'b' and defaults to the endian type for the current machine.

If no array specification is included within the ARRAY() operator, then the array specification is taken from the ARRAY environment variable. For example:

  foo.arr[ARRAY(r512)]          # bitpix=-32 dim1=512 dim2=512
  foo.arr[ARRAY(r512.400)]      # bitpix=-32 dim1=512 dim2=400
  foo.arr[ARRAY(r512.400])      # bitpix=-32 dim1=512 dim2=400
  foo.arr[ARRAY(r512.400:2880)] # bitpix=-32 dim1=512 dim2=400 skip=2880
  foo.arr[ARRAY(r512l)]         # bitpix=-32 dim1=512 dim2=512 endian=little
  setenv ARRAY "r512.400:2880"
  foo.arr[ARRAY()]              # bitpix=-32 dim1=512 dim2=400 skip=2880

Specifying Image Sections

Once a data file (and possibly, a FITS extension) has been specified, the next (optional) part of a bracket specification can be used to select image section information, i.e., to specify the x,y limits of an image section, as well as the blocking factor to apply to that section. This information can be added to any file specification but only is used by Funtools image processing routines.

The format of the image section specification is one of the following:

where the limit values can be ints or "*" for default. A single "*" can be used instead of val:val, as shown. Note that blocking is applied to the section after it is extracted.

In addition to image sections specified by the lo and hi x,y limits, image sections using center positions can be specified:

Note that the (float) values for dim, dim1, dim2, xcen, ycen must be specified or else the expression does not make sense!

In all cases, block is optional and defaults to 1. An 's' or 'a' can be appended to signify "sum" or "average" blocking (default is "sum"). Section specifications are given in image coordinates by default. If you wish to specify physical coordinates, add a 'p' as the last character of the section specification, before the closing bracket. For example:


A section can be specified in any Funtools file name. If the operation to be applied to that file is an imaging operation, then the specification will be utilized. If the operation is purely a table operation, then the section specification is ignored.

Do not be confused by:

  foo.fits[2]
  foo.fits[*,2]
The former specifies opening the second extension of the FITS file. The latter specifies application of block 2 to the image section.

Note that the section specification must come after any of FITS ext name or ind number, but all sensible defaults are supported:

Binning FITS Binary Tables and Non-FITS Event Files

If a FITS binary table or a non-FITS raw event file is to be binned into a 2D image (e.g., using the funimage program), it is necessary to specify the two columns to be used for the binning, as well as the dimensions of the image. Funtools first looks for a specifier of the form:
 bincols=([xnam[:tlmin[:tlmax:[binsiz]]]],[ynam[:tlmin[:tlmax[:binsiz]]]])
in bracket syntax, and uses the column names thus specified. The tlmin, tlmax, and binsiz specifiers determine the image binning dimensions using:
  dim = (tlmax - tlmin)/binsiz     (floating point data)
  dim = (tlmax - tlmin)/binsiz + 1 (integer data)
These tlmin, tlmax, and binsiz specifiers can be omitted if TLMIN, TLMAX, and TDBIN header parameters are present in the FITS binary table header, respectively. If only one parameter is specified, it is assumed to be tlmax, and tlmin defaults to 1. If two parameters are specified, they are assumed to be tlmin and tlmax. For example, to bin an HRC event list columns "VPOS" and "UPOS", use:
  hrc.nepr[bincols=(VPOS,UPOS)]
or
  hrc.nepr[bincols=(VPOS:49152,UPOS:4096)]
Note that you can optionally specify the dimensions of these columns to cover cases where neither TLMAX keywords are defined in the header. If either dimension is specified, then both must be specified.

You can set the FITS_BINCOLS or EVENTS_BINCOLS environment variable as an alternative to adding the "bincols=" specifier to each file name for FITS binary tables and raw event files, respectively. If no binning keywords or environment variables are specified, or if the specified columns are not in the binary table, the Chandra parameters CPREF (or PREFX) are searched for in the FITS binary table header. Failing this, columns named "X" and "Y" are sought. If these are not found, the code looks for columns containing the characters "X" and "Y". Thus, you can bin on "DETX" and "DETX" columns without specifying them, if these are the only column names containing the "X" and "Y" characters.

Ordinarily, each event or row contributes one count to an image pixel during the 2D binning process. Thus, if five events all have the same (x,y) position, the image pixel value for that position will have a value of five. It is possible to specify a variable contribution for each event by using the vcol=[colname] filter spec:

 vcol=[colname]
The vcol colname is a column containing a numeric value in each event row that will be used as the contribution of the given event to its image pixel. For example, consider an event file that has the following content:
  x:e:4    y:e:4    v:e
  ------   ------   ----
  1        1        1.0
  2        2        2.0
  3        3        3.0
  4        4        0.0
  1        1        1.0
  2        2        2.0
  3        3        3.0
  4        4        4.0
There are two events with x,y value of (1,1) so ordinarily a 2D image will have a value of 2 in the (1,1) pixel. If the v column is specified as the value column:
  foo.fits'[vcol=v]'
then each pixel will contain the additive sum of the associated (x,y) column values from the v column. For example, image pixel (1,1) will contain 1. + 1. = 2, image pixel (2,2) will contain (2 + 2) = 4, etc.

An important variation on the use of a value column to specify the contribution an event makes to an image pixel is when the value column contains the reciprocal of the event contribution. For this case, the column name should be prefixed with a / (divide sign) thus:

  foo.fits'[vcol=/v]'
Each image pixel value will then be the sum of the reciprocals of the value column. A zero in the value column results in NaN (not a number). Thus, in the above example, image pixel (1.1) will contain 1/1 + 1/1 = 2, image pixel (2,2) will contain (1/2 + 1/2) = 1, etc. Image pixel (4,4) will contain (1/0 + 1/4) = NaN.

You can set the FITS_VCOL or EVENTS_VCOL environment variable as an alternative to adding the "vcol=" specifier to each file name for FITS binary tables and raw event files, respectively.

Finally, when binning events, the data type of the resulting 2D image must be specified. This can be done with the "bitpix=[n]" keyword in the bracket specification. For example:

  events.fits[bincols=(VPOS,UPOS),bitpix=-32]
will create a floating point image binned on columns VPOS and UPOS. If no bitpix keyword is specified, bitpix=32 is assumed. As with bincols values, you also can use the FITS_BITPIX and EVENTS_BITPIX environment variables to set this value for FITS binary tables and raw event files, respectively.

The funimage program also allows you to create a 1D image projection along any column of a table by using the bincols=[column] filter specification and specifying a single column. For example, the following command projects a 1D image along the chipx column of a table:

  funimage ev.fits'[bincols=chipx]' im.fits
See funimage for more information about creating 1D and 2D images.

Finally, please note that Funtools supports most FITS standards. We will add missing support as required by the community. In general, however, we do not support non-standard extensions. For example, we sense the presence of the binary table 'variable length array' proposed extension and we pass it along when copying and filtering files, but we do not process it. We will add support for new standards as they become official.

Table and Spatial Region Filters

Note that, in addition extensions and image sections, Funtools bracket notation can be used to specify table and spatial region filters. These filters are always placed after the image section information. They can be specified in the same bracket or in a separate bracket immediately following:

where: The topics of table and region filtering are covered in detail in:

Disk Files and Other Supported File Types

The specified file usually is an ordinary disk file. In addition, gzip'ed files are supported in Funtools: gzip'ed input files are automatically uncompressed as they are read, and gzip'ed output files are compressed as they are written. NB: if a FITS binary table is written in gzip format, the number of rows in the table will be set to -1. Such a file will work with Funtools programs but will not work with other FITS programs such as ds9.

The special keywords "stdin" and "stdout" designate Unix standard input and standard output, respectively. The string "-" (hyphen) will be taken to mean "stdin" if the file is opened for reading and "stdout" if the file is opened for writing.

A file also can be an INET socket on the same or another machine using the syntax:

  machine:port
Thus, for example:
  karapet:1428
specifies that I/O should be performed to/from port 1428 on the machine karapet. If no machine name is specified, the default is to use the current machine:
  :1428
This means to open port 1428 on the current machine. Socket support allows you to generate a distributed pipe:
  on karapet:       funtask1 in.fits bynars:1428
  on bynars:        funtask2 :1428 out.fits
The socket mechanism thus supports simple parallel processing using process decomposition. Note that parallel processing using data decomposition is supported via the section specifier (see below), and the row# specifier, which is part of Table Filtering.

A file also can be a pointer to shared memory using the syntax:

  shm:[id|@key][:size]
A shared memory segment is specified with a shm: prefix, followed by either the shared memory id or the shared memory key (where the latter is prefixed by the '@' character). The size (in bytes) of the shared memory segment can then be appended (preceded by the ':' character). If the size specification is absent, the code will attempt to determine the length automatically. If the open mode contains the string "w+", then the memory segment will be created if it does not exist. (It also will be released and deleted when the file is closed.) In the case where a memory segment is being created, the length of the segment is required.

A file also can be Unix piped command (i.e. a program to run) using the syntax:

  "pipe: command arg1 ... argn"
The output from the command must be a valid FITS file. It is important to use quotes to protect spaces so that command arguments are passed correctly. A silly example is:
  fundisp "pipe: funtable 'foo.fits[cir 512 512 .1]' stdout"
This seemed like a good idea at the time ...

Lists of Files

Funtools also will process a list of files as a single file using the syntax:

  "list: file1 file2 ... filen"
The files in the list are separated by whitespace. Any of the above file types can be used. For example, if two files, foo1.fits and foo2.fits, are part of the same observation, they can be processed as a single file (using their own filters):
  fundisp "list: foo1.fits[cir(512,512,10)] foo2.fits[cir(511,511,10)]"
         X        Y      PHA       PI                  TIME       DX       DY
  -------- -------- -------- -------- --------------------- -------- --------
       512      512        6        7     79493997.45854475      578      574
       512      512        8        9     79494575.58943175      579      573
       512      512        5        6     79493631.03866175      578      575
       512      512        5        5     79493290.86521725      578      575
       512      512        8        9     79493432.00990875      579      573
       511      511        5        5     79488631.09462625      580      575
       511      511       10       11     79488780.60006675      580      573
       511      511        4        4     79494562.35474326      580      575
       511      511        6        6     79488203.01561825      580      575
       511      511        6        6     79488017.99730176      580      575
       511      511        4        4     79494332.45355175      580      575
       511      511        9       10     79492685.94014275      581      574
       511      511        5        5     79487708.71298325      580      575
       511      511        8        9     79493719.00160225      581      573
Again, note that it is important to avoid spaces in the filters because the list separator also is whitespace. To protect whitespace in a filter, enclose the file specification in quotes:
  fundisp "list: 'foo1.fits[cir 512 512 .1]' foo2.fits[cir(511,511,.1)]" 

Go to Funtools Help Index

Last updated: February 15, 2006