summaryrefslogtreecommitdiffstats
path: root/funtools/doc/filters.html
blob: bd78fea68008a6c50f7e2753e29cb6bf006f2b24 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
<!-- =defdoc funfilters funfilters n -->
<HTML>
<HEAD>
<TITLE>Table Filtering</TITLE>
</HEAD>
<BODY>

<!-- =section funfilters NAME -->
<H2><A NAME="funfilters">Funfilters: Filtering Rows in a Table</A></H2>

<!-- =section funfilters SYNOPSIS -->
<H2>Summary</H2>
<P>
This document contains a summary of the user interface for 
filtering rows in binary tables.

<!-- =section funfilters DESCRIPTION -->
<H2>Description</H2>
<P>
Table filtering allows a program to select rows from an table (e.g.,
X-ray event list) by checking each row against one or more expressions
involving the columns in the table. When a table is filtered, only
valid rows satisfying these expressions are passed through for processing.

<P>
A filter expression is specified using bracket notation appended to
the filename of the data being processed:
<PRE>
  foo.fits[pha==1&&pi==2]
</PRE>
It is also possible to put region specification inside a file and
then pass the filename in bracket notation:
<PRE>
  foo.fits[@my.reg]
</PRE>
Filters must be placed after the extension and image section
information, when such information is present. The correct order is:
<UL>
<LI> file[fileinfo,sectioninfo][filters]
<LI> file[fileinfo,sectioninfo,filters]
</UL>
where:
<UL>
<LI> <B>file</B> is the Funtools file name
<LI> <B>fileinfo</B> is an ARRAY, EVENT, FITS extension, or FITS index
<LI> <B>sectioninfo</B> is the image section to extract
<LI> <B>filters</B> are spatial region and table (row) filters to apply
</UL>
See <A HREF="./files.html">Funtools Files</A> for more information
on file and image section specifications.

<H2>Filter Expressions</H2>

<P>
Table filtering can be performed on columns of data in a FITS
binary table or a raw event file.  Table filtering is accomplished by
means of <B>table filter specifications</B>.  An table filter
specification consists of one or more <B>filter expressions</B> Filter
specifications also can contain comments and local/global processing
directives.

<P>
More specifically, a filter specification consist of one or more lines
containing:
<PRE>
  # comment until end of line
  # include the following file in the table descriptor
  @file
  # each row expression can contain filters separated by operators
  [filter_expression] BOOLOP [filter_expression2], ...
  # each row expression can contain filters separated by the comma operator
  [filter_expression1], [filter_expression2], ...
  # the special row# keyword allows a range of rows to be processed
  row#=m:n
  # or a single row
  row#=m
  # regions are supported -- but are described elsewhere
  [spatial_region_expression]
</PRE>

<P>
A single filter expression consists of an arithmetic, logical, or
other operations involving one or more column values from a
table. Columns can be compared to other columns, to header values,
or to numeric constants. Standard math functions can be applied to
columns. Separate filter expressions can be combined using boolean operators.
Standard C semantics can be used when constructing expressions, with
the usual precedence and associativity rules holding sway:
<PRE>
  Operator                                Associativity
  --------                                -------------
  ()                                      left to right
  !! (logical not)                        right to left
  !  (bitwise not) - (unary minus)        right to left
  *  /                                    left to right
  +  -                                    left to right
  &lt; &lt;= &gt; &gt;=                               left to right
  == !=                                   left to right
  &  (bitwise and)                        left to right
  ^  (bitwise exclusive or)               left to right
  |  (bitwise inclusive or)               left to right
  && (logical and)                        left to right
  || (logical or)                         left to right
  =                                       right to left
</PRE>
For example, if energy and pha are columns in a table, 
then the following are valid expressions:
<PRE>
  pha>1
  energy == pha
  (pha>1) && (energy<=2)
  max(pha,energy)>=2.5
</PRE>

<P>
Comparison values can be integers or floats. Integer comparison values can be
specified in decimal, octal (using '0' as prefix), hex (using '0x' as prefix)
or binary (using '0b' as prefix). Thus, the following all specify the same
comparison test of a status mask:
<PRE>
  (status & 15) == 8           # decimal
  (status & 017) == 010        # octal
  (status & 0xf) == 0x8        # hex
  (status & 0b1111) == 0b1000  # binary
</PRE>
<P>
The special keyword row# allows you to process a range of rows.
When row# is specified, the filter code skips to the designated
row  and only processes the specified number of rows. The
"*" character can be utilized as the high limit value to denote
processing of the remaining rows. Thus:
<PRE>
  row#=100:109
</PRE>
processes 10 rows, starting with row 100 (counting from 1),
while:
<PRE>
  row#=100:*
</PRE>
specifies that all but the first 99 rows are to be processed.

<P>
Spatial region filtering allows a program to select regions of an
image or rows of a table (e.g., X-ray events) using simple geometric
shapes and boolean combinations of shapes.  For a complete description
of regions, see <A HREF="./regions.html">Spatial Region Filtering</A>.

<H2><A NAME="separators">Separators Also Are Operators</A></H2>
<P>
As mentioned previously, multiple filter expressions can be specified
in a filter descriptor, separated by commas or new-lines.
When such a comma or new-line separator is used, the boolean AND operator
is automatically generated in its place. Thus and expression such as:
<PRE>
  pha==1,pi=2:4
</PRE>
is equivalent to:
<PRE>
  (pha==1) && (pi>=2&&pi<=4)
</PRE>
<P>
[Note that the behavior of separators is different for filter expressions
and spatial region expressions.  The former uses AND as the operator, while
the latter user OR. See
<A HREF="./combo.html">Combining Region and Table Filters</A>
for more information about these conventions and how they are treated
when combined.]

<H2><A NAME="range">Range Lists</A></H2><P> 
<P>
Aside from the standard C syntax, filter expressions can make use of
IRAF-style <B>range lists</B> which specify a range of values. The
syntax requires that the column name be followed by an '=' sign, which
is followed by one or more comma-delimited range expressions of the form:
<PRE>
  col = vv              # col == vv in range
  col = :vv             # col <= vv in range
  col = vv:             # col >= vv in range
  col = vv1:vv2         # vv1 <= col <= vv2 in range
</PRE>
The vv's above must be numeric constants; the right hand side of a
range list cannot contain a column name or header value.
<P>
Note that, unlike an ordinary comma separator, the comma separator used
between two or more range expressions denotes OR.  Thus, when two or
more range expressions are combined with a comma separator, the resulting
expression is a shortcut for more complicated boolean logic. For example:
<PRE>
  col = :3,6:8,10:
</PRE>
is equivalent to:
<PRE>
  (col<=3) || (col>=6 && col <=8) || (col >=10)
</PRE>
Note also that the single-valued rangelist:
<PRE>
  col = val
</PRE>
is equivalent to the C-based filter expression:
<PRE>
  col == val
</PRE>
assuming, of course, that val is a numeric constant.

<H2><A NAME="math">Math Operations and Functions</A></H2><P> 
<P>
It is permissible to specify C math functions as part of the filter syntax.
When the filter parser recognizes a function call, it automatically
includes the math.h and links in the C math library.  Thus, it is
possible to filter rows by expressions such as these:
<UL>
<LI>(pi+pha)>(2+log(pi)-pha)
<LI> min(pi,pha)*14>x
<LI> max(pi,pha)==(pi+1)
<LI> feq(pi,pha)
<LI> div(pi,pha)>0
</UL>
The function feq(a,b) returns true (1) if the difference between a and b
(taken as double precision values) is less than approximately 10E-15.
The function div(a,b) divides a by b, but returns NaN (not a number)
if b is 0. It is a safe way to avoid floating point errors when
dividing one column by another.

<H2><A NAME="include">Include Files</A></H2><P> 
<P>
The special <B>@filename</B> directive specifies an include file
containing filter expressions. This file is processed as part of
the overall filter descriptor:
<PRE>
  foo.fits[pha==1,@foo]
</PRE>

<H2><A NAME="header">Header Parameters</A></H2><P> 
<P>
The filter syntax supports comparison between a column value and a
header parameter value of a FITS binary tables (raw event files have no
such header).  The header parameters can be taken from the binary
table header or the primary header.  For example, assuming there is a
header value MEAN_PHA in one of these headers, you can select photons
having exactly this value using:

<UL>
<LI> pha==MEAN_PHA
</UL>

<H2>Examples</H2>
<P>
Table filtering is more easily described by means of examples.
Consider data containing the following table structure:
<UL>
<LI> double TIME
<LI> int X
<LI> int Y
<LI> short PI
<LI> short PHA
<LI> int DX
<LI> int DY
</UL>

<P>
Tables can be filtered on these columns using IRAF/QPOE range syntax or
any valid C syntax.  The following examples illustrate the possibilities:
<DL>

<P>
<DT> pha=10
<DT> pha==10
<DD> select rows whose pha value is exactly 10

<P>
<DT> pha=10:50
<DD> select rows whose pha value is in the range of 10 to 50

<P>
<DT> pha=10:50,100
<DD> select rows whose pha value is in the range of 10 to 50 or is
equal to 100

<P>
<DT> pha>=10 && pha<=50
<DD> select rows whose pha value is in the range of 10 to 50

<P>
<DT> pi=1,2&&pha>3
<DD> select rows whose pha value is 1 or 2 and whose pi value is 3

<P>
<DT> pi=1,2 || pha>3
<DD> select rows whose pha value is 1 or 2 or whose pi value is 3

<P>
<DT> pha==pi+1
<DD> select rows whose pha value is 1 less than the pi value

<P>
<DT> (pha==pi+1) && (time>50000.0)
<DD> select rows whose pha value is 1 less than the pi value
and whose time value is greater than 50000

<P>
<DT>(pi+pha)>20
<DD> select rows in which the sum of the pi and pha values is greater
than 20

<P>
<DT> pi%2==1
<DD> select rows in which the pi value is odd
</DL>

<P>
Currently, integer range list limits cannot be specified in binary
notation (use decimal, hex, or octal instead). Please contact us if
this is a problem.

<!-- =section funfilters SEE ALSO -->
<!-- =text See funtools(n) for a list of Funtools help pages -->
<!-- =stop -->

<P>
<A HREF="./help.html">Go to Funtools Help Index</A>

<H5>Last updated: November 17, 2005</H5>

</BODY>
</HTML>