summaryrefslogtreecommitdiffstats
path: root/doc/user/scanners.xml
blob: 758a84909b32d766a22584e81a10ff1a8a13ae32 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
<?xml version='1.0'?>
<!DOCTYPE sconsdoc [
    <!ENTITY % scons SYSTEM "../scons.mod">
    %scons;
    
    <!ENTITY % builders-mod SYSTEM "../generated/builders.mod">
    %builders-mod;
    <!ENTITY % functions-mod SYSTEM "../generated/functions.mod">
    %functions-mod;
    <!ENTITY % tools-mod SYSTEM "../generated/tools.mod">
    %tools-mod;
    <!ENTITY % variables-mod SYSTEM "../generated/variables.mod">
    %variables-mod;
]>

<chapter id="chap-scanners"
         xmlns="http://www.scons.org/dbxsd/v1.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.scons.org/dbxsd/v1.0/scons.xsd scons.xsd">
<title>Writing Scanners</title>

<!--

  __COPYRIGHT__

  Permission is hereby granted, free of charge, to any person obtaining
  a copy of this software and associated documentation files (the
  "Software"), to deal in the Software without restriction, including
  without limitation the rights to use, copy, modify, merge, publish,
  distribute, sublicense, and/or sell copies of the Software, and to
  permit persons to whom the Software is furnished to do so, subject to
  the following conditions:

  The above copyright notice and this permission notice shall be included
  in all copies or substantial portions of the Software.

  THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY
  KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
  WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
  NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
  LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
  OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
  WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

-->

<!--

=head1 Using and writing dependency scanners

QuickScan allows simple target-independent scanners to be set up for
source files. Only one QuickScan scanner may be associated with any given
source file and environment, although the same scanner may (and should)
be used for multiple files of a given type.

A QuickScan scanner is only ever invoked once for a given source file,
and it is only invoked if the file is used by some target in the tree
(i.e., there is a dependency on the source file).

QuickScan is invoked as follows:

  QuickScan CONSENV CODEREF, FILENAME [, PATH]

The subroutine referenced by CODEREF is expected to return a list of
filenames included directly by FILE. These filenames will, in turn, be
scanned. The optional PATH argument supplies a lookup path for finding
FILENAME and/or files returned by the user-supplied subroutine.  The PATH
may be a reference to an array of lookup-directory names, or a string of
names separated by the system's separator character (':' on UNIX systems,
';' on Windows NT).

The subroutine is called once for each line in the file, with $_ set to the
current line. If the subroutine needs to look at additional lines, or, for
that matter, the entire file, then it may read them itself, from the
filehandle SCAN. It may also terminate the loop, if it knows that no further
include information is available, by closing the filehandle.

Whether or not a lookup path is provided, QuickScan first tries to lookup
the file relative to the current directory (for the top-level file
supplied directly to QuickScan), or from the directory containing the
file which referenced the file. This is not very general, but seems good
enough, especially if you have the luxury of writing your own utilities
and can control the use of the search path in a standard way.

Here's a real example, taken from a F<Construct> file here:

  sub cons::SMFgen {
      my($env, @tables) = @_;
      foreach $t (@tables) {
	  $env->QuickScan(sub { /\b\S*?\.smf\b/g }, "$t.smf",
			  $env->{SMF_INCLUDE_PATH});
	  $env->Command(["$t.smdb.cc","$t.smdb.h","$t.snmp.cc",
			 "$t.ami.cc", "$t.http.cc"], "$t.smf",
			q(smfgen %( %SMF_INCLUDE_OPT %) %<));
      }
  }

The subroutine above finds all names of the form <name>.smf in the
file. It will return the names even if they're found within comments,
but that's OK (the mechanism is forgiving of extra files; they're just
ignored on the assumption that the missing file will be noticed when
the program, in this example, smfgen, is actually invoked).

[NOTE that the form C<$env-E<gt>QuickScan ...>  and C<$env-E<gt>Command
...> should not be necessary, but, for some reason, is required
for this particular invocation. This appears to be a bug in Perl or
a misunderstanding on my part; this invocation style does not always
appear to be necessary.]

Here is another way to build the same scanner. This one uses an
explicit code reference, and also (unnecessarily, in this case) reads
the whole file itself:

  sub myscan {
      my(@includes);
      do {
	  push(@includes, /\b\S*?\.smf\b/g);
      } while <SCAN>;
      @includes
  }

Note that the order of the loop is reversed, with the loop test at the
end. This is because the first line is already read for you. This scanner
can be attached to a source file by:

  QuickScan $env \&myscan, "$_.smf";

This final example, which scans a different type of input file, takes
over the file scanning rather than being called for each input line:

  $env->QuickScan(
      sub { my(@includes) = ();
	  do {
	     push(@includes, $3)
		 if /^(#include|import)\s+(\")(.+)(\")/ && $3
	  } while <SCAN>;
	  @includes
      },
      "$idlFileName",
      "$env->{CPPPATH};$BUILD/ActiveContext/ACSCLientInterfaces"
  );

-->

  <para>

    &SCons; has built-in scanners that know how to look in
    C, Fortran and IDL source files for information about
    other files that targets built from those files depend on--for example,
    in the case of files that use the C preprocessor,
    the <filename>.h</filename> files that are specified
    using <literal>#include</literal> lines in the source.
    You can use the same mechanisms that &SCons; uses to create
    its built-in scanners to write scanners of your own for file types
    that &SCons; does not know how to scan "out of the box."

  </para>

  <section>
  <title>A Simple Scanner Example</title>

    <para>

      Suppose, for example, that we want to create a simple scanner
      for <filename>.foo</filename> files.
      A <filename>.foo</filename> file contains some text that
      will be processed,
      and can include other files on lines that begin
      with <literal>include</literal>
      followed by a file name:

    </para>

    <programlisting>
include filename.foo
    </programlisting>

    <para>

      Scanning a file will be handled by a Python function
      that you must supply.
      Here is a function that will use the Python
      <filename>re</filename> module
      to scan for the <literal>include</literal> lines in our example:

    </para>

    <programlisting>
import re

include_re = re.compile(r'^include\s+(\S+)$', re.M)

def kfile_scan(node, env, path, arg):
    contents = node.get_text_contents()
    return env.File(include_re.findall(contents))
    </programlisting>

    <para>
    
      It is important to note that you
      have to return a list of File nodes from the scanner function, simple
      strings for the file names won't do. As in the examples we are showing here,
      you can use the &File;
      function of your current Environment in order to create nodes on the fly from
      a sequence of file names with relative paths.
      
    </para>

    <para>

      The scanner function must
      accept the four specified arguments
      and return a list of implicit dependencies.
      Presumably, these would be dependencies found
      from examining the contents of the file,
      although the function can perform any
      manipulation at all to generate the list of
      dependencies.

    </para>

    <variablelist>

      <varlistentry>
      <term>node</term>

      <listitem>
      <para>

      An &SCons; node object representing the file being scanned.
      The path name to the file can be
      used by converting the node to a string
      using the <literal>str()</literal> function,
      or an internal &SCons; <literal>get_text_contents()</literal>
      object method can be used to fetch the contents.

      </para>
      </listitem>
      </varlistentry>

      <varlistentry>
      <term>env</term>

      <listitem>
      <para>

      The construction environment in effect for this scan.
      The scanner function may choose to use construction
      variables from this environment to affect its behavior.

      </para>
      </listitem>
      </varlistentry>

      <varlistentry>
      <term>path</term>

      <listitem>
      <para>

      A list of directories that form the search path for included files
      for this scanner.
      This is how &SCons; handles the &cv-link-CPPPATH; and &cv-link-LIBPATH;
      variables.

      </para>
      </listitem>
      </varlistentry>

      <varlistentry>
      <term>arg</term>

      <listitem>
      <para>

      An optional argument that you can choose to
      have passed to this scanner function by
      various scanner instances.

      </para>
      </listitem>
      </varlistentry>

    </variablelist>

    <para>

    A Scanner object is created using the &Scanner; function,
    which typically takes an <literal>skeys</literal> argument
    to associate the type of file suffix with this scanner.
    The Scanner object must then be associated with the
    &cv-link-SCANNERS; construction variable of a construction environment,
    typically by using the &Append; method:

    </para>

    <programlisting>
kscan = Scanner(function = kfile_scan,
                skeys = ['.k'])
env.Append(SCANNERS = kscan)
    </programlisting>

    <para>

    When we put it all together, it looks like:

    </para>

    <scons_example name="scanners_scan">
      <file name="SConstruct" printme="1">
  import re

  include_re = re.compile(r'^include\s+(\S+)$', re.M)

  def kfile_scan(node, env, path):
      contents = node.get_text_contents()
      includes = include_re.findall(contents)
      return env.File(includes)

  kscan = Scanner(function = kfile_scan,
                  skeys = ['.k'])

  env = Environment(ENV = {'PATH' : '__ROOT__/usr/local/bin'})
  env.Append(SCANNERS = kscan)

  env.Command('foo', 'foo.k', 'kprocess &lt; $SOURCES &gt; $TARGET')
      </file>
      <file name="foo.k">
include other_file
      </file>
      <file name="other_file">
other_file
      </file>
      <directory name="__ROOT__/usr"></directory>
      <directory name="__ROOT__/usr/local"></directory>
      <directory name="__ROOT__/usr/local/bin"></directory>
      <file name="__ROOT_/usr/local/bin/kprocess" chmod="755">
cat
      </file>
    </scons_example>

    <!--

    <para>

      Now if we run &scons;
      and then re-run it after changing the contents of
      <filename>other_file</filename>,
      the <filename>foo</filename>
      target file will be automatically rebuilt:

    </para>

    <scons_output example="scanners_scan" suffix="1">
      <scons_output_command>scons -Q</scons_output_command>
      <scons_output_command output="    [CHANGE THE CONTENTS OF other_file]">edit other_file</scons_output_command>
      <scons_output_command>scons -Q</scons_output_command>
      <scons_output_command>scons -Q</scons_output_command>
    </scons_output>

    -->

  </section>

  <section>
  <title>Adding a search path to a scanner: &FindPathDirs;</title>

    <para>

    Many scanners need to search for included files or dependencies
    using a path variable; this is how &cv-link-CPPPATH; and
    &cv-link-LIBPATH; work.  The path to search is passed to your
    scanner as the <literal>path</literal> argument.  Path variables
    may be lists of nodes, semicolon-separated strings, or even
    contain SCons variables which need to be expanded.  Fortunately,
    &SCons; provides the &FindPathDirs; function which itself returns
    a function to expand a given path (given as a SCons construction
    variable name) to a list of paths at the time the scanner is
    called.  Deferring evaluation until that point allows, for
    instance, the path to contain $TARGET references which differ for
    each file scanned.

    </para>

    <para>

    Using &FindPathDirs; is quite easy.  Continuing the above example,
    using KPATH as the construction variable with the search path
    (analogous to &cv-link-CPPPATH;), we just modify the &Scanner;
    constructor call to include a path keyword arg:

    </para>
    
    <scons_example name="scanners_findpathdirs">
      <file name="SConstruct" printme="1">
kscan = Scanner(function = kfile_scan,
                skeys = ['.k'],
                path=FindPathDirs('KPATH'))
      </file>
    </scons_example>
    
    <para>
    
    FindPathDirs returns a callable object that, when called, will
    essentially expand the elements in env['KPATH'] and tell the
    scanner to search in those dirs.  It will also properly add
    related repository and variant dirs to the search list.  As a side
    note, the returned method stores the path in an efficient way so
    lookups are fast even when variable substitutions may be needed.
    This is important since many files get scanned in a typical build.
    
    </para>
    </section>

</chapter>