diff options
author | Tim Peters <tim.peters@gmail.com> | 2001-06-29 23:51:08 (GMT) |
---|---|---|
committer | Tim Peters <tim.peters@gmail.com> | 2001-06-29 23:51:08 (GMT) |
commit | 4efb6e964376a46aaa3acf365a6627a37af236bf (patch) | |
tree | dc61305039a9561bec3693a76b56804984f87ddc /Doc/lib/libtokenize.tex | |
parent | 88e66254f90dcfd8287775bb0caee7916fb958b2 (diff) | |
download | cpython-4efb6e964376a46aaa3acf365a6627a37af236bf.zip cpython-4efb6e964376a46aaa3acf365a6627a37af236bf.tar.gz cpython-4efb6e964376a46aaa3acf365a6627a37af236bf.tar.bz2 |
Turns out Neil didn't intend for *all* of his gen-branch work to get
committed.
tokenize.py: I like these changes, and have tested them extensively
without even realizing it, so I just updated the docstring and the docs.
tabnanny.py: Also liked this, but did a little code fiddling. I should
really rewrite this to *exploit* generators, but that's near the bottom
of my effort/benefit scale so doubt I'll get to it anytime soon (it
would be most useful as a non-trivial example of ideal use of generators;
but test_generators.py has already grown plenty of food-for-thought
examples).
inspect.py: I'm sure Ping intended for this to continue running even
under 1.5.2, so I reverted this to the last pre-gen-branch version. The
"bugfix" I checked in in-between was actually repairing a bug *introduced*
by the conversion to generators, so it's OK that the reverted version
doesn't reflect that checkin.
Diffstat (limited to 'Doc/lib/libtokenize.tex')
-rw-r--r-- | Doc/lib/libtokenize.tex | 37 |
1 files changed, 27 insertions, 10 deletions
diff --git a/Doc/lib/libtokenize.tex b/Doc/lib/libtokenize.tex index 205407c..6cd9348 100644 --- a/Doc/lib/libtokenize.tex +++ b/Doc/lib/libtokenize.tex @@ -12,12 +12,33 @@ source code, implemented in Python. The scanner in this module returns comments as tokens as well, making it useful for implementing ``pretty-printers,'' including colorizers for on-screen displays. -The scanner is exposed by a single function: +The primary entry point is a generator: +\begin{funcdesc}{generate_tokens}{readline} + The \function{generate_tokens()} generator requires one argment, + \var{readline}, which must be a callable object which + provides the same interface as the \method{readline()} method of + built-in file objects (see section~\ref{bltin-file-objects}). Each + call to the function should return one line of input as a string. + + The generator produces 5-tuples with these members: + the token type; + the token string; + a 2-tuple \code{(\var{srow}, \var{scol})} of ints specifying the + row and column where the token begins in the source; + a 2-tuple \code{(\var{erow}, \var{ecol})} of ints specifying the + row and column where the token ends in the source; + and the line on which the token was found. + The line passed is the \emph{logical} line; + continuation lines are included. + \versionadded{2.2} +\end{funcdesc} + +An older entry point is retained for backward compatibility: \begin{funcdesc}{tokenize}{readline\optional{, tokeneater}} The \function{tokenize()} function accepts two parameters: one - representing the input stream, and one providing an output mechanism + representing the input stream, and one providing an output mechanism for \function{tokenize()}. The first parameter, \var{readline}, must be a callable object which @@ -26,17 +47,13 @@ The scanner is exposed by a single function: call to the function should return one line of input as a string. The second parameter, \var{tokeneater}, must also be a callable - object. It is called with five parameters: the token type, the - token string, a tuple \code{(\var{srow}, \var{scol})} specifying the - row and column where the token begins in the source, a tuple - \code{(\var{erow}, \var{ecol})} giving the ending position of the - token, and the line on which the token was found. The line passed - is the \emph{logical} line; continuation lines are included. + object. It is called once for each token, with five arguments, + corresponding to the tuples generated by \function{generate_tokens()}. \end{funcdesc} -All constants from the \refmodule{token} module are also exported from -\module{tokenize}, as are two additional token type values that might be +All constants from the \refmodule{token} module are also exported from +\module{tokenize}, as are two additional token type values that might be passed to the \var{tokeneater} function by \function{tokenize()}: \begin{datadesc}{COMMENT} |