diff options
author | Meador Inge <meadori@gmail.com> | 2012-01-19 06:44:45 (GMT) |
---|---|---|
committer | Meador Inge <meadori@gmail.com> | 2012-01-19 06:44:45 (GMT) |
commit | 00c7f85298b9803371b4a0019ce8732ed8a2dd3b (patch) | |
tree | abe8b6c7ba263370c515b3d307122b1b2cc2e6b7 /Doc/library/tokenize.rst | |
parent | 3f67ec1afd9103211854037f2b269ff46545ffe3 (diff) | |
download | cpython-00c7f85298b9803371b4a0019ce8732ed8a2dd3b.zip cpython-00c7f85298b9803371b4a0019ce8732ed8a2dd3b.tar.gz cpython-00c7f85298b9803371b4a0019ce8732ed8a2dd3b.tar.bz2 |
Issue #2134: Add support for tokenize.TokenInfo.exact_type.
Diffstat (limited to 'Doc/library/tokenize.rst')
-rw-r--r-- | Doc/library/tokenize.rst | 53 |
1 files changed, 52 insertions, 1 deletions
diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst index 050d74c..37d9f41 100644 --- a/Doc/library/tokenize.rst +++ b/Doc/library/tokenize.rst @@ -15,6 +15,11 @@ implemented in Python. The scanner in this module returns comments as tokens as well, making it useful for implementing "pretty-printers," including colorizers for on-screen displays. +To simplify token stream handling, all :ref:`operators` and :ref:`delimiters` +tokens are returned using the generic :data:`token.OP` token type. The exact +type can be determined by checking the ``exact_type`` property on the +:term:`named tuple` returned from :func:`tokenize.tokenize`. + Tokenizing Input ---------------- @@ -36,9 +41,17 @@ The primary entry point is a :term:`generator`: returned as a :term:`named tuple` with the field names: ``type string start end line``. + The returned :term:`named tuple` has a additional property named + ``exact_type`` that contains the exact operator type for + :data:`token.OP` tokens. For all other token types ``exact_type`` + equals the named tuple ``type`` field. + .. versionchanged:: 3.1 Added support for named tuples. + .. versionchanged:: 3.3 + Added support for ``exact_type``. + :func:`tokenize` determines the source encoding of the file by looking for a UTF-8 BOM or encoding cookie, according to :pep:`263`. @@ -131,7 +144,19 @@ It is as simple as: .. code-block:: sh - python -m tokenize [filename.py] + python -m tokenize [-e] [filename.py] + +The following options are accepted: + +.. program:: tokenize + +.. cmdoption:: -h, --help + + show this help message and exit + +.. cmdoption:: -e, --exact + + display token names using the exact type If :file:`filename.py` is specified its contents are tokenized to stdout. Otherwise, tokenization is performed on stdin. @@ -215,3 +240,29 @@ the name of the token, and the final column is the value of the token (if any) 4,10-4,11: OP ')' 4,11-4,12: NEWLINE '\n' 5,0-5,0: ENDMARKER '' + +The exact token type names can be displayed using the ``-e`` option: + +.. code-block:: sh + + $ python -m tokenize -e hello.py + 0,0-0,0: ENCODING 'utf-8' + 1,0-1,3: NAME 'def' + 1,4-1,13: NAME 'say_hello' + 1,13-1,14: LPAR '(' + 1,14-1,15: RPAR ')' + 1,15-1,16: COLON ':' + 1,16-1,17: NEWLINE '\n' + 2,0-2,4: INDENT ' ' + 2,4-2,9: NAME 'print' + 2,9-2,10: LPAR '(' + 2,10-2,25: STRING '"Hello, World!"' + 2,25-2,26: RPAR ')' + 2,26-2,27: NEWLINE '\n' + 3,0-3,1: NL '\n' + 4,0-4,0: DEDENT '' + 4,0-4,9: NAME 'say_hello' + 4,9-4,10: LPAR '(' + 4,10-4,11: RPAR ')' + 4,11-4,12: NEWLINE '\n' + 5,0-5,0: ENDMARKER '' |