summaryrefslogtreecommitdiffstats
path: root/Doc/library/tokenize.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/tokenize.rst')
-rw-r--r--Doc/library/tokenize.rst23
1 files changed, 19 insertions, 4 deletions
diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst
index 7017045..577d7cc 100644
--- a/Doc/library/tokenize.rst
+++ b/Doc/library/tokenize.rst
@@ -6,6 +6,9 @@
.. moduleauthor:: Ka Ping Yee
.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
+**Source code:** :source:`Lib/tokenize.py`
+
+--------------
The :mod:`tokenize` module provides a lexical scanner for Python source code,
implemented in Python. The scanner in this module returns comments as tokens
@@ -95,12 +98,25 @@ function it uses to do this is available:
It detects the encoding from the presence of a UTF-8 BOM or an encoding
cookie as specified in :pep:`263`. If both a BOM and a cookie are present,
- but disagree, a SyntaxError will be raised.
+ but disagree, a SyntaxError will be raised. Note that if the BOM is found,
+ ``'utf-8-sig'`` will be returned as an encoding.
+
+ If no encoding is specified, then the default of ``'utf-8'`` will be
+ returned.
+
+ Use :func:`open` to open Python source files: it uses
+ :func:`detect_encoding` to detect the file encoding.
+
- If no encoding is specified, then the default of ``'utf-8'`` will be returned.
+.. function:: open(filename)
+ Open a file in read only mode using the encoding detected by
+ :func:`detect_encoding`.
-Example of a script re-writer that transforms float literals into Decimal
+ .. versionadded:: 3.2
+
+
+Example of a script rewriter that transforms float literals into Decimal
objects::
from tokenize import tokenize, untokenize, NUMBER, STRING, NAME, OP
@@ -142,4 +158,3 @@ objects::
result.append((toknum, tokval))
return untokenize(result).decode('utf-8')
-