summaryrefslogtreecommitdiffstats
path: root/Doc/library/tokenize.rst
diff options
context:
space:
mode:
authorBenjamin Peterson <benjamin@python.org>2010-03-18 22:43:41 (GMT)
committerBenjamin Peterson <benjamin@python.org>2010-03-18 22:43:41 (GMT)
commitb3a482962d12871db122dbc843476eea054b2782 (patch)
tree9ffc6315cd63635cd24ec31913cc0f61c6abe1e4 /Doc/library/tokenize.rst
parent00888934476bf21fc7b0b76b96adf08cee7db57e (diff)
downloadcpython-b3a482962d12871db122dbc843476eea054b2782.zip
cpython-b3a482962d12871db122dbc843476eea054b2782.tar.gz
cpython-b3a482962d12871db122dbc843476eea054b2782.tar.bz2
show a common usage of detect_encoding
Diffstat (limited to 'Doc/library/tokenize.rst')
-rw-r--r--Doc/library/tokenize.rst12
1 files changed, 11 insertions, 1 deletions
diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst
index ac6ae36..446d3bb 100644
--- a/Doc/library/tokenize.rst
+++ b/Doc/library/tokenize.rst
@@ -98,7 +98,17 @@ function it uses to do this is available:
but disagree, a SyntaxError will be raised. Note that if the BOM is found,
``'utf-8-sig'`` will be returned as an encoding.
- If no encoding is specified, then the default of ``'utf-8'`` will be returned.
+ If no encoding is specified, then the default of ``'utf-8'`` will be
+ returned.
+
+ :func:`detect_encoding` is useful for robustly reading Python source files.
+ A common pattern for this follows::
+
+ def read_python_source(file_name):
+ with open(file_name, "rb") as fp:
+ encoding = tokenize.detect_encoding(fp.readline)[0]
+ with open(file_name, "r", encoding=encoding) as fp:
+ return fp.read()
Example of a script re-writer that transforms float literals into Decimal