bpo-12486: Document tokenize.generate_tokens() as public API (#6957)

* Document tokenize.generate_tokens() * Add news file * Add test for generate_tokens * Document behaviour around ENCODING token * Add generate_tokens to __all__
author: Thomas Kluyver <takowl@gmail.com> 2018-06-05 17:26:39 (GMT)
committer: Carol Willing <carolcode@willingconsulting.com> 2018-06-05 17:26:39 (GMT)
commit: c56b17bd8c7a3fd03859822246633d2c9586f8bd (patch)
tree: 346fb8b3a6614679232792b3f46398b33e5f3c0e /Doc/library/tokenize.rst
parent: c2745d2d05546d76f655ab450eb23d1af39e0b1c (diff)
download: cpython-c56b17bd8c7a3fd03859822246633d2c9586f8bd.zip
cpython-c56b17bd8c7a3fd03859822246633d2c9586f8bd.tar.gz
cpython-c56b17bd8c7a3fd03859822246633d2c9586f8bd.tar.bz2
1 files changed, 12 insertions, 1 deletions
diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst
index 4c0a0ce..111289c 100644
--- a/Doc/library/tokenize.rst
+++ b/Doc/library/tokenize.rst
@@ -57,6 +57,16 @@ The primary entry point is a :term:`generator`:
    :func:`.tokenize` determines the source encoding of the file by looking for a
    UTF-8 BOM or encoding cookie, according to :pep:`263`.
 
+.. function:: generate_tokens(readline)
+
+   Tokenize a source reading unicode strings instead of bytes.
+
+   Like :func:`.tokenize`, the *readline* argument is a callable returning
+   a single line of input. However, :func:`generate_tokens` expects *readline*
+   to return a str object rather than bytes.
+
+   The result is an iterator yielding named tuples, exactly like
+   :func:`.tokenize`. It does not yield an :data:`~token.ENCODING` token.
 
 All constants from the :mod:`token` module are also exported from
 :mod:`tokenize`.
@@ -79,7 +89,8 @@ write back the modified script.
     positions) may change.
 
     It returns bytes, encoded using the :data:`~token.ENCODING` token, which
-    is the first token sequence output by :func:`.tokenize`.
+    is the first token sequence output by :func:`.tokenize`. If there is no
+    encoding token in the input, it returns a str instead.
 
 
 :func:`.tokenize` needs to detect the encoding of source files it tokenizes. The
author	Thomas Kluyver <takowl@gmail.com>	2018-06-05 17:26:39 (GMT)
committer	Carol Willing <carolcode@willingconsulting.com>	2018-06-05 17:26:39 (GMT)
commit	c56b17bd8c7a3fd03859822246633d2c9586f8bd (patch)
tree	346fb8b3a6614679232792b3f46398b33e5f3c0e /Doc/library/tokenize.rst
parent	c2745d2d05546d76f655ab450eb23d1af39e0b1c (diff)
download	cpython-c56b17bd8c7a3fd03859822246633d2c9586f8bd.zip cpython-c56b17bd8c7a3fd03859822246633d2c9586f8bd.tar.gz cpython-c56b17bd8c7a3fd03859822246633d2c9586f8bd.tar.bz2