summaryrefslogtreecommitdiffstats
path: root/Lib/tokenize.py
Commit message (Collapse)AuthorAgeFilesLines
* [3.12] gh-105390: Correctly raise TokenError instead of SyntaxError for ↵Miss Islington (bot)2023-06-071-2/+18
| | | | tokenize errors (GH-105399) (#105439)
* [3.12] gh-105324: Fix tokenize module main function for stdin (GH-105325) ↵Miss Islington (bot)2023-06-051-2/+1
| | | | (#105330)
* [3.12] gh-105069: Add a readline-like callable to the tokenizer to consume ↵Miss Islington (bot)2023-05-311-21/+11
| | | | | | | | input iteratively (GH-105070) (#105119) gh-105069: Add a readline-like callable to the tokenizer to consume input iteratively (GH-105070) (cherry picked from commit 9216e69a87d16d871625721ed5a8aa302511f367) Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
* [3.12] gh-104976: Ensure trailing dedent tokens are emitted as the previous ↵Miss Islington (bot)2023-05-261-5/+0
| | | | tokenizer (GH-104980) (#105000)
* gh-102856: Tokenize performance improvement (#104731)Marta Gómez Macías2023-05-221-12/+1
|
* gh-104719: Restore Tokenize module constants (#104722)Marta Gómez Macías2023-05-211-0/+101
|
* gh-102856: Python tokenizer implementation for PEP 701 (#104323)Marta Gómez Macías2023-05-211-288/+51
| | | | | | | | | | | This commit replaces the Python implementation of the tokenize module with an implementation that reuses the real C tokenizer via a private extension module. The tokenize module now implements a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward compatibility. As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation. Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
* bpo-46565: `del` loop vars that are leaking into module namespaces (GH-30993)Nikita Sobolev2022-02-031-0/+2
|
* Add tests for the C tokenizer and expose it as a private module (GH-27924)Pablo Galindo Salgado2021-08-241-0/+8
|
* bpo-44667: Treat correctly lines ending with comments and no newlines in the ↵Pablo Galindo Salgado2021-07-311-1/+1
| | | | Python tokenizer (GH-27499)
* bpo-43014: Improve performance of tokenize.tokenize by 20-30%Anthony Sottile2021-01-241-0/+2
|
* bpo-5028: Fix up rest of documentation for tokenize documenting line (GH-13686)Anthony Sottile2019-05-301-1/+1
| | | https://bugs.python.org/issue5028
* bpo-5028: fix doc bug for tokenize (GH-11683)Andrew Carr2019-05-301-1/+1
| | | https://bugs.python.org/issue5028
* bpo-36766: Typos in docs and code comments (GH-13116)penguindustin2019-05-061-1/+1
|
* bpo-30455: Generate all token related code and docs from Grammar/Tokens. ↵Serhiy Storchaka2018-12-221-60/+6
| | | | | | | | | | | | | | | | | | | (GH-10370) "Include/token.h", "Lib/token.py" (containing now some data moved from "Lib/tokenize.py") and new files "Parser/token.c" (containing the code moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by "Tools/scripts/generate_token.py". The script overwrites files only if needed and can be used on the read-only sources tree. "Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py" instead of been executable itself. Added new make targets "regen-token" and "regen-symbol" which are now dependencies of "regen-all". The documentation contains now strings for operators and punctuation tokens.
* bpo-33899: Make tokenize module mirror end-of-file is end-of-line behavior ↵Ammar Askar2018-07-061-0/+10
| | | | | | | | | (GH-7891) Most of the change involves fixing up the test suite, which previously made the assumption that there wouldn't be a new line if the input didn't end in one. Contributed by Ammar Askar.
* bpo-12486: Document tokenize.generate_tokens() as public API (#6957)Thomas Kluyver2018-06-051-3/+6
| | | | | | | | | | | | * Document tokenize.generate_tokens() * Add news file * Add test for generate_tokens * Document behaviour around ENCODING token * Add generate_tokens to __all__
* bpo-33338: [tokenize] Minor code cleanup (#6573)Łukasz Langa2018-04-231-11/+8
| | | | This change contains minor things that make diffing between Lib/tokenize.py and Lib/lib2to3/pgen2/tokenize.py cleaner.
* bpo-33260: Regenerate token.py after removing ASYNC and AWAIT. (GH-6447)Serhiy Storchaka2018-04-111-1/+1
|
* bpo-30406: Make async and await proper keywords (#1669)Jelle Zijlstra2017-10-061-61/+1
| | | Per PEP 492, 'async' and 'await' should become proper keywords in 3.7.
* bpo-25324: copy tok_name before changing it (#1608)Albert-Jan Nijburg2017-05-311-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | * add test to check if were modifying token * copy list so import tokenize doesnt have side effects on token * shorten line * add tokenize tokens to token.h to get them to show up in token * move ERRORTOKEN back to its previous location, and fix nitpick * copy comments from token.h automatically * fix whitespace and make more pythonic * change to fix comments from @haypo * update token.rst and Misc/NEWS * change wording * some more wording changes
* bpo-30377: Simplify handling of COMMENT and NL in tokenize.py (#1607)Albert-Jan Nijburg2017-05-241-5/+3
|
* bpo-30296 Remove unnecessary tuples, lists, sets, and dicts (#1489)Jon Dufresne2017-05-181-1/+1
| | | | | | | | * Replaced list(<generator expression>) with list comprehension * Replaced dict(<generator expression>) with dict comprehension * Replaced set(<list literal>) with set literal * Replaced builtin func(<list comprehension>) with func(<generator expression>) when supported (e.g. any(), all(), tuple(), min(), & max())
* Add ELLIPSIS and RARROW. Add tests (#666)Jim Fasarakis-Hilliard2017-03-141-1/+3
|
* Issue #26331: Implement the parsing part of PEP 515.Brett Cannon2016-09-091-8/+9
| | | | Thanks to Georg Brandl for the patch.
* Issue #26581: Use the first coding cookie on a line, not the last one.Serhiy Storchaka2016-03-201-1/+1
|\
| * Issue #26581: Use the first coding cookie on a line, not the last one.Serhiy Storchaka2016-03-201-1/+1
| |
* | Issue #25977: Fix typos in Lib/tokenize.pyBerker Peksag2015-12-291-5/+5
|\ \ | |/ | | | | Patch by John Walker.
| * Issue #25977: Fix typos in Lib/tokenize.pyBerker Peksag2015-12-291-4/+4
| | | | | | | | Patch by John Walker.
* | Issue 25311: Add support for f-strings to tokenize.py. Also added some ↵Eric V. Smith2015-10-261-51/+67
|/ | | | comments to explain what's happening, since it's not so obvious.
* Issue #24619: Simplify async/await tokenization.Yury Selivanov2015-07-231-16/+23
| | | | | | | | | | This commit simplifies async/await tokenization in tokenizer.c, tokenize.py & lib2to3/tokenize.py. Previous solution was to keep a stack of async-def & def blocks, whereas the new approach is just to remember position of the outermost async-def block. This change won't bring any parsing performance improvements, but it makes the code much easier to read and validate.
* Issue #24619: New approach for tokenizing async/await.Yury Selivanov2015-07-221-1/+6
| | | | | | | | | | | | | | | | | | | | | This commit fixes how one-line async-defs and defs are tracked by tokenizer. It allows to correctly parse invalid code such as: >>> async def f(): ... def g(): pass ... async = 10 and valid code such as: >>> async def f(): ... async def g(): pass ... await z As a consequence, is is now possible to have one-line 'async def foo(): await ..' functions: >>> async def foo(): return await bar()
* Issue #20387: Merge test and patch from 3.4.4Jason R. Coombs2015-06-281-0/+17
|\
| * Issue #20387: Restore retention of indentation during untokenize.Dingyuan Wang2015-06-221-0/+17
| |
* | (Merge 3.5) Issue #23840: tokenize.open() now closes the temporary binary fileVictor Stinner2015-05-251-5/+9
|\ \ | |/ | | | | on error to fix a resource warning.
| * Issue #23840: tokenize.open() now closes the temporary binary file on error toVictor Stinner2015-05-251-5/+9
| | | | | | | | fix a resource warning.
* | PEP 0492 -- Coroutines with async and await syntax. Issue #24017.Yury Selivanov2015-05-121-2/+54
| |
* | Issue #23615: Modules bz2, tarfile and tokenize now can be reloaded withSerhiy Storchaka2015-03-111-2/+1
|\ \ | |/ | | | | imp.reload(). Patch by Thomas Kluyver.
| * Issue #23615: Modules bz2, tarfile and tokenize now can be reloaded withSerhiy Storchaka2015-03-111-2/+1
| | | | | | | | imp.reload(). Patch by Thomas Kluyver.
* | Removed duplicated dict entries.Serhiy Storchaka2015-01-111-1/+0
| |
* | (Merge 3.4) Issue #22599: Enhance tokenize.open() to be able to call it duringVictor Stinner2014-12-051-3/+4
|\ \ | |/ | | | | | | | | | | | | | | | | | | | | | | Python finalization. Before the module kept a reference to the builtins module, but the module attributes are cleared during Python finalization. Instead, keep directly a reference to the open() function. This enhancement is not perfect, calling tokenize.open() can still fail if called very late during Python finalization. Usually, the function is called by the linecache module which is called to display a traceback or emit a warning.
| * Issue #22599: Enhance tokenize.open() to be able to call it during PythonVictor Stinner2014-12-051-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | finalization. Before the module kept a reference to the builtins module, but the module attributes are cleared during Python finalization. Instead, keep directly a reference to the open() function. This enhancement is not perfect, calling tokenize.open() can still fail if called very late during Python finalization. Usually, the function is called by the linecache module which is called to display a traceback or emit a warning.
* | PEP 465: a dedicated infix operator for matrix multiplication (closes #21176)Benjamin Peterson2014-04-101-2/+3
|/
* Merge with 3.3Terry Jan Reedy2014-02-241-1/+1
|\
| * whitespaceTerry Jan Reedy2014-02-241-1/+1
| |
* | Merge with 3.3Terry Jan Reedy2014-02-241-0/+6
|\ \ | |/
| * Issue #9974: When untokenizing, use row info to insert backslash+newline.Terry Jan Reedy2014-02-241-0/+6
| | | | | | | | Original patches by A. Kuchling and G. Rees (#12691).
* | Merge with 3.3Terry Jan Reedy2014-02-181-13/+11
|\ \ | |/
| * Issue #8478: Untokenizer.compat now processes first token from iterator input.Terry Jan Reedy2014-02-181-13/+11
| | | | | | | | Patch based on lines from Georg Brandl, Eric Snow, and Gareth Rees.
* | Untokenize, bad assert: Merge with 3.3Terry Jan Reedy2014-02-171-1/+3
|\ \ | |/