| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
tokenize errors (GH-105399) (#105439)
|
|
|
|
| |
(#105330)
|
|
|
|
|
|
|
|
| |
input iteratively (GH-105070) (#105119)
gh-105069: Add a readline-like callable to the tokenizer to consume input iteratively (GH-105070)
(cherry picked from commit 9216e69a87d16d871625721ed5a8aa302511f367)
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
|
|
|
|
| |
tokenizer (GH-104980) (#105000)
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This commit replaces the Python implementation of the tokenize module with an implementation
that reuses the real C tokenizer via a private extension module. The tokenize module now implements
a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward
compatibility.
As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via
the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation.
Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
|
| |
|
| |
|
|
|
|
| |
Python tokenizer (GH-27499)
|
| |
|
|
|
| |
https://bugs.python.org/issue5028
|
|
|
| |
https://bugs.python.org/issue5028
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(GH-10370)
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.
"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.
Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".
The documentation contains now strings for operators and punctuation tokens.
|
|
|
|
|
|
|
|
|
| |
(GH-7891)
Most of the change involves fixing up the test suite, which previously made
the assumption that there wouldn't be a new line if the input didn't end in
one.
Contributed by Ammar Askar.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Document tokenize.generate_tokens()
* Add news file
* Add test for generate_tokens
* Document behaviour around ENCODING token
* Add generate_tokens to __all__
|
|
|
|
| |
This change contains minor things that make diffing between Lib/tokenize.py and
Lib/lib2to3/pgen2/tokenize.py cleaner.
|
| |
|
|
|
| |
Per PEP 492, 'async' and 'await' should become proper keywords in 3.7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add test to check if were modifying token
* copy list so import tokenize doesnt have side effects on token
* shorten line
* add tokenize tokens to token.h to get them to show up in token
* move ERRORTOKEN back to its previous location, and fix nitpick
* copy comments from token.h automatically
* fix whitespace and make more pythonic
* change to fix comments from @haypo
* update token.rst and Misc/NEWS
* change wording
* some more wording changes
|
| |
|
|
|
|
|
|
|
|
| |
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
expression>) when supported (e.g. any(), all(), tuple(), min(), &
max())
|
| |
|
|
|
|
| |
Thanks to Georg Brandl for the patch.
|
|\ |
|
| | |
|
|\ \
| |/
| |
| | |
Patch by John Walker.
|
| |
| |
| |
| | |
Patch by John Walker.
|
|/
|
|
| |
comments to explain what's happening, since it's not so obvious.
|
|
|
|
|
|
|
|
|
|
| |
This commit simplifies async/await tokenization in tokenizer.c,
tokenize.py & lib2to3/tokenize.py. Previous solution was to keep
a stack of async-def & def blocks, whereas the new approach is just
to remember position of the outermost async-def block.
This change won't bring any parsing performance improvements, but
it makes the code much easier to read and validate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes how one-line async-defs and defs are tracked
by tokenizer. It allows to correctly parse invalid code such
as:
>>> async def f():
... def g(): pass
... async = 10
and valid code such as:
>>> async def f():
... async def g(): pass
... await z
As a consequence, is is now possible to have one-line
'async def foo(): await ..' functions:
>>> async def foo(): return await bar()
|
|\ |
|
| | |
|
|\ \
| |/
| |
| | |
on error to fix a resource warning.
|
| |
| |
| |
| | |
fix a resource warning.
|
| | |
|
|\ \
| |/
| |
| | |
imp.reload(). Patch by Thomas Kluyver.
|
| |
| |
| |
| | |
imp.reload(). Patch by Thomas Kluyver.
|
| | |
|
|\ \
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Python finalization.
Before the module kept a reference to the builtins module, but the module
attributes are cleared during Python finalization. Instead, keep directly a
reference to the open() function.
This enhancement is not perfect, calling tokenize.open() can still fail if
called very late during Python finalization. Usually, the function is called
by the linecache module which is called to display a traceback or emit a
warning.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
finalization.
Before the module kept a reference to the builtins module, but the module
attributes are cleared during Python finalization. Instead, keep directly a
reference to the open() function.
This enhancement is not perfect, calling tokenize.open() can still fail if
called very late during Python finalization. Usually, the function is called
by the linecache module which is called to display a traceback or emit a
warning.
|
|/ |
|
|\ |
|
| | |
|
|\ \
| |/ |
|
| |
| |
| |
| | |
Original patches by A. Kuchling and G. Rees (#12691).
|
|\ \
| |/ |
|
| |
| |
| |
| | |
Patch based on lines from Georg Brandl, Eric Snow, and Gareth Rees.
|
|\ \
| |/ |
|