| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Python tokenizer (GH-27499)
|
| |
|
|
|
| |
https://bugs.python.org/issue5028
|
|
|
| |
https://bugs.python.org/issue5028
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(GH-10370)
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.
"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.
Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".
The documentation contains now strings for operators and punctuation tokens.
|
|
|
|
|
|
|
|
|
| |
(GH-7891)
Most of the change involves fixing up the test suite, which previously made
the assumption that there wouldn't be a new line if the input didn't end in
one.
Contributed by Ammar Askar.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Document tokenize.generate_tokens()
* Add news file
* Add test for generate_tokens
* Document behaviour around ENCODING token
* Add generate_tokens to __all__
|
|
|
|
| |
This change contains minor things that make diffing between Lib/tokenize.py and
Lib/lib2to3/pgen2/tokenize.py cleaner.
|
| |
|
|
|
| |
Per PEP 492, 'async' and 'await' should become proper keywords in 3.7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add test to check if were modifying token
* copy list so import tokenize doesnt have side effects on token
* shorten line
* add tokenize tokens to token.h to get them to show up in token
* move ERRORTOKEN back to its previous location, and fix nitpick
* copy comments from token.h automatically
* fix whitespace and make more pythonic
* change to fix comments from @haypo
* update token.rst and Misc/NEWS
* change wording
* some more wording changes
|
| |
|
|
|
|
|
|
|
|
| |
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
expression>) when supported (e.g. any(), all(), tuple(), min(), &
max())
|
| |
|
|
|
|
| |
Thanks to Georg Brandl for the patch.
|
|\ |
|
| | |
|
|\ \
| |/
| |
| | |
Patch by John Walker.
|
| |
| |
| |
| | |
Patch by John Walker.
|
|/
|
|
| |
comments to explain what's happening, since it's not so obvious.
|
|
|
|
|
|
|
|
|
|
| |
This commit simplifies async/await tokenization in tokenizer.c,
tokenize.py & lib2to3/tokenize.py. Previous solution was to keep
a stack of async-def & def blocks, whereas the new approach is just
to remember position of the outermost async-def block.
This change won't bring any parsing performance improvements, but
it makes the code much easier to read and validate.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes how one-line async-defs and defs are tracked
by tokenizer. It allows to correctly parse invalid code such
as:
>>> async def f():
... def g(): pass
... async = 10
and valid code such as:
>>> async def f():
... async def g(): pass
... await z
As a consequence, is is now possible to have one-line
'async def foo(): await ..' functions:
>>> async def foo(): return await bar()
|
|\ |
|
| | |
|
|\ \
| |/
| |
| | |
on error to fix a resource warning.
|
| |
| |
| |
| | |
fix a resource warning.
|
| | |
|
|\ \
| |/
| |
| | |
imp.reload(). Patch by Thomas Kluyver.
|
| |
| |
| |
| | |
imp.reload(). Patch by Thomas Kluyver.
|
| | |
|
|\ \
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Python finalization.
Before the module kept a reference to the builtins module, but the module
attributes are cleared during Python finalization. Instead, keep directly a
reference to the open() function.
This enhancement is not perfect, calling tokenize.open() can still fail if
called very late during Python finalization. Usually, the function is called
by the linecache module which is called to display a traceback or emit a
warning.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
finalization.
Before the module kept a reference to the builtins module, but the module
attributes are cleared during Python finalization. Instead, keep directly a
reference to the open() function.
This enhancement is not perfect, calling tokenize.open() can still fail if
called very late during Python finalization. Usually, the function is called
by the linecache module which is called to display a traceback or emit a
warning.
|
|/ |
|
|\ |
|
| | |
|
|\ \
| |/ |
|
| |
| |
| |
| | |
Original patches by A. Kuchling and G. Rees (#12691).
|
|\ \
| |/ |
|
| |
| |
| |
| | |
Patch based on lines from Georg Brandl, Eric Snow, and Gareth Rees.
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| | |
Replace it with correct logic that raises ValueError for bad input.
Issues #8478 and #12691 reported the incorrect logic.
Add an Untokenize test case and an initial test method.
|
|\ \
| |/
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* The first line of Python script could be executed twice when the source
encoding (not equal to 'utf-8') was specified on the second line.
* Now the source encoding declaration on the second line isn't effective if
the first line contains anything except a comment.
* As a consequence, 'python -x' works now again with files with the source
encoding declarations specified on the second file, and can be used again
to make Python batch files on Windows.
* The tokenize module now ignore the source encoding declaration on the second
line if the first line contains anything except a comment.
* IDLE now ignores the source encoding declaration on the second line if the
first line contains anything except a comment.
* 2to3 and the findnocoding.py script now ignore the source encoding
declaration on the second line if the first line contains anything except
a comment.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* The first line of Python script could be executed twice when the source
encoding (not equal to 'utf-8') was specified on the second line.
* Now the source encoding declaration on the second line isn't effective if
the first line contains anything except a comment.
* As a consequence, 'python -x' works now again with files with the source
encoding declarations specified on the second file, and can be used again
to make Python batch files on Windows.
* The tokenize module now ignore the source encoding declaration on the second
line if the first line contains anything except a comment.
* IDLE now ignores the source encoding declaration on the second line if the
first line contains anything except a comment.
* 2to3 and the findnocoding.py script now ignore the source encoding
declaration on the second line if the first line contains anything except
a comment.
|
|\ \
| |/ |
|
| | |
|
|\ \
| |/
| |
| | |
now detect Python source code encoding only in comment lines.
|
| |
| |
| |
| | |
now detect Python source code encoding only in comment lines.
|
|/ |
|