| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Windows, GH-include "pyerrors.h" no longer defines "snprintf" and
"vsnprintf" macros.
PyOS_snprintf() and PyOS_vsnprintf() should be used to get portable
behavior.
Replace snprintf() calls with PyOS_snprintf() and replace vsnprintf()
calls with PyOS_vsnprintf().
(cherry picked from commit e822e37946f27c09953bb5733acf3b07c2db690f)
Co-authored-by: Victor Stinner <vstinner@python.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A line with only a line continuation character should be considered
a blank line at tokenizer level so that only a single NEWLINE token
gets emitted. The old parser was working around the issue, but the
new parser threw a `SyntaxError` for valid input. For example,
an empty line following a line continuation character was interpreted
as a `SyntaxError`.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
(cherry picked from commit 896f4cf63f9ab93e30572d879a5719d5aa2499fb)
Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
|
|
|
|
|
| |
(cherry picked from commit a2bbedc8b18c001d2f9e702e6e678efbb2990daa)
Co-authored-by: Ammar Askar <ammar@ammaraskar.com>
|
|
|
|
| |
(GH-20033)
|
|
|
|
| |
Due to backwards compatibility concerns regarding keywords immediately followed by a string without whitespace between them (like in `bg="#d00" if clear else"#fca"`) will fail to parse,
commit 41d5b94af44e34ac05d4cd57460ed104ccf96628 has to be reverted.
|
|
|
|
|
| |
(GH-19619)
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Py_FatalError() function is replaced with a macro which logs
automatically the name of the current function, unless the
Py_LIMITED_API macro is defined.
Changes:
* Add _Py_FatalErrorFunc() function.
* Remove the function name from the message of Py_FatalError() calls
which included the function name.
* Update tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function PyTokenizer_FromUTF8 from Parser/tokenizer.c had a comment:
/* XXX: constify members. */
This patch addresses that.
In the tok_state struct:
* end and start were non-const but could be made const
* str and input were const but should have been non-const
Changes to support this include:
* decode_str() now returns a char * since it is allocated.
* PyTokenizer_FromString() and PyTokenizer_FromUTF8() each creates a
new char * for an allocate string instead of reusing the input
const char *.
* PyTokenizer_Get() and tok_get() now take const char ** arguments.
* Various local vars are const or non-const accordingly.
I was able to remove five casts that cast away constness.
|
|
|
|
| |
* Always set the text attribute.
* Correct the offset attribute for non-ascii sources.
|
|
|
|
| |
PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the
string is not ready.
|
| |
|
|
|
|
|
| |
(GH-17421)
https://bugs.python.org/issue38673
|
|
|
| |
Without indendation, seems like strcpy line is parallel to `if` condition.
|
|
|
|
| |
(GH-14433)
|
|
|
|
|
|
|
|
|
|
|
| |
(GH-13504)
This disallows things like `# type: ignoreé`, which seems wrong.
Also switch to using Py_ISALNUM for the alnum check, for consistency
with other code (and maybe correctness re: locale issues?).
https://bugs.python.org/issue36878
|
|
|
|
|
| |
GH-13238 made extra text after a # type: ignore accepted by the parser.
This finishes the job and actually plumbs the extra text through the
parser and makes it available in the AST.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the parser consistent with the tokenize module (already the case
in `pypy`).
sample
------
```python
x = 5\
```
before
------
```console
$ python3 t.py
$ python3 -mtokenize t.py
t.py:2:0: error: EOF in multi-line statement
```
after
-----
```console
$ ./python t.py
File "t.py", line 3
x = 5\
^
SyntaxError: unexpected EOF while parsing
$ ./python -m tokenize t.py
t.py:2:0: error: EOF in multi-line statement
```
https://bugs.python.org/issue2180
|
|
|
|
|
|
|
| |
In the parser, when using the type_comments=True option, recognize
a TYPE_IGNORE as anything containing `# type: ignore` followed by
a non-alphanumeric character. This is to allow ignores such as
`# type: ignore[E1000]`.
|
|
|
| |
After the removal of pgen, multiple header and function prototypes that lack implementation or are unused are still lying around.
|
|
|
|
|
|
| |
tok_nextc() (12601)
Remove the PyMem_FREE() call added in cb90c89. The buffer will be
freed when PyTokenizer_Free() is called on the tokenizer state.
|
| |
|
|
|
|
|
|
|
| |
This adds a `feature_version` flag to `ast.parse()` (documented) and `compile()` (hidden) that allow tweaking the parser to support older versions of the grammar. In particular if `feature_version` is 5 or 6, the hacks for the `async` and `await` keyword from PEP 492 are reinstated. (For 7 or higher, these are unconditionally treated as keywords, but they are still special tokens rather than `NAME` tokens that the parser driver recognizes.)
https://bugs.python.org/issue35975
|
|
|
|
|
| |
Pgen is the oldest piece of technology in the CPython repository, building it requires various #if[n]def PGEN hacks in other parts of the code and it also depends more and more on CPython internals. This commit removes the old pgen C code and replaces it for a new version implemented in pure Python. This is a modified and adapted version of lib2to3/pgen2 that can generate grammar files compatibles with the current parser.
This commit also eliminates all the #ifdef and code branches related to pgen, simplifying the code and making it more maintainable. The regen-grammar step now uses $(PYTHON_FOR_REGEN) that can be any version of the interpreter, so the new pgen code maintains compatibility with older versions of the interpreter (this also allows regenerating the grammar with the current CI solution that uses Python3.5). The new pgen Python module also makes use of the Grammar/Tokens file that holds the token specification, so is always kept in sync and avoids having to maintain duplicate token definitions.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(GH-10370)
"Include/token.h", "Lib/token.py" (containing now some data moved from
"Lib/tokenize.py") and new files "Parser/token.c" (containing the code
moved from "Parser/tokenizer.c") and "Doc/library/token-list.inc" (included
in "Doc/library/token.rst") are now generated from "Grammar/Tokens" by
"Tools/scripts/generate_token.py". The script overwrites files only if
needed and can be used on the read-only sources tree.
"Lib/symbol.py" is now generated by "Tools/scripts/generate_symbol_py.py"
instead of been executable itself.
Added new make targets "regen-token" and "regen-symbol" which are now
dependencies of "regen-all".
The documentation contains now strings for operators and punctuation tokens.
|
| |
|
|
|
|
|
|
| |
(GH-11015)
Set MemoryError when appropriate, add missing failure checks,
and fix some potential leaks.
|
| |
|
|
|
|
|
|
| |
Fix the following warning on Windows:
parser\tokenizer.c(1297): warning C4244: 'function': conversion from
'__int64' to 'int', possible loss of data.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Remove the following fields from tok_state structure which are now
used unused:
* altwarning: "Issue warning if alternate tabs don't match"
* alterror: "Issue error if alternate tabs don't match"
* alttabsize: "Alternate tab spacing"
Replace alttabsize variable with ALTTABSIZE define.
|
|
|
| |
Per PEP 492, 'async' and 'await' should become proper keywords in 3.7.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add test to check if were modifying token
* copy list so import tokenize doesnt have side effects on token
* shorten line
* add tokenize tokens to token.h to get them to show up in token
* move ERRORTOKEN back to its previous location, and fix nitpick
* copy comments from token.h automatically
* fix whitespace and make more pythonic
* change to fix comments from @haypo
* update token.rst and Misc/NEWS
* change wording
* some more wording changes
|
|\ |
|
| |
| |
| |
| | |
Patch by Ryan Gonzalez.
|
|/
|
|
|
|
|
| |
Replace:
PyObject_CallObject(callable, NULL)
with:
_PyObject_CallNoArg(callable)
|
|
|
|
| |
with PyUnicode_AsUTF8 and PyUnicode_AsUTF8AndSize.
|
|\ |
|
| |\ |
|
| | | |
|
|\ \ \
| |/ / |
|
| | | |
|
| | |
| | |
| | |
| | | |
Thanks to Georg Brandl for the patch.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In the case of an escape character, c is never read. tok_next() is
used to advance the pointer.
CID 1225097
|
|\ \ \
| |/ / |
|
| | | |
|