| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
continuation characters (GH-30130)
|
|
|
|
| |
PyImport_ImportModule (GH-30046)
|
|
|
|
| |
files (GH-30068)
|
| |
|
|
|
|
| |
characters (GH-29654)
|
|
|
| |
characters in the parser
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
It used to be like this:
<img width="1232" alt="Снимок экрана 2021-10-22 в 23 07 40" src="https://user-images.githubusercontent.com/4660275/138516608-fef6ec01-a96a-40f4-81ef-52265b0f536b.png">
Quick `grep` tells that it is just used in one place under `Py_DEBUG`: https://github.com/python/cpython/blame/f6e8b80d20159596cf641305bad3a833bedd2f4f/Parser/tokenizer.c#L1047-L1051
<img width="752" alt="Снимок экрана 2021-10-22 в 23 08 09" src="https://user-images.githubusercontent.com/4660275/138516684-ea503136-1e92-48a5-95bb-419e190d5866.png">
I am not sure, but it also looks like a private thing, it should not affect other users.
Automerge-Triggered-By: GH:pablogsal
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename PyTokenize functions to mark them as private:
* PyTokenizer_FindEncodingFilename() => _PyTokenizer_FindEncodingFilename()
* PyTokenizer_FromString() => _PyTokenizer_FromString()
* PyTokenizer_FromFile() => _PyTokenizer_FromFile()
* PyTokenizer_FromUTF8() => _PyTokenizer_FromUTF8()
* PyTokenizer_Free() => _PyTokenizer_Free()
* PyTokenizer_Get() => _PyTokenizer_Get()
Remove the unused PyTokenizer_FindEncoding() function.
import.c: remove unused #include "errcode.h".
|
|
|
|
|
|
|
| |
* Move _PyObject_CallNoArgs() to pycore_call.h (internal C API).
* _ssl, _sqlite and _testcapi extensions now call the public
PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs().
* _lsprof extension is now built with Py_BUILD_CORE_MODULE macro
defined to get access to internal _PyObject_CallNoArgs().
|
|
|
|
|
| |
Fix typo in the private _PyObject_CallNoArg() function name: rename
it to _PyObject_CallNoArgs() to be consistent with the public
function PyObject_CallNoArgs().
|
| |
|
| |
|
|
|
| |
Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu>
|
|
|
|
|
| |
buffers (GH-26676)
Automerge-Triggered-By: GH:pablogsal
|
|
|
|
|
|
|
|
| |
Emit a deprecation warning if the numeric literal is immediately followed by
one of keywords: and, else, for, if, in, is, or. Raise a syntax error with
more informative message if it is immediately followed by other keyword or
identifier.
Automerge-Triggered-By: GH:pablogsal
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the parser does a second pass to check for errors, these rules can
have some small side-effects as they may advance the parser more than
the point reached in the first pass. This can cause the tokenizer to ask
for extra tokens in interactive mode causing the tokenizer to show the
prompt instead of failing instantly.
To avoid this, add a new mode to the tokenizer that is activated in the
second pass and deactivates asking for new tokens when the interactive
line is finished. As the parsing should have reached the last line in
the first pass, the second pass should not need to ask for more tokens.
|
| |
|
| |
|
|
|
|
| |
from stdin (GH-24763)
|
|
|
| |
Automerge-Triggered-By: GH:isidentical
|
|
|
|
| |
for column numbers (GH-24266)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When trying to extract the error line for the error message there
are two distinct cases:
1. The input comes from a file, which means that we can extract the
error line by using `PyErr_ProgramTextObject` and which we already
do.
2. The input does not come from a file, at which point we need to get
the source code from the tokenizer:
* If the tokenizer's current line number is the same with the line
of the error, we get the line from `tok->buf` and we're ready.
* Else, we can extract the error line from the source code in the
following two ways:
* If the input comes from a string we have all the input
in `tok->str` and we can extract the error line from it.
* If the input comes from stdin, i.e. the interactive prompt, we
do not have access to the previous line. That's why a new
field `tok->stdin_content` is added which holds the whole input for the
current (multiline) statement or expression. We can then extract the
error line from `tok->stdin_content` like we do in the string case above.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
No longer use deprecated aliases to functions:
* Replace PyMem_MALLOC() with PyMem_Malloc()
* Replace PyMem_REALLOC() with PyMem_Realloc()
* Replace PyMem_FREE() with PyMem_Free()
* Replace PyMem_Del() with PyMem_Free()
* Replace PyMem_DEL() with PyMem_Free()
Modify also the PyMem_DEL() macro to use directly PyMem_Free().
|
|
|
|
|
|
|
|
|
|
| |
On Windows, #include "pyerrors.h" no longer defines "snprintf" and
"vsnprintf" macros.
PyOS_snprintf() and PyOS_vsnprintf() should be used to get portable
behavior.
Replace snprintf() calls with PyOS_snprintf() and replace vsnprintf()
calls with PyOS_vsnprintf().
|
|
|
|
|
|
|
|
|
|
| |
A line with only a line continuation character should be considered
a blank line at tokenizer level so that only a single NEWLINE token
gets emitted. The old parser was working around the issue, but the
new parser threw a `SyntaxError` for valid input. For example,
an empty line following a line continuation character was interpreted
as a `SyntaxError`.
Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
|
| |
|
|
|
|
| |
(GH-20033)
|
|
|
|
| |
Due to backwards compatibility concerns regarding keywords immediately followed by a string without whitespace between them (like in `bg="#d00" if clear else"#fca"`) will fail to parse,
commit 41d5b94af44e34ac05d4cd57460ed104ccf96628 has to be reverted.
|
|
|
|
|
| |
(GH-19619)
Co-authored-by: Guido van Rossum <gvanrossum@gmail.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Py_FatalError() function is replaced with a macro which logs
automatically the name of the current function, unless the
Py_LIMITED_API macro is defined.
Changes:
* Add _Py_FatalErrorFunc() function.
* Remove the function name from the message of Py_FatalError() calls
which included the function name.
* Update tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function PyTokenizer_FromUTF8 from Parser/tokenizer.c had a comment:
/* XXX: constify members. */
This patch addresses that.
In the tok_state struct:
* end and start were non-const but could be made const
* str and input were const but should have been non-const
Changes to support this include:
* decode_str() now returns a char * since it is allocated.
* PyTokenizer_FromString() and PyTokenizer_FromUTF8() each creates a
new char * for an allocate string instead of reusing the input
const char *.
* PyTokenizer_Get() and tok_get() now take const char ** arguments.
* Various local vars are const or non-const accordingly.
I was able to remove five casts that cast away constness.
|
|
|
|
| |
* Always set the text attribute.
* Correct the offset attribute for non-ascii sources.
|
|
|
|
| |
PyUnicode_IsIdentifier() does not call Py_FatalError() anymore if the
string is not ready.
|
| |
|
|
|
|
|
| |
(GH-17421)
https://bugs.python.org/issue38673
|
|
|
| |
Without indendation, seems like strcpy line is parallel to `if` condition.
|
|
|
|
| |
(GH-14433)
|
|
|
|
|
|
|
|
|
|
|
| |
(GH-13504)
This disallows things like `# type: ignoreé`, which seems wrong.
Also switch to using Py_ISALNUM for the alnum check, for consistency
with other code (and maybe correctness re: locale issues?).
https://bugs.python.org/issue36878
|
|
|
|
|
| |
GH-13238 made extra text after a # type: ignore accepted by the parser.
This finishes the job and actually plumbs the extra text through the
parser and makes it available in the AST.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the parser consistent with the tokenize module (already the case
in `pypy`).
sample
------
```python
x = 5\
```
before
------
```console
$ python3 t.py
$ python3 -mtokenize t.py
t.py:2:0: error: EOF in multi-line statement
```
after
-----
```console
$ ./python t.py
File "t.py", line 3
x = 5\
^
SyntaxError: unexpected EOF while parsing
$ ./python -m tokenize t.py
t.py:2:0: error: EOF in multi-line statement
```
https://bugs.python.org/issue2180
|
|
|
|
|
|
|
| |
In the parser, when using the type_comments=True option, recognize
a TYPE_IGNORE as anything containing `# type: ignore` followed by
a non-alphanumeric character. This is to allow ignores such as
`# type: ignore[E1000]`.
|
|
|
| |
After the removal of pgen, multiple header and function prototypes that lack implementation or are unused are still lying around.
|
|
|
|
|
|
| |
tok_nextc() (12601)
Remove the PyMem_FREE() call added in cb90c89. The buffer will be
freed when PyTokenizer_Free() is called on the tokenizer state.
|
| |
|