summaryrefslogtreecommitdiffstats
path: root/Parser
Commit message (Collapse)AuthorAgeFilesLines
* bpo-46820: Fix a SyntaxError in a numeric literal followed by "not in" ↵Serhiy Storchaka2022-02-221-0/+3
| | | | | | | (GH-31479) Fix parsing a numeric literal immediately (without spaces) followed by "not in" keywords, like in "1not in x". Now the parser only emits a warning, not a syntax error.
* bpo-46762: Fix an assert failure in f-strings where > or < is the last ↵Eric V. Smith2022-02-161-10/+10
| | | | character if the f-string is missing a trailing right brace. (#31365)
* Don't print rejected tokens when using the debug flags in the parser (GH-31258)Pablo Galindo Salgado2022-02-101-1/+0
|
* Allow the parser to avoid nested processing of invalid rules (GH-31252)Pablo Galindo Salgado2022-02-103-1044/+1033
|
* bpo-46707: Avoid potential exponential backtracking in some syntax errors ↵Pablo Galindo Salgado2022-02-101-1015/+1020
| | | | (GH-31241)
* bpo-46541: Replace core use of _Py_IDENTIFIER() with statically initialized ↵Eric Snow2022-02-081-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | global objects. (gh-30928) We're no longer using _Py_IDENTIFIER() (or _Py_static_string()) in any core CPython code. It is still used in a number of non-builtin stdlib modules. The replacement is: PyUnicodeObject (not pointer) fields under _PyRuntimeState, statically initialized as part of _PyRuntime. A new _Py_GET_GLOBAL_IDENTIFIER() macro facilitates lookup of the fields (along with _Py_GET_GLOBAL_STRING() for non-identifier strings). https://bugs.python.org/issue46541#msg411799 explains the rationale for this change. The core of the change is in: * (new) Include/internal/pycore_global_strings.h - the declarations for the global strings, along with the macros * Include/internal/pycore_runtime_init.h - added the static initializers for the global strings * Include/internal/pycore_global_objects.h - where the struct in pycore_global_strings.h is hooked into _PyRuntimeState * Tools/scripts/generate_global_objects.py - added generation of the global string declarations and static initializers I've also added a --check flag to generate_global_objects.py (along with make check-global-objects) to check for unused global strings. That check is added to the PR CI config. The remainder of this change updates the core code to use _Py_GET_GLOBAL_IDENTIFIER() instead of _Py_IDENTIFIER() and the related _Py*Id functions (likewise for _Py_GET_GLOBAL_STRING() instead of _Py_static_string()). This includes adding a few functions where there wasn't already an alternative to _Py*Id(), replacing the _Py_Identifier * parameter with PyObject *. The following are not changed (yet): * stop using _Py_IDENTIFIER() in the stdlib modules * (maybe) get rid of _Py_IDENTIFIER(), etc. entirely -- this may not be doable as at least one package on PyPI using this (private) API * (maybe) intern the strings during runtime init https://bugs.python.org/issue46541
* bpo-46521: Fix codeop to use a new partial-input mode of the parser (GH-31010)Pablo Galindo Salgado2022-02-083-11/+31
|
* bpo-14916: use specified tokenizer fd for file input (GH-31006)Paul m. p. P2022-02-011-1/+1
| | | | | | | | | | @pablogsal, sorry i failed to rebase to main, so i recreated https://github.com/python/cpython/pull/22190#issuecomment-1024633392 > PyRun_InteractiveOne\*() functions allow to explicitily set fd instead of stdin. but stdin was hardcoded in readline call. > This patch does not fix target file for prompt unlike original bpo one : prompt fd is unrelated to tokenizer source which could be read only. It is more of a bugfix regarding the docs : actual documentation say "prompt the user" so one would expect prompt to go on stdout not a file for both PyRun_InteractiveOne\*() and PyRun_InteractiveLoop\*(). Automerge-Triggered-By: GH:pablogsal
* bpo-46091: Correctly calculate indentation levels for whitespace lines with ↵Pablo Galindo Salgado2022-01-251-13/+33
| | | | continuation characters (GH-30130)
* bpo-46503: Prevent an assert from firing when parsing some invalid \N ↵Eric V. Smith2022-01-251-2/+14
| | | | | | | sequences in f-strings. (GH-30865) * bpo-46503: Prevent an assert from firing. Also fix one nearby tiny PEP-7 nit. * Added blurb.
* Fix the caret position in some syntax errors in interactive mode (GH-30718)Pablo Galindo Salgado2022-01-201-2/+3
|
* bpo-46339: Include clarification on assert in ↵Pablo Galindo Salgado2022-01-181-0/+3
| | | | 'get_error_line_from_tokenizer_buffers' (#30545)
* bpo-46339: Fix crash in the parser when computing error text for multi-line ↵Pablo Galindo Salgado2022-01-111-2/+9
| | | | | f-strings (GH-30529) Automerge-Triggered-By: GH:pablogsal
* bpo-46237: Fix the line number of tokenizer errors inside f-strings (GH-30463)Pablo Galindo Salgado2022-01-082-5/+8
|
* bpo-46289: Make conversion of FormattedValue not optional on ASDL (GH-30467)Batuhan Taskaya2022-01-071-1/+1
| | | Automerge-Triggered-By: GH:isidentical
* bpo-46240: Correct the error for unclosed parentheses when the tokenizer is ↵Pablo Galindo Salgado2022-01-041-1/+2
| | | | not finished (GH-30378)
* bpo-46110: Restore commit e9898bf153d26059261ffef11f7643ae991e2a4cPablo Galindo Salgado2022-01-032-3193/+4581
| | | This restores commit e9898bf153d26059261ffef11f7643ae991e2a4c .
* Revert "bpo-46110: Add a recursion check to avoid stack overflow in the PEG ↵Pablo Galindo Salgado2022-01-032-4581/+3193
| | | | | parser (GH-30177)" (GH-30363) This reverts commit e9898bf153d26059261ffef11f7643ae991e2a4c temporarily as we want to confirm if this commit is the cause of a slowdown at startup time.
* bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser ↵Pablo Galindo Salgado2021-12-202-3193/+4581
| | | | | (GH-30177) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
* bpo-45292: [PEP-654] add except* (GH-29581)Irit Katriel2021-12-142-1776/+2721
|
* bpo-45855: Replaced deprecated `PyImport_ImportModuleNoBlock` with ↵Kumar Aditya2021-12-122-2/+2
| | | | PyImport_ImportModule (GH-30046)
* bpo-46054: Fix parsing error when parsing non-utf8 characters in source ↵Pablo Galindo Salgado2021-12-121-8/+5
| | | | files (GH-30068)
* bpo-42918: Improve build-in function compile() in mode 'single' (GH-29934)Weipeng Hong2021-12-101-19/+1
| | | Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
* bpo-46004: Fix error location for loops with invalid targets (GH-29959)Pablo Galindo Salgado2021-12-071-1/+2
|
* bpo-45866: pegen strips directory of "generated from" header (GH-29777)Victor Stinner2021-11-261-1/+1
| | | | | "make regen-all" now produces the same output when run from a directory other than the source tree: when building Python out of the source tree.
* bpo-45727: Only trigger the 'did you forgot a comma' error suggestion if ↵Pablo Galindo Salgado2021-11-244-4/+7
| | | | inside parentheses (GH-29757)
* Ensure the str member of the tokenizer is always initialised (GH-29681)Pablo Galindo Salgado2021-11-213-3/+3
|
* Refactor parser compilation units into specific components (GH-29676)Pablo Galindo Salgado2021-11-214-1868/+1894
|
* bpo-45811: Improve error message when source code contains invisible control ↵Pablo Galindo Salgado2021-11-201-0/+6
| | | | characters (GH-29654)
* bpo-45450: Improve syntax error for parenthesized arguments (GH-28906)Pablo Galindo Salgado2021-11-201-445/+849
|
* bpo-45494: Fix error location in EOF tokenizer errors (GH-29108)Pablo Galindo Salgado2021-11-201-2/+7
|
* bpo-45848: Allow the parser to get error lines from encoded files (GH-29646)Pablo Galindo Salgado2021-11-201-7/+8
|
* bpo-45727: Make the syntax error for missing comma more consistent (GH-29427)Pablo Galindo Salgado2021-11-192-22/+24
|
* bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are ↵Pablo Galindo Salgado2021-11-161-1/+1
| | | | not provided (GH-29582)
* bpo-45820: Fix a segfault when the parser fails without reading any input ↵Pablo Galindo Salgado2021-11-161-0/+8
| | | | (GH-29580)
* bpo-45738: Fix computation of error location for invalid continuation (GH-29550)Pablo Galindo Salgado2021-11-142-11/+5
| | | characters in the parser
* bpo-45764: improve error message when missing '(' after 'def' (GH-29484)Carl Friedrich Bolz-Tereick2021-11-091-12/+12
| | | | | to achieve this, change the grammar to expect the '(' token after 'def' NAME. Automerge-Triggered-By: GH:pablogsal
* bpo-45716: Improve the error message when using True/False/None as keywords ↵Pablo Galindo Salgado2021-11-051-689/+795
| | | | in a call (GH-29413)
* bpo-44257: fix "assigment_expr" typo + regenerate the grammar, and remove ↵wim glenn2021-11-031-60/+60
| | | | | | unused imports (GH-29393) Co-authored-by: Wim Glenn <wglenn@jumptrading.com>
* bpo-45562: Ensure all tokenizer debug messages are printed to stderr (GH-29270)Pablo Galindo Salgado2021-10-281-1/+1
|
* bpo-45562: Print tokenizer debug messages to stderr (GH-29250)Pablo Galindo Salgado2021-10-271-4/+4
|
* bpo-45574: fix warning about `print_escape` being unused (GH-29172)Nikita Sobolev2021-10-221-0/+2
| | | | | | | | | | | It used to be like this: <img width="1232" alt="Снимок экрана 2021-10-22 в 23 07 40" src="https://user-images.githubusercontent.com/4660275/138516608-fef6ec01-a96a-40f4-81ef-52265b0f536b.png"> Quick `grep` tells that it is just used in one place under `Py_DEBUG`: https://github.com/python/cpython/blame/f6e8b80d20159596cf641305bad3a833bedd2f4f/Parser/tokenizer.c#L1047-L1051 <img width="752" alt="Снимок экрана 2021-10-22 в 23 08 09" src="https://user-images.githubusercontent.com/4660275/138516684-ea503136-1e92-48a5-95bb-419e190d5866.png"> I am not sure, but it also looks like a private thing, it should not affect other users. Automerge-Triggered-By: GH:pablogsal
* bpo-45562: Only show debug output from the parser in debug builds (GH-29140)Pablo Galindo Salgado2021-10-221-0/+2
|
* bpo-45494: Fix parser crash when reporting errors involving invalid ↵Pablo Galindo Salgado2021-10-192-122/+130
| | | | | | | | | | | | continuation characters (GH-28993) There are two errors that this commit fixes: * The parser was not correctly computing the offset and the string source for E_LINECONT errors due to the incorrect usage of strtok(). * The parser was not correctly unwinding the call stack when a tokenizer exception happened in rules involving optionals ('?', [...]) as we always make them return valid results by using the comma operator. We need to check first if we don't have an error before continuing.
* bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" ↵Serhiy Storchaka2021-10-141-1/+1
| | | | | | | | | codec (GH-28939) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.unicode_escape_decode(). It is True by default to match the former behavior.
* bpo-45434: Mark the PyTokenizer C API as private (GH-28924)Victor Stinner2021-10-134-42/+35
| | | | | | | | | | | | | | Rename PyTokenize functions to mark them as private: * PyTokenizer_FindEncodingFilename() => _PyTokenizer_FindEncodingFilename() * PyTokenizer_FromString() => _PyTokenizer_FromString() * PyTokenizer_FromFile() => _PyTokenizer_FromFile() * PyTokenizer_FromUTF8() => _PyTokenizer_FromUTF8() * PyTokenizer_Free() => _PyTokenizer_Free() * PyTokenizer_Get() => _PyTokenizer_Get() Remove the unused PyTokenizer_FindEncoding() function. import.c: remove unused #include "errcode.h".
* bpo-45434: Move _Py_BEGIN_SUPPRESS_IPH to pycore_fileutils.h (GH-28922)Victor Stinner2021-10-131-0/+1
|
* bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895)Victor Stinner2021-10-121-0/+1
| | | | | | | * Move _PyObject_CallNoArgs() to pycore_call.h (internal C API). * _ssl, _sqlite and _testcapi extensions now call the public PyObject_CallNoArgs() function, rather than _PyObject_CallNoArgs(). * _lsprof extension is now built with Py_BUILD_CORE_MODULE macro defined to get access to internal _PyObject_CallNoArgs().
* bpo-45439: Rename _PyObject_CallNoArg() to _PyObject_CallNoArgs() (GH-28891)Victor Stinner2021-10-111-1/+1
| | | | | Fix typo in the private _PyObject_CallNoArg() function name: rename it to _PyObject_CallNoArgs() to be consistent with the public function PyObject_CallNoArgs().
* bpo-45408: Don't override previous tokenizer errors in the second parser ↵Pablo Galindo Salgado2021-10-071-1/+4
| | | | pass (GH-28812)