summaryrefslogtreecommitdiffstats
path: root/Parser/pegen.h
Commit message (Collapse)AuthorAgeFilesLines
* gh-112943: Correctly compute end offsets for multiline tokens in the ↵Pablo Galindo Salgado2023-12-111-0/+1
| | | | tokenize module (#112949)
* gh-110805: Allow the repl to show source code and complete tracebacks (#110775)Pablo Galindo Salgado2023-10-131-1/+2
|
* gh-108179: Add error message for parser stack overflows (#108256)Dennis Sweeney2023-08-221-0/+2
|
* gh-107015: Remove async_hacks from the tokenizer (#107018)Pablo Galindo Salgado2023-07-261-1/+0
|
* gh-104922: remove PY_SSIZE_T_CLEAN (#106315)Inada Naoki2023-07-021-1/+0
|
* gh-105194: Fix format specifier escaped characters in f-strings (#105231)Pablo Galindo Salgado2023-06-021-0/+1
|
* gh-103656: Transfer f-string buffers to parser to avoid use-after-free ↵Lysandros Nikolaou2023-04-271-2/+11
| | | | | (GH-103896) Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
* gh-102856: Initial implementation of PEP 701 (#102855)Pablo Galindo Salgado2023-04-191-4/+14
| | | | | | Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Batuhan Taskaya <isidentical@gmail.com> Co-authored-by: Marta Gómez Macías <mgmacias@google.com> Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
* gh-100940: Change "char *str" to "const char *str" in KeywordToken: It is ↵Stepfen Shawn2023-01-181-1/+1
| | | | an immutable string. (#100936)
* gh-93103: Parser uses PyConfig.parser_debug instead of Py_DebugFlag (#93106)Victor Stinner2022-05-241-0/+1
| | | | | | | * Replace deprecated Py_DebugFlag with PyConfig.parser_debug in the parser. * Add Parser.debug member. * Add tok_state.debug member. * Py_FrozenMain(): Replace Py_VerboseFlag with PyConfig.verbose.
* gh-92651: Remove the Include/token.h header file (#92652)Victor Stinner2022-05-111-1/+1
| | | | | | | | | | | | | | | Remove the token.h header file. There was never any public tokenizer C API. The token.h header file was only designed to be used by Python internals. Move Include/token.h to Include/internal/pycore_token.h. Including this header file now requires that the Py_BUILD_CORE macro is defined. It no longer checks for the Py_LIMITED_API macro. Rename functions: * PyToken_OneChar() => _PyToken_OneChar() * PyToken_TwoChars() => _PyToken_TwoChars() * PyToken_ThreeChars() => _PyToken_ThreeChars()
* bpo-47212: Improve error messages for un-parenthesized generator expressions ↵Matthieu Dartiailh2022-04-051-0/+1
| | | | (GH-32302)
* Allow the parser to avoid nested processing of invalid rules (GH-31252)Pablo Galindo Salgado2022-02-101-1/+0
|
* bpo-46521: Fix codeop to use a new partial-input mode of the parser (GH-31010)Pablo Galindo Salgado2022-02-081-0/+1
|
* bpo-46004: Fix error location for loops with invalid targets (GH-29959)Pablo Galindo Salgado2021-12-071-1/+2
|
* bpo-45727: Only trigger the 'did you forgot a comma' error suggestion if ↵Pablo Galindo Salgado2021-11-241-0/+1
| | | | inside parentheses (GH-29757)
* Refactor parser compilation units into specific components (GH-29676)Pablo Galindo Salgado2021-11-211-54/+58
|
* bpo-43914: Correctly highlight SyntaxError exceptions for invalid generator ↵Pablo Galindo Salgado2021-09-271-1/+1
| | | | expression in function calls (GH-28576)
* Update pegen to use the latest upstream developments (GH-27586)Pablo Galindo Salgado2021-08-121-0/+1
|
* bpo-34013: Generalize the invalid legacy statement error message (GH-27389)Pablo Galindo Salgado2021-07-271-0/+1
|
* bpo-43950: Print columns in tracebacks (PEP 657) (GH-26958)Ammar Askar2021-07-041-0/+1
| | | | | | | | The traceback.c and traceback.py mechanisms now utilize the newly added code.co_positions and PyCode_Addr2Location to print carets on the specific expressions involved in a traceback. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com> Co-authored-by: Ammar Askar <ammar@ammaraskar.com> Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>
* bpo-44456: Improve the syntax error when mixing keyword and positional ↵Pablo Galindo2021-06-241-0/+3
| | | | patterns (GH-26793)
* Add more const modifiers. (GH-26691)Serhiy Storchaka2021-06-121-3/+3
|
* bpo-44180: Fix edge cases in invalid assigment rules in the parser (GH-26283)Pablo Galindo2021-05-211-0/+1
| | | | | | | | | | | | | | | The invalid assignment rules are very delicate since the parser can easily raise an invalid assignment when a keyword argument is provided. As they are very deep into the grammar tree, is very difficult to specify in which contexts these rules can be used and in which don't. For that, we need to use a different version of the rule that doesn't do error checking in those situations where we don't want the rule to raise (keyword arguments and generator expressions). We also need to check if we are in left-recursive rule, as those can try to eagerly advance the parser even if the parse will fail at the end of the expression. Failing to do this allows the parser to start parsing a call as a tuple and incorrectly identify a keyword argument as an invalid assignment, before it realizes that it was not a tuple after all.
* bpo-43892: Validate the first term of complex literal value patterns (GH-25735)Brandt Bucher2021-04-301-1/+2
|
* bpo-43892: Make match patterns explicit in the AST (GH-25585)Nick Coghlan2021-04-291-0/+9
| | | Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
* bpo-43914: Highlight invalid ranges in SyntaxErrors (#25525)Pablo Galindo2021-04-231-4/+15
| | | | | | | | | | | | | | | | | To improve the user experience understanding what part of the error messages associated with SyntaxErrors is wrong, we can highlight the whole error range and not only place the caret at the first character. In this way: >>> foo(x, z for z in range(10), t, w) File "<stdin>", line 1 foo(x, z for z in range(10), t, w) ^ SyntaxError: Generator expression must be parenthesized becomes >>> foo(x, z for z in range(10), t, w) File "<stdin>", line 1 foo(x, z for z in range(10), t, w) ^^^^^^^^^^^^^^^^^^^^ SyntaxError: Generator expression must be parenthesized
* bpo-43822: Improve syntax errors for missing commas (GH-25377)Pablo Galindo2021-04-151-0/+2
|
* bpo-43798: Add source location attributes to alias (GH-25324)Matthew Suozzo2021-04-101-1/+1
| | | | | | | * Add source location attributes to alias. * Move alias star construction to pegen helper. Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
* Sanitize macros and debug functions in pegen.c (GH-25291)Pablo Galindo2021-04-091-1/+3
|
* bpo-43244: Remove parser_interface.h header file (GH-25001)Victor Stinner2021-03-241-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | Remove parser functions using the "struct _mod" type, because the AST C API was removed: * PyParser_ASTFromFile() * PyParser_ASTFromFileObject() * PyParser_ASTFromFilename() * PyParser_ASTFromString() * PyParser_ASTFromStringObject() These functions were undocumented and excluded from the limited C API. Add pycore_parser.h internal header file. Rename functions: * PyParser_ASTFromFileObject() => _PyParser_ASTFromFile() * PyParser_ASTFromStringObject() => _PyParser_ASTFromString() These functions are no longer exported (replace PyAPI_FUNC() with extern). Remove also _PyPegen_run_parser_from_file() function. Update test_peg_generator to use _PyPegen_run_parser_from_file_pointer() instead.
* bpo-43244: Remove ast.h, asdl.h, Python-ast.h headers (GH-24933)Victor Stinner2021-03-231-2/+2
| | | | | | | | | | | | | | | | These functions were undocumented and excluded from the limited C API. Most names defined by these header files were not prefixed by "Py" and so could create names conflicts. For example, Python-ast.h defined a "Yield" macro which was conflict with the "Yield" name used by the Windows <winbase.h> header. Use the Python ast module instead. * Move Include/asdl.h to Include/internal/pycore_asdl.h. * Move Include/Python-ast.h to Include/internal/pycore_ast.h. * Remove ast.h header file. * pycore_symtable.h no longer includes Python-ast.h.
* bpo-43555: Report the column offset for invalid line continuation character ↵Pablo Galindo2021-03-221-2/+3
| | | | (GH-24939)
* bpo-35134: Move Include/{pyarena.h,pyctype.h} to Include/cpython/ (GH-24550)Nicholas Sim2021-02-171-1/+0
| | | | Move non-limited C API headers pyarena.h and pyctype.h into Include/cpython/ directory.
* bpo-42997: Improve error message for missing : before suites (GH-24292)Pablo Galindo2021-02-021-4/+3
| | | | | | | | * Add to the peg generator a new directive ('&&') that allows to expect a token and hard fail the parsing if the token is not found. This allows to quickly emmit syntax errors for missing tokens. * Use the new grammar element to hard-fail if the ':' is missing before suites.
* bpo-42214: Fix check for NOTEQUAL token in the PEG parser for the ↵Pablo Galindo2020-10-301-1/+1
| | | | barry_as_flufl rule (GH-23048)
* bpo-42123: Run the parser two times and only enable invalid rules on the ↵Lysandros Nikolaou2020-10-261-0/+1
| | | | | | | | | | second run (GH-22111) * Implement running the parser a second time for the errors messages The first parser run is only responsible for detecting whether there is a `SyntaxError` or not. If there isn't the AST gets returned. Otherwise, the parser is run a second time with all the `invalid_*` rules enabled so that all the customized error messages get produced.
* bpo-41746: Cast to typed seqs in CHECK macros to avoid type erasure (GH-22864)Lysandros Nikolaou2020-10-211-4/+4
|
* bpo-41746: Add type information to asdl_seq objects (GH-22223)Pablo Galindo2020-09-161-15/+15
| | | | | | | | | | | | | * Add new capability to the PEG parser to type variable assignments. For instance: ``` | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a } ``` * Add new sequence types from the asdl definition (automatically generated) * Make `asdl_seq` type a generic aliasing pointer type. * Create a new `asdl_generic_seq` for the generic case using `void*`. * The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed. * New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences. * Changes all possible `asdl_seq` types to use specific versions everywhere.
* bpo-41697: Correctly handle KeywordOrStarred when parsing arguments in the ↵Pablo Galindo2020-09-031-1/+3
| | | | parser (GH-22077)
* bpo-41690: Use a loop to collect args in the parser instead of recursion ↵Pablo Galindo2020-09-021-0/+1
| | | | | | | | | | | | | | | | | | | | | (GH-22053) This program can segfault the parser by stack overflow: ``` import ast code = "f(" + ",".join(['a' for _ in range(100000)]) + ")" print("Ready!") ast.parse(code) ``` the reason is that the rule for arguments has a simple recursion when collecting args: args[expr_ty]: [...] | a=named_expression b=[',' c=args { c }] { [...] }
* bpo-41060: Avoid SEGFAULT when calling GET_INVALID_TARGET in the grammar ↵Lysandros Nikolaou2020-06-211-3/+22
| | | | | | | | | (GH-21020) `GET_INVALID_TARGET` might unexpectedly return `NULL`, which if not caught will cause a SEGFAULT. Therefore, this commit introduces a new inline function `RAISE_SYNTAX_ERROR_INVALID_TARGET` that always checks for `GET_INVALID_TARGET` returning NULL and can be used in the grammar, replacing the long C ternary operation used till now.
* bpo-40958: Avoid 'possible loss of data' warning on Windows (GH-20970)Lysandros Nikolaou2020-06-201-1/+1
|
* bpo-40334: Produce better error messages on invalid targets (GH-20106)Lysandros Nikolaou2020-06-181-1/+11
| | | | | | | | | | | | | | The following error messages get produced: - `cannot delete ...` for invalid `del` targets - `... is an illegal 'for' target` for invalid targets in for statements - `... is an illegal 'with' target` for invalid targets in with statements Additionally, a few `cut`s were added in various places before the invocation of the `invalid_*` rule, in order to speed things up. Co-authored-by: Pablo Galindo <Pablogsal@gmail.com>
* bpo-40958: Avoid buffer overflow in the parser when indexing the current ↵Pablo Galindo2020-06-161-2/+2
| | | | line (GH-20875)
* bpo-40939: Remove the old parser (GH-20768)Pablo Galindo2020-06-111-0/+273
This commit removes the old parser, the deprecated parser module, the old parser compatibility flags and environment variables and all associated support code and documentation.