summaryrefslogtreecommitdiffstats
path: root/Parser
Commit message (Collapse)AuthorAgeFilesLines
* [3.12] gh-105017: Fix including additional NL token when using CRLF ↵Miss Islington (bot)2023-05-271-1/+1
| | | | | | | (GH-105022) (#105023) Co-authored-by: Marta Gómez Macías <mgmacias@google.com> Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
* [3.12] Fix typo in the tokenizer (GH-104950) (#104953)Miss Islington (bot)2023-05-261-1/+1
| | | | | (cherry picked from commit 705e387dd81b971cb1ee5727da54adfb565f61d0) Co-authored-by: Stepfen Shawn <m18824909883@163.com>
* [3.12] gh-104866: Tokenize should emit NEWLINE after exiting block with ↵Miss Islington (bot)2023-05-241-3/+6
| | | | | | | | | comment (GH-104870) (#104872) gh-104866: Tokenize should emit NEWLINE after exiting block with comment (GH-104870) (cherry picked from commit c90a862cdcf55dc1753c6466e5fa4a467a13ae24) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com>
* [3.12] GH-104668: Don't call PyOS_* hooks in subinterpreters (GH-104760)Miss Islington (bot)2023-05-231-7/+25
| | | | | | GH-104668: Don't call PyOS_* hooks in subinterpreters (GH-104674) (cherry picked from commit 357bed0bcd3c5d7c4a8caad451754a9a172aca3e) Co-authored-by: Brandt Bucher <brandtbucher@microsoft.com>
* gh-102856: Allow comments inside multi-line f-string expresions (#104006)Cristián Maureira-Fredes2023-05-221-4/+0
|
* gh-104656: Rename typeparams AST node to type_params (#104657)Jelle Zijlstra2023-05-223-26/+26
|
* gh-98836: Extend PyUnicode_FromFormat() (GH-98838)Serhiy Storchaka2023-05-211-8/+3
| | | | | | | | | * Support for conversion specifiers o (octal) and X (uppercase hexadecimal). * Support for length modifiers j (intmax_t) and t (ptrdiff_t). * Length modifiers are now applied to all integer conversions. * Support for wchar_t C strings (%ls and %lV). * Support for variable width and precision (*). * Support for flag - (left alignment).
* gh-102856: Python tokenizer implementation for PEP 701 (#104323)Marta Gómez Macías2023-05-215-8/+65
| | | | | | | | | | | This commit replaces the Python implementation of the tokenize module with an implementation that reuses the real C tokenizer via a private extension module. The tokenize module now implements a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward compatibility. As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation. Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
* gh-104658: Fix location of unclosed quote error for multiline f-strings ↵Pablo Galindo Salgado2023-05-202-2/+6
| | | | (#104660)
* gh-103763: Implement PEP 695 (#103764)Jelle Zijlstra2023-05-163-1964/+2573
| | | | | | | | | | | | | | This implements PEP 695, Type Parameter Syntax. It adds support for: - Generic functions (def func[T](): ...) - Generic classes (class X[T](): ...) - Type aliases (type X = ...) - New scoping when the new syntax is used within a class body - Compiler and interpreter changes to support the new syntax and scoping rules Co-authored-by: Marc Mueller <30130371+cdce8p@users.noreply.github.com> Co-authored-by: Eric Traut <eric@traut.com> Co-authored-by: Larry Hastings <larry@hastings.org> Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
* Trim trailing whitespace and test on CI (#104275)Hugo van Kemenade2023-05-081-1/+1
| | | | Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
* gh-99113: Add Py_MOD_PER_INTERPRETER_GIL_SUPPORTED (gh-104205)Eric Snow2023-05-051-0/+1
| | | Here we are doing no more than adding the value for Py_mod_multiple_interpreters and using it for stdlib modules. We will start checking for it in gh-104206 (once PyInterpreterState.ceval.own_gil is added in gh-104204).
* gh-104169: Ensure the tokenizer doesn't overwrite previous errors (#104170)Pablo Galindo Salgado2023-05-041-0/+6
|
* gh-97556: Raise null bytes syntax error upon null in multiline string ↵Lysandros Nikolaou2023-05-041-1/+8
| | | | (GH-104136)
* gh-104016: Fixed off by 1 error in f string tokenizer (#104047)jx1242023-05-012-5/+9
| | | | | | Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com> Co-authored-by: Ken Jin <kenjin@python.org> Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
* gh-103824: fix use-after-free error in Parser/tokenizer.c (#103993)chgnrdv2023-05-011-0/+4
|
* gh-103656: Transfer f-string buffers to parser to avoid use-after-free ↵Lysandros Nikolaou2023-04-277-60/+127
| | | | | (GH-103896) Co-authored-by: Pablo Galindo <pablogsal@gmail.com>
* gh-103718: Correctly set f-string buffers in all cases (GH-103815)Lysandros Nikolaou2023-04-251-8/+6
| | | | | | Turns out we always need to remember/restore fstring buffers in all of the stack of tokenizer modes, cause they might change to `TOK_REGULAR_MODE` and have newlines inside the braces (which is when we need to reallocate the buffer and restore the fstring ones).
* GH-103727: Avoid advancing tokenizer too far in f-string mode (GH-103775)Lysandros Nikolaou2023-04-241-8/+10
|
* GH-103718: Correctly cache and restore f-string buffers when needed (GH-103719)Lysandros Nikolaou2023-04-232-11/+30
|
* gh-102310: Change error range for invalid bytes literals (#103663)Nikita Sobolev2023-04-231-1/+2
|
* gh-102856: Clean some of the PEP 701 tokenizer implementation (#103634)Pablo Galindo Salgado2023-04-192-74/+67
|
* gh-102856: Initial implementation of PEP 701 (#102855)Pablo Galindo Salgado2023-04-1910-3630/+5666
| | | | | | Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Batuhan Taskaya <isidentical@gmail.com> Co-authored-by: Marta Gómez Macías <mgmacias@google.com> Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
* GH-102711: Fix warnings found by clang (#102712)Chenxi Mao2023-03-281-2/+2
| | | | | | | | | | | | | | | | | There are some warnings if build python via clang: Parser/pegen.c:812:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes] _PyPegen_clear_memo_statistics() ^ void Parser/pegen.c:820:29: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes] _PyPegen_get_memo_statistics() ^ void Fix it to make clang happy. Signed-off-by: Chenxi Mao <chenxi.mao@suse.com>
* gh-102255: Improve build support for Windows API partitions (GH-102256)Max Bachmann2023-03-091-3/+5
| | | | | Add `MS_WINDOWS_DESKTOP`, `MS_WINDOWS_APPS`, `MS_WINDOWS_SYSTEM` and `MS_WINDOWS_GAMES` preprocessor definitions to allow switching off functionality missing from particular API partitions ("partitions" are used in Windows to identify overlapping subsets of APIs). CPython only officially supports `MS_WINDOWS_DESKTOP` and `MS_WINDOWS_SYSTEM` (APPS is included by normal desktop builds, but APPS without DESKTOP is not covered). Other configurations are a convenience for people building their own runtimes. `MS_WINDOWS_GAMES` is for the Xbox subset of the Windows API, which is also available on client OS, but is restricted compared to `MS_WINDOWS_DESKTOP`. These restrictions may change over time, as they relate to the build headers rather than the OS support, and so we assume that Xbox builds will use the latest available version of the GDK.
* gh-102416: Do not memoize incorrectly loop rules in the parser (#102467)Pablo Galindo Salgado2023-03-061-216/+0
|
* gh-100227: Move _str_replace_inf to PyInterpreterState (gh-102333)Eric Snow2023-02-281-3/+1
| | | https://github.com/python/cpython/issues/100227
* Fix some typos in asdl_c.py (GH-101757)abel15022023-02-101-2/+2
|
* GH-101578: Normalize the current exception (GH-101607)Mark Shannon2023-02-081-9/+6
| | | | | | | | | | * Make sure that the current exception is always normalized. * Remove redundant type and traceback fields for the current exception. * Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException * Add new API functions: PyException_GetArgs, PyException_SetArgs
* gh-100940: Change "char *str" to "const char *str" in KeywordToken: It is ↵Stepfen Shawn2023-01-181-1/+1
| | | | an immutable string. (#100936)
* gh-101046: Fix a potential memory leak in the parser when raising ↵Pablo Galindo Salgado2023-01-161-0/+108
| | | | MemoryError (#101051)
* gh-81057: Move the Cached Parser Dummy Name to _PyRuntimeState (#100277)Eric Snow2022-12-161-20/+2
|
* gh-81057: Move More Globals to _PyRuntimeState (gh-100092)Eric Snow2022-12-071-2/+2
| | | https://github.com/python/cpython/issues/81057
* gh-90110: Clean Up the C-analyzer Globals Lists (gh-100091)Eric Snow2022-12-071-1/+2
| | | https://github.com/python/cpython/issues/90110
* gh-100050: Fix an assertion error when raising unclosed parenthesis errors ↵Pablo Galindo Salgado2022-12-061-0/+4
| | | | | in the tokenizer (GH-100065) Automerge-Triggered-By: GH:pablogsal
* gh-99891: Fix infinite recursion in the tokenizer when showing warnings ↵Pablo Galindo Salgado2022-11-302-0/+9
| | | | | (GH-99893) Automerge-Triggered-By: GH:pablogsal
* gh-90994: Improve error messages upon call arguments syntax errors (GH-96893)Lysandros Nikolaou2022-11-201-1149/+1395
|
* gh-99581: Fix a buffer overflow in the tokenizer when copying lines that ↵Pablo Galindo Salgado2022-11-201-1/+6
| | | | fill the available buffer (#99605)
* gh-99211: Point to except/except* on syntax errors when mixing them (GH-99215)Lysandros Nikolaou2022-11-201-671/+713
| | | Automerge-Triggered-By: GH:lysnikolaou
* gh-81057: Move Globals in Core Code to _PyRuntimeState (gh-99496)Eric Snow2022-11-151-0/+1
| | | | | This is the first of several changes to consolidate non-object globals in core code. https://github.com/python/cpython/issues/81057
* gh-99300: Use Py_NewRef() in Python/Python-ast.c (#99499)Victor Stinner2022-11-151-4/+5
| | | | | | Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in Python/Python-ast.c. Update Parser/asdl_c.py to regenerate code.
* gh-81057: Move Global Variables Holding Objects to _PyRuntimeState. (gh-99487)Eric Snow2022-11-141-0/+4
| | | | | This moves nearly all remaining object-holding globals in core code (other than static types). https://github.com/python/cpython/issues/81057
* gh-99300: Use Py_NewRef() in Parser/ directory (#99330)Victor Stinner2022-11-103-9/+4
| | | | Replace Py_INCREF() with Py_NewRef() in C files of the Parser/ directory and in the PEG generator.
* gh-99300: Use Py_NewRef() in Python/ directory (#99317)Victor Stinner2022-11-101-8/+7
| | | | | | Replace Py_INCREF() and Py_XINCREF() with Py_NewRef() and Py_XNewRef() in C files of the Python/ directory. Update Parser/asdl_c.py to regenerate Python/Python-ast.c.
* gh-99153: set location on SyntaxError for try with both except and except* ↵Irit Katriel2022-11-061-3/+3
| | | | (GH-99160)
* gh-98401: Invalid escape sequences emits SyntaxWarning (#99011)Victor Stinner2022-11-031-2/+9
| | | | | | | | | | | | | | | | | | | | | | A backslash-character pair that is not a valid escape sequence now generates a SyntaxWarning, instead of DeprecationWarning. For example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an invalid escape sequence), use raw strings for regular expression: re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will eventually be raised, instead of SyntaxWarning. Octal escapes with value larger than 0o377 (ex: "\477"), deprecated in Python 3.11, now produce a SyntaxWarning, instead of DeprecationWarning. In a future Python version they will be eventually a SyntaxError. codecs.escape_decode() and codecs.unicode_escape_decode() are left unchanged: they still emit DeprecationWarning. * The parser only emits SyntaxWarning for Python 3.12 (feature version), and still emits DeprecationWarning on older Python versions. * Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and wasm_build.py.
* gh-98931: Improve error message when the user types 'import x from y' ↵Pablo Galindo Salgado2022-11-011-391/+465
| | | | instead of 'from y import x' (#98932)
* gh-97669: Create Tools/build/ directory (#97963)Victor Stinner2022-10-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Create Tools/build/ directory. Move the following scripts from Tools/scripts/ to Tools/build/: * check_extension_modules.py * deepfreeze.py * freeze_modules.py * generate_global_objects.py * generate_levenshtein_examples.py * generate_opcode_h.py * generate_re_casefix.py * generate_sre_constants.py * generate_stdlib_module_names.py * generate_token.py * parse_html5_entities.py * smelly.py * stable_abi.py * umarshal.py * update_file.py * verify_ensurepip_wheels.py Update references to these scripts.
* gh-97997: Add col_offset field to tokenizer and use that for AST nodes (#98000)Lysandros Nikolaou2022-10-072-11/+43
|
* gh-97973: Return all necessary information from the tokenizer (GH-97984)Lysandros Nikolaou2022-10-064-137/+150
| | | | | Right now, the tokenizer only returns type and two pointers to the start and end of the token. This PR modifies the tokenizer to return the type and set all of the necessary information, so that the parser does not have to this.