summaryrefslogtreecommitdiffstats
path: root/Parser/string_parser.c
Commit message (Collapse)AuthorAgeFilesLines
* gh-111380: Show SyntaxWarnings only once when parsing if invalid syntax is ↵Pablo Galindo Salgado2023-10-271-0/+5
| | | | encouintered (#111381)
* gh-104169: Fix test_peg_generator after tokenizer refactoring (#110727)Lysandros Nikolaou2023-10-121-2/+3
| | | | * Fix test_peg_generator after tokenizer refactoring * Remove references to tokenizer.c in comments etc.
* gh-104169: Refactor tokenizer into lexer and wrappers (#110684)Lysandros Nikolaou2023-10-111-1/+1
| | | | | | | | | | | * The lexer, which include the actual lexeme producing logic, goes into the `lexer` directory. * The wrappers, one wrapper per input mode (file, string, utf-8, and readline), go into the `tokenizer` directory and include logic for creating a lexer instance and managing the buffer for different modes. --------- Co-authored-by: Pablo Galindo <pablogsal@gmail.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* gh-106320: Remove _PyBytes_Join() C API (#107144)Victor Stinner2023-07-231-0/+1
| | | | | | | | | | | Move private _PyBytes functions to the internal C API (pycore_bytesobject.h): * _PyBytes_DecodeEscape() * _PyBytes_FormatEx() * _PyBytes_FromHex() * _PyBytes_Join() No longer export these functions.
* gh-106320: Remove private _PyUnicode codecs C API functions (#106385)Victor Stinner2023-07-041-0/+1
| | | | | Remove private _PyUnicode codecs C API functions: move them to the internal C API (pycore_unicodeobject.h). No longer export most of these functions.
* gh-105938: Emit a SyntaxWarning for escaped braces in an f-string (#105939)Lysandros Nikolaou2023-06-201-1/+6
|
* gh-102310: Change error range for invalid bytes literals (#103663)Nikita Sobolev2023-04-231-1/+2
|
* gh-102856: Initial implementation of PEP 701 (#102855)Pablo Galindo Salgado2023-04-191-1054/+35
| | | | | | Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Batuhan Taskaya <isidentical@gmail.com> Co-authored-by: Marta Gómez Macías <mgmacias@google.com> Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>
* gh-99300: Use Py_NewRef() in Parser/ directory (#99330)Victor Stinner2022-11-101-3/+1
| | | | Replace Py_INCREF() with Py_NewRef() in C files of the Parser/ directory and in the PEG generator.
* gh-98401: Invalid escape sequences emits SyntaxWarning (#99011)Victor Stinner2022-11-031-2/+9
| | | | | | | | | | | | | | | | | | | | | | A backslash-character pair that is not a valid escape sequence now generates a SyntaxWarning, instead of DeprecationWarning. For example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an invalid escape sequence), use raw strings for regular expression: re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will eventually be raised, instead of SyntaxWarning. Octal escapes with value larger than 0o377 (ex: "\477"), deprecated in Python 3.11, now produce a SyntaxWarning, instead of DeprecationWarning. In a future Python version they will be eventually a SyntaxError. codecs.escape_decode() and codecs.unicode_escape_decode() are left unchanged: they still emit DeprecationWarning. * The parser only emits SyntaxWarning for Python 3.12 (feature version), and still emits DeprecationWarning on older Python versions. * Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and wasm_build.py.
* gh-94869: Fix the location in some expressions for multi-line f-string ast ↵Pablo Galindo Salgado2022-07-161-1/+4
| | | | nodes (#94895)
* gh-93418: Fix an assert when an f-string expression is followed by an '=', ↵Eric V. Smith2022-06-011-1/+3
| | | | but no closing brace. (gh-93419)
* gh-93283: Improve error message for f-string with invalid conversion ↵Serhiy Storchaka2022-05-311-12/+28
| | | | character (GH-93349)
* gh-81548: Deprecate octal escape sequences with value larger than 0o377 ↵Serhiy Storchaka2022-04-301-7/+18
| | | | (GH-91668)
* bpo-47129: Add more informative messages to f-string syntax errors (32127)Maciej Górski2022-03-281-0/+5
| | | | | | | | | | | | | * Add more informative messages to f-string syntax errors * 📜🤖 Added by blurb_it. * Fix whitespaces * Change error message * Remove the 'else' statement (as sugested in review) Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
* bpo-46762: Fix an assert failure in f-strings where > or < is the last ↵Eric V. Smith2022-02-161-10/+10
| | | | character if the f-string is missing a trailing right brace. (#31365)
* bpo-46503: Prevent an assert from firing when parsing some invalid \N ↵Eric V. Smith2022-01-251-2/+14
| | | | | | | sequences in f-strings. (GH-30865) * bpo-46503: Prevent an assert from firing. Also fix one nearby tiny PEP-7 nit. * Added blurb.
* bpo-46237: Fix the line number of tokenizer errors inside f-strings (GH-30463)Pablo Galindo Salgado2022-01-081-1/+4
|
* bpo-45461: Fix IncrementalDecoder and StreamReader in the "unicode-escape" ↵Serhiy Storchaka2021-10-141-1/+1
| | | | | | | | | codec (GH-28939) They support now splitting escape sequences between input chunks. Add the third parameter "final" in codecs.unicode_escape_decode(). It is True by default to match the former behavior.
* bpo-45434: Mark the PyTokenizer C API as private (GH-28924)Victor Stinner2021-10-131-2/+2
| | | | | | | | | | | | | | Rename PyTokenize functions to mark them as private: * PyTokenizer_FindEncodingFilename() => _PyTokenizer_FindEncodingFilename() * PyTokenizer_FromString() => _PyTokenizer_FromString() * PyTokenizer_FromFile() => _PyTokenizer_FromFile() * PyTokenizer_FromUTF8() => _PyTokenizer_FromUTF8() * PyTokenizer_Free() => _PyTokenizer_Free() * PyTokenizer_Get() => _PyTokenizer_Get() Remove the unused PyTokenizer_FindEncoding() function. import.c: remove unused #include "errcode.h".
* Optimized code format (GH-28599)Rajendra arora2021-09-281-3/+1
| | | Automerge-Triggered-By: GH:pablogsal
* bpo-44885: Correct the ast locations of f-strings with format specs and ↵Pablo Galindo Salgado2021-08-121-38/+30
| | | | repeated expressions (GH-27729)
* Add more const modifiers. (GH-26691)Serhiy Storchaka2021-06-121-5/+5
|
* fix: use unambiguous punction in 'invalid escape sequence' message (GH-26582)Ned Batchelder2021-06-081-2/+2
|
* bpo-43244: Rename pycore_ast.h functions to _PyAST_xxx() (GH-25252)Victor Stinner2021-04-071-8/+11
| | | | | | Rename AST functions of pycore_ast.h to use the "_PyAST_" prefix. Remove macros creating aliases without prefix. For example, Module() becomes _PyAST_Module(). Update Grammar/python.gram to use _PyAST_xxx() functions.
* bpo-43244: Remove the pyarena.h header (GH-25007)Victor Stinner2021-03-241-1/+1
| | | | | | | | | | | | | | | | | | Remove the pyarena.h header file with functions: * PyArena_New() * PyArena_Free() * PyArena_Malloc() * PyArena_AddPyObject() These functions were undocumented, excluded from the limited C API, and were only used internally by the compiler. Add pycore_pyarena.h header. Rename functions: * PyArena_New() => _PyArena_New() * PyArena_Free() => _PyArena_Free() * PyArena_Malloc() => _PyArena_Malloc() * PyArena_AddPyObject() => _PyArena_AddPyObject()
* Remove full stop from a bytes-related SyntaxError message (GH-24300)numbermaniac2021-01-231-1/+1
|
* bpo-42806: Fix ast locations of f-strings inside parentheses (GH-24067)Pablo Galindo2021-01-031-1/+1
|
* bpo-42519: Replace PyMem_MALLOC() with PyMem_Malloc() (GH-23586)Victor Stinner2020-12-011-1/+1
| | | | | | | | | | | No longer use deprecated aliases to functions: * Replace PyMem_MALLOC() with PyMem_Malloc() * Replace PyMem_REALLOC() with PyMem_Realloc() * Replace PyMem_FREE() with PyMem_Free() * Replace PyMem_Del() with PyMem_Free() * Replace PyMem_DEL() with PyMem_Free() Modify also the PyMem_DEL() macro to use directly PyMem_Free().
* bpo-40998: Address compiler warnings found by ubsan (GH-20929)Christian Heimes2020-11-181-0/+3
| | | | | Signed-off-by: Christian Heimes <christian@python.org> Automerge-Triggered-By: GH:tiran
* bpo-41746: Add type information to asdl_seq objects (GH-22223)Pablo Galindo2020-09-161-4/+4
| | | | | | | | | | | | | * Add new capability to the PEG parser to type variable assignments. For instance: ``` | a[asdl_stmt_seq*]=';'.small_stmt+ [';'] NEWLINE { a } ``` * Add new sequence types from the asdl definition (automatically generated) * Make `asdl_seq` type a generic aliasing pointer type. * Create a new `asdl_generic_seq` for the generic case using `void*`. * The old `asdl_seq_GET`/`ast_seq_SET` macros now are typed. * New `asdl_seq_GET_UNTYPED`/`ast_seq_SET_UNTYPED` macros for dealing with generic sequences. * Changes all possible `asdl_seq` types to use specific versions everywhere.
* Fix trivial typo in the PEG string parser (GH-21508)Eric V. Smith2020-07-161-1/+1
|
* Fix possibly-unitialized warning in string_parser.c. (GH-21503)Benjamin Peterson2020-07-161-15/+16
| | | | | | | | | | | | | | | | | | | | GCC says ``` ../cpython/Parser/string_parser.c: In function ‘fstring_find_expr’: ../cpython/Parser/string_parser.c:404:93: warning: ‘cols’ may be used uninitialized in this function [-Wmaybe-uninitialized] 404 | p2->starting_col_offset = p->tok->first_lineno == p->tok->lineno ? t->col_offset + cols : cols; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~ ../cpython/Parser/string_parser.c:384:16: note: ‘cols’ was declared here 384 | int lines, cols; | ^~~~ ../cpython/Parser/string_parser.c:403:45: warning: ‘lines’ may be used uninitialized in this function [-Wmaybe-uninitialized] 403 | p2->starting_lineno = t->lineno + lines - 1; | ~~~~~~~~~~~~~~~~~~^~~ ../cpython/Parser/string_parser.c:384:9: note: ‘lines’ was declared here 384 | int lines, cols; | ^~~~~ ``` and, indeed, if `PyBytes_AsString` somehow fails, lines & cols will not be initialized.
* bpo-41076: Pre-feed the parser with the f-string expression location (GH-21054)Lysandros Nikolaou2020-06-271-242/+22
| | | This commit changes the parsing of f-string expressions with the new parser. The parser gets pre-fed with the location of the expression itself (not the f-string, which was what we were doing before). This allows us to completely skip the shifting of the AST nodes after the parsing is completed.
* bpo-41132: Use pymalloc allocator in the f-string parser (GH-21173)Lysandros Nikolaou2020-06-271-7/+7
|
* Remove old comment in string_parser.c (GH-20906)Pablo Galindo2020-06-161-5/+0
|
* Improve readability and style in parser files (GH-20884)Pablo Galindo2020-06-151-98/+130
|
* bpo-40939: Remove the old parser (GH-20768)Pablo Galindo2020-06-111-0/+1421
This commit removes the old parser, the deprecated parser module, the old parser compatibility flags and environment variables and all associated support code and documentation.