summaryrefslogtreecommitdiffstats
path: root/Parser/tokenizer.c
Commit message (Collapse)AuthorAgeFilesLines
* bpo-36459: Fix a possible double PyMem_FREE() due to tokenizer.c's ↵Miss Islington (bot)2019-03-281-1/+0
| | | | | | | | | tok_nextc() (12601) Remove the PyMem_FREE() call added in cb90c89. The buffer will be freed when PyTokenizer_Free() is called on the tokenizer state. (cherry picked from commit cda139d1ded6708665b53e4ed32ccc1d2627e1da) Co-authored-by: Zackery Spytz <zspytz@gmail.com>
* bpo-36367: Free buffer if realloc fails in tokenize.c (GH-12442) (GH-12470)Victor Stinner2019-03-201-2/+8
| | | (cherry picked from commit cb90c89de14aab636739b3e810cf949e47b54a0c)
* bpo-25083: Python can sometimes create incorrect .pyc files (GH-8449)tzickel2018-09-101-0/+5
| | | | | | Python 2 never checked for I/O error when reading .py files and thus could mistake an I/O error for EOF and create incorrect .pyc files. This adds an check for this and aborts on an error.
* bpo-33645: Fix an "unknown parsing error" in the parser. (GH-7119)Serhiy Storchaka2018-05-311-0/+2
| | | | It is reproduced when parse the "<>" operator and run Python with both options -3 and -We.
* properly handle the single null-byte file (closes #24022)Benjamin Peterson2016-09-191-1/+1
|
* Issue #25388: Fixed tokenizer hang when processing undecodable source codeSerhiy Storchaka2015-11-141-3/+6
| | | | with a null byte.
* add missing NULL checks to get_coding_spec (closes #24854)Benjamin Peterson2015-08-141-1/+4
|
* Issue #22221: Backported fixes from Python 3 (issue #18960).Serhiy Storchaka2014-09-051-3/+17
| | | | | | | | | | | | | * Now the source encoding declaration on the second line isn't effective if the first line contains anything except a comment. This affects compile(), eval() and exec() too. * IDLE now ignores the source encoding declaration on the second line if the first line contains anything except a comment. * 2to3 and the findnocoding.py script now ignore the source encoding declaration on the second line if the first line contains anything except a comment.
* Issue #21789: fix broken link (reported by Jan Varho)Ned Deily2014-06-171-1/+1
|
* allow the keyword else immediately after (no space) an integer (closes #21642)Benjamin Peterson2014-06-071-5/+14
|
* complain if the codec doesn't return unicodeBenjamin Peterson2013-12-281-0/+6
|
* Issue #18038: SyntaxError raised during compilation sources with illegalSerhiy Storchaka2013-06-091-7/+7
| | | | encoding now always contains an encoding name.
* Issue #9020: The Py_IS* macros from pyctype.h should generally only beStefan Krah2010-06-241-1/+1
| | | | | used with signed/unsigned char arguments. For integer arguments, EOF has to be handled separately.
* Untabify C files. Will watch buildbots.Antoine Pitrou2010-05-091-1370/+1370
|
* use our own locale independent ctype macrosBenjamin Peterson2010-04-031-19/+3
| | | | requires building pyctype.o into pgen
* ensure that the locale does not affect the tokenization of identifiersBenjamin Peterson2010-04-031-4/+18
|
* Issue #3137: Don't ignore errors at startup, especially a keyboard interruptVictor Stinner2010-03-101-1/+5
| | | | | | (SIGINT). If an error occurs while importing the site module, the error is printed and Python exits. Initialize the GIL before importing the site module.
* Issue #7820: The parser tokenizer restores all bytes in the right if the BOMVictor Stinner2010-03-021-22/+32
| | | | | | check fails. Fix an assertion in pydebug mode.
* rewrite translate_newlines for clarityBenjamin Peterson2009-12-061-12/+11
|
* fix several compile() issues by translating newlines in the tokenizerBenjamin Peterson2009-11-121-16/+66
|
* spellingBenjamin Peterson2009-11-071-1/+1
|
* fix some coding styleBenjamin Peterson2009-10-091-13/+30
|
* don't mask encoding errors when decoding a string #6289Benjamin Peterson2009-06-161-4/+1
|
* #3367: revert rev. 65539: this change causes test_parser to failAndrew M. Kuchling2008-08-051-1/+1
|
* #3367 from Kristjan Valur Jonsson:Andrew M. Kuchling2008-08-051-1/+1
| | | | | | | If a PyTokenizer_FromString() is called with an empty string, the tokenizer's line_start member never gets initialized. Later, it is compared with the token pointer 'a' in parsetok.c:193 and that behavior can result in undefined behavior.
* This reverts r63675 based on the discussion in this thread:Gregory P. Smith2008-06-091-16/+16
| | | | | | | http://mail.python.org/pipermail/python-dev/2008-June/079988.html Python 2.6 should stick with PyString_* in its codebase. The PyBytes_* names in the spirit of 3.0 are available via a #define only. See the email thread.
* Renamed PyString to PyBytesChristian Heimes2008-05-261-16/+16
|
* Issue2681: the literal 0o8 was wrongly accepted, and evaluated as float(0.0).Amaury Forgeot d'Arc2008-04-241-1/+1
| | | | | This happened only when 8 is the first digit. Credits go to Lukas Meuser.
* Revert r61969 which added casts to Py_CHARMASK to avoid compiler warnings.Neal Norwitz2008-03-281-8/+0
| | | | | | Rather than sprinkle casts throughout the code, change Py_CHARMASK to always cast it's result to an unsigned char. This should ensure we do the right thing when accessing an array with the result.
* Make Py3k warnings consistent w.r.t. punctuation; also respect theGeorg Brandl2008-03-251-1/+1
| | | | EOL 80 limit and supply more alternatives in warning messages.
* Finished backporting PEP 3127, Integer Literal Support and Syntax.Eric Smith2008-03-171-1/+25
| | | | | | | | Added 0b and 0o literals to tokenizer. Modified PyOS_strtoul to support 0b and 0o inputs. Modified PyLong_FromString to support guessing 0b and 0o inputs. Renamed test_hexoct.py to test_int_literal.py and added binary tests. Added upper and lower case 0b, 0O, and 0X tests to test_int_literal.py
* Add assertion that we do not blow out newlNeal Norwitz2008-01-271-0/+1
|
* Fixed bug #1915: Python compiles with --enable-unicode=no again. However ↵Christian Heimes2008-01-231-2/+1
| | | | several extension methods and modules do not work without unicode support.
* Add a "const" to make gcc happy.Georg Brandl2008-01-211-1/+1
|
* Issue #1882: when compiling code from a string, encoding cookies in theGeorg Brandl2008-01-211-2/+13
| | | | second line of code were not always recognized correctly.
* Fix #1679: "0x" was taken as a valid integer literal.Georg Brandl2008-01-191-0/+7
| | | | | Fixes the tokenizer, tokenize.py and int() to reject this. Patches by Malte Helmert.
* Added bytes and b'' as aliases for str and ''Christian Heimes2008-01-181-0/+8
|
* Fix #define ordering.Georg Brandl2008-01-071-3/+2
|
* Make Python compile with --disable-unicode.Georg Brandl2008-01-071-0/+2
|
* Warning "<> not supported in 3.x" should be enabled only when the -3 option ↵Amaury Forgeot d'Arc2007-11-241-1/+1
| | | | is set.
* Fixed problems in the last commit. Filenames and line numbers weren't ↵Christian Heimes2007-11-231-9/+11
| | | | | | reported correctly. Backquotes still don't report the correct file. The AST nodes only contain the line number but not the file name.
* Applied patch #1754273 and #1754271 from Thomas GleeChristian Heimes2007-11-231-1/+10
| | | | The patches are adding deprecation warnings for back ticks and <>
* Change a PyErr_Print() into a PyErr_Clear(),Guido van Rossum2007-10-151-1/+1
| | | | per discussion in issue 1031213.
* Patch #1031213: Decode source line in SyntaxErrors back to its originalMartin v. Löwis2007-09-041-0/+62
| | | | source encoding. Will backport to 2.5.
* Comment grammarAndrew M. Kuchling2006-10-061-1/+1
|
* Don't truncate if size_t is bigger than uintNeal Norwitz2006-06-121-1/+1
|
* Patch #1357836:Neal Norwitz2006-06-021-9/+11
| | | | | | | | | | Prevent an invalid memory read from test_coding in case the done flag is set. In that case, the loop isn't entered. I wonder if rather than setting the done flag in the cases before the loop, if they should just exit early. This code looks like it should be refactored. Backport candidate (also the early break above if decoding_fgets fails)
* C++ compiler cleanup: cast signed to unsignedSkip Montanaro2006-04-181-1/+1
|
* As discussed on python-dev, really fix the PyMem_*/PyObject_* memory APINeal Norwitz2006-04-111-22/+22
| | | | | | | | | | | | | | | | mismatches. At least I hope this fixes them all. This reverts part of my change from yesterday that converted everything in Parser/*.c to use PyObject_* API. The encoding doesn't really need to use PyMem_*, however, it uses new_string() which must return PyMem_* for handling the result of PyOS_Readline() which returns PyMem_* memory. If there were 2 versions of new_string() one that returned PyMem_* for tokens and one that return PyObject_* for encodings that could also fix this problem. I'm not sure which version would be clearer. This seems to fix both Guido's and Phillip's problems, so it's good enough for now. After this change, it would be good to review Parser/*.c for consistent use of the 2 memory APIs.
* Fix the code in Parser/ to also compile with C++. This was mostly casts forAnthony Baxter2006-04-111-12/+13
| | | | | | | malloc/realloc type functions, as well as renaming one variable called 'new' in tokensizer.c. Still lots more to be done, going to be checking in one chunk at a time or the patch will be massively huge. Still compiles ok with gcc.