summaryrefslogtreecommitdiffstats
path: root/Parser/tokenizer.c
Commit message (Collapse)AuthorAgeFilesLines
* Don't truncate if size_t is bigger than uintNeal Norwitz2006-06-121-1/+1
|
* Patch #1357836:Neal Norwitz2006-06-021-9/+11
| | | | | | | | | | Prevent an invalid memory read from test_coding in case the done flag is set. In that case, the loop isn't entered. I wonder if rather than setting the done flag in the cases before the loop, if they should just exit early. This code looks like it should be refactored. Backport candidate (also the early break above if decoding_fgets fails)
* C++ compiler cleanup: cast signed to unsignedSkip Montanaro2006-04-181-1/+1
|
* As discussed on python-dev, really fix the PyMem_*/PyObject_* memory APINeal Norwitz2006-04-111-22/+22
| | | | | | | | | | | | | | | | mismatches. At least I hope this fixes them all. This reverts part of my change from yesterday that converted everything in Parser/*.c to use PyObject_* API. The encoding doesn't really need to use PyMem_*, however, it uses new_string() which must return PyMem_* for handling the result of PyOS_Readline() which returns PyMem_* memory. If there were 2 versions of new_string() one that returned PyMem_* for tokens and one that return PyObject_* for encodings that could also fix this problem. I'm not sure which version would be clearer. This seems to fix both Guido's and Phillip's problems, so it's good enough for now. After this change, it would be good to review Parser/*.c for consistent use of the 2 memory APIs.
* Fix the code in Parser/ to also compile with C++. This was mostly casts forAnthony Baxter2006-04-111-12/+13
| | | | | | | malloc/realloc type functions, as well as renaming one variable called 'new' in tokensizer.c. Still lots more to be done, going to be checking in one chunk at a time or the patch will be massively huge. Still compiles ok with gcc.
* SF patch #1467512, fix double free with triple quoted string in standard build.Neal Norwitz2006-04-101-6/+6
| | | | | | This was the result of inconsistent use of PyMem_* and PyObject_* allocators. By changing to use PyObject_* allocator almost everywhere, this removes the inconsistency.
* Years in the making.Tim Peters2006-03-261-39/+44
| | | | | | | | | | | | | | | | | | | | | | | | objimpl.h, pymem.h: Stop mapping PyMem_{Del, DEL} and PyMem_{Free, FREE} to PyObject_{Free, FREE} in a release build. They're aliases for the system free() now. _subprocess.c/sp_handle_dealloc(): Since the memory was originally obtained via PyObject_NEW, it must be released via PyObject_FREE (or _DEL). pythonrun.c, tokenizer.c, parsermodule.c: I lost count of the number of PyObject vs PyMem mismatches in these -- it's like the specific function called at each site was picked at random, sometimes even with memory obtained via PyMem getting released via PyObject. Changed most to use PyObject uniformly, since the blobs allocated are predictably small in most cases, and obmalloc is generally faster than system mallocs then. If extension modules in real life prove as sloppy as Python's front end, we'll have to revert the objimpl.h + pymem.h part of this patch. Note that no problems will show up in a debug build (all calls still go thru obmalloc then). Problems will show up only in a release build, most likely segfaults.
* Use macro versions instead of function versions when we already know the type.Neal Norwitz2006-03-201-1/+3
| | | | | | | | This will hopefully get rid of some Coverity warnings, be a hint to developers, and be marginally faster. Some asserts were added when the type is currently known, but depends on values from another function.
* Fix crashing bug in tokenizer, when tokenizing files with non-ASCII bytesThomas Wouters2006-03-021-0/+5
| | | | | | | | | | | | | | | | | | | but without a specified encoding: decoding_fgets() (and decoding_feof()) can return NULL and fiddle with the 'tok' struct, making tok->buf NULL. This is okay in the other cases of calls to decoding_*(), it seems, but not in this one. This should get a test added, somewhere, but the testsuite doesn't seem to test encoding anywhere (although plenty of tests use it.) It seems to me that decoding errors in other places in the code (like at the start of a token, instead of in the middle of one) make the code end up adding small integers to NULL pointers, but happen to check for error states before using the calculated new pointers. I haven't been able to trigger any other crashes, in any case. I would nominate this file for a comlete rewrite for Py3k. The whole decoding trick is too bolted-on for my tastes.
* Patch #1440601: Add col_offset attribute to AST nodes.Martin v. Löwis2006-03-011-0/+5
|
* Change non-ASCII warning into a SyntaxError.Martin v. Löwis2006-02-281-10/+6
|
* Use Py_ssize_t to count the length.Martin v. Löwis2006-02-161-1/+1
|
* Merge ssize_t branch.Martin v. Löwis2006-02-151-10/+10
|
* Fix SF bug #1072182, problems with signed characters.Neal Norwitz2005-12-191-1/+1
| | | | Most of these can be backported.
* Fix Bug #1378022, UTF-8 files with a leading BOM crashed the interpreter.Neal Norwitz2005-12-181-0/+6
| | | | Needs backport.
* Fix some more memory leaks.Neal Norwitz2005-11-161-6/+11
| | | | | | Call error_ret() in decode_str(). It was called in some other places, but seemed inconsistent. It is safe to call PyTokenizer_Free() after calling error_ret().
* Free coding spec (cs) if there was an error to prevent mem leak. Maybe ↵Neal Norwitz2005-10-211-0/+3
| | | | backport candidate
* - Fix segfault with invalid coding.Neal Norwitz2005-10-021-1/+4
| | | | | | | - SF Bug #772896, unknown encoding results in MemoryError, which is not helpful I will only backport the segfault fix. I'll let Anthony decide if he wants the other changes backported. I will do the backport if asked.
* Apply SF patch #1101726: Fix buffer overrun in tokenizer.c when a source fileWalter Dörwald2005-07-121-27/+45
| | | | with a PEP 263 encoding declaration results in long decoded line.
* Patch #802188: better parser error message for non-EOL following line cont.Martin v. Löwis2005-03-031-1/+1
|
* SF #941229: Decode source code with sys.stdin.encoding in interactiveHye-Shik Chang2004-08-041-0/+61
| | | | | | | modes like non-interactive modes. This allows for non-latin-1 users to write unicode strings directly and sets Japanese users free from weird manual escaping <wink> in shift_jis environments. (Reviewed by Martin v. Loewis)
* PEP-0318, @decorator-style. In Guido's words:Anthony Baxter2004-08-021-0/+2
| | | | | "@ seems the syntax that everybody can hate equally" Implementation by Mark Russell, from SF #979728.
* Getting rid of all the code inside #ifdef macintosh too.Jack Jansen2003-11-201-11/+0
|
* Add URL for PEP to the source code encoding warning.Marc-André Lemburg2003-02-171-6/+12
| | | | | | Remove the usage of PyErr_WarnExplicit() since this could cause sensitive information from the source files to appear in e.g. log files.
* patch 680474 that fixes bug 679880: compile/eval/exec refused utf-8 bomJust van Rossum2003-02-091-2/+2
| | | | mark. Added unit test.
* Fix [ 665014 ] files with long lines and an encoding crash.Mark Hammond2003-01-141-1/+2
| | | | | Ensure that the 'size' arg is correctly passed to the encoding reader to prevent buffer overflows.
* Constify filenames and scripts. Fixes #651362.Martin v. Löwis2002-12-111-3/+5
|
* Fix compiler warning on HP-UX.Neal Norwitz2002-11-021-2/+2
| | | | Cast param to isalnum() to int.
* Patch #512981: Update readline input stream on sys.stdin/out change.Martin v. Löwis2002-10-261-2/+2
|
* Removed reliance on gcc/C99 extension.Tim Peters2002-09-031-1/+3
|
* Ignore encoding declarations inside strings. Fixes #603509.Martin v. Löwis2002-09-031-1/+16
|
* Squash a few calls to the hideously expensive PyObject_CallObject(o,a)Guido van Rossum2002-08-161-3/+14
| | | | | | | -- replace then with slightly faster PyObject_Call(o,a,NULL). (The difference is that the latter requires a to be a tuple; the former allows other values and wraps them in a tuple if necessary; it involves two more levels of C function calls to accomplish all that.)
* provide less mysterious error messages when seeing end-of-line inSkip Montanaro2002-08-151-3/+6
| | | | | single-quoted strings or end-of-file in triple-quoted strings. closes patch 586561.
* Use Py_FatalError instead of abort.Martin v. Löwis2002-08-071-2/+3
|
* Fix PEP 263 code --without-unicode. Fixes #591943.Martin v. Löwis2002-08-071-0/+18
|
* Added a cast to shut up a compiler warning.Jack Jansen2002-08-051-1/+1
|
* Add 1 to lineno in deprecation warning. Fixes #590888.Martin v. Löwis2002-08-051-1/+3
|
* Make pgen compile with pydebug. Duplicate normalized names, as it mayMartin v. Löwis2002-08-041-2/+6
| | | | be longer than the old string.
* Group statements properly.Martin v. Löwis2002-08-041-6/+12
|
* Repaired a fatal compiler error in the debug build: it's not clear whatTim Peters2002-08-041-1/+1
| | | | this was trying to assert, but the name it referenced didn't exist.
* Squash compiler wng about signed-vs-unsigned mismatch.Tim Peters2002-08-041-1/+1
|
* Patch #534304: Implement phase 1 of PEP 263.Martin v. Löwis2002-08-041-8/+440
|
* Mass checkin of universal newline support.Jack Jansen2002-04-141-4/+5
| | | | | | | | Highlights: import and friends will understand any of \r, \n and \r\n as end of line. Python file input will do the same if you use mode 'U'. Everything can be disabled by configuring with --without-universal-newlines. See PEP278 for details.
* SF patch #455966: Allow leading 0 in float/imag literals.Tim Peters2001-08-301-3/+22
| | | | Consequences for Jython still unknown (but raised on Jython-Dev).
* SF bug [#455775] float parsing discrepancy.Tim Peters2001-08-271-5/+8
| | | | PyTokenizer_Get: error if exponent contains no digits (3e, 2.0e+, ...).
* Implement PEP 238 in its (almost) full glory.Guido van Rossum2001-08-081-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | This introduces: - A new operator // that means floor division (the kind of division where 1/2 is 0). - The "future division" statement ("from __future__ import division) which changes the meaning of the / operator to implement "true division" (where 1/2 is 0.5). - New overloadable operators __truediv__ and __floordiv__. - New slots in the PyNumberMethods struct for true and floor division, new abstract APIs for them, new opcodes, and so on. I emphasize that without the future division statement, the semantics of / will remain unchanged until Python 3.0. Not yet implemented are warnings (default off) when / is used with int or long arguments. This has been on display since 7/31 as SF patch #443474. Flames to /dev/null.
* SF but #417587: compiler warnings compiling 2.1.Tim Peters2001-04-211-3/+0
| | | | Repaired *some* of the SGI compiler warnings Sjoerd Mullender reported.
* REMOVED all CWI, CNRI and BeOpen copyright markings.Guido van Rossum2000-09-011-9/+0
| | | | This should match the situation in the 1.6b1 tree.
* Support for three-token characters (**=, >>=, <<=) which was written byThomas Wouters2000-08-241-0/+94
| | | | | Michael Hudson, and support in general for the augmented assignment syntax. The graminit.c patch is large!
* Mass ANSIfication.Thomas Wouters2000-07-221-25/+12
| | | | | | Work around intrcheck.c's desire to pass 'PyErr_CheckSignals' to 'Py_AddPendingCall' by providing a (static) wrapper function that has the right number of arguments.