summaryrefslogtreecommitdiffstats
path: root/Python/codecs.c
Commit message (Collapse)AuthorAgeFilesLines
* Close #20404: blacklist non-text encodings in io.TextIOWrapperNick Coghlan2014-02-041-21/+63
| | | | | | | | | - io.TextIOWrapper (and hence the open() builtin) now use the internal codec marking system added for issue #19619 - also tweaked the C code to only look up the encoding once, rather than multiple times - the existing output type checks remain in place to deal with unmarked third party codecs.
* Issue #19619: Blacklist non-text codecs in method APINick Coghlan2013-11-221-16/+122
| | | | | | | | | | str.encode, bytes.decode and bytearray.decode now use an internal API to throw LookupError for known non-text encodings, rather than attempting the encoding or decoding operation and then throwing a TypeError for an unexpected output type. The latter mechanism remains in place for third party non-text encodings.
* Issue #12892: The utf-16* and utf-32* codecs now reject (lone) surrogates.Serhiy Storchaka2013-11-191-17/+146
| | | | | | | | | | The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.
* Close 19609: narrow scope of codec exc chainingNick Coghlan2013-11-151-4/+6
|
* Close #17828: better handling of codec errorsNick Coghlan2013-11-131-0/+18
| | | | | | | | - output type errors now redirect users to the type-neutral convenience functions in the codecs module - stateless errors that occur during encoding and decoding will now be automatically wrapped in exceptions that give the name of the codec involved
* Issue #1772673: The type of `char*` arguments now changed to `const char*`.Serhiy Storchaka2013-10-191-2/+2
|
* Issue #18722: Remove uses of the "register" keyword in C code.Antoine Pitrou2013-08-131-2/+2
|
* Issue #18408: normalizestring() now raises MemoryError on memory allocation ↵Victor Stinner2013-07-111-1/+1
| | | | failure
* Issue #15422: get rid of PyCFunction_New macroAndrew Svetlov2012-12-251-1/+1
|
* #16336: merge with 3.3.Ezio Melotti2012-11-031-4/+4
|\
| * #16336: merge with 3.2.Ezio Melotti2012-11-031-4/+4
| |\
| | * #16336: fix input checking in the surrogatepass error handler. Patch by ↵Ezio Melotti2012-11-031-4/+4
| | | | | | | | | | | | Serhiy Storchaka.
* | | Issue #16330: Use surrogate-related macrosVictor Stinner2012-10-301-2/+2
|/ / | | | | | | Patch written by Serhiy Storchaka.
* | merge with 3.2Philip Jenvey2012-10-271-3/+4
|\ \ | |/
| * bounds check for bad data (thanks amaury)Philip Jenvey2012-10-271-3/+4
| |
* | Check newly created consistency using _PyUnicode_CheckConsistency(str, 1)Victor Stinner2012-04-271-4/+6
| | | | | | | | | | | | * In debug mode, fill the string data with invalid characters * Simplify also reference counting in PyCodec_BackslashReplaceErrors() and PyCodec_XMLCharRefReplaceError()
* | Issue #13722: Avoid silencing ImportErrors when initializing the codecs ↵Antoine Pitrou2012-01-181-9/+0
|\ \ | |/ | | | | registry.
| * Issue #13722: Avoid silencing ImportErrors when initializing the codecs ↵Antoine Pitrou2012-01-181-9/+0
| | | | | | | | registry.
* | PyCodec_IgnoreErrors() avoids the deprecated "u#" formatVictor Stinner2011-12-011-2/+1
| |
* | Avoid the Py_UNICODE type in codecs.cVictor Stinner2011-11-041-4/+11
| |
* | PyCodec_XMLCharRefReplaceError(): Remove unused variableVictor Stinner2011-11-041-2/+2
| |
* | Fix C89 incompatibility.Martin v. Löwis2011-11-041-1/+1
| |
* | Port error handlers from Py_UNICODE indexing to code point indexing.Martin v. Löwis2011-11-041-77/+46
| |
* | Rename _Py_identifier to _Py_IDENTIFIER.Martin v. Löwis2011-10-141-2/+2
| |
* | Issue #13088: Add shared Py_hexdigits constant to format a number into base 16Victor Stinner2011-10-141-12/+12
| |
* | Use identifier API for PyObject_GetAttrString.Martin v. Löwis2011-10-101-2/+4
| |
* | PyCodec_ReplaceErrors() uses "C" format instead of "u#" to build resultVictor Stinner2011-10-021-2/+3
| |
* | Use the new Py_ARRAY_LENGTH macroVictor Stinner2011-09-281-1/+1
| |
* | Implement PEP 393.Martin v. Löwis2011-09-281-24/+20
|/
* Issue #1813: Fix codec lookup under Turkish locales.Antoine Pitrou2011-07-241-1/+1
|
* Issue #9804: ascii() now always represents unicode surrogate pairs asAntoine Pitrou2010-09-091-6/+20
| | | | | | a single `\UXXXXXXXX`, regardless of whether the character is printable or not. Also, the "backslashreplace" error handler now joins surrogate pairs into a single character on UCS-2 builds.
* Recorded merge of revisions 81029 via svnmerge fromAntoine Pitrou2010-05-091-522/+522
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81029 | antoine.pitrou | 2010-05-09 16:46:46 +0200 (dim., 09 mai 2010) | 3 lines Untabify C files. Will watch buildbots. ........
* Merged revisions ↵Georg Brandl2009-10-271-5/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 75365,75394,75402-75403,75418,75459,75484,75592-75596,75600,75602-75607,75610-75613,75616-75617,75623,75627,75640,75647,75696,75795 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r75365 | georg.brandl | 2009-10-11 22:16:16 +0200 (So, 11 Okt 2009) | 1 line Fix broken links found by "make linkcheck". scipy.org seems to be done right now, so I could not verify links going there. ........ r75394 | georg.brandl | 2009-10-13 20:10:59 +0200 (Di, 13 Okt 2009) | 1 line Fix markup. ........ r75402 | georg.brandl | 2009-10-14 17:51:48 +0200 (Mi, 14 Okt 2009) | 1 line #7125: fix typo. ........ r75403 | georg.brandl | 2009-10-14 17:57:46 +0200 (Mi, 14 Okt 2009) | 1 line #7126: os.environ changes *do* take effect in subprocesses started with os.system(). ........ r75418 | georg.brandl | 2009-10-14 20:48:32 +0200 (Mi, 14 Okt 2009) | 1 line #7116: str.join() takes an iterable. ........ r75459 | georg.brandl | 2009-10-17 10:57:43 +0200 (Sa, 17 Okt 2009) | 1 line Fix refleaks in _ctypes PyCSimpleType_New, which fixes the refleak seen in test___all__. ........ r75484 | georg.brandl | 2009-10-18 09:58:12 +0200 (So, 18 Okt 2009) | 1 line Fix missing word. ........ r75592 | georg.brandl | 2009-10-22 09:05:48 +0200 (Do, 22 Okt 2009) | 1 line Fix punctuation. ........ r75593 | georg.brandl | 2009-10-22 09:06:49 +0200 (Do, 22 Okt 2009) | 1 line Revert unintended change. ........ r75594 | georg.brandl | 2009-10-22 09:56:02 +0200 (Do, 22 Okt 2009) | 1 line Fix markup. ........ r75595 | georg.brandl | 2009-10-22 09:56:56 +0200 (Do, 22 Okt 2009) | 1 line Fix duplicate target. ........ r75596 | georg.brandl | 2009-10-22 10:05:04 +0200 (Do, 22 Okt 2009) | 1 line Add a new directive marking up implementation details and start using it. ........ r75600 | georg.brandl | 2009-10-22 13:01:46 +0200 (Do, 22 Okt 2009) | 1 line Make it more robust. ........ r75602 | georg.brandl | 2009-10-22 13:28:06 +0200 (Do, 22 Okt 2009) | 1 line Document new directive. ........ r75603 | georg.brandl | 2009-10-22 13:28:23 +0200 (Do, 22 Okt 2009) | 1 line Allow short form with text as argument. ........ r75604 | georg.brandl | 2009-10-22 13:36:50 +0200 (Do, 22 Okt 2009) | 1 line Fix stylesheet for multi-paragraph impl-details. ........ r75605 | georg.brandl | 2009-10-22 13:48:10 +0200 (Do, 22 Okt 2009) | 1 line Use "impl-detail" directive where applicable. ........ r75606 | georg.brandl | 2009-10-22 17:00:06 +0200 (Do, 22 Okt 2009) | 1 line #6324: membership test tries iteration via __iter__. ........ r75607 | georg.brandl | 2009-10-22 17:04:09 +0200 (Do, 22 Okt 2009) | 1 line #7088: document new functions in signal as Unix-only. ........ r75610 | georg.brandl | 2009-10-22 17:27:24 +0200 (Do, 22 Okt 2009) | 1 line Reorder __slots__ fine print and add a clarification. ........ r75611 | georg.brandl | 2009-10-22 17:42:32 +0200 (Do, 22 Okt 2009) | 1 line #7035: improve docs of the various <method>_errors() functions, and give them docstrings. ........ r75612 | georg.brandl | 2009-10-22 17:52:15 +0200 (Do, 22 Okt 2009) | 1 line #7156: document curses as Unix-only. ........ r75613 | georg.brandl | 2009-10-22 17:54:35 +0200 (Do, 22 Okt 2009) | 1 line #6977: getopt does not support optional option arguments. ........ r75616 | georg.brandl | 2009-10-22 18:17:05 +0200 (Do, 22 Okt 2009) | 1 line Add proper references. ........ r75617 | georg.brandl | 2009-10-22 18:20:55 +0200 (Do, 22 Okt 2009) | 1 line Make printout margin important. ........ r75623 | georg.brandl | 2009-10-23 10:14:44 +0200 (Fr, 23 Okt 2009) | 1 line #7188: fix optionxform() docs. ........ r75627 | fred.drake | 2009-10-23 15:04:51 +0200 (Fr, 23 Okt 2009) | 2 lines add further note about what's passed to optionxform ........ r75640 | neil.schemenauer | 2009-10-23 21:58:17 +0200 (Fr, 23 Okt 2009) | 2 lines Improve some docstrings in the 'warnings' module. ........ r75647 | georg.brandl | 2009-10-24 12:04:19 +0200 (Sa, 24 Okt 2009) | 1 line Fix markup. ........ r75696 | georg.brandl | 2009-10-25 21:25:43 +0100 (So, 25 Okt 2009) | 1 line Fix a demo. ........ r75795 | georg.brandl | 2009-10-27 16:10:22 +0100 (Di, 27 Okt 2009) | 1 line Fix a strange mis-edit. ........
* Rename utf8b error handler to surrogateescape.Martin v. Löwis2009-05-101-6/+6
|
* Rename the surrogates error handler to surrogatepass.Martin v. Löwis2009-05-101-6/+6
|
* Issue #5915: Implement PEP 383, Non-decodable Bytes inMartin v. Löwis2009-05-051-0/+89
| | | | System Character Interfaces.
* Make PyCodec_SurrogateErrors static.Martin v. Löwis2009-05-021-1/+4
|
* Issue #3672: Reject surrogates in utf-8 codec; add surrogates errorMartin v. Löwis2009-05-021-0/+92
| | | | handler.
* Issue 3723: Fixed initialization of subinterpretersChristian Heimes2008-10-301-0/+1
| | | | | The patch fixes several issues with Py_NewInterpreter as well as the demo for multiple subinterpreters. Most of the patch was written by MvL with help from Benjamin, Amaury and me. Graham Dumpleton has verified that this patch fixes an issue with mod_wsgi.
* Move the codec decode type checks to bytes/bytearray.decode().Marc-André Lemburg2008-06-061-20/+25
| | | | | | | | | | | | Use faster PyUnicode_FromEncodedObject() for bytes/bytearray.decode(). Add new PyCodec_KnownEncoding() API. Add new PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode() APIs. Add missing PyUnicode_AsDecodedObject() to unicodeobject.h Fix punicode codec to also work on memoryviews.
* Renamed PyString to PyBytesChristian Heimes2008-05-261-2/+2
|
* Renamed PyBytes to PyByteArrayChristian Heimes2008-05-261-2/+2
|
* More PyImport_ImportModule -> PyImport_ImportModuleNoBlockChristian Heimes2008-01-031-1/+1
|
* #1629: Renamed Py_Size, Py_Type and Py_Refcnt to Py_SIZE, Py_TYPE and Py_REFCNT.Christian Heimes2007-12-191-1/+1
|
* Merging the py3k-pep3137 branch back into the py3k branch.Guido van Rossum2007-11-061-25/+37
| | | | | | | | | | | | | | No detailed change log; just check out the change log for the py3k-pep3137 branch. The most obvious changes: - str8 renamed to bytes (PyString at the C level); - bytes renamed to buffer (PyBytes at the C level); - PyString and PyUnicode are no longer compatible. I.e. we now have an immutable bytes type and a mutable bytes type. The behavior of PyString was modified quite a bit, to make it more bytes-like. Some changes are still on the to-do list.
* This is the uncontroversial half of patch 1263 by Thomas Lee:Guido van Rossum2007-10-191-8/+12
| | | | | changes to codecs.c and structmember.c to use PyUnicode instead of PyString.
* Handle errorNeal Norwitz2007-08-111-1/+4
|
* Revert 55876. Use PyUnicode_AsEncodedString instead.Martin v. Löwis2007-06-121-17/+0
|
* Short-cut lookup of utf-8 codec, to make import workMartin v. Löwis2007-06-111-0/+17
| | | | on OSX.
* Change PyErr_Format() to generate a unicode string (by usingWalter Dörwald2007-05-251-12/+7
| | | | | | | | PyUnicode_FromFormatV() instead of PyString_FromFormatV()). Change calls to PyErr_Format() to benefit from the new format specifiers: Using %S, object instead of %s, PyString_AS_STRING(object) with will work with unicode objects too.