summaryrefslogtreecommitdiffstats
path: root/Include/unicodeobject.h
Commit message (Collapse)AuthorAgeFilesLines
* Merged revisions ↵Georg Brandl2009-01-141-10/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 68292,68344,68361,68378,68424,68426,68429-68430,68450,68457,68480-68481,68493,68495,68499,68501,68512,68514-68515 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r68292 | skip.montanaro | 2009-01-04 11:36:58 +0100 (So, 04 Jan 2009) | 3 lines If user configures --without-gcc give preference to $CC instead of blindly assuming the compiler will be "cc". ........ r68344 | marc-andre.lemburg | 2009-01-05 20:43:35 +0100 (Mo, 05 Jan 2009) | 7 lines Fix #4846 (Py_UNICODE_ISSPACE causes linker error) by moving the declaration into the extern "C" section. Add a few more comments and apply some minor edits to make the file contents fit the original structure again. ........ r68361 | antoine.pitrou | 2009-01-06 19:34:08 +0100 (Di, 06 Jan 2009) | 3 lines Use shutil.rmtree rather than os.rmdir. ........ r68378 | mark.dickinson | 2009-01-07 18:48:33 +0100 (Mi, 07 Jan 2009) | 2 lines Issue #4869: clarify documentation for random.expovariate. ........ r68424 | benjamin.peterson | 2009-01-09 03:53:35 +0100 (Fr, 09 Jan 2009) | 1 line specify what -3 warnings are about ........ r68426 | benjamin.peterson | 2009-01-09 04:03:05 +0100 (Fr, 09 Jan 2009) | 1 line fix spelling ........ r68429 | benjamin.peterson | 2009-01-09 04:05:14 +0100 (Fr, 09 Jan 2009) | 1 line add -3 to manpage ........ r68430 | benjamin.peterson | 2009-01-09 04:07:27 +0100 (Fr, 09 Jan 2009) | 1 line be more specific in -3 option help ........ r68450 | jeffrey.yasskin | 2009-01-09 17:47:07 +0100 (Fr, 09 Jan 2009) | 3 lines Fix issue 4884, preventing a crash in the socket code when python is compiled with llvm-gcc and run with a glibc <2.10. ........ r68457 | kristjan.jonsson | 2009-01-09 21:10:59 +0100 (Fr, 09 Jan 2009) | 1 line Issue 3677: Fix import from UNC paths on Windows. ........ r68480 | vinay.sajip | 2009-01-10 14:38:04 +0100 (Sa, 10 Jan 2009) | 1 line Minor documentation changes cross-referencing NullHandler to the documentation on configuring logging in a library. ........ r68481 | vinay.sajip | 2009-01-10 14:42:04 +0100 (Sa, 10 Jan 2009) | 1 line Corrected an incorrect self-reference. ........ r68493 | benjamin.peterson | 2009-01-10 18:18:55 +0100 (Sa, 10 Jan 2009) | 1 line rewrite verbose conditionals ........ r68495 | benjamin.peterson | 2009-01-10 18:36:44 +0100 (Sa, 10 Jan 2009) | 1 line tp_iter only exists with Py_TPFLAGS_HAVE_ITER #4901 ........ r68499 | mark.dickinson | 2009-01-10 20:14:55 +0100 (Sa, 10 Jan 2009) | 2 lines Remove an unnecessary check from test_decimal. ........ r68501 | vinay.sajip | 2009-01-10 20:22:57 +0100 (Sa, 10 Jan 2009) | 1 line Corrected minor typo and added .currentmodule directives to fix missing cross-references. ........ r68512 | benjamin.peterson | 2009-01-10 23:42:10 +0100 (Sa, 10 Jan 2009) | 1 line make tests fail if they can't be imported ........ r68514 | benjamin.peterson | 2009-01-11 00:41:59 +0100 (So, 11 Jan 2009) | 1 line move seealso to a more appropiate place ........ r68515 | benjamin.peterson | 2009-01-11 00:49:08 +0100 (So, 11 Jan 2009) | 1 line macos 9 isn't supported ........
* Merged revisions ↵Georg Brandl2009-01-011-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 67952-67953,67955,67957-67958,67960-67961,67963,67965,67967,67970-67971,67973,67982,67988,67990,67995,68014,68016,68030,68057,68061,68112,68115-68118,68120-68121,68123-68128 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r67952 | georg.brandl | 2008-12-27 18:42:40 +0100 (Sat, 27 Dec 2008) | 2 lines #4752: actually use custom handler in example. ........ r67953 | georg.brandl | 2008-12-27 19:20:04 +0100 (Sat, 27 Dec 2008) | 3 lines Patch #4739 by David Laban: add symbols to pydoc help topics, so that ``help('@')`` works as expected. ........ r67955 | georg.brandl | 2008-12-27 19:27:53 +0100 (Sat, 27 Dec 2008) | 3 lines Follow-up to r67746 in order to restore backwards-compatibility for those who (monkey-)patch TextWrapper.wordsep_re with a custom RE. ........ r67957 | georg.brandl | 2008-12-27 19:49:19 +0100 (Sat, 27 Dec 2008) | 2 lines #4754: improve winsound documentation. ........ r67958 | georg.brandl | 2008-12-27 20:02:59 +0100 (Sat, 27 Dec 2008) | 2 lines #4682: 'b' is actually unsigned char. ........ r67960 | georg.brandl | 2008-12-27 20:04:44 +0100 (Sat, 27 Dec 2008) | 2 lines #4695: fix backslashery. ........ r67961 | georg.brandl | 2008-12-27 20:06:04 +0100 (Sat, 27 Dec 2008) | 2 lines Use :samp: role. ........ r67963 | georg.brandl | 2008-12-27 20:11:15 +0100 (Sat, 27 Dec 2008) | 2 lines #4671: document that pydoc imports modules. ........ r67965 | antoine.pitrou | 2008-12-27 21:34:52 +0100 (Sat, 27 Dec 2008) | 3 lines Issue #4677: add two list comprehension tests to pybench. ........ r67967 | benjamin.peterson | 2008-12-27 23:18:58 +0100 (Sat, 27 Dec 2008) | 1 line fix markup ........ r67970 | alexandre.vassalotti | 2008-12-28 02:52:58 +0100 (Sun, 28 Dec 2008) | 2 lines Fix name mangling of PyUnicode_ClearFreeList. ........ r67971 | alexandre.vassalotti | 2008-12-28 03:10:35 +0100 (Sun, 28 Dec 2008) | 2 lines Sort UCS-2/UCS-4 name mangling list. ........ r67973 | alexandre.vassalotti | 2008-12-28 03:58:22 +0100 (Sun, 28 Dec 2008) | 2 lines Document Py_VaBuildValue. ........ r67982 | benjamin.peterson | 2008-12-28 16:37:31 +0100 (Sun, 28 Dec 2008) | 1 line fix WORD_BIGEDIAN declaration in Universal builds; fixes #4060 and #4728 ........ r67988 | ronald.oussoren | 2008-12-28 20:40:56 +0100 (Sun, 28 Dec 2008) | 1 line Issue4064: architecture string for universal builds on OSX ........ r67990 | ronald.oussoren | 2008-12-28 20:50:40 +0100 (Sun, 28 Dec 2008) | 3 lines Update the fix for issue4064 to deal correctly with all three variants of universal builds that are presented by the configure script. ........ r67995 | benjamin.peterson | 2008-12-28 22:16:07 +0100 (Sun, 28 Dec 2008) | 1 line #4763 PyErr_ExceptionMatches won't blow up with NULL arguments ........ r68014 | benjamin.peterson | 2008-12-29 18:47:42 +0100 (Mon, 29 Dec 2008) | 1 line #4764 set IOError.filename when trying to open a directory on POSIX platforms ........ r68016 | benjamin.peterson | 2008-12-29 18:56:58 +0100 (Mon, 29 Dec 2008) | 1 line #4764 in io.open, set IOError.filename when trying to open a directory on POSIX platforms ........ r68030 | benjamin.peterson | 2008-12-29 22:38:14 +0100 (Mon, 29 Dec 2008) | 1 line fix French ........ r68057 | vinay.sajip | 2008-12-30 08:01:25 +0100 (Tue, 30 Dec 2008) | 1 line Minor documentation change relating to NullHandler. ........ r68061 | georg.brandl | 2008-12-30 11:15:49 +0100 (Tue, 30 Dec 2008) | 2 lines #4778: attributes can't be called. ........ r68112 | benjamin.peterson | 2009-01-01 00:48:39 +0100 (Thu, 01 Jan 2009) | 1 line #4795 inspect.isgeneratorfunction() should return False instead of None ........ r68115 | benjamin.peterson | 2009-01-01 05:04:41 +0100 (Thu, 01 Jan 2009) | 1 line simplfy code ........ r68116 | georg.brandl | 2009-01-01 12:46:51 +0100 (Thu, 01 Jan 2009) | 2 lines #4100: note that element children are not necessarily present on "start" events. ........ r68117 | georg.brandl | 2009-01-01 12:53:55 +0100 (Thu, 01 Jan 2009) | 2 lines #4156: make clear that "protocol" is to be replaced with the protocol name. ........ r68118 | georg.brandl | 2009-01-01 13:00:19 +0100 (Thu, 01 Jan 2009) | 2 lines #4185: clarify escape behavior of replacement strings. ........ r68120 | georg.brandl | 2009-01-01 13:15:31 +0100 (Thu, 01 Jan 2009) | 4 lines #4228: Pack negative values the same way as 2.4 in struct's L format. ........ r68121 | georg.brandl | 2009-01-01 13:43:33 +0100 (Thu, 01 Jan 2009) | 2 lines Point to types module in new module deprecation notice. ........ r68123 | georg.brandl | 2009-01-01 13:52:29 +0100 (Thu, 01 Jan 2009) | 2 lines #4784: ... on three counts ... ........ r68124 | georg.brandl | 2009-01-01 13:53:19 +0100 (Thu, 01 Jan 2009) | 2 lines #4782: Fix markup error that hid load() and loads(). ........ r68125 | georg.brandl | 2009-01-01 14:02:09 +0100 (Thu, 01 Jan 2009) | 2 lines #4776: add data_files and package_dir arguments. ........ r68126 | georg.brandl | 2009-01-01 14:05:13 +0100 (Thu, 01 Jan 2009) | 2 lines Handlers are in the `logging.handlers` module. ........ r68127 | georg.brandl | 2009-01-01 14:14:49 +0100 (Thu, 01 Jan 2009) | 2 lines #4767: Use correct submodules for all MIME classes. ........ r68128 | antoine.pitrou | 2009-01-01 15:11:22 +0100 (Thu, 01 Jan 2009) | 3 lines Issue #3680: Reference cycles created through a dict, set or deque iterator did not get collected. ........
* Merged revisions 66891 via svnmerge fromAmaury Forgeot d'Arc2008-10-141-1/+1
| | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r66891 | amaury.forgeotdarc | 2008-10-14 23:47:22 +0200 (mar., 14 oct. 2008) | 5 lines #4122: On Windows, Py_UNICODE_ISSPACE cannot be used in an extension module: compilation fails with "undefined reference to _Py_ascii_whitespace" Will backport to 2.6. ........
* Refactor and clean up str.format() code (and helpers) in advance of ↵Eric Smith2008-05-301-0/+6
| | | | optimizations.
* Implemented Martin's suggestion to clear the free lists during the garbage ↵Christian Heimes2008-02-141-0/+4
| | | | collection of the highest generation.
* Patch #1970 by Antoine Pitrou: Speedup unicode whitespace and linebreak ↵Christian Heimes2008-01-301-1/+8
| | | | detection. The speedup is about 25% for split() (571 / 457 usec) and 35% (175 / 127 usec )for splitlines()
* Add stdarg include for va_list to get this to compile on cygwinNeal Norwitz2008-01-271-0/+2
|
* Backport of several functions from Python 3.0 to 2.6 including ↵Christian Heimes2008-01-251-0/+23
| | | | | | | PyUnicode_FromString, PyUnicode_Format and PyLong_From/AsSsize_t. The functions are partly required for the backport of the bytearray type and _fileio module. They should also make it easier to port C to 3.0. First chapter of the Python 3.0 io framework back port: _fileio The next step depends on a working bytearray type which itself depends on a backport of the nwe buffer API.
* #1629: Renamed Py_Size, Py_Type and Py_Refcnt to Py_SIZE, Py_TYPE and ↵Christian Heimes2007-12-191-2/+2
| | | | Py_REFCNT. Macros for b/w compatibility are available.
* The incremental decoder for utf-7 must preserve its state between calls.Amaury Forgeot d'Arc2007-11-201-0/+7
| | | | | | | Solves issue1460. Might not be a backport candidate: a new API function was added, and some code may rely on details in utf-7.py.
* Backport r57105 and r57145 from the py3k branch: UTF-32 codecs.Walter Dörwald2007-08-171-0/+82
|
* PEP 3123: Provide forward compatibility with Python 3.0, while keepingMartin v. Löwis2007-07-211-2/+2
| | | | | backwards compatibility. Add Py_Refcnt, Py_Type, Py_Size, and PyVarObject_HEAD_INIT.
* Variation of patch # 1624059 to speed up checking if an object is a subclassNeal Norwitz2007-02-251-1/+2
| | | | | | | | | | | | | | | | | | of some of the common builtin types. Use a bit in tp_flags for each common builtin type. Check the bit to determine if any instance is a subclass of these common types. The check avoids a function call and O(n) search of the base classes. The check is done in the various Py*_Check macros rather than calling PyType_IsSubtype(). All the bits are set in tp_flags when the type is declared in the Objects/*object.c files because PyType_Ready() is not called for all the types. Should PyType_Ready() be called for all types? If so and the change is made, the changes to the Objects/*object.c files can be reverted (remove setting the tp_flags). Objects/typeobject.c would also have to be modified to add conditions for Py*_CheckExact() in addition to each the PyType_IsSubtype check.
* Slightly revised version of patch #1538956:Marc-André Lemburg2006-08-141-0/+24
| | | | | | | | | | Replace UnicodeDecodeErrors raised during == and != compares of Unicode and other objects with a new UnicodeWarning. All other comparisons continue to raise exceptions. Exceptions other than UnicodeDecodeErrors are also left untouched.
* Patch #1455898: Incremental mode for "mbcs" codec.Martin v. Löwis2006-06-141-0/+7
|
* Patch #1359618: Speed-up charmap encoder.Martin v. Löwis2006-06-041-0/+5
|
* needforspeed: added Py_MEMCPY macro (currently tuned for Visual C only),Fredrik Lundh2006-05-281-9/+2
| | | | | and use it for string copy operations. this gives a 20% speedup on some string benchmarks.
* needforspeed: added rpartition implementationFredrik Lundh2006-05-261-1/+12
|
* needforspeed: partition implementation, part two.Fredrik Lundh2006-05-261-0/+9
| | | | feel free to improve the documentation and the docstrings.
* needforspeed: check first *and* last character before doing a full memcmpFredrik Lundh2006-05-231-4/+6
|
* needforspeed: use memcpy for "long" strings; use a better algorithmFredrik Lundh2006-05-221-4/+9
| | | | for long repeats.
* needforspeed: speed up unicode repeat, unicode string copyFredrik Lundh2006-05-221-4/+7
|
* Merge ssize_t branch.Martin v. Löwis2006-02-151-46/+46
|
* _PyUnicode_IsWhitespace(),Tim Peters2005-10-291-2/+2
| | | | | | | | | _PyUnicode_IsLinebreak(): Changed the declarations to match the definitions. Don't know why they differed; MSVC warned about it; don't know why only these two functions use "const". Someone who does may want to do something saner ;-).
* SF bug #1251300: On UCS-4 builds the "unicode-internal" codec will now complainWalter Dörwald2005-08-301-0/+10
| | | | | about illegal code points. The codec now supports PEP 293 style error handlers. (This is a variant of the Nik Haldimann's patch that detects truncated data)
* Correct the handling of 0-termination of PyUnicode_AsWideChar()Marc-André Lemburg2004-11-221-2/+8
| | | | | | | | and its usage in PyLocale_strcoll(). Clarify the documentation on this. Thanks to Andreas Degert for pointing this out.
* SF patch #1056231: typo in comment (unicodeobject.h)Raymond Hettinger2004-10-311-1/+1
|
* SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now supportWalter Dörwald2004-09-071-0/+21
| | | | | | | | | | | decoding incomplete input (when the input stream is temporarily exhausted). codecs.StreamReader now implements buffering, which enables proper readline support for the UTF-16 decoders. codecs.StreamReader.read() has a new argument chars which specifies the number of characters to return. codecs.StreamReader.readline() and codecs.StreamReader.readlines() have a new argument keepends. Trailing "\n"s will be stripped from the lines if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and PyUnicode_DecodeUTF16Stateful.
* SF #989185: Drop unicode.iswide() and unicode.width() and addHye-Shik Chang2004-08-041-18/+0
| | | | | | | | | | | | unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
* Allow string and unicode return types from .encode()/.decode()Marc-André Lemburg2004-07-081-0/+11
| | | | | methods on string and unicode objects. Added unicode.decode() which was missing for no apparent reason.
* - SF #962502: Add two more methods for unicode type; width() andHye-Shik Chang2004-06-021-0/+18
| | | | | | | iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
* Add rsplit method for str and unicode builtin types.Hye-Shik Chang2003-12-151-0/+20
| | | | | SF feature request #801847. Original patch is written by Sean Reifschneider.
* Add name mangling for new PyUnicode_FromOrdinal() and fix declarationMarc-André Lemburg2002-08-121-1/+3
| | | | to use new extern macro.
* Excise DL_EXPORT from Include.Mark Hammond2002-08-121-72/+72
| | | | Thanks to Skip Montanaro and Kalle Svensson for the patches.
* Add C API PyUnicode_FromOrdinal() which exposes unichr() at C level.Marc-André Lemburg2002-08-111-0/+12
| | | | | | | u'%c' will now raise a ValueError in case the argument is an integer outside the valid range of Unicode code point ordinals. Closes SF bug #593581.
* Fix for bug [ 561796 ] string.find causes lazy errorMarc-André Lemburg2002-05-291-1/+2
|
* Apply patch diff.txt from SF feature requestWalter Dörwald2002-04-221-0/+7
| | | | | | | | | http://www.python.org/sf/444708 This adds the optional argument for str.strip to unicode.strip too and makes it possible to call str.strip with a unicode argument and unicode.strip with a str argument.
* SF patch #470578: Fixes to synchronize unicode() and str()Guido van Rossum2001-10-191-10/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements what we have discussed on python-dev late in September: str(obj) and unicode(obj) should behave similar, while the old behaviour is retained for unicode(obj, encoding, errors). The patch also adds a new feature with which objects can provide unicode(obj) with input data: the __unicode__ method. Currently no new tp_unicode slot is implemented; this is left as option for the future. Note that PyUnicode_FromEncodedObject() no longer accepts Unicode objects as input. The API name already suggests that Unicode objects do not belong in the list of acceptable objects and the functionality was only needed because PyUnicode_FromEncodedObject() was being used directly by unicode(). The latter was changed in the discussed way: * unicode(obj) calls PyObject_Unicode() * unicode(obj, encoding, errors) calls PyUnicode_FromEncodedObject() One thing left open to discussion is whether to leave the PyUnicode_FromObject() API as a thin API extension on top of PyUnicode_FromEncodedObject() or to turn it into a (macro) alias for PyObject_Unicode() and deprecate it. Doing so would have some surprising consequences though, e.g. u"abc" + 123 would turn out as u"abc123"... [Marc-Andre didn't have time to check this in before the deadline. I hope this is OK, Marc-Andre! You can still make changes and commit them on the trunk after the branch has been made, but then please mail Barry a context diff if you want the change to be merged into the 2.2b1 release branch. GvR]
* Patch #435971: UTF-7 codec by Brian Quinlan.Marc-André Lemburg2001-09-201-0/+18
|
* Fix for bug #462737.Marc-André Lemburg2001-09-191-3/+3
|
* Possibly the end of SF [#460020] bug or feature: unicode() and subclasses.Tim Peters2001-09-111-0/+2
| | | | | Changed unicode(i) to return a true Unicode object when i is an instance of a unicode subclass. Added PyUnicode_CheckExact macro.
* Make the Py<type>_Check() macro use PyObject_TypeCheck().Guido van Rossum2001-08-301-1/+1
|
* Patch #445762: Support --disable-unicodeMartin v. Löwis2001-08-171-0/+7
| | | | | | | | - Do not compile unicodeobject, unicodectype, and unicodedata if Unicode is disabled - check for Py_USING_UNICODE in all places that use Unicode functions - disables unicode literals, and the builtin functions - add the types.StringTypes list - remove Unicode literals from most tests.
* SF patch #438013 Remove 2-byte Py_UCS2 assumptionsTim Peters2001-08-091-6/+0
| | | | | | | | Removed all instances of Py_UCS2 from the codebase, and so also (I hope) the last remaining reliance on the platform having an integral type with exactly 16 bits. PyUnicode_DecodeUTF16() and PyUnicode_EncodeUTF16() now read and write one byte at a time.
* As discussed on python-dev: this patch adds name mangling toMarc-André Lemburg2001-07-311-0/+150
| | | | | assure that extensions and interpreters using the Unicode APIs were compiled using the same Unicode width.
* Add _PyUnicode_AsDefaultEncodedString to unicodeobject.h.Jeremy Hylton2001-07-301-0/+17
| | | | | | | And remove all the extern decls in the middle of .c files. Apparently, it was excluded from the header file because it is intended for internal use by the interpreter. It's still intended for internal use and documented as such in the header file.
* removed "register const" from scalar arguments to the unicodeFredrik Lundh2001-06-271-15/+15
| | | | predicates
* use Py_UNICODE_WIDE instead of USE_UCS4_STORAGE and Py_UNICODE_SIZEFredrik Lundh2001-06-271-6/+7
| | | | tests.
* Encode surrogates in UTF-8 even for a wide Py_UNICODE.Martin v. Löwis2001-06-271-0/+3
| | | | | | | Implement sys.maxunicode. Explicitly wrap around upper/lower computations for wide Py_UNICODE. When decoding large characters with UTF-8, represent expected test results using the \U notation.
* Make Unicode work a bit better on Windows...Fredrik Lundh2001-06-261-0/+8
|