summaryrefslogtreecommitdiffstats
path: root/Objects/unicodeobject.c
Commit message (Collapse)AuthorAgeFilesLines
* Make Py_BuildValue, PyObject_CallFunction andMartin v. Löwis2006-04-141-0/+1
| | | | PyObject_CallMethod aware of PY_SSIZE_T_CLEAN.
* Remove another INT_MAX limitationMartin v. Löwis2006-04-131-2/+1
|
* Change more ints to Py_ssize_t.Martin v. Löwis2006-04-131-40/+39
|
* Revert 34153: Py_UNICODE should not be signed.Martin v. Löwis2006-04-131-8/+3
|
* spread the extern "C" { } magic pixie dust around. Python itself builds nowAnthony Baxter2006-04-131-0/+10
| | | | | using a C++ compiler. Still lots and lots of errors in the modules built by setup.py, and a bunch of warnings from g++ in the core.
* More low-hanging fruit. Still need to re-arrange some code (or find a betterAnthony Baxter2006-04-111-5/+6
| | | | | solution) in the same way as listobject.c got changed. Hoping for a better solution.
* That one was a mistake.Georg Brandl2006-03-301-1/+1
|
* Remove unnecessary casts in type object initializers.Georg Brandl2006-03-301-4/+4
|
* - Reindent a confusingly indented piece of code (no intended code changesThomas Wouters2006-03-121-13/+16
| | | | | | | | | there) - Add missing DECREFs of inner-scope 'temp' variable - Add various missing DECREFs by changing 'return NULL' into 'goto onError' - Avoid double DECREF when last _PyUnicode_Resize() fails Coverity found one of the missing DECREFs, but oddly enough not the others.
* Update Unicode database to Unicode 4.1.Martin v. Löwis2006-03-091-1/+1
|
* Checking in the code for PEP 357.Guido van Rossum2006-03-071-2/+5
| | | | | | This was mostly written by Travis Oliphant. I've inspected it all; Neal Norwitz and MvL have also looked at it (in an earlier incarnation).
* SF #1444030: Fix several potential defects found by Coverity.Hye-Shik Chang2006-03-071-8/+14
| | | | (reviewed by Neal Norwitz)
* Revert backwards-incompatible const changes.Martin v. Löwis2006-02-271-1/+1
|
* Use correct PyArg_Parse format char for Py_ssize_t in unicode.center().Thomas Wouters2006-02-161-1/+1
| | | | | | | | | | | Fixes: >>> u"".center(10) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError on 64-bit systems.
* Use Py_ssize_t for counts and sizes.Martin v. Löwis2006-02-161-1/+1
| | | | Convert Py_ssize_t using PyInt_FromSsize_t
* Support %zd in PyErr_Format and PyString_FromFormat.Martin v. Löwis2006-02-161-6/+3
|
* doubletounicode(), longtounicode():Tim Peters2006-02-161-4/+8
| | | | | | | | Py_SAFE_DOWNCAST can evaluate its first argument multiple times in a debug build. This caused two distinct assert- failures in test_unicode run under a debug build. Rewrote the code in trivial ways so that multiple evaluation of the first argument doesn't hurt.
* Remove two unused Py_ssize_t variables (merge glitches, looks like.)Thomas Wouters2006-02-151-2/+0
|
* Merge ssize_t branch.Martin v. Löwis2006-02-151-287/+299
|
* - Patch #1400181, fix unicode string formatting to not use the locale.Neal Norwitz2006-01-101-16/+21
| | | | | | | | | | | | | | | | This is how string objects work. u'%f' could use , instead of . for the decimal point. Now both strings and unicode always use periods. This is the code that would break: import locale locale.setlocale(locale.LC_NUMERIC, 'de_DE') u'%.1f' % 1.0 assert '1.0' == u'%.1f' % 1.0 I couldn't create a test case which fails, but this fixes the problem. Will backport.
* Fix icc warnings: remove (sometimes) unused variable conditionallyNeal Norwitz2006-01-081-2/+4
|
* Stop maintaining the buildno file.Martin v. Löwis2006-01-051-12/+15
| | | | Also, stop determining Unicode sizes with PyString_GET_SIZE.
* Bug #1379994: Fix *unicode_escape codecs to encode r'\' as r'\\'Hye-Shik Chang2005-12-171-3/+3
| | | | just like string codecs.
* Add const to several API functions that take char *.Jeremy Hylton2005-12-101-1/+1
| | | | | | | | | | | | | | | | | | | In C++, it's an error to pass a string literal to a char* function without a const_cast(). Rather than require every C++ extension module to put a cast around string literals, fix the API to state the const-ness. I focused on parts of the API where people usually pass literals: PyArg_ParseTuple() and friends, Py_BuildValue(), PyMethodDef, the type slots, etc. Predictably, there were a large set of functions that needed to be fixed as a result of these changes. The most pervasive change was to make the keyword args list passed to PyArg_ParseTupleAndKewords() to be a const char *kwlist[]. One cast was required as a result of the changes: A type object mallocs the memory for its tp_doc slot and later frees it. PyTypeObject says that tp_doc is const char *; but if the type was created by type_new(), we know it is safe to cast to char *.
* Fix leaked reference to None.Walter Dörwald2005-11-281-0/+1
|
* Another comment typo fixAndrew M. Kuchling2005-11-021-1/+1
|
* Fix typo in comment.Walter Dörwald2005-11-021-1/+1
|
* fix typos, mostly in commentsFred Drake2005-10-281-2/+2
|
* Fix bug:Michael W. Hudson2005-10-211-4/+0
| | | | | | | | [ 1327110 ] wrong TypeError traceback in generator expressions by removing the code that can stomp on the users' TypeError raised by the iterable argument to ''.join() -- PySequence_Fast (now?) gives a perfectly reasonable message itself. Also, a couple of tests.
* Whitespace corrections.Marc-André Lemburg2005-10-191-19/+19
|
* Bug fix for [ 1331062 ] utf 7 codec broken.Marc-André Lemburg2005-10-191-8/+16
| | | | Backport candidate.
* Part of SF patch #1313939: Speedup charmap decoding by extendingWalter Dörwald2005-10-061-75/+107
| | | | | | | PyUnicode_DecodeCharmap() the accept a unicode string as the mapping argument which is used as a mapping table. This code isn't used by any of the codecs yet.
* SF bug #1251300: On UCS-4 builds the "unicode-internal" codec will now complainWalter Dörwald2005-08-301-0/+75
| | | | | about illegal code points. The codec now supports PEP 293 style error handlers. (This is a variant of the Nik Haldimann's patch that detects truncated data)
* Correct the handling of 0-termination of PyUnicode_AsWideChar()Marc-André Lemburg2004-11-221-1/+7
| | | | | | | | and its usage in PyLocale_strcoll(). Clarify the documentation on this. Thanks to Andreas Degert for pointing this out.
* Applied patch for [ 1047269 ] Buffer overwrite in PyUnicode_AsWideChar.Marc-André Lemburg2004-10-151-2/+2
| | | | Python 2.3.x candidate.
* Initialize sep and seplen to suppress warning from gcc.Skip Montanaro2004-09-161-3/+3
|
* Add a missing line continuation character.Thomas Heller2004-09-151-1/+1
|
* Make the hint about the None default less ambiguous.Walter Dörwald2004-09-141-1/+1
|
* Enhance the docstrings for unicode.split() and string.split()Walter Dörwald2004-09-141-2/+2
| | | | | to make it clear that it is possible to pass None as the separator argument to get the default "any whitespace" separator.
* SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now supportWalter Dörwald2004-09-071-23/+57
| | | | | | | | | | | decoding incomplete input (when the input stream is temporarily exhausted). codecs.StreamReader now implements buffering, which enables proper readline support for the UTF-16 decoders. codecs.StreamReader.read() has a new argument chars which specifies the number of characters to return. codecs.StreamReader.readline() and codecs.StreamReader.readlines() have a new argument keepends. Trailing "\n"s will be stripped from the lines if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and PyUnicode_DecodeUTF16Stateful.
* PyUnicode_Join(): Bozo Alert. While this is chugging along, it mayTim Peters2004-08-271-0/+12
| | | | | | | | | need to convert str objects from the iterable to unicode. So, if someone set the system default encoding to something nasty enough, the conversion process could mutate the input iterable as a side effect, and PySequence_Fast doesn't hide that from us if the input was a list. IOW, can't assume the size of PySequence_Fast's result is invariant across PyUnicode_FromObject() calls.
* PyUnicode_Join(): Rewrote to use PySequence_Fast(). This doesn't doTim Peters2004-08-271-126/+96
| | | | | | | | much to reduce the size of the code, but greatly improves its clarity. It's also quicker in what's probably the most common case (the argument iterable is a list). Against it, if the iterable isn't a list or a tuple, a temp tuple is materialized containing the entire input sequence, and that's a bigger temp memory burden. Yawn.
* PyUnicode_Join(): Missed a spot where I intended a cast from size_t toTim Peters2004-08-271-1/+1
| | | | | int. I sure wish MS would gripe about that! Whatever, note that the statement above it guarantees that the cast loses no info.
* PyUnicode_Join(): Two primary aims:Tim Peters2004-08-271-40/+120
| | | | | | | | 1. u1.join([u2]) is u2 2. Be more careful about C-level int overflow. Since PySequence_Fast() isn't needed to achieve #1, it's not used -- but the code could sure be simpler if it were.
* SF #989185: Drop unicode.iswide() and unicode.width() and addHye-Shik Chang2004-08-041-67/+0
| | | | | | | | | | | | unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
* Let u'%s' % obj try obj.__unicode__() first and fallback to obj.__str__().Marc-André Lemburg2004-07-231-10/+12
|
* Moved SunPro warning suppression into pyport.h and out of individualNicholas Bastin2004-07-151-4/+0
| | | | modules and objects.
* Fix a copy&paste typo.Marc-André Lemburg2004-07-101-1/+1
|
* .encode()/.decode() patch part 2.Marc-André Lemburg2004-07-081-0/+10
|
* Allow string and unicode return types from .encode()/.decode()Marc-André Lemburg2004-07-081-5/+95
| | | | | methods on string and unicode objects. Added unicode.decode() which was missing for no apparent reason.