summaryrefslogtreecommitdiffstats
path: root/Include/unicodeobject.h
Commit message (Collapse)AuthorAgeFilesLines
* Added #fndef's to avoid compiler errors.Marc-André Lemburg2000-08-111-1/+3
|
* This patch finalizes the move from UTF-8 to a default encoding inMarc-André Lemburg2000-08-031-2/+3
| | | | | | | | | | | | | | | | | | the Python Unicode implementation. The internal buffer used for implementing the buffer protocol is renamed to defenc to make this change visible. It now holds the default encoded version of the Unicode object and is calculated on demand (NULL otherwise). Since the default encoding defaults to ASCII, this will mean that Unicode objects which hold non-ASCII characters will no longer work on C APIs using the "s" or "t" parser markers. C APIs must now explicitly provide Unicode support via the "u", "U" or "es"/"es#" parser markers in order to work with non-ASCII Unicode strings. (Note: this patch will also have to be applied to the 1.6 branch of the CVS tree.)
* Changing the CNRI copyright notice according to CNRI's instructions.Guido van Rossum2000-08-031-1/+1
| | | | | This is a notice without a date, which apparently is not a claim to copyright but only advice to the reader. IANAL. :-)
* ANSIfications: fix empty arglists, and remove the checks forThomas Wouters2000-07-221-1/+1
| | | | 'HAVE_STDARG_PROTOTYPES' (consider it true, remove false branch)
* Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in eitherThomas Wouters2000-07-161-2/+2
| | | | | | | | | | comments, docstrings or error messages. I fixed two minor things in test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't"). There is a minor style issue involved: Guido seems to have preferred English grammar (behaviour, honour) in a couple places. This patch changes that to American, which is the more prominent style in the source. I prefer English myself, so if English is preferred, I'd be happy to supply a patch myself ;)
* Added new API PyUnicode_FromEncodedObject() which supports decodingMarc-André Lemburg2000-07-071-0/+18
| | | | | | objects including instance objects. The old API PyUnicode_FromObject() is still available as shortcut.
* Bill Tutt: Added Py_UCS4 typedef to hold UCS4 values (these needMarc-André Lemburg2000-07-071-0/+11
| | | | | at least 32 bits as opposed to Py_UNICODE which rely on having 16 bits).
* Modified the ISALPHA and ISALNUM macros to use the new lookup APIsMarc-André Lemburg2000-07-051-5/+8
| | | | from unicodectype.c
* Added new Py_UNICODE_ISALPHA() and Py_UNICODE_ISALNUM() macrosMarc-André Lemburg2000-07-031-0/+11
| | | | | | | | which are true for alphabetic and alphanumeric characters resp. The macros are currently implemented using the existing is* tables but will have to be updated to meet the Unicode standard definitions (add tables for non-cased letters and letter modifiers).
* Marc-Andre Lemburg <mal@lemburg.com>:Marc-André Lemburg2000-06-181-1/+2
| | | | | Added optimization proposed by Andrew Kuchling to the Unicode matching macro.
* M.-A. Lemburg <mal@lemburg.com>:Fred Drake2000-05-091-4/+26
| | | | | Added PyUnicode_GetDefaultEncoding() and PyUnicode_GetDefaultEncoding() APIs.
* Marc-Andre Lemburg:Guido van Rossum2000-04-111-1/+1
| | | | | | | Changed PyUnicode_Splitlines() maxsplit argument to keepends. The maxsplit functionality was replaced by the keepends functionality which allows keeping the line end markers together with the string.
* Marc-Andre Lemburg: New exported API PyUnicode_Resize().Guido van Rossum2000-04-101-0/+19
|
* Marc-Andre's third try at this bulk patch seems to work (except thatGuido van Rossum2000-04-051-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | his copy of test_contains.py seems to be broken -- the lines he deleted were already absent). Checkin messages: New Unicode support for int(), float(), complex() and long(). - new APIs PyInt_FromUnicode() and PyLong_FromUnicode() - added support for Unicode to PyFloat_FromString() - new encoding API PyUnicode_EncodeDecimal() which converts Unicode to a decimal char* string (used in the above new APIs) - shortcuts for calls like int(<int object>) and float(<float obj>) - tests for all of the above Unicode compares and contains checks: - comparing Unicode and non-string types now works; TypeErrors are masked, all other errors such as ValueError during Unicode coercion are passed through (note that PyUnicode_Compare does not implement the masking -- PyObject_Compare does this) - contains now works for non-string types too; TypeErrors are masked and 0 returned; all other errors are passed through Better testing support for the standard codecs. Misc minor enhancements, such as an alias dbcs for the mbcs codec. Changes: - PyLong_FromString() now applies the same error checks as does PyInt_FromString(): trailing garbage is reported as error and not longer silently ignored. The only characters which may be trailing the digits are 'L' and 'l' -- these are still silently ignored. - string.ato?() now directly interface to int(), long() and float(). The error strings are now a little different, but the type still remains the same. These functions are now ready to get declared obsolete ;-) - PyNumber_Int() now also does a check for embedded NULL chars in the input string; PyNumber_Long() already did this (and still does) Followed by: Looks like I've gone a step too far there... (and test_contains.py seem to have a bug too). I've changed back to reporting all errors in PyUnicode_Contains() and added a few more test cases to test_contains.py (plus corrected the join() NameError).
* Marc-Andre Lemburg:Guido van Rossum2000-03-281-1/+7
| | | | | | | | | | | | | | | The attached patch set includes a workaround to get Python with Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause is a bug in the BSDI wchar.h header file) and Python interfaces for the MBCS codec donated by Mark Hammond. Also included are some minor corrections w/r to the docs of the new "es" and "es#" parser markers (use PyMem_Free() instead of free(); thanks to Mark Hammond for finding these). The unicodedata tests are now in a separate file (test_unicodedata.py) to avoid problems if the module cannot be found.
* Prototypes added for MBCS codecs. (Win32 only.)Guido van Rossum2000-03-281-0/+20
|
* On 17-Mar-2000, Marc-Andre Lemburg said:Barry Warsaw2000-03-201-7/+9
| | | | | | | | | | | | | Attached you find an update of the Unicode implementation. The patch is against the current CVS version. I would appreciate if someone with CVS checkin permissions could check the changes in. The patch contains all bugs and patches sent this week and also fixes a leak in the codecs code and a bug in the free list code for Unicode objects (which only shows up when compiling Python with Py_DEBUG; thanks to MarkH for spotting this one).
* Marc-Andre Lemburg: add declaration for PyUnicode_Contains().Guido van Rossum2000-03-131-0/+11
|
* Unicode implementation by Marc-Andre Lemburg based on original code by ↵Guido van Rossum2000-03-101-0/+754
Fredrik Lundh.