summaryrefslogtreecommitdiffstats
path: root/Lib/encodings
Commit message (Collapse)AuthorAgeFilesLines
* This patch changes the default behaviour of the builtin charmapMarc-André Lemburg2001-01-0352-267/+359
| | | | | | | | | | | | | | | | codec to not apply Latin-1 mappings for keys which are not found in the mapping dictionaries, but instead treat them as undefined mappings. The patch was originally written by Martin v. Loewis with some additional (cosmetic) changes and an updated test script by Marc-Andre Lemburg. The standard codecs were recreated from the most current files available at the Unicode.org site using the Tools/scripts/gencodec.py tool. This patch closes the bugs #116285 and #119960.
* Changed .getaliases() support to register the new aliases in theMarc-André Lemburg2000-12-121-4/+12
| | | | | | | | | | | encodings package aliases mapping dictionary rather than in the internal cache used by the search function. This enables aliases to take advantage of the full normalization process applied to encoding names which was previously not available. The patch restricts alias registration to new aliases. Existing aliases cannot be overridden anymore.
* Spelling fixes supplied by Rob W. W. Hooft. All these are fixes in eitherThomas Wouters2000-07-161-2/+2
| | | | | | | | | | comments, docstrings or error messages. I fixed two minor things in test_winreg.py ("didn't" -> "Didn't" and "Didnt" -> "Didn't"). There is a minor style issue involved: Guido seems to have preferred English grammar (behaviour, honour) in a couple places. This patch changes that to American, which is the more prominent style in the source. I prefer English myself, so if English is preferred, I'd be happy to supply a patch myself ;)
* Marc-Andre Lemburg <mal@lemburg.com>:Marc-André Lemburg2000-06-131-2/+2
| | | | | Removed import of string module -- use string methods directly. Thanks to Finn Bock.
* Marc-Andre Lemburg <mal@lemburg.com>:Marc-André Lemburg2000-06-071-0/+22
| | | | | Added some more codec aliases. Some of them are needed by the new locale.py encoding support.
* New codec which always raises an exception when used. ThisMarc-André Lemburg2000-06-071-0/+34
| | | | | codec can be used to effectively switch off string coercion to Unicode.
* Marc-Andre's third try at this bulk patch seems to work (except thatGuido van Rossum2000-04-052-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | his copy of test_contains.py seems to be broken -- the lines he deleted were already absent). Checkin messages: New Unicode support for int(), float(), complex() and long(). - new APIs PyInt_FromUnicode() and PyLong_FromUnicode() - added support for Unicode to PyFloat_FromString() - new encoding API PyUnicode_EncodeDecimal() which converts Unicode to a decimal char* string (used in the above new APIs) - shortcuts for calls like int(<int object>) and float(<float obj>) - tests for all of the above Unicode compares and contains checks: - comparing Unicode and non-string types now works; TypeErrors are masked, all other errors such as ValueError during Unicode coercion are passed through (note that PyUnicode_Compare does not implement the masking -- PyObject_Compare does this) - contains now works for non-string types too; TypeErrors are masked and 0 returned; all other errors are passed through Better testing support for the standard codecs. Misc minor enhancements, such as an alias dbcs for the mbcs codec. Changes: - PyLong_FromString() now applies the same error checks as does PyInt_FromString(): trailing garbage is reported as error and not longer silently ignored. The only characters which may be trailing the digits are 'L' and 'l' -- these are still silently ignored. - string.ato?() now directly interface to int(), long() and float(). The error strings are now a little different, but the type still remains the same. These functions are now ready to get declared obsolete ;-) - PyNumber_Int() now also does a check for embedded NULL chars in the input string; PyNumber_Long() already did this (and still does) Followed by: Looks like I've gone a step too far there... (and test_contains.py seem to have a bug too). I've changed back to reporting all errors in PyUnicode_Contains() and added a few more test cases to test_contains.py (plus corrected the join() NameError).
* Marc-Andre Lemburg: use all lowercase names.Guido van Rossum2000-03-311-8/+8
|
* Marc-Andre Lemburg:Guido van Rossum2000-03-281-1/+0
| | | | | | | | | | | | | | | The attached patch set includes a workaround to get Python with Unicode compile on BSDI 4.x (courtesy Thomas Wouters; the cause is a bug in the BSDI wchar.h header file) and Python interfaces for the MBCS codec donated by Mark Hammond. Also included are some minor corrections w/r to the docs of the new "es" and "es#" parser markers (use PyMem_Free() instead of free(); thanks to Mark Hammond for finding these). The unicodedata tests are now in a separate file (test_unicodedata.py) to avoid problems if the module cannot be found.
* MBCS codecs for Windows. Contributed by Mark Hammond.Guido van Rossum2000-03-281-0/+37
|
* On 17-Mar-2000, Marc-Andre Lemburg said:Barry Warsaw2000-03-201-3/+3
| | | | | | | | | | | | | Attached you find an update of the Unicode implementation. The patch is against the current CVS version. I would appreciate if someone with CVS checkin permissions could check the changes in. The patch contains all bugs and patches sent this week and also fixes a leak in the codecs code and a bug in the free list code for Unicode objects (which only shows up when compiling Python with Py_DEBUG; thanks to MarkH for spotting this one).
* Marc-Andre Lemburg: Unicode encodings.Guido van Rossum2000-03-1064-0/+8398