summaryrefslogtreecommitdiffstats
path: root/Lib/encodings
Commit message (Collapse)AuthorAgeFilesLines
* Fix the encodings package codec search function to only searchMarc-André Lemburg2006-02-191-1/+1
| | | | | | inside its own package. Fixes problem reported in patch #1433198. Add codec search function for codec test codec.
* Patch #1177307: UTF-8-Sig codec.Martin v. Löwis2006-01-081-0/+57
|
* Whitespace normalization.Tim Peters2005-12-2561-32998/+32953
|
* Cosmetic change: make all hex literals use upper case hex so that theyMarc-André Lemburg2005-10-2445-14236/+14281
| | | | | | look more like the Unicode Consortium files. Add ending new-line to all source files.
* Removed the decoding_map from the codecs where this is possible.Marc-André Lemburg2005-10-2445-25663/+22734
| | | | | Replaced the tis_620, cp1140 and koi8_u codecs with new ones based on custom mapping files.
* Replace the old EBCDIC codecs with new ones using the decoding table.Marc-André Lemburg2005-10-214-981/+3027
|
* Alias iso8859_1 to latin_1 which is the same encoding, but hasMarc-André Lemburg2005-10-211-0/+7
| | | | a much faster codec implementation.
* Add a few more Mac OS encodings. The mapping tables for these areMarc-André Lemburg2005-10-215-0/+3414
| | | | available at ftp.unicode.org.
* Replace the old charmap codecs with new ones generated from the currentMarc-André Lemburg2005-10-2149-5129/+29964
| | | | | | | mapping tables available at ftp.unicode.org. These new codecs include and use character decoding tables which speeds up decoding by a few factors.
* Bug #1245379: Add "unicode-1-1-utf-7" as an alias for "utf-7" as specifiedWalter Dörwald2005-10-091-0/+1
| | | | by RFC 1642.
* No need to import exceptions, they are builtinsNeal Norwitz2005-09-011-3/+2
|
* Make IDNA return an empty string when the input is empty. Fixes #1163178.Martin v. Löwis2005-08-251-0/+6
| | | | Will backport to 2.4.
* Reset internal buffers when seek() is called. This fixes SF bug #1156259.Walter Dörwald2005-03-141-0/+7
|
* Fix wrong variable name.Walter Dörwald2004-12-291-1/+1
|
* Rearranged mappings to value sorting order.Marc-André Lemburg2004-12-101-13/+13
|
* SF patch #998993: The UTF-8 and the UTF-16 stateful decoders now supportWalter Dörwald2004-09-074-72/+49
| | | | | | | | | | | decoding incomplete input (when the input stream is temporarily exhausted). codecs.StreamReader now implements buffering, which enables proper readline support for the UTF-16 decoders. codecs.StreamReader.read() has a new argument chars which specifies the number of characters to return. codecs.StreamReader.readline() and codecs.StreamReader.readlines() have a new argument keepends. Trailing "\n"s will be stripped from the lines if keepends is false. Added C APIs PyUnicode_DecodeUTF8Stateful and PyUnicode_DecodeUTF16Stateful.
* Whitespace normalization.Tim Peters2004-08-073-136/+136
|
* Added new codecs and aliases for ISO_8859-11, ISO_8859-16 andMarc-André Lemburg2004-08-054-1/+285
| | | | | | TIS-620. Closes SF bug #1001895: Adding missing ISO 8859 codecs, especially Thai.
* Whitespace normalization.Tim Peters2004-07-311-95/+95
|
* New codec: [ 996067 ] hp-roman8 codecMarc-André Lemburg2004-07-281-0/+139
|
* Added new codec hp-roman8 submitted as patch [ 996067 ] hp-roman8 codec.Marc-André Lemburg2004-07-281-0/+7
|
* Bring CJKCodecs 1.1 into trunk. This completely reorganizes sourceHye-Shik Chang2004-07-1825-87/+260
| | | | | | and installed layouts to make maintenance simple and easy. And it also adds four new codecs; big5hkscs, euc-jis-2004, shift-jis-2004 and iso2022-jp-2004.
* Whitespace normalization.Tim Peters2004-07-0721-138/+118
|
* Convert input to a string object. Fixes #909230.Martin v. Löwis2004-03-231-0/+1
| | | | Backported 2.3.
* Add a new unicode codec: ptcp154 (Kazakh)Hye-Shik Chang2004-03-192-0/+168
|
* Fix wrong character mapping in koi8_u: SF bug #902501.Marc-André Lemburg2004-02-231-1/+1
|
* Let the default encodings search function lookup aliases before trying the ↵Marc-André Lemburg2004-01-201-18/+26
| | | | codec import. This allows applications to install codecs which override (non-special-cased) builtin codecs.
* Add some more code page aliases needed for completeness.Marc-André Lemburg2004-01-201-0/+16
|
* Fix a typo: s/iso_3022/iso2022/Hye-Shik Chang2004-01-201-1/+1
|
* Add CJK codecs support as discussed on python-dev. (SF #873597)Hye-Shik Chang2004-01-1721-9/+780
| | | | | Several style fixes are suggested by Martin v. Loewis and Marc-Andre Lemburg. Thanks!
* Revert previous change. MAL preferred the old version.Raymond Hettinger2003-12-011-4/+41
|
* Simplifed the code.Raymond Hettinger2003-12-011-41/+4
|
* Fix typo in the comments.Raymond Hettinger2003-09-241-1/+1
|
* Added codec for bz2 compression.Raymond Hettinger2003-09-232-0/+67
|
* Support trailing dots in DNS names. Fixes #782510. Will backport to 2.3.Martin v. Löwis2003-08-051-3/+15
|
* more generic reference to python interpreterSkip Montanaro2003-07-221-1/+1
|
* Remove usage of re module from encodings package search function.Marc-André Lemburg2003-05-161-4/+19
|
* Whitespace normalization.Tim Peters2003-04-243-10/+9
|
* Implement IDNA (Internationalized Domain Names in Applications).Martin v. Löwis2003-04-182-0/+409
|
* Revert Patch #670715: iconv support.Martin v. Löwis2003-04-032-39/+0
|
* Handle iconv initialization erorrsNeal Norwitz2003-02-281-1/+1
|
* Patch #670715: Universal Unicode Codec for POSIX iconv.Martin v. Löwis2003-01-262-0/+40
|
* Whitespace normalization.Tim Peters2002-12-241-1/+1
|
* Add new encoding for Ukrainian CyrillicNeal Norwitz2002-10-171-0/+54
|
* When looking for an alias, first look for the normalized name (whichGuido van Rossum2002-10-041-1/+3
| | | | | still may contain dots), then if that doesn't exist look for the name with dots replaced by underscores. This is a little more forgiving.
* Undo the removal. Guido mentioned that the encoding name is in activeMarc-André Lemburg2002-10-041-0/+1
| | | | by some email headers.
* Remove unneeded alias.Marc-André Lemburg2002-10-041-1/+0
|
* Fix doc-string.Marc-André Lemburg2002-10-041-3/+3
|
* Adapt lookup names to new more general encoding name normalizationMarc-André Lemburg2002-10-041-14/+14
| | | | scheme.
* Extending the encoding name normalization to handle more non-alphanumericMarc-André Lemburg2002-10-041-8/+20
| | | | characters.