summaryrefslogtreecommitdiffstats
path: root/Modules/unicodedata_db.h
Commit message (Collapse)AuthorAgeFilesLines
* Upgrade to Unicode 6.0.0.Martin v. Löwis2010-10-111-2370/+2559
| | | | | | | | makeunicodedata.py: download all data files from unicode.org, switch to extracting Unihan data from zip file. Read linebreakprops and derivednormalizationprops even for old versions, even though they are not used in delta records. test:unicode.py: U+11000 is now assigned, use U+14000 instead.
* Fixed a failure in test_bigmem.Florent Xicluna2010-03-191-2933/+3285
| | | | | | | | | | | Merged revision 79059 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r79059 | florent.xicluna | 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) | 2 lines Issue #8024: Update the Unicode database to 5.2 ........
* Revert Unicode UCD 5.2 upgrade in 3.x. It broke repr() for unicode objects, ↵Florent Xicluna2010-03-191-3285/+2933
| | | | and gave failures in test_bigmem. Revert 79062, 79065 and 79083.
* Merged revisions 79059 via svnmerge fromFlorent Xicluna2010-03-181-2933/+3285
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r79059 | florent.xicluna | 2010-03-18 22:50:06 +0100 (jeu, 18 mar 2010) | 2 lines Issue #8024: Update the Unicode database to 5.2 ........
* Merged revisions 75272-75273 via svnmerge fromAmaury Forgeot d'Arc2009-10-061-130/+244
| | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r75272 | amaury.forgeotdarc | 2009-10-06 21:56:32 +0200 (mar., 06 oct. 2009) | 5 lines #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric, _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace. It now also parses the Unihan.txt for numeric values. ........ r75273 | amaury.forgeotdarc | 2009-10-06 22:02:09 +0200 (mar., 06 oct. 2009) | 2 lines Add Anders Chrigstrom to Misc/ACKS for his work on unicodedata. ........
* Merged revisions 72054 via svnmerge fromAntoine Pitrou2009-04-271-1736/+1961
| | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r72054 | antoine.pitrou | 2009-04-27 23:53:26 +0200 (lun., 27 avril 2009) | 5 lines Issue #1734234: Massively speedup `unicodedata.normalize()` when the string is already in normalized form, by performing a quick check beforehand. Original patch by Rauli Ruohonen. ........
* Merged revisions 66362 via svnmerge fromMartin v. Löwis2008-09-101-2631/+3007
| | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r66362 | martin.v.loewis | 2008-09-10 15:38:12 +0200 (Mi, 10 Sep 2008) | 3 lines Issue #3811: The Unicode database was updated to 5.1. Reviewed by Fredrik Lundh and Marc-Andre Lemburg. ........
* Merged revisions 64226 via svnmerge fromMartin v. Löwis2008-06-131-2/+2
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r64226 | martin.v.loewis | 2008-06-13 09:47:47 +0200 (Fr, 13 Jun 2008) | 2 lines Make more symbols static. ........
* Update Unicode database to Unicode 4.1.Martin v. Löwis2006-03-091-2334/+3507
|
* SF #989185: Drop unicode.iswide() and unicode.width() and addHye-Shik Chang2004-08-041-1091/+1272
| | | | | | | | | | | | unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
* - SF #962502: Add two more methods for unicode type; width() andHye-Shik Chang2004-06-021-1/+1
| | | | | | | iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
* Add unidata_version. Bump generator version number.Martin v. Löwis2002-11-251-1/+2
|
* Regenerate from Unicode 3.2.0 to include all First/Last ranges.Martin v. Löwis2002-11-241-139/+131
|
* Patch #626485: Support Unicode normalization.Martin v. Löwis2002-11-231-0/+577
|
* Update to Unicode 3.2 database.Martin v. Löwis2002-10-181-1497/+2769
|
* compress unicode decomposition tables (this saves another 55k)Fredrik Lundh2001-01-211-3822/+1118
|
* forgot to check in the new makeunicodedata.py scriptFredrik Lundh2001-01-211-1/+1
|
* Added 38,642 missing characters to the Unicode database (first-lastFredrik Lundh2000-11-031-96/+114
| | | | | | | ranges) -- but thanks to the 2.0 compression scheme, this doesn't add a single byte to the resulting binaries (!) Closes bug #117524
* unicode database compression, step 2:Fredrik Lundh2000-09-251-4277/+4522
| | | | | | | | | | - fixed attributions - moved decomposition data to a separate table, in preparation for step 3 (which won't happen before 2.0 final, promise!) - use relative paths in the generator script I have a lot more stuff in the works for 2.1, but let's leave that for another day...
* unicode database compression, step 1:Fredrik Lundh2000-09-241-0/+4380
- use unidb compression for the unicodedata module. on Windows, the new unidatabase module is 120k, down from nearly 600k.