summaryrefslogtreecommitdiffstats
path: root/Modules/unicodedata_db.h
Commit message (Collapse)AuthorAgeFilesLines
* Issue #8024: Update the Unicode database to 5.2Florent Xicluna2010-03-181-2933/+3285
|
* #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,Amaury Forgeot d'Arc2009-10-061-130/+244
| | | | | | _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace. It now also parses the Unihan.txt for numeric values.
* Issue #1734234: Massively speedup `unicodedata.normalize()` when theAntoine Pitrou2009-04-271-1736/+1961
| | | | | string is already in normalized form, by performing a quick check beforehand. Original patch by Rauli Ruohonen.
* Issue #3811: The Unicode database was updated to 5.1.Martin v. Löwis2008-09-101-2631/+3007
| | | | Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
* Make more symbols static.Martin v. Löwis2008-06-131-2/+2
|
* Update Unicode database to Unicode 4.1.Martin v. Löwis2006-03-091-2334/+3507
|
* SF #989185: Drop unicode.iswide() and unicode.width() and addHye-Shik Chang2004-08-041-1091/+1272
| | | | | | | | | | | | unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
* - SF #962502: Add two more methods for unicode type; width() andHye-Shik Chang2004-06-021-1/+1
| | | | | | | iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
* Add unidata_version. Bump generator version number.Martin v. Löwis2002-11-251-1/+2
|
* Regenerate from Unicode 3.2.0 to include all First/Last ranges.Martin v. Löwis2002-11-241-139/+131
|
* Patch #626485: Support Unicode normalization.Martin v. Löwis2002-11-231-0/+577
|
* Update to Unicode 3.2 database.Martin v. Löwis2002-10-181-1497/+2769
|
* compress unicode decomposition tables (this saves another 55k)Fredrik Lundh2001-01-211-3822/+1118
|
* forgot to check in the new makeunicodedata.py scriptFredrik Lundh2001-01-211-1/+1
|
* Added 38,642 missing characters to the Unicode database (first-lastFredrik Lundh2000-11-031-96/+114
| | | | | | | ranges) -- but thanks to the 2.0 compression scheme, this doesn't add a single byte to the resulting binaries (!) Closes bug #117524
* unicode database compression, step 2:Fredrik Lundh2000-09-251-4277/+4522
| | | | | | | | | | - fixed attributions - moved decomposition data to a separate table, in preparation for step 3 (which won't happen before 2.0 final, promise!) - use relative paths in the generator script I have a lot more stuff in the works for 2.1, but let's leave that for another day...
* unicode database compression, step 1:Fredrik Lundh2000-09-241-0/+4380
- use unidb compression for the unicodedata module. on Windows, the new unidatabase module is 120k, down from nearly 600k.