summaryrefslogtreecommitdiffstats
path: root/Modules/unicodedata.c
Commit message (Collapse)AuthorAgeFilesLines
* fix possible overflow bugs in unicodedata (closes #23367)Benjamin Peterson2015-03-021-1/+8
|
* #18803: fix more typos. Patch by Févry Thibault.Ezio Melotti2013-08-251-1/+1
|
* #18466: fix more typos. Patch by Févry Thibault.Ezio Melotti2013-08-171-1/+1
|
* #16681: use "bidirectional class" instead of "bidirectional category" in the ↵Ezio Melotti2012-12-141-1/+1
| | | | docstring too.
* Remove all other uses of the C tolower()/toupper() which could break with a ↵Antoine Pitrou2011-10-041-2/+2
| | | | | | Turkish locale. (except in the strop module, which is deprecated anyway)
* Merged revisions 87442 via svnmerge fromAlexander Belopolsky2010-12-281-4/+9
| | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r87442 | alexander.belopolsky | 2010-12-22 21:27:37 -0500 (Wed, 22 Dec 2010) | 1 line Issue #10254: Fixed a crash and a regression introduced by the implementation of PRI 29. ........
* Issue #10459: Update CJK character names to Unicode 5.2.Martin v. Löwis2010-11-221-2/+3
|
* Untabify C files. Will watch buildbots.Antoine Pitrou2010-05-091-133/+133
|
* Backported PyCapsule from 3.1, and converted most uses ofLarry Hastings2010-03-251-1/+1
| | | | CObject to PyCapsule.
* Link specifically to the UCD version 5.2.0.Ezio Melotti2010-03-231-1/+2
|
* Update the version number of the Unicode Database in a few more places.Ezio Melotti2010-03-221-5/+4
|
* Issue #1054943: Fix unicodedata.normalize('NFC', text) for the Public ReviewVictor Stinner2010-03-041-1/+1
| | | | | | Issue #29. PR #29 was released in february 2004!
* #1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,Amaury Forgeot d'Arc2009-10-061-1/+1
| | | | | | _PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace. It now also parses the Unihan.txt for numeric values.
* Issue #1734234: Massively speedup `unicodedata.normalize()` when theAntoine Pitrou2009-04-271-5/+58
| | | | | string is already in normalized form, by performing a quick check beforehand. Original patch by Rauli Ruohonen.
* Issue #3811: The Unicode database was updated to 5.1.Martin v. Löwis2008-09-101-5/+8
| | | | Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
* This reverts r63675 based on the discussion in this thread:Gregory P. Smith2008-06-091-6/+6
| | | | | | | http://mail.python.org/pipermail/python-dev/2008-June/079988.html Python 2.6 should stick with PyString_* in its codebase. The PyBytes_* names in the spirit of 3.0 are available via a #define only. See the email thread.
* Change all functions that expect one unicode character to accept a pair ofWalter Dörwald2008-06-021-73/+74
| | | | surrogates in narrow builds. Fixes issue #1706460.
* Renamed PyString to PyBytesChristian Heimes2008-05-261-6/+6
|
* #1629: Renamed Py_Size, Py_Type and Py_Refcnt to Py_SIZE, Py_TYPE and ↵Christian Heimes2007-12-191-1/+1
| | | | Py_REFCNT. Macros for b/w compatibility are available.
* Bug #1704793: Return UTF-16 pair if unicodedata.lookup cannotMartin v. Löwis2007-07-281-16/+11
| | | | represent the result in a single character.
* PEP 3123: Provide forward compatibility with Python 3.0, while keepingMartin v. Löwis2007-07-211-3/+2
| | | | | backwards compatibility. Add Py_Refcnt, Py_Type, Py_Size, and PyVarObject_HEAD_INIT.
* Replace C++ comment with C comment (fixes SF bug #1593525).Walter Dörwald2006-11-091-1/+1
|
* I'm not sure why this code allocates this string for the error message.Neal Norwitz2006-08-121-2/+11
| | | | | | | I think it would be better to always use snprintf and have the format limit the size of the name appropriately (like %.200s). Klocwork #340
* Update dangling references to the 3.2 database toMartin v. Löwis2006-08-101-5/+5
| | | | mention that this is UCD 4.1 now.
* No functional change. Add comment and assert to describe why there cannot ↵Neal Norwitz2006-07-271-2/+9
| | | | be overflow which was reported by Klocwork. Discussed on python-dev
* Patch 1494554: Update numeric properties to Unicode 4.1.Martin v. Löwis2006-05-271-2/+2
|
* No reason to export get_decomp_record, make staticNeal Norwitz2006-04-171-1/+1
|
* Support NFD of very long strings.Martin v. Löwis2006-04-131-3/+3
|
* Get rid of warnings about using chars as subscriptsNeal Norwitz2006-04-101-2/+2
| | | | on Alpha (and possibly other platforms) by using Py_CHARMASK().
* Adjust CJK Ideograph range to Unicode 4.1.Martin v. Löwis2006-03-111-13/+12
|
* Fix refcounting bug.Martin v. Löwis2006-03-101-0/+1
|
* Avoid forward-declaring the methods array.Martin v. Löwis2006-03-101-52/+53
| | | | Rename unicodedata.db* to unicodedata.ucd*
* Update Unicode database to Unicode 4.1.Martin v. Löwis2006-03-091-28/+213
|
* Remove gcc (4.0.x) warning about uninitialized value by explicitly settingThomas Wouters2006-03-011-2/+1
| | | | | | | the sentinel value in the main function, rather than the helper. This function could possibly do with an early-out if any of the helper calls ends up with a len of 0, but I doubt it really matters (how common are malformed hangul syllables, really?)
* Patch #1213831: Fix typo in unicodedata._getcode.Martin v. Löwis2005-09-181-1/+1
| | | | Will backport to Python 2.4.
* Correct URL to the official UnicodeData 3.2.0 resource. (ReportedHye-Shik Chang2005-06-041-1/+1
| | | | by Darek Suchojad)
* Fill docstrings for module and functions, extracted from the texHye-Shik Chang2005-04-041-13/+108
| | | | documentation. (Patch #1173245, Contributed by Jeremy Yallop)
* SF #989185: Drop unicode.iswide() and unicode.width() and addHye-Shik Chang2004-08-041-0/+21
| | | | | | | | | | | | unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
* Fix typo.Hye-Shik Chang2004-07-151-1/+1
|
* Special case normalization of empty strings. Fixes #924361.Martin v. Löwis2004-04-171-0/+7
| | | | Backported to 2.3.
* Overallocate target buffer for normalization more early. Fixes #834676.Martin v. Löwis2003-11-061-5/+7
| | | | Backported to 2.3.
* Fix SF bug #694816, remove comparison of unsigned value < 0Neal Norwitz2003-02-281-2/+2
|
* Remove C++ comment.Martin v. Löwis2002-12-071-1/+1
|
* Add unidata_version. Bump generator version number.Martin v. Löwis2002-11-251-0/+2
|
* Verify that the code in CJK UNIFIED IDEOGRAPH- actually denotes an ideograph.Martin v. Löwis2002-11-231-3/+12
|
* Patch #626485: Support Unicode normalization.Martin v. Löwis2002-11-231-15/+279
|
* Implement names for CJK unified ideographs. Add name to KeyError output.Martin v. Löwis2002-11-231-1/+39
| | | | Verify that the lookup for an existing name succeeds.
* Fix off-by-one error.Martin v. Löwis2002-11-231-1/+1
|
* Patch #626548: Support Hangul syllable names.Martin v. Löwis2002-11-231-2/+109
|
* Update to Unicode 3.2 database.Martin v. Löwis2002-10-181-3/+3
|