cpython.git - https://github.com/python/cpython.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	#7112: Fix compilation warning in unicodetype_db.h	Amaury Forgeot d'Arc	2009-10-13	1	-0/+5
\| \| \| \|	makeunicodedata now generates double literals
*	#1571184: makeunicodedata.py now generates the functions _PyUnicode_ToNumeric,	Amaury Forgeot d'Arc	2009-10-06	1	-8/+123
\| \| \| \| \| \|	_PyUnicode_IsLinebreak and _PyUnicode_IsWhitespace. It now also parses the Unihan.txt for numeric values.
*	#1616979: Add the cp720 (Arabic DOS) encoding.	Amaury Forgeot d'Arc	2009-07-13	2	-0/+68
\| \| \| \| \|	Since there is no official mapping file from unicode.org, the codec file is generated on Windows with the new genwincodec.py script.
*	Issue #1734234: Massively speedup `unicodedata.normalize()` when the	Antoine Pitrou	2009-04-27	1	-5/+32
\| \| \| \| \|	string is already in normalized form, by performing a quick check beforehand. Original patch by Rauli Ruohonen.
*	Issue #5828 (Invalid behavior of unicode.lower): Fixed bogus logic in	Walter Dörwald	2009-04-25	1	-22/+21
\| \| \| \| \|	makeunicodedata.py and regenerated the Unicode database (This fixes u'\u1d79'.lower() == '\x00').
*	Issue #3811: The Unicode database was updated to 5.1.	Martin v. Löwis	2008-09-10	1	-10/+30
\| \| \| \|	Reviewed by Fredrik Lundh and Marc-Andre Lemburg.
*	Make more symbols static.	Martin v. Löwis	2008-06-13	1	-2/+2
\|
*	Patch #2167 from calvin: Remove unused imports	Christian Heimes	2008-02-23	1	-1/+1
\|
*	Patch #1359618: Speed-up charmap encoder.	Martin v. Löwis	2006-06-04	2	-26/+27
\|
*	when generating python code prefer to generate valid python code	Jack Diederich	2006-05-26	1	-3/+3
\|
*	Don't add multiple empty lines at the end of the codec. With this a	Walter Dörwald	2006-03-31	1	-1/+1
\| \| \| \|	regenerated codec should survive reindent.py unchanged.
*	Whitespace for generated code.	Walter Dörwald	2006-03-27	1	-0/+3
\|
*	Patch #1443155: Add the incremental codecs support for CJK codecs.	Hye-Shik Chang	2006-03-26	2	-1/+69
\| \| \| \|	(reviewed by Walter Dörwald)
*	Patch #1436130: codecs.lookup() now returns a CodecInfo object (a subclass	Walter Dörwald	2006-03-15	2	-22/+43
\| \| \| \| \| \| \|	of tuple) that provides incremental decoders and encoders (a way to use stateful codecs without the stream API). Functions codecs.getincrementaldecoder() and codecs.getincrementalencoder() have been added.
*	Add changelog entry.	Martin v. Löwis	2006-03-11	1	-0/+1
\|
*	Whitespace normalization.	Tim Peters	2006-03-10	1	-1/+1
\|
*	Update Unicode database to Unicode 4.1.	Martin v. Löwis	2006-03-09	1	-11/+141
\|
*	Whitespace normalization.	Tim Peters	2005-12-25	1	-3/+3
\|
*	Add Makefile which allows easily rebuilding the charmap codecs.	Marc-André Lemburg	2005-10-25	1	-0/+81
\|
*	Add custom mapping files used for generating some of the charmap	Marc-André Lemburg	2005-10-25	3	-0/+873
\| \| \| \|	codecs.
*	Apply some cosmetic fixes to the output of the script.	Marc-André Lemburg	2005-10-25	1	-15/+28
\| \| \| \|	Only include the decoding map if no table can be generated.
*	Add two new tools to compare codecs and show differences and to	Marc-André Lemburg	2005-10-21	2	-0/+94
\| \| \| \|	list all installed codecs.
*	Moved gencodec.py to the Tools/unicode/ directory.	Marc-André Lemburg	2005-10-21	1	-0/+391
\| \| \| \| \| \|	Added new support for decoding tables. Cleaned up the implementation a bit.
*	SF #989185: Drop unicode.iswide() and unicode.width() and add	Hye-Shik Chang	2004-08-04	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \|	unicodedata.east_asian_width(). You can still implement your own simple width() function using it like this: def width(u): w = 0 for c in unicodedata.normalize('NFC', u): cwidth = unicodedata.east_asian_width(c) if cwidth in ('W', 'F'): w += 2 else: w += 1 return w
*	Whitespace normalization, via reindent.py.	Tim Peters	2004-07-18	1	-1/+0
\|
*	- SF #962502: Add two more methods for unicode type; width() and	Hye-Shik Chang	2004-06-02	1	-4/+29
\| \| \| \| \| \| \|	iswide() for east asian width manipulation. (Inspired by David Goodger, Reviewed by Martin v. Loewis) - Move _PyUnicode_TypeRecord.flags to the end of the struct so that no padding is added for UCS-4 builds. (Suggested by Martin v. Loewis)
*	Applying SF patch #949329 on behalf of Raymond Hettinger.	Armin Rigo	2004-05-19	1	-27/+26
\|
*	Implement IDNA (Internationalized Domain Names in Applications).	Martin v. Löwis	2003-04-18	1	-0/+433
\|
*	Add unidata_version. Bump generator version number.	Martin v. Löwis	2002-11-25	1	-2/+6
\|
*	Sort names independent of the Python version. Fix hex constant warning.	Martin v. Löwis	2002-11-24	1	-7/+11
\| \| \| \|	Include all First/Last blocks.
*	Patch #626485: Support Unicode normalization.	Martin v. Löwis	2002-11-23	1	-3/+90
\|
*	Verify that lower-higher case delta are 16-bit.	Martin v. Löwis	2002-10-18	1	-3/+11
\|
*	Update to Unicode 3.2 database.	Martin v. Löwis	2002-10-18	1	-2/+2
\|
*	Apply diff2.txt from SF patch http://www.python.org/sf/572113	Walter Dörwald	2002-09-11	1	-7/+7
\| \| \| \| \| \| \| \|	(with one small bugfix in bgen/bgen/scantools.py) This replaces string module functions with string methods for the stuff in the Tools directory. Several uses of string.letters etc. are still remaining.
*	Unicode nits: Don't include unicodedatabase.h no more. And make sure	Fredrik Lundh	2001-01-21	1	-2/+2
\| \| \| \|	to build all tables in makeunicodedata.py.
*	compress unicode decomposition tables (this saves another 55k)	Fredrik Lundh	2001-01-21	1	-41/+76
\|
*	forgot to check in the new makeunicodedata.py script	Fredrik Lundh	2001-01-21	1	-17/+271
\|
*	Added 38,642 missing characters to the Unicode database (first-last	Fredrik Lundh	2000-11-03	1	-11/+39
\| \| \| \| \| \| \|	ranges) -- but thanks to the 2.0 compression scheme, this doesn't add a single byte to the resulting binaries (!) Closes bug #117524
*	Remove bogus stdout redirection and use of sys.__stdout__; use	Fred Drake	2000-10-26	1	-46/+42
\| \| \| \|	augmented print statement instead.
*	- don't set the titlecase flag for uppercase letters (sorry, tim)	Fredrik Lundh	2000-09-25	1	-2/+2
\|
*	unicode database compression, step 3:	Fredrik Lundh	2000-09-25	1	-4/+19
\| \| \| \|	- added decimal digit and digit properties to the unidb tables
*	unicode database compression, step 3:	Fredrik Lundh	2000-09-25	1	-9/+97
\| \| \| \| \| \| \|	- use unidb compression for the unicodectype module. smaller, faster, and slightly more portable... - also mention the unicode directory in Tools/README
*	unicode database compression, step 2:	Fredrik Lundh	2000-09-25	1	-15/+47
\| \| \| \| \| \| \| \| \| \|	- fixed attributions - moved decomposition data to a separate table, in preparation for step 3 (which won't happen before 2.0 final, promise!) - use relative paths in the generator script I have a lot more stuff in the works for 2.1, but let's leave that for another day...
*	Fiddled w/ /F's cool new splitbins function: documented it, generalized it	Tim Peters	2000-09-25	1	-26/+54
\| \| \| \| \| \| \| \| \| \|	a bit, sped it a lot primarily by removing the unused assumption that None was a legit bin entry (the function doesn't really need to assume that there's anything special about 0), added an optional "trace" argument, and in __debug__ mode added exhaustive verification that the decomposition is both correct and doesn't overstep any array bounds (which wasn't obvious to me from staring at the generated C code -- now I feel safe!). Did not commit a new unicodedata_db.h, as the one produced by this version is identical to the one already checked in.
*	unicode database compression, step 1:	Fredrik Lundh	2000-09-24	1	-0/+202
	- use unidb compression for the unicodedata module. on Windows, the new unidatabase module is 120k, down from nearly 600k.