diff options
author | Victor Stinner <victor.stinner@haypocalc.com> | 2011-09-29 00:56:16 (GMT) |
---|---|---|
committer | Victor Stinner <victor.stinner@haypocalc.com> | 2011-09-29 00:56:16 (GMT) |
commit | 7d637ab8704b3101b2cefaa667c087118afcc4c1 (patch) | |
tree | e0e1cb1a609ce3fd7b4a64595a127a6043c7ba2d | |
parent | f503673c4dfc38d652eabeed44c21484fc697c4f (diff) | |
download | cpython-7d637ab8704b3101b2cefaa667c087118afcc4c1.zip cpython-7d637ab8704b3101b2cefaa667c087118afcc4c1.tar.gz cpython-7d637ab8704b3101b2cefaa667c087118afcc4c1.tar.bz2 |
Complete What's New in 3.3 about PEP 393
-rw-r--r-- | Doc/whatsnew/3.3.rst | 28 |
1 files changed, 28 insertions, 0 deletions
diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst index 3cd4dd1..32d7a3e 100644 --- a/Doc/whatsnew/3.3.rst +++ b/Doc/whatsnew/3.3.rst @@ -65,6 +65,28 @@ XXX Add list of changes introduced by :pep:`393` here: either ``0xFFFF`` or ``0x10FFFF`` for backward compatibility, and it should not be used with the new Unicode API (see :issue:`13054`). +* Non-BMP characters (U+10000-U+10FFFF range) are no more special cases. + ``'\U0010FFFF'[0]`` is now ``'\U0010FFFF'`` on any platform, instead of + ``'\uDFFF'`` on narrow build or ``'\U0010FFFF'`` on wide build. And + ``len('\U0010FFFF')`` is now ``1`` on any platform, instead of ``2`` on + narrow build or ``1`` on wide build. More generally, most bugs related to + non-BMP characters are now fixed. For example, :func:`unicodedata.normalize` + handles correctly non-BMP characters on all platforms. + +* The storage of Unicode string is now adapted on the content of the string. + Pure ASCII and Latin1 strings (U+0000-U+00FF) use 1 byte per character, BMP + strings (U+0000-U+FFFF) use 2 bytes per character, and non-BMP characters + (U+10000-U+10FFFF range) use 4 bytes per characters. The memory usage of + Python 3.3 is two to three times smaller than Python 3.2, and a little bit + better than Python 2.7, on a `Django benchmark + <http://mail.python.org/pipermail/python-dev/2011-September/113714.html>`_. + +* The PEP 393 is fully backward compatible. The legacy API should remain + available at least five years. Applications using the legacy API will not + fully benefit of the memory reduction, or worse may use a little bit more + memory, because Python may have to maintain two versions of each string (in + the legacy format and in the new efficient storage). + Other Language Changes ====================== @@ -334,3 +356,9 @@ that may require changes to your code: .. Issue #10998: -Q command-line flags are related artifacts have been removed. Code checking sys.flags.division_warning will need updating. Contributed by Éric Araujo. + +* :pep:`393`: The :c:type:`Py_UNICODE` type and all functions using this type + are deprecated. To fully benefit of the memory footprint reduction provided + by the PEP 393, you have to convert your code to the new Unicode API. Read + the porting guide: XXX. + |