Document new and deprecated Unicode functions

author: Victor Stinner <victor.stinner@haypocalc.com> 2011-11-20 17:27:55 (GMT)
committer: Victor Stinner <victor.stinner@haypocalc.com> 2011-11-20 17:27:55 (GMT)
commit: 46606ce870588f7c7606cfcaa0ed192dd30aba17 (patch)
tree: 72079b967b6f30bd57588ec2e8ad7958e955e93f /Doc/whatsnew/3.3.rst
parent: b4938aaf158286bde8f8e2ba426fb8debdea1cbd (diff)
download: cpython-46606ce870588f7c7606cfcaa0ed192dd30aba17.zip
cpython-46606ce870588f7c7606cfcaa0ed192dd30aba17.tar.gz
cpython-46606ce870588f7c7606cfcaa0ed192dd30aba17.tar.bz2
1 files changed, 92 insertions, 11 deletions
diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst
index a0ce3e4..6b07cb9 100644
--- a/Doc/whatsnew/3.3.rst
+++ b/Doc/whatsnew/3.3.rst
@@ -209,10 +209,8 @@ the equality of the underlying sequences generated by those range objects.
 (:issue:`13021`)
 
 
-New, Improved, and Deprecated Modules
-=====================================
-
-* Stub
+New and Improved Modules
+========================
 
 array
 -----
@@ -579,7 +577,11 @@ Optimizations
 
 Major performance enhancements have been added:
 
-* Stub
+* Thanks to the :pep:`393`, some operations on Unicode strings has been optimized:
+
+  * the memory footprint is divided by 2 to 4 depending on the text
+  * getting a substring of a latin1 strings is 4 times faster
+  * TODO
 
 
 Build and C API Changes
@@ -587,7 +589,27 @@ Build and C API Changes
 
 Changes to Python's build process and to the C API include:
 
-* Stub
+* The :pep:`393` added new Unicode types, macros and functions:
+
+  * Py_UCS1, Py_UCS2, Py_UCS4 types
+  * PyASCIIObject and PyCompactUnicodeObject structures
+  * :c:func:`PyUnicode_New`
+  * :c:macro:`PyUnicode_READY`
+  * :c:func:`PyUnicode_FromKindAndData`
+  * :c:func:`PyUnicode_GetLength`, :c:macro:`PyUnicode_GET_LENGTH`
+  * :c:func:`PyUnicode_CopyCharacters`
+  * :c:func:`PyUnicode_ReadChar`, :c:func:`PyUnicode_WriteChar`
+  * :c:func:`PyUnicode_AsUCS4`, :c:func:`PyUnicode_AsUCS4Copy`
+  * :c:func:`PyUnicode_FindChar`
+  * :c:func:`PyUnicode_Substring`
+  * :c:macro:`PyUnicode_1BYTE_DATA`, :c:macro:`PyUnicode_2BYTE_DATA`,
+    :c:macro:`PyUnicode_4BYTE_DATA`
+  * :c:macro:`PyUnicode_KIND` with :c:type:`PyUnicode_Kind` enum:
+    :c:data:`PyUnicode_WCHAR_KIND`, :c:data:`PyUnicode_1BYTE_KIND`,
+    :c:data:`PyUnicode_2BYTE_KIND`, :c:data:`PyUnicode_4BYTE_KIND`
+  * :c:macro:`PyUnicode_DATA`
+  * :c:macro:`PyUnicode_READ`, :c:macro:`PyUnicode_READ_CHAR`, :c:macro:`PyUnicode_WRITE`
+  * :c:macro:`PyUnicode_MAX_CHAR_VALUE`
 
 
 Unsupported Operating Systems
@@ -599,22 +621,81 @@ Windows 2000 and Windows platforms which set ``COMSPEC`` to ``command.com``
 are no longer supported due to maintenance burden.
 
 
-Deprecated modules, functions and methods
-=========================================
+Deprecated Python modules, functions and methods
+================================================
 
 * The :mod:`packaging` module replaces the :mod:`distutils` module
 * The ``unicode_internal`` codec has been deprecated because of the
   :pep:`393`, use UTF-8, UTF-16 (``utf-16-le`` or ``utf-16-le``), or UTF-32
-  (``utf-32-le`` or ``utf-32-le``) instead.
+  (``utf-32-le`` or ``utf-32-le``)
 * :meth:`ftplib.FTP.nlst` and :meth:`ftplib.FTP.dir`: use
-  :meth:`ftplib.FTP.mlsd` instead.
+  :meth:`ftplib.FTP.mlsd`
 * :func:`platform.popen`: use the :mod:`subprocess` module. Check especially
   the :ref:`subprocess-replacements` section.
 * :issue:`13374`: The Windows bytes API has been deprecated in the :mod:`os`
-  module. Use Unicode filenames instead of bytes filenames to not depend on
+  module. Use Unicode filenames, instead of bytes filenames, to not depend on
   the ANSI code page anymore and to support any filename.
 
 
+Deprecated functions and types of the C API
+===========================================
+
+The :c:type:`Py_UNICODE` has been deprecated by the :pep:`393` and will be
+removed in Python 4. All functions using this type are deprecated:
+
+Functions and macros manipulating Py_UNICODE* strings:
+
+ * :c:macro:`Py_UNICODE_strlen`: use :c:func:`PyUnicode_GetLength` or
+   :c:macro:`PyUnicode_GET_LENGTH`
+ * :c:macro:`Py_UNICODE_strcat`: use :c:func:`PyUnicode_CopyCharacters` or
+   :c:func:`PyUnicode_FromFormat`
+ * :c:macro:`Py_UNICODE_strcpy`, :c:macro:`Py_UNICODE_strncpy`,
+   :c:macro:`Py_UNICODE_COPY`: use :c:func:`PyUnicode_CopyCharacters` or
+   :c:func:`PyUnicode_Substring`
+ * :c:macro:`Py_UNICODE_strcmp`: use :c:func:`PyUnicode_Compare`
+ * :c:macro:`Py_UNICODE_strncmp`: use :c:func:`PyUnicode_Tailmatch`
+ * :c:macro:`Py_UNICODE_strchr`, :c:macro:`Py_UNICODE_strrchr`: use
+   :c:func:`PyUnicode_FindChar`
+ * :c:macro:`Py_UNICODE_FILL`
+
+Unicode functions and methods using :c:type:`Py_UNICODE` and
+:c:type:`Py_UNICODE*` types:
+
+ * :c:macro:`PyUnicode_FromUnicode`: use :c:func:`PyUnicode_FromWideChar` or
+   :c:func:`PyUnicode_FromKindAndData`
+ * :c:macro:`PyUnicode_AS_UNICODE`, :c:func:`PyUnicode_AsUnicode`,
+   :c:func:`PyUnicode_AsUnicodeAndSize`: use :c:func:`PyUnicode_AsWideCharString`
+ * :c:macro:`PyUnicode_AS_DATA`: use :c:macro:`PyUnicode_DATA` with
+   :c:macro:`PyUnicode_READ` and :c:macro:`PyUnicode_WRITE`
+ * :c:macro:`PyUnicode_GET_SIZE`, :c:func:`PyUnicode_GetSize`: use
+   :c:macro:`PyUnicode_GET_LENGTH` or :c:func:`PyUnicode_GetLength`
+ * :c:macro:`PyUnicode_GET_DATA_SIZE`: use
+   ``PyUnicode_GET_LENGTH(str) * PyUnicode_KIND(str)`` (only work on ready
+   strings)
+ * :c:func:`PyUnicode_AsUnicodeCopy`: use :c:func:`PyUnicode_AsUCS4Copy`,
+   :c:func:`PyUnicode_AsWideCharString` or :c:func:`PyUnicode_Copy`
+
+Encoders:
+
+ * :c:func:`PyUnicode_Encode`: use :c:func:`PyUnicode_AsEncodedObject`
+ * :c:func:`PyUnicode_EncodeUTF7`
+ * :c:func:`PyUnicode_EncodeUTF8`: use :c:func:`PyUnicode_AsUTF8String`
+ * :c:func:`PyUnicode_EncodeUTF32`
+ * :c:func:`PyUnicode_EncodeUTF16`
+ * :c:func:`PyUnicode_EncodeUnicodeEscape:` use
+   :c:func:`PyUnicode_AsUnicodeEscapeString`
+ * :c:func:`PyUnicode_EncodeRawUnicodeEscape:` use
+   :c:func:`PyUnicode_AsRawUnicodeEscapeString`
+ * :c:func:`PyUnicode_EncodeLatin1`: use :c:func:`PyUnicode_AsLatin1String`
+ * :c:func:`PyUnicode_EncodeASCII`: use :c:func:`PyUnicode_AsASCIIString`
+ * :c:func:`PyUnicode_EncodeCharmap`
+ * :c:func:`PyUnicode_TranslateCharmap`
+ * :c:func:`PyUnicode_EncodeMBCS`: use :c:func:`PyUnicode_AsMBCSString` or
+   :c:func:`PyUnicode_EncodeCodePage` (with ``CP_ACP`` code_page)
+ * :c:func:`PyUnicode_EncodeDecimal`,
+   :c:func:`PyUnicode_TransformDecimalToASCII`
+
+
 Porting to Python 3.3
 =====================
author	Victor Stinner <victor.stinner@haypocalc.com>	2011-11-20 17:27:55 (GMT)
committer	Victor Stinner <victor.stinner@haypocalc.com>	2011-11-20 17:27:55 (GMT)
commit	46606ce870588f7c7606cfcaa0ed192dd30aba17 (patch)
tree	72079b967b6f30bd57588ec2e8ad7958e955e93f /Doc/whatsnew/3.3.rst
parent	b4938aaf158286bde8f8e2ba426fb8debdea1cbd (diff)
download	cpython-46606ce870588f7c7606cfcaa0ed192dd30aba17.zip cpython-46606ce870588f7c7606cfcaa0ed192dd30aba17.tar.gz cpython-46606ce870588f7c7606cfcaa0ed192dd30aba17.tar.bz2