summaryrefslogtreecommitdiffstats
path: root/Doc/c-api
diff options
context:
space:
mode:
authorAntoine Pitrou <solipsis@pitrou.net>2011-10-22 20:08:05 (GMT)
committerAntoine Pitrou <solipsis@pitrou.net>2011-10-22 20:08:05 (GMT)
commitb965b3938a95a7758b5f0c381f2baaf80db16495 (patch)
tree4cf6c4c45a28745a324761e34f4ec25ee4e113d0 /Doc/c-api
parente6b99a1832cc860371c53b0cb89f7649926d103e (diff)
downloadcpython-b965b3938a95a7758b5f0c381f2baaf80db16495.zip
cpython-b965b3938a95a7758b5f0c381f2baaf80db16495.tar.gz
cpython-b965b3938a95a7758b5f0c381f2baaf80db16495.tar.bz2
Elaborate on representations and canonical/legacy unicode objects
Diffstat (limited to 'Doc/c-api')
-rw-r--r--Doc/c-api/unicode.rst16
1 files changed, 15 insertions, 1 deletions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index fb5f38c..e982be0 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -18,7 +18,21 @@ for strings where all code points are below 128, 256, or 65536; otherwise, code
points must be below 1114112 (which is the full Unicode range).
:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached
-in the Unicode object.
+in the Unicode object. The :c:type:`Py_UNICODE*` representation is deprecated
+and inefficient; it should be avoided in performance- or memory-sensitive
+situations.
+
+Due to the transition between the old APIs and the new APIs, unicode objects
+can internally be in two states depending on how they were created:
+
+* "canonical" unicode objects are all objects created by a non-deprecated
+ unicode API. They use the most efficient representation allowed by the
+ implementation.
+
+* "legacy" unicode objects have been created through one of the deprecated
+ APIs (typically :c:func:`PyUnicode_FromUnicode`) and only bear the
+ :c:type:`Py_UNICODE*` representation; you will have to call
+ :c:func:`PyUnicode_READY` on them before calling any other API.
Unicode Type