summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorVictor Stinner <victor.stinner@gmail.com>2014-08-01 10:28:48 (GMT)
committerVictor Stinner <victor.stinner@gmail.com>2014-08-01 10:28:48 (GMT)
commitf6a271ae980d2f3fb450f745b8f87624378156c4 (patch)
treeae0c09042455826ae38875945dadd4919ca8f235 /Doc
parentc6f8c0a1de448e7ca62ece1d21f089194d31f0d9 (diff)
downloadcpython-f6a271ae980d2f3fb450f745b8f87624378156c4.zip
cpython-f6a271ae980d2f3fb450f745b8f87624378156c4.tar.gz
cpython-f6a271ae980d2f3fb450f745b8f87624378156c4.tar.bz2
Issue #18395: Rename ``_Py_char2wchar()`` to :c:func:`Py_DecodeLocale`, rename
``_Py_wchar2char()`` to :c:func:`Py_EncodeLocale`, and document these functions.
Diffstat (limited to 'Doc')
-rw-r--r--Doc/c-api/sys.rst54
-rw-r--r--Doc/c-api/unicode.rst35
-rw-r--r--Doc/library/codecs.rst1
-rw-r--r--Doc/library/os.rst7
4 files changed, 80 insertions, 17 deletions
diff --git a/Doc/c-api/sys.rst b/Doc/c-api/sys.rst
index 9760dca..a6a939c 100644
--- a/Doc/c-api/sys.rst
+++ b/Doc/c-api/sys.rst
@@ -47,6 +47,60 @@ Operating System Utilities
not call those functions directly! :c:type:`PyOS_sighandler_t` is a typedef
alias for :c:type:`void (\*)(int)`.
+.. c:function:: wchar_t* Py_DecodeLocale(const char* arg, size_t *size)
+
+ Decode a byte string from the locale encoding with the :ref:`surrogateescape
+ error handler <surrogateescape>`: undecodable bytes are decoded as
+ characters in range U+DC80..U+DCFF. If a byte sequence can be decoded as a
+ surrogate character, escape the bytes using the surrogateescape error
+ handler instead of decoding them.
+
+ Return a pointer to a newly allocated wide character string, use
+ :c:func:`PyMem_RawFree` to free the memory. If size is not ``NULL``, write
+ the number of wide characters excluding the null character into ``*size``
+
+ Return ``NULL`` on decoding error or memory allocation error. If *size* is
+ not ``NULL``, ``*size`` is set to ``(size_t)-1`` on memory error or set to
+ ``(size_t)-2`` on decoding error.
+
+ Decoding errors should never happen, unless there is a bug in the C
+ library.
+
+ Use the :c:func:`Py_EncodeLocale` function to encode the character string
+ back to a byte string.
+
+ .. seealso::
+
+ The :c:func:`PyUnicode_DecodeFSDefaultAndSize` and
+ :c:func:`PyUnicode_DecodeLocaleAndSize` functions.
+
+ .. versionadded:: 3.5
+
+
+.. c:function:: char* Py_EncodeLocale(const wchar_t *text, size_t *error_pos)
+
+ Encode a wide character string to the locale encoding with the
+ :ref:`surrogateescape error handler <surrogateescape>`: surrogate characters
+ in the range U+DC80..U+DCFF are converted to bytes 0x80..0xFF.
+
+ Return a pointer to a newly allocated byte string, use :c:func:`PyMem_Free`
+ to free the memory. Return ``NULL`` on encoding error or memory allocation
+ error
+
+ If error_pos is not ``NULL``, ``*error_pos`` is set to the index of the
+ invalid character on encoding error, or set to ``(size_t)-1`` otherwise.
+
+ Use the :c:func:`Py_DecodeLocale` function to decode the bytes string back
+ to a wide character string.
+
+ .. seealso::
+
+ The :c:func:`PyUnicode_EncodeFSDefault` and
+ :c:func:`PyUnicode_EncodeLocale` functions.
+
+ .. versionadded:: 3.5
+
+
.. _systemfunctions:
System Functions
diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 4352351..2d1bae1 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -758,11 +758,13 @@ system.
*errors* is ``NULL``. *str* must end with a null character but
cannot contain embedded null characters.
+ Use :c:func:`PyUnicode_DecodeFSDefaultAndSize` to decode a string from
+ :c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at
+ Python startup).
+
.. seealso::
- Use :c:func:`PyUnicode_DecodeFSDefaultAndSize` to decode a string from
- :c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at
- Python startup).
+ The :c:func:`Py_DecodeLocale` function.
.. versionadded:: 3.3
@@ -783,11 +785,13 @@ system.
*errors* is ``NULL``. Return a :class:`bytes` object. *str* cannot
contain embedded null characters.
+ Use :c:func:`PyUnicode_EncodeFSDefault` to encode a string to
+ :c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at
+ Python startup).
+
.. seealso::
- Use :c:func:`PyUnicode_EncodeFSDefault` to encode a string to
- :c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at
- Python startup).
+ The :c:func:`Py_EncodeLocale` function.
.. versionadded:: 3.3
@@ -832,12 +836,14 @@ used, passing :c:func:`PyUnicode_FSDecoder` as the conversion function:
If :c:data:`Py_FileSystemDefaultEncoding` is not set, fall back to the
locale encoding.
+ :c:data:`Py_FileSystemDefaultEncoding` is initialized at startup from the
+ locale encoding and cannot be modified later. If you need to decode a string
+ from the current locale encoding, use
+ :c:func:`PyUnicode_DecodeLocaleAndSize`.
+
.. seealso::
- :c:data:`Py_FileSystemDefaultEncoding` is initialized at startup from the
- locale encoding and cannot be modified later. If you need to decode a
- string from the current locale encoding, use
- :c:func:`PyUnicode_DecodeLocaleAndSize`.
+ The :c:func:`Py_DecodeLocale` function.
.. versionchanged:: 3.2
Use ``"strict"`` error handler on Windows.
@@ -867,12 +873,13 @@ used, passing :c:func:`PyUnicode_FSDecoder` as the conversion function:
If :c:data:`Py_FileSystemDefaultEncoding` is not set, fall back to the
locale encoding.
+ :c:data:`Py_FileSystemDefaultEncoding` is initialized at startup from the
+ locale encoding and cannot be modified later. If you need to encode a string
+ to the current locale encoding, use :c:func:`PyUnicode_EncodeLocale`.
+
.. seealso::
- :c:data:`Py_FileSystemDefaultEncoding` is initialized at startup from the
- locale encoding and cannot be modified later. If you need to encode a
- string to the current locale encoding, use
- :c:func:`PyUnicode_EncodeLocale`.
+ The :c:func:`Py_EncodeLocale` function.
.. versionadded:: 3.2
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 36144e9..4c2a023 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -318,6 +318,7 @@ and writing to platform dependent files:
encodings.
+.. _surrogateescape:
.. _codec-base-classes:
Codec Base Classes
diff --git a/Doc/library/os.rst b/Doc/library/os.rst
index 9cfc472..bf3a8d5 100644
--- a/Doc/library/os.rst
+++ b/Doc/library/os.rst
@@ -78,9 +78,10 @@ uses the file system encoding to perform this conversion (see
.. versionchanged:: 3.1
On some systems, conversion using the file system encoding may fail. In this
- case, Python uses the ``surrogateescape`` encoding error handler, which means
- that undecodable bytes are replaced by a Unicode character U+DCxx on
- decoding, and these are again translated to the original byte on encoding.
+ case, Python uses the :ref:`surrogateescape encoding error handler
+ <surrogateescape>`, which means that undecodable bytes are replaced by a
+ Unicode character U+DCxx on decoding, and these are again translated to the
+ original byte on encoding.
The file system encoding must guarantee to successfully decode all bytes