diff options
author | Serhiy Storchaka <storchaka@gmail.com> | 2014-11-25 11:57:17 (GMT) |
---|---|---|
committer | Serhiy Storchaka <storchaka@gmail.com> | 2014-11-25 11:57:17 (GMT) |
commit | 166ebc4e5dd09f005c6144b7568da83728b8b893 (patch) | |
tree | f6b9deb3cb72095ef55bcef31637f4aaafe95248 /Doc | |
parent | 6cecf68c7b51390429a2488846b1d0c29581987a (diff) | |
download | cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.zip cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.tar.gz cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.tar.bz2 |
Issue #19676: Added the "namereplace" error handler.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/c-api/codec.rst | 5 | ||||
-rw-r--r-- | Doc/howto/unicode.rst | 7 | ||||
-rw-r--r-- | Doc/library/codecs.rst | 17 | ||||
-rw-r--r-- | Doc/library/functions.rst | 3 | ||||
-rw-r--r-- | Doc/library/io.rst | 7 |
5 files changed, 34 insertions, 5 deletions
diff --git a/Doc/c-api/codec.rst b/Doc/c-api/codec.rst index 83252af..5bb56e3 100644 --- a/Doc/c-api/codec.rst +++ b/Doc/c-api/codec.rst @@ -116,3 +116,8 @@ Registry API for Unicode encoding error handlers Replace the unicode encode error with backslash escapes (``\x``, ``\u`` and ``\U``). +.. c:function:: PyObject* PyCodec_NameReplaceErrors(PyObject *exc) + + Replace the unicode encode error with `\N{...}` escapes. + + .. versionadded: 3.4 diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst index 50bca5a..aac2373 100644 --- a/Doc/howto/unicode.rst +++ b/Doc/howto/unicode.rst @@ -325,8 +325,9 @@ The *errors* parameter is the same as the parameter of the :meth:`~bytes.decode` method but supports a few more possible handlers. As well as ``'strict'``, ``'ignore'``, and ``'replace'`` (which in this case inserts a question mark instead of the unencodable character), there is -also ``'xmlcharrefreplace'`` (inserts an XML character reference) and -``backslashreplace`` (inserts a ``\uNNNN`` escape sequence). +also ``'xmlcharrefreplace'`` (inserts an XML character reference), +``backslashreplace`` (inserts a ``\uNNNN`` escape sequence) and +``namereplace`` (inserts a ``\N{...}`` escape sequence). The following example shows the different results:: @@ -346,6 +347,8 @@ The following example shows the different results:: b'ꀀabcd޴' >>> u.encode('ascii', 'backslashreplace') b'\\ua000abcd\\u07b4' + >>> u.encode('ascii', 'namereplace') + b'\\N{YI SYLLABLE IT}abcd\\u07b4' The low-level routines for registering and accessing the available encodings are found in the :mod:`codecs` module. Implementing new diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index 4c2a023..ea4c450 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -98,6 +98,8 @@ It defines the following functions: reference (for encoding only) * ``'backslashreplace'``: replace with backslashed escape sequences (for encoding only) + * ``'namereplace'``: replace with ``\N{...}`` escape sequences (for + encoding only) * ``'surrogateescape'``: on decoding, replace with code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the @@ -232,6 +234,11 @@ functions which use :func:`lookup` for the codec lookup: Implements the ``backslashreplace`` error handling (for encoding only): the unencodable character is replaced by a backslashed escape sequence. +.. function:: namereplace_errors(exception) + + Implements the ``namereplace`` error handling (for encoding only): the + unencodable character is replaced by a ``\N{...}`` escape sequence. + To simplify working with encoded files or stream, the module also defines these utility functions: @@ -363,6 +370,9 @@ and implemented by all standard Python codecs: | ``'backslashreplace'`` | Replace with backslashed escape sequences | | | (only for encoding). | +-------------------------+-----------------------------------------------+ +| ``'namereplace'`` | Replace with ``\N{...}`` escape sequences | +| | (only for encoding). | ++-------------------------+-----------------------------------------------+ | ``'surrogateescape'`` | Replace byte with surrogate U+DCxx, as defined| | | in :pep:`383`. | +-------------------------+-----------------------------------------------+ @@ -384,6 +394,9 @@ schemes: .. versionchanged:: 3.4 The ``'surrogatepass'`` error handlers now works with utf-16\* and utf-32\* codecs. +.. versionadded:: 3.4 + The ``'namereplace'`` error handler. + The set of allowed values can be extended via :meth:`register_error`. @@ -477,6 +490,8 @@ define in order to be compatible with the Python codec registry. * ``'backslashreplace'`` Replace with backslashed escape sequences. + * ``'namereplace'`` Replace with ``\N{...}`` escape sequences. + The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the :class:`IncrementalEncoder` @@ -625,6 +640,8 @@ compatible with the Python codec registry. * ``'backslashreplace'`` Replace with backslashed escape sequences. + * ``'namereplace'`` Replace with ``\N{...}`` escape sequences. + The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the :class:`StreamWriter` object. diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst index 9e38d6f..d1e3407 100644 --- a/Doc/library/functions.rst +++ b/Doc/library/functions.rst @@ -975,6 +975,9 @@ are always available. They are listed here in alphabetical order. replaces unsupported characters with Python's backslashed escape sequences. + * ``'namereplace'`` (also only supported when writing) + replaces unsupported characters with ``\N{...}`` escape sequences. + .. index:: single: universal newlines; open() built-in function diff --git a/Doc/library/io.rst b/Doc/library/io.rst index 0054286..c77db90 100644 --- a/Doc/library/io.rst +++ b/Doc/library/io.rst @@ -827,9 +827,10 @@ Text I/O errors can lead to data loss.) ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted where there is malformed data. When writing, ``'xmlcharrefreplace'`` (replace with the appropriate XML character - reference) or ``'backslashreplace'`` (replace with backslashed escape - sequences) can be used. Any other error handling name that has been - registered with :func:`codecs.register_error` is also valid. + reference), ``'backslashreplace'`` (replace with backslashed escape + sequences) or ``'namereplace'`` (replace with ``\N{...}`` escape sequences) + can be used. Any other error handling name that has been registered with + :func:`codecs.register_error` is also valid. .. index:: single: universal newlines; io.TextIOWrapper class |