Issue #19676: Added the "namereplace" error handler.

author: Serhiy Storchaka <storchaka@gmail.com> 2014-11-25 11:57:17 (GMT)
committer: Serhiy Storchaka <storchaka@gmail.com> 2014-11-25 11:57:17 (GMT)
commit: 166ebc4e5dd09f005c6144b7568da83728b8b893 (patch)
tree: f6b9deb3cb72095ef55bcef31637f4aaafe95248 /Doc
parent: 6cecf68c7b51390429a2488846b1d0c29581987a (diff)
download: cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.zip
cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.tar.gz
cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.tar.bz2
5 files changed, 34 insertions, 5 deletions
diff --git a/Doc/c-api/codec.rst b/Doc/c-api/codec.rst
index 83252af..5bb56e3 100644
--- a/Doc/c-api/codec.rst
+++ b/Doc/c-api/codec.rst
@@ -116,3 +116,8 @@ Registry API for Unicode encoding error handlers
    Replace the unicode encode error with backslash escapes (``\x``, ``\u`` and
    ``\U``).
 
+.. c:function:: PyObject* PyCodec_NameReplaceErrors(PyObject *exc)
+
+   Replace the unicode encode error with `\N{...}` escapes.
+
+  .. versionadded: 3.4
diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst
index 50bca5a..aac2373 100644
--- a/Doc/howto/unicode.rst
+++ b/Doc/howto/unicode.rst
@@ -325,8 +325,9 @@ The *errors* parameter is the same as the parameter of the
 :meth:`~bytes.decode` method but supports a few more possible handlers. As well as
 ``'strict'``, ``'ignore'``, and ``'replace'`` (which in this case
 inserts a question mark instead of the unencodable character), there is
-also ``'xmlcharrefreplace'`` (inserts an XML character reference) and
-``backslashreplace`` (inserts a ``\uNNNN`` escape sequence).
+also ``'xmlcharrefreplace'`` (inserts an XML character reference),
+``backslashreplace`` (inserts a ``\uNNNN`` escape sequence) and
+``namereplace`` (inserts a ``\N{...}`` escape sequence).
 
 The following example shows the different results::
 
@@ -346,6 +347,8 @@ The following example shows the different results::
     b'&#40960;abcd&#1972;'
     >>> u.encode('ascii', 'backslashreplace')
     b'\\ua000abcd\\u07b4'
+    >>> u.encode('ascii', 'namereplace')
+    b'\\N{YI SYLLABLE IT}abcd\\u07b4'
 
 The low-level routines for registering and accessing the available
 encodings are found in the :mod:`codecs` module.  Implementing new
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 4c2a023..ea4c450 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -98,6 +98,8 @@ It defines the following functions:
      reference (for encoding only)
    * ``'backslashreplace'``: replace with backslashed escape sequences (for
      encoding only)
+   * ``'namereplace'``: replace with ``\N{...}`` escape sequences (for
+     encoding only)
    * ``'surrogateescape'``: on decoding, replace with code points in the Unicode
      Private Use Area ranging from U+DC80 to U+DCFF.  These private code
      points will then be turned back into the same bytes when the
@@ -232,6 +234,11 @@ functions which use :func:`lookup` for the codec lookup:
    Implements the ``backslashreplace`` error handling (for encoding only): the
    unencodable character is replaced by a backslashed escape sequence.
 
+.. function:: namereplace_errors(exception)
+
+   Implements the ``namereplace`` error handling (for encoding only): the
+   unencodable character is replaced by a ``\N{...}`` escape sequence.
+
 To simplify working with encoded files or stream, the module also defines these
 utility functions:
 
@@ -363,6 +370,9 @@ and implemented by all standard Python codecs:
 | ``'backslashreplace'``  | Replace with backslashed escape sequences     |
 |                         | (only for encoding).                          |
 +-------------------------+-----------------------------------------------+
+| ``'namereplace'``       | Replace with ``\N{...}`` escape sequences     |
+|                         | (only for encoding).                          |
++-------------------------+-----------------------------------------------+
 | ``'surrogateescape'``   | Replace byte with surrogate U+DCxx, as defined|
 |                         | in :pep:`383`.                                |
 +-------------------------+-----------------------------------------------+
@@ -384,6 +394,9 @@ schemes:
 .. versionchanged:: 3.4
    The ``'surrogatepass'`` error handlers now works with utf-16\* and utf-32\* codecs.
 
+.. versionadded:: 3.4
+   The ``'namereplace'`` error handler.
+
 The set of allowed values can be extended via :meth:`register_error`.
 
 
@@ -477,6 +490,8 @@ define in order to be compatible with the Python codec registry.
 
    * ``'backslashreplace'`` Replace with backslashed escape sequences.
 
+   * ``'namereplace'`` Replace with ``\N{...}`` escape sequences.
+
    The *errors* argument will be assigned to an attribute of the same name.
    Assigning to this attribute makes it possible to switch between different error
    handling strategies during the lifetime of the :class:`IncrementalEncoder`
@@ -625,6 +640,8 @@ compatible with the Python codec registry.
 
    * ``'backslashreplace'`` Replace with backslashed escape sequences.
 
+   * ``'namereplace'`` Replace with ``\N{...}`` escape sequences.
+
    The *errors* argument will be assigned to an attribute of the same name.
    Assigning to this attribute makes it possible to switch between different error
    handling strategies during the lifetime of the :class:`StreamWriter` object.
diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst
index 9e38d6f..d1e3407 100644
--- a/Doc/library/functions.rst
+++ b/Doc/library/functions.rst
@@ -975,6 +975,9 @@ are always available.  They are listed here in alphabetical order.
      replaces unsupported characters with Python's backslashed escape
      sequences.
 
+   * ``'namereplace'`` (also only supported when writing)
+     replaces unsupported characters with ``\N{...}`` escape sequences.
+
    .. index::
       single: universal newlines; open() built-in function
 
diff --git a/Doc/library/io.rst b/Doc/library/io.rst
index 0054286..c77db90 100644
--- a/Doc/library/io.rst
+++ b/Doc/library/io.rst
@@ -827,9 +827,10 @@ Text I/O
    errors can lead to data loss.)  ``'replace'`` causes a replacement marker
    (such as ``'?'``) to be inserted where there is malformed data.  When
    writing, ``'xmlcharrefreplace'`` (replace with the appropriate XML character
-   reference) or ``'backslashreplace'`` (replace with backslashed escape
-   sequences) can be used.  Any other error handling name that has been
-   registered with :func:`codecs.register_error` is also valid.
+   reference), ``'backslashreplace'`` (replace with backslashed escape
+   sequences) or ``'namereplace'`` (replace with ``\N{...}`` escape sequences)
+   can be used.  Any other error handling name that has been registered with
+   :func:`codecs.register_error` is also valid.
 
    .. index::
       single: universal newlines; io.TextIOWrapper class
author	Serhiy Storchaka <storchaka@gmail.com>	2014-11-25 11:57:17 (GMT)
committer	Serhiy Storchaka <storchaka@gmail.com>	2014-11-25 11:57:17 (GMT)
commit	166ebc4e5dd09f005c6144b7568da83728b8b893 (patch)
tree	f6b9deb3cb72095ef55bcef31637f4aaafe95248 /Doc
parent	6cecf68c7b51390429a2488846b1d0c29581987a (diff)
download	cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.zip cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.tar.gz cpython-166ebc4e5dd09f005c6144b7568da83728b8b893.tar.bz2