summaryrefslogtreecommitdiffstats
path: root/Doc/library/codecs.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/codecs.rst')
-rw-r--r--Doc/library/codecs.rst32
1 files changed, 25 insertions, 7 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index fb3af3b..06bce84 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -7,6 +7,7 @@
.. sectionauthor:: Marc-André Lemburg <mal@lemburg.com>
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
+**Source code:** :source:`Lib/codecs.py`
.. index::
single: Unicode
@@ -22,10 +23,9 @@ manages the codec and error handling lookup process.
It defines the following functions:
-.. function:: encode(obj, [encoding[, errors]])
+.. function:: encode(obj, encoding='utf-8', errors='strict')
- Encodes *obj* using the codec registered for *encoding*. The default
- encoding is ``utf-8``.
+ Encodes *obj* using the codec registered for *encoding*.
*Errors* may be given to set the desired error handling scheme. The
default error handler is ``strict`` meaning that encoding errors raise
@@ -33,10 +33,9 @@ It defines the following functions:
:exc:`UnicodeEncodeError`). Refer to :ref:`codec-base-classes` for more
information on codec error handling.
-.. function:: decode(obj, [encoding[, errors]])
+.. function:: decode(obj, encoding='utf-8', errors='strict')
- Decodes *obj* using the codec registered for *encoding*. The default
- encoding is ``utf-8``.
+ Decodes *obj* using the codec registered for *encoding*.
*Errors* may be given to set the desired error handling scheme. The
default error handler is ``strict`` meaning that decoding errors raise
@@ -99,6 +98,8 @@ It defines the following functions:
reference (for encoding only)
* ``'backslashreplace'``: replace with backslashed escape sequences (for
encoding only)
+ * ``'namereplace'``: replace with ``\N{...}`` escape sequences (for
+ encoding only)
* ``'surrogateescape'``: on decoding, replace with code points in the Unicode
Private Use Area ranging from U+DC80 to U+DCFF. These private code
points will then be turned back into the same bytes when the
@@ -233,6 +234,13 @@ functions which use :func:`lookup` for the codec lookup:
Implements the ``backslashreplace`` error handling (for encoding only): the
unencodable character is replaced by a backslashed escape sequence.
+.. function:: namereplace_errors(exception)
+
+ Implements the ``namereplace`` error handling (for encoding only): the
+ unencodable character is replaced by a ``\N{...}`` escape sequence.
+
+ .. versionadded:: 3.5
+
To simplify working with encoded files or stream, the module also defines these
utility functions:
@@ -319,6 +327,7 @@ and writing to platform dependent files:
encodings.
+.. _surrogateescape:
.. _codec-base-classes:
Codec Base Classes
@@ -363,6 +372,9 @@ and implemented by all standard Python codecs:
| ``'backslashreplace'`` | Replace with backslashed escape sequences |
| | (only for encoding). |
+-------------------------+-----------------------------------------------+
+| ``'namereplace'`` | Replace with ``\N{...}`` escape sequences |
+| | (only for encoding). |
++-------------------------+-----------------------------------------------+
| ``'surrogateescape'`` | Replace byte with surrogate U+DCxx, as defined|
| | in :pep:`383`. |
+-------------------------+-----------------------------------------------+
@@ -384,6 +396,9 @@ schemes:
.. versionchanged:: 3.4
The ``'surrogatepass'`` error handlers now works with utf-16\* and utf-32\* codecs.
+.. versionadded:: 3.5
+ The ``'namereplace'`` error handler.
+
The set of allowed values can be extended via :meth:`register_error`.
@@ -477,6 +492,8 @@ define in order to be compatible with the Python codec registry.
* ``'backslashreplace'`` Replace with backslashed escape sequences.
+ * ``'namereplace'`` Replace with ``\N{...}`` escape sequences.
+
The *errors* argument will be assigned to an attribute of the same name.
Assigning to this attribute makes it possible to switch between different error
handling strategies during the lifetime of the :class:`IncrementalEncoder`
@@ -625,6 +642,8 @@ compatible with the Python codec registry.
* ``'backslashreplace'`` Replace with backslashed escape sequences.
+ * ``'namereplace'`` Replace with ``\N{...}`` escape sequences.
+
The *errors* argument will be assigned to an attribute of the same name.
Assigning to this attribute makes it possible to switch between different error
handling strategies during the lifetime of the :class:`StreamWriter` object.
@@ -1420,4 +1439,3 @@ This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded
BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this
is only done once (on the first write to the byte stream). For decoding an
optional UTF-8 encoded BOM at the start of the data will be skipped.
-