diff options
Diffstat (limited to 'Doc/library/codecs.rst')
| -rw-r--r-- | Doc/library/codecs.rst | 32 |
1 files changed, 25 insertions, 7 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index fb3af3b..06bce84 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -7,6 +7,7 @@ .. sectionauthor:: Marc-André Lemburg <mal@lemburg.com> .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> +**Source code:** :source:`Lib/codecs.py` .. index:: single: Unicode @@ -22,10 +23,9 @@ manages the codec and error handling lookup process. It defines the following functions: -.. function:: encode(obj, [encoding[, errors]]) +.. function:: encode(obj, encoding='utf-8', errors='strict') - Encodes *obj* using the codec registered for *encoding*. The default - encoding is ``utf-8``. + Encodes *obj* using the codec registered for *encoding*. *Errors* may be given to set the desired error handling scheme. The default error handler is ``strict`` meaning that encoding errors raise @@ -33,10 +33,9 @@ It defines the following functions: :exc:`UnicodeEncodeError`). Refer to :ref:`codec-base-classes` for more information on codec error handling. -.. function:: decode(obj, [encoding[, errors]]) +.. function:: decode(obj, encoding='utf-8', errors='strict') - Decodes *obj* using the codec registered for *encoding*. The default - encoding is ``utf-8``. + Decodes *obj* using the codec registered for *encoding*. *Errors* may be given to set the desired error handling scheme. The default error handler is ``strict`` meaning that decoding errors raise @@ -99,6 +98,8 @@ It defines the following functions: reference (for encoding only) * ``'backslashreplace'``: replace with backslashed escape sequences (for encoding only) + * ``'namereplace'``: replace with ``\N{...}`` escape sequences (for + encoding only) * ``'surrogateescape'``: on decoding, replace with code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the @@ -233,6 +234,13 @@ functions which use :func:`lookup` for the codec lookup: Implements the ``backslashreplace`` error handling (for encoding only): the unencodable character is replaced by a backslashed escape sequence. +.. function:: namereplace_errors(exception) + + Implements the ``namereplace`` error handling (for encoding only): the + unencodable character is replaced by a ``\N{...}`` escape sequence. + + .. versionadded:: 3.5 + To simplify working with encoded files or stream, the module also defines these utility functions: @@ -319,6 +327,7 @@ and writing to platform dependent files: encodings. +.. _surrogateescape: .. _codec-base-classes: Codec Base Classes @@ -363,6 +372,9 @@ and implemented by all standard Python codecs: | ``'backslashreplace'`` | Replace with backslashed escape sequences | | | (only for encoding). | +-------------------------+-----------------------------------------------+ +| ``'namereplace'`` | Replace with ``\N{...}`` escape sequences | +| | (only for encoding). | ++-------------------------+-----------------------------------------------+ | ``'surrogateescape'`` | Replace byte with surrogate U+DCxx, as defined| | | in :pep:`383`. | +-------------------------+-----------------------------------------------+ @@ -384,6 +396,9 @@ schemes: .. versionchanged:: 3.4 The ``'surrogatepass'`` error handlers now works with utf-16\* and utf-32\* codecs. +.. versionadded:: 3.5 + The ``'namereplace'`` error handler. + The set of allowed values can be extended via :meth:`register_error`. @@ -477,6 +492,8 @@ define in order to be compatible with the Python codec registry. * ``'backslashreplace'`` Replace with backslashed escape sequences. + * ``'namereplace'`` Replace with ``\N{...}`` escape sequences. + The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the :class:`IncrementalEncoder` @@ -625,6 +642,8 @@ compatible with the Python codec registry. * ``'backslashreplace'`` Replace with backslashed escape sequences. + * ``'namereplace'`` Replace with ``\N{...}`` escape sequences. + The *errors* argument will be assigned to an attribute of the same name. Assigning to this attribute makes it possible to switch between different error handling strategies during the lifetime of the :class:`StreamWriter` object. @@ -1420,4 +1439,3 @@ This module implements a variant of the UTF-8 codec: On encoding a UTF-8 encoded BOM will be prepended to the UTF-8 encoded bytes. For the stateful encoder this is only done once (on the first write to the byte stream). For decoding an optional UTF-8 encoded BOM at the start of the data will be skipped. - |
