summaryrefslogtreecommitdiffstats
path: root/Doc/library
diff options
context:
space:
mode:
authorAndrew Kuchling <amk@amk.ca>2013-06-16 16:58:48 (GMT)
committerAndrew Kuchling <amk@amk.ca>2013-06-16 16:58:48 (GMT)
commitc7b6c50f29fac4971e7271ac649ee3b7ef3deac7 (patch)
tree84ce5a8e35e7d3da929f85c9f025beb2ddc05d7a /Doc/library
parent893f2ffc7c8a110f069bb05c66e60632cc49cbef (diff)
downloadcpython-c7b6c50f29fac4971e7271ac649ee3b7ef3deac7.zip
cpython-c7b6c50f29fac4971e7271ac649ee3b7ef3deac7.tar.gz
cpython-c7b6c50f29fac4971e7271ac649ee3b7ef3deac7.tar.bz2
Describe 'surrogateescape' in the documentation.
Also, improve some docstring descriptions of the 'errors' parameter. Closes #14015.
Diffstat (limited to 'Doc/library')
-rw-r--r--Doc/library/codecs.rst6
-rw-r--r--Doc/library/functions.rst40
2 files changed, 35 insertions, 11 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 0d38253..e80fc3a 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -78,7 +78,11 @@ It defines the following functions:
reference (for encoding only)
* ``'backslashreplace'``: replace with backslashed escape sequences (for
encoding only)
- * ``'surrogateescape'``: replace with surrogate U+DCxx, see :pep:`383`
+ * ``'surrogateescape'``: on decoding, replace with code points in the Unicode
+ Private Use Area ranging from U+DC80 to U+DCFF. These private code
+ points will then be turned back into the same bytes when the
+ ``surrogateescape`` error handler is used when encoding the data.
+ (See :pep:`383` for more.)
as well as any other error handling name defined via :func:`register_error`.
diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst
index 3059e17..04fb95e 100644
--- a/Doc/library/functions.rst
+++ b/Doc/library/functions.rst
@@ -895,16 +895,36 @@ are always available. They are listed here in alphabetical order.
the list of supported encodings.
*errors* is an optional string that specifies how encoding and decoding
- errors are to be handled--this cannot be used in binary mode. Pass
- ``'strict'`` to raise a :exc:`ValueError` exception if there is an encoding
- error (the default of ``None`` has the same effect), or pass ``'ignore'`` to
- ignore errors. (Note that ignoring encoding errors can lead to data loss.)
- ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
- where there is malformed data. When writing, ``'xmlcharrefreplace'``
- (replace with the appropriate XML character reference) or
- ``'backslashreplace'`` (replace with backslashed escape sequences) can be
- used. Any other error handling name that has been registered with
- :func:`codecs.register_error` is also valid.
+ errors are to be handled--this cannot be used in binary mode.
+ A variety of standard error handlers are available, though any
+ error handling name that has been registered with
+ :func:`codecs.register_error` is also valid. The standard names
+ are:
+
+ * ``'strict'`` to raise a :exc:`ValueError` exception if there is
+ an encoding error. The default value of ``None`` has the same
+ effect.
+
+ * ``'ignore'`` ignores errors. Note that ignoring encoding errors
+ can lead to data loss.
+
+ * ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted
+ where there is malformed data.
+
+ * ``'surrogateescape'`` will represent any incorrect bytes as code
+ points in the Unicode Private Use Area ranging from U+DC80 to
+ U+DCFF. These private code points will then be turned back into
+ the same bytes when the ``surrogateescape`` error handler is used
+ when writing data. This is useful for processing files in an
+ unknown encoding.
+
+ * ``'xmlcharrefreplace'`` is only supported when writing to a file.
+ Characters not supported by the encoding are replaced with the
+ appropriate XML character reference ``&#nnn;``.
+
+ * ``'backslashreplace'`` (also only supported when writing)
+ replaces unsupported characters with Python's backslashed escape
+ sequences.
.. index::
single: universal newlines; open() built-in function