summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBenjamin Peterson <benjamin@python.org>2012-12-02 16:33:14 (GMT)
committerBenjamin Peterson <benjamin@python.org>2012-12-02 16:33:14 (GMT)
commitc77dd206989c9f8641f5e128f29599383526e645 (patch)
treec4e9c1e7b4696c92f2384d874cb79d6a72a36d9b
parent26e5335a46f34944da7fd20ab8b1574fae6a5585 (diff)
parent78f7e3a8dc999114c6863754b0c72ad5a9ec93eb (diff)
downloadcpython-c77dd206989c9f8641f5e128f29599383526e645.zip
cpython-c77dd206989c9f8641f5e128f29599383526e645.tar.gz
cpython-c77dd206989c9f8641f5e128f29599383526e645.tar.bz2
merge 3.3
-rw-r--r--Doc/library/codecs.rst17
-rw-r--r--Doc/library/exceptions.rst24
2 files changed, 34 insertions, 7 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 071fc23..28ea89d 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -155,13 +155,16 @@ functions which use :func:`lookup` for the codec lookup:
when *name* is specified as the errors parameter.
For encoding *error_handler* will be called with a :exc:`UnicodeEncodeError`
- instance, which contains information about the location of the error. The error
- handler must either raise this or a different exception or return a tuple with a
- replacement for the unencodable part of the input and a position where encoding
- should continue. The encoder will encode the replacement and continue encoding
- the original input at the specified position. Negative position values will be
- treated as being relative to the end of the input string. If the resulting
- position is out of bound an :exc:`IndexError` will be raised.
+ instance, which contains information about the location of the error. The
+ error handler must either raise this or a different exception or return a
+ tuple with a replacement for the unencodable part of the input and a position
+ where encoding should continue. The replacement may be either :class:`str` or
+ :class:`bytes`. If the replacement is bytes, the encoder will simply copy
+ them into the output buffer. If the replacement is a string, the encoder will
+ encode the replacement. Encoding continues on original input at the
+ specified position. Negative position values will be treated as being
+ relative to the end of the input string. If the resulting position is out of
+ bound an :exc:`IndexError` will be raised.
Decoding and translating works similar, except :exc:`UnicodeDecodeError` or
:exc:`UnicodeTranslateError` will be passed to the handler and that the
diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst
index ccc6005..624ba75 100644
--- a/Doc/library/exceptions.rst
+++ b/Doc/library/exceptions.rst
@@ -377,6 +377,30 @@ The following exceptions are the exceptions that are usually raised.
Raised when a Unicode-related encoding or decoding error occurs. It is a
subclass of :exc:`ValueError`.
+ :exc:`UnicodeError` has attributes that describe the encoding or decoding
+ error. For example, ``err.object[err.start:err.end]`` gives the particular
+ invalid input that the codec failed on.
+
+ .. attribute:: encoding
+
+ The name of the encoding that raised the error.
+
+ .. attribute:: reason
+
+ A string describing the specific codec error.
+
+ .. attribute:: object
+
+ The object the codec was attempting to encode or decode.
+
+ .. attribute:: start
+
+ The first index of invalid data in :attr:`object`.
+
+ .. attribute:: end
+
+ The index after the last invalid data in :attr:`object`.
+
.. exception:: UnicodeEncodeError