diff options
author | Benjamin Peterson <benjamin@python.org> | 2012-12-02 16:33:14 (GMT) |
---|---|---|
committer | Benjamin Peterson <benjamin@python.org> | 2012-12-02 16:33:14 (GMT) |
commit | c77dd206989c9f8641f5e128f29599383526e645 (patch) | |
tree | c4e9c1e7b4696c92f2384d874cb79d6a72a36d9b | |
parent | 26e5335a46f34944da7fd20ab8b1574fae6a5585 (diff) | |
parent | 78f7e3a8dc999114c6863754b0c72ad5a9ec93eb (diff) | |
download | cpython-c77dd206989c9f8641f5e128f29599383526e645.zip cpython-c77dd206989c9f8641f5e128f29599383526e645.tar.gz cpython-c77dd206989c9f8641f5e128f29599383526e645.tar.bz2 |
merge 3.3
-rw-r--r-- | Doc/library/codecs.rst | 17 | ||||
-rw-r--r-- | Doc/library/exceptions.rst | 24 |
2 files changed, 34 insertions, 7 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index 071fc23..28ea89d 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -155,13 +155,16 @@ functions which use :func:`lookup` for the codec lookup: when *name* is specified as the errors parameter. For encoding *error_handler* will be called with a :exc:`UnicodeEncodeError` - instance, which contains information about the location of the error. The error - handler must either raise this or a different exception or return a tuple with a - replacement for the unencodable part of the input and a position where encoding - should continue. The encoder will encode the replacement and continue encoding - the original input at the specified position. Negative position values will be - treated as being relative to the end of the input string. If the resulting - position is out of bound an :exc:`IndexError` will be raised. + instance, which contains information about the location of the error. The + error handler must either raise this or a different exception or return a + tuple with a replacement for the unencodable part of the input and a position + where encoding should continue. The replacement may be either :class:`str` or + :class:`bytes`. If the replacement is bytes, the encoder will simply copy + them into the output buffer. If the replacement is a string, the encoder will + encode the replacement. Encoding continues on original input at the + specified position. Negative position values will be treated as being + relative to the end of the input string. If the resulting position is out of + bound an :exc:`IndexError` will be raised. Decoding and translating works similar, except :exc:`UnicodeDecodeError` or :exc:`UnicodeTranslateError` will be passed to the handler and that the diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst index ccc6005..624ba75 100644 --- a/Doc/library/exceptions.rst +++ b/Doc/library/exceptions.rst @@ -377,6 +377,30 @@ The following exceptions are the exceptions that are usually raised. Raised when a Unicode-related encoding or decoding error occurs. It is a subclass of :exc:`ValueError`. + :exc:`UnicodeError` has attributes that describe the encoding or decoding + error. For example, ``err.object[err.start:err.end]`` gives the particular + invalid input that the codec failed on. + + .. attribute:: encoding + + The name of the encoding that raised the error. + + .. attribute:: reason + + A string describing the specific codec error. + + .. attribute:: object + + The object the codec was attempting to encode or decode. + + .. attribute:: start + + The first index of invalid data in :attr:`object`. + + .. attribute:: end + + The index after the last invalid data in :attr:`object`. + .. exception:: UnicodeEncodeError |