diff options
author | Georg Brandl <georg@python.org> | 2010-11-19 22:09:04 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2010-11-19 22:09:04 (GMT) |
commit | c8c60c22845177f419e4de7305102310e336b1f0 (patch) | |
tree | 442d527038f5d230feb7daa30903eb1b90f8807b /Doc/howto/unicode.rst | |
parent | c5b0ec0a8326483bb90d9e339e8bf72c63958b7d (diff) | |
download | cpython-c8c60c22845177f419e4de7305102310e336b1f0.zip cpython-c8c60c22845177f419e4de7305102310e336b1f0.tar.gz cpython-c8c60c22845177f419e4de7305102310e336b1f0.tar.bz2 |
Do not put a raw REPLACEMENT CHARACTER in the document.
Diffstat (limited to 'Doc/howto/unicode.rst')
-rw-r--r-- | Doc/howto/unicode.rst | 5 |
1 files changed, 4 insertions, 1 deletions
diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst index b809182..77fcd26 100644 --- a/Doc/howto/unicode.rst +++ b/Doc/howto/unicode.rst @@ -263,10 +263,13 @@ Unicode result). The following examples show the differences:: UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: unexpected code byte >>> b'\x80abc'.decode("utf-8", "replace") - '�abc' + '?abc' >>> b'\x80abc'.decode("utf-8", "ignore") 'abc' +(In this code example, the Unicode replacement character has been replaced by +a question mark because it may not be displayed on some systems.) + Encodings are specified as strings containing the encoding's name. Python 3.2 comes with roughly 100 different encodings; see the Python Library Reference at :ref:`standard-encodings` for a list. Some encodings have multiple names; for |