summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorVictor Stinner <victor.stinner@haypocalc.com>2011-10-26 23:38:56 (GMT)
committerVictor Stinner <victor.stinner@haypocalc.com>2011-10-26 23:38:56 (GMT)
commit2f3ca9f20efad37aad479d557c282e08481602d0 (patch)
tree1830560daa4865d29734aa3c538eacd3199e110e /Doc
parentcc9695643fc40780f51719d5e9a272283a743077 (diff)
downloadcpython-2f3ca9f20efad37aad479d557c282e08481602d0.zip
cpython-2f3ca9f20efad37aad479d557c282e08481602d0.tar.gz
cpython-2f3ca9f20efad37aad479d557c282e08481602d0.tar.bz2
Close #13247: Add cp65001 codec, the Windows UTF-8 (CP_UTF8)
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/codecs.rst5
-rw-r--r--Doc/whatsnew/3.3.rst5
2 files changed, 10 insertions, 0 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 4b33c61..4523c7f 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -1011,6 +1011,11 @@ particular, the following variants typically exist:
+-----------------+--------------------------------+--------------------------------+
| cp1258 | windows-1258 | Vietnamese |
+-----------------+--------------------------------+--------------------------------+
+| cp65001 | | Windows only: Windows UTF-8 |
+| | | (``CP_UTF8``) |
+| | | |
+| | | .. versionadded:: 3.3 |
++-----------------+--------------------------------+--------------------------------+
| euc_jp | eucjp, ujis, u-jis | Japanese |
+-----------------+--------------------------------+--------------------------------+
| euc_jis_2004 | jisx0213, eucjis2004 | Japanese |
diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst
index 6ae8315..1ee9c1b 100644
--- a/Doc/whatsnew/3.3.rst
+++ b/Doc/whatsnew/3.3.rst
@@ -225,6 +225,11 @@ The :mod:`~encodings.mbcs` codec has be rewritten to handle correclty
:mod:`~encodings.mbcs` codec is now supporting all error handlers, instead of
only ``replace`` to encode and ``ignore`` to decode.
+A new Windows-only codec has been added: ``cp65001`` (:issue:`13247`). It is
+the Windows code page 65001 (Windows UTF-8, ``CP_UTF8``). For example, it is
+used by ``sys.stdout`` if the console output code page is set to cp65001 (e.g.
+using ``chcp 65001`` command).
+
Multibyte CJK decoders now resynchronize faster. They only ignore the first
byte of an invalid byte sequence. For example, ``b'\xff\n'.decode('gb2312',
'replace')`` now returns a ``\n`` after the replacement character.