From 54f70923a3a50971e726b62e63250ec05df21be4 Mon Sep 17 00:00:00 2001 From: Serhiy Storchaka Date: Wed, 22 May 2013 15:28:30 +0300 Subject: Issue #17844: Refactor a documentation of Python specific encodings. Add links to encoders and decoders for binary-to-binary codecs. --- Doc/library/codecs.rst | 180 ++++++++++++++++++++++++++++--------------------- Misc/NEWS | 7 +- 2 files changed, 108 insertions(+), 79 deletions(-) diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index 460af13..39a3c5d 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -1098,88 +1098,112 @@ particular, the following variants typically exist: | utf_8_sig | | all languages | +-----------------+--------------------------------+--------------------------------+ -A number of codecs are specific to Python, so their codec names have no meaning -outside Python. Some of them don't convert from Unicode strings to byte strings, -but instead use the property of the Python codecs machinery that any bijective -function with one argument can be considered as an encoding. - -For the codecs listed below, the result in the "encoding" direction is always a -byte string. The result of the "decoding" direction is listed as operand type in -the table. - -.. tabularcolumns:: |l|p{0.3\linewidth}|l|p{0.3\linewidth}| - -+--------------------+---------------------------+----------------+---------------------------+ -| Codec | Aliases | Operand type | Purpose | -+====================+===========================+================+===========================+ -| base64_codec | base64, base-64 | byte string | Convert operand to MIME | -| | | | base64 (the result always | -| | | | includes a trailing | -| | | | ``'\n'``) | -+--------------------+---------------------------+----------------+---------------------------+ -| bz2_codec | bz2 | byte string | Compress the operand | -| | | | using bz2 | -+--------------------+---------------------------+----------------+---------------------------+ -| hex_codec | hex | byte string | Convert operand to | -| | | | hexadecimal | -| | | | representation, with two | -| | | | digits per byte | -+--------------------+---------------------------+----------------+---------------------------+ -| idna | | Unicode string | Implements :rfc:`3490`, | -| | | | see also | -| | | | :mod:`encodings.idna` | -+--------------------+---------------------------+----------------+---------------------------+ -| mbcs | dbcs | Unicode string | Windows only: Encode | -| | | | operand according to the | -| | | | ANSI codepage (CP_ACP) | -+--------------------+---------------------------+----------------+---------------------------+ -| palmos | | Unicode string | Encoding of PalmOS 3.5 | -+--------------------+---------------------------+----------------+---------------------------+ -| punycode | | Unicode string | Implements :rfc:`3492` | -+--------------------+---------------------------+----------------+---------------------------+ -| quopri_codec | quopri, quoted-printable, | byte string | Convert operand to MIME | -| | quotedprintable | | quoted printable | -+--------------------+---------------------------+----------------+---------------------------+ -| raw_unicode_escape | | Unicode string | Produce a string that is | -| | | | suitable as raw Unicode | -| | | | literal in Python source | -| | | | code | -+--------------------+---------------------------+----------------+---------------------------+ -| rot_13 | rot13 | Unicode string | Returns the Caesar-cypher | -| | | | encryption of the operand | -+--------------------+---------------------------+----------------+---------------------------+ -| string_escape | | byte string | Produce a string that is | -| | | | suitable as string | -| | | | literal in Python source | -| | | | code | -+--------------------+---------------------------+----------------+---------------------------+ -| undefined | | any | Raise an exception for | -| | | | all conversions. Can be | -| | | | used as the system | -| | | | encoding if no automatic | -| | | | :term:`coercion` between | -| | | | byte and Unicode strings | -| | | | is desired. | -+--------------------+---------------------------+----------------+---------------------------+ -| unicode_escape | | Unicode string | Produce a string that is | -| | | | suitable as Unicode | -| | | | literal in Python source | -| | | | code | -+--------------------+---------------------------+----------------+---------------------------+ -| unicode_internal | | Unicode string | Return the internal | -| | | | representation of the | -| | | | operand | -+--------------------+---------------------------+----------------+---------------------------+ -| uu_codec | uu | byte string | Convert the operand using | -| | | | uuencode | -+--------------------+---------------------------+----------------+---------------------------+ -| zlib_codec | zip, zlib | byte string | Compress the operand | -| | | | using gzip | -+--------------------+---------------------------+----------------+---------------------------+ +Python Specific Encodings +------------------------- + +A number of predefined codecs are specific to Python, so their codec names have +no meaning outside Python. These are listed in the tables below based on the +expected input and output types (note that while text encodings are the most +common use case for codecs, the underlying codec infrastructure supports +arbitrary data transforms rather than just text encodings). For asymmetric +codecs, the stated purpose describes the encoding direction. + +The following codecs provide unicode-to-str encoding [#encoding-note]_ and +str-to-unicode decoding [#decoding-note]_, similar to the Unicode text +encodings. + +.. tabularcolumns:: |l|L|L| + ++--------------------+---------------------------+---------------------------+ +| Codec | Aliases | Purpose | ++====================+===========================+===========================+ +| idna | | Implements :rfc:`3490`, | +| | | see also | +| | | :mod:`encodings.idna` | ++--------------------+---------------------------+---------------------------+ +| mbcs | dbcs | Windows only: Encode | +| | | operand according to the | +| | | ANSI codepage (CP_ACP) | ++--------------------+---------------------------+---------------------------+ +| palmos | | Encoding of PalmOS 3.5 | ++--------------------+---------------------------+---------------------------+ +| punycode | | Implements :rfc:`3492` | ++--------------------+---------------------------+---------------------------+ +| raw_unicode_escape | | Produce a string that is | +| | | suitable as raw Unicode | +| | | literal in Python source | +| | | code | ++--------------------+---------------------------+---------------------------+ +| rot_13 | rot13 | Returns the Caesar-cypher | +| | | encryption of the operand | ++--------------------+---------------------------+---------------------------+ +| undefined | | Raise an exception for | +| | | all conversions. Can be | +| | | used as the system | +| | | encoding if no automatic | +| | | :term:`coercion` between | +| | | byte and Unicode strings | +| | | is desired. | ++--------------------+---------------------------+---------------------------+ +| unicode_escape | | Produce a string that is | +| | | suitable as Unicode | +| | | literal in Python source | +| | | code | ++--------------------+---------------------------+---------------------------+ +| unicode_internal | | Return the internal | +| | | representation of the | +| | | operand | ++--------------------+---------------------------+---------------------------+ .. versionadded:: 2.3 The ``idna`` and ``punycode`` encodings. +The following codecs provide str-to-str encoding and decoding +[#decoding-note]_. + +.. tabularcolumns:: |l|L|L|L| + ++--------------------+---------------------------+---------------------------+------------------------------+ +| Codec | Aliases | Purpose | Encoder/decoder | ++====================+===========================+===========================+==============================+ +| base64_codec | base64, base-64 | Convert operand to MIME | :meth:`base64.b64encode`, | +| | | base64 (the result always | :meth:`base64.b64decode` | +| | | includes a trailing | | +| | | ``'\n'``) | | ++--------------------+---------------------------+---------------------------+------------------------------+ +| bz2_codec | bz2 | Compress the operand | :meth:`bz2.compress`, | +| | | using bz2 | :meth:`bz2.decompress` | ++--------------------+---------------------------+---------------------------+------------------------------+ +| hex_codec | hex | Convert operand to | :meth:`base64.b16encode`, | +| | | hexadecimal | :meth:`base64.b16decode` | +| | | representation, with two | | +| | | digits per byte | | ++--------------------+---------------------------+---------------------------+------------------------------+ +| quopri_codec | quopri, quoted-printable, | Convert operand to MIME | :meth:`quopri.encodestring`, | +| | quotedprintable | quoted printable | :meth:`quopri.decodestring` | ++--------------------+---------------------------+---------------------------+------------------------------+ +| string_escape | | Produce a string that is | | +| | | suitable as string | | +| | | literal in Python source | | +| | | code | | ++--------------------+---------------------------+---------------------------+------------------------------+ +| uu_codec | uu | Convert the operand using | :meth:`uu.encode`, | +| | | uuencode | :meth:`uu.decode` | ++--------------------+---------------------------+---------------------------+------------------------------+ +| zlib_codec | zip, zlib | Compress the operand | :meth:`zlib.compress`, | +| | | using gzip | :meth:`zlib.decompress` | ++--------------------+---------------------------+---------------------------+------------------------------+ + +.. [#encoding-note] str objects are also accepted as input in place of unicode + objects. They are implicitly converted to unicode by decoding them using + the default encoding. If this conversion fails, it may lead to encoding + operations raising :exc:`UnicodeDecodeError`. + +.. [#decoding-note] unicode objects are also accepted as input in place of str + objects. They are implicitly converted to str by encoding them using the + default encoding. If this conversion fails, it may lead to decoding + operations raising :exc:`UnicodeEncodeError`. + :mod:`encodings.idna` --- Internationalized Domain Names in Applications ------------------------------------------------------------------------ diff --git a/Misc/NEWS b/Misc/NEWS index 9367bd5..391ecc5 100644 --- a/Misc/NEWS +++ b/Misc/NEWS @@ -26,12 +26,17 @@ IDLE - Issue #14146: Highlight source line while debugging on Windows. - Tests ----- - Issue #11995: test_pydoc doesn't import all sys.path modules anymore. +Documentation +------------- + +- Issue #17844: Refactor a documentation of Python specific encodings. + Add links to encoders and decoders for binary-to-binary codecs. + What's New in Python 2.7.5? =========================== -- cgit v0.12