From a198645fa0f9a9c6183c211955083765dc8ab3a8 Mon Sep 17 00:00:00 2001 From: R David Murray Date: Wed, 23 Dec 2015 21:17:17 -0500 Subject: #1753718: clarify RFC compliance and bytes/string argument types. Patch includes contributions by Isobel Hooper, incorporating suggestions from Paul Winkler. Reviewed by Martin Panter. In addition to accepting the corrections for the RFC compliance wording, I went through and corrected all the argument and return types, and made the pattern of how the arguments and return types are documented consistent. So, this patch also addresses #20782, though I had forgotten about that issue and its patch. --- Doc/library/base64.rst | 128 +++++++++++++++++++++++++++---------------------- 1 file changed, 71 insertions(+), 57 deletions(-) diff --git a/Doc/library/base64.rst b/Doc/library/base64.rst index eba4b36..3f0161f 100644 --- a/Doc/library/base64.rst +++ b/Doc/library/base64.rst @@ -21,13 +21,19 @@ safely sent by email, used as parts of URLs, or included as part of an HTTP POST request. The encoding algorithm is not the same as the :program:`uuencode` program. -There are two :rfc:`3548` interfaces provided by this module. The modern -interface supports encoding and decoding ASCII byte string objects using all -three :rfc:`3548` defined alphabets (normal, URL-safe, and filesystem-safe). -Additionally, the decoding functions of the modern interface also accept -Unicode strings containing only ASCII characters. The legacy interface provides -for encoding and decoding to and from file-like objects as well as byte -strings, but only using the Base64 standard alphabet. +There are two interfaces provided by this module. The modern interface +supports encoding :term:`bytes-like objects ` to ASCII +:class:`bytes`, and decoding :term:`bytes-like objects ` or +strings containing ASCII to :class:`bytes`. All three :rfc:`3548` defined +alphabets (normal, URL-safe, and filesystem-safe) are supported. + +The legacy interface does not support decoding from strings, but it does +provide functions for encoding and decoding to and from :term:`file objects +`. It only supports the Base64 standard alphabet, and it adds +newlines every 76 characters as per :rfc:`2045`. Note that if you are looking +for :rfc:`2045` support you probably want to be looking at the :mod:`email` +package instead. + .. versionchanged:: 3.3 ASCII-only Unicode strings are now accepted by the decoding functions of @@ -41,26 +47,26 @@ The modern interface provides: .. function:: b64encode(s, altchars=None) - Encode a byte string using Base64. + Encode the :term:`bytes-like object` *s* using Base64 and return the encoded + :class:`bytes`. - *s* is the string to encode. Optional *altchars* must be a string of at least + Optional *altchars* must be a :term:`bytes-like object` of at least length 2 (additional characters are ignored) which specifies an alternative alphabet for the ``+`` and ``/`` characters. This allows an application to e.g. generate URL or filesystem safe Base64 strings. The default is ``None``, for which the standard Base64 alphabet is used. - The encoded byte string is returned. - .. function:: b64decode(s, altchars=None, validate=False) - Decode a Base64 encoded byte string. + Decode the Base64 encoded :term:`bytes-like object` or ASCII string + *s* and return the decoded :class:`bytes`. - *s* is the byte string to decode. Optional *altchars* must be a string of + Optional *altchars* must be a :term:`bytes-like object` or ASCII string of at least length 2 (additional characters are ignored) which specifies the alternative alphabet used instead of the ``+`` and ``/`` characters. - The decoded string is returned. A :exc:`binascii.Error` exception is raised + A :exc:`binascii.Error` exception is raised if *s* is incorrectly padded. If *validate* is ``False`` (the default), non-base64-alphabet characters are @@ -71,38 +77,44 @@ The modern interface provides: .. function:: standard_b64encode(s) - Encode byte string *s* using the standard Base64 alphabet. + Encode :term:`bytes-like object` *s* using the standard Base64 alphabet + and return the encoded :class:`bytes`. .. function:: standard_b64decode(s) - Decode byte string *s* using the standard Base64 alphabet. + Decode :term:`bytes-like object` or ASCII string *s* using the standard + Base64 alphabet and return the decoded :class:`bytes`. .. function:: urlsafe_b64encode(s) - Encode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of - ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. The result + Encode :term:`bytes-like object` *s* using a URL-safe alphabet, which + substitutes ``-`` instead of ``+`` and ``_`` instead of ``/`` in the + standard Base64 alphabet, and return the encoded :class:`bytes`. The result can still contain ``=``. .. function:: urlsafe_b64decode(s) - Decode byte string *s* using a URL-safe alphabet, which substitutes ``-`` instead of - ``+`` and ``_`` instead of ``/`` in the standard Base64 alphabet. + Decode :term:`bytes-like object` or ASCII string *s* using a URL-safe + alphabet, which substitutes ``-`` instead of ``+`` and ``_`` instead of + ``/`` in the standard Base64 alphabet, and return the decoded + :class:`bytes`. .. function:: b32encode(s) - Encode a byte string using Base32. *s* is the string to encode. The encoded string - is returned. + Encode the :term:`bytes-like object` *s* using Base32 and return the + encoded :class:`bytes`. .. function:: b32decode(s, casefold=False, map01=None) - Decode a Base32 encoded byte string. + Decode the Base32 encoded :term:`bytes-like object` or ASCII string *s* and + return the decoded :class:`bytes`. - *s* is the byte string to decode. Optional *casefold* is a flag specifying + Optional *casefold* is a flag specifying whether a lowercase alphabet is acceptable as input. For security purposes, the default is ``False``. @@ -113,46 +125,45 @@ The modern interface provides: digit 0 is always mapped to the letter O). For security purposes the default is ``None``, so that 0 and 1 are not allowed in the input. - The decoded byte string is returned. A :exc:`binascii.Error` is raised if *s* is + A :exc:`binascii.Error` is raised if *s* is incorrectly padded or if there are non-alphabet characters present in the - string. + input. .. function:: b16encode(s) - Encode a byte string using Base16. - - *s* is the string to encode. The encoded byte string is returned. + Encode the :term:`bytes-like object` *s* using Base16 and return the + encoded :class:`bytes`. .. function:: b16decode(s, casefold=False) - Decode a Base16 encoded byte string. + Decode the Base16 encoded :term:`bytes-like object` or ASCII string *s* and + return the decoded :class:`bytes`. - *s* is the string to decode. Optional *casefold* is a flag specifying whether a + Optional *casefold* is a flag specifying whether a lowercase alphabet is acceptable as input. For security purposes, the default is ``False``. - The decoded byte string is returned. A :exc:`TypeError` is raised if *s* were + A :exc:`TypeError` is raised if *s* is incorrectly padded or if there are non-alphabet characters present in the - string. + input. .. function:: a85encode(s, *, foldspaces=False, wrapcol=0, pad=False, adobe=False) - Encode a byte string using Ascii85. - - *s* is the string to encode. The encoded byte string is returned. + Encode the :term:`bytes-like object` *s* using Ascii85 and return the + encoded :class:`bytes`. *foldspaces* is an optional flag that uses the special short sequence 'y' instead of 4 consecutive spaces (ASCII 0x20) as supported by 'btoa'. This feature is not supported by the "standard" Ascii85 encoding. - *wrapcol* controls whether the output should have newline (``'\n'``) + *wrapcol* controls whether the output should have newline (``b'\n'``) characters added to it. If this is non-zero, each output line will be at most this many characters long. - *pad* controls whether the input string is padded to a multiple of 4 + *pad* controls whether the input is padded to a multiple of 4 before encoding. Note that the ``btoa`` implementation always pads. *adobe* controls whether the encoded byte sequence is framed with ``<~`` @@ -163,9 +174,8 @@ The modern interface provides: .. function:: a85decode(s, *, foldspaces=False, adobe=False, ignorechars=b' \\t\\n\\r\\v') - Decode an Ascii85 encoded byte string. - - *s* is the byte string to decode. + Decode the Ascii85 encoded :term:`bytes-like object` or ASCII string *s* and + return the decoded :class:`bytes`. *foldspaces* is a flag that specifies whether the 'y' short sequence should be accepted as shorthand for 4 consecutive spaces (ASCII 0x20). @@ -174,7 +184,8 @@ The modern interface provides: *adobe* controls whether the input sequence is in Adobe Ascii85 format (i.e. is framed with <~ and ~>). - *ignorechars* should be a byte string containing characters to ignore + *ignorechars* should be a :term:`bytes-like object` or ASCII string + containing characters to ignore from the input. This should only contain whitespace characters, and by default contains all whitespace characters in ASCII. @@ -183,18 +194,19 @@ The modern interface provides: .. function:: b85encode(s, pad=False) - Encode a byte string using base85, as used in e.g. git-style binary - diffs. + Encode the :term:`bytes-like object` *s* using base85 (as used in e.g. + git-style binary diffs) and return the encoded :class:`bytes`. - If *pad* is true, the input is padded with "\\0" so its length is a - multiple of 4 characters before encoding. + If *pad* is true, the input is padded with ``b'\0'`` so its length is a + multiple of 4 bytes before encoding. .. versionadded:: 3.4 .. function:: b85decode(b) - Decode base85-encoded byte string. Padding is implicitly removed, if + Decode the base85-encoded :term:`bytes-like object` or ASCII string *b* and + return the decoded :class:`bytes`. Padding is implicitly removed, if necessary. .. versionadded:: 3.4 @@ -214,15 +226,15 @@ The legacy interface: Decode the contents of the binary *input* file and write the resulting binary data to the *output* file. *input* and *output* must be :term:`file objects - `. *input* will be read until ``input.read()`` returns an empty - bytes object. + `. *input* will be read until ``input.readline()`` returns an + empty bytes object. .. function:: decodebytes(s) decodestring(s) - Decode the byte string *s*, which must contain one or more lines of base64 - encoded data, and return a byte string containing the resulting binary data. + Decode the :term:`bytes-like object` *s*, which must contain one or more + lines of base64 encoded data, and return the decoded :class:`bytes`. ``decodestring`` is a deprecated alias. .. versionadded:: 3.1 @@ -233,17 +245,19 @@ The legacy interface: Encode the contents of the binary *input* file and write the resulting base64 encoded data to the *output* file. *input* and *output* must be :term:`file objects `. *input* will be read until ``input.read()`` returns - an empty bytes object. :func:`encode` returns the encoded data plus a trailing - newline character (``b'\n'``). + an empty bytes object. :func:`encode` inserts a newline character (``b'\n'``) + after every 76 bytes of the output, as well as ensuring that the output + always ends with a newline, as per :rfc:`2045` (MIME). .. function:: encodebytes(s) encodestring(s) - Encode the byte string *s*, which can contain arbitrary binary data, and - return a byte string containing one or more lines of base64-encoded data. - :func:`encodebytes` returns a string containing one or more lines of - base64-encoded data always including an extra trailing newline (``b'\n'``). + Encode the :term:`bytes-like object` *s*, which can contain arbitrary binary + data, and return :class:`bytes` containing the base64-encoded data, with newlines + (``b'\n'``) inserted after every 76 bytes of output, and ensuring that + there is a trailing newline, as per :rfc:`2045` (MIME). + ``encodestring`` is a deprecated alias. -- cgit v0.12