Add bytes methods documentation.

author: Georg Brandl <georg@python.org> 2007-08-31 10:15:37 (GMT)
committer: Georg Brandl <georg@python.org> 2007-08-31 10:15:37 (GMT)
commit: 226878cba507cff4b6ce094063682d0b0b53cbb9 (patch)
tree: cf95b7b7cf86226c9f79ec6292c0f3187b234979 /Doc
parent: 283e35f606a52fe3ad3cb8006a41155476118b7c (diff)
download: cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.zip
cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.tar.gz
cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.tar.bz2
3 files changed, 243 insertions, 61 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 7a035c2..aa6bc98 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -1117,6 +1117,8 @@ For the codecs listed below, the result in the "encoding" direction is always a
 byte string. The result of the "decoding" direction is listed as operand type in
 the table.
 
+.. XXX fix here, should be in above table
+
 +--------------------+---------+----------------+---------------------------+
 | Codec              | Aliases | Operand type   | Purpose                   |
 +====================+=========+================+===========================+
diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst
index e7569ad..dab3476 100644
--- a/Doc/library/stdtypes.rst
+++ b/Doc/library/stdtypes.rst
@@ -504,7 +504,7 @@ described in the :ref:`string-methods` section.  Bytes objects can be
 constructed from literals too; use a ``b`` prefix with normal string syntax:
 ``b'xyzzy'``.
 
-.. caveat::
+.. warning::
 
    While string objects are sequences of characters (represented by strings of
    length 1), bytes objects are sequences of *integers* (between 0 and 255),
@@ -649,8 +649,6 @@ Notes:
       Formerly, string concatenation never occurred in-place.
 
 
-.. XXX add bytes methods
-
 .. _string-methods:
 
 String Methods
@@ -687,7 +685,7 @@ the :mod:`re` module for string functions based on regular expressions.
 .. XXX what about str.decode???
 .. method:: str.decode([encoding[, errors]])
 
-   Decodes the string using the codec registered for *encoding*. *encoding*
+   Decode the string using the codec registered for *encoding*. *encoding*
    defaults to the default string encoding.  *errors* may be given to set a
    different error handling scheme.  The default is ``'strict'``, meaning that
    encoding errors raise :exc:`UnicodeError`.  Other possible values are
@@ -700,7 +698,7 @@ the :mod:`re` module for string functions based on regular expressions.
       Support for other error handling schemes added.
 
 
-.. method:: str.encode([encoding[,errors]])
+.. method:: str.encode([encoding[, errors]])
 
    Return an encoded version of the string.  Default encoding is the current
    default string encoding.  *errors* may be given to set a different error
@@ -869,7 +867,7 @@ the :mod:`re` module for string functions based on regular expressions.
    occurrences are replaced.
 
 
-.. method:: str.rfind(sub [,start [,end]])
+.. method:: str.rfind(sub[, start[, end]])
 
    Return the highest index in the string where substring *sub* is found, such that
    *sub* is contained within s[start,end].  Optional arguments *start* and *end*
@@ -902,7 +900,7 @@ the :mod:`re` module for string functions based on regular expressions.
    .. versionadded:: 2.5
 
 
-.. method:: str.rsplit([sep [,maxsplit]])
+.. method:: str.rsplit([sep[, maxsplit]])
 
    Return a list of the words in the string, using *sep* as the delimiter string.
    If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost*
@@ -929,17 +927,17 @@ the :mod:`re` module for string functions based on regular expressions.
       Support for the *chars* argument.
 
 
-.. method:: str.split([sep [,maxsplit]])
+.. method:: str.split([sep[, maxsplit]])
 
-   Return a list of the words in the string, using *sep* as the delimiter string.
-   If *maxsplit* is given, at most *maxsplit* splits are done. (thus, the list will
-   have at most ``maxsplit+1`` elements).  If *maxsplit* is not specified, then
-   there is no limit on the number of splits (all possible splits are made).
-   Consecutive delimiters are not grouped together and are deemed to delimit empty
-   strings (for example, ``'1,,2'.split(',')`` returns ``['1', '', '2']``).  The
-   *sep* argument may consist of multiple characters (for example, ``'1, 2,
-   3'.split(', ')`` returns ``['1', '2', '3']``).  Splitting an empty string with a
-   specified separator returns ``['']``.
+   Return a list of the words in the string, using *sep* as the delimiter
+   string.  If *maxsplit* is given, at most *maxsplit* splits are done (thus,
+   the list will have at most ``maxsplit+1`` elements).  If *maxsplit* is not
+   specified, then there is no limit on the number of splits (all possible
+   splits are made).  Consecutive delimiters are not grouped together and are
+   deemed to delimit empty strings (for example, ``'1,,2'.split(',')`` returns
+   ``['1', '', '2']``).  The *sep* argument may consist of multiple characters
+   (for example, ``'1, 2, 3'.split(', ')`` returns ``['1', '2', '3']``).
+   Splitting an empty string with a specified separator returns ``['']``.
 
    If *sep* is not specified or is ``None``, a different splitting algorithm is
    applied.  First, whitespace characters (spaces, tabs, newlines, returns, and
@@ -999,7 +997,7 @@ the :mod:`re` module for string functions based on regular expressions.
 
 .. method:: str.translate(map)
 
-   Returns a copy of the *s* where all characters have been mapped through the
+   Return a copy of the *s* where all characters have been mapped through the
    *map* which must be a mapping of Unicode ordinals (integers) to Unicode
    ordinals, strings or ``None``.  Unmapped characters are left
    untouched. Characters mapped to ``None`` are deleted.
@@ -1043,7 +1041,7 @@ Old String Formatting Operations
 
 .. note::
 
-   The formatting operations described here are obsolete and my go away in future
+   The formatting operations described here are obsolete and may go away in future
    versions of Python.  Use the new :ref:`string-formatting` in new code.
 
 String objects have one unique built-in operation: the ``%`` operator (modulo).
@@ -1238,12 +1236,17 @@ Mutable Sequence Types
 .. index::
    triple: mutable; sequence; types
    object: list
+   object: bytes
 
-List objects support additional operations that allow in-place modification of
-the object. Other mutable sequence types (when added to the language) should
-also support these operations. Strings and tuples are immutable sequence types:
-such objects cannot be modified once created. The following operations are
-defined on mutable sequence types (where *x* is an arbitrary object):
+List and bytes objects support additional operations that allow in-place
+modification of the object.  Other mutable sequence types (when added to the
+language) should also support these operations.  Strings and tuples are
+immutable sequence types: such objects cannot be modified once created. The
+following operations are defined on mutable sequence types (where *x* is an
+arbitrary object).
+
+Note that while lists allow their items to be of any type, bytes object
+"items" are all integers in the range 0 <= x < 256.
 
 +------------------------------+--------------------------------+---------------------+
 | Operation                    | Result                         | Notes               |
@@ -1263,30 +1266,30 @@ defined on mutable sequence types (where *x* is an arbitrary object):
 | ``del s[i:j:k]``             | removes the elements of        |                     |
 |                              | ``s[i:j:k]`` from the list     |                     |
 +------------------------------+--------------------------------+---------------------+
-| ``s.append(x)``              | same as ``s[len(s):len(s)] =   | \(2)                |
+| ``s.append(x)``              | same as ``s[len(s):len(s)] =   |                     |
 |                              | [x]``                          |                     |
 +------------------------------+--------------------------------+---------------------+
-| ``s.extend(x)``              | same as ``s[len(s):len(s)] =   | \(3)                |
+| ``s.extend(x)``              | same as ``s[len(s):len(s)] =   | \(2)                |
 |                              | x``                            |                     |
 +------------------------------+--------------------------------+---------------------+
 | ``s.count(x)``               | return number of *i*'s for     |                     |
 |                              | which ``s[i] == x``            |                     |
 +------------------------------+--------------------------------+---------------------+
-| ``s.index(x[, i[, j]])``     | return smallest *k* such that  | \(4)                |
+| ``s.index(x[, i[, j]])``     | return smallest *k* such that  | \(3)                |
 |                              | ``s[k] == x`` and ``i <= k <   |                     |
 |                              | j``                            |                     |
 +------------------------------+--------------------------------+---------------------+
-| ``s.insert(i, x)``           | same as ``s[i:i] = [x]``       | \(5)                |
+| ``s.insert(i, x)``           | same as ``s[i:i] = [x]``       | \(4)                |
 +------------------------------+--------------------------------+---------------------+
-| ``s.pop([i])``               | same as ``x = s[i]; del s[i];  | \(6)                |
+| ``s.pop([i])``               | same as ``x = s[i]; del s[i];  | \(5)                |
 |                              | return x``                     |                     |
 +------------------------------+--------------------------------+---------------------+
-| ``s.remove(x)``              | same as ``del s[s.index(x)]``  | \(4)                |
+| ``s.remove(x)``              | same as ``del s[s.index(x)]``  | \(3)                |
 +------------------------------+--------------------------------+---------------------+
-| ``s.reverse()``              | reverses the items of *s* in   | \(7)                |
+| ``s.reverse()``              | reverses the items of *s* in   | \(6)                |
 |                              | place                          |                     |
 +------------------------------+--------------------------------+---------------------+
-| ``s.sort([cmp[, key[,        | sort the items of *s* in place | (7), (8), (9), (10) |
+| ``s.sort([cmp[, key[,        | sort the items of *s* in place | (6), (7)            |
 | reverse]]])``                |                                |                     |
 +------------------------------+--------------------------------+---------------------+
 
@@ -1297,32 +1300,27 @@ defined on mutable sequence types (where *x* is an arbitrary object):
    pair: slice; assignment
    pair: extended slice; assignment
    statement: del
-   single: append() (list method)
-   single: extend() (list method)
-   single: count() (list method)
-   single: index() (list method)
-   single: insert() (list method)
-   single: pop() (list method)
-   single: remove() (list method)
-   single: reverse() (list method)
-   single: sort() (list method)
+   single: append() (sequence method)
+   single: extend() (sequence method)
+   single: count() (sequence method)
+   single: index() (sequence method)
+   single: insert() (sequence method)
+   single: pop() (sequence method)
+   single: remove() (sequence method)
+   single: reverse() (sequence method)
+   single: sort() (sequence method)
 
 Notes:
 
 (1)
-   *t* must have the same length as the slice it is  replacing.
+   *t* must have the same length as the slice it is replacing.
 
 (2)
-   The C implementation of Python has historically accepted multiple parameters and
-   implicitly joined them into a tuple; this no longer works in Python 2.0.  Use of
-   this misfeature has been deprecated since Python 1.4.
-
-(3)
    *x* can be any iterable object.
 
-(4)
+(3)
    Raises :exc:`ValueError` when *x* is not found in *s*. When a negative index is
-   passed as the second or third parameter to the :meth:`index` method, the list
+   passed as the second or third parameter to the :meth:`index` method, the sequence
    length is added, as for slice indices.  If it is still negative, it is truncated
    to zero, as for slice indices.
 
@@ -1330,25 +1328,27 @@ Notes:
       Previously, :meth:`index` didn't have arguments for specifying start and stop
       positions.
 
-(5)
+(4)
    When a negative index is passed as the first parameter to the :meth:`insert`
-   method, the list length is added, as for slice indices.  If it is still
+   method, the sequence length is added, as for slice indices.  If it is still
    negative, it is truncated to zero, as for slice indices.
 
    .. versionchanged:: 2.3
       Previously, all negative indices were truncated to zero.
 
+(5)
+   The optional argument *i* defaults to ``-1``, so that by default the last
+   item is removed and returned.
+
 (6)
-   The :meth:`pop` method is only supported by the list and array types.  The
-   optional argument *i* defaults to ``-1``, so that by default the last item is
-   removed and returned.
+   The :meth:`sort` and :meth:`reverse` methods modify the sequence in place for
+   economy of space when sorting or reversing a large sequence.  To remind you
+   that they operate by side effect, they don't return the sorted or reversed
+   sequence.
 
 (7)
-   The :meth:`sort` and :meth:`reverse` methods modify the list in place for
-   economy of space when sorting or reversing a large list.  To remind you that
-   they operate by side effect, they don't return the sorted or reversed list.
+   :meth:`sort` is not supported by bytes objects.
 
-(8)
    The :meth:`sort` method takes optional arguments for controlling the
    comparisons.
 
@@ -1374,19 +1374,199 @@ Notes:
    .. versionchanged:: 2.4
       Support for *key* and *reverse* was added.
 
-(9)
    Starting with Python 2.3, the :meth:`sort` method is guaranteed to be stable.  A
    sort is stable if it guarantees not to change the relative order of elements
    that compare equal --- this is helpful for sorting in multiple passes (for
    example, sort by department, then by salary grade).
 
-(10)
    While a list is being sorted, the effect of attempting to mutate, or even
    inspect, the list is undefined.  The C implementation of Python 2.3 and newer
    makes the list appear empty for the duration, and raises :exc:`ValueError` if it
    can detect that the list has been mutated during a sort.
 
 
+.. _bytes-methods:
+
+Bytes Methods
+-------------
+
+.. index:: pair: bytes; methods
+
+In addition to the operations on mutable sequence types (see
+:ref:`typesseq-mutable`), bytes objects, being "mutable ASCII strings" have
+further useful methods also found on strings.
+
+.. XXX documented "count" differently above
+
+.. method:: bytes.count(sub[, start[, end]])
+
+   In contrast to the standard sequence ``count`` method, this returns the
+   number of occurrences of substring (not item) *sub* in the slice
+   ``[start:end]``.  Optional arguments *start* and *end* are interpreted as in
+   slice notation.
+
+
+.. method:: bytes.decode([encoding[, errors]])
+
+   Decode the bytes using the codec registered for *encoding*. *encoding*
+   defaults to the default string encoding.  *errors* may be given to set a
+   different error handling scheme.  The default is ``'strict'``, meaning that
+   encoding errors raise :exc:`UnicodeError`.  Other possible values are
+   ``'ignore'``, ``'replace'`` and any other name registered via
+   :func:`codecs.register_error`, see section :ref:`codec-base-classes`.
+
+
+.. method:: bytes.endswith(suffix[, start[, end]])
+
+   Return ``True`` if the bytes object ends with the specified *suffix*,
+   otherwise return ``False``.  *suffix* can also be a tuple of suffixes to look
+   for.  With optional *start*, test beginning at that position.  With optional
+   *end*, stop comparing at that position.
+
+
+.. method:: bytes.find(sub[, start[, end]])
+
+   Return the lowest index in the string where substring *sub* is found, such that
+   *sub* is contained in the range [*start*, *end*].  Optional arguments *start*
+   and *end* are interpreted as in slice notation.  Return ``-1`` if *sub* is not
+   found.
+
+
+.. method:: bytes.fromhex(string)
+
+   This :class:`bytes` class method returns a bytes object, decoding the given
+   string object.  The string must contain two hexadecimal digits per byte, spaces
+   are ignored.
+
+   Example::
+   
+      >>> bytes.fromhex('f0 f1f2  ')
+      b'\xf0\xf1\xf2'
+
+
+.. method:: bytes.index(sub[, start[, end]])
+
+   Like :meth:`find`, but raise :exc:`ValueError` when the substring is not found.
+
+
+.. method:: bytes.join(seq)
+
+   Return a bytes object which is the concatenation of the bytes objects in the
+   sequence *seq*.  The separator between elements is the bytes object providing
+   this method.
+
+
+.. method:: bytes.lstrip(which)
+
+   Return a copy of the bytes object with leading bytes removed.  The *which*
+   argument is a bytes object specifying the set of bytes to be removed.  As
+   with :meth:`str.lstrip`, the *which* argument is not a prefix; rather, all
+   combinations of its values are stripped.
+
+
+.. method:: bytes.partition(sep)
+
+   Split the bytes object at the first occurrence of *sep*, and return a 3-tuple
+   containing the part before the separator, the separator itself, and the part
+   after the separator.  If the separator is not found, return a 3-tuple
+   containing the bytes object itself, followed by two empty strings.
+
+
+.. method:: bytes.replace(old, new[, count])
+
+   Return a copy of the bytes object with all occurrences of substring *old*
+   replaced by *new*.  If the optional argument *count* is given, only the first
+   *count* occurrences are replaced.
+
+
+.. method:: bytes.rfind(sub[, start[, end]])
+
+   Return the highest index in the string where substring *sub* is found, such
+   that *sub* is contained within the slice ``[start:end]``.  Optional arguments
+   *start* and *end* are interpreted as in slice notation.  Return ``-1`` on
+   failure.
+
+
+.. method:: bytes.rindex(sub[, start[, end]])
+
+   Like :meth:`rfind` but raises :exc:`ValueError` when the substring *sub* is
+   not found.
+
+
+.. method:: bytes.rpartition(sep)
+
+   Split the bytes object at the last occurrence of *sep*, and return a 3-tuple
+   containing the part before the separator, the separator itself, and the part
+   after the separator.  If the separator is not found, return a 3-tuple
+   containing two empty strings, followed by the string itself.
+
+
+.. method:: bytes.rsplit(sep[, maxsplit])
+
+   Return a list of substrings, using *sep* as the delimiter.  If *maxsplit* is
+   given, at most *maxsplit* splits are done, the *rightmost* ones.  Except for
+   splitting from the right, :meth:`rsplit` behaves like :meth:`split` which is
+   described in detail below.
+
+
+.. method:: bytes.rstrip(which)
+
+   Return a copy of the bytes object with trailing bytes removed.  The *which*
+   argument is a bytes object specifying the set of bytes to be removed.  As
+   with :meth:`str.rstrip`, The *chars* argument is not a suffix; rather, all
+   combinations of its values are stripped.
+
+
+.. method:: bytes.split(sep[, maxsplit])
+
+   Return a list of substrings, using *sep* as the delimiter.  If *maxsplit* is
+   given, at most *maxsplit* splits are done (thus, the list will have at most
+   ``maxsplit+1`` elements).  If *maxsplit* is not specified, then there is no
+   limit on the number of splits (all possible splits are made).  Consecutive
+   delimiters are not grouped together and are deemed to delimit empty strings
+   (for example, ``b'1,,2'.split(b',')`` returns ``[b'1', b'', b'2']``).  The
+   *sep* argument may consist of multiple bytes (for example, ``b'1, 2,
+   3'.split(b', ')`` returns ``[b'1', b'2', b'3']``).  Splitting an empty string
+   with a specified separator returns ``[b'']``.
+
+
+.. method:: bytes.startswith(prefix[, start[, end]])
+
+   Return ``True`` if the bytes object starts with the *prefix*, otherwise
+   return ``False``.  *prefix* can also be a tuple of prefixes to look for.
+   With optional *start*, test string beginning at that position.  With optional
+   *end*, stop comparing string at that position.
+
+
+.. method:: bytes.strip(which)
+
+   Return a copy of the bytes object with leading and trailing bytes found in
+   *which* removed.  The *which* argument is a bytes object specifying the set
+   of characters to be removed.  The *which* argument is not a prefix or suffix;
+   rather, all combinations of its values are stripped::
+
+      >>> b'www.example.com'.strip(b'cmowz.')
+      b'example'
+
+
+.. method:: bytes.translate(table[, deletechars])
+
+   Return a copy of the bytes object where all bytes occurring in the optional
+   argument *deletechars* are removed, and the remaining bytes have been mapped
+   through the given translation table, which must be a bytes object of length
+   256.
+
+   You can use the :func:`maketrans` helper function in the :mod:`string` module to
+   create a translation table.
+
+   .. XXX a None table doesn't seem to be supported
+      For string objects, set the *table* argument to
+      ``None`` for translations that only delete characters::
+
+         >>> 'read this short text'.translate(None, 'aeiou')
+         'rd ths shrt txt'
+
+
 .. _types-set:
 
 Set Types --- :class:`set`, :class:`frozenset`
diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst
index 8dbdc31..f45b311 100644
--- a/Doc/reference/expressions.rst
+++ b/Doc/reference/expressions.rst
@@ -1272,7 +1272,7 @@ groups from right to left).
 
 .. [#] While comparisons between strings make sense at the byte
    level, they may be counter-intuitive to users. For example, the
-   strings ``u"\u00C7"`` and ``u"\u0327\u0043"`` compare differently,
+   strings ``"\u00C7"`` and ``"\u0327\u0043"`` compare differently,
    even though they both represent the same unicode character (LATIN
    CAPTITAL LETTER C WITH CEDILLA).
author	Georg Brandl <georg@python.org>	2007-08-31 10:15:37 (GMT)
committer	Georg Brandl <georg@python.org>	2007-08-31 10:15:37 (GMT)
commit	226878cba507cff4b6ce094063682d0b0b53cbb9 (patch)
tree	cf95b7b7cf86226c9f79ec6292c0f3187b234979 /Doc
parent	283e35f606a52fe3ad3cb8006a41155476118b7c (diff)
download	cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.zip cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.tar.gz cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.tar.bz2