summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorGeorg Brandl <georg@python.org>2007-08-31 10:15:37 (GMT)
committerGeorg Brandl <georg@python.org>2007-08-31 10:15:37 (GMT)
commit226878cba507cff4b6ce094063682d0b0b53cbb9 (patch)
treecf95b7b7cf86226c9f79ec6292c0f3187b234979 /Doc
parent283e35f606a52fe3ad3cb8006a41155476118b7c (diff)
downloadcpython-226878cba507cff4b6ce094063682d0b0b53cbb9.zip
cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.tar.gz
cpython-226878cba507cff4b6ce094063682d0b0b53cbb9.tar.bz2
Add bytes methods documentation.
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/codecs.rst2
-rw-r--r--Doc/library/stdtypes.rst300
-rw-r--r--Doc/reference/expressions.rst2
3 files changed, 243 insertions, 61 deletions
diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst
index 7a035c2..aa6bc98 100644
--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@@ -1117,6 +1117,8 @@ For the codecs listed below, the result in the "encoding" direction is always a
byte string. The result of the "decoding" direction is listed as operand type in
the table.
+.. XXX fix here, should be in above table
+
+--------------------+---------+----------------+---------------------------+
| Codec | Aliases | Operand type | Purpose |
+====================+=========+================+===========================+
diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst
index e7569ad..dab3476 100644
--- a/Doc/library/stdtypes.rst
+++ b/Doc/library/stdtypes.rst
@@ -504,7 +504,7 @@ described in the :ref:`string-methods` section. Bytes objects can be
constructed from literals too; use a ``b`` prefix with normal string syntax:
``b'xyzzy'``.
-.. caveat::
+.. warning::
While string objects are sequences of characters (represented by strings of
length 1), bytes objects are sequences of *integers* (between 0 and 255),
@@ -649,8 +649,6 @@ Notes:
Formerly, string concatenation never occurred in-place.
-.. XXX add bytes methods
-
.. _string-methods:
String Methods
@@ -687,7 +685,7 @@ the :mod:`re` module for string functions based on regular expressions.
.. XXX what about str.decode???
.. method:: str.decode([encoding[, errors]])
- Decodes the string using the codec registered for *encoding*. *encoding*
+ Decode the string using the codec registered for *encoding*. *encoding*
defaults to the default string encoding. *errors* may be given to set a
different error handling scheme. The default is ``'strict'``, meaning that
encoding errors raise :exc:`UnicodeError`. Other possible values are
@@ -700,7 +698,7 @@ the :mod:`re` module for string functions based on regular expressions.
Support for other error handling schemes added.
-.. method:: str.encode([encoding[,errors]])
+.. method:: str.encode([encoding[, errors]])
Return an encoded version of the string. Default encoding is the current
default string encoding. *errors* may be given to set a different error
@@ -869,7 +867,7 @@ the :mod:`re` module for string functions based on regular expressions.
occurrences are replaced.
-.. method:: str.rfind(sub [,start [,end]])
+.. method:: str.rfind(sub[, start[, end]])
Return the highest index in the string where substring *sub* is found, such that
*sub* is contained within s[start,end]. Optional arguments *start* and *end*
@@ -902,7 +900,7 @@ the :mod:`re` module for string functions based on regular expressions.
.. versionadded:: 2.5
-.. method:: str.rsplit([sep [,maxsplit]])
+.. method:: str.rsplit([sep[, maxsplit]])
Return a list of the words in the string, using *sep* as the delimiter string.
If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost*
@@ -929,17 +927,17 @@ the :mod:`re` module for string functions based on regular expressions.
Support for the *chars* argument.
-.. method:: str.split([sep [,maxsplit]])
+.. method:: str.split([sep[, maxsplit]])
- Return a list of the words in the string, using *sep* as the delimiter string.
- If *maxsplit* is given, at most *maxsplit* splits are done. (thus, the list will
- have at most ``maxsplit+1`` elements). If *maxsplit* is not specified, then
- there is no limit on the number of splits (all possible splits are made).
- Consecutive delimiters are not grouped together and are deemed to delimit empty
- strings (for example, ``'1,,2'.split(',')`` returns ``['1', '', '2']``). The
- *sep* argument may consist of multiple characters (for example, ``'1, 2,
- 3'.split(', ')`` returns ``['1', '2', '3']``). Splitting an empty string with a
- specified separator returns ``['']``.
+ Return a list of the words in the string, using *sep* as the delimiter
+ string. If *maxsplit* is given, at most *maxsplit* splits are done (thus,
+ the list will have at most ``maxsplit+1`` elements). If *maxsplit* is not
+ specified, then there is no limit on the number of splits (all possible
+ splits are made). Consecutive delimiters are not grouped together and are
+ deemed to delimit empty strings (for example, ``'1,,2'.split(',')`` returns
+ ``['1', '', '2']``). The *sep* argument may consist of multiple characters
+ (for example, ``'1, 2, 3'.split(', ')`` returns ``['1', '2', '3']``).
+ Splitting an empty string with a specified separator returns ``['']``.
If *sep* is not specified or is ``None``, a different splitting algorithm is
applied. First, whitespace characters (spaces, tabs, newlines, returns, and
@@ -999,7 +997,7 @@ the :mod:`re` module for string functions based on regular expressions.
.. method:: str.translate(map)
- Returns a copy of the *s* where all characters have been mapped through the
+ Return a copy of the *s* where all characters have been mapped through the
*map* which must be a mapping of Unicode ordinals (integers) to Unicode
ordinals, strings or ``None``. Unmapped characters are left
untouched. Characters mapped to ``None`` are deleted.
@@ -1043,7 +1041,7 @@ Old String Formatting Operations
.. note::
- The formatting operations described here are obsolete and my go away in future
+ The formatting operations described here are obsolete and may go away in future
versions of Python. Use the new :ref:`string-formatting` in new code.
String objects have one unique built-in operation: the ``%`` operator (modulo).
@@ -1238,12 +1236,17 @@ Mutable Sequence Types
.. index::
triple: mutable; sequence; types
object: list
+ object: bytes
-List objects support additional operations that allow in-place modification of
-the object. Other mutable sequence types (when added to the language) should
-also support these operations. Strings and tuples are immutable sequence types:
-such objects cannot be modified once created. The following operations are
-defined on mutable sequence types (where *x* is an arbitrary object):
+List and bytes objects support additional operations that allow in-place
+modification of the object. Other mutable sequence types (when added to the
+language) should also support these operations. Strings and tuples are
+immutable sequence types: such objects cannot be modified once created. The
+following operations are defined on mutable sequence types (where *x* is an
+arbitrary object).
+
+Note that while lists allow their items to be of any type, bytes object
+"items" are all integers in the range 0 <= x < 256.
+------------------------------+--------------------------------+---------------------+
| Operation | Result | Notes |
@@ -1263,30 +1266,30 @@ defined on mutable sequence types (where *x* is an arbitrary object):
| ``del s[i:j:k]`` | removes the elements of | |
| | ``s[i:j:k]`` from the list | |
+------------------------------+--------------------------------+---------------------+
-| ``s.append(x)`` | same as ``s[len(s):len(s)] = | \(2) |
+| ``s.append(x)`` | same as ``s[len(s):len(s)] = | |
| | [x]`` | |
+------------------------------+--------------------------------+---------------------+
-| ``s.extend(x)`` | same as ``s[len(s):len(s)] = | \(3) |
+| ``s.extend(x)`` | same as ``s[len(s):len(s)] = | \(2) |
| | x`` | |
+------------------------------+--------------------------------+---------------------+
| ``s.count(x)`` | return number of *i*'s for | |
| | which ``s[i] == x`` | |
+------------------------------+--------------------------------+---------------------+
-| ``s.index(x[, i[, j]])`` | return smallest *k* such that | \(4) |
+| ``s.index(x[, i[, j]])`` | return smallest *k* such that | \(3) |
| | ``s[k] == x`` and ``i <= k < | |
| | j`` | |
+------------------------------+--------------------------------+---------------------+
-| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | \(5) |
+| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | \(4) |
+------------------------------+--------------------------------+---------------------+
-| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(6) |
+| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(5) |
| | return x`` | |
+------------------------------+--------------------------------+---------------------+
-| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(4) |
+| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(3) |
+------------------------------+--------------------------------+---------------------+
-| ``s.reverse()`` | reverses the items of *s* in | \(7) |
+| ``s.reverse()`` | reverses the items of *s* in | \(6) |
| | place | |
+------------------------------+--------------------------------+---------------------+
-| ``s.sort([cmp[, key[, | sort the items of *s* in place | (7), (8), (9), (10) |
+| ``s.sort([cmp[, key[, | sort the items of *s* in place | (6), (7) |
| reverse]]])`` | | |
+------------------------------+--------------------------------+---------------------+
@@ -1297,32 +1300,27 @@ defined on mutable sequence types (where *x* is an arbitrary object):
pair: slice; assignment
pair: extended slice; assignment
statement: del
- single: append() (list method)
- single: extend() (list method)
- single: count() (list method)
- single: index() (list method)
- single: insert() (list method)
- single: pop() (list method)
- single: remove() (list method)
- single: reverse() (list method)
- single: sort() (list method)
+ single: append() (sequence method)
+ single: extend() (sequence method)
+ single: count() (sequence method)
+ single: index() (sequence method)
+ single: insert() (sequence method)
+ single: pop() (sequence method)
+ single: remove() (sequence method)
+ single: reverse() (sequence method)
+ single: sort() (sequence method)
Notes:
(1)
- *t* must have the same length as the slice it is replacing.
+ *t* must have the same length as the slice it is replacing.
(2)
- The C implementation of Python has historically accepted multiple parameters and
- implicitly joined them into a tuple; this no longer works in Python 2.0. Use of
- this misfeature has been deprecated since Python 1.4.
-
-(3)
*x* can be any iterable object.
-(4)
+(3)
Raises :exc:`ValueError` when *x* is not found in *s*. When a negative index is
- passed as the second or third parameter to the :meth:`index` method, the list
+ passed as the second or third parameter to the :meth:`index` method, the sequence
length is added, as for slice indices. If it is still negative, it is truncated
to zero, as for slice indices.
@@ -1330,25 +1328,27 @@ Notes:
Previously, :meth:`index` didn't have arguments for specifying start and stop
positions.
-(5)
+(4)
When a negative index is passed as the first parameter to the :meth:`insert`
- method, the list length is added, as for slice indices. If it is still
+ method, the sequence length is added, as for slice indices. If it is still
negative, it is truncated to zero, as for slice indices.
.. versionchanged:: 2.3
Previously, all negative indices were truncated to zero.
+(5)
+ The optional argument *i* defaults to ``-1``, so that by default the last
+ item is removed and returned.
+
(6)
- The :meth:`pop` method is only supported by the list and array types. The
- optional argument *i* defaults to ``-1``, so that by default the last item is
- removed and returned.
+ The :meth:`sort` and :meth:`reverse` methods modify the sequence in place for
+ economy of space when sorting or reversing a large sequence. To remind you
+ that they operate by side effect, they don't return the sorted or reversed
+ sequence.
(7)
- The :meth:`sort` and :meth:`reverse` methods modify the list in place for
- economy of space when sorting or reversing a large list. To remind you that
- they operate by side effect, they don't return the sorted or reversed list.
+ :meth:`sort` is not supported by bytes objects.
-(8)
The :meth:`sort` method takes optional arguments for controlling the
comparisons.
@@ -1374,19 +1374,199 @@ Notes:
.. versionchanged:: 2.4
Support for *key* and *reverse* was added.
-(9)
Starting with Python 2.3, the :meth:`sort` method is guaranteed to be stable. A
sort is stable if it guarantees not to change the relative order of elements
that compare equal --- this is helpful for sorting in multiple passes (for
example, sort by department, then by salary grade).
-(10)
While a list is being sorted, the effect of attempting to mutate, or even
inspect, the list is undefined. The C implementation of Python 2.3 and newer
makes the list appear empty for the duration, and raises :exc:`ValueError` if it
can detect that the list has been mutated during a sort.
+.. _bytes-methods:
+
+Bytes Methods
+-------------
+
+.. index:: pair: bytes; methods
+
+In addition to the operations on mutable sequence types (see
+:ref:`typesseq-mutable`), bytes objects, being "mutable ASCII strings" have
+further useful methods also found on strings.
+
+.. XXX documented "count" differently above
+
+.. method:: bytes.count(sub[, start[, end]])
+
+ In contrast to the standard sequence ``count`` method, this returns the
+ number of occurrences of substring (not item) *sub* in the slice
+ ``[start:end]``. Optional arguments *start* and *end* are interpreted as in
+ slice notation.
+
+
+.. method:: bytes.decode([encoding[, errors]])
+
+ Decode the bytes using the codec registered for *encoding*. *encoding*
+ defaults to the default string encoding. *errors* may be given to set a
+ different error handling scheme. The default is ``'strict'``, meaning that
+ encoding errors raise :exc:`UnicodeError`. Other possible values are
+ ``'ignore'``, ``'replace'`` and any other name registered via
+ :func:`codecs.register_error`, see section :ref:`codec-base-classes`.
+
+
+.. method:: bytes.endswith(suffix[, start[, end]])
+
+ Return ``True`` if the bytes object ends with the specified *suffix*,
+ otherwise return ``False``. *suffix* can also be a tuple of suffixes to look
+ for. With optional *start*, test beginning at that position. With optional
+ *end*, stop comparing at that position.
+
+
+.. method:: bytes.find(sub[, start[, end]])
+
+ Return the lowest index in the string where substring *sub* is found, such that
+ *sub* is contained in the range [*start*, *end*]. Optional arguments *start*
+ and *end* are interpreted as in slice notation. Return ``-1`` if *sub* is not
+ found.
+
+
+.. method:: bytes.fromhex(string)
+
+ This :class:`bytes` class method returns a bytes object, decoding the given
+ string object. The string must contain two hexadecimal digits per byte, spaces
+ are ignored.
+
+ Example::
+
+ >>> bytes.fromhex('f0 f1f2 ')
+ b'\xf0\xf1\xf2'
+
+
+.. method:: bytes.index(sub[, start[, end]])
+
+ Like :meth:`find`, but raise :exc:`ValueError` when the substring is not found.
+
+
+.. method:: bytes.join(seq)
+
+ Return a bytes object which is the concatenation of the bytes objects in the
+ sequence *seq*. The separator between elements is the bytes object providing
+ this method.
+
+
+.. method:: bytes.lstrip(which)
+
+ Return a copy of the bytes object with leading bytes removed. The *which*
+ argument is a bytes object specifying the set of bytes to be removed. As
+ with :meth:`str.lstrip`, the *which* argument is not a prefix; rather, all
+ combinations of its values are stripped.
+
+
+.. method:: bytes.partition(sep)
+
+ Split the bytes object at the first occurrence of *sep*, and return a 3-tuple
+ containing the part before the separator, the separator itself, and the part
+ after the separator. If the separator is not found, return a 3-tuple
+ containing the bytes object itself, followed by two empty strings.
+
+
+.. method:: bytes.replace(old, new[, count])
+
+ Return a copy of the bytes object with all occurrences of substring *old*
+ replaced by *new*. If the optional argument *count* is given, only the first
+ *count* occurrences are replaced.
+
+
+.. method:: bytes.rfind(sub[, start[, end]])
+
+ Return the highest index in the string where substring *sub* is found, such
+ that *sub* is contained within the slice ``[start:end]``. Optional arguments
+ *start* and *end* are interpreted as in slice notation. Return ``-1`` on
+ failure.
+
+
+.. method:: bytes.rindex(sub[, start[, end]])
+
+ Like :meth:`rfind` but raises :exc:`ValueError` when the substring *sub* is
+ not found.
+
+
+.. method:: bytes.rpartition(sep)
+
+ Split the bytes object at the last occurrence of *sep*, and return a 3-tuple
+ containing the part before the separator, the separator itself, and the part
+ after the separator. If the separator is not found, return a 3-tuple
+ containing two empty strings, followed by the string itself.
+
+
+.. method:: bytes.rsplit(sep[, maxsplit])
+
+ Return a list of substrings, using *sep* as the delimiter. If *maxsplit* is
+ given, at most *maxsplit* splits are done, the *rightmost* ones. Except for
+ splitting from the right, :meth:`rsplit` behaves like :meth:`split` which is
+ described in detail below.
+
+
+.. method:: bytes.rstrip(which)
+
+ Return a copy of the bytes object with trailing bytes removed. The *which*
+ argument is a bytes object specifying the set of bytes to be removed. As
+ with :meth:`str.rstrip`, The *chars* argument is not a suffix; rather, all
+ combinations of its values are stripped.
+
+
+.. method:: bytes.split(sep[, maxsplit])
+
+ Return a list of substrings, using *sep* as the delimiter. If *maxsplit* is
+ given, at most *maxsplit* splits are done (thus, the list will have at most
+ ``maxsplit+1`` elements). If *maxsplit* is not specified, then there is no
+ limit on the number of splits (all possible splits are made). Consecutive
+ delimiters are not grouped together and are deemed to delimit empty strings
+ (for example, ``b'1,,2'.split(b',')`` returns ``[b'1', b'', b'2']``). The
+ *sep* argument may consist of multiple bytes (for example, ``b'1, 2,
+ 3'.split(b', ')`` returns ``[b'1', b'2', b'3']``). Splitting an empty string
+ with a specified separator returns ``[b'']``.
+
+
+.. method:: bytes.startswith(prefix[, start[, end]])
+
+ Return ``True`` if the bytes object starts with the *prefix*, otherwise
+ return ``False``. *prefix* can also be a tuple of prefixes to look for.
+ With optional *start*, test string beginning at that position. With optional
+ *end*, stop comparing string at that position.
+
+
+.. method:: bytes.strip(which)
+
+ Return a copy of the bytes object with leading and trailing bytes found in
+ *which* removed. The *which* argument is a bytes object specifying the set
+ of characters to be removed. The *which* argument is not a prefix or suffix;
+ rather, all combinations of its values are stripped::
+
+ >>> b'www.example.com'.strip(b'cmowz.')
+ b'example'
+
+
+.. method:: bytes.translate(table[, deletechars])
+
+ Return a copy of the bytes object where all bytes occurring in the optional
+ argument *deletechars* are removed, and the remaining bytes have been mapped
+ through the given translation table, which must be a bytes object of length
+ 256.
+
+ You can use the :func:`maketrans` helper function in the :mod:`string` module to
+ create a translation table.
+
+ .. XXX a None table doesn't seem to be supported
+ For string objects, set the *table* argument to
+ ``None`` for translations that only delete characters::
+
+ >>> 'read this short text'.translate(None, 'aeiou')
+ 'rd ths shrt txt'
+
+
.. _types-set:
Set Types --- :class:`set`, :class:`frozenset`
diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst
index 8dbdc31..f45b311 100644
--- a/Doc/reference/expressions.rst
+++ b/Doc/reference/expressions.rst
@@ -1272,7 +1272,7 @@ groups from right to left).
.. [#] While comparisons between strings make sense at the byte
level, they may be counter-intuitive to users. For example, the
- strings ``u"\u00C7"`` and ``u"\u0327\u0043"`` compare differently,
+ strings ``"\u00C7"`` and ``"\u0327\u0043"`` compare differently,
even though they both represent the same unicode character (LATIN
CAPTITAL LETTER C WITH CEDILLA).