diff options
author | Guido van Rossum <guido@python.org> | 2007-11-06 21:34:58 (GMT) |
---|---|---|
committer | Guido van Rossum <guido@python.org> | 2007-11-06 21:34:58 (GMT) |
commit | 98297ee7815939b124156e438b22bd652d67b5db (patch) | |
tree | a9d239ebd87c73af2571ab48003984c4e18e27e5 /Doc | |
parent | a19f80c6df2df5e8a5d0cff37131097835ef971e (diff) | |
download | cpython-98297ee7815939b124156e438b22bd652d67b5db.zip cpython-98297ee7815939b124156e438b22bd652d67b5db.tar.gz cpython-98297ee7815939b124156e438b22bd652d67b5db.tar.bz2 |
Merging the py3k-pep3137 branch back into the py3k branch.
No detailed change log; just check out the change log for the py3k-pep3137
branch. The most obvious changes:
- str8 renamed to bytes (PyString at the C level);
- bytes renamed to buffer (PyBytes at the C level);
- PyString and PyUnicode are no longer compatible.
I.e. we now have an immutable bytes type and a mutable bytes type.
The behavior of PyString was modified quite a bit, to make it more
bytes-like. Some changes are still on the to-do list.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/array.rst | 9 | ||||
-rw-r--r-- | Doc/library/exceptions.rst | 6 | ||||
-rw-r--r-- | Doc/library/functions.rst | 27 | ||||
-rw-r--r-- | Doc/library/stdtypes.rst | 8 | ||||
-rw-r--r-- | Doc/library/warnings.rst | 4 | ||||
-rw-r--r-- | Doc/whatsnew/3.0.rst | 11 |
6 files changed, 43 insertions, 22 deletions
diff --git a/Doc/library/array.rst b/Doc/library/array.rst index c2b7a44..4747b63 100644 --- a/Doc/library/array.rst +++ b/Doc/library/array.rst @@ -56,8 +56,9 @@ The module defines the following type: .. function:: array(typecode[, initializer]) Return a new array whose items are restricted by *typecode*, and initialized - from the optional *initializer* value, which must be a list, string, or iterable - over elements of the appropriate type. + from the optional *initializer* value, which must be a list, object + supporting the buffer interface, or iterable over elements of the + appropriate type. If given a list or string, the initializer is passed to the new array's :meth:`fromlist`, :meth:`fromstring`, or :meth:`fromunicode` method (see below) @@ -69,6 +70,10 @@ The module defines the following type: Obsolete alias for :func:`array`. +.. data:: typecodes + + A string with all available type codes. + Array objects support the ordinary sequence operations of indexing, slicing, concatenation, and multiplication. When using slice assignment, the assigned value must be an array object with the same type code; in all other cases, diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst index 34fb429..9453b7a 100644 --- a/Doc/library/exceptions.rst +++ b/Doc/library/exceptions.rst @@ -405,7 +405,11 @@ module for more information. Base class for warnings related to Unicode. -The class hierarchy for built-in exceptions is: +.. exception:: BytesWarning + + Base class for warnings related to :class:`bytes` and :class:`buffer`. +The class hierarchy for built-in exceptions is: + .. literalinclude:: ../../Lib/test/exception_hierarchy.txt diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst index 63f2c33..d554a08 100644 --- a/Doc/library/functions.rst +++ b/Doc/library/functions.rst @@ -118,18 +118,19 @@ available. They are listed here in alphabetical order. .. index:: pair: Boolean; type -.. function:: bytes([arg[, encoding[, errors]]]) +.. function:: buffer([arg[, encoding[, errors]]]) - Return a new array of bytes. The :class:`bytes` type is a mutable sequence + Return a new array of bytes. The :class:`buffer` type is an immutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of - mutable sequences, described in :ref:`typesseq-mutable`, as well as a few - methods borrowed from strings, described in :ref:`bytes-methods`. + mutable sequences, described in :ref:`typesseq-mutable`, as well as most methods + that the :class:`str` type has, see :ref:`bytes-methods`. The optional *arg* parameter can be used to initialize the array in a few different ways: * If it is a *string*, you must also give the *encoding* (and optionally, - *errors*) parameters; :func:`bytes` then acts like :meth:`str.encode`. + *errors*) parameters; :func:`buffer` then converts the Unicode string to + bytes using :meth:`str.encode`. * If it is an *integer*, the array will have that size and will be initialized with null bytes. @@ -137,12 +138,24 @@ available. They are listed here in alphabetical order. * If it is an object conforming to the *buffer* interface, a read-only buffer of the object will be used to initialize the bytes array. - * If it is an *iterable*, it must be an iterable of integers in the range 0 - <= x < 256, which are used as the initial contents of the array. + * If it is an *iterable*, it must be an iterable of integers in the range + ``0 <= x < 256``, which are used as the initial contents of the array. Without an argument, an array of size 0 is created. +.. function:: bytes([arg[, encoding[, errors]]]) + + Return a new "bytes" object, which is an immutable sequence of integers in + the range ``0 <= x < 256``. :class:`bytes` is an immutable version of + :class:`buffer` -- it has the same non-mutating methods and the same indexing + and slicing behavior. + + Accordingly, constructor arguments are interpreted as for :func:`buffer`. + + Bytes objects can also be created with literals, see :ref:`strings`. + + .. function:: chr(i) Return the string of one character whose Unicode codepoint is the integer diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index f557b1f..9073bca 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1313,9 +1313,11 @@ Bytes and Buffer Methods Bytes and buffer objects, being "strings of bytes", have all methods found on strings, with the exception of :func:`encode`, :func:`format` and -:func:`isidentifier`, which do not make sense with these types. Wherever one of -these methods needs to interpret the bytes as characters (e.g. the :func:`is...` -methods), the ASCII character set is assumed. +:func:`isidentifier`, which do not make sense with these types. For converting +the objects to strings, they have a :func:`decode` method. + +Wherever one of these methods needs to interpret the bytes as characters +(e.g. the :func:`is...` methods), the ASCII character set is assumed. .. note:: diff --git a/Doc/library/warnings.rst b/Doc/library/warnings.rst index 684209f..9a10385 100644 --- a/Doc/library/warnings.rst +++ b/Doc/library/warnings.rst @@ -80,6 +80,10 @@ following warnings category classes are currently defined: | :exc:`UnicodeWarning` | Base category for warnings related to | | | Unicode. | +----------------------------------+-----------------------------------------------+ +| :exc:`BytesWarning` | Base category for warnings related to | +| | :class:`bytes` and :class:`buffer`. | ++----------------------------------+-----------------------------------------------+ + While these are technically built-in exceptions, they are documented here, because conceptually they belong to the warnings mechanism. diff --git a/Doc/whatsnew/3.0.rst b/Doc/whatsnew/3.0.rst index afe842d..8d6babd 100644 --- a/Doc/whatsnew/3.0.rst +++ b/Doc/whatsnew/3.0.rst @@ -131,11 +131,6 @@ changes to rarely used features.) that if a file is opened using an incorrect mode or encoding, I/O will likely fail. -* Bytes aren't hashable, and don't support certain operations like - ``b.lower()``, ``b.strip()`` or ``b.split()``. - For the latter two, use ``b.strip(b" \t\r\n\f")`` or - ``b.split(b" \t\r\n\f")``. - * ``map()`` and ``filter()`` return iterators. A quick fix is e.g. ``list(map(...))``, but a better fix is often to use a list comprehension (especially when the original code uses ``lambda``). @@ -158,13 +153,11 @@ Strings and Bytes * There is only one string type; its name is ``str`` but its behavior and implementation are more like ``unicode`` in 2.x. -* PEP 358: There is a new type, ``bytes``, to represent binary data +* PEP 3137: There is a new type, ``bytes``, to represent binary data (and encoded text, which is treated as binary data until you decide to decode it). The ``str`` and ``bytes`` types cannot be mixed; you must always explicitly convert between them, using the ``.encode()`` - (str -> bytes) or ``.decode()`` (bytes -> str) methods. Comparing a - bytes and a str instance for equality raises a TypeError; this - catches common mistakes. + (str -> bytes) or ``.decode()`` (bytes -> str) methods. * PEP 3112: Bytes literals. E.g. b"abc". |