Add bytes/remove unicode from the data model.

author: Georg Brandl <georg@python.org> 2007-08-31 16:41:12 (GMT)
committer: Georg Brandl <georg@python.org> 2007-08-31 16:41:12 (GMT)
commit: dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8 (patch)
tree: 5e0dcd6178f9f1fb626f406cb580fbe3e47efd7c /Doc
parent: 85eb8c103c9e460917911b43c6be302c30d75efb (diff)
download: cpython-dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8.zip
cpython-dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8.tar.gz
cpython-dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8.tar.bz2
1 files changed, 36 insertions, 64 deletions
diff --git a/Doc/reference/datamodel.rst b/Doc/reference/datamodel.rst
index c2d3aee..6f6f42c 100644
--- a/Doc/reference/datamodel.rst
+++ b/Doc/reference/datamodel.rst
@@ -289,52 +289,21 @@ Sequences
          .. index::
             builtin: chr
             builtin: ord
-            object: string
-            single: character
-            single: byte
-            single: ASCII@ASCII
-
-         The items of a string are characters.  There is no separate character type; a
-         character is represented by a string of one item. Characters represent (at
-         least) 8-bit bytes.  The built-in functions :func:`chr` and :func:`ord` convert
-         between characters and nonnegative integers representing the byte values.  Bytes
-         with the values 0-127 usually represent the corresponding ASCII values, but the
-         interpretation of values is up to the program.  The string data type is also
-         used to represent arrays of bytes, e.g., to hold data read from a file.
-
-         .. index::
-            single: ASCII@ASCII
-            single: EBCDIC
-            single: character set
-            pair: string; comparison
-            builtin: chr
-            builtin: ord
-
-         (On systems whose native character set is not ASCII, strings may use EBCDIC in
-         their internal representation, provided the functions :func:`chr` and
-         :func:`ord` implement a mapping between ASCII and EBCDIC, and string comparison
-         preserves the ASCII order. Or perhaps someone can propose a better rule?)
-
-      Unicode
-         .. index::
-            builtin: unichr
-            builtin: ord
-            builtin: unicode
-            object: unicode
+            builtin: str
             single: character
             single: integer
             single: Unicode
 
-         The items of a Unicode object are Unicode code units.  A Unicode code unit is
-         represented by a Unicode object of one item and can hold either a 16-bit or
-         32-bit value representing a Unicode ordinal (the maximum value for the ordinal
-         is given in ``sys.maxunicode``, and depends on how Python is configured at
-         compile time).  Surrogate pairs may be present in the Unicode object, and will
-         be reported as two separate items.  The built-in functions :func:`unichr` and
-         :func:`ord` convert between code units and nonnegative integers representing the
-         Unicode ordinals as defined in the Unicode Standard 3.0. Conversion from and to
-         other encodings are possible through the Unicode method :meth:`encode` and the
-         built-in function :func:`unicode`.
+         The items of a string object are Unicode code units.  A Unicode code
+         unit is represented by a string object of one item and can hold either
+         a 16-bit or 32-bit value representing a Unicode ordinal (the maximum
+         value for the ordinal is given in ``sys.maxunicode``, and depends on
+         how Python is configured at compile time).  Surrogate pairs may be
+         present in the Unicode object, and will be reported as two separate
+         items.  The built-in functions :func:`chr` and :func:`ord` convert
+         between code units and nonnegative integers representing the Unicode
+         ordinals as defined in the Unicode Standard 3.0. Conversion from and to
+         other encodings are possible through the string method :meth:`encode`.
 
       Tuples
          .. index::
@@ -342,11 +311,12 @@ Sequences
             pair: singleton; tuple
             pair: empty; tuple
 
-         The items of a tuple are arbitrary Python objects. Tuples of two or more items
-         are formed by comma-separated lists of expressions.  A tuple of one item (a
-         'singleton') can be formed by affixing a comma to an expression (an expression
-         by itself does not create a tuple, since parentheses must be usable for grouping
-         of expressions).  An empty tuple can be formed by an empty pair of parentheses.
+         The items of a tuple are arbitrary Python objects. Tuples of two or
+         more items are formed by comma-separated lists of expressions.  A tuple
+         of one item (a 'singleton') can be formed by affixing a comma to an
+         expression (an expression by itself does not create a tuple, since
+         parentheses must be usable for grouping of expressions).  An empty
+         tuple can be formed by an empty pair of parentheses.
 
       .. % Immutable sequences
 
@@ -369,14 +339,23 @@ Sequences
       Lists
          .. index:: object: list
 
-         The items of a list are arbitrary Python objects.  Lists are formed by placing a
-         comma-separated list of expressions in square brackets. (Note that there are no
-         special cases needed to form lists of length 0 or 1.)
+         The items of a list are arbitrary Python objects.  Lists are formed by
+         placing a comma-separated list of expressions in square brackets. (Note
+         that there are no special cases needed to form lists of length 0 or 1.)
+
+      Bytes
+         .. index:: bytes, byte
+
+         A bytes object is a mutable array.  The items are 8-bit bytes,
+         represented by integers in the range 0 <= x < 256.  Bytes literals
+         (like ``b'abc'`` and the built-in function :func:`bytes` can be used to
+         construct bytes objects.  Also, bytes objects can be decoded to strings
+         via the :meth:`decode` method.
 
       .. index:: module: array
 
-      The extension module :mod:`array` provides an additional example of a mutable
-      sequence type.
+      The extension module :mod:`array` provides an additional example of a
+      mutable sequence type.
 
       .. % Mutable sequences
 
@@ -1230,12 +1209,14 @@ Basic customization
       builtin: str
       builtin: print
 
-   Called by the :func:`str` built-in function and by the :func:`print`
-   function to compute the "informal" string representation of an object.  This
-   differs from :meth:`__repr__` in that it does not have to be a valid Python
+   Called by the :func:`str` built-in function and by the :func:`print` function
+   to compute the "informal" string representation of an object.  This differs
+   from :meth:`__repr__` in that it does not have to be a valid Python
    expression: a more convenient or concise representation may be used instead.
    The return value must be a string object.
 
+   .. XXX what about subclasses of string?
+
 
 .. method:: object.__format__(self, format_spec)
 
@@ -1355,15 +1336,6 @@ Basic customization
    :meth:`__bool__`, all its instances are considered true.
 
 
-.. method:: object.__unicode__(self)
-
-   .. index:: builtin: unicode
-
-   Called to implement :func:`unicode` builtin; should return a Unicode object.
-   When this method is not defined, string conversion is attempted, and the result
-   of string conversion is converted to Unicode using the system default encoding.
-
-
 .. _attribute-access:
 
 Customizing attribute access
author	Georg Brandl <georg@python.org>	2007-08-31 16:41:12 (GMT)
committer	Georg Brandl <georg@python.org>	2007-08-31 16:41:12 (GMT)
commit	dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8 (patch)
tree	5e0dcd6178f9f1fb626f406cb580fbe3e47efd7c /Doc
parent	85eb8c103c9e460917911b43c6be302c30d75efb (diff)
download	cpython-dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8.zip cpython-dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8.tar.gz cpython-dcc56f8bf668965ef5f3cfe2fd6a07b7b84b08a8.tar.bz2