diff options
author | Georg Brandl <georg@python.org> | 2007-11-02 20:06:17 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2007-11-02 20:06:17 (GMT) |
commit | 7c3e79f67f6191fb500144db1d484d8bb8619f5a (patch) | |
tree | 36ae7c99ee5a7d66b8145f9005237e5897aae2d3 | |
parent | 03fd077482fd5d88a40a8f0d04e0566b9738098a (diff) | |
download | cpython-7c3e79f67f6191fb500144db1d484d8bb8619f5a.zip cpython-7c3e79f67f6191fb500144db1d484d8bb8619f5a.tar.gz cpython-7c3e79f67f6191fb500144db1d484d8bb8619f5a.tar.bz2 |
Make "hashable" a glossary entry and clarify docs on __cmp__, __eq__ and __hash__.
I hope the concept of hashability is better understandable now.
Thanks to Tim Hatch for pointing out the flaws here.
-rw-r--r-- | Doc/c-api/concrete.rst | 4 | ||||
-rw-r--r-- | Doc/glossary.rst | 14 | ||||
-rw-r--r-- | Doc/library/datetime.rst | 2 | ||||
-rw-r--r-- | Doc/library/difflib.rst | 4 | ||||
-rw-r--r-- | Doc/library/random.rst | 4 | ||||
-rw-r--r-- | Doc/library/stdtypes.rst | 36 | ||||
-rw-r--r-- | Doc/library/weakref.rst | 4 | ||||
-rw-r--r-- | Doc/reference/datamodel.rst | 75 | ||||
-rw-r--r-- | Doc/reference/expressions.rst | 2 |
9 files changed, 84 insertions, 61 deletions
diff --git a/Doc/c-api/concrete.rst b/Doc/c-api/concrete.rst index 209f3e6..a02332a 100644 --- a/Doc/c-api/concrete.rst +++ b/Doc/c-api/concrete.rst @@ -2231,8 +2231,8 @@ Dictionary Objects .. cfunction:: int PyDict_SetItem(PyObject *p, PyObject *key, PyObject *val) Insert *value* into the dictionary *p* with a key of *key*. *key* must be - hashable; if it isn't, :exc:`TypeError` will be raised. Return ``0`` on success - or ``-1`` on failure. + :term:`hashable`; if it isn't, :exc:`TypeError` will be raised. Return ``0`` + on success or ``-1`` on failure. .. cfunction:: int PyDict_SetItemString(PyObject *p, const char *key, PyObject *val) diff --git a/Doc/glossary.rst b/Doc/glossary.rst index 65a47f1..03484de 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -153,6 +153,20 @@ Glossary in the past to create a "free-threaded" interpreter (one which locks shared data at a much finer granularity), but performance suffered in the common single-processor case. + + hashable + An object is *hashable* if it has a hash value that never changes during + its lifetime (it needs a :meth:`__hash__` method), and can be compared to + other objects (it needs an :meth:`__eq__` or :meth:`__cmp__` method). + Hashable objects that compare equal must have the same hash value. + + Hashability makes an object usable as a dictionary key and a set member, + because these data structures use the hash value internally. + + All of Python's immutable built-in objects are hashable, while all mutable + containers (such as lists or dictionaries) are not. Objects that are + instances of user-defined classes are hashable by default; they all + compare unequal, and their hash value is their :func:`id`. IDLE An Integrated Development Environment for Python. IDLE is a basic editor diff --git a/Doc/library/datetime.rst b/Doc/library/datetime.rst index 24d4f69..9442d29 100644 --- a/Doc/library/datetime.rst +++ b/Doc/library/datetime.rst @@ -262,7 +262,7 @@ compared to an object of a different type, :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The latter cases return :const:`False` or :const:`True`, respectively. -:class:`timedelta` objects are hashable (usable as dictionary keys), support +:class:`timedelta` objects are :term:`hashable` (usable as dictionary keys), support efficient pickling, and in Boolean contexts, a :class:`timedelta` object is considered to be true if and only if it isn't equal to ``timedelta(0)``. diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst index 4da3be9..baea5d4 100644 --- a/Doc/library/difflib.rst +++ b/Doc/library/difflib.rst @@ -20,7 +20,7 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module. .. class:: SequenceMatcher This is a flexible class for comparing pairs of sequences of any type, so long - as the sequence elements are hashable. The basic algorithm predates, and is a + as the sequence elements are :term:`hashable`. The basic algorithm predates, and is a little fancier than, an algorithm published in the late 1980's by Ratcliff and Obershelp under the hyperbolic name "gestalt pattern matching." The idea is to find the longest contiguous matching subsequence that contains no "junk" @@ -313,7 +313,7 @@ The :class:`SequenceMatcher` class has this constructor: on blanks or hard tabs. The optional arguments *a* and *b* are sequences to be compared; both default to - empty strings. The elements of both sequences must be hashable. + empty strings. The elements of both sequences must be :term:`hashable`. :class:`SequenceMatcher` objects have the following methods: diff --git a/Doc/library/random.rst b/Doc/library/random.rst index 02adf7a..e19d07e 100644 --- a/Doc/library/random.rst +++ b/Doc/library/random.rst @@ -60,7 +60,7 @@ Bookkeeping functions: .. function:: seed([x]) Initialize the basic random number generator. Optional argument *x* can be any - hashable object. If *x* is omitted or ``None``, current system time is used; + :term:`hashable` object. If *x* is omitted or ``None``, current system time is used; current system time is also used to initialize the generator when the module is first imported. If randomness sources are provided by the operating system, they are used instead of the system time (see the :func:`os.urandom` function @@ -165,7 +165,7 @@ Functions for sequences: (the sample) to be partitioned into grand prize and second place winners (the subslices). - Members of the population need not be hashable or unique. If the population + Members of the population need not be :term:`hashable` or unique. If the population contains repeats, then each occurrence is a possible selection in the sample. To choose a sample from a range of integers, use an :func:`xrange` object as an diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 046d494..a74758b 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -1419,7 +1419,7 @@ Set Types --- :class:`set`, :class:`frozenset` .. index:: object: set -A :dfn:`set` object is an unordered collection of distinct hashable objects. +A :dfn:`set` object is an unordered collection of distinct :term:`hashable` objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. @@ -1438,7 +1438,7 @@ There are currently two builtin set types, :class:`set` and :class:`frozenset`. The :class:`set` type is mutable --- the contents can be changed using methods like :meth:`add` and :meth:`remove`. Since it is mutable, it has no hash value and cannot be used as either a dictionary key or as an element of another set. -The :class:`frozenset` type is immutable and hashable --- its contents cannot be +The :class:`frozenset` type is immutable and :term:`hashable` --- its contents cannot be altered after it is created; it can therefore be used as a dictionary key or as an element of another set. @@ -1538,8 +1538,7 @@ or ``a>b``. Accordingly, sets do not implement the :meth:`__cmp__` method. Since sets only define partial ordering (subset relationships), the output of the :meth:`list.sort` method is undefined for lists of sets. -Set elements are like dictionary keys; they need to define both :meth:`__hash__` -and :meth:`__eq__` methods. +Set elements, like dictionary keys, must be :term:`hashable`. Binary operations that mix :class:`set` instances with :class:`frozenset` return the type of the first operand. For example: ``frozenset('ab') | set('bc')`` @@ -1619,21 +1618,20 @@ Mapping Types --- :class:`dict` statement: del builtin: len -A :dfn:`mapping` object maps immutable values to arbitrary objects. Mappings -are mutable objects. There is currently only one standard mapping type, the -:dfn:`dictionary`. -(For other containers see the built in :class:`list`, -:class:`set`, and :class:`tuple` classes, and the :mod:`collections` -module.) - -A dictionary's keys are *almost* arbitrary values. Only -values containing lists, dictionaries or other mutable types (that are compared -by value rather than by object identity) may not be used as keys. Numeric types -used for keys obey the normal rules for numeric comparison: if two numbers -compare equal (such as ``1`` and ``1.0``) then they can be used interchangeably -to index the same dictionary entry. (Note however, that since computers -store floating-point numbers as approximations it is usually unwise to -use them as dictionary keys.) +A :dfn:`mapping` object maps :term:`hashable` values to arbitrary objects. +Mappings are mutable objects. There is currently only one standard mapping +type, the :dfn:`dictionary`. (For other containers see the built in +:class:`list`, :class:`set`, and :class:`tuple` classes, and the +:mod:`collections` module.) + +A dictionary's keys are *almost* arbitrary values. Values that are not +:term:`hashable`, that is, values containing lists, dictionaries or other +mutable types (that are compared by value rather than by object identity) may +not be used as keys. Numeric types used for keys obey the normal rules for +numeric comparison: if two numbers compare equal (such as ``1`` and ``1.0``) +then they can be used interchangeably to index the same dictionary entry. (Note +however, that since computers store floating-point numbers as approximations it +is usually unwise to use them as dictionary keys.) Dictionaries can be created by placing a comma-separated list of ``key: value`` pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098: diff --git a/Doc/library/weakref.rst b/Doc/library/weakref.rst index 21007d9..225991a 100644 --- a/Doc/library/weakref.rst +++ b/Doc/library/weakref.rst @@ -87,7 +87,7 @@ Extension types can easily be made to support weak references; see but cannot be propagated; they are handled in exactly the same way as exceptions raised from an object's :meth:`__del__` method. - Weak references are hashable if the *object* is hashable. They will maintain + Weak references are :term:`hashable` if the *object* is hashable. They will maintain their hash value even after the *object* was deleted. If :func:`hash` is called the first time only after the *object* was deleted, the call will raise :exc:`TypeError`. @@ -108,7 +108,7 @@ Extension types can easily be made to support weak references; see the proxy in most contexts instead of requiring the explicit dereferencing used with weak reference objects. The returned object will have a type of either ``ProxyType`` or ``CallableProxyType``, depending on whether *object* is - callable. Proxy objects are not hashable regardless of the referent; this + callable. Proxy objects are not :term:`hashable` regardless of the referent; this avoids a number of problems related to their fundamentally mutable nature, and prevent their use as dictionary keys. *callback* is the same as the parameter of the same name to the :func:`ref` function. diff --git a/Doc/reference/datamodel.rst b/Doc/reference/datamodel.rst index 078293c..dc2fbd8 100644 --- a/Doc/reference/datamodel.rst +++ b/Doc/reference/datamodel.rst @@ -409,9 +409,10 @@ Set types Frozen sets .. index:: object: frozenset - These represent an immutable set. They are created by the built-in - :func:`frozenset` constructor. As a frozenset is immutable and hashable, it can - be used again as an element of another set, or as a dictionary key. + These represent an immutable set. They are created by the built-in + :func:`frozenset` constructor. As a frozenset is immutable and + :term:`hashable`, it can be used again as an element of another set, or as + a dictionary key. .. % Set types @@ -1315,6 +1316,9 @@ Basic customization .. versionadded:: 2.1 + .. index:: + single: comparisons + These are the so-called "rich comparison" methods, and are called for comparison operators in preference to :meth:`__cmp__` below. The correspondence between operator symbols and method names is as follows: ``x<y`` calls ``x.__lt__(y)``, @@ -1329,14 +1333,16 @@ Basic customization context (e.g., in the condition of an ``if`` statement), Python will call :func:`bool` on the value to determine if the result is true or false. - There are no implied relationships among the comparison operators. The truth of - ``x==y`` does not imply that ``x!=y`` is false. Accordingly, when defining - :meth:`__eq__`, one should also define :meth:`__ne__` so that the operators will - behave as expected. + There are no implied relationships among the comparison operators. The truth + of ``x==y`` does not imply that ``x!=y`` is false. Accordingly, when + defining :meth:`__eq__`, one should also define :meth:`__ne__` so that the + operators will behave as expected. See the paragraph on :meth:`__hash__` for + some important notes on creating :term:`hashable` objects which support + custom comparison operations and are usable as dictionary keys. - There are no reflected (swapped-argument) versions of these methods (to be used - when the left argument does not support the operation but the right argument - does); rather, :meth:`__lt__` and :meth:`__gt__` are each other's reflection, + There are no swapped-argument versions of these methods (to be used when the + left argument does not support the operation but the right argument does); + rather, :meth:`__lt__` and :meth:`__gt__` are each other's reflection, :meth:`__le__` and :meth:`__ge__` are each other's reflection, and :meth:`__eq__` and :meth:`__ne__` are their own reflection. @@ -1349,14 +1355,15 @@ Basic customization builtin: cmp single: comparisons - Called by comparison operations if rich comparison (see above) is not defined. - Should return a negative integer if ``self < other``, zero if ``self == other``, - a positive integer if ``self > other``. If no :meth:`__cmp__`, :meth:`__eq__` - or :meth:`__ne__` operation is defined, class instances are compared by object - identity ("address"). See also the description of :meth:`__hash__` for some - important notes on creating objects which support custom comparison operations - and are usable as dictionary keys. (Note: the restriction that exceptions are - not propagated by :meth:`__cmp__` has been removed since Python 1.5.) + Called by comparison operations if rich comparison (see above) is not + defined. Should return a negative integer if ``self < other``, zero if + ``self == other``, a positive integer if ``self > other``. If no + :meth:`__cmp__`, :meth:`__eq__` or :meth:`__ne__` operation is defined, class + instances are compared by object identity ("address"). See also the + description of :meth:`__hash__` for some important notes on creating + :term:`hashable` objects which support custom comparison operations and are + usable as dictionary keys. (Note: the restriction that exceptions are not + propagated by :meth:`__cmp__` has been removed since Python 1.5.) .. method:: object.__rcmp__(self, other) @@ -1371,25 +1378,29 @@ Basic customization object: dictionary builtin: hash - Called for the key object for dictionary operations, and by the built-in - function :func:`hash`. Should return a 32-bit integer usable as a hash value + Called for the key object for dictionary operations, and by the built-in + function :func:`hash`. Should return an integer usable as a hash value for dictionary operations. The only required property is that objects which compare equal have the same hash value; it is advised to somehow mix together (e.g., using exclusive or) the hash values for the components of the object that - also play a part in comparison of objects. If a class does not define a - :meth:`__cmp__` method it should not define a :meth:`__hash__` operation either; - if it defines :meth:`__cmp__` or :meth:`__eq__` but not :meth:`__hash__`, its - instances will not be usable as dictionary keys. If a class defines mutable - objects and implements a :meth:`__cmp__` or :meth:`__eq__` method, it should not - implement :meth:`__hash__`, since the dictionary implementation requires that a - key's hash value is immutable (if the object's hash value changes, it will be in - the wrong hash bucket). + also play a part in comparison of objects. - .. versionchanged:: 2.5 - :meth:`__hash__` may now also return a long integer object; the 32-bit integer - is then derived from the hash of that object. + If a class does not define a :meth:`__cmp__` or :meth:`__eq__` method it + should not define a :meth:`__hash__` operation either; if it defines + :meth:`__cmp__` or :meth:`__eq__` but not :meth:`__hash__`, its instances + will not be usable as dictionary keys. If a class defines mutable objects + and implements a :meth:`__cmp__` or :meth:`__eq__` method, it should not + implement :meth:`__hash__`, since the dictionary implementation requires that + a key's hash value is immutable (if the object's hash value changes, it will + be in the wrong hash bucket). + + User-defined classes have :meth:`__cmp__` and :meth:`__hash__` methods + by default; with them, all objects compare unequal and ``x.__hash__()`` + returns ``id(x)``. - .. index:: single: __cmp__() (object method) + .. versionchanged:: 2.5 + :meth:`__hash__` may now also return a long integer object; the 32-bit + integer is then derived from the hash of that object. .. method:: object.__nonzero__(self) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 488f090..706d0f1 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -276,7 +276,7 @@ the corresponding datum. .. index:: pair: immutable; object Restrictions on the types of the key values are listed earlier in section -:ref:`types`. (To summarize, the key type should be hashable, which excludes +:ref:`types`. (To summarize, the key type should be :term:`hashable`, which excludes all mutable objects.) Clashes between duplicate keys are not detected; the last datum (textually rightmost in the display) stored for a given key value prevails. |