diff options
author | Tim Peters <tim.peters@gmail.com> | 2001-05-27 07:39:22 (GMT) |
---|---|---|
committer | Tim Peters <tim.peters@gmail.com> | 2001-05-27 07:39:22 (GMT) |
commit | 15d4929ae4ad5206e2f7fa947d632bfba9b8bfd1 (patch) | |
tree | 97abc7e1d28bcb8da6d31d3578759c70b0e2bf41 /Objects/dictobject.c | |
parent | dac238bd46e95d1736cd6bee11df850a508f433b (diff) | |
download | cpython-15d4929ae4ad5206e2f7fa947d632bfba9b8bfd1.zip cpython-15d4929ae4ad5206e2f7fa947d632bfba9b8bfd1.tar.gz cpython-15d4929ae4ad5206e2f7fa947d632bfba9b8bfd1.tar.bz2 |
Implement an old idea of Christian Tismer's: use polynomial division
instead of multiplication to generate the probe sequence. The idea is
recorded in Python-Dev for Dec 2000, but that version is prone to rare
infinite loops.
The value is in getting *all* the bits of the hash code to participate;
and, e.g., this speeds up querying every key in a dict with keys
[i << 16 for i in range(20000)] by a factor of 500. Should be equally
valuable in any bad case where the high-order hash bits were getting
ignored.
Also wrote up some of the motivations behind Python's ever-more-subtle
hash table strategy.
Diffstat (limited to 'Objects/dictobject.c')
-rw-r--r-- | Objects/dictobject.c | 90 |
1 files changed, 72 insertions, 18 deletions
diff --git a/Objects/dictobject.c b/Objects/dictobject.c index d5700c9..5944b6e 100644 --- a/Objects/dictobject.c +++ b/Objects/dictobject.c @@ -31,6 +31,58 @@ k a second time. Theory can be used to find such polys efficiently, but the operational defn. of "works" is sufficient to find them in reasonable time via brute force program (hint: any poly that has an even number of 1 bits cannot work; ditto any poly with low bit 0; exploit those). + +Some major subtleties: Most hash schemes depend on having a "good" hash +function, in the sense of simulating randomness. Python doesn't: some of +its hash functions are trivial, such as hash(i) == i for ints i (excepting +i == -1, because -1 is the "error occurred" return value from tp_hash). + +This isn't necessarily bad! To the contrary, that our hash tables are powers +of 2 in size, and that we take the low-order bits as the initial table index, +means that there are no collisions at all for dicts indexed by a contiguous +range of ints. This is "better than random" behavior, and that's very +desirable. + +On the other hand, when collisions occur, the tendency to fill contiguous +slices of the hash table makes a good collision resolution strategy crucial; +e.g., linear probing is right out. + +Reimer Behrends contributed the idea of using a polynomial-based approach, +using repeated multiplication by x in GF(2**n) where a polynomial is chosen +such that x is a primitive root. This visits every table location exactly +once, and the sequence of locations probed is highly non-linear. + +The same is also largely true of quadratic probing for power-of-2 tables, of +the specific + + (i + comb(1, 2)) mod size + (i + comb(2, 2)) mod size + (i + comb(3, 2)) mod size + (i + comb(4, 2)) mod size + ... + (i + comb(j, 2)) mod size + +flavor. The polynomial approach "scrambles" the probe indices better, but +more importantly allows to get *some* additional bits of the hash code into +play via computing the initial increment, thus giving a weak form of double +hashing. Quadratic probing cannot be extended that way (the first probe +offset must be 1, the second 3, the third 6, etc). + +Christian Tismer later contributed the idea of using polynomial division +instead of multiplication. The problem is that the multiplicative method +can't get *all* the bits of the hash code into play without expensive +computations that slow down the initial index and/or initial increment +computation. For a set of keys like [i << 16 for i in range(20000)], under +the multiplicative method the initial index and increment were the same for +all keys, so every key followed exactly the same probe sequence, and so +this degenerated into a (very slow) linear search. The division method uses +all the bits of the hash code naturally in the increment, although it *may* +visit locations more than once until such time as all the high bits of the +increment have been shifted away. It's also impossible to tell in advance +whether incr is congruent to 0 modulo poly, so each iteration of the loop has +to guard against incr becoming 0. These are minor costs, as we usually don't +get into the probe loop, and when we do we usually get out on its first +iteration. */ static long polys[] = { @@ -204,7 +256,7 @@ static dictentry * lookdict(dictobject *mp, PyObject *key, register long hash) { register int i; - register unsigned incr; + register unsigned int incr; register dictentry *freeslot; register unsigned int mask = mp->ma_size-1; dictentry *ep0 = mp->ma_table; @@ -244,13 +296,14 @@ lookdict(dictobject *mp, PyObject *key, register long hash) } /* Derive incr from hash, just to make it more arbitrary. Note that incr must not be 0, or we will get into an infinite loop.*/ - incr = (hash ^ ((unsigned long)hash >> 3)) & mask; - if (!incr) - incr = mask; + incr = hash ^ ((unsigned long)hash >> 3); + /* In the loop, me_key == dummy is by far (factor of 100s) the least likely outcome, so test for that last. */ for (;;) { - ep = &ep0[(i+incr)&mask]; + if (!incr) + incr = 1; /* and incr will never be 0 again */ + ep = &ep0[(i + incr) & mask]; if (ep->me_key == NULL) { if (restore_error) PyErr_Restore(err_type, err_value, err_tb); @@ -282,10 +335,10 @@ lookdict(dictobject *mp, PyObject *key, register long hash) } else if (ep->me_key == dummy && freeslot == NULL) freeslot = ep; - /* Cycle through GF(2^n)-{0} */ - incr <<= 1; - if (incr > mask) - incr ^= mp->ma_poly; /* clears the highest bit */ + /* Cycle through GF(2**n). */ + if (incr & 1) + incr ^= mp->ma_poly; /* clears the lowest bit */ + incr >>= 1; } } @@ -303,7 +356,7 @@ static dictentry * lookdict_string(dictobject *mp, PyObject *key, register long hash) { register int i; - register unsigned incr; + register unsigned int incr; register dictentry *freeslot; register unsigned int mask = mp->ma_size-1; dictentry *ep0 = mp->ma_table; @@ -334,13 +387,14 @@ lookdict_string(dictobject *mp, PyObject *key, register long hash) } /* Derive incr from hash, just to make it more arbitrary. Note that incr must not be 0, or we will get into an infinite loop.*/ - incr = (hash ^ ((unsigned long)hash >> 3)) & mask; - if (!incr) - incr = mask; + incr = hash ^ ((unsigned long)hash >> 3); + /* In the loop, me_key == dummy is by far (factor of 100s) the least likely outcome, so test for that last. */ for (;;) { - ep = &ep0[(i+incr)&mask]; + if (!incr) + incr = 1; /* and incr will never be 0 again */ + ep = &ep0[(i + incr) & mask]; if (ep->me_key == NULL) return freeslot == NULL ? ep : freeslot; if (ep->me_key == key @@ -350,10 +404,10 @@ lookdict_string(dictobject *mp, PyObject *key, register long hash) return ep; if (ep->me_key == dummy && freeslot == NULL) freeslot = ep; - /* Cycle through GF(2^n)-{0} */ - incr <<= 1; - if (incr > mask) - incr ^= mp->ma_poly; /* clears the highest bit */ + /* Cycle through GF(2**n). */ + if (incr & 1) + incr ^= mp->ma_poly; /* clears the lowest bit */ + incr >>= 1; } } |