Fix typos and add some elaborations

author: Raymond Hettinger <python@rcn.com> 2004-03-15 15:52:22 (GMT)
committer: Raymond Hettinger <python@rcn.com> 2004-03-15 15:52:22 (GMT)
commit: 9d5c44307ae85e130f1fc956a9b5b1b52c681ada (patch)
tree: fa09e4f110bae4144da96c5ea02bbc83a4453db3 /Objects/dictnotes.txt
parent: cd1e8a948516a8a05d4e682d793a49d80bbdf8f8 (diff)
download: cpython-9d5c44307ae85e130f1fc956a9b5b1b52c681ada.zip
cpython-9d5c44307ae85e130f1fc956a9b5b1b52c681ada.tar.gz
cpython-9d5c44307ae85e130f1fc956a9b5b1b52c681ada.tar.bz2
1 files changed, 9 insertions, 4 deletions
diff --git a/Objects/dictnotes.txt b/Objects/dictnotes.txt
index 63b06e5..cb46cb1 100644
--- a/Objects/dictnotes.txt
+++ b/Objects/dictnotes.txt
@@ -94,7 +94,7 @@ Tunable Dictionary Parameters
 * Growth rate upon hitting maximum load.  Currently set to *2.
     Raising this to *4 results in half the number of resizes,
     less effort to resize, better sparseness for some (but not
-    all dict sizes), and potentially double memory consumption
+    all dict sizes), and potentially doubles memory consumption
     depending on the size of the dictionary.  Setting to *4
     eliminates every other resize step.
 
@@ -112,6 +112,8 @@ iteration and key listing.  Those methods loop over every potential
 entry.  Doubling the size of dictionary results in twice as many
 non-overlapping memory accesses for keys(), items(), values(),
 __iter__(), iterkeys(), iteritems(), itervalues(), and update().
+Also, every dictionary iterates at least twice, once for the memset()
+when it is created and once by dealloc().
 
 
 Results of Cache Locality Experiments
@@ -191,6 +193,8 @@ sizes and access patterns, the user may be able to provide useful hints.
    is not at a premium, the user may benefit from setting the maximum load
    ratio at 5% or 10% instead of the usual 66.7%.  This will sharply
    curtail the number of collisions but will increase iteration time.
+   The builtin namespace is a prime example of a dictionary that can
+   benefit from being highly sparse.
 
 2) Dictionary creation time can be shortened in cases where the ultimate
    size of the dictionary is known in advance.  The dictionary can be
@@ -199,7 +203,7 @@ sizes and access patterns, the user may be able to provide useful hints.
    more quickly because the first half of the keys will be inserted into
    a more sparse environment than before.  The preconditions for this
    strategy arise whenever a dictionary is created from a key or item
-   sequence and the number of unique keys is known.
+   sequence and the number of *unique* keys is known.
 
 3) If the key space is large and the access pattern is known to be random,
    then search strategies exploiting cache locality can be fruitful.
@@ -228,11 +232,12 @@ The dictionary can be immediately rebuilt (eliminating dummy entries),
 resized (to an appropriate level of sparseness), and the keys can be
 jostled (to minimize collisions).  The lookdict() routine can then
 eliminate the test for dummy entries (saving about 1/4 of the time
-spend in the collision resolution loop).
+spent in the collision resolution loop).
 
 An additional possibility is to insert links into the empty spaces
 so that dictionary iteration can proceed in len(d) steps instead of
-(mp->mask + 1) steps.
+(mp->mask + 1) steps.  Alternatively, a separate tuple of keys can be
+kept just for iteration.
 
 
 Caching Lookups
author	Raymond Hettinger <python@rcn.com>	2004-03-15 15:52:22 (GMT)
committer	Raymond Hettinger <python@rcn.com>	2004-03-15 15:52:22 (GMT)
commit	9d5c44307ae85e130f1fc956a9b5b1b52c681ada (patch)
tree	fa09e4f110bae4144da96c5ea02bbc83a4453db3 /Objects/dictnotes.txt
parent	cd1e8a948516a8a05d4e682d793a49d80bbdf8f8 (diff)
download	cpython-9d5c44307ae85e130f1fc956a9b5b1b52c681ada.zip cpython-9d5c44307ae85e130f1fc956a9b5b1b52c681ada.tar.gz cpython-9d5c44307ae85e130f1fc956a9b5b1b52c681ada.tar.bz2