From 6c511e6d1cde2bc20669b9efe0aa32517884deb6 Mon Sep 17 00:00:00 2001 From: Tim Peters Date: Thu, 8 Aug 2002 01:55:16 +0000 Subject: Added info about highwater heap-memory use for the sortperf.py tests; + a couple of minor edits elsewhere. --- Objects/listsort.txt | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/Objects/listsort.txt b/Objects/listsort.txt index 1137beb..d80f221 100644 --- a/Objects/listsort.txt +++ b/Objects/listsort.txt @@ -167,6 +167,34 @@ Comparison with Python's Samplesort Hybrid But timsort "should be" slower than samplesort on ~sort, so it's hard to count that it isn't on some boxes as a strike against it . ++ Here's the highwater mark for the number of heap-based temp slots (4 + bytes each on this box) needed by each test, again with arguments + "15 20 1": + + + 2**i *sort \sort /sort 3sort +sort %sort ~sort =sort !sort + 32768 16384 0 0 6256 0 10821 12288 0 16383 + 65536 32766 0 0 21652 0 31276 24576 0 32767 + 131072 65534 0 0 17258 0 58112 49152 0 65535 + 262144 131072 0 0 35660 0 123561 98304 0 131071 + 524288 262142 0 0 31302 0 212057 196608 0 262143 +1048576 524286 0 0 312438 0 484942 393216 0 524287 + + Discussion: The tests that end up doing (close to) perfectly balanced + merges (*sort, !sort) need all N//2 temp slots (or almost all). ~sort + also ends up doing balanced merges, but systematically benefits a lot from + the preliminary pre-merge searches described under "Merge Memory" later. + %sort approaches having a balanced merge at the end because the random + selection of elements to replace is expected to produce an out-of-order + element near the midpoint. \sort, /sort, =sort are the trivial one-run + cases, needing no merging at all. +sort ends up having one very long run + and one very short, and so gets all the temp space it needs from the small + temparray member of the MergeState struct (note that the same would be + true if the new random elements were prefixed to the sorted list instead, + but not if they appeared "in the middle"). 3sort approaches N//3 temp + slots twice, but the run lengths that remain after 3 random exchanges + clearly has very high variance. + A detailed description of timsort follows. @@ -460,13 +488,13 @@ Galloping with a Broken Leg --------------------------- So why don't we always gallop? Because it can lose, on two counts: -1. While we're willing to endure small per-run overheads, per-comparison +1. While we're willing to endure small per-merge overheads, per-comparison overheads are a different story. Calling Yet Another Function per comparison is expensive, and gallop_left() and gallop_right() are too long-winded for sane inlining. -2. Ignoring function-call overhead, galloping can-- alas --require more - comparisons than linear one-at-time search, depending on the data. +2. Galloping can-- alas --require more comparisons than linear one-at-time + search, depending on the data. #2 requires details. If A[0] belongs before B[0], galloping requires 1 compare to determine that, same as linear search, except it costs more -- cgit v0.12