GH-116554: Relax list.sort()'s notion of "descending" runs (#116578)

* GH-116554: Relax list.sort()'s notion of "descending" run Rewrote `count_run()` so that sub-runs of equal elements no longer end a descending run. Both ascending and descending runs can have arbitrarily many sub-runs of arbitrarily many equal elements now. This is tricky, because we only use ``<`` comparisons, so checking for equality doesn't come "for free". Surprisingly, it turned out there's a very cheap (one comparison) way to determine whether an ascending run consisted of all-equal elements. That sealed the deal. In addition, after a descending run is reversed in-place, we now go on to see whether it can be extended by an ascending run that just happens to be adjacent. This succeeds in finding at least one additional element to append about half the time, and so appears to more than repay its cost (the savings come from getting to skip a binary search, when a short run is artificially forced to length MIINRUN later, for each new element `count_run()` can add to the initial run). While these have been in the back of my mind for years, a question on StackOverflow pushed it to action: https://stackoverflow.com/questions/78108792/ They were wondering why it took about 4x longer to sort a list like: [999_999, 999_999, ..., 2, 2, 1, 1, 0, 0] than "similar" lists. Of course that runs very much faster after this patch. Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Pieter Eendebak <pieter.eendebak@gmail.com>
author: Tim Peters <tim.peters@gmail.com> 2024-03-13 00:59:42 (GMT)
committer: GitHub <noreply@github.com> 2024-03-13 00:59:42 (GMT)
commit: bf121d6a694bea4fe9864a19879fe0c70c4e0656 (patch)
tree: 24954c37d350938872453294dbed2755eddf1e4b /Lib
parent: 7d1abe9502641a3602e9773aebc29ee56d8f40ae (diff)
download: cpython-bf121d6a694bea4fe9864a19879fe0c70c4e0656.zip
cpython-bf121d6a694bea4fe9864a19879fe0c70c4e0656.tar.gz
cpython-bf121d6a694bea4fe9864a19879fe0c70c4e0656.tar.bz2
1 files changed, 21 insertions, 0 deletions
diff --git a/Lib/test/test_sort.py b/Lib/test/test_sort.py
index 3b6ad4d..2a7cfb7 100644
--- a/Lib/test/test_sort.py
+++ b/Lib/test/test_sort.py
@@ -128,6 +128,27 @@ class TestBase(unittest.TestCase):
             x = [e for e, i in augmented] # a stable sort of s
             check("stability", x, s)
 
+    def test_small_stability(self):
+        from itertools import product
+        from operator import itemgetter
+
+        # Exhaustively test stability across all lists of small lengths
+        # and only a few distinct elements.
+        # This can provoke edge cases that randomization is unlikely to find.
+        # But it can grow very expensive quickly, so don't overdo it.
+        NELTS = 3
+        MAXSIZE = 9
+
+        pick0 = itemgetter(0)
+        for length in range(MAXSIZE + 1):
+            # There are NELTS ** length distinct lists.
+            for t in product(range(NELTS), repeat=length):
+                xs = list(zip(t, range(length)))
+                # Stability forced by index in each element.
+                forced = sorted(xs)
+                # Use key= to hide the index from compares.
+                native = sorted(xs, key=pick0)
+                self.assertEqual(forced, native)
 #==============================================================================
 
 class TestBugs(unittest.TestCase):
author	Tim Peters <tim.peters@gmail.com>	2024-03-13 00:59:42 (GMT)
committer	GitHub <noreply@github.com>	2024-03-13 00:59:42 (GMT)
commit	bf121d6a694bea4fe9864a19879fe0c70c4e0656 (patch)
tree	24954c37d350938872453294dbed2755eddf1e4b /Lib
parent	7d1abe9502641a3602e9773aebc29ee56d8f40ae (diff)
download	cpython-bf121d6a694bea4fe9864a19879fe0c70c4e0656.zip cpython-bf121d6a694bea4fe9864a19879fe0c70c4e0656.tar.gz cpython-bf121d6a694bea4fe9864a19879fe0c70c4e0656.tar.bz2