summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
authorRaymond Hettinger <rhettinger@users.noreply.github.com>2024-09-28 00:19:44 (GMT)
committerGitHub <noreply@github.com>2024-09-28 00:19:44 (GMT)
commit165ed68c26759b817388add52a7aa2d26755d451 (patch)
tree30e9abc9a9f14fdd21ce28a9727ab38b677c49a2 /Doc
parent02b49c51501f5eeef3ab5d74fb9eace1151a1359 (diff)
downloadcpython-165ed68c26759b817388add52a7aa2d26755d451.zip
cpython-165ed68c26759b817388add52a7aa2d26755d451.tar.gz
cpython-165ed68c26759b817388add52a7aa2d26755d451.tar.bz2
Sorting techniques edits (#124701)
Diffstat (limited to 'Doc')
-rw-r--r--Doc/howto/sorting.rst73
1 files changed, 70 insertions, 3 deletions
diff --git a/Doc/howto/sorting.rst b/Doc/howto/sorting.rst
index b98f91e..70c34cd 100644
--- a/Doc/howto/sorting.rst
+++ b/Doc/howto/sorting.rst
@@ -47,11 +47,14 @@ lists. In contrast, the :func:`sorted` function accepts any iterable.
Key Functions
=============
-Both :meth:`list.sort` and :func:`sorted` have a *key* parameter to specify a
-function (or other callable) to be called on each list element prior to making
+The :meth:`list.sort` method and the functions :func:`sorted`,
+:func:`min`, :func:`max`, :func:`heapq.nsmallest`, and
+:func:`heapq.nlargest` have a *key* parameter to specify a function (or
+other callable) to be called on each list element prior to making
comparisons.
-For example, here's a case-insensitive string comparison:
+For example, here's a case-insensitive string comparison using
+:meth:`str.casefold`:
.. doctest::
@@ -272,6 +275,70 @@ to make it usable as a key function::
sorted(words, key=cmp_to_key(strcoll)) # locale-aware sort order
+Strategies For Unorderable Types and Values
+===========================================
+
+A number of type and value issues can arise when sorting.
+Here are some strategies that can help:
+
+* Convert non-comparable input types to strings prior to sorting:
+
+.. doctest::
+
+ >>> data = ['twelve', '11', 10]
+ >>> sorted(map(str, data))
+ ['10', '11', 'twelve']
+
+This is needed because most cross-type comparisons raise a
+:exc:`TypeError`.
+
+* Remove special values prior to sorting:
+
+.. doctest::
+
+ >>> from math import isnan
+ >>> from itertools import filterfalse
+ >>> data = [3.3, float('nan'), 1.1, 2.2]
+ >>> sorted(filterfalse(isnan, data))
+ [1.1, 2.2, 3.3]
+
+This is needed because the `IEEE-754 standard
+<https://en.wikipedia.org/wiki/IEEE_754>`_ specifies that, "Every NaN
+shall compare unordered with everything, including itself."
+
+Likewise, ``None`` can be stripped from datasets as well:
+
+.. doctest::
+
+ >>> data = [3.3, None, 1.1, 2.2]
+ >>> sorted(x for x in data if x is not None)
+ [1.1, 2.2, 3.3]
+
+This is needed because ``None`` is not comparable to other types.
+
+* Convert mapping types into sorted item lists before sorting:
+
+.. doctest::
+
+ >>> data = [{'a': 1}, {'b': 2}]
+ >>> sorted(data, key=lambda d: sorted(d.items()))
+ [{'a': 1}, {'b': 2}]
+
+This is needed because dict-to-dict comparisons raise a
+:exc:`TypeError`.
+
+* Convert set types into sorted lists before sorting:
+
+.. doctest::
+
+ >>> data = [{'a', 'b', 'c'}, {'b', 'c', 'd'}]
+ >>> sorted(map(sorted, data))
+ [['a', 'b', 'c'], ['b', 'c', 'd']]
+
+This is needed because the elements contained in set types do not have a
+deterministic order. For example, ``list({'a', 'b'})`` may produce
+either ``['a', 'b']`` or ``['b', 'a']``.
+
Odds and Ends
=============