diff options
author | Raymond Hettinger <rhettinger@users.noreply.github.com> | 2024-02-20 03:22:07 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2024-02-20 03:22:07 (GMT) |
commit | 9f8a9e8ac74d588a2958dfa04085a0d78f9dcba9 (patch) | |
tree | e6d14eda8b30e772df2aa82f610483261581a352 | |
parent | c2cb31bbe1262213085c425bc853d6587c66cae9 (diff) | |
download | cpython-9f8a9e8ac74d588a2958dfa04085a0d78f9dcba9.zip cpython-9f8a9e8ac74d588a2958dfa04085a0d78f9dcba9.tar.gz cpython-9f8a9e8ac74d588a2958dfa04085a0d78f9dcba9.tar.bz2 |
Modernize the Sorting HowTo guide (gh-115479)
-rw-r--r-- | Doc/howto/sorting.rst | 60 |
1 files changed, 54 insertions, 6 deletions
diff --git a/Doc/howto/sorting.rst b/Doc/howto/sorting.rst index 38dd09f..fffef48 100644 --- a/Doc/howto/sorting.rst +++ b/Doc/howto/sorting.rst @@ -4,7 +4,6 @@ Sorting HOW TO ************** :Author: Andrew Dalke and Raymond Hettinger -:Release: 0.1 Python lists have a built-in :meth:`list.sort` method that modifies the list @@ -56,7 +55,7 @@ For example, here's a case-insensitive string comparison: .. doctest:: - >>> sorted("This is a test string from Andrew".split(), key=str.lower) + >>> sorted("This is a test string from Andrew".split(), key=str.casefold) ['a', 'Andrew', 'from', 'is', 'string', 'test', 'This'] The value of the *key* parameter should be a function (or other callable) that @@ -97,10 +96,14 @@ The same technique works for objects with named attributes. For example: >>> sorted(student_objects, key=lambda student: student.age) # sort by age [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] -Operator Module Functions -========================= +Objects with named attributes can be made by a regular class as shown +above, or they can be instances of :class:`~dataclasses.dataclass` or +a :term:`named tuple`. -The key-function patterns shown above are very common, so Python provides +Operator Module Functions and Partial Function Evaluation +========================================================= + +The :term:`key function` patterns shown above are very common, so Python provides convenience functions to make accessor functions easier and faster. The :mod:`operator` module has :func:`~operator.itemgetter`, :func:`~operator.attrgetter`, and a :func:`~operator.methodcaller` function. @@ -128,6 +131,24 @@ sort by *grade* then by *age*: >>> sorted(student_objects, key=attrgetter('grade', 'age')) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)] +The :mod:`functools` module provides another helpful tool for making +key-functions. The :func:`~functools.partial` function can reduce the +`arity <https://en.wikipedia.org/wiki/Arity>`_ of a multi-argument +function making it suitable for use as a key-function. + +.. doctest:: + + >>> from functools import partial + >>> from unicodedata import normalize + + >>> names = 'Zoë Åbjørn Núñez Élana Zeke Abe Nubia Eloise'.split() + + >>> sorted(names, key=partial(normalize, 'NFD')) + ['Abe', 'Åbjørn', 'Eloise', 'Élana', 'Nubia', 'Núñez', 'Zeke', 'Zoë'] + + >>> sorted(names, key=partial(normalize, 'NFC')) + ['Abe', 'Eloise', 'Nubia', 'Núñez', 'Zeke', 'Zoë', 'Åbjørn', 'Élana'] + Ascending and Descending ======================== @@ -200,6 +221,8 @@ This idiom is called Decorate-Sort-Undecorate after its three steps: For example, to sort the student data by *grade* using the DSU approach: +.. doctest:: + >>> decorated = [(student.grade, i, student) for i, student in enumerate(student_objects)] >>> decorated.sort() >>> [student for grade, i, student in decorated] # undecorate @@ -282,7 +305,11 @@ Odds and Ends [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] However, note that ``<`` can fall back to using :meth:`~object.__gt__` if - :meth:`~object.__lt__` is not implemented (see :func:`object.__lt__`). + :meth:`~object.__lt__` is not implemented (see :func:`object.__lt__` + for details on the mechanics). To avoid surprises, :pep:`8` + recommends that all six comparison methods be implemented. + The :func:`~functools.total_ordering` decorator is provided to make that + task easier. * Key functions need not depend directly on the objects being sorted. A key function can also access external resources. For instance, if the student grades @@ -295,3 +322,24 @@ Odds and Ends >>> newgrades = {'john': 'F', 'jane':'A', 'dave': 'C'} >>> sorted(students, key=newgrades.__getitem__) ['jane', 'dave', 'john'] + +Partial Sorts +============= + +Some applications require only some of the data to be ordered. The standard +library provides several tools that do less work than a full sort: + +* :func:`min` and :func:`max` return the smallest and largest values, + respectively. These functions make a single pass over the input data and + require almost no auxiliary memory. + +* :func:`heapq.nsmallest` and :func:`heapq.nlargest` return + the *n* smallest and largest values, respectively. These functions + make a single pass over the data keeping only *n* elements in memory + at a time. For values of *n* that are small relative to the number of + inputs, these functions make far fewer comparisons than a full sort. + +* :func:`heapq.heappush` and :func:`heapq.heappop` create and maintain a + partially sorted arrangement of data that keeps the smallest element + at position ``0``. These functions are suitable for implementing + priority queues which are commonly used for task scheduling. |