diff options
| author | Terry Reedy <tjreedy@udel.edu> | 2010-11-11 23:22:19 (GMT) |
|---|---|---|
| committer | Terry Reedy <tjreedy@udel.edu> | 2010-11-11 23:22:19 (GMT) |
| commit | d2d2ae91c5ea2d226a6aae7afc8c8b05200c4eef (patch) | |
| tree | 3d0e6661b6a54c6257786c700b9ba36664642365 /Doc/library/difflib.rst | |
| parent | 6c2e0224ffb739a3db3984aa8beb68db3338b3f1 (diff) | |
| download | cpython-d2d2ae91c5ea2d226a6aae7afc8c8b05200c4eef.zip cpython-d2d2ae91c5ea2d226a6aae7afc8c8b05200c4eef.tar.gz cpython-d2d2ae91c5ea2d226a6aae7afc8c8b05200c4eef.tar.bz2 | |
#2986 Add autojunk parameter to SequenceMatcher to optionally disable 'popular == junk' heuristic.
Diffstat (limited to 'Doc/library/difflib.rst')
| -rw-r--r-- | Doc/library/difflib.rst | 15 |
1 files changed, 14 insertions, 1 deletions
diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst index 8556e1d..4d19b40 100644 --- a/Doc/library/difflib.rst +++ b/Doc/library/difflib.rst @@ -37,6 +37,16 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module. complicated way on how many elements the sequences have in common; best case time is linear. + **Automatic junk heuristic:** :class:`SequenceMatcher` supports a heuristic that + automatically treats certain sequence items as junk. The heuristic counts how many + times each individual item appears in the sequence. If an item's duplicates (after + the first one) account for more than 1% of the sequence and the sequence is at least + 200 items long, this item is marked as "popular" and is treated as junk for + the purpose of sequence matching. This heuristic can be turned off by setting + the ``autojunk`` argument to ``False`` when creating the :class:`SequenceMatcher`. + + .. versionadded:: 2.7 + The *autojunk* parameter. .. class:: Differ @@ -334,7 +344,7 @@ SequenceMatcher Objects The :class:`SequenceMatcher` class has this constructor: -.. class:: SequenceMatcher([isjunk[, a[, b]]]) +.. class:: SequenceMatcher([isjunk[, a[, b[, autojunk=True]]]]) Optional argument *isjunk* must be ``None`` (the default) or a one-argument function that takes a sequence element and returns true if and only if the @@ -350,6 +360,9 @@ The :class:`SequenceMatcher` class has this constructor: The optional arguments *a* and *b* are sequences to be compared; both default to empty strings. The elements of both sequences must be :term:`hashable`. + The optional argument *autojunk* can be used to disable the automatic junk + heuristic. + :class:`SequenceMatcher` objects have the following methods: |
