summaryrefslogtreecommitdiffstats
path: root/Doc/library/difflib.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/difflib.rst')
-rw-r--r--Doc/library/difflib.rst235
1 files changed, 120 insertions, 115 deletions
diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst
index 585fa74..01c9c85 100644
--- a/Doc/library/difflib.rst
+++ b/Doc/library/difflib.rst
@@ -343,166 +343,171 @@ The :class:`SequenceMatcher` class has this constructor:
The optional arguments *a* and *b* are sequences to be compared; both default to
empty strings. The elements of both sequences must be :term:`hashable`.
-:class:`SequenceMatcher` objects have the following methods:
+ :class:`SequenceMatcher` objects have the following methods:
-.. method:: SequenceMatcher.set_seqs(a, b)
+ .. method:: set_seqs(a, b)
- Set the two sequences to be compared.
+ Set the two sequences to be compared.
-:class:`SequenceMatcher` computes and caches detailed information about the
-second sequence, so if you want to compare one sequence against many sequences,
-use :meth:`set_seq2` to set the commonly used sequence once and call
-:meth:`set_seq1` repeatedly, once for each of the other sequences.
+ :class:`SequenceMatcher` computes and caches detailed information about the
+ second sequence, so if you want to compare one sequence against many
+ sequences, use :meth:`set_seq2` to set the commonly used sequence once and
+ call :meth:`set_seq1` repeatedly, once for each of the other sequences.
-.. method:: SequenceMatcher.set_seq1(a)
+ .. method:: set_seq1(a)
- Set the first sequence to be compared. The second sequence to be compared is
- not changed.
+ Set the first sequence to be compared. The second sequence to be compared
+ is not changed.
-.. method:: SequenceMatcher.set_seq2(b)
+ .. method:: set_seq2(b)
- Set the second sequence to be compared. The first sequence to be compared is
- not changed.
+ Set the second sequence to be compared. The first sequence to be compared
+ is not changed.
-.. method:: SequenceMatcher.find_longest_match(alo, ahi, blo, bhi)
+ .. method:: find_longest_match(alo, ahi, blo, bhi)
- Find longest matching block in ``a[alo:ahi]`` and ``b[blo:bhi]``.
+ Find longest matching block in ``a[alo:ahi]`` and ``b[blo:bhi]``.
- If *isjunk* was omitted or ``None``, :meth:`find_longest_match` returns ``(i, j,
- k)`` such that ``a[i:i+k]`` is equal to ``b[j:j+k]``, where ``alo <= i <= i+k <=
- ahi`` and ``blo <= j <= j+k <= bhi``. For all ``(i', j', k')`` meeting those
- conditions, the additional conditions ``k >= k'``, ``i <= i'``, and if ``i ==
- i'``, ``j <= j'`` are also met. In other words, of all maximal matching blocks,
- return one that starts earliest in *a*, and of all those maximal matching blocks
- that start earliest in *a*, return the one that starts earliest in *b*.
+ If *isjunk* was omitted or ``None``, :meth:`find_longest_match` returns
+ ``(i, j, k)`` such that ``a[i:i+k]`` is equal to ``b[j:j+k]``, where ``alo
+ <= i <= i+k <= ahi`` and ``blo <= j <= j+k <= bhi``. For all ``(i', j',
+ k')`` meeting those conditions, the additional conditions ``k >= k'``, ``i
+ <= i'``, and if ``i == i'``, ``j <= j'`` are also met. In other words, of
+ all maximal matching blocks, return one that starts earliest in *a*, and
+ of all those maximal matching blocks that start earliest in *a*, return
+ the one that starts earliest in *b*.
- >>> s = SequenceMatcher(None, " abcd", "abcd abcd")
- >>> s.find_longest_match(0, 5, 0, 9)
- Match(a=0, b=4, size=5)
+ >>> s = SequenceMatcher(None, " abcd", "abcd abcd")
+ >>> s.find_longest_match(0, 5, 0, 9)
+ Match(a=0, b=4, size=5)
- If *isjunk* was provided, first the longest matching block is determined as
- above, but with the additional restriction that no junk element appears in the
- block. Then that block is extended as far as possible by matching (only) junk
- elements on both sides. So the resulting block never matches on junk except as
- identical junk happens to be adjacent to an interesting match.
+ If *isjunk* was provided, first the longest matching block is determined
+ as above, but with the additional restriction that no junk element appears
+ in the block. Then that block is extended as far as possible by matching
+ (only) junk elements on both sides. So the resulting block never matches
+ on junk except as identical junk happens to be adjacent to an interesting
+ match.
- Here's the same example as before, but considering blanks to be junk. That
- prevents ``' abcd'`` from matching the ``' abcd'`` at the tail end of the second
- sequence directly. Instead only the ``'abcd'`` can match, and matches the
- leftmost ``'abcd'`` in the second sequence:
+ Here's the same example as before, but considering blanks to be junk. That
+ prevents ``' abcd'`` from matching the ``' abcd'`` at the tail end of the
+ second sequence directly. Instead only the ``'abcd'`` can match, and
+ matches the leftmost ``'abcd'`` in the second sequence:
- >>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
- >>> s.find_longest_match(0, 5, 0, 9)
- Match(a=1, b=0, size=4)
+ >>> s = SequenceMatcher(lambda x: x==" ", " abcd", "abcd abcd")
+ >>> s.find_longest_match(0, 5, 0, 9)
+ Match(a=1, b=0, size=4)
- If no blocks match, this returns ``(alo, blo, 0)``.
+ If no blocks match, this returns ``(alo, blo, 0)``.
- This method returns a :term:`named tuple` ``Match(a, b, size)``.
+ This method returns a :term:`named tuple` ``Match(a, b, size)``.
-.. method:: SequenceMatcher.get_matching_blocks()
+ .. method:: get_matching_blocks()
- Return list of triples describing matching subsequences. Each triple is of the
- form ``(i, j, n)``, and means that ``a[i:i+n] == b[j:j+n]``. The triples are
- monotonically increasing in *i* and *j*.
+ Return list of triples describing matching subsequences. Each triple is of
+ the form ``(i, j, n)``, and means that ``a[i:i+n] == b[j:j+n]``. The
+ triples are monotonically increasing in *i* and *j*.
- The last triple is a dummy, and has the value ``(len(a), len(b), 0)``. It is
- the only triple with ``n == 0``. If ``(i, j, n)`` and ``(i', j', n')`` are
- adjacent triples in the list, and the second is not the last triple in the list,
- then ``i+n != i'`` or ``j+n != j'``; in other words, adjacent triples always
- describe non-adjacent equal blocks.
+ The last triple is a dummy, and has the value ``(len(a), len(b), 0)``. It
+ is the only triple with ``n == 0``. If ``(i, j, n)`` and ``(i', j', n')``
+ are adjacent triples in the list, and the second is not the last triple in
+ the list, then ``i+n != i'`` or ``j+n != j'``; in other words, adjacent
+ triples always describe non-adjacent equal blocks.
- .. XXX Explain why a dummy is used!
+ .. XXX Explain why a dummy is used!
- .. doctest::
+ .. doctest::
- >>> s = SequenceMatcher(None, "abxcd", "abcd")
- >>> s.get_matching_blocks()
- [Match(a=0, b=0, size=2), Match(a=3, b=2, size=2), Match(a=5, b=4, size=0)]
+ >>> s = SequenceMatcher(None, "abxcd", "abcd")
+ >>> s.get_matching_blocks()
+ [Match(a=0, b=0, size=2), Match(a=3, b=2, size=2), Match(a=5, b=4, size=0)]
-.. method:: SequenceMatcher.get_opcodes()
+ .. method:: get_opcodes()
- Return list of 5-tuples describing how to turn *a* into *b*. Each tuple is of
- the form ``(tag, i1, i2, j1, j2)``. The first tuple has ``i1 == j1 == 0``, and
- remaining tuples have *i1* equal to the *i2* from the preceding tuple, and,
- likewise, *j1* equal to the previous *j2*.
+ Return list of 5-tuples describing how to turn *a* into *b*. Each tuple is
+ of the form ``(tag, i1, i2, j1, j2)``. The first tuple has ``i1 == j1 ==
+ 0``, and remaining tuples have *i1* equal to the *i2* from the preceding
+ tuple, and, likewise, *j1* equal to the previous *j2*.
- The *tag* values are strings, with these meanings:
+ The *tag* values are strings, with these meanings:
- +---------------+---------------------------------------------+
- | Value | Meaning |
- +===============+=============================================+
- | ``'replace'`` | ``a[i1:i2]`` should be replaced by |
- | | ``b[j1:j2]``. |
- +---------------+---------------------------------------------+
- | ``'delete'`` | ``a[i1:i2]`` should be deleted. Note that |
- | | ``j1 == j2`` in this case. |
- +---------------+---------------------------------------------+
- | ``'insert'`` | ``b[j1:j2]`` should be inserted at |
- | | ``a[i1:i1]``. Note that ``i1 == i2`` in |
- | | this case. |
- +---------------+---------------------------------------------+
- | ``'equal'`` | ``a[i1:i2] == b[j1:j2]`` (the sub-sequences |
- | | are equal). |
- +---------------+---------------------------------------------+
+ +---------------+---------------------------------------------+
+ | Value | Meaning |
+ +===============+=============================================+
+ | ``'replace'`` | ``a[i1:i2]`` should be replaced by |
+ | | ``b[j1:j2]``. |
+ +---------------+---------------------------------------------+
+ | ``'delete'`` | ``a[i1:i2]`` should be deleted. Note that |
+ | | ``j1 == j2`` in this case. |
+ +---------------+---------------------------------------------+
+ | ``'insert'`` | ``b[j1:j2]`` should be inserted at |
+ | | ``a[i1:i1]``. Note that ``i1 == i2`` in |
+ | | this case. |
+ +---------------+---------------------------------------------+
+ | ``'equal'`` | ``a[i1:i2] == b[j1:j2]`` (the sub-sequences |
+ | | are equal). |
+ +---------------+---------------------------------------------+
- For example:
+ For example:
- >>> a = "qabxcd"
- >>> b = "abycdf"
- >>> s = SequenceMatcher(None, a, b)
- >>> for tag, i1, i2, j1, j2 in s.get_opcodes():
- ... print(("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
- ... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2])))
- delete a[0:1] (q) b[0:0] ()
- equal a[1:3] (ab) b[0:2] (ab)
- replace a[3:4] (x) b[2:3] (y)
- equal a[4:6] (cd) b[3:5] (cd)
- insert a[6:6] () b[5:6] (f)
+ >>> a = "qabxcd"
+ >>> b = "abycdf"
+ >>> s = SequenceMatcher(None, a, b)
+ >>> for tag, i1, i2, j1, j2 in s.get_opcodes():
+ ... print(("%7s a[%d:%d] (%s) b[%d:%d] (%s)" %
+ ... (tag, i1, i2, a[i1:i2], j1, j2, b[j1:j2])))
+ delete a[0:1] (q) b[0:0] ()
+ equal a[1:3] (ab) b[0:2] (ab)
+ replace a[3:4] (x) b[2:3] (y)
+ equal a[4:6] (cd) b[3:5] (cd)
+ insert a[6:6] () b[5:6] (f)
-.. method:: SequenceMatcher.get_grouped_opcodes([n])
+ .. method:: get_grouped_opcodes([n])
- Return a :term:`generator` of groups with up to *n* lines of context.
+ Return a :term:`generator` of groups with up to *n* lines of context.
- Starting with the groups returned by :meth:`get_opcodes`, this method splits out
- smaller change clusters and eliminates intervening ranges which have no changes.
+ Starting with the groups returned by :meth:`get_opcodes`, this method
+ splits out smaller change clusters and eliminates intervening ranges which
+ have no changes.
- The groups are returned in the same format as :meth:`get_opcodes`.
+ The groups are returned in the same format as :meth:`get_opcodes`.
-.. method:: SequenceMatcher.ratio()
+ .. method:: ratio()
- Return a measure of the sequences' similarity as a float in the range [0, 1].
+ Return a measure of the sequences' similarity as a float in the range [0,
+ 1].
- Where T is the total number of elements in both sequences, and M is the number
- of matches, this is 2.0\*M / T. Note that this is ``1.0`` if the sequences are
- identical, and ``0.0`` if they have nothing in common.
+ Where T is the total number of elements in both sequences, and M is the
+ number of matches, this is 2.0\*M / T. Note that this is ``1.0`` if the
+ sequences are identical, and ``0.0`` if they have nothing in common.
- This is expensive to compute if :meth:`get_matching_blocks` or
- :meth:`get_opcodes` hasn't already been called, in which case you may want to
- try :meth:`quick_ratio` or :meth:`real_quick_ratio` first to get an upper bound.
+ This is expensive to compute if :meth:`get_matching_blocks` or
+ :meth:`get_opcodes` hasn't already been called, in which case you may want
+ to try :meth:`quick_ratio` or :meth:`real_quick_ratio` first to get an
+ upper bound.
-.. method:: SequenceMatcher.quick_ratio()
+ .. method:: quick_ratio()
- Return an upper bound on :meth:`ratio` relatively quickly.
+ Return an upper bound on :meth:`ratio` relatively quickly.
- This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is
- faster to compute.
+ This isn't defined beyond that it is an upper bound on :meth:`ratio`, and
+ is faster to compute.
-.. method:: SequenceMatcher.real_quick_ratio()
+ .. method:: real_quick_ratio()
- Return an upper bound on :meth:`ratio` very quickly.
+ Return an upper bound on :meth:`ratio` very quickly.
- This isn't defined beyond that it is an upper bound on :meth:`ratio`, and is
- faster to compute than either :meth:`ratio` or :meth:`quick_ratio`.
+ This isn't defined beyond that it is an upper bound on :meth:`ratio`, and
+ is faster to compute than either :meth:`ratio` or :meth:`quick_ratio`.
The three methods that return the ratio of matching to total characters can give
different results due to differing levels of approximation, although
@@ -589,17 +594,17 @@ The :class:`Differ` class has this constructor:
length 1), and returns true if the character is junk. The default is ``None``,
meaning that no character is considered junk.
-:class:`Differ` objects are used (deltas generated) via a single method:
+ :class:`Differ` objects are used (deltas generated) via a single method:
-.. method:: Differ.compare(a, b)
+ .. method:: Differ.compare(a, b)
- Compare two sequences of lines, and generate the delta (a sequence of lines).
+ Compare two sequences of lines, and generate the delta (a sequence of lines).
- Each sequence must contain individual single-line strings ending with newlines.
- Such sequences can be obtained from the :meth:`readlines` method of file-like
- objects. The delta generated also consists of newline-terminated strings, ready
- to be printed as-is via the :meth:`writelines` method of a file-like object.
+ Each sequence must contain individual single-line strings ending with newlines.
+ Such sequences can be obtained from the :meth:`readlines` method of file-like
+ objects. The delta generated also consists of newline-terminated strings, ready
+ to be printed as-is via the :meth:`writelines` method of a file-like object.
.. _differ-examples: