summaryrefslogtreecommitdiffstats
path: root/Doc/library/filecmp.rst
blob: 8471a7263078799d0454163f51e48eb320c73c4b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
:mod:`filecmp` --- File and Directory Comparisons
=================================================

.. module:: filecmp
   :synopsis: Compare files efficiently.
.. sectionauthor:: Moshe Zadka <moshez@zadka.site.co.il>

**Source code:** :source:`Lib/filecmp.py`

--------------

The :mod:`filecmp` module defines functions to compare files and directories,
with various optional time/correctness trade-offs. For comparing files,
see also the :mod:`difflib` module.

The :mod:`filecmp` module defines the following functions:


.. function:: cmp(f1, f2, shallow=True)

   Compare the files named *f1* and *f2*, returning ``True`` if they seem equal,
   ``False`` otherwise.

   If *shallow* is true, files with identical :func:`os.stat` signatures are
   taken to be equal.  Otherwise, the contents of the files are compared.

   Note that no external programs are called from this function, giving it
   portability and efficiency.

   This function uses a cache for past comparisons and the results,
   with a cache invalidation mechanism relying on stale signatures
   or by explicitly calling :func:`clear_cache`.


.. function:: cmpfiles(dir1, dir2, common, shallow=True)

   Compare the files in the two directories *dir1* and *dir2* whose names are
   given by *common*.

   Returns three lists of file names: *match*, *mismatch*,
   *errors*.  *match* contains the list of files that match, *mismatch* contains
   the names of those that don't, and *errors* lists the names of files which
   could not be compared.  Files are listed in *errors* if they don't exist in
   one of the directories, the user lacks permission to read them or if the
   comparison could not be done for some other reason.

   The *shallow* parameter has the same meaning and default value as for
   :func:`filecmp.cmp`.

   For example, ``cmpfiles('a', 'b', ['c', 'd/e'])`` will compare ``a/c`` with
   ``b/c`` and ``a/d/e`` with ``b/d/e``.  ``'c'`` and ``'d/e'`` will each be in
   one of the three returned lists.


.. function:: clear_cache()

   .. versionadded:: 3.4

   Clear the filecmp cache. This may be useful if a file is compared so quickly
   after it is modified that it is within the mtime resolution of
   the underlying filesystem.


.. _dircmp-objects:

The :class:`dircmp` class
-------------------------

.. class:: dircmp(a, b, ignore=None, hide=None)

   Construct a new directory comparison object, to compare the directories *a*
   and *b*.  *ignore* is a list of names to ignore, and defaults to
   :attr:`filecmp.DEFAULT_IGNORES`.  *hide* is a list of names to hide, and
   defaults to ``[os.curdir, os.pardir]``.

   The :class:`dircmp` class compares files by doing *shallow* comparisons
   as described for :func:`filecmp.cmp`.

   The :class:`dircmp` class provides the following methods:

   .. method:: report()

      Print (to :data:`sys.stdout`) a comparison between *a* and *b*.

   .. method:: report_partial_closure()

      Print a comparison between *a* and *b* and common immediate
      subdirectories.

   .. method:: report_full_closure()

      Print a comparison between *a* and *b* and common subdirectories
      (recursively).

   The :class:`dircmp` class offers a number of interesting attributes that may be
   used to get various bits of information about the directory trees being
   compared.

   Note that via :meth:`__getattr__` hooks, all attributes are computed lazily,
   so there is no speed penalty if only those attributes which are lightweight
   to compute are used.


   .. attribute:: left

      The directory *a*.


   .. attribute:: right

      The directory *b*.


   .. attribute:: left_list

      Files and subdirectories in *a*, filtered by *hide* and *ignore*.


   .. attribute:: right_list

      Files and subdirectories in *b*, filtered by *hide* and *ignore*.


   .. attribute:: common

      Files and subdirectories in both *a* and *b*.


   .. attribute:: left_only

      Files and subdirectories only in *a*.


   .. attribute:: right_only

      Files and subdirectories only in *b*.


   .. attribute:: common_dirs

      Subdirectories in both *a* and *b*.


   .. attribute:: common_files

      Files in both *a* and *b*.


   .. attribute:: common_funny

      Names in both *a* and *b*, such that the type differs between the
      directories, or names for which :func:`os.stat` reports an error.


   .. attribute:: same_files

      Files which are identical in both *a* and *b*, using the class's
      file comparison operator.


   .. attribute:: diff_files

      Files which are in both *a* and *b*, whose contents differ according
      to the class's file comparison operator.


   .. attribute:: funny_files

      Files which are in both *a* and *b*, but could not be compared.


   .. attribute:: subdirs

      A dictionary mapping names in :attr:`common_dirs` to :class:`dircmp`
      objects.

.. attribute:: DEFAULT_IGNORES

   .. versionadded:: 3.4

   List of directories ignored by :class:`dircmp` by default.


Here is a simplified example of using the ``subdirs`` attribute to search
recursively through two directories to show common different files::

    >>> from filecmp import dircmp
    >>> def print_diff_files(dcmp):
    ...     for name in dcmp.diff_files:
    ...         print("diff_file %s found in %s and %s" % (name, dcmp.left,
    ...               dcmp.right))
    ...     for sub_dcmp in dcmp.subdirs.values():
    ...         print_diff_files(sub_dcmp)
    ...
    >>> dcmp = dircmp('dir1', 'dir2') # doctest: +SKIP
    >>> print_diff_files(dcmp) # doctest: +SKIP