Doc/lib/libdoctest.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432

\section{\module{doctest} ---
         Test docstrings represent reality}

\declaremodule{standard}{doctest}
\moduleauthor{Tim Peters}{tim_one@users.sourceforge.net}
\sectionauthor{Tim Peters}{tim_one@users.sourceforge.net}
\sectionauthor{Moshe Zadka}{moshez@debian.org}

\modulesynopsis{A framework for verifying examples in docstrings.}

The \module{doctest} module searches a module's docstrings for text that looks
like an interactive Python session, then executes all such sessions to verify
they still work exactly as shown.  Here's a complete but small example:

\begin{verbatim}
"""
This is module example.

Example supplies one function, factorial.  For example,

>>> factorial(5)
120
"""

def factorial(n):
    """Return the factorial of n, an exact integer >= 0.

    If the result is small enough to fit in an int, return an int.
    Else return a long.

    >>> [factorial(n) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> [factorial(long(n)) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> factorial(30)
    265252859812191058636308480000000L
    >>> factorial(30L)
    265252859812191058636308480000000L
    >>> factorial(-1)
    Traceback (most recent call last):
        ...
    ValueError: n must be >= 0

    Factorials of floats are OK, but the float must be an exact integer:
    >>> factorial(30.1)
    Traceback (most recent call last):
        ...
    ValueError: n must be exact integer
    >>> factorial(30.0)
    265252859812191058636308480000000L

    It must also not be ridiculously large:
    >>> factorial(1e100)
    Traceback (most recent call last):
        ...
    OverflowError: n too large
    """

\end{verbatim}
% allow LaTeX to break here.
\begin{verbatim}

    import math
    if not n >= 0:
        raise ValueError("n must be >= 0")
    if math.floor(n) != n:
        raise ValueError("n must be exact integer")
    if n+1 == n:  # e.g., 1e300
        raise OverflowError("n too large")
    result = 1
    factor = 2
    while factor <= n:
        try:
            result *= factor
        except OverflowError:
            result *= long(factor)
        factor += 1
    return result

def _test():
    import doctest, example
    return doctest.testmod(example)

if __name__ == "__main__":
    _test()
\end{verbatim}

If you run \file{example.py} directly from the command line, doctest works
its magic:

\begin{verbatim}
$ python example.py
$
\end{verbatim}

There's no output!  That's normal, and it means all the examples worked.
Pass \programopt{-v} to the script, and doctest prints a detailed log
of what it's trying, and prints a summary at the end:

\begin{verbatim}
$ python example.py -v
Running example.__doc__
Trying: factorial(5)
Expecting: 120
ok
0 of 1 examples failed in example.__doc__
Running example.factorial.__doc__
Trying: [factorial(n) for n in range(6)]
Expecting: [1, 1, 2, 6, 24, 120]
ok
Trying: [factorial(long(n)) for n in range(6)]
Expecting: [1, 1, 2, 6, 24, 120]
ok
Trying: factorial(30)
Expecting: 265252859812191058636308480000000L
ok
\end{verbatim}

And so on, eventually ending with:

\begin{verbatim}
Trying: factorial(1e100)
Expecting:
Traceback (most recent call last):
    ...
OverflowError: n too large
ok
0 of 8 examples failed in example.factorial.__doc__
2 items passed all tests:
   1 tests in example
   8 tests in example.factorial
9 tests in 2 items.
9 passed and 0 failed.
Test passed.
$
\end{verbatim}

That's all you need to know to start making productive use of doctest!  Jump
in.  The docstrings in doctest.py contain detailed information about all
aspects of doctest, and we'll just cover the more important points here.

\subsection{Normal Usage}

In normal use, end each module \module{M} with:

\begin{verbatim}
def _test():
    import doctest, M           # replace M with your module's name
    return doctest.testmod(M)   # ditto

if __name__ == "__main__":
    _test()
\end{verbatim}

Then running the module as a script causes the examples in the docstrings
to get executed and verified:

\begin{verbatim}
python M.py
\end{verbatim}

This won't display anything unless an example fails, in which case the
failing example(s) and the cause(s) of the failure(s) are printed to stdout,
and the final line of output is \code{'Test failed.'}.

Run it with the \programopt{-v} switch instead:

\begin{verbatim}
python M.py -v
\end{verbatim}

and a detailed report of all examples tried is printed to \code{stdout},
along with assorted summaries at the end.

You can force verbose mode by passing \code{verbose=1} to testmod, or
prohibit it by passing \code{verbose=0}.  In either of those cases,
\code{sys.argv} is not examined by testmod.

In any case, testmod returns a 2-tuple of ints \code{(\var{f},
\var{t})}, where \var{f} is the number of docstring examples that
failed and \var{t} is the total number of docstring examples
attempted.

\subsection{Which Docstrings Are Examined?}

See \file{docstring.py} for all the details.  They're unsurprising:  the
module docstring, and all function, class and method docstrings are
searched, with the exception of docstrings attached to objects with private
names.

In addition, if \code{M.__test__} exists and "is true", it must be a
dict, and each entry maps a (string) name to a function object, class
object, or string.  Function and class object docstrings found from
\code{M.__test__} are searched even if the name is private, and
strings are searched directly as if they were docstrings.  In output,
a key \code{K} in \code{M.__test__} appears with name

\begin{verbatim}
      <name of M>.__test__.K
\end{verbatim}

Any classes found are recursively searched similarly, to test docstrings in
their contained methods and nested classes.  While private names reached
from \module{M}'s globals are skipped, all names reached from
\code{M.__test__} are searched.

\subsection{What's the Execution Context?}

By default, each time testmod finds a docstring to test, it uses a
{\em copy} of \module{M}'s globals, so that running tests on a module
doesn't change the module's real globals, and so that one test in
\module{M} can't leave behind crumbs that accidentally allow another test
to work.  This means examples can freely use any names defined at top-level
in \module{M}, and names defined earlier in the docstring being run.  It
also means that sloppy imports (see below) can cause examples in external
docstrings to use globals inappropriate for them.

You can force use of your own dict as the execution context by passing
\code{globs=your_dict} to \function{testmod()} instead.  Presumably this
would be a copy of \code{M.__dict__} merged with the globals from other
imported modules.

\subsection{What About Exceptions?}

No problem, as long as the only output generated by the example is the
traceback itself.  For example:

\begin{verbatim}
    >>> [1, 2, 3].remove(42)
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    ValueError: list.remove(x): x not in list
    >>>
\end{verbatim}

Note that only the exception type and value are compared (specifically,
only the last line in the traceback).  The various ``File'' lines in
between can be left out (unless they add significantly to the documentation
value of the example).

\subsection{Advanced Usage}

\function{testmod()} actually creates a local instance of class
\class{Tester}, runs appropriate methods of that class, and merges
the results into global \class{Tester} instance \code{master}.

You can create your own instances of \class{Tester}, and so build your
own policies, or even run methods of \code{master} directly.  See
\code{Tester.__doc__} for details.


\subsection{How are Docstring Examples Recognized?}

In most cases a copy-and-paste of an interactive console session works fine
--- just make sure the leading whitespace is rigidly consistent (you can mix
tabs and spaces if you're too lazy to do it right, but doctest is not in
the business of guessing what you think a tab means).

\begin{verbatim}
    >>> # comments are ignored
    >>> x = 12
    >>> x
    12
    >>> if x == 13:
    ...     print "yes"
    ... else:
    ...     print "no"
    ...     print "NO"
    ...     print "NO!!!"
    ...
    no
    NO
    NO!!!
    >>>
\end{verbatim}

Any expected output must immediately follow the final \code{">>>"} or
\code{"..."} line containing the code, and the expected output (if any)
extends to the next \code{">>>"} or all-whitespace line.

The fine print:

\begin{itemize}

\item Expected output cannot contain an all-whitespace line, since such a
  line is taken to signal the end of expected output.

\item Output to stdout is captured, but not output to stderr (exception
  tracebacks are captured via a different means).

\item If you continue a line via backslashing in an interactive session, or
  for any other reason use a backslash, you need to double the backslash in
  the docstring version.  This is simply because you're in a string, and so
  the backslash must be escaped for it to survive intact.  Like:

\begin{verbatim}
>>> if "yes" == \\
...     "y" +   \\
...     "es":
...     print 'yes'
yes
\end{verbatim}

The starting column doesn't matter:

\begin{verbatim}
>>> assert "Easy!"
>>> import math
>>> math.floor(1.9)
1.0
\end{verbatim}

and as many leading whitespace characters are stripped from the expected
output as appeared in the initial ">>>" line that triggered it.
\end{itemize}

\subsection{Warnings}

\begin{enumerate}

\item Sloppy imports can cause trouble; e.g., if you do

\begin{verbatim}
from XYZ import XYZclass
\end{verbatim}

then \class{XYZclass} is a name in \code{M.__dict__} too, and doctest
has no way to know that \class{XYZclass} wasn't \emph{defined} in
\module{M}.  So it may try to execute the examples in
\class{XYZclass}'s docstring, and those in turn may require a
different set of globals to work correctly.  I prefer to do
``\code{import *}''-friendly imports, a la

\begin{verbatim}
from XYZ import XYZclass as _XYZclass
\end{verbatim}

and then the leading underscore makes \class{_XYZclass} a private name so
testmod skips it by default.  Other approaches are described in
\file{doctest.py}.

\item \module{doctest} is serious about requiring exact matches in expected
  output.  If even a single character doesn't match, the test fails.  This
  will probably surprise you a few times, as you learn exactly what Python
  does and doesn't guarantee about output.  For example, when printing a
  dict, Python doesn't guarantee that the key-value pairs will be printed
  in any particular order, so a test like

% Hey! What happened to Monty Python examples?
\begin{verbatim}
    >>> foo()
    {"Hermione": "hippogryph", "Harry": "broomstick"}
    >>>
\end{verbatim}

is vulnerable!  One workaround is to do

\begin{verbatim}
    >>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
    1
    >>>
\end{verbatim}

instead.  Another is to do

\begin{verbatim}
    >>> d = foo().items()
    >>> d.sort()
    >>> d
    [('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
\end{verbatim}

There are others, but you get the idea.

Another bad idea is to print things that embed an object address, like

\begin{verbatim}
    >>> id(1.0) # certain to fail some of the time
    7948648
    >>>
\end{verbatim}

Floating-point numbers are also subject to small output variations across
platforms, because Python defers to the platform C library for float
formatting, and C libraries vary widely in quality here.

\begin{verbatim}
    >>> 1./7  # risky
    0.14285714285714285
    >>> print 1./7 # safer
    0.142857142857
    >>> print round(1./7, 6) # much safer
    0.142857
\end{verbatim}

Numbers of the form \code{I/2.**J} are safe across all platforms, and I
often contrive doctest examples to produce numbers of that form:

\begin{verbatim}
    >>> 3./4  # utterly safe
    0.75
\end{verbatim}

Simple fractions are also easier for people to understand, and that makes
for better documentation.
\end{enumerate}


\subsection{Soapbox}

The first word in doctest is "doc", and that's why the author wrote
doctest:  to keep documentation up to date.  It so happens that doctest
makes a pleasant unit testing environment, but that's not its primary
purpose.

Choose docstring examples with care.  There's an art to this that needs to
be learned --- it may not be natural at first.  Examples should add genuine
value to the documentation.  A good example can often be worth many words.
If possible, show just a few normal cases, show endcases, show interesting
subtle cases, and show an example of each kind of exception that can be
raised.  You're probably testing for endcases and subtle cases anyway in an
interactive shell:  doctest wants to make it as easy as possible to capture
those sessions, and will verify they continue to work as designed forever
after.

If done with care, the examples will be invaluable for your users, and will
pay back the time it takes to collect them many times over as the years go
by and "things change".  I'm still amazed at how often one of my doctest
examples stops working after a "harmless" change.

For exhaustive testing, or testing boring cases that add no value to the
docs, define a \code{__test__} dict instead.  That's what it's for.