Doc/lib/libdoctest.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645

\section{\module{doctest} ---
         Test docstrings represent reality}

\declaremodule{standard}{doctest}
\moduleauthor{Tim Peters}{tim_one@users.sourceforge.net}
\sectionauthor{Tim Peters}{tim_one@users.sourceforge.net}
\sectionauthor{Moshe Zadka}{moshez@debian.org}

\modulesynopsis{A framework for verifying examples in docstrings.}

The \module{doctest} module searches a module's docstrings for text that looks
like an interactive Python session, then executes all such sessions to verify
they still work exactly as shown.  Here's a complete but small example:

\begin{verbatim}
"""
This is module example.

Example supplies one function, factorial.  For example,

>>> factorial(5)
120
"""

def factorial(n):
    """Return the factorial of n, an exact integer >= 0.

    If the result is small enough to fit in an int, return an int.
    Else return a long.

    >>> [factorial(n) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> [factorial(long(n)) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> factorial(30)
    265252859812191058636308480000000L
    >>> factorial(30L)
    265252859812191058636308480000000L
    >>> factorial(-1)
    Traceback (most recent call last):
        ...
    ValueError: n must be >= 0

    Factorials of floats are OK, but the float must be an exact integer:
    >>> factorial(30.1)
    Traceback (most recent call last):
        ...
    ValueError: n must be exact integer
    >>> factorial(30.0)
    265252859812191058636308480000000L

    It must also not be ridiculously large:
    >>> factorial(1e100)
    Traceback (most recent call last):
        ...
    OverflowError: n too large
    """

\end{verbatim}
% allow LaTeX to break here.
\begin{verbatim}

    import math
    if not n >= 0:
        raise ValueError("n must be >= 0")
    if math.floor(n) != n:
        raise ValueError("n must be exact integer")
    if n+1 == n:  # catch a value like 1e300
        raise OverflowError("n too large")
    result = 1
    factor = 2
    while factor <= n:
        try:
            result *= factor
        except OverflowError:
            result *= long(factor)
        factor += 1
    return result

def _test():
    import doctest
    return doctest.testmod()

if __name__ == "__main__":
    _test()
\end{verbatim}

If you run \file{example.py} directly from the command line,
\module{doctest} works its magic:

\begin{verbatim}
$ python example.py
$
\end{verbatim}

There's no output!  That's normal, and it means all the examples
worked.  Pass \programopt{-v} to the script, and \module{doctest}
prints a detailed log of what it's trying, and prints a summary at the
end:

\begin{verbatim}
$ python example.py -v
Trying: factorial(5)
Expecting: 120
ok
Trying: [factorial(n) for n in range(6)]
Expecting: [1, 1, 2, 6, 24, 120]
ok
Trying: [factorial(long(n)) for n in range(6)]
Expecting: [1, 1, 2, 6, 24, 120]
ok\end{verbatim}

And so on, eventually ending with:

\begin{verbatim}
Trying: factorial(1e100)
Expecting:
    Traceback (most recent call last):
        ...
    OverflowError: n too large
ok
2 items passed all tests:
   1 tests in example
   8 tests in example.factorial
9 tests in 2 items.
9 passed and 0 failed.
Test passed.
$
\end{verbatim}

That's all you need to know to start making productive use of
\module{doctest}!  Jump in.

\subsection{Simple Usage}

The simplest (not necessarily the best) way to start using doctest is to
end each module \module{M} with:

\begin{verbatim}
def _test():
    import doctest
    return doctest.testmod()

if __name__ == "__main__":
    _test()
\end{verbatim}

\module{doctest} then examines docstrings in the module calling
\function{testmod()}.  If you want to test a different module, you can
pass that module object to \function{testmod()}.

Running the module as a script causes the examples in the docstrings
to get executed and verified:

\begin{verbatim}
python M.py
\end{verbatim}

This won't display anything unless an example fails, in which case the
failing example(s) and the cause(s) of the failure(s) are printed to stdout,
and the final line of output is
\\code{'***Test Failed*** \var{N} failures.'}, where \var{N} is the
number of examples that failed.

Run it with the \programopt{-v} switch instead:

\begin{verbatim}
python M.py -v
\end{verbatim}

and a detailed report of all examples tried is printed to standard
output, along with assorted summaries at the end.

You can force verbose mode by passing \code{verbose=True} to
\function{testmod()}, or
prohibit it by passing \code{verbose=False}.  In either of those cases,
\code{sys.argv} is not examined by \function{testmod()}.

In any case, \function{testmod()} returns a 2-tuple of ints \code{(\var{f},
\var{t})}, where \var{f} is the number of docstring examples that
failed and \var{t} is the total number of docstring examples
attempted.

\begin{funcdesc}{testmod}{\optional{m}\optional{, name}\optional{,
                          globs}\optional{, verbose}\optional{,
                          isprivate}\optional{, report}\optional{,
                          optionflags}\optional{, extraglobs}\optional{,
                          raise_on_error}}

  All arguments are optional, and all except for \var{m} should be
  specified in keyword form.

  Test examples in docstrings in functions and classes reachable
  from module \var{m} (or the current module if \var{m} is not supplied
  or is \code{None}), starting with \code{\var{m}.__doc__}.

  Also test examples reachable from dict \code{\var{m}.__test__}, if it
  exists and is not \code{None}.  \code{\var{m}.__test__} maps
  names (strings) to functions, classes and strings; function and class
  docstrings are searched for examples; strings are searched directly,
  as if they were docstrings.

  Only docstrings attached to objects belonging to module \var{m} are
  searched.

  Return \code{(\var{failure_count}, \var{test_count})}.

  Optional argument \var{name} gives the name of the module; by default,
  or if \code{None}, \code{\var{m}.__name__} is used.

  Optional argument \var{globs} gives a dict to be used as the globals
  when executing examples; by default, or if \code{None},
  \code{\var{m}.__dict__} is used.  A new shallow copy of this dict is
  created for each docstring with examples, so that each docstring's
  examples start with a clean slate.

  Optional argument \var{extraglobs} gives a dicti merged into the
  globals used to execute examples.  This works like
  \method{dict.update()}:  if \var{globs} and \var{extraglobs} have a
  common key, the associated value in \var{extraglobs} appears in the
  combined dict.  By default, or if \code{None}, no extra globals are
  used.  This is an advanced feature that allows parameterization of
  doctests.  For example, a doctest can be written for a base class, using
  a generic name for the class, then reused to test any number of
  subclasses by passing an \var{extraglobs} dict mapping the generic
  name to the subclass to be tested.

  Optional argument \var{verbose} prints lots of stuff if true, and prints
  only failures if false; by default, or if \code{None}, it's true
  if and only if \code{'-v'} is in \code{\module{sys}.argv}.

  Optional argument \var{report} prints a summary at the end when true,
  else prints nothing at the end.  In verbose mode, the summary is
  detailed, else the summary is very brief (in fact, empty if all tests
  passed).

  Optional argument \var{optionflags} or's together module constants,
  and defaults to 0.

%  Possible values:
%
%      DONT_ACCEPT_TRUE_FOR_1
%          By default, if an expected output block contains just "1",
%          an actual output block containing just "True" is considered
%          to be a match, and similarly for "0" versus "False".  When
%          DONT_ACCEPT_TRUE_FOR_1 is specified, neither substitution
%          is allowed.
%
%      DONT_ACCEPT_BLANKLINE
%          By default, if an expected output block contains a line
%          containing only the string "<BLANKLINE>", then that line
%          will match a blank line in the actual output.  When
%          DONT_ACCEPT_BLANKLINE is specified, this substitution is
%          not allowed.
%
%      NORMALIZE_WHITESPACE
%          When NORMALIZE_WHITESPACE is specified, all sequences of
%          whitespace are treated as equal.  I.e., any sequence of
%          whitespace within the expected output will match any
%          sequence of whitespace within the actual output.
%
%      ELLIPSIS
%          When ELLIPSIS is specified, then an ellipsis marker
%          ("...") in the expected output can match any substring in
%          the actual output.
%
%      UNIFIED_DIFF
%          When UNIFIED_DIFF is specified, failures that involve
%          multi-line expected and actual outputs will be displayed
%          using a unified diff.
%
%      CONTEXT_DIFF
%          When CONTEXT_DIFF is specified, failures that involve
%          multi-line expected and actual outputs will be displayed
%          using a context diff.

  Optional argument \var{raise_on_error} defaults to false.  If true,
  an exception is raised upon the first failure or unexpected exception
  in an example.  This allows failures to be post-mortem debugged.
  Default behavior is to continue running examples.

  Optional argument \var{isprivate} specifies a function used to
  determine whether a name is private.  The default function treats
  all names as public.  \var{isprivate} can be set to
  \code{\module{doctest}.is_private} to skip over names that are
  private according to Python's underscore naming convention.
  \deprecated{2.4}{\var{isprivate} was a stupid idea -- don't use it.
  If you need to skip tests based on name, filter the list returned by
  \code{DocTestFinder.find()} instead.}

%  """ [XX] This is no longer true:
%  Advanced tomfoolery:  testmod runs methods of a local instance of
%  class doctest.Tester, then merges the results into (or creates)
%  global Tester instance doctest.master.  Methods of doctest.master
%  can be called directly too, if you want to do something unusual.
%  Passing report=0 to testmod is especially useful then, to delay
%  displaying a summary.  Invoke doctest.master.summarize(verbose)
%  when you're done fiddling.

  \versionchanged[The parameter \var{optionflags} was added]{2.3}

  \versionchanged[Many new module constants for use with \var{optionflags}
                  were added]{2.4}

  \versionchanged[The parameters \var{extraglobs} and \var{raise_on_error}
                  were added]{2.4}
\end{funcdesc}


\subsection{Which Docstrings Are Examined?}

See the docstrings in \file{doctest.py} for all the details.  They're
unsurprising: the module docstring, and all function, class and method
docstrings are searched.  Optionally, the tester can be directed to
exclude docstrings attached to objects with private names.  Objects
imported into the module are not searched.

In addition, if \code{M.__test__} exists and "is true", it must be a
dict, and each entry maps a (string) name to a function object, class
object, or string.  Function and class object docstrings found from
\code{M.__test__} are searched even if the tester has been
directed to skip over private names in the rest of the module.
In output, a key \code{K} in \code{M.__test__} appears with name

\begin{verbatim}
<name of M>.__test__.K
\end{verbatim}

Any classes found are recursively searched similarly, to test docstrings in
their contained methods and nested classes.  While private names reached
from \module{M}'s globals can be optionally skipped, all names reached from
\code{M.__test__} are searched.

\subsection{What's the Execution Context?}

By default, each time \function{testmod()} finds a docstring to test, it uses
a \emph{copy} of \module{M}'s globals, so that running tests on a module
doesn't change the module's real globals, and so that one test in
\module{M} can't leave behind crumbs that accidentally allow another test
to work.  This means examples can freely use any names defined at top-level
in \module{M}, and names defined earlier in the docstring being run.

You can force use of your own dict as the execution context by passing
\code{globs=your_dict} to \function{testmod()} instead.  Presumably this
would be a copy of \code{M.__dict__} merged with the globals from other
imported modules.

\subsection{What About Exceptions?}

No problem, as long as the only output generated by the example is the
traceback itself.  For example:

\begin{verbatim}
>>> [1, 2, 3].remove(42)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: list.remove(x): x not in list
>>>
\end{verbatim}

Note that only the exception type and value are compared (specifically,
only the last line in the traceback).  The various ``File'' lines in
between can be left out (unless they add significantly to the documentation
value of the example).

\subsection{Advanced Usage}

Several module level functions are available for controlling how doctests
are run.

\begin{funcdesc}{debug}{module, name}
  Debug a single docstring containing doctests.

  Provide the \var{module} (or dotted name of the module) containing the
  docstring to be debugged and the \var{name} (within the module) of the
  object with the docstring to be debugged.

  The doctest examples are extracted (see function \function{testsource()}),
  and written to a temporary file.  The Python debugger, \refmodule{pdb},
  is then invoked on that file.
  \versionadded{2.3}
\end{funcdesc}

\begin{funcdesc}{testmod}{}
  This function provides the most basic interface to the doctests.
  It creates a local instance of class \class{Tester}, runs appropriate
  methods of that class, and merges the results into the global \class{Tester}
  instance, \code{master}.

  To get finer control than \function{testmod()} offers, create an instance
  of \class{Tester} with custom policies, or run methods of \code{master}
  directly.  See \code{Tester.__doc__} for details.
\end{funcdesc}

\begin{funcdesc}{testsource}{module, name}
  Extract the doctest examples from a docstring.

  Provide the \var{module} (or dotted name of the module) containing the
  tests to be extracted and the \var{name} (within the module) of the object
  with the docstring containing the tests to be extracted.

  The doctest examples are returned as a string containing Python
  code.  The expected output blocks in the examples are converted
  to Python comments.
  \versionadded{2.3}
\end{funcdesc}

\begin{funcdesc}{DocTestSuite}{\optional{module}}
  Convert doctest tests for a module to a
  \class{\refmodule{unittest}.TestSuite}.

  The returned \class{TestSuite} is to be run by the unittest framework
  and runs each doctest in the module.  If any of the doctests fail,
  then the synthesized unit test fails, and a \exception{DocTestTestFailure}
  exception is raised showing the name of the file containing the test and a
  (sometimes approximate) line number.

  The optional \var{module} argument provides the module to be tested.  It
  can be a module object or a (possibly dotted) module name.  If not
  specified, the module calling this function is used.

  Example using one of the many ways that the \refmodule{unittest} module
  can use a \class{TestSuite}:

  \begin{verbatim}
    import unittest
    import doctest
    import my_module_with_doctests

    suite = doctest.DocTestSuite(my_module_with_doctests)
    runner = unittest.TextTestRunner()
    runner.run(suite)
  \end{verbatim}

  \versionadded{2.3}
  \warning{This function does not currently search \code{M.__test__}
  and its search technique does not exactly match \function{testmod()} in
  every detail.  Future versions will bring the two into convergence.}
\end{funcdesc}


\subsection{How are Docstring Examples Recognized?}

In most cases a copy-and-paste of an interactive console session works
fine---just make sure the leading whitespace is rigidly consistent
(you can mix tabs and spaces if you're too lazy to do it right, but
\module{doctest} is not in the business of guessing what you think a tab
means).

\begin{verbatim}
>>> # comments are ignored
>>> x = 12
>>> x
12
>>> if x == 13:
...     print "yes"
... else:
...     print "no"
...     print "NO"
...     print "NO!!!"
...
no
NO
NO!!!
>>>
\end{verbatim}

Any expected output must immediately follow the final
\code{'>\code{>}>~'} or \code{'...~'} line containing the code, and
the expected output (if any) extends to the next \code{'>\code{>}>~'}
or all-whitespace line.

The fine print:

\begin{itemize}

\item Expected output cannot contain an all-whitespace line, since such a
  line is taken to signal the end of expected output.

\item Output to stdout is captured, but not output to stderr (exception
  tracebacks are captured via a different means).

\item If you continue a line via backslashing in an interactive session,
  or for any other reason use a backslash, you should use a raw
  docstring, which will preserve your backslahses exactly as you type
  them:

\begin{verbatim}
>>> def f(x):
...     r'''Backslashes in a raw docstring: m\n'''
>>> print f.__doc__
Backslashes in a raw docstring: m\n
\end{verbatim}

  Otherwise, the backslash will be interpreted as part of the string.
  E.g., the "\textbackslash" above would be interpreted as a newline
  character.  Alternatively, you can double each backslash in the
  doctest version (and not use a raw string):

\begin{verbatim}
>>> def f(x):
...     '''Backslashes in a raw docstring: m\\n'''
>>> print f.__doc__
Backslashes in a raw docstring: m\n
\end{verbatim}

\item The starting column doesn't matter:

\begin{verbatim}
  >>> assert "Easy!"
        >>> import math
            >>> math.floor(1.9)
            1.0
\end{verbatim}

and as many leading whitespace characters are stripped from the
expected output as appeared in the initial \code{'>\code{>}>~'} line
that triggered it.
\end{itemize}

\subsection{Warnings}

\begin{enumerate}

\item \module{doctest} is serious about requiring exact matches in expected
  output.  If even a single character doesn't match, the test fails.  This
  will probably surprise you a few times, as you learn exactly what Python
  does and doesn't guarantee about output.  For example, when printing a
  dict, Python doesn't guarantee that the key-value pairs will be printed
  in any particular order, so a test like

% Hey! What happened to Monty Python examples?
% Tim: ask Guido -- it's his example!
\begin{verbatim}
>>> foo()
{"Hermione": "hippogryph", "Harry": "broomstick"}
>>>
\end{verbatim}

is vulnerable!  One workaround is to do

\begin{verbatim}
>>> foo() == {"Hermione": "hippogryph", "Harry": "broomstick"}
True
>>>
\end{verbatim}

instead.  Another is to do

\begin{verbatim}
>>> d = foo().items()
>>> d.sort()
>>> d
[('Harry', 'broomstick'), ('Hermione', 'hippogryph')]
\end{verbatim}

There are others, but you get the idea.

Another bad idea is to print things that embed an object address, like

\begin{verbatim}
>>> id(1.0) # certain to fail some of the time
7948648
>>>
\end{verbatim}

Floating-point numbers are also subject to small output variations across
platforms, because Python defers to the platform C library for float
formatting, and C libraries vary widely in quality here.

\begin{verbatim}
>>> 1./7  # risky
0.14285714285714285
>>> print 1./7 # safer
0.142857142857
>>> print round(1./7, 6) # much safer
0.142857
\end{verbatim}

Numbers of the form \code{I/2.**J} are safe across all platforms, and I
often contrive doctest examples to produce numbers of that form:

\begin{verbatim}
>>> 3./4  # utterly safe
0.75
\end{verbatim}

Simple fractions are also easier for people to understand, and that makes
for better documentation.

\item Be careful if you have code that must only execute once.

If you have module-level code that must only execute once, a more foolproof
definition of \function{_test()} is

\begin{verbatim}
def _test():
    import doctest, sys
    doctest.testmod()
\end{verbatim}

\item WYSIWYG isn't always the case, starting in Python 2.3.  The
  string form of boolean results changed from \code{'0'} and
  \code{'1'} to \code{'False'} and \code{'True'} in Python 2.3.
  This makes it clumsy to write a doctest showing boolean results that
  passes under multiple versions of Python.  In Python 2.3, by default,
  and as a special case, if an expected output block consists solely
  of \code{'0'} and the actual output block consists solely of
  \code{'False'}, that's accepted as an exact match, and similarly for
  \code{'1'} versus \code{'True'}.  This behavior can be turned off by
  passing the new (in 2.3) module constant
  \constant{DONT_ACCEPT_TRUE_FOR_1} as the value of \function{testmod()}'s
  new (in 2.3) optional \var{optionflags} argument.  Some years after
  the integer spellings of booleans are history, this hack will
  probably be removed again.

\end{enumerate}


\subsection{Soapbox}

The first word in ``doctest'' is ``doc,'' and that's why the author
wrote \refmodule{doctest}: to keep documentation up to date.  It so
happens that \refmodule{doctest} makes a pleasant unit testing
environment, but that's not its primary purpose.

Choose docstring examples with care.  There's an art to this that
needs to be learned---it may not be natural at first.  Examples should
add genuine value to the documentation.  A good example can often be
worth many words.  If possible, show just a few normal cases, show
endcases, show interesting subtle cases, and show an example of each
kind of exception that can be raised.  You're probably testing for
endcases and subtle cases anyway in an interactive shell:
\refmodule{doctest} wants to make it as easy as possible to capture
those sessions, and will verify they continue to work as designed
forever after.

If done with care, the examples will be invaluable for your users, and
will pay back the time it takes to collect them many times over as the
years go by and things change.  I'm still amazed at how often one of
my \refmodule{doctest} examples stops working after a ``harmless''
change.

For exhaustive testing, or testing boring cases that add no value to the
docs, define a \code{__test__} dict instead.  That's what it's for.