summaryrefslogtreecommitdiffstats
path: root/Doc/library/__main__.rst
blob: 90973d0058e883969f2a0c317b808d965a9598e6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
:mod:`__main__` --- Top-level code environment
==============================================

.. module:: __main__
   :synopsis: The environment where top-level code is run. Covers command-line
              interfaces, import-time behavior, and ``__name__ == '__main__'``.

--------------

In Python, the special name ``__main__`` is used for two important constructs:

1. the name of the top-level environment of the program, which can be
   checked using the ``__name__ == '__main__'`` expression; and
2. the ``__main__.py`` file in Python packages.

Both of these mechanisms are related to Python modules; how users interact with
them and how they interact with each other.  They are explained in detail
below.  If you're new to Python modules, see the tutorial section
:ref:`tut-modules` for an introduction.


.. _name_equals_main:

``__name__ == '__main__'``
---------------------------

When a Python module or package is imported, ``__name__`` is set to the
module's name.  Usually, this is the name of the Python file itself without the
``.py`` extension::

    >>> import configparser
    >>> configparser.__name__
    'configparser'

If the file is part of a package, ``__name__`` will also include the parent
package's path::

    >>> from concurrent.futures import process
    >>> process.__name__
    'concurrent.futures.process'

However, if the module is executed in the top-level code environment,
its ``__name__`` is set to the string ``'__main__'``.

What is the "top-level code environment"?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

``__main__`` is the name of the environment where top-level code is run.
"Top-level code" is the first user-specified Python module that starts running.
It's "top-level" because it imports all other modules that the program needs.
Sometimes "top-level code" is called an *entry point* to the application.

The top-level code environment can be:

* the scope of an interactive prompt::

    >>> __name__
    '__main__'

* the Python module passed to the Python interpreter as a file argument:

    .. code-block:: shell-session

       $ python3 helloworld.py
       Hello, world!

* the Python module or package passed to the Python interpreter with the
  :option:`-m` argument:

    .. code-block:: shell-session

       $ python3 -m tarfile
       usage: tarfile.py [-h] [-v] (...)

* Python code read by the Python interpreter from standard input:

    .. code-block:: shell-session

       $ echo "import this" | python3
       The Zen of Python, by Tim Peters

       Beautiful is better than ugly.
       Explicit is better than implicit.
       ...

* Python code passed to the Python interpreter with the :option:`-c` argument:

    .. code-block:: shell-session

       $ python3 -c "import this"
       The Zen of Python, by Tim Peters

       Beautiful is better than ugly.
       Explicit is better than implicit.
       ...

In each of these situations, the top-level module's ``__name__`` is set to
``'__main__'``.

As a result, a module can discover whether or not it is running in the
top-level environment by checking its own ``__name__``, which allows a common
idiom for conditionally executing code when the module is not initialized from
an import statement::

    if __name__ == '__main__':
        # Execute when the module is not initialized from an import statement.
        ...

.. seealso::

   For a more detailed look at how ``__name__`` is set in all situations, see
   the tutorial section :ref:`tut-modules`.


Idiomatic Usage
^^^^^^^^^^^^^^^

Some modules contain code that is intended for script use only, like parsing
command-line arguments or fetching data from standard input.  If a module
like this was imported from a different module, for example to unit test
it, the script code would unintentionally execute as well.

This is where using the ``if __name__ == '__main__'`` code block comes in
handy. Code within this block won't run unless the module is executed in the
top-level environment.

Putting as few statements as possible in the block below ``if __name__ ==
'__main__'`` can improve code clarity and correctness. Most often, a function
named ``main`` encapsulates the program's primary behavior::

    # echo.py

    import shlex
    import sys

    def echo(phrase: str) -> None:
       """A dummy wrapper around print."""
       # for demonstration purposes, you can imagine that there is some
       # valuable and reusable logic inside this function
       print(phrase)

    def main() -> int:
        """Echo the input arguments to standard output"""
        phrase = shlex.join(sys.argv)
        echo(phrase)
        return 0

    if __name__ == '__main__':
        sys.exit(main())  # next section explains the use of sys.exit

Note that if the module didn't encapsulate code inside the ``main`` function
but instead put it directly within the ``if __name__ == '__main__'`` block,
the ``phrase`` variable would be global to the entire module.  This is
error-prone as other functions within the module could be unintentionally using
the global variable instead of a local name.  A ``main`` function solves this
problem.

Using a ``main`` function has the added benefit of the ``echo`` function itself
being isolated and importable elsewhere. When ``echo.py`` is imported, the
``echo`` and ``main`` functions will be defined, but neither of them will be
called, because ``__name__ != '__main__'``.


Packaging Considerations
^^^^^^^^^^^^^^^^^^^^^^^^

``main`` functions are often used to create command-line tools by specifying
them as entry points for console scripts.  When this is done,
`pip <https://pip.pypa.io/>`_ inserts the function call into a template script,
where the return value of ``main`` is passed into :func:`sys.exit`.
For example::

    sys.exit(main())

Since the call to ``main`` is wrapped in :func:`sys.exit`, the expectation is
that your function will return some value acceptable as an input to
:func:`sys.exit`; typically, an integer or ``None`` (which is implicitly
returned if your function does not have a return statement).

By proactively following this convention ourselves, our module will have the
same behavior when run directly (i.e. ``python3 echo.py``) as it will have if
we later package it as a console script entry-point in a pip-installable
package.

In particular, be careful about returning strings from your ``main`` function.
:func:`sys.exit` will interpret a string argument as a failure message, so
your program will have an exit code of ``1``, indicating failure, and the
string will be written to :data:`sys.stderr`.  The ``echo.py`` example from
earlier exemplifies using the ``sys.exit(main())`` convention.

.. seealso::

   `Python Packaging User Guide <https://packaging.python.org/>`_
   contains a collection of tutorials and references on how to distribute and
   install Python packages with modern tools.


``__main__.py`` in Python Packages
----------------------------------

If you are not familiar with Python packages, see section :ref:`tut-packages`
of the tutorial.  Most commonly, the ``__main__.py`` file is used to provide
a command-line interface for a package. Consider the following hypothetical
package, "bandclass":

.. code-block:: text

   bandclass
     ├── __init__.py
     ├── __main__.py
     └── student.py

``__main__.py`` will be executed when the package itself is invoked
directly from the command line using the :option:`-m` flag. For example:

.. code-block:: shell-session

   $ python3 -m bandclass

This command will cause ``__main__.py`` to run. How you utilize this mechanism
will depend on the nature of the package you are writing, but in this
hypothetical case, it might make sense to allow the teacher to search for
students::

    # bandclass/__main__.py

    import sys
    from .student import search_students

    student_name = sys.argv[2] if len(sys.argv) >= 2 else ''
    print(f'Found student: {search_students(student_name)}')

Note that ``from .student import search_students`` is an example of a relative
import.  This import style can be used when referencing modules within a
package.  For more details, see :ref:`intra-package-references` in the
:ref:`tut-modules` section of the tutorial.

Idiomatic Usage
^^^^^^^^^^^^^^^

The contents of ``__main__.py`` typically isn't fenced with
``if __name__ == '__main__'`` blocks.  Instead, those files are kept short,
functions to execute from other modules.  Those other modules can then be
easily unit-tested and are properly reusable.

If used, an ``if __name__ == '__main__'`` block will still work as expected
for a ``__main__.py`` file within a package, because its ``__name__``
attribute will include the package's path if imported::

    >>> import asyncio.__main__
    >>> asyncio.__main__.__name__
    'asyncio.__main__'

This won't work for ``__main__.py`` files in the root directory of a .zip file
though.  Hence, for consistency, minimal ``__main__.py`` like the :mod:`venv`
one mentioned below are preferred.

.. seealso::

   See :mod:`venv` for an example of a package with a minimal ``__main__.py``
   in the standard library. It doesn't contain a ``if __name__ == '__main__'``
   block. You can invoke it with ``python -m venv [directory]``.

   See :mod:`runpy` for more details on the :option:`-m` flag to the
   interpreter executable.

   See :mod:`zipapp` for how to run applications packaged as *.zip* files. In
   this case Python looks for a ``__main__.py`` file in the root directory of
   the archive.



``import __main__``
-------------------

Regardless of which module a Python program was started with, other modules
running within that same program can import the top-level environment's scope
(:term:`namespace`) by importing the ``__main__`` module.  This doesn't import
a ``__main__.py`` file but rather whichever module that received the special
name ``'__main__'``.

Here is an example module that consumes the ``__main__`` namespace::

    # namely.py

    import __main__

    def did_user_define_their_name():
        return 'my_name' in dir(__main__)

    def print_user_name():
        if not did_user_define_their_name():
            raise ValueError('Define the variable `my_name`!')

        if '__file__' in dir(__main__):
            print(__main__.my_name, "found in file", __main__.__file__)
        else:
            print(__main__.my_name)

Example usage of this module could be as follows::

    # start.py

    import sys

    from namely import print_user_name

    # my_name = "Dinsdale"

    def main():
        try:
            print_user_name()
        except ValueError as ve:
            return str(ve)

    if __name__ == "__main__":
        sys.exit(main())

Now, if we started our program, the result would look like this:

.. code-block:: shell-session

   $ python3 start.py
   Define the variable `my_name`!

The exit code of the program would be 1, indicating an error. Uncommenting the
line with ``my_name = "Dinsdale"`` fixes the program and now it exits with
status code 0, indicating success:

.. code-block:: shell-session

   $ python3 start.py
   Dinsdale found in file /path/to/start.py

Note that importing ``__main__`` doesn't cause any issues with unintentionally
running top-level code meant for script use which is put in the
``if __name__ == "__main__"`` block of the ``start`` module. Why does this work?

Python inserts an empty ``__main__`` module in :data:`sys.modules` at
interpreter startup, and populates it by running top-level code. In our example
this is the ``start`` module which runs line by line and imports ``namely``.
In turn, ``namely`` imports ``__main__`` (which is really ``start``). That's an
import cycle! Fortunately, since the partially populated ``__main__``
module is present in :data:`sys.modules`, Python passes that to ``namely``.
See :ref:`Special considerations for __main__ <import-dunder-main>` in the
import system's reference for details on how this works.

The Python REPL is another example of a "top-level environment", so anything
defined in the REPL becomes part of the ``__main__`` scope::

    >>> import namely
    >>> namely.did_user_define_their_name()
    False
    >>> namely.print_user_name()
    Traceback (most recent call last):
    ...
    ValueError: Define the variable `my_name`!
    >>> my_name = 'Jabberwocky'
    >>> namely.did_user_define_their_name()
    True
    >>> namely.print_user_name()
    Jabberwocky

Note that in this case the ``__main__`` scope doesn't contain a ``__file__``
attribute as it's interactive.

The ``__main__`` scope is used in the implementation of :mod:`pdb` and
:mod:`rlcompleter`.