summaryrefslogtreecommitdiffstats
path: root/Doc/library/__main__.rst
diff options
context:
space:
mode:
authorJack DeVries <58614260+jdevries3133@users.noreply.github.com>2021-08-24 17:01:41 (GMT)
committerGitHub <noreply@github.com>2021-08-24 17:01:41 (GMT)
commit7cba23164cf82f6619db002cd30021b5dfb1f809 (patch)
tree21b0324f3a913384ca8c88578ff25bca3119a392 /Doc/library/__main__.rst
parenta24676bedcd332dd7e6fa5521d0449206391d190 (diff)
downloadcpython-7cba23164cf82f6619db002cd30021b5dfb1f809.zip
cpython-7cba23164cf82f6619db002cd30021b5dfb1f809.tar.gz
cpython-7cba23164cf82f6619db002cd30021b5dfb1f809.tar.bz2
bpo-39452: Rewrite and expand __main__.rst (#26883)
Broadened scope of the document to explicitly discuss and differentiate between ``__main__.py`` in packages versus the ``__name__ == '__main__'`` expression (and the idioms that surround it), as well as ``import __main__``. Co-authored-by: Géry Ogam <gery.ogam@gmail.com> Co-authored-by: Éric Araujo <merwok@netwok.org> Co-authored-by: Łukasz Langa <lukasz@langa.pl>
Diffstat (limited to 'Doc/library/__main__.rst')
-rw-r--r--Doc/library/__main__.rst377
1 files changed, 360 insertions, 17 deletions
diff --git a/Doc/library/__main__.rst b/Doc/library/__main__.rst
index a64faf1..116a9a9 100644
--- a/Doc/library/__main__.rst
+++ b/Doc/library/__main__.rst
@@ -1,25 +1,368 @@
-
-:mod:`__main__` --- Top-level script environment
-================================================
+:mod:`__main__` --- Top-level code environment
+==============================================
.. module:: __main__
- :synopsis: The environment where the top-level script is run.
+ :synopsis: The environment where top-level code is run. Covers command-line
+ interfaces, import-time behavior, and ``__name__ == '__main__'``.
--------------
-``'__main__'`` is the name of the scope in which top-level code executes.
-A module's __name__ is set equal to ``'__main__'`` when read from
-standard input, a script, or from an interactive prompt.
+In Python, the special name ``__main__`` is used for two important constructs:
+
+1. the name of the top-level environment of the program, which can be
+ checked using the ``__name__ == '__main__'`` expression; and
+2. the ``__main__.py`` file in Python packages.
+
+Both of these mechanisms are related to Python modules; how users interact with
+them and how they interact with each other. They are explained in detail
+below. If you're new to Python modules, see the tutorial section
+:ref:`tut-modules` for an introduction.
+
+
+.. _name_equals_main:
+
+``__name__ == '__main__'``
+---------------------------
+
+When a Python module or package is imported, ``__name__`` is set to the
+module's name. Usually, this is the name of the Python file itself without the
+``.py`` extension::
+
+ >>> import configparser
+ >>> configparser.__name__
+ 'configparser'
+
+If the file is part of a package, ``__name__`` will also include the parent
+package's path::
+
+ >>> from concurrent.futures import process
+ >>> process.__name__
+ 'concurrent.futures.process'
+
+However, if the module is executed in the top-level code environment,
+its ``__name__`` is set to the string ``'__main__'``.
+
+What is the "top-level code environment"?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``__main__`` is the name of the environment where top-level code is run.
+"Top-level code" is the first user-specified Python module that starts running.
+It's "top-level" because it imports all other modules that the program needs.
+Sometimes "top-level code" is called an *entry point* to the application.
+
+The top-level code environment can be:
+
+* the scope of an interactive prompt::
+
+ >>> __name__
+ '__main__'
+
+* the Python module passed to the Python interpreter as a file argument:
+
+ .. code-block:: shell-session
+
+ $ python3 helloworld.py
+ Hello, world!
+
+* the Python module or package passed to the Python interpreter with the
+ :option:`-m` argument:
+
+ .. code-block:: shell-session
+
+ $ python3 -m tarfile
+ usage: tarfile.py [-h] [-v] (...)
+
+* Python code read by the Python interpreter from standard input:
+
+ .. code-block:: shell-session
+
+ $ echo "import this" | python3
+ The Zen of Python, by Tim Peters
+
+ Beautiful is better than ugly.
+ Explicit is better than implicit.
+ ...
+
+* Python code passed to the Python interpreter with the :option:`-c` argument:
+
+ .. code-block:: shell-session
+
+ $ python3 -c "import this"
+ The Zen of Python, by Tim Peters
+
+ Beautiful is better than ugly.
+ Explicit is better than implicit.
+ ...
+
+In each of these situations, the top-level module's ``__name__`` is set to
+``'__main__'``.
+
+As a result, a module can discover whether or not it is running in the
+top-level environment by checking its own ``__name__``, which allows a common
+idiom for conditionally executing code when the module is not initialized from
+an import statement::
+
+ if __name__ == '__main__':
+ # Execute when the module is not initialized from an import statement.
+ ...
+
+.. seealso::
+
+ For a more detailed look at how ``__name__`` is set in all situations, see
+ the tutorial section :ref:`tut-modules`.
+
+
+Idiomatic Usage
+^^^^^^^^^^^^^^^
+
+Some modules contain code that is intended for script use only, like parsing
+command-line arguments or fetching data from standard input. When a module
+like this were to be imported from a different module, for example to unit test
+it, the script code would unintentionally execute as well.
+
+This is where using the ``if __name__ == '__main__'`` code block comes in
+handy. Code within this block won't run unless the module is executed in the
+top-level environment.
+
+Putting as few statements as possible in the block below ``if __name___ ==
+'__main__'`` can improve code clarity and correctness. Most often, a function
+named ``main`` encapsulates the program's primary behavior::
+
+ # echo.py
+
+ import shlex
+ import sys
+
+ def echo(phrase: str) -> None:
+ """A dummy wrapper around print."""
+ # for demonstration purposes, you can imagine that there is some
+ # valuable and reusable logic inside this function
+ print(phrase)
+
+ def main() -> int:
+ """Echo the input arguments to standard output"""
+ phrase = shlex.join(sys.argv)
+ echo(phrase)
+ return 0
+
+ if __name__ == '__main__':
+ sys.exit(main()) # next section explains the use of sys.exit
+
+Note that if the module didn't encapsulate code inside the ``main`` function
+but instead put it directly within the ``if __name__ == '__main__'`` block,
+the ``phrase`` variable would be global to the entire module. This is
+error-prone as other functions within the module could be unintentionally using
+the global variable instead of a local name. A ``main`` function solves this
+problem.
+
+Using a ``main`` function has the added benefit of the ``echo`` function itself
+being isolated and importable elsewhere. When ``echo.py`` is imported, the
+``echo`` and ``main`` functions will be defined, but neither of them will be
+called, because ``__name__ != '__main__'``.
+
+
+Packaging Considerations
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+``main`` functions are often used to create command-line tools by specifying
+them as entry points for console scripts. When this is done,
+`pip <https://pip.pypa.io/>`_ inserts the function call into a template script,
+where the return value of ``main`` is passed into :func:`sys.exit`.
+For example::
+
+ sys.exit(main())
+
+Since the call to ``main`` is wrapped in :func:`sys.exit`, the expectation is
+that your function will return some value acceptable as an input to
+:func:`sys.exit`; typically, an integer or ``None`` (which is implicitly
+returned if your function does not have a return statement).
+
+By proactively following this convention ourselves, our module will have the
+same behavior when run directly (i.e. ``python3 echo.py``) as it will have if
+we later package it as a console script entry-point in a pip-installable
+package.
+
+In particular, be careful about returning strings from your ``main`` function.
+:func:`sys.exit` will interpret a string argument as a failure message, so
+your program will have an exit code of ``1``, indicating failure, and the
+string will be written to :data:`sys.stderr`. The ``echo.py`` example from
+earlier exemplifies using the ``sys.exit(main())`` convention.
+
+.. seealso::
+
+ `Python Packaging User Guide <https://packaging.python.org/>`_
+ contains a collection of tutorials and references on how to distribute and
+ install Python packages with modern tools.
+
+
+``__main__.py`` in Python Packages
+----------------------------------
+
+If you are not familiar with Python packages, see section :ref:`tut-packages`
+of the tutorial. Most commonly, the ``__main__.py`` file is used to provide
+a command-line interface for a package. Consider the following hypothetical
+package, "bandclass":
+
+.. code-block:: text
+
+ bandclass
+ ├── __init__.py
+ ├── __main__.py
+ └── student.py
+
+``__main__.py`` will be executed when the package itself is invoked
+directly from the command line using the :option:`-m` flag. For example:
+
+.. code-block:: shell-session
+
+ $ python3 -m bandclass
+
+This command will cause ``__main__.py`` to run. How you utilize this mechanism
+will depend on the nature of the package you are writing, but in this
+hypothetical case, it might make sense to allow the teacher to search for
+students::
+
+ # bandclass/__main__.py
+
+ import sys
+ from .student import search_students
+
+ student_name = sys.argv[2] if len(sys.argv) >= 2 else ''
+ print(f'Found student: {search_students(student_name)}')
+
+Note that ``from .student import search_students`` is an example of a relative
+import. This import style must be used when referencing modules within a
+package. For more details, see :ref:`intra-package-references` in the
+:ref:`tut-modules` section of the tutorial.
+
+Idiomatic Usage
+^^^^^^^^^^^^^^^
+
+The contents of ``__main__.py`` typically isn't fenced with
+``if __name__ == '__main__'`` blocks. Instead, those files are kept short,
+functions to execute from other modules. Those other modules can then be
+easily unit-tested and are properly reusable.
+
+If used, an ``if __name__ == '__main__'`` block will still work as expected
+for a ``__main__.py`` file within a package, because its ``__name__``
+attribute will include the package's path if imported::
+
+ >>> import asyncio.__main__
+ >>> asyncio.__main__.__name__
+ 'asyncio.__main__'
+
+This won't work for ``__main__.py`` files in the root directory of a .zip file
+though. Hence, for consistency, minimal ``__main__.py`` like the :mod:`venv`
+one mentioned above are preferred.
+
+.. seealso::
+
+ See :mod:`venv` for an example of a package with a minimal ``__main__.py``
+ in the standard library. It doesn't contain a ``if __name__ == '__main__'``
+ block. You can invoke it with ``python3 -m venv [directory]``.
+
+ See :mod:`runpy` for more details on the :option:`-m` flag to the
+ interpreter executable.
+
+ See :mod:`zipapp` for how to run applications packaged as *.zip* files. In
+ this case Python looks for a ``__main__.py`` file in the root directory of
+ the archive.
+
+
+
+``import __main__``
+-------------------
+
+Regardless of which module a Python program was started with, other modules
+running within that same program can import the top-level environment's scope
+(:term:`namespace`) by importing the ``__main__`` module. This doesn't import
+a ``__main__.py`` file but rather whichever module that received the special
+name ``'__main__'``.
+
+Here is an example module that consumes the ``__main__`` namespace::
+
+ # namely.py
+
+ import __main__
+
+ def did_user_define_their_name():
+ return 'my_name' in dir(__main__)
+
+ def print_user_name():
+ if not did_user_define_their_name():
+ raise ValueError('Define the variable `my_name`!')
+
+ if '__file__' in dir(__main__):
+ print(__main__.my_name, "found in file", __main__.__file__)
+ else:
+ print(__main__.my_name)
+
+Example usage of this module could be as follows::
+
+ # start.py
+
+ import sys
+
+ from namely import print_user_name
+
+ # my_name = "Dinsdale"
+
+ def main():
+ try:
+ print_user_name()
+ except ValueError as ve:
+ return str(ve)
+
+ if __name__ == "__main__":
+ sys.exit(main())
+
+Now, if we started our program, the result would look like this:
+
+.. code-block:: shell-session
+
+ $ python3 start.py
+ Define the variable `my_name`!
+
+The exit code of the program would be 1, indicating an error. Uncommenting the
+line with ``my_name = "Dinsdale"`` fixes the program and now it exits with
+status code 0, indicating success:
+
+.. code-block:: shell-session
+
+ $ python3 start.py
+ Dinsdale found in file /path/to/start.py
+
+Note that importing ``__main__`` doesn't cause any issues with unintentionally
+running top-level code meant for script use which is put in the
+``if __name__ == "__main__"`` block of the ``start`` module. Why does this work?
+
+Python inserts an empty ``__main__`` module in :attr:`sys.modules` at
+interpreter startup, and populates it by running top-level code. In our example
+this is the ``start`` module which runs line by line and imports ``namely``.
+In turn, ``namely`` imports ``__main__`` (which is really ``start``). That's an
+import cycle! Fortunately, since the partially populated ``__main__``
+module is present in :attr:`sys.modules`, Python passes that to ``namely``.
+See :ref:`Special considerations for __main__ <import-dunder-main>` in the
+import system's reference for details on how this works.
+
+The Python REPL is another example of a "top-level environment", so anything
+defined in the REPL becomes part of the ``__main__`` scope::
-A module can discover whether or not it is running in the main scope by
-checking its own ``__name__``, which allows a common idiom for conditionally
-executing code in a module when it is run as a script or with ``python
--m`` but not when it is imported::
+ >>> import namely
+ >>> namely.did_user_define_their_name()
+ False
+ >>> namely.print_user_name()
+ Traceback (most recent call last):
+ ...
+ ValueError: Define the variable `my_name`!
+ >>> my_name = 'Jabberwocky'
+ >>> namely.did_user_define_their_name()
+ True
+ >>> namely.print_user_name()
+ Jabberwocky
- if __name__ == "__main__":
- # execute only if run as a script
- main()
+Note that in this case the ``__main__`` scope doesn't contain a ``__file__``
+attribute as it's interactive.
-For a package, the same effect can be achieved by including a
-``__main__.py`` module, the contents of which will be executed when the
-module is run with ``-m``.
+The ``__main__`` scope is used in the implementation of :mod:`pdb` and
+:mod:`rlcompleter`.