diff options
author | Erlend E. Aasland <erlend.aasland@protonmail.com> | 2022-10-27 13:06:48 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-10-27 13:06:48 (GMT) |
commit | 723ebe76e787cfa6b08cc9587dd679f3234a1025 (patch) | |
tree | 6e46cce9595e7411b5dfdf83fb05b6536baf325b /Doc/howto | |
parent | 22863df7ca5f9cd01a40ab3dce3d067ec5666081 (diff) | |
download | cpython-723ebe76e787cfa6b08cc9587dd679f3234a1025.zip cpython-723ebe76e787cfa6b08cc9587dd679f3234a1025.tar.gz cpython-723ebe76e787cfa6b08cc9587dd679f3234a1025.tar.bz2 |
gh-96143: Improve perf profiler docs (#96445)
Diffstat (limited to 'Doc/howto')
-rw-r--r-- | Doc/howto/perf_profiling.rst | 69 |
1 files changed, 38 insertions, 31 deletions
diff --git a/Doc/howto/perf_profiling.rst b/Doc/howto/perf_profiling.rst index 387fb3f..ad2eb7b 100644 --- a/Doc/howto/perf_profiling.rst +++ b/Doc/howto/perf_profiling.rst @@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler :author: Pablo Galindo -The Linux ``perf`` profiler is a very powerful tool that allows you to profile and -obtain information about the performance of your application. ``perf`` also has -a very vibrant ecosystem of tools that aid with the analysis of the data that it -produces. +`The Linux perf profiler <https://perf.wiki.kernel.org>`_ +is a very powerful tool that allows you to profile and obtain +information about the performance of your application. +``perf`` also has a very vibrant ecosystem of tools +that aid with the analysis of the data that it produces. The main problem with using the ``perf`` profiler with Python applications is that ``perf`` only allows to get information about native symbols, this is, the names of @@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the relationship between this piece of code and the associated Python function using `perf map files`_. -.. warning:: +.. note:: Support for the ``perf`` profiler is only currently available for Linux on selected architectures. Check the output of the configure build step or @@ -51,11 +52,11 @@ For example, consider the following script: if __name__ == "__main__": baz(1000000) -We can run perf to sample CPU stack traces at 9999 Hertz: +We can run ``perf`` to sample CPU stack traces at 9999 Hertz:: $ perf record -F 9999 -g -o perf.data python my_script.py -Then we can use perf report to analyze the data: +Then we can use ``perf`` report to analyze the data: .. code-block:: shell-session @@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which bytecode-evaluating function. -Instead, if we run the same experiment with perf support activated we get: +Instead, if we run the same experiment with ``perf`` support enabled we get: .. code-block:: shell-session @@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get: -Enabling perf profiling mode ----------------------------- +How to enable ``perf`` profiling support +---------------------------------------- -There are two main ways to activate the perf profiling mode. If you want it to be -active since the start of the Python interpreter, you can use the ``-Xperf`` option: +``perf`` profiling support can either be enabled from the start using +the environment variable :envvar:`PYTHONPERFSUPPORT` or the +:option:`-X perf <-X>` option, +or dynamically using :func:`sys.activate_stack_trampoline` and +:func:`sys.deactivate_stack_trampoline`. - $ python -Xperf my_script.py +The :mod:`!sys` functions take precedence over the :option:`!-X` option, +the :option:`!-X` option takes precedence over the environment variable. -You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf -profiling mode globally. +Example, using the environment variable:: -There is also support for dynamically activating and deactivating the perf -profiling mode by using the APIs in the :mod:`sys` module: + $ PYTHONPERFSUPPORT=1 + $ python script.py + $ perf report -g -i perf.data -.. code-block:: python - - import sys - sys.activate_stack_trampoline("perf") +Example, using the :option:`!-X` option:: - # Run some code with Perf profiling active + $ python -X perf script.py + $ perf report -g -i perf.data - sys.deactivate_stack_trampoline() +Example, using the :mod:`sys` APIs in file :file:`example.py`: - # Perf profiling is not active anymore +.. code-block:: python -These APIs can be handy if you want to activate/deactivate profiling mode in -response to a signal or other communication mechanism with your process. + import sys + sys.activate_stack_trampoline("perf") + do_profiled_stuff() + sys.deactivate_stack_trampoline() + non_profiled_stuff() -Now we can analyze the data with ``perf report``: +...then:: - $ perf report -g -i perf.data + $ python ./example.py + $ perf report -g -i perf.data How to obtain the best results -------------------------------- +------------------------------ For the best results, Python should be compiled with ``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows profilers to unwind using only the frame pointer and not on DWARF debug -information. This is because as the code that is interposed to allow perf +information. This is because as the code that is interposed to allow ``perf`` support is dynamically generated it doesn't have any DWARF debugging information available. -You can check if you system has been compiled with this flag by running: +You can check if your system has been compiled with this flag by running:: $ python -m sysconfig | grep 'no-omit-frame-pointer' |