summaryrefslogtreecommitdiffstats
path: root/Doc/howto
diff options
context:
space:
mode:
authorErlend E. Aasland <erlend.aasland@protonmail.com>2022-10-27 13:06:48 (GMT)
committerGitHub <noreply@github.com>2022-10-27 13:06:48 (GMT)
commit723ebe76e787cfa6b08cc9587dd679f3234a1025 (patch)
tree6e46cce9595e7411b5dfdf83fb05b6536baf325b /Doc/howto
parent22863df7ca5f9cd01a40ab3dce3d067ec5666081 (diff)
downloadcpython-723ebe76e787cfa6b08cc9587dd679f3234a1025.zip
cpython-723ebe76e787cfa6b08cc9587dd679f3234a1025.tar.gz
cpython-723ebe76e787cfa6b08cc9587dd679f3234a1025.tar.bz2
gh-96143: Improve perf profiler docs (#96445)
Diffstat (limited to 'Doc/howto')
-rw-r--r--Doc/howto/perf_profiling.rst69
1 files changed, 38 insertions, 31 deletions
diff --git a/Doc/howto/perf_profiling.rst b/Doc/howto/perf_profiling.rst
index 387fb3f..ad2eb7b 100644
--- a/Doc/howto/perf_profiling.rst
+++ b/Doc/howto/perf_profiling.rst
@@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler
:author: Pablo Galindo
-The Linux ``perf`` profiler is a very powerful tool that allows you to profile and
-obtain information about the performance of your application. ``perf`` also has
-a very vibrant ecosystem of tools that aid with the analysis of the data that it
-produces.
+`The Linux perf profiler <https://perf.wiki.kernel.org>`_
+is a very powerful tool that allows you to profile and obtain
+information about the performance of your application.
+``perf`` also has a very vibrant ecosystem of tools
+that aid with the analysis of the data that it produces.
The main problem with using the ``perf`` profiler with Python applications is that
``perf`` only allows to get information about native symbols, this is, the names of
@@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the
relationship between this piece of code and the associated Python function using
`perf map files`_.
-.. warning::
+.. note::
Support for the ``perf`` profiler is only currently available for Linux on
selected architectures. Check the output of the configure build step or
@@ -51,11 +52,11 @@ For example, consider the following script:
if __name__ == "__main__":
baz(1000000)
-We can run perf to sample CPU stack traces at 9999 Hertz:
+We can run ``perf`` to sample CPU stack traces at 9999 Hertz::
$ perf record -F 9999 -g -o perf.data python my_script.py
-Then we can use perf report to analyze the data:
+Then we can use ``perf`` report to analyze the data:
.. code-block:: shell-session
@@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
bytecode-evaluating function.
-Instead, if we run the same experiment with perf support activated we get:
+Instead, if we run the same experiment with ``perf`` support enabled we get:
.. code-block:: shell-session
@@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get:
-Enabling perf profiling mode
-----------------------------
+How to enable ``perf`` profiling support
+----------------------------------------
-There are two main ways to activate the perf profiling mode. If you want it to be
-active since the start of the Python interpreter, you can use the ``-Xperf`` option:
+``perf`` profiling support can either be enabled from the start using
+the environment variable :envvar:`PYTHONPERFSUPPORT` or the
+:option:`-X perf <-X>` option,
+or dynamically using :func:`sys.activate_stack_trampoline` and
+:func:`sys.deactivate_stack_trampoline`.
- $ python -Xperf my_script.py
+The :mod:`!sys` functions take precedence over the :option:`!-X` option,
+the :option:`!-X` option takes precedence over the environment variable.
-You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf
-profiling mode globally.
+Example, using the environment variable::
-There is also support for dynamically activating and deactivating the perf
-profiling mode by using the APIs in the :mod:`sys` module:
+ $ PYTHONPERFSUPPORT=1
+ $ python script.py
+ $ perf report -g -i perf.data
-.. code-block:: python
-
- import sys
- sys.activate_stack_trampoline("perf")
+Example, using the :option:`!-X` option::
- # Run some code with Perf profiling active
+ $ python -X perf script.py
+ $ perf report -g -i perf.data
- sys.deactivate_stack_trampoline()
+Example, using the :mod:`sys` APIs in file :file:`example.py`:
- # Perf profiling is not active anymore
+.. code-block:: python
-These APIs can be handy if you want to activate/deactivate profiling mode in
-response to a signal or other communication mechanism with your process.
+ import sys
+ sys.activate_stack_trampoline("perf")
+ do_profiled_stuff()
+ sys.deactivate_stack_trampoline()
+ non_profiled_stuff()
-Now we can analyze the data with ``perf report``:
+...then::
- $ perf report -g -i perf.data
+ $ python ./example.py
+ $ perf report -g -i perf.data
How to obtain the best results
--------------------------------
+------------------------------
For the best results, Python should be compiled with
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
profilers to unwind using only the frame pointer and not on DWARF debug
-information. This is because as the code that is interposed to allow perf
+information. This is because as the code that is interposed to allow ``perf``
support is dynamically generated it doesn't have any DWARF debugging information
available.
-You can check if you system has been compiled with this flag by running:
+You can check if your system has been compiled with this flag by running::
$ python -m sysconfig | grep 'no-omit-frame-pointer'