diff options
author | Benjamin Peterson <benjamin@python.org> | 2017-12-09 18:26:52 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2017-12-09 18:26:52 (GMT) |
commit | 42aa93b8ff2f7879282b06efc73a31ec7785e602 (patch) | |
tree | 92ee301e1f487a7f5aa8ec78a36ebc50d21d6ec9 /Doc/reference/import.rst | |
parent | 28d8d14013ade0657fed4673f5fa3c08eb2b1944 (diff) | |
download | cpython-42aa93b8ff2f7879282b06efc73a31ec7785e602.zip cpython-42aa93b8ff2f7879282b06efc73a31ec7785e602.tar.gz cpython-42aa93b8ff2f7879282b06efc73a31ec7785e602.tar.bz2 |
closes bpo-31650: PEP 552 (Deterministic pycs) implementation (#4575)
Python now supports checking bytecode cache up-to-dateness with a hash of the
source contents rather than volatile source metadata. See the PEP for details.
While a fairly straightforward idea, quite a lot of code had to be modified due
to the pervasiveness of pyc implementation details in the codebase. Changes in
this commit include:
- The core changes to importlib to understand how to read, validate, and
regenerate hash-based pycs.
- Support for generating hash-based pycs in py_compile and compileall.
- Modifications to our siphash implementation to support passing a custom
key. We then expose it to importlib through _imp.
- Updates to all places in the interpreter, standard library, and tests that
manually generate or parse pyc files to grok the new format.
- Support in the interpreter command line code for long options like
--check-hash-based-pycs.
- Tests and documentation for all of the above.
Diffstat (limited to 'Doc/reference/import.rst')
-rw-r--r-- | Doc/reference/import.rst | 27 |
1 files changed, 27 insertions, 0 deletions
diff --git a/Doc/reference/import.rst b/Doc/reference/import.rst index 881e0ae..45d4172 100644 --- a/Doc/reference/import.rst +++ b/Doc/reference/import.rst @@ -675,6 +675,33 @@ Here are the exact rules used: :meth:`~importlib.abc.Loader.module_repr` method, if defined, before trying either approach described above. However, the method is deprecated. +.. _pyc-invalidation: + +Cached bytecode invalidation +---------------------------- + +Before Python loads cached bytecode from ``.pyc`` file, it checks whether the +cache is up-to-date with the source ``.py`` file. By default, Python does this +by storing the source's last-modified timestamp and size in the cache file when +writing it. At runtime, the import system then validates the cache file by +checking the stored metadata in the cache file against at source's +metadata. + +Python also supports "hash-based" cache files, which store a hash of the source +file's contents rather than its metadata. There are two variants of hash-based +``.pyc`` files: checked and unchecked. For checked hash-based ``.pyc`` files, +Python validates the cache file by hashing the source file and comparing the +resulting hash with the hash in the cache file. If a checked hash-based cache +file is found to be invalid, Python regenerates it and writes a new checked +hash-based cache file. For unchecked hash-based ``.pyc`` files, Python simply +assumes the cache file is valid if it exists. Hash-based ``.pyc`` files +validation behavior may be overridden with the :option:`--check-hash-based-pycs` +flag. + +.. versionchanged:: 3.7 + Added hash-based ``.pyc`` files. Previously, Python only supported + timestamp-based invalidation of bytecode caches. + The Path Based Finder ===================== |