diff options
author | Mark Shannon <mark@hotpy.org> | 2022-03-11 14:29:10 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-03-11 14:29:10 (GMT) |
commit | 4f74ffc5e333a9ca931153f4844c8c785b1362a3 (patch) | |
tree | b09cfe5b6fe61e5e9dccee204e26a18c3499dba9 /Python/adaptive.md | |
parent | 54ab9ad312ea53db40e31712454272e1d4c0315f (diff) | |
download | cpython-4f74ffc5e333a9ca931153f4844c8c785b1362a3.zip cpython-4f74ffc5e333a9ca931153f4844c8c785b1362a3.tar.gz cpython-4f74ffc5e333a9ca931153f4844c8c785b1362a3.tar.bz2 |
Update adaptive.md for inline caching (GH-31817)
Diffstat (limited to 'Python/adaptive.md')
-rw-r--r-- | Python/adaptive.md | 29 |
1 files changed, 20 insertions, 9 deletions
diff --git a/Python/adaptive.md b/Python/adaptive.md index 81880ce..e8161bc 100644 --- a/Python/adaptive.md +++ b/Python/adaptive.md @@ -14,16 +14,16 @@ A family of instructions has the following fundamental properties: it executes the non-adaptive instruction. * It has at least one specialized form of the instruction that is tailored for a particular value or set of values at runtime. -* All members of the family have access to the same number of cache entries. - Individual family members do not need to use all of the entries. +* All members of the family must have the same number of inline cache entries, + to ensure correct execution. + Individual family members do not need to use all of the entries, + but must skip over any unused entries when executing. The current implementation also requires the following, although these are not fundamental and may change: -* If a family uses one or more entries, then the first entry must be a - `_PyAdaptiveEntry` entry. -* If a family uses no cache entries, then the `oparg` is used as the - counter for the adaptive instruction. +* All families uses one or more inline cache entries, + the first entry is always the counter. * All instruction names should start with the name of the non-adaptive instruction. * The adaptive instruction should end in `_ADAPTIVE`. @@ -76,6 +76,10 @@ keeping `Ti` low which means minimizing branches and dependent memory accesses (pointer chasing). These two objectives may be in conflict, requiring judgement and experimentation to design the family of instructions. +The size of the inline cache should as small as possible, +without impairing performance, to reduce the number of +`EXTENDED_ARG` jumps, and to reduce pressure on the CPU's data cache. + ### Gathering data Before choosing how to specialize an instruction, it is important to gather @@ -106,7 +110,7 @@ This can be tested quickly: * `globals->keys->dk_version == expected_version` and the operation can be performed quickly: -* `value = globals->keys->entries[index].value`. +* `value = entries[cache->index].me_value;`. Because it is impossible to measure the performance of an instruction without also measuring unrelated factors, the assessment of the quality of a @@ -119,8 +123,7 @@ base instruction. In general, specialized instructions should be implemented in two parts: 1. A sequence of guards, each of the form - `DEOPT_IF(guard-condition-is-false, BASE_NAME)`, - followed by a `record_cache_hit()`. + `DEOPT_IF(guard-condition-is-false, BASE_NAME)`. 2. The operation, which should ideally have no branches and a minimum number of dependent memory accesses. @@ -129,3 +132,11 @@ can be re-used in the operation. If there are branches in the operation, then consider further specialization to eliminate the branches. + +### Maintaining stats + +Finally, take care that stats are gather correctly. +After the last `DEOPT_IF` has passed, a hit should be recorded with +`STAT_INC(BASE_INSTRUCTION, hit)`. +After a optimization has been deferred in the `ADAPTIVE` form, +that should be recorded with `STAT_INC(BASE_INSTRUCTION, deferred)`. |