diff options
author | Pablo Galindo <Pablogsal@gmail.com> | 2019-12-27 21:55:56 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2019-12-27 21:55:56 (GMT) |
commit | 90913985b62845a58f6b9e815121bcf614bd107f (patch) | |
tree | 3bc2594d17089262818928ab10445b9dd7a367e8 /Modules/gcmodule.c | |
parent | 91874bb07161bb481b6f5ea18ffafe69cb8cac30 (diff) | |
download | cpython-90913985b62845a58f6b9e815121bcf614bd107f.zip cpython-90913985b62845a58f6b9e815121bcf614bd107f.tar.gz cpython-90913985b62845a58f6b9e815121bcf614bd107f.tar.bz2 |
Move comment about permanent generation to gcmodule.c (GH-17718)
The comment about the collection rules for the permanent generation was
incorrectly referenced by a comment in gcmodule.c (the comment has been
moved long ago into a header file). Moving the comment into the relevant
code helps with readability and avoids broken references.
Diffstat (limited to 'Modules/gcmodule.c')
-rw-r--r-- | Modules/gcmodule.c | 36 |
1 files changed, 34 insertions, 2 deletions
diff --git a/Modules/gcmodule.c b/Modules/gcmodule.c index 64afe83..b11ae84 100644 --- a/Modules/gcmodule.c +++ b/Modules/gcmodule.c @@ -1381,8 +1381,40 @@ collect_generations(PyThreadState *tstate) for (int i = NUM_GENERATIONS-1; i >= 0; i--) { if (gcstate->generations[i].count > gcstate->generations[i].threshold) { /* Avoid quadratic performance degradation in number - of tracked objects. See comments at the beginning - of this file, and issue #4074. + of tracked objects (see also issue #4074): + + To limit the cost of garbage collection, there are two strategies; + - make each collection faster, e.g. by scanning fewer objects + - do less collections + This heuristic is about the latter strategy. + + In addition to the various configurable thresholds, we only trigger a + full collection if the ratio + + long_lived_pending / long_lived_total + + is above a given value (hardwired to 25%). + + The reason is that, while "non-full" collections (i.e., collections of + the young and middle generations) will always examine roughly the same + number of objects -- determined by the aforementioned thresholds --, + the cost of a full collection is proportional to the total number of + long-lived objects, which is virtually unbounded. + + Indeed, it has been remarked that doing a full collection every + <constant number> of object creations entails a dramatic performance + degradation in workloads which consist in creating and storing lots of + long-lived objects (e.g. building a large list of GC-tracked objects would + show quadratic performance, instead of linear as expected: see issue #4074). + + Using the above ratio, instead, yields amortized linear performance in + the total number of objects (the effect of which can be summarized + thusly: "each full garbage collection is more and more costly as the + number of objects grows, but we do fewer and fewer of them"). + + This heuristic was suggested by Martin von Löwis on python-dev in + June 2008. His original analysis and proposal can be found at: + http://mail.python.org/pipermail/python-dev/2008-June/080579.html */ if (i == NUM_GENERATIONS - 1 && gcstate->long_lived_pending < gcstate->long_lived_total / 4) |