summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorTim Peters <tim.peters@gmail.com>2003-04-07 19:21:15 (GMT)
committerTim Peters <tim.peters@gmail.com>2003-04-07 19:21:15 (GMT)
commitf6b8045ca5262f0d83eea3a279828a3f136f167f (patch)
treecf4eb25fc1d278acff4487a3e2aa9b2422d93c8a
parentdf875b99fcb69c18168fb761ddaa722a034175dd (diff)
downloadcpython-f6b8045ca5262f0d83eea3a279828a3f136f167f.zip
cpython-f6b8045ca5262f0d83eea3a279828a3f136f167f.tar.gz
cpython-f6b8045ca5262f0d83eea3a279828a3f136f167f.tar.bz2
Reworked has_finalizer() to use the new _PyObject_Lookup() instead
of PyObject_HasAttr(); the former promises never to execute arbitrary Python code. Undid many of the changes recently made to worm around the worst consequences of that PyObject_HasAttr() could execute arbitrary Python code. Compatibility is hard to discuss, because the dangerous cases are so perverse, and much of this appears to rely on implementation accidents. To start with, using hasattr() to check for __del__ wasn't only dangerous, in some cases it was wrong: if an instance of an old- style class didn't have "__del__" in its instance dict or in any base class dict, but a getattr hook said __del__ existed, then hasattr() said "yes, this object has a __del__". But instance_dealloc() ignores the possibility of getattr hooks when looking for a __del__, so while object.__del__ succeeds, no __del__ method is called when the object is deleted. gc was therefore incorrect in believing that the object had a finalizer. The new method doesn't suffer that problem (like instance_dealloc(), _PyObject_Lookup() doesn't believe __del__ exists in that case), but does suffer a somewhat opposite-- and even more obscure --oddity: if an instance of an old-style class doesn't have "__del__" in its instance dict, and a base class does have "__del__" in its dict, and the first base class with a "__del__" associates it with a descriptor (an object with a __get__ method), *and* if that descriptor raises an exception when __get__ is called, then (a) the current method believes the instance does have a __del__, but (b) hasattr() does not believe the instance has a __del__. While these disagree, I believe the new method is "more correct": because the descriptor *will* be called when the object is destructed, it can execute arbitrary Python code at the time the object is destructed, and that's really what gc means by "has a finalizer": not specifically a __del__ method, but more generally the possibility of executing arbitrary Python code at object destruction time. Code in a descriptor's __get__() executed at destruction time can be just as problematic as code in a __del__() executed then. So I believe the new method is better on all counts. Bugfix candidate, but it's unclear to me how all this differs in the 2.2 branch (e.g., new-style and old-style classes already took different gc paths in 2.3 before this last round of patches, but don't in the 2.2 branch).
-rw-r--r--Lib/test/test_gc.py5
-rw-r--r--Misc/NEWS14
-rw-r--r--Modules/gcmodule.c118
3 files changed, 59 insertions, 78 deletions
diff --git a/Lib/test/test_gc.py b/Lib/test/test_gc.py
index e225881..1fbd508 100644
--- a/Lib/test/test_gc.py
+++ b/Lib/test/test_gc.py
@@ -272,8 +272,9 @@ def test_boom():
# the internal "attr" attributes as a side effect. That causes the
# trash cycle to get reclaimed via refcounts falling to 0, thus mutating
# the trash graph as a side effect of merely asking whether __del__
- # exists. This used to (before 2.3b1) crash Python.
- expect(gc.collect(), 0, "boom")
+ # exists. This used to (before 2.3b1) crash Python. Now __getattr__
+ # isn't called.
+ expect(gc.collect(), 4, "boom")
expect(len(gc.garbage), garbagelen, "boom")
class Boom2:
diff --git a/Misc/NEWS b/Misc/NEWS
index a58f363..667c89e 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -12,6 +12,14 @@ What's New in Python 2.3 beta 1?
Core and builtins
-----------------
+- Some horridly obscure problems were fixed involving interaction
+ between garbage collection and old-style classes with "ambitious"
+ getattr hooks. If an old-style instance didn't have a __del__ method,
+ but did have a __getattr__ hook, and the instance became reachable
+ only from an unreachable cycle, and the hook resurrected or deleted
+ unreachable objects when asked to resolve "__del__", anything up to
+ a segfault could happen. That's been repaired.
+
- dict.pop now takes an optional argument specifying a default
value to return if the key is not in the dict. If a default is not
given and the key is not found, a KeyError will still be raised.
@@ -77,7 +85,7 @@ Library
return 'not a == b' rather than 'a != b'. This gives the desired
result for classes that define __eq__ without defining __ne__.
-- sgmllib now supports SGML marked sections, in particular the
+- sgmllib now supports SGML marked sections, in particular the
MS Office extensions.
- The urllib module now offers support for the iterator protocol.
@@ -154,10 +162,10 @@ Mac
- EasyDialogs dialogs are now movable-modal, and if the application is
currently in the background they will ask to be moved to the foreground
before displaying.
-
+
- OSA Scripting support has improved a lot, and gensuitemodule.py can now
be used by mere mortals.
-
+
- The IDE (in a framework build) now includes introductory documentation
in Apple Help Viewer format.
diff --git a/Modules/gcmodule.c b/Modules/gcmodule.c
index bf56cd4..464786b 100644
--- a/Modules/gcmodule.c
+++ b/Modules/gcmodule.c
@@ -359,19 +359,17 @@ move_unreachable(PyGC_Head *young, PyGC_Head *unreachable)
/* Return true if object has a finalization method.
* CAUTION: An instance of an old-style class has to be checked for a
- *__del__ method, and that can cause arbitrary Python code to get executed
- * via the class's __getattr__ hook (if any). This function can therefore
- * mutate the object graph, and that's been the source of subtle bugs.
+ *__del__ method, and earlier versions of this used to call PyObject_HasAttr,
+ * which in turn could call the class's __getattr__ hook (if any). That
+ * could invoke arbitrary Python code, mutating the object graph in arbitrary
+ * ways, and that was the source of some excruciatingly subtle bugs.
*/
static int
has_finalizer(PyObject *op)
{
if (PyInstance_Check(op)) {
- /* This is the dangerous path: hasattr can invoke
- * the class __getattr__(), and that can do anything.
- */
assert(delstr != NULL);
- return PyObject_HasAttr(op, delstr);
+ return _PyInstance_Lookup(op, delstr) != NULL;
}
else if (PyType_HasFeature(op->ob_type, Py_TPFLAGS_HEAPTYPE))
return op->ob_type->tp_del != NULL;
@@ -379,38 +377,28 @@ has_finalizer(PyObject *op)
return 0;
}
-/* Move all objects out of unreachable, into collectable or finalizers.
- * It's possible that some objects will get collected (via refcount falling
- * to 0), or resurrected, as a side effect of checking for __del__ methods.
- * After, finalizers contains all the objects from unreachable that haven't
- * been collected by magic, and that have a finalizer. gc_refs is
- * GC_REACHABLE for all of those. collectable contains all the remaining
- * objects from unreachable, and gc_refs remains GC_TENTATIVELY_UNREACHABLE
- * for those (we're still not sure they're reclaimable after this! Some
- * may yet by reachable *from* the objects in finalizers).
+/* Move the objects in unreachable with __del__ methods into finalizers.
+ * The objects remaining in unreachable do not have __del__ methods, and
+ * gc_refs remains GC_TENTATIVELY_UNREACHABLE for them. The objects
+ * moved into finalizers have gc_refs changed to GC_REACHABLE.
*/
static void
-move_finalizers(PyGC_Head *unreachable, PyGC_Head *collectable,
- PyGC_Head *finalizers)
+move_finalizers(PyGC_Head *unreachable, PyGC_Head *finalizers)
{
- while (!gc_list_is_empty(unreachable)) {
- PyGC_Head *gc = unreachable->gc.gc_next;
+ PyGC_Head *gc = unreachable->gc.gc_next;
+
+ while (gc != unreachable) {
PyObject *op = FROM_GC(gc);
- int finalizer;
+ PyGC_Head *next = gc->gc.gc_next;
assert(IS_TENTATIVELY_UNREACHABLE(op));
- finalizer = has_finalizer(op);
- if (unreachable->gc.gc_next == gc) {
+ if (has_finalizer(op)) {
gc_list_remove(gc);
- if (finalizer) {
- gc_list_append(gc, finalizers);
- gc->gc.gc_refs = GC_REACHABLE;
- }
- else
- gc_list_append(gc, collectable);
+ gc_list_append(gc, finalizers);
+ gc->gc.gc_refs = GC_REACHABLE;
}
- /* else has_finalizer() deleted op via side effect */
+ gc = next;
}
}
@@ -430,11 +418,10 @@ visit_move(PyObject *op, PyGC_Head *tolist)
}
/* Move objects that are reachable from finalizers, from the unreachable set
- * into the reachable_from_finalizers set.
+ * into finalizers set.
*/
static void
-move_finalizer_reachable(PyGC_Head *finalizers,
- PyGC_Head *reachable_from_finalizers)
+move_finalizer_reachable(PyGC_Head *finalizers)
{
traverseproc traverse;
PyGC_Head *gc = finalizers->gc.gc_next;
@@ -443,7 +430,7 @@ move_finalizer_reachable(PyGC_Head *finalizers,
traverse = FROM_GC(gc)->ob_type->tp_traverse;
(void) traverse(FROM_GC(gc),
(visitproc)visit_move,
- (void *)reachable_from_finalizers);
+ (void *)finalizers);
}
}
@@ -475,24 +462,32 @@ debug_cycle(char *msg, PyObject *op)
/* Handle uncollectable garbage (cycles with finalizers, and stuff reachable
* only from such cycles).
- * If DEBUG_SAVEALL or hasfinalizer, the objects in finalizers are appended
- * to the module garbage list (a Python list). The objects in finalizers
- * are merged into the old list regardless.
+ * If DEBUG_SAVEALL, all objects in finalizers are appended to the module
+ * garbage list (a Python list), else only the objects in finalizers with
+ * __del__ methods are appended to garbage. All objects in finalizers are
+ * merged into the old list regardless.
* Returns 0 if all OK, <0 on error (out of memory to grow the garbage list).
* The finalizers list is made empty on a successful return.
*/
static int
-handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old, int hasfinalizer)
+handle_finalizers(PyGC_Head *finalizers, PyGC_Head *old)
{
+ PyGC_Head *gc = finalizers->gc.gc_next;
+
if (garbage == NULL) {
garbage = PyList_New(0);
if (garbage == NULL)
Py_FatalError("gc couldn't create gc.garbage list");
}
- if ((debug & DEBUG_SAVEALL) || hasfinalizer) {
- if (append_objects(garbage, finalizers) < 0)
- return -1;
+ for (; gc != finalizers; gc = gc->gc.gc_next) {
+ PyObject *op = FROM_GC(gc);
+
+ if ((debug & DEBUG_SAVEALL) || has_finalizer(op)) {
+ if (PyList_Append(garbage, op) < 0)
+ return -1;
+ }
}
+
gc_list_merge(finalizers, old);
return 0;
}
@@ -541,9 +536,7 @@ collect(int generation)
PyGC_Head *young; /* the generation we are examining */
PyGC_Head *old; /* next older generation */
PyGC_Head unreachable;
- PyGC_Head collectable;
PyGC_Head finalizers;
- PyGC_Head reachable_from_finalizers;
PyGC_Head *gc;
if (delstr == NULL) {
@@ -606,31 +599,19 @@ collect(int generation)
* care not to create such things. For Python, finalizers means
* instance objects with __del__ methods.
*
- * Move each unreachable object into the collectable set or the
- * finalizers set. Because we need to check for __del__ methods on
- * instances of classic classes, arbitrary Python code may get
- * executed by getattr hooks: that may resurrect or deallocate (via
- * refcount falling to 0) unreachable objects, so this is very
- * delicate.
- */
- gc_list_init(&collectable);
+ * Move unreachable objects with finalizers into a different list.
+ */
gc_list_init(&finalizers);
- move_finalizers(&unreachable, &collectable, &finalizers);
+ move_finalizers(&unreachable, &finalizers);
/* finalizers contains the unreachable objects with a finalizer;
* unreachable objects reachable only *from* those are also
- * uncollectable; we move those into a separate list
- * (reachable_from_finalizers) so we don't have to do the dangerous
- * has_finalizer() test again later.
+ * uncollectable, and we move those into the finalizers list too.
*/
- gc_list_init(&reachable_from_finalizers);
- move_finalizer_reachable(&finalizers, &reachable_from_finalizers);
- /* And move everything only reachable from the reachable stuff. */
- move_finalizer_reachable(&reachable_from_finalizers,
- &reachable_from_finalizers);
+ move_finalizer_reachable(&finalizers);
/* Collect statistics on collectable objects found and print
* debugging information. */
- for (gc = collectable.gc.gc_next; gc != &collectable;
+ for (gc = unreachable.gc.gc_next; gc != &unreachable;
gc = gc->gc.gc_next) {
m++;
if (debug & DEBUG_COLLECTABLE) {
@@ -640,7 +621,7 @@ collect(int generation)
/* Call tp_clear on objects in the collectable set. This will cause
* the reference cycles to be broken. It may also cause some objects
* in finalizers and/or reachable_from_finalizers to be freed */
- delete_garbage(&collectable, old);
+ delete_garbage(&unreachable, old);
/* Collect statistics on uncollectable objects found and print
* debugging information. */
@@ -651,13 +632,6 @@ collect(int generation)
if (debug & DEBUG_UNCOLLECTABLE)
debug_cycle("uncollectable", FROM_GC(gc));
}
- for (gc = reachable_from_finalizers.gc.gc_next;
- gc != &reachable_from_finalizers;
- gc = gc->gc.gc_next) {
- n++;
- if (debug & DEBUG_UNCOLLECTABLE)
- debug_cycle("uncollectable", FROM_GC(gc));
- }
if (debug & DEBUG_STATS) {
if (m == 0 && n == 0) {
PySys_WriteStderr("gc: done.\n");
@@ -673,13 +647,11 @@ collect(int generation)
* reachable list of garbage. The programmer has to deal with
* this if they insist on creating this type of structure.
*/
- if (handle_finalizers(&finalizers, old, 1) == 0)
- (void)handle_finalizers(&reachable_from_finalizers, old, 0);
+ (void)handle_finalizers(&finalizers, old);
if (PyErr_Occurred()) {
- if (gc_str == NULL) {
+ if (gc_str == NULL)
gc_str = PyString_FromString("garbage collection");
- }
PyErr_WriteUnraisable(gc_str);
Py_FatalError("unexpected exception during garbage collection");
}