summaryrefslogtreecommitdiffstats
path: root/src/engine/SCons/Memoize.py
diff options
context:
space:
mode:
authorSteven Knight <knight@baldmt.com>2006-12-16 01:43:01 (GMT)
committerSteven Knight <knight@baldmt.com>2006-12-16 01:43:01 (GMT)
commitc4d04b3b45e7b71a1b28053b90084bcf2fdf9c0e (patch)
tree8a0d07c078ac21bf1ab689eacf06577069bb9231 /src/engine/SCons/Memoize.py
parentb32cd624a5ad9526d28584b8e6c4a7958f436424 (diff)
downloadSCons-c4d04b3b45e7b71a1b28053b90084bcf2fdf9c0e.zip
SCons-c4d04b3b45e7b71a1b28053b90084bcf2fdf9c0e.tar.gz
SCons-c4d04b3b45e7b71a1b28053b90084bcf2fdf9c0e.tar.bz2
Merged revisions 1675-1736 via svnmerge from
http://scons.tigris.org/svn/scons/branches/core ........ r1689 | stevenknight | 2006-11-06 20:56:29 -0600 (Mon, 06 Nov 2006) | 1 line 0.96.D483 - Merge changes for 0.96.93 packaging from the subsidiary branch. ........ r1690 | stevenknight | 2006-11-06 20:59:30 -0600 (Mon, 06 Nov 2006) | 1 line 0.96.D484 - Update HOWTO for releases. Fix name type in src/CHANGES.txt. ........ r1691 | stevenknight | 2006-11-08 13:55:36 -0600 (Wed, 08 Nov 2006) | 1 line 0.96.D485 - Fix MergeFlags() handling of None values. (John Pye) ........ r1692 | stevenknight | 2006-11-08 17:15:05 -0600 (Wed, 08 Nov 2006) | 1 line 0.96.D486 - Directly execute commands on Windows when possible. (Jay Kint) ........ r1693 | stevenknight | 2006-11-08 18:54:49 -0600 (Wed, 08 Nov 2006) | 1 line 0.96.D487 - Remove the semi-colon from the list of characters that determine when we use cmd ........ r1694 | stevenknight | 2006-11-09 01:34:06 -0600 (Thu, 09 Nov 2006) | 1 line 0.96.D488 - Pick up latex/bibtex 'Rerun to get citations correct' messages. (Dmitry Mikhin) ........ r1695 | stevenknight | 2006-11-11 08:36:33 -0600 (Sat, 11 Nov 2006) | 1 line 0.96.D489 - Back out the direct-execution-on-Windows change until we solve a corner case. ........ r1696 | stevenknight | 2006-11-15 10:33:10 -0600 (Wed, 15 Nov 2006) | 1 line 0.96.D490 - Fix the sconsign script when the .sconsign.dblite file is specified with its suf ........ r1697 | stevenknight | 2006-11-18 10:45:50 -0600 (Sat, 18 Nov 2006) | 4 lines Complete move of test/sconsign/script.py to underneath test/sconsign/script/. (This got left out of the previous checkin due to an error in the script that resubmits Aegis changes to Subversion.) ........ r1698 | stevenknight | 2006-11-18 11:05:26 -0600 (Sat, 18 Nov 2006) | 1 line 0.96.D491 - Allow an Options converter to take the construction environment as a parameter. ........ r1699 | stevenknight | 2006-11-30 15:34:37 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D492 - Reverse the order in which we try the arguments Options converters, first a sing ........ r1700 | stevenknight | 2006-11-30 16:03:09 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D493 - Speed up rel_path() by avoiding recomputation of intermediate directory relative ........ r1701 | stevenknight | 2006-11-30 16:14:16 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D494 - More efficient get_suffix(): compute it once when we set the name. ........ r1702 | stevenknight | 2006-11-30 16:22:55 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D495 - Fix missing XML end tags. ........ r1703 | stevenknight | 2006-11-30 17:15:25 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D496 - Turn Memoizer into a simple counter for --debug=memoizer, not something that doe ........ r1704 | stevenknight | 2006-11-30 20:30:50 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D497 - Add the scons-time script, with doc and tests. ........ r1705 | stevenknight | 2006-11-30 23:28:20 -0600 (Thu, 30 Nov 2006) | 1 line 0.96.D498 - Update the copyright years string. ........ r1706 | stevenknight | 2006-12-01 11:54:22 -0600 (Fri, 01 Dec 2006) | 1 line 0.96.D499 - Fix _do_Lookup => _doLookup value-caching misspellings. (Ben Leslie) ........ r1707 | stevenknight | 2006-12-01 12:03:46 -0600 (Fri, 01 Dec 2006) | 1 line 0.96.D500 - Fix copyright test against debian build. (Walter Franzini) ........ r1708 | stevenknight | 2006-12-01 14:23:29 -0600 (Fri, 01 Dec 2006) | 1 line 0.96.D501 - Add #include lines for test portability. (Gary Oberbrunner) ........ r1709 | stevenknight | 2006-12-01 14:51:12 -0600 (Fri, 01 Dec 2006) | 1 line 0.96.D502 - Fix tests under Python versions with no profiler (pstats module). ........ r1710 | stevenknight | 2006-12-01 20:04:49 -0600 (Fri, 01 Dec 2006) | 1 line 0.96.D503 - Remove unnecessary os.path.normpath() calls. (Gary Oberbrunner) ........ r1711 | stevenknight | 2006-12-01 20:34:31 -0600 (Fri, 01 Dec 2006) | 1 line 0.96.D504 - Accomodate arbitray white space after a SWIG %module keyword. (Anonymous) ........ r1712 | stevenknight | 2006-12-05 14:49:54 -0600 (Tue, 05 Dec 2006) | 1 line 0.96.D506 - Cache substitutions of of Builder source suffixes. Use a new PathList module, and a refactor Node.FS.Rfindalldirs() method, to cache calculations of values like CPPPATH. ........ r1713 | stevenknight | 2006-12-05 18:43:36 -0600 (Tue, 05 Dec 2006) | 1 line 0.96.D507 - Use cached stat() values in diskchecks. ........ r1714 | stevenknight | 2006-12-05 21:11:24 -0600 (Tue, 05 Dec 2006) | 1 line 0.96.D508 - Fix Memoizer hit counts for methods memoizing simple values. Clean up the code for memoizing return values in a dictionary. Fix comments. ........ r1715 | stevenknight | 2006-12-06 07:23:18 -0600 (Wed, 06 Dec 2006) | 1 line 0.96.D369 - More efficient Node.FS.Dir.current() check. Fix some Windows test portability issues. ........ r1716 | stevenknight | 2006-12-06 12:24:32 -0600 (Wed, 06 Dec 2006) | 2 lines Undo previous checkin (distributed incorrect Aegis change number). ........ r1717 | stevenknight | 2006-12-06 12:34:53 -0600 (Wed, 06 Dec 2006) | 1 line 0.96.D505 - Update ae-{cvs,svn}-ci for newer versions of aetar, and to not truncate descriptions. ........ r1718 | stevenknight | 2006-12-07 23:01:41 -0600 (Thu, 07 Dec 2006) | 1 line 0.96.D509 - Only look for mslink on Windows systems. (Sohail Somani) ........ r1719 | stevenknight | 2006-12-07 23:18:33 -0600 (Thu, 07 Dec 2006) | 1 line 0.96.D510 - Have the D compiler Tool use the same logic for shared libraries, too. (Paolo Invernizzi) ........ r1720 | stevenknight | 2006-12-07 23:29:47 -0600 (Thu, 07 Dec 2006) | 1 line 0.96.D511 - Generalize a JobTests.py test so it doesn't assume a specific order in which the operating system executes the threads. ........ r1721 | stevenknight | 2006-12-07 23:39:37 -0600 (Thu, 07 Dec 2006) | 1 line 0.96.D512 - Back out the Tool/dmd.py change; it breaks shared library linking for other lanuages beside D in the construction environment. ........ r1722 | stevenknight | 2006-12-07 23:47:11 -0600 (Thu, 07 Dec 2006) | 1 line 0.96.D513 - Test fixes: Windows portability, handle changes to Python 2.5 messages. ........ r1723 | stevenknight | 2006-12-08 00:00:13 -0600 (Fri, 08 Dec 2006) | 1 line 0.96.D514 - Change how the 'as' Tool is imported to accomodate the Python 2.6 'as' keyword. ........ r1724 | stevenknight | 2006-12-08 11:19:27 -0600 (Fri, 08 Dec 2006) | 1 line 0.96.D515 - Cache both Node.FS.find_file() and Node.FS.Dri.srcdir_find_file(). ........ r1725 | stevenknight | 2006-12-08 17:27:35 -0600 (Fri, 08 Dec 2006) | 1 line 0.96.D516 - Better error when we try to fetch contents from an Entry that doesn't exist. (Tom Parker) ........ r1726 | stevenknight | 2006-12-08 23:28:55 -0600 (Fri, 08 Dec 2006) | 1 line 0.96.D517 - Make sure we pick up the scons-local directory regardless of where we chdir internally. ........ r1727 | stevenknight | 2006-12-11 16:25:53 -0600 (Mon, 11 Dec 2006) | 1 line 0.96.D518 - Cache results of Executor.get_unignored_sources() and Executor.process_sources(). Eliminate some map() and disambiguate() calls when scanning for implicit dependencies. ........ r1728 | stevenknight | 2006-12-12 14:32:22 -0600 (Tue, 12 Dec 2006) | 1 line 0.96.D519 - Fix SideEffect() when -j is used. ........ r1729 | stevenknight | 2006-12-12 16:58:15 -0600 (Tue, 12 Dec 2006) | 1 line 0.96.D520 - Add a srcdir keyword to Builder calls. ........ r1730 | stevenknight | 2006-12-12 21:40:59 -0600 (Tue, 12 Dec 2006) | 1 line 0.96.D521 - TeX/LaTeX updates, including handling files in subdirectories. (Joel B. Mohler, Rob Managan, Dmitry Mikhin) ........ r1731 | stevenknight | 2006-12-14 15:01:02 -0600 (Thu, 14 Dec 2006) | 1 line 0.96.D522 - Propogate TypeErrors during variable substitution for display to the user. ........ r1732 | stevenknight | 2006-12-14 20:01:49 -0600 (Thu, 14 Dec 2006) | 1 line 0.96.D523 - Fix the os.path.join() calls in EnvironmentTests.py. ........ r1733 | stevenknight | 2006-12-15 07:48:22 -0600 (Fri, 15 Dec 2006) | 1 line 0.96.D524 - Fix source directories as dependencies of an Alias (0.96.93 problem found by LilyPond). ........ r1735 | stevenknight | 2006-12-15 12:43:45 -0600 (Fri, 15 Dec 2006) | 1 line 0.96.D525 - Allow printing Debug.caller() output (or other end-of-run debugging info) when using -h. ........ r1736 | stevenknight | 2006-12-15 16:30:08 -0600 (Fri, 15 Dec 2006) | 1 line 0.96.D526 - Add an option to debug IndexError and NameError exceptions during variable substitution. ........
Diffstat (limited to 'src/engine/SCons/Memoize.py')
-rw-r--r--src/engine/SCons/Memoize.py948
1 files changed, 190 insertions, 758 deletions
diff --git a/src/engine/SCons/Memoize.py b/src/engine/SCons/Memoize.py
index c2a3027..6a46350 100644
--- a/src/engine/SCons/Memoize.py
+++ b/src/engine/SCons/Memoize.py
@@ -1,66 +1,3 @@
-"""Memoizer
-
-Memoizer -- base class to provide automatic, optimized caching of
-method return values for subclassed objects. Caching is activated by
-the presence of "__cacheable__" in the doc of a method (acts like a
-decorator). The presence of "__cache_reset__" or "__reset_cache__"
-in the doc string instead indicates a method that should reset the
-cache, discarding any currently cached values.
-
-Note: current implementation is optimized for speed, not space. The
-cache reset operation does not actually discard older results, and in
-fact, all cached results (and keys) are held indefinitely.
-
-Most of the work for this is done by copying and modifying the class
-definition itself, rather than the object instances. This will
-therefore allow all instances of a class to get caching activated
-without requiring lengthy initialization or other management of the
-instance.
-
-[This could also be done using metaclassing (which would require
-Python 2.2) and decorators (which would require Python 2.4). Current
-implementation is used due to Python 1.5.2 compatability requirement
-contraint.]
-
-A few notes:
-
- * All local methods/attributes use a prefix of "_MeMoIZeR" to avoid
- namespace collisions with the attributes of the objects
- being cached.
-
- * Based on performance evaluations of dictionaries, caching is
- done by providing each object with a unique key attribute and
- using the value of that attribute as an index for dictionary
- lookup. If an object doesn't have one of these attributes,
- fallbacks are utilized (although they will be somewhat slower).
-
- * To support this unique-value attribute correctly, it must be
- removed whenever a __cmp__ operation is performed, and it must
- be updated whenever a copy.copy or copy.deepcopy is performed,
- so appropriate manipulation is provided by the Caching code
- below.
-
- * Cached values are stored in the class (indexed by the caching
- key attribute, then by the name of the method called and the
- constructed key of the arguments passed). By storing them here
- rather than on the instance, the instance can be compared,
- copied, and pickled much easier.
-
-Some advantages:
-
- * The method by which caching is implemented can be changed in a
- single location and it will apply globally.
-
- * Greatly simplified client code: remove lots of try...except or
- similar handling of cached lookup. Also usually more correct in
- that it based caching on all input arguments whereas many
- hand-implemented caching operations often miss arguments that
- might affect results.
-
- * Caching can be globally disabled very easily (for testing, etc.)
-
-"""
-
#
# __COPYRIGHT__
#
@@ -86,752 +23,247 @@ Some advantages:
__revision__ = "__FILE__ __REVISION__ __DATE__ __DEVELOPER__"
-#TBD: for pickling, should probably revert object to unclassed state...
+__doc__ = """Memoizer
-import copy
-import os
-import string
-import sys
+A metaclass implementation to count hits and misses of the computed
+values that various methods cache in memory.
-# A flag controlling whether or not we actually use memoization.
-use_memoizer = 1
+Use of this modules assumes that wrapped methods be coded to cache their
+values in a consistent way. Here is an example of wrapping a method
+that returns a computed value, with no input parameters:
-#
-# Generate a key for an object that is to be used as the caching key
-# for that object.
-#
-# Current implementation: singleton generating a monotonically
-# increasing integer
+ memoizer_counters = [] # Memoization
-class MemoizerKey:
- def __init__(self):
- self._next_keyval = 0
- def __call__(self):
- r = self._next_keyval
- self._next_keyval = self._next_keyval + 1
- return str(r)
-Next_Memoize_Key = MemoizerKey()
+ memoizer_counters.append(SCons.Memoize.CountValue('foo')) # Memoization
+ def foo(self):
-#
-# Memoized Class management.
-#
-# Classes can be manipulated just like object instances; we are going
-# to do some of that here, without the benefit of metaclassing
-# introduced in Python 2.2 (it would be nice to use that, but this
-# attempts to maintain backward compatibility to Python 1.5.2).
-#
-# The basic implementation therefore is to update the class definition
-# for any objects that we want to enable caching for. The updated
-# definition performs caching activities for those methods
-# appropriately marked in the original class.
-#
-# When an object is created, its class is switched to this updated,
-# cache-enabled class definition, thereby enabling caching operations.
-#
-# To get an instance to used the updated, caching class, the instance
-# must declare the Memoizer as a base class and make sure to call the
-# Memoizer's __init__ during the instance's __init__. The Memoizer's
-# __init__ will perform the class updating.
-
-# For Python 2.2 and later, where metaclassing is supported, it is
-# sufficient to provide a "__metaclass__ = Memoized_Metaclass" as part
-# of the class definition; the metaclassing will automatically invoke
-# the code herein properly.
-
-##import cPickle
-##def ALT0_MeMoIZeR_gen_key(argtuple, kwdict):
-## return cPickle.dumps( (argtuple,kwdict) )
-
-def ALT1_MeMoIZeR_gen_key(argtuple, kwdict):
- return repr(argtuple) + '|' + repr(kwdict)
-
-def ALT2_MeMoIZeR_gen_key(argtuple, kwdict):
- return string.join(map(lambda A:
- getattr(A, '_MeMoIZeR_Key', str(A)),
- argtuple) + \
- map(lambda D:
- str(D[0])+
- getattr(D[1], '_MeMoIZeR_Key', str(D[1])),
- kwdict.items()),
- '|')
-
-def ALT3_MeMoIZeR_gen_key(argtuple, kwdict):
- ret = []
- for A in argtuple:
- X = getattr(A, '_MeMoIZeR_Key', None)
- if X:
- ret.append(X)
- else:
- ret.append(str(A))
- for K,V in kwdict.items():
- ret.append(str(K))
- X = getattr(V, '_MeMoIZeR_Key', None)
- if X:
- ret.append(X)
- else:
- ret.append(str(V))
- return string.join(ret, '|')
-
-def ALT4_MeMoIZeR_gen_key(argtuple, kwdict):
- if kwdict:
- return string.join(map(lambda A:
- getattr(A, '_MeMoIZeR_Key', None) or str(A),
- argtuple) + \
- map(lambda D:
- str(D[0])+
- (getattr(D[1], '_MeMoIZeR_Key', None) or str(D[1])),
- kwdict.items()),
- '|')
- return string.join(map(lambda A:
- getattr(A, '_MeMoIZeR_Key', None) or str(A),
- argtuple),
- '!')
-
-def ALT5_MeMoIZeR_gen_key(argtuple, kwdict):
- A = string.join(map(str, argtuple), '|')
- K = ''
- if kwdict:
- I = map(lambda K,D=kwdict: str(K)+'='+str(D[K]), kwdict.keys())
- K = string.join(I, '|')
- return string.join([A,K], '!')
-
-def ALT6_MeMoIZeR_gen_key(argtuple, kwdict):
- A = string.join(map(str, map(id, argtuple)), '|')
- K = ''
- if kwdict:
- I = map(lambda K,D=kwdict: str(K)+'='+str(id(D[K])), kwdict.keys())
- K = string.join(I, '|')
- return string.join([A,K], '!')
-
-def ALT7_MeMoIZeR_gen_key(argtuple, kwdict):
- A = string.join(map(repr, argtuple), '|')
- K = ''
- if kwdict:
- I = map(lambda K,D=kwdict: repr(K)+'='+repr(D[K]), kwdict.keys())
- K = string.join(I, '|')
- return string.join([A,K], '!')
-
-def ALT8_MeMoIZeR_gen_key(argtuple, kwdict):
- ret = []
- for A in argtuple:
- X = getattr(A, '_MeMoIZeR_Key', None)
- if X:
- ret.append(X)
- else:
- ret.append(repr(A))
- for K,V in kwdict.items():
- ret.append(str(K))
- X = getattr(V, '_MeMoIZeR_Key', None)
- if X:
- ret.append(X)
- else:
- ret.append(repr(V))
- return string.join(ret, '|')
+ try: # Memoization
+ return self._memo['foo'] # Memoization
+ except KeyError: # Memoization
+ pass # Memoization
-def ALT9_MeMoIZeR_gen_key(argtuple, kwdict):
- ret = []
- for A in argtuple:
- try:
- X = A.__dict__.get('_MeMoIZeR_Key', None) or repr(A)
- except (AttributeError, KeyError):
- X = repr(A)
- ret.append(X)
- for K,V in kwdict.items():
- ret.append(str(K))
- ret.append('=')
- try:
- X = V.__dict__.get('_MeMoIZeR_Key', None) or repr(V)
- except (AttributeError, KeyError):
- X = repr(V)
- ret.append(X)
- return string.join(ret, '|')
-
-#_MeMoIZeR_gen_key = ALT9_MeMoIZeR_gen_key # 8.8, 0.20
-_MeMoIZeR_gen_key = ALT8_MeMoIZeR_gen_key # 8.5, 0.18
-#_MeMoIZeR_gen_key = ALT7_MeMoIZeR_gen_key # 8.7, 0.17
-#_MeMoIZeR_gen_key = ALT6_MeMoIZeR_gen_key #
-#_MeMoIZeR_gen_key = ALT5_MeMoIZeR_gen_key # 9.7, 0.20
-#_MeMoIZeR_gen_key = ALT4_MeMoIZeR_gen_key # 8.6, 0.19
-#_MeMoIZeR_gen_key = ALT3_MeMoIZeR_gen_key # 8.5, 0.20
-#_MeMoIZeR_gen_key = ALT2_MeMoIZeR_gen_key # 10.1, 0.22
-#_MeMoIZeR_gen_key = ALT1_MeMoIZeR_gen_key # 8.6 0.18
-
-
-
-## This is really the core worker of the Memoize module. Any
-## __cacheable__ method ends up calling this function which tries to
-## return a previously cached value if it exists, and which calls the
-## actual function and caches the return value if it doesn't already
-## exist.
-##
-## This function should be VERY efficient: it will get called a lot
-## and its job is to be faster than what would be called.
-
-def Memoizer_cache_get(func, cdict, args, kw):
- """Called instead of name to see if this method call's return
- value has been cached. If it has, just return the cached
- value; if not, call the actual method and cache the return."""
-
- obj = args[0]
-
- ckey = obj._MeMoIZeR_Key + ':' + _MeMoIZeR_gen_key(args, kw)
-
-## try:
-## rval = cdict[ckey]
-## except KeyError:
-## rval = cdict[ckey] = apply(func, args, kw)
-
- rval = cdict.get(ckey, "_MeMoIZeR")
- if rval is "_MeMoIZeR":
- rval = cdict[ckey] = apply(func, args, kw)
-
-## rval = cdict.setdefault(ckey, apply(func, args, kw))
-
-## if cdict.has_key(ckey):
-## rval = cdict[ckey]
-## else:
-## rval = cdict[ckey] = apply(func, args, kw)
-
- return rval
-
-def Memoizer_cache_get_self(func, cdict, self):
- """Called instead of func(self) to see if this method call's
- return value has been cached. If it has, just return the cached
- value; if not, call the actual method and cache the return.
- Optimized version of Memoizer_cache_get for methods that take the
- object instance as the only argument."""
-
- ckey = self._MeMoIZeR_Key
-
-## try:
-## rval = cdict[ckey]
-## except KeyError:
-## rval = cdict[ckey] = func(self)
-
- rval = cdict.get(ckey, "_MeMoIZeR")
- if rval is "_MeMoIZeR":
- rval = cdict[ckey] = func(self)
-
-## rval = cdict.setdefault(ckey, func(self)))
-
-## if cdict.has_key(ckey):
-## rval = cdict[ckey]
-## else:
-## rval = cdict[ckey] = func(self)
-
- return rval
-
-def Memoizer_cache_get_one(func, cdict, self, arg):
- """Called instead of func(self, arg) to see if this method call's
- return value has been cached. If it has, just return the cached
- value; if not, call the actual method and cache the return.
- Optimized version of Memoizer_cache_get for methods that take the
- object instance and one other argument only."""
-
-## X = getattr(arg, "_MeMoIZeR_Key", None)
-## if X:
-## ckey = self._MeMoIZeR_Key +':'+ X
-## else:
-## ckey = self._MeMoIZeR_Key +':'+ str(arg)
- ckey = self._MeMoIZeR_Key + ':' + \
- (getattr(arg, "_MeMoIZeR_Key", None) or repr(arg))
-
-## try:
-## rval = cdict[ckey]
-## except KeyError:
-## rval = cdict[ckey] = func(self, arg)
-
- rval = cdict.get(ckey, "_MeMoIZeR")
- if rval is "_MeMoIZeR":
- rval = cdict[ckey] = func(self, arg)
+ result = self.compute_foo_value()
-## rval = cdict.setdefault(ckey, func(self, arg)))
-
-## if cdict.has_key(ckey):
-## rval = cdict[ckey]
-## else:
-## rval = cdict[ckey] = func(self, arg)
+ self._memo['foo'] = result # Memoization
- return rval
+ return result
-#
-# Caching stuff is tricky, because the tradeoffs involved are often so
-# non-obvious, so we're going to support an alternate set of functions
-# that also count the hits and misses, to try to get a concrete idea of
-# which Memoizations seem to pay off.
-#
-# Because different configurations can have such radically different
-# performance tradeoffs, interpreting the hit/miss results will likely be
-# more of an art than a science. In other words, don't assume that just
-# because you see no hits in one configuration that it's not worthwhile
-# Memoizing that method.
-#
-# Note that these are essentially cut-and-paste copies of the above
-# Memozer_cache_get*() implementations, with the addition of the
-# counting logic. If the above implementations change, the
-# corresponding change should probably be made down below as well,
-# just to try to keep things in sync.
-#
+Here is an example of wrapping a method that will return different values
+based on one or more input arguments:
-class CounterEntry:
- def __init__(self):
- self.hit = 0
- self.miss = 0
+ def _bar_key(self, argument): # Memoization
+ return argument # Memoization
-import UserDict
-class Counter(UserDict.UserDict):
- def __call__(self, obj, methname):
- k = obj.__class__.__name__ + '.' + methname
- try:
- return self[k]
- except KeyError:
- c = self[k] = CounterEntry()
- return c
-
-CacheCount = Counter()
-CacheCountSelf = Counter()
-CacheCountOne = Counter()
-
-def Dump():
- items = CacheCount.items() + CacheCountSelf.items() + CacheCountOne.items()
- items.sort()
- for k, v in items:
- print " %7d hits %7d misses %s()" % (v.hit, v.miss, k)
-
-def Count_cache_get(name, func, cdict, args, kw):
- """Called instead of name to see if this method call's return
- value has been cached. If it has, just return the cached
- value; if not, call the actual method and cache the return."""
-
- obj = args[0]
-
- ckey = obj._MeMoIZeR_Key + ':' + _MeMoIZeR_gen_key(args, kw)
-
- c = CacheCount(obj, name)
- rval = cdict.get(ckey, "_MeMoIZeR")
- if rval is "_MeMoIZeR":
- rval = cdict[ckey] = apply(func, args, kw)
- c.miss = c.miss + 1
- else:
- c.hit = c.hit + 1
-
- return rval
-
-def Count_cache_get_self(name, func, cdict, self):
- """Called instead of func(self) to see if this method call's
- return value has been cached. If it has, just return the cached
- value; if not, call the actual method and cache the return.
- Optimized version of Memoizer_cache_get for methods that take the
- object instance as the only argument."""
-
- ckey = self._MeMoIZeR_Key
-
- c = CacheCountSelf(self, name)
- rval = cdict.get(ckey, "_MeMoIZeR")
- if rval is "_MeMoIZeR":
- rval = cdict[ckey] = func(self)
- c.miss = c.miss + 1
- else:
- c.hit = c.hit + 1
-
- return rval
-
-def Count_cache_get_one(name, func, cdict, self, arg):
- """Called instead of func(self, arg) to see if this method call's
- return value has been cached. If it has, just return the cached
- value; if not, call the actual method and cache the return.
- Optimized version of Memoizer_cache_get for methods that take the
- object instance and one other argument only."""
-
- ckey = self._MeMoIZeR_Key + ':' + \
- (getattr(arg, "_MeMoIZeR_Key", None) or repr(arg))
-
- c = CacheCountOne(self, name)
- rval = cdict.get(ckey, "_MeMoIZeR")
- if rval is "_MeMoIZeR":
- rval = cdict[ckey] = func(self, arg)
- c.miss = c.miss + 1
- else:
- c.hit = c.hit + 1
-
- return rval
-
-MCG_dict = {
- 'MCG' : Memoizer_cache_get,
- 'MCGS' : Memoizer_cache_get_self,
- 'MCGO' : Memoizer_cache_get_one,
-}
-
-MCG_lambda = "lambda *args, **kw: MCG(methcode, methcached, args, kw)"
-MCGS_lambda = "lambda self: MCGS(methcode, methcached, self)"
-MCGO_lambda = "lambda self, arg: MCGO(methcode, methcached, self, arg)"
-
-def EnableCounting():
- """Enable counting of Memoizer hits and misses by overriding the
- globals that hold the non-counting versions of the functions and
- lambdas we call with the counting versions.
- """
- global MCG_dict
- global MCG_lambda
- global MCGS_lambda
- global MCGO_lambda
+ memoizer_counters.append(SCons.Memoize.CountDict('bar', _bar_key)) # Memoization
- MCG_dict = {
- 'MCG' : Count_cache_get,
- 'MCGS' : Count_cache_get_self,
- 'MCGO' : Count_cache_get_one,
- }
+ def bar(self, argument):
- MCG_lambda = "lambda *args, **kw: MCG(methname, methcode, methcached, args, kw)"
- MCGS_lambda = "lambda self: MCGS(methname, methcode, methcached, self)"
- MCGO_lambda = "lambda self, arg: MCGO(methname, methcode, methcached, self, arg)"
+ memo_key = argument # Memoization
+ try: # Memoization
+ memo_dict = self._memo['bar'] # Memoization
+ except KeyError: # Memoization
+ memo_dict = {} # Memoization
+ self._memo['dict'] = memo_dict # Memoization
+ else: # Memoization
+ try: # Memoization
+ return memo_dict[memo_key] # Memoization
+ except KeyError: # Memoization
+ pass # Memoization
+ result = self.compute_bar_value(argument)
+ memo_dict[memo_key] = result # Memoization
-class _Memoizer_Simple:
+ return result
- def __setstate__(self, state):
- self.__dict__.update(state)
- self.__dict__['_MeMoIZeR_Key'] = Next_Memoize_Key()
- #kwq: need to call original's setstate if it had one...
+At one point we avoided replicating this sort of logic in all the methods
+by putting it right into this module, but we've moved away from that at
+present (see the "Historical Note," below.).
- def _MeMoIZeR_reset(self):
- self.__dict__['_MeMoIZeR_Key'] = Next_Memoize_Key()
- return 1
+Deciding what to cache is tricky, because different configurations
+can have radically different performance tradeoffs, and because the
+tradeoffs involved are often so non-obvious. Consequently, deciding
+whether or not to cache a given method will likely be more of an art than
+a science, but should still be based on available data from this module.
+Here are some VERY GENERAL guidelines about deciding whether or not to
+cache return values from a method that's being called a lot:
+ -- The first question to ask is, "Can we change the calling code
+ so this method isn't called so often?" Sometimes this can be
+ done by changing the algorithm. Sometimes the *caller* should
+ be memoized, not the method you're looking at.
-class _Memoizer_Comparable:
+ -- The memoized function should be timed with multiple configurations
+ to make sure it doesn't inadvertently slow down some other
+ configuration.
- def __setstate__(self, state):
- self.__dict__.update(state)
- self.__dict__['_MeMoIZeR_Key'] = Next_Memoize_Key()
- #kwq: need to call original's setstate if it had one...
+ -- When memoizing values based on a dictionary key composed of
+ input arguments, you don't need to use all of the arguments
+ if some of them don't affect the return values.
- def _MeMoIZeR_reset(self):
- self.__dict__['_MeMoIZeR_Key'] = Next_Memoize_Key()
- return 1
+Historical Note: The initial Memoizer implementation actually handled
+the caching of values for the wrapped methods, based on a set of generic
+algorithms for computing hashable values based on the method's arguments.
+This collected caching logic nicely, but had two drawbacks:
- def __cmp__(self, other):
- """A comparison might use the object dictionaries to
- compare, so the dictionaries should contain caching
- entries. Make new dictionaries without those entries
- to use with the underlying comparison."""
+ Running arguments through a generic key-conversion mechanism is slower
+ (and less flexible) than just coding these things directly. Since the
+ methods that need memoized values are generally performance-critical,
+ slowing them down in order to collect the logic isn't the right
+ tradeoff.
- if self is other:
- return 0
+ Use of the memoizer really obscured what was being called, because
+ all the memoized methods were wrapped with re-used generic methods.
+ This made it more difficult, for example, to use the Python profiler
+ to figure out how to optimize the underlying methods.
+"""
- # We are here as a cached object, but cmp will flip its
- # arguments back and forth and recurse attempting to get base
- # arguments for the comparison, so we might have already been
- # stripped.
+import new
- try:
- saved_d1 = self.__dict__
- d1 = copy.copy(saved_d1)
- del d1['_MeMoIZeR_Key']
- except KeyError:
- return self._MeMoIZeR_cmp(other)
- self.__dict__ = d1
+# A flag controlling whether or not we actually use memoization.
+use_memoizer = None
- # Same thing for the other, but we should try to convert it
- # here in case the _MeMoIZeR_cmp compares __dict__ objects
- # directly.
+CounterList = []
- saved_other = None
- try:
- if other.__dict__.has_key('_MeMoIZeR_Key'):
- saved_other = other.__dict__
- d2 = copy.copy(saved_other)
- del d2['_MeMoIZeR_Key']
- other.__dict__ = d2
- except (AttributeError, KeyError):
- pass
-
- # Both self and other have been prepared: perform the test,
- # then restore the original dictionaries and exit
-
- rval = self._MeMoIZeR_cmp(other)
-
- self.__dict__ = saved_d1
- if saved_other:
- other.__dict__ = saved_other
-
- return rval
-
-
-def Analyze_Class(klass):
- if klass.__dict__.has_key('_MeMoIZeR_converted'): return klass
-
- original_name = str(klass)
-
- D,R,C = _analyze_classmethods(klass.__dict__, klass.__bases__)
-
- if C:
- modelklass = _Memoizer_Comparable
- lcldict = {'_MeMoIZeR_cmp':C}
- else:
- modelklass = _Memoizer_Simple
- lcldict = {}
-
- klass.__dict__.update(memoize_classdict(klass, modelklass, lcldict, D, R))
-
- return klass
-
-
-# Note that each eval("lambda...") has a few \n's prepended to the
-# lambda, and furthermore that each of these evals has a different
-# number of \n's prepended. This is to provide a little bit of info
-# for traceback or profile output, which generate things like 'File
-# "<string>", line X'. X will be the number of \n's plus 1.
-
-# Also use the following routine to specify the "filename" portion so
-# that it provides useful information. In addition, make sure it
-# contains 'os.sep + "SCons" + os.sep' for the
-# SCons.Script.find_deepest_user_frame operation.
-
-def whoami(memoizer_funcname, real_funcname):
- return '...'+os.sep+'SCons'+os.sep+'Memoizer-'+ \
- memoizer_funcname+'-lambda<'+real_funcname+'>'
-
-def memoize_classdict(klass, modelklass, new_klassdict, cacheable, resetting):
- new_klassdict.update(modelklass.__dict__)
- new_klassdict['_MeMoIZeR_converted'] = 1
-
- for name,code in cacheable.items():
- eval_dict = {
- 'methname' : name,
- 'methcode' : code,
- 'methcached' : {},
- }
- eval_dict.update(MCG_dict)
- fc = code.func_code
- if fc.co_argcount == 1 and not fc.co_flags & 0xC:
- compiled = compile("\n"*1 + MCGS_lambda,
- whoami('cache_get_self', name),
- "eval")
- elif fc.co_argcount == 2 and not fc.co_flags & 0xC:
- compiled = compile("\n"*2 + MCGO_lambda,
- whoami('cache_get_one', name),
- "eval")
+class Counter:
+ """
+ Base class for counting memoization hits and misses.
+
+ We expect that the metaclass initialization will have filled in
+ the .name attribute that represents the name of the function
+ being counted.
+ """
+ def __init__(self, method_name):
+ """
+ """
+ self.method_name = method_name
+ self.hit = 0
+ self.miss = 0
+ CounterList.append(self)
+ def display(self):
+ fmt = " %7d hits %7d misses %s()"
+ print fmt % (self.hit, self.miss, self.name)
+ def __cmp__(self, other):
+ return cmp(self.name, other.name)
+
+class CountValue(Counter):
+ """
+ A counter class for simple, atomic memoized values.
+
+ A CountValue object should be instantiated in a class for each of
+ the class's methods that memoizes its return value by simply storing
+ the return value in its _memo dictionary.
+
+ We expect that the metaclass initialization will fill in the
+ .underlying_method attribute with the method that we're wrapping.
+ We then call the underlying_method method after counting whether
+ its memoized value has already been set (a hit) or not (a miss).
+ """
+ def __call__(self, *args, **kw):
+ obj = args[0]
+ if obj._memo.has_key(self.method_name):
+ self.hit = self.hit + 1
else:
- compiled = compile("\n"*3 + MCG_lambda,
- whoami('cache_get', name),
- "eval")
- newmethod = eval(compiled, eval_dict, {})
- new_klassdict[name] = newmethod
-
- for name,code in resetting.items():
- newmethod = eval(
- compile(
- "lambda obj_self, *args, **kw: (obj_self._MeMoIZeR_reset(), apply(rmethcode, (obj_self,)+args, kw))[1]",
- whoami('cache_reset', name),
- 'eval'),
- {'rmethcode':code}, {})
- new_klassdict[name] = newmethod
-
- return new_klassdict
-
-def _analyze_classmethods(klassdict, klassbases):
- """Given a class, performs a scan of methods for that class and
- all its base classes (recursively). Returns aggregated results of
- _scan_classdict calls where subclass methods are superimposed over
- base class methods of the same name (emulating instance->class
- method lookup)."""
-
- D = {}
- R = {}
- C = None
-
- # Get cache/reset/cmp methods from subclasses
-
- for K in klassbases:
- if K.__dict__.has_key('_MeMoIZeR_converted'): continue
- d,r,c = _analyze_classmethods(K.__dict__, K.__bases__)
- D.update(d)
- R.update(r)
- C = c or C
-
- # Delete base method info if current class has an override
-
- for M in D.keys():
- if M == '__cmp__': continue
- if klassdict.has_key(M):
- del D[M]
- for M in R.keys():
- if M == '__cmp__': continue
- if klassdict.has_key(M):
- del R[M]
-
- # Get cache/reset/cmp from current class
-
- d,r,c = _scan_classdict(klassdict)
-
- # Update accumulated cache/reset/cmp methods
-
- D.update(d)
- R.update(r)
- C = c or C
-
- return D,R,C
-
-
-def _scan_classdict(klassdict):
- """Scans the method dictionary of a class to find all methods
- interesting to caching operations. Returns a tuple of these
- interesting methods:
-
- ( dict-of-cachable-methods,
- dict-of-cache-resetting-methods,
- cmp_method_val or None)
-
- Each dict has the name of the method as a key and the corresponding
- value is the method body."""
-
- cache_setters = {}
- cache_resetters = {}
- cmp_if_exists = None
- already_cache_modified = 0
-
- for attr,val in klassdict.items():
- if not callable(val): continue
- if attr == '__cmp__':
- cmp_if_exists = val
- continue # cmp can't be cached and can't reset cache
- if attr == '_MeMoIZeR_cmp':
- already_cache_modified = 1
- continue
- if not val.__doc__: continue
- if string.find(val.__doc__, '__cache_reset__') > -1:
- cache_resetters[attr] = val
- continue
- if string.find(val.__doc__, '__reset_cache__') > -1:
- cache_resetters[attr] = val
- continue
- if string.find(val.__doc__, '__cacheable__') > -1:
- cache_setters[attr] = val
- continue
- if already_cache_modified: cmp_if_exists = 'already_cache_modified'
- return cache_setters, cache_resetters, cmp_if_exists
+ self.miss = self.miss + 1
+ return apply(self.underlying_method, args, kw)
-#
-# Primary Memoizer class. This should be a base-class for any class
-# that wants method call results to be cached. The sub-class should
-# call this parent class's __init__ method, but no other requirements
-# are made on the subclass (other than appropriate decoration).
+class CountDict(Counter):
+ """
+ A counter class for memoized values stored in a dictionary, with
+ keys based on the method's input arguments.
+
+ A CountDict object is instantiated in a class for each of the
+ class's methods that memoizes its return value in a dictionary,
+ indexed by some key that can be computed from one or more of
+ its input arguments.
+
+ We expect that the metaclass initialization will fill in the
+ .underlying_method attribute with the method that we're wrapping.
+ We then call the underlying_method method after counting whether the
+ computed key value is already present in the memoization dictionary
+ (a hit) or not (a miss).
+ """
+ def __init__(self, method_name, keymaker):
+ """
+ """
+ Counter.__init__(self, method_name)
+ self.keymaker = keymaker
+ def __call__(self, *args, **kw):
+ obj = args[0]
+ try:
+ memo_dict = obj._memo[self.method_name]
+ except KeyError:
+ self.miss = self.miss + 1
+ else:
+ key = apply(self.keymaker, args, kw)
+ if memo_dict.has_key(key):
+ self.hit = self.hit + 1
+ else:
+ self.miss = self.miss + 1
+ return apply(self.underlying_method, args, kw)
class Memoizer:
"""Object which performs caching of method calls for its 'primary'
instance."""
def __init__(self):
- self.__class__ = Analyze_Class(self.__class__)
- self._MeMoIZeR_Key = Next_Memoize_Key()
+ pass
+
+# Find out if we support metaclasses (Python 2.2 and later).
-# Find out if we are pre-2.2
+class M:
+ def __init__(cls, name, bases, cls_dict):
+ cls.has_metaclass = 1
+
+class A:
+ __metaclass__ = M
try:
- vinfo = sys.version_info
+ has_metaclass = A.has_metaclass
except AttributeError:
- """Split an old-style version string into major and minor parts. This
- is complicated by the fact that a version string can be something
- like 3.2b1."""
- import re
- version = string.split(string.split(sys.version, ' ')[0], '.')
- vinfo = (int(version[0]), int(re.match('\d+', version[1]).group()))
- del re
-
-need_version = (2, 2) # actual
-#need_version = (33, 0) # always
-#need_version = (0, 0) # never
-
-has_metaclass = (vinfo[0] > need_version[0] or \
- (vinfo[0] == need_version[0] and
- vinfo[1] >= need_version[1]))
+ has_metaclass = None
+
+del M
+del A
if not has_metaclass:
+ def Dump(title):
+ pass
+
class Memoized_Metaclass:
# Just a place-holder so pre-metaclass Python versions don't
# have to have special code for the Memoized classes.
pass
+ def EnableMemoization():
+ import SCons.Warnings
+ msg = 'memoization is not supported in this version of Python (no metaclasses)'
+ raise SCons.Warnings.NoMetaclassSupportWarning, msg
+
else:
- # Initialization is a wee bit of a hassle. We want to do some of
- # our own work for initialization, then pass on to the actual
- # initialization function. However, we have to be careful we
- # don't interfere with (a) the super()'s initialization call of
- # it's superclass's __init__, and (b) classes we are Memoizing
- # that don't have their own __init__ but which have a super that
- # has an __init__. To do (a), we eval a lambda below where the
- # actual init code is locally bound and the __init__ entry in the
- # class's dictionary is replaced with the _MeMoIZeR_init call. To
- # do (b), we use _MeMoIZeR_superinit as a fallback if the class
- # doesn't have it's own __init__. Note that we don't use getattr
- # to obtain the __init__ because we don't want to re-instrument
- # parent-class __init__ operations (and we want to avoid the
- # Object object's slot init if the class has no __init__).
-
- def _MeMoIZeR_init(actual_init, self, args, kw):
- self.__dict__['_MeMoIZeR_Key'] = Next_Memoize_Key()
- apply(actual_init, (self,)+args, kw)
-
- def _MeMoIZeR_superinit(self, cls, args, kw):
- apply(super(cls, self).__init__, args, kw)
+ def Dump(title=None):
+ if title:
+ print title
+ CounterList.sort()
+ for counter in CounterList:
+ counter.display()
class Memoized_Metaclass(type):
def __init__(cls, name, bases, cls_dict):
- # Note that cls_dict apparently contains a *copy* of the
- # attribute dictionary of the class; modifying cls_dict
- # has no effect on the actual class itself.
- D,R,C = _analyze_classmethods(cls_dict, bases)
- if C:
- modelklass = _Memoizer_Comparable
- cls_dict['_MeMoIZeR_cmp'] = C
- else:
- modelklass = _Memoizer_Simple
- klassdict = memoize_classdict(cls, modelklass, cls_dict, D, R)
-
- init = klassdict.get('__init__', None)
- if not init:
- # Make sure filename has os.sep+'SCons'+os.sep so that
- # SCons.Script.find_deepest_user_frame doesn't stop here
- import inspect # It's OK, can't get here for Python < 2.1
- filename = inspect.getsourcefile(_MeMoIZeR_superinit)
- if not filename:
- # This file was compiled at a path name different from
- # how it's invoked now, so just make up something.
- filename = whoami('superinit', '???')
- superinitcode = compile(
- "lambda self, *args, **kw: MPI(self, cls, args, kw)",
- filename,
- "eval")
- superinit = eval(superinitcode,
- {'cls':cls,
- 'MPI':_MeMoIZeR_superinit})
- init = superinit
-
- newinitcode = compile(
- "\n"*(init.func_code.co_firstlineno-1) +
- "lambda self, args, kw: _MeMoIZeR_init(real_init, self, args, kw)",
- whoami('init', init.func_code.co_filename),
- 'eval')
- newinit = eval(newinitcode,
- {'real_init':init,
- '_MeMoIZeR_init':_MeMoIZeR_init},
- {})
- klassdict['__init__'] = lambda self, *args, **kw: newinit(self, args, kw)
-
- super(Memoized_Metaclass, cls).__init__(name, bases, klassdict)
- # Now, since klassdict doesn't seem to have affected the class
- # definition itself, apply klassdict.
- for attr in klassdict.keys():
- setattr(cls, attr, klassdict[attr])
-
-def DisableMemoization():
- global use_memoizer
- use_memoizer = None
-
-def use_old_memoization():
- return use_memoizer and not has_metaclass
+ super(Memoized_Metaclass, cls).__init__(name, bases, cls_dict)
+
+ for counter in cls_dict.get('memoizer_counters', []):
+ method_name = counter.method_name
+
+ counter.name = cls.__name__ + '.' + method_name
+ counter.underlying_method = cls_dict[method_name]
+
+ replacement_method = new.instancemethod(counter, None, cls)
+ setattr(cls, method_name, replacement_method)
+
+ def EnableMemoization():
+ global use_memoizer
+ use_memoizer = 1