diff options
Diffstat (limited to 'src/3rdparty/ptmalloc/README')
-rw-r--r-- | src/3rdparty/ptmalloc/README | 186 |
1 files changed, 186 insertions, 0 deletions
diff --git a/src/3rdparty/ptmalloc/README b/src/3rdparty/ptmalloc/README new file mode 100644 index 0000000..914c745 --- /dev/null +++ b/src/3rdparty/ptmalloc/README @@ -0,0 +1,186 @@ +ptmalloc3 - a multi-thread malloc implementation +================================================ + +Wolfram Gloger (wg@malloc.de) + +Jan 2006 + + +Thanks +====== + +This release was partly funded by Pixar Animation Studios. I would +like to thank David Baraff of Pixar for his support and Doug Lea +(dl@cs.oswego.edu) for the great original malloc implementation. + + +Introduction +============ + +This package is a modified version of Doug Lea's malloc-2.8.3 +implementation (available seperately from ftp://g.oswego.edu/pub/misc) +that I adapted for multiple threads, while trying to avoid lock +contention as much as possible. + +As part of the GNU C library, the source files may be available under +the GNU Library General Public License (see the comments in the +files). But as part of this stand-alone package, the code is also +available under the (probably less restrictive) conditions described +in the file 'COPYRIGHT'. In any case, there is no warranty whatsoever +for this package. + +The current distribution should be available from: + +http://www.malloc.de/malloc/ptmalloc3.tar.gz + + +Compilation +=========== + +It should be possible to build ptmalloc3 on any UN*X-like system that +implements the sbrk(), mmap(), munmap() and mprotect() calls. Since +there are now several source files, a library (libptmalloc3.a) is +generated. See the Makefile for examples of the compile-time options. + +Note that support for non-ANSI compilers is no longer there. + +Several example targets are provided in the Makefile: + + o Posix threads (pthreads), compile with "make posix" + + o Posix threads with explicit initialization, compile with + "make posix-explicit" (known to be required on HPUX) + + o Posix threads without "tsd data hack" (see below), compile with + "make posix-with-tsd" + + o Solaris threads, compile with "make solaris" + + o SGI sproc() threads, compile with "make sproc" + + o no threads, compile with "make nothreads" (currently out of order?) + +For Linux: + + o make "linux-pthread" (almost the same as "make posix") or + make "linux-shared" + +Note that some compilers need special flags for multi-threaded code, +e.g. with Solaris cc with Posix threads, one should use: + +% make posix SYS_FLAGS='-mt' + +Some additional targets, ending in `-libc', are also provided in the +Makefile, to compare performance of the test programs to the case when +linking with the standard malloc implementation in libc. + +A potential problem remains: If any of the system-specific functions +for getting/setting thread-specific data or for locking a mutex call +one of the malloc-related functions internally, the implementation +cannot work at all due to infinite recursion. One example seems to be +Solaris 2.4. I would like to hear if this problem occurs on other +systems, and whether similar workarounds could be applied. + +For Posix threads, too, an optional hack like that has been integrated +(activated when defining USE_TSD_DATA_HACK) which depends on +`pthread_t' being convertible to an integral type (which is of course +not generally guaranteed). USE_TSD_DATA_HACK is now the default +because I haven't yet found a non-glibc pthreads system where this +hack is _not_ needed. + +*NEW* and _important_: In (currently) one place in the ptmalloc3 +source, a write memory barrier is needed, named +atomic_write_barrier(). This macro needs to be defined at the end of +malloc-machine.h. For gcc, a fallback in the form of a full memory +barrier is already defined, but you may need to add another definition +if you don't use gcc. + +Usage +===== + +Just link libptmalloc3 into your application. + +Some wicked systems (e.g. HPUX apparently) won't let malloc call _any_ +thread-related functions before main(). On these systems, +USE_STARTER=2 must be defined during compilation (see "make +posix-explicit" above) and the global initialization function +ptmalloc_init() must be called explicitly, preferably at the start of +main(). + +Otherwise, when using ptmalloc3, no special precautions are necessary. + +Link order is important +======================= + +On some systems, when overriding malloc and linking against shared +libraries, the link order becomes very important. E.g., when linking +C++ programs on Solaris with Solaris threads [this is probably now +obsolete], don't rely on libC being included by default, but instead +put `-lthread' behind `-lC' on the command line: + + CC ... libptmalloc3.a -lC -lthread + +This is because there are global constructors in libC that need +malloc/ptmalloc, which in turn needs to have the thread library to be +already initialized. + +Debugging hooks +=============== + +All calls to malloc(), realloc(), free() and memalign() are routed +through the global function pointers __malloc_hook, __realloc_hook, +__free_hook and __memalign_hook if they are not NULL (see the malloc.h +header file for declarations of these pointers). Therefore the malloc +implementation can be changed at runtime, if care is taken not to call +free() or realloc() on pointers obtained with a different +implementation than the one currently in effect. (The easiest way to +guarantee this is to set up the hooks before any malloc call, e.g. +with a function pointed to by the global variable +__malloc_initialize_hook). + +You can now also tune other malloc parameters (normally adjused via +mallopt() calls from the application) with environment variables: + + MALLOC_TRIM_THRESHOLD_ for deciding to shrink the heap (in bytes) + + MALLOC_GRANULARITY_ The unit for allocating and deallocating + MALLOC_TOP_PAD_ memory from the system. The default + is 64k and this parameter _must_ be a + power of 2. + + MALLOC_MMAP_THRESHOLD_ min. size for chunks allocated via + mmap() (in bytes) + +Tests +===== + +Two testing applications, t-test1 and t-test2, are included in this +source distribution. Both perform pseudo-random sequences of +allocations/frees, and can be given numeric arguments (all arguments +are optional): + +% t-test[12] <n-total> <n-parallel> <n-allocs> <size-max> <bins> + + n-total = total number of threads executed (default 10) + n-parallel = number of threads running in parallel (2) + n-allocs = number of malloc()'s / free()'s per thread (10000) + size-max = max. size requested with malloc() in bytes (10000) + bins = number of bins to maintain + +The first test `t-test1' maintains a completely seperate pool of +allocated bins for each thread, and should therefore show full +parallelism. On the other hand, `t-test2' creates only a single pool +of bins, and each thread randomly allocates/frees any bin. Some lock +contention is to be expected in this case, as the threads frequently +cross each others arena. + +Performance results from t-test1 should be quite repeatable, while the +behaviour of t-test2 depends on scheduling variations. + +Conclusion +========== + +I'm always interested in performance data and feedback, just send mail +to ptmalloc@malloc.de. + +Good luck! |