summaryrefslogtreecommitdiffstats
path: root/jemalloc/src/tcache.c
Commit message (Collapse)AuthorAgeFilesLines
* Dynamically adjust tcache fill count.Jason Evans2011-03-211-2/+3
| | | | | | | | Dynamically adjust tcache fill count (number of objects allocated per tcache refill) such that if GC has to flush inactive objects, the fill count gradually decreases. Conversely, if refills occur while the fill count is depressed, the fill count gradually increases back to its maximum value.
* Use bitmaps to track small regions.Jason Evans2011-03-171-43/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The previous free list implementation, which embedded singly linked lists in available regions, had the unfortunate side effect of causing many cache misses during thread cache fills. Fix this in two places: - arena_run_t: Use a new bitmap implementation to track which regions are available. Furthermore, revert to preferring the lowest available region (as jemalloc did with its old bitmap-based approach). - tcache_t: Move read-only tcache_bin_t metadata into tcache_bin_info_t, and add a contiguous array of pointers to tcache_t in order to track cached objects. This substantially increases the size of tcache_t, but results in much higher data locality for common tcache operations. As a side benefit, it is again possible to efficiently flush the least recently used cached objects, so this change changes flushing from MRU to LRU. The new bitmap implementation uses a multi-level summary approach to make finding the lowest available region very fast. In practice, bitmaps only have one or two levels, though the implementation is general enough to handle extremely large bitmaps, mainly so that large page sizes can still be entertained. Fix tcache_bin_flush_large() to always flush statistics, in the same way that tcache_bin_flush_small() was recently fixed. Use JEMALLOC_DEBUG rather than NDEBUG. Add dassert(), and use it for debug-only asserts.
* Create arena_bin_info_t.Jason Evans2011-03-151-3/+3
| | | | | Move read-only fields from arena_bin_t into arena_bin_info_t, primarily in order to avoid false cacheline sharing.
* Fix a thread cache stats merging bug.Jason Evans2011-03-141-0/+19
| | | | | | | | | | When a thread cache flushes objects to their arenas due to an abundance of cached objects, it merges the allocation request count for the associated size class, and increments a flush counter. If none of the flushed objects came from the thread's assigned arena, then the merging wouldn't happen (though the counter would typically eventually be merged), nor would the flush counter be incremented (a hard bug). Fix this via extra conditional code just after the flush loop.
* Replace JEMALLOC_OPTIONS with MALLOC_CONF.Jason Evans2010-10-241-6/+6
| | | | | | | | | | | Replace the single-character run-time flags with key/value pairs, which can be set via the malloc_conf global, /etc/malloc.conf, and the MALLOC_CONF environment variable. Replace the JEMALLOC_PROF_PREFIX environment variable with the "opt.prof_prefix" option. Replace umax2s() with u2s().
* Use offsetof() when sizing dynamic structures.Jason Evans2010-10-021-1/+1
| | | | | | Base dynamic structure size on offsetof(), rather than subtracting the size of the dynamic structure member. Results could differ on systems with strict data structure alignment requirements.
* Omit chunk header in arena chunk map.Jason Evans2010-10-021-6/+6
| | | | | | Omit the first map_bias elements of the map in arena_chunk_t. This avoids barely spilling over into an extra chunk header page for common chunk sizes.
* Add {,r,s,d}allocm().Jason Evans2010-09-171-1/+3
| | | | | | Add allocm(), rallocm(), sallocm(), and dallocm(), which are a functional superset of malloc(), calloc(), posix_memalign(), malloc_usable_size(), and free().
* Port to Mac OS X.Jason Evans2010-09-121-6/+20
| | | | | Add Mac OS X support, based in large part on the OS X support in Mozilla's version of jemalloc.
* Fix tcache crash during thread cleanup.Jason Evans2010-04-141-14/+12
| | | | | | | Properly maintain tcache_bin_t's avail pointer such that it is NULL if no objects are cached. This only caused problems during thread cache destruction, since cache flushing otherwise never occurs on an empty bin.
* Track dirty and clean runs separately.Jason Evans2010-03-191-2/+2
| | | | | Split arena->runs_avail into arena->runs_avail_{clean,dirty}, and preferentially allocate dirty runs.
* Remove medium size classes.Jason Evans2010-03-171-12/+131
| | | | | | | | | | Remove medium size classes, because concurrent dirty page purging is no longer capable of purging inactive dirty pages inside active runs (due to recent arena/bin locking changes). Enhance tcache to support caching large objects, so that the same range of size classes is still cached, despite the removal of medium size class support.
* Fix a run initialization race condition.Jason Evans2010-03-161-6/+7
| | | | | | | | Initialize small run header before dropping arena->lock, arena_chunk_purge() relies on valid small run headers during run iteration. Add some assertions.
* Push locks into arena bins.Jason Evans2010-03-151-45/+36
| | | | | | | | | | For bin-related allocation, protect data structures with bin locks rather than arena locks. Arena locks remain for run allocation/deallocation and other miscellaneous operations. Restructure statistics counters to maintain per bin allocated/nmalloc/ndalloc, but continue to provide arena-wide statistics via aggregation in the ctl code.
* Simplify tcache object caching.Jason Evans2010-03-141-120/+77
| | | | | | | | | | | | | | | | | | | | Use chains of cached objects, rather than using arrays of pointers. Since tcache_bin_t is no longer dynamically sized, convert tcache_t's tbin to an array of structures, rather than an array of pointers. This implicitly removes tcache_bin_{create,destroy}(), which further simplifies the fast path for malloc/free. Use cacheline alignment for tcache_t allocations. Remove runtime configuration option for number of tcache bin slots, and replace it with a boolean option for enabling/disabling tcache. Limit the number of tcache objects to the lesser of TCACHE_NSLOTS_MAX and 2X the number of regions per run for the size class. For GC-triggered flush, discard 3/4 of the objects below the low water mark, rather than 1/2.
* Simplify malloc_message().Jason Evans2010-03-041-2/+2
| | | | | Rather than passing four strings to malloc_message(), malloc_write4(), and all the functions that use them, only pass one string.
* Restructure source tree.Jason Evans2010-02-111-0/+335