summaryrefslogtreecommitdiffstats
path: root/src/jemalloc.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix rallocx() sampling code to not eagerly commit sampler update.Jason Evans2016-06-081-3/+3
| | | | | | rallocx() for an alignment-constrained request may end up with a smaller-than-worst-case size if in-place reallocation succeeds due to serendipitous alignment. In such cases, sampling may not happen.
* Fix a Valgrind regression in calloc().Elliot Ronaghan2016-06-071-1/+1
| | | | | This regression was caused by 3ef51d7f733ac6432e80fa902a779ab5b98d74f6 (Optimize the fast paths of calloc() and [m,d,sd]allocx().).
* Resolve bootstrapping issues when embedded in FreeBSD libc.Jason Evans2016-05-111-244/+270
| | | | | | | | | | | | | b2c0d6322d2307458ae2b28545f8a5c9903d7ef5 (Add witness, a simple online locking validator.) caused a broad propagation of tsd throughout the internal API, but tsd_fetch() was designed to fail prior to tsd bootstrapping. Fix this by splitting tsd_t into non-nullable tsd_t and nullable tsdn_t, and modifying all internal APIs that do not critically rely on tsd to take nullable pointers. Furthermore, add the tsd_booted_get() function so that tsdn_fetch() can probe whether tsd bootstrapping is complete and return NULL if not. All dangerous conversions of nullable pointers are tsdn_tsd() calls that assert-fail on invalid conversion.
* Fix tsd bootstrapping for a0malloc().Jason Evans2016-05-071-27/+31
|
* Optimize the fast paths of calloc() and [m,d,sd]allocx().Jason Evans2016-05-061-186/+114
| | | | | | | | This is a broader application of optimizations to malloc() and free() in f4a0f32d340985de477bbe329ecdaecd69ed1055 (Fast-path improvement: reduce # of branches and unnecessary operations.). This resolves #321.
* Modify pages_map() to support mapping uncommitted virtual memory.Jason Evans2016-05-061-0/+1
| | | | | | | | | | | If the OS overcommits: - Commit all mappings in pages_map() regardless of whether the caller requested committed memory. - Linux-specific: Specify MAP_NORESERVE to avoid unfortunate interactions with heuristic overcommit mode during fork(2). This resolves #193.
* Fix witness/fork() interactions.Jason Evans2016-04-261-3/+3
| | | | | | Fix witness to clear its list of owned mutexes in the child if platform-specific malloc_mutex code re-initializes mutexes rather than unlocking them.
* Fix fork()-related lock rank ordering reversals.Jason Evans2016-04-261-12/+29
|
* Fix arena_choose_hard() regression.Jason Evans2016-04-231-1/+1
| | | | | This regression was caused by 66cd953514a18477eb49732e40d5c2ab5f1b12c5 (Do not allocate metadata via non-auto arenas, nor tcaches.).
* Do not allocate metadata via non-auto arenas, nor tcaches.Jason Evans2016-04-221-37/+75
| | | | | This assures that all internally allocated metadata come from the first opt_narenas arenas, i.e. the automatically multiplexed arenas.
* Add witness, a simple online locking validator.Jason Evans2016-04-141-147/+230
| | | | This resolves #358.
* Fix a potential tsd cleanup leak.Jason Evans2016-02-281-0/+3
| | | | | | | | | | | | | | Prior to 767d85061a6fb88ec977bbcd9b429a43aff391e6 (Refactor arenas array (fixes deadlock).), it was possible under some circumstances for arena_get() to trigger recreation of the arenas cache during tsd cleanup, and the arenas cache would then be leaked. In principle a similar issue could still occur as a side effect of decay-based purging, which calls arena_tdata_get(). Fix arenas_tdata_cleanup() by setting tsd->arenas_tdata_bypass to true, so that arena_tdata_get() will gracefully fail (an expected behavior) rather than recreating tsd->arena_tdata. Reported by Christopher Ferris <cferris@google.com>.
* Add more HUGE_MAXCLASS overflow checks.Jason Evans2016-02-261-23/+34
| | | | | | | Add HUGE_MAXCLASS overflow checks that are specific to heap profiling code paths. This fixes test failures that were introduced by 0c516a00c4cb28cff55ce0995f756b5aae074c9e (Make *allocx() size class overflow behavior defined.).
* Make *allocx() size class overflow behavior defined.Jason Evans2016-02-251-24/+44
| | | | | | | Limit supported size and alignment to HUGE_MAXCLASS, which in turn is now limited to be less than PTRDIFF_MAX. This resolves #278 and #295.
* Refactor arenas array (fixes deadlock).Jason Evans2016-02-251-151/+90
| | | | | | | | | | | | Refactor the arenas array, which contains pointers to all extant arenas, such that it starts out as a sparse array of maximum size, and use double-checked atomics-based reads as the basis for fast and simple arena_get(). Additionally, reduce arenas_lock's role such that it only protects against arena initalization races. These changes remove the possibility for arena lookups to trigger locking, which resolves at least one known (fork-related) deadlock. This resolves #315.
* Silence miscellaneous 64-to-32-bit data loss warnings.Jason Evans2016-02-241-1/+1
|
* Use ssize_t for readlink() rather than int.Jason Evans2016-02-241-1/+1
|
* Make opt_narenas unsigned rather than size_t.Jason Evans2016-02-241-8/+12
|
* Refactor time_* into nstime_*.Jason Evans2016-02-221-1/+1
| | | | | | | Use a single uint64_t in nstime_t to store nanoseconds rather than using struct timespec. This reduces fragility around conversions between long and uint64_t, especially missing casts that only cause problems on 32-bit platforms.
* Implement decay-based unused dirty page purging.Jason Evans2016-02-201-11/+42
| | | | | | | | | | | | | | | | This is an alternative to the existing ratio-based unused dirty page purging, and is intended to eventually become the sole purging mechanism. Add mallctls: - opt.purge - opt.decay_time - arena.<i>.decay - arena.<i>.decay_time - arenas.decay_time - stats.arenas.<i>.decay_time This resolves #325.
* Refactor arenas_cache tsd.Jason Evans2016-02-201-62/+87
| | | | | Refactor arenas_cache tsd into arenas_tdata, which is a structure of type arena_tdata_t.
* Add --with-malloc-conf.Jason Evans2016-02-201-3/+6
| | | | | Add --with-malloc-conf, which makes it possible to embed a default options string during configuration.
* Call malloc_test_boot0() from malloc_init_hard_recursible().Cosmin Paraschiv2016-01-111-5/+16
| | | | | | | | | | | When using LinuxThreads, malloc bootstrapping deadlocks, since malloc_tsd_boot0() ends up calling pthread_setspecific(), which causes recursive allocation. Fix it by moving the malloc_tsd_boot0() call to malloc_init_hard_recursible(). The deadlock was introduced by 8bb3198f72fc7587dc93527f9f19fb5be52fa553 (Refactor/fix arenas manipulation.), when tsd_boot() was split and the top half, tsd_boot0(), got an extra tsd_wrapper_set() call.
* Fast-path improvement: reduce # of branches and unnecessary operations.Qi Wang2015-11-101-53/+133
| | | | | | - Combine multiple runtime branches into a single malloc_slow check. - Avoid calling arena_choose / size2index / index2size on fast path. - A few micro optimizations.
* Add mallocx() OOM tests.Jason Evans2015-09-171-0/+2
|
* Simplify imallocx_prof_sample().Jason Evans2015-09-171-26/+13
| | | | | | | Simplify imallocx_prof_sample() to always operate on usize rather than sometimes using size. This avoids redundant usize computations and more closely fits the style adopted by i[rx]allocx_prof_sample() to fix sampling bugs.
* Fix irallocx_prof_sample().Jason Evans2015-09-171-5/+5
| | | | | Fix irallocx_prof_sample() to always allocate large regions, even when alignment is non-zero.
* Fix ixallocx_prof_sample().Jason Evans2015-09-171-17/+4
| | | | | | Fix ixallocx_prof_sample() to never modify nor create sampled small allocations. xallocx() is in general incapable of moving small allocations, so this fix removes buggy code without loss of generality.
* Centralize xallocx() size[+extra] overflow checks.Jason Evans2015-09-151-7/+11
|
* Fix ixallocx_prof() to check for size greater than HUGE_MAXCLASS.Jason Evans2015-09-151-1/+5
|
* Resolve an unsupported special case in arena_prof_tctx_set().Jason Evans2015-09-151-3/+3
| | | | | | | | | | | Add arena_prof_tctx_reset() and use it instead of arena_prof_tctx_set() when resetting the tctx pointer during reallocation, which happens whenever an originally sampled reallocated object is not sampled during reallocation. This regression was introduced by 594c759f37c301d0245dc2accf4d4aaf9d202819 (Optimize arena_prof_tctx_set().)
* Fix ixallocx_prof_sample() argument order reversal.Jason Evans2015-09-151-1/+1
| | | | | Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample() in the correct order.
* s/max_usize/usize_max/gJason Evans2015-09-151-6/+6
|
* s/oldptr/old_ptr/gJason Evans2015-09-151-15/+15
|
* Make one call to prof_active_get_unlocked() per allocation event.Jason Evans2015-09-151-10/+19
| | | | | | | Make one call to prof_active_get_unlocked() per allocation event, and use the result throughout the relevant functions that handle an allocation event. Also add a missing check in prof_realloc(). These fixes protect allocation events against concurrent prof_active changes.
* Fix irealloc_prof() to prof_alloc_rollback() on OOM.Jason Evans2015-09-151-1/+3
|
* Optimize irallocx_prof() to optimistically update the sampler state.Jason Evans2015-09-151-3/+3
|
* Fix ixallocx_prof() size+extra overflow.Jason Evans2015-09-151-0/+3
| | | | | Fix ixallocx_prof() to clamp the extra parameter if size+extra would overflow HUGE_MAXCLASS.
* Force initialization of the init_lock in malloc_init_hard on Windows XPMike Hommey2015-09-041-1/+15
| | | | This resolves #269.
* Fix arenas_cache_cleanup() and arena_get_hard().Jason Evans2015-08-281-6/+5
| | | | | | | | | Fix arenas_cache_cleanup() and arena_get_hard() to handle allocation/deallocation within the application's thread-specific data cleanup functions even after arenas_cache is torn down. This is a more general fix that complements 45e9f66c280e1ba8bebf7bed387a43bc9e45536d (Fix arenas_cache_cleanup().).
* Fix arenas_cache_cleanup().Christopher Ferris2015-08-211-1/+5
| | | | | | Fix arenas_cache_cleanup() to handle allocation/deallocation within the application's thread-specific data cleanup functions even after arenas_cache is torn down.
* MSVC compatibility changesMatthijs2015-08-041-8/+16
| | | | | | - Decorate public function with __declspec(allocator) and __declspec(restrict), just like MSVC 1900 - Support JEMALLOC_HAS_RESTRICT by defining the restrict keyword - Move __declspec(nothrow) between 'void' and '*' so it compiles once more
* Move JEMALLOC_NOTHROW just after return type.Jason Evans2015-07-211-36/+27
| | | | | | Only use __declspec(nothrow) in C++ mode. This resolves #244.
* Remove JEMALLOC_ALLOC_SIZE annotations on functions not returning pointersMike Hommey2015-07-211-2/+2
| | | | | | | | As per gcc documentation: The alloc_size attribute is used to tell the compiler that the function return value points to memory (...) This resolves #245.
* Avoid function prototype incompatibilities.Jason Evans2015-07-101-20/+40
| | | | | | | | | Add various function attributes to the exported functions to give the compiler more information to work with during optimization, and also specify throw() when compiling with C++ on Linux, in order to adequately match what __THROW does in glibc. This resolves #237.
* Optimizations for WindowsMatthijs2015-06-251-1/+4
| | | | | | | - Set opt_lg_chunk based on run-time OS setting - Verify LG_PAGE is compatible with run-time OS setting - When targeting Windows Vista or newer, use SRWLOCK instead of CRITICAL_SECTION - When targeting Windows Vista or newer, statically initialize init_lock
* Fix size class overflow handling when profiling is enabled.Jason Evans2015-06-241-4/+12
| | | | | | | | | | Fix size class overflow handling for malloc(), posix_memalign(), memalign(), calloc(), and realloc() when profiling is enabled. Remove an assertion that erroneously caused arena_sdalloc() to fail when profiling was enabled. This resolves #232.
* Add alignment assertions to public aligned allocation functions.Jason Evans2015-06-231-28/+33
|
* Implement cache index randomization for large allocations.Jason Evans2015-05-061-1/+2
| | | | | | | | | | | | | | | | | | | | Extract szad size quantization into {extent,run}_quantize(), and . quantize szad run sizes to the union of valid small region run sizes and large run sizes. Refactor iteration in arena_run_first_fit() to use run_quantize{,_first,_next(), and add support for padded large runs. For large allocations that have no specified alignment constraints, compute a pseudo-random offset from the beginning of the first backing page that is a multiple of the cache line size. Under typical configurations with 4-KiB pages and 64-byte cache lines this results in a uniform distribution among 64 page boundary offsets. Add the --disable-cache-oblivious option, primarily intended for performance testing. This resolves #13.
* Concise JEMALLOC_HAVE_ISSETUGID case in secure_getenv().Igor Podlesny2015-04-301-11/+3
|