summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Fix rallocx() sampling code to not eagerly commit sampler update.Jason Evans2016-06-081-3/+3
| | | | | | rallocx() for an alignment-constrained request may end up with a smaller-than-worst-case size if in-place reallocation succeeds due to serendipitous alignment. In such cases, sampling may not happen.
* Fix opt_zero-triggered in-place huge reallocation zeroing.Jason Evans2016-06-081-5/+5
| | | | | | Fix huge_ralloc_no_move_expand() to update the extent's zeroed attribute based on the intersection of the previous value and that of the newly merged trailing extent.
* Fix a Valgrind regression in chunk_alloc_wrapper().Elliot Ronaghan2016-06-071-2/+4
| | | | | This regression was caused by d412624b25eed2b5c52b7d94a71070d3aab03cb4 (Move retaining out of default chunk hooks).
* Fix a Valgrind regression in calloc().Elliot Ronaghan2016-06-071-1/+1
| | | | | This regression was caused by 3ef51d7f733ac6432e80fa902a779ab5b98d74f6 (Optimize the fast paths of calloc() and [m,d,sd]allocx().).
* Fix potential VM map fragmentation regression.Jason Evans2016-06-072-2/+2
| | | | | | | | Revert 245ae6036c09cc11a72fab4335495d95cddd5beb (Support --with-lg-page values larger than actual page size.), because it could cause VM map fragmentation if the kernel grows mmap()ed memory downward. This resolves #391.
* Fix mixed decl in nstime.cElliot Ronaghan2016-06-071-3/+5
| | | | Fix mixed decl in the gettimeofday() branch of nstime_update()
* Propagate tsdn to default chunk hooks.Jason Evans2016-06-071-20/+62
| | | | | | | This avoids bootstrapping issues for configurations that require allocation during tsd initialization. This resolves #390.
* Guard tsdn_tsd() call with tsdn_null() check.Jason Evans2016-05-111-2/+2
|
* Mangle tested functions as n_witness_* rather than witness_*_impl.Jason Evans2016-05-111-9/+8
|
* Optimize witness fast path.Jason Evans2016-05-111-118/+4
| | | | | | | | | | | Short-circuit commonly called witness functions so that they only execute in debug builds, and remove equivalent guards from mutex functions. This avoids pointless code execution in witness_assert_lockless(), which is typically called twice per allocation/deallocation function invocation. Inline commonly called witness functions so that optimized builds can completely remove calls as dead code.
* Fix chunk accounting related to triggering gdump profiles.Jason Evans2016-05-111-0/+15
| | | | | Fix in place huge reallocation to update the chunk counters that are used for triggering gdump profiles.
* Resolve bootstrapping issues when embedded in FreeBSD libc.Jason Evans2016-05-1114-1196/+1257
| | | | | | | | | | | | | b2c0d6322d2307458ae2b28545f8a5c9903d7ef5 (Add witness, a simple online locking validator.) caused a broad propagation of tsd throughout the internal API, but tsd_fetch() was designed to fail prior to tsd bootstrapping. Fix this by splitting tsd_t into non-nullable tsd_t and nullable tsdn_t, and modifying all internal APIs that do not critically rely on tsd to take nullable pointers. Furthermore, add the tsd_booted_get() function so that tsdn_fetch() can probe whether tsd bootstrapping is complete and return NULL if not. All dangerous conversions of nullable pointers are tsdn_tsd() calls that assert-fail on invalid conversion.
* Fix tsd bootstrapping for a0malloc().Jason Evans2016-05-071-27/+31
|
* Optimize the fast paths of calloc() and [m,d,sd]allocx().Jason Evans2016-05-063-188/+116
| | | | | | | | This is a broader application of optimizations to malloc() and free() in f4a0f32d340985de477bbe329ecdaecd69ed1055 (Fast-path improvement: reduce # of branches and unnecessary operations.). This resolves #321.
* Modify pages_map() to support mapping uncommitted virtual memory.Jason Evans2016-05-063-25/+102
| | | | | | | | | | | If the OS overcommits: - Commit all mappings in pages_map() regardless of whether the caller requested committed memory. - Linux-specific: Specify MAP_NORESERVE to avoid unfortunate interactions with heuristic overcommit mode during fork(2). This resolves #193.
* Scale leak report summary according to sampling probability.Jason Evans2016-05-041-18/+38
| | | | | | | This makes the numbers reported in the leak report summary closely match those reported by jeprof. This resolves #356.
* Add the stats.retained and stats.arenas.<i>.retained statistics.Jason Evans2016-05-044-6/+30
| | | | This resolves #367.
* Fix huge_palloc() regression.Jason Evans2016-05-046-14/+15
| | | | | | | | | | Split arena_choose() into arena_[i]choose() and use arena_ichoose() for arena lookup during internal allocation. This fixes huge_palloc() so that it always succeeds during extent node allocation. This regression was introduced by 66cd953514a18477eb49732e40d5c2ab5f1b12c5 (Do not allocate metadata via non-auto arenas, nor tcaches.).
* Fix witness/fork() interactions.Jason Evans2016-04-262-4/+16
| | | | | | Fix witness to clear its list of owned mutexes in the child if platform-specific malloc_mutex code re-initializes mutexes rather than unlocking them.
* Fix fork()-related lock rank ordering reversals.Jason Evans2016-04-264-35/+123
|
* Fix arena reset effects on large/huge stats.Jason Evans2016-04-251-5/+24
| | | | | | Reset large curruns to 0 during arena reset. Do not increase huge ndalloc stats during arena reset.
* Fix arena_choose_hard() regression.Jason Evans2016-04-231-1/+1
| | | | | This regression was caused by 66cd953514a18477eb49732e40d5c2ab5f1b12c5 (Do not allocate metadata via non-auto arenas, nor tcaches.).
* Implement the arena.<i>.reset mallctl.Jason Evans2016-04-222-37/+224
| | | | | | | This makes it possible to discard all of an arena's allocations in a single operation. This resolves #146.
* Do not allocate metadata via non-auto arenas, nor tcaches.Jason Evans2016-04-228-112/+145
| | | | | This assures that all internally allocated metadata come from the first opt_narenas arenas, i.e. the automatically multiplexed arenas.
* Reduce a variable scope.Jason Evans2016-04-221-2/+1
|
* Update private_symbols.txt.Jason Evans2016-04-182-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | Change test-related mangling to simplify symbol filtering. The following commands can be used to detect missing/obsolete symbol mangling, with the caveat that the full set of symbols is based on the union of symbols generated by all configurations, some of which are platform-specific: ./autogen.sh --enable-debug --enable-prof --enable-lazy-lock make all tests nm -a lib/libjemalloc.a src/*.jet.o \ |grep " [TDBCR] " \ |awk '{print $3}' \ |sed -e 's/^\(je_\|jet_\(n_\)\?\)\([a-zA-Z0-9_]*\)/\3/g' \ |LC_COLLATE=C sort -u \ |grep -v \ -e '^\(malloc\|calloc\|posix_memalign\|aligned_alloc\|realloc\|free\)$' \ -e '^\(m\|r\|x\|s\|d\|sd\|n\)allocx$' \ -e '^mallctl\(\|nametomib\|bymib\)$' \ -e '^malloc_\(stats_print\|usable_size\|message\)$' \ -e '^\(memalign\|valloc\)$' \ -e '^__\(malloc\|memalign\|realloc\|free\)_hook$' \ -e '^pthread_create$' \ > /tmp/private_symbols.txt
* Fix style nits.Jason Evans2016-04-171-1/+1
|
* Fix malloc_mutex_[un]lock() to conditionally check witness.Jason Evans2016-04-171-10/+0
| | | | Also remove tautological cassert(config_debug) calls.
* Convert base_mtx locking protocol comments to assertions.Jason Evans2016-04-171-10/+12
|
* Add witness, a simple online locking validator.Jason Evans2016-04-1414-1083/+1452
| | | | This resolves #358.
* Fix 64-to-32 conversion warnings in 32-bit moderustyx2016-04-121-11/+15
|
* Fix malloc_stats_print() to print correct opt.narenas value.Jason Evans2016-04-121-1/+1
| | | | | This regression was caused by 8f683b94a751c65af8f9fa25970ccf2917b96bb8 (Make opt_narenas unsigned rather than size_t.).
* Support --with-lg-page values larger than actual page size.Jason Evans2016-04-112-2/+2
| | | | | | | | | | | | | | During over-allocation in preparation for creating aligned mappings, allocate one more page than necessary if PAGE is the actual page size, so that trimming still succeeds even if the system returns a mapping that has less than PAGE alignment. This allows compiling with e.g. 64 KiB "pages" on systems that actually use 4 KiB pages. Note that for e.g. --with-lg-page=21, it is also necessary to increase the chunk size (e.g. --with-malloc-conf=lg_chunk:22) so that there are at least two "pages" per chunk. In practice this isn't a particularly compelling configuration because so much (unusable) virtual memory is dedicated to chunk headers.
* Refactor/fix ph.Jason Evans2016-04-112-50/+47
| | | | | | | | | | | | | | | | | | | | | Refactor ph to support configurable comparison functions. Use a cpp macro code generation form equivalent to the rb macros so that pairing heaps can be used for both run heaps and chunk heaps. Remove per node parent pointers, and instead use leftmost siblings' prev pointers to track parents. Fix multi-pass sibling merging to iterate over intermediate results using a FIFO, rather than a LIFO. Use this fixed sibling merging implementation for both merge phases of the auxiliary twopass algorithm (first merging the aux list, then replacing the root with its merged children). This fixes both degenerate merge behavior and the potential for deep recursion. This regression was introduced by 6bafa6678fc36483e638f1c3a0a9bf79fb89bfc9 (Pairing heap). This resolves #371.
* Reduce differences between alternative bitmap implementations.Jason Evans2016-04-061-7/+4
|
* Add JEMALLOC_ALLOC_JUNK and JEMALLOC_FREE_JUNK macrosChris Peterson2016-03-314-26/+29
| | | | | Replace hardcoded 0xa5 and 0x5a junk values with JEMALLOC_ALLOC_JUNK and JEMALLOC_FREE_JUNK macros, respectively.
* Update a comment.Jason Evans2016-03-311-2/+2
|
* Fix potential chunk leaks.Jason Evans2016-03-313-44/+25
| | | | | | | | | | | | Move chunk_dalloc_arena()'s implementation into chunk_dalloc_wrapper(), so that if the dalloc hook fails, proper decommit/purge/retain cascading occurs. This fixes three potential chunk leaks on OOM paths, one during dss-based chunk allocation, one during chunk header commit (currently relevant only on Windows), and one during rtree write (e.g. if rtree node allocation fails). Merge chunk_purge_arena() into chunk_purge_default() (refactor, no change to functionality).
* Fix -Wunreachable-code warning in malloc_vsnprintf().Chris Peterson2016-03-271-2/+2
| | | | | | | | | | | Variables s and slen are declared inside a switch statement, but outside a case scope. clang reports these variable definitions as "unreachable", though this is not really meaningful in this case. This is the only -Wunreachable-code warning in jemalloc. src/util.c:501:5 [-Wunreachable-code] code will never be executed This resolves #364.
* Constify various internal arena APIs.Jason Evans2016-03-232-24/+29
|
* Code formatting fixes.Jason Evans2016-03-231-1/+2
|
* Optimize rtree_get().Jason Evans2016-03-232-0/+3
| | | | | | | Specialize fast path to avoid code that cannot execute for dependent loads. Manually unroll.
* Refactor out signed/unsigned comparisons.Jason Evans2016-03-151-7/+4
|
* Convert arena_bin_t's runs from a tree to a heap.Jason Evans2016-03-081-35/+15
|
* Use pairing heap for arena->runs_availDave Watson2016-03-081-13/+15
| | | | | | | | | | | | | | Use pairing heap instead of red black tree in arena runs_avail. The extra links are unioned with the bitmap_t, so this change doesn't use any extra memory. Canaries show this change to be a 1% cpu win, and 2% latency win. In particular, large free()s, and small bin frees are now O(1) (barring coalescing). I also tested changing bin->runs to be a pairing heap, but saw a much smaller win, and it would mean increasing the size of arena_run_s by two pointers, so I left that as an rb-tree for now.
* Pairing heapDave Watson2016-03-081-0/+2
| | | | | | | | | | | | | | | Initial implementation of a twopass pairing heap with aux list. Research papers linked in comments. Where search/nsearch/last aren't needed, this gives much faster first(), delete(), and insert(). Insert is O(1), and first/delete don't have to walk the whole tree. Also tested rb_old with parent pointers - it was better than the current rb.h for memory loads, but still much worse than a pairing heap. An array-based heap would be much faster if everything fits in memory, but on a cold cache it has many more memory loads for most operations.
* Avoid a potential innocuous compiler warning.Jason Evans2016-03-031-1/+5
| | | | | | | Add a cast to avoid comparing a ssize_t value to a uint64_t value that is always larger than a 32-bit ssize_t. This silences an innocuous compiler warning from e.g. gcc 4.2.1 about the comparison always having the same result.
* Fix stack corruption and uninitialized var warningDmitri Smirnov2016-02-291-1/+1
| | | | | | Stack corruption happens in x64 bit This resolves #347.
* Fix a potential tsd cleanup leak.Jason Evans2016-02-281-0/+3
| | | | | | | | | | | | | | Prior to 767d85061a6fb88ec977bbcd9b429a43aff391e6 (Refactor arenas array (fixes deadlock).), it was possible under some circumstances for arena_get() to trigger recreation of the arenas cache during tsd cleanup, and the arenas cache would then be leaked. In principle a similar issue could still occur as a side effect of decay-based purging, which calls arena_tdata_get(). Fix arenas_tdata_cleanup() by setting tsd->arenas_tdata_bypass to true, so that arena_tdata_get() will gracefully fail (an expected behavior) rather than recreating tsd->arena_tdata. Reported by Christopher Ferris <cferris@google.com>.
* Fix stats.arenas.<i>.[...] for --disable-stats case.Jason Evans2016-02-282-84/+109
| | | | | | | | Add missing stats.arenas.<i>.{dss,lg_dirty_mult,decay_time} initialization. Fix stats.arenas.<i>.{pactive,pdirty} to read under the protection of the arena mutex.