summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
...
* Determine rtree levels at compile time.Jason Evans2017-02-095-23/+49
| | | | | | | Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.
* Remove rtree leading 0 bit optimization.Jason Evans2017-02-091-31/+4
| | | | A subsequent change instead ignores insignificant high bits.
* Make non-essential inline rtree functions static functions.Jason Evans2017-02-093-111/+16
|
* Split rtree_elm_lookup_hard() out of rtree_elm_lookup().Jason Evans2017-02-093-101/+6
| | | | | Anything but a hit in the first element of the lookup cache is expensive enough to negate the benefits of inlining.
* Replace rtree path cache with LRU cache.Jason Evans2017-02-094-124/+108
| | | | | | | Rework rtree_ctx_t to encapsulate an rtree leaf LRU lookup cache rather than a single-path element lookup cache. The replacement is logically much simpler, as well as slightly faster in the fast path case and less prone to degraded performance during non-trivial sequences of lookups.
* Optimize a branch out of rtree_read() if !dependent.Jason Evans2017-02-091-1/+1
|
* Disentangle arena and extent locking.Jason Evans2017-02-0211-111/+213
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor arena and extent locking protocols such that arena and extent locks are never held when calling into the extent_*_wrapper() API. This requires extra care during purging since the arena lock no longer protects the inner purging logic. It also requires extra care to protect extents from being merged with adjacent extents. Convert extent_t's 'active' flag to an enumerated 'state', so that retained extents are explicitly marked as such, rather than depending on ring linkage state. Refactor the extent collections (and their synchronization) for cached and retained extents into extents_t. Incorporate LRU functionality to support purging. Incorporate page count accounting, which replaces arena->ndirty and arena->stats.retained. Assert that no core locks are held when entering any internal [de]allocation functions. This is in addition to existing assertions that no locks are held when entering external [de]allocation functions. Audit and document synchronization protocols for all arena_t fields. This fixes a potential deadlock due to recursive allocation during gdump, in a similar fashion to b49c649bc18fff4bd10a1c8adbaf1f25f6453cb6 (Fix lock order reversal during gdump.), but with a necessarily much broader code impact.
* Fix/refactor tcaches synchronization.Jason Evans2017-02-023-15/+22
| | | | | | | Synchronize tcaches with tcaches_mtx rather than ctl_mtx. Add missing synchronization for tcache flushing. This bug was introduced by 1cb181ed632e7573fb4eab194e4d216867222d27 (Implement explicit tcache support.), which was first released in 4.0.0.
* Add witness_assert_depth[_to_rank]().Jason Evans2017-02-024-6/+36
| | | | | This makes it possible to make lock state assertions about precisely which locks are held.
* Replace tabs following #define with spaces.Jason Evans2017-01-2136-305/+304
| | | | This resolves #564.
* Remove extraneous parens around return arguments.Jason Evans2017-01-2120-346/+344
| | | | This resolves #540.
* Update brace style.Jason Evans2017-01-2126-758/+582
| | | | | | | Add braces around single-line blocks, and remove line breaks before function-opening braces. This resolves #537.
* Fix --disable-stats support.Jason Evans2017-01-202-19/+29
| | | | | Fix numerous regressions that were exposed by --disable-stats, both in the core library and in the tests.
* Added stats about number of bytes cached in tcache currently.Qi Wang2017-01-181-0/+3
|
* Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitionsMike Hommey2017-01-182-2/+0
| | | | | | | | | | The SDK jemalloc is built against might be not be the latest for various reasons, but the resulting binary ought to work on newer versions of OSX. In order to ensure this, we need the fullest definitions possible, so copy what we need from the latest version of malloc/malloc.h available on opensource.apple.com.
* Fix prof_realloc() regression.Jason Evans2017-01-173-14/+25
| | | | | | | | | | | | | | Mostly revert the prof_realloc() changes in 498856f44a30b31fe713a18eb2fc7c6ecf3a9f63 (Move slabs out of chunks.) so that prof_free_sampled_object() is called when appropriate. Leave the prof_tctx_[re]set() optimization in place, but add an assertion to verify that all eight cases are correctly handled. Add a comment to make clear the code ordering, so that the regression originally fixed by ea8d97b8978a0c0423f0ed64332463a25b787c3d (Fix prof_{malloc,free}_sample_object() call order in prof_realloc().) is not repeated. This resolves #499.
* Remove leading blank lines from function bodies.Jason Evans2017-01-1320-200/+0
| | | | This resolves #535.
* Break up headers into constituent partsDavid Goldblatt2017-01-1293-3604/+3449
| | | | | | | | | | This is part of a broader change to make header files better represent the dependencies between one another (see https://github.com/jemalloc/jemalloc/issues/533). It breaks up component headers into smaller parts that can be made to have a simpler dependency graph. For the autogenerated headers (smoothstep.h and size_classes.h), no splitting was necessary, so I didn't add support to emit multiple headers.
* Remove mb.h, which is unusedDavid Goldblatt2017-01-112-119/+0
|
* Use better pre-processor defines for sparc64John Paul Adrian Glaubitz2017-01-111-1/+1
| | | | | | | | Currently, jemalloc detects sparc64 targets by checking whether __sparc64__ is defined. However, this definition is used on BSD targets only. Linux targets define both __sparc__ and __arch64__ for sparc64. Since this also works on BSD, rather use __sparc__ and __arch64__ instead of __sparc64__ to detect sparc64 targets.
* Implement arena.<i>.destroy .Jason Evans2017-01-076-1/+24
| | | | | | | Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an analogue to MALLCTL_ARENAS_ALL. This resolves #382.
* Range-check mib[1] --> arena_ind casts.Jason Evans2017-01-071-1/+1
|
* Move static ctl_epoch variable into ctl_stats_t (as epoch).Jason Evans2017-01-071-0/+1
|
* Refactor ctl_stats_t.Jason Evans2017-01-072-8/+15
| | | | | | | Refactor ctl_stats_t to be a demand-zeroed non-growing data structure. To keep the size from being onerous (~60 MiB) on 32-bit systems, convert the arenas field to contain pointers rather than directly embedded ctl_arena_stats_t elements.
* Rename the arenas.extend mallctl to arenas.create.Jason Evans2017-01-071-3/+3
|
* Add MALLCTL_ARENAS_ALL.Jason Evans2017-01-072-0/+18
| | | | | | | Add the MALLCTL_ARENAS_ALL cpp macro as a fixed index for use in accessing the arena.<i>.{purge,decay,dss} and stats.arenas.<i>.* mallctls, and deprecate access via the arenas.narenas index (to be removed in 6.0.0).
* Reindent.Jason Evans2017-01-071-12/+12
|
* Implement per arena base allocators.Jason Evans2016-12-275-56/+133
| | | | | | | | | | | | | Add/rename related mallctls: - Add stats.arenas.<i>.base . - Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal . - Add stats.arenas.<i>.resident . Modify the arenas.extend mallctl to take an optional (extent_hooks_t *) argument so that it is possible for all base allocations to be serviced by the specified extent hooks. This resolves #463.
* Refactor purging and splitting/merging.Jason Evans2016-12-275-7/+46
| | | | | | | | | | | | | | Split purging into lazy and forced variants. Use the forced variant for zeroing dss. Add support for NULL function pointers as an opt-out mechanism for the dalloc, commit, decommit, purge_lazy, purge_forced, split, and merge fields of extent_hooks_t. Add short-circuiting checks in large_ralloc_no_move_{shrink,expand}() so that no attempt is made if splitting/merging is not supported. This resolves #268.
* Rename arena_decay_t's ndirty to nunpurged.Jason Evans2016-12-271-1/+1
|
* Use exponential series to size extents.Jason Evans2016-12-271-0/+9
| | | | | | | | | | If virtual memory is retained, allocate extents such that their sizes form an exponentially growing series. This limits the number of disjoint virtual memory ranges so that extent merging can be effective even if multiple arenas' extent allocation requests are highly interleaved. This resolves #462.
* Add huge page configuration and pages_[no}huge().Jason Evans2016-12-274-2/+34
| | | | | | | | Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.
* Simplify arena_slab_regind().Jason Evans2016-12-232-0/+4
| | | | | | | | | Rewrite arena_slab_regind() to provide sufficient constant data for the compiler to perform division strength reduction. This replaces more general manual strength reduction that was implemented before arena_bin_info was compile-time-constant. It would be possible to slightly improve on the compiler-generated division code by taking advantage of range limits that the compiler doesn't know about.
* Add some missing explicit casts.Jason Evans2016-12-131-3/+4
|
* jemalloc cpp new/delete bindingsDave Watson2016-12-133-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds cpp bindings for jemalloc, along with necessary autoconf settings. This is mostly to add sized deallocation support, which can't be added from C directly. Sized deallocation is ~10% microbench improvement. * Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the easiest way to get c++14 detection. * Adds various other changes, like CXXFLAGS, to configure.ac. * Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic unittest. * Both new and delete are overridden, to ensure jemalloc is used for both. * TODO future enhancement of avoiding extra PLT thunks for new and delete - sdallocx and malloc are publicly exported jemalloc symbols, using an alias would link them directly. Unfortunately, was having trouble getting it to play nice with jemalloc's namespace support. Testing: Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized deallocation support, verified that the rest build correctly. Tested mac osx and Centos. Tested --with-jemalloc-prefix and --without-export. This resolves #202.
* Add a_type parameter to qr_{meld,split}().Jason Evans2016-12-132-5/+5
|
* Add --disable-syscall.Jason Evans2016-12-041-2/+2
| | | | This resolves #517.
* Enable overriding JEMALLOC_{ALLOC,FREE}_JUNK.Jason Evans2016-11-221-2/+6
| | | | This resolves #509.
* Style fixes.Jason Evans2016-11-221-6/+6
|
* Add pthread_atfork(3) feature test.Jason Evans2016-11-171-0/+3
| | | | | | Some versions of Android provide a pthreads library without providing pthread_atfork(), so in practice a separate feature test is necessary for the latter.
* Update a comment.Jason Evans2016-11-171-1/+1
|
* Refactor madvise(2) configuration.Jason Evans2016-11-171-12/+9
| | | | | | | | | Add feature tests for the MADV_FREE and MADV_DONTNEED flags to madvise(2), so that MADV_FREE is detected and used for Linux kernel versions 4.5 and newer. Refactor pages_purge() so that on systems which support both flags, MADV_FREE is preferred over MADV_DONTNEED. This resolves #387.
* Add extent serial numbers.Jason Evans2016-11-153-7/+84
| | | | | | | | Add extent serial numbers and use them where appropriate as a sort key that is higher priority than address, so that the allocation policy prefers older extents. This resolves #147.
* Rename atomic_*_{uint32,uint64,u}() to atomic_*_{u32,u64,zu}().Jason Evans2016-11-075-119/+119
| | | | This change conforms to naming conventions throughout the codebase.
* Revert "Define 64-bits atomics unconditionally"Jason Evans2016-11-071-8/+10
| | | | | | This reverts commit c2942e2c0e097e7c75a3addd0b9c87758f91692e. This resolves #495.
* Refactor prng to not use 64-bit atomics on 32-bit platforms.Jason Evans2016-11-074-22/+139
| | | | This resolves #495.
* Fix psz/pind edge cases.Jason Evans2016-11-043-25/+9
| | | | | | | | | Add an "over-size" extent heap in which to store extents which exceed the maximum size class (plus cache-oblivious padding, if enabled). Remove psz2ind_clamp() and use psz2ind() instead so that trying to allocate the maximum size class can in principle succeed. In practice, this allows assertions to hold so that OOM errors can be successfully generated.
* Fix extent_alloc_cache[_locked]() to support decommitted allocation.Jason Evans2016-11-041-2/+2
| | | | | | | | | Fix extent_alloc_cache[_locked]() to support decommitted allocation, and use this ability in arena_stash_dirty(), so that decommitted extents are not needlessly committed during purging. In practice this does not happen on any currently supported systems, because both extent merging and decommit must be implemented; all supported systems implement one xor the other.
* Update symbol mangling.Jason Evans2016-11-031-0/+2
|
* Fix long spinning in rtree_node_initDave Watson2016-11-032-5/+4
| | | | | | | | | | | | | | | | | rtree_node_init spinlocks the node, allocates, and then sets the node. This is under heavy contention at the top of the tree if many threads start to allocate at the same time. Instead, take a per-rtree sleeping mutex to reduce spinning. Tested both pthreads and osx OSSpinLock, and both reduce spinning adequately Previous benchmark time: ./ttest1 500 100 ~15s New benchmark time: ./ttest1 500 100 .57s