summaryrefslogtreecommitdiffstats
path: root/src/arena.c
Commit message (Collapse)AuthorAgeFilesLines
* Silence miscellaneous 64-to-32-bit data loss warnings.Jason Evans2016-02-261-2/+2
| | | | This resolves #341.
* Remove a superfluous comment.Jason Evans2016-02-261-1/+0
|
* Make *allocx() size class overflow behavior defined.Jason Evans2016-02-251-15/+21
| | | | | | | Limit supported size and alignment to HUGE_MAXCLASS, which in turn is now limited to be less than PTRDIFF_MAX. This resolves #278 and #295.
* Refactor arenas array (fixes deadlock).Jason Evans2016-02-251-0/+21
| | | | | | | | | | | | Refactor the arenas array, which contains pointers to all extant arenas, such that it starts out as a sparse array of maximum size, and use double-checked atomics-based reads as the basis for fast and simple arena_get(). Additionally, reduce arenas_lock's role such that it only protects against arena initalization races. These changes remove the possibility for arena lookups to trigger locking, which resolves at least one known (fork-related) deadlock. This resolves #315.
* Fix arena_size computation.Dave Watson2016-02-251-1/+1
| | | | | | | | | | | | | Fix arena_size arena_new() computation to incorporate runs_avail_nclasses elements for runs_avail, rather than (runs_avail_nclasses - 1) elements. Since offsetof(arena_t, runs_avail) is used rather than sizeof(arena_t) for the first term of the computation, all of the runs_avail elements must be added into the second term. This bug was introduced (by Jason Evans) while merging pull request #330 as 3417a304ccde61ac1f68b436ec22c03f1d6824ec (Separate arena_avail trees).
* Fix arena_run_first_best_fitDave Watson2016-02-251-1/+1
| | | | | | Merge of 3417a304ccde61ac1f68b436ec22c03f1d6824ec looks like a small bug: first_best_fit doesn't scan through all the classes, since ind is offset from runs_avail_nclasses by run_avail_bias.
* Silence miscellaneous 64-to-32-bit data loss warnings.Jason Evans2016-02-241-10/+11
|
* Refactor jemalloc_ffs*() into ffs_*().Jason Evans2016-02-241-2/+1
| | | | Use appropriate versions to resolve 64-to-32-bit data loss warnings.
* Collapse arena_avail_tree_* into arena_run_tree_*.Jason Evans2016-02-241-11/+7
| | | | | These tree types converged to become identical, yet they still had independently generated red-black tree implementations.
* Separate arena_avail treesDave Watson2016-02-241-88/+50
| | | | | | | | | | | Separate run trees by index, replacing the previous quantize logic. Quantization by index is now performed only on insertion / removal from the tree, and not on node comparison, saving some cpu. This also means we don't have to dereference the miscelm* pointers, saving half of the memory loads from miscelms/mapbits that have fallen out of cache. A linear scan of the indicies appears to be fast enough. The only cost of this is an extra tree array in each arena.
* Use table lookup for run_quantize_{floor,ceil}().Jason Evans2016-02-231-21/+86
| | | | | Reduce run quantization overhead by generating lookup tables during bootstrapping, and using the tables for all subsequent run quantization.
* Fix run_quantize_ceil().Jason Evans2016-02-231-1/+1
| | | | | | | | | | | In practice this bug had limited impact (and then only by increasing chunk fragmentation) because run_quantize_ceil() returned correct results except for inputs that could only arise from aligned allocation requests that required more than page alignment. This bug existed in the original run quantization implementation, which was introduced by 8a03cf039cd06f9fa6972711195055d865673966 (Implement cache index randomization for large allocations.).
* Test run quantization.Jason Evans2016-02-221-10/+28
| | | | | Also rename run_quantize_*() to improve clarity. These tests demonstrate that run_quantize_ceil() is flawed.
* Refactor time_* into nstime_*.Jason Evans2016-02-221-28/+25
| | | | | | | Use a single uint64_t in nstime_t to store nanoseconds rather than using struct timespec. This reduces fragility around conversions between long and uint64_t, especially missing casts that only cause problems on 32-bit platforms.
* Implement decay-based unused dirty page purging.Jason Evans2016-02-201-20/+307
| | | | | | | | | | | | | | | | This is an alternative to the existing ratio-based unused dirty page purging, and is intended to eventually become the sole purging mechanism. Add mallctls: - opt.purge - opt.decay_time - arena.<i>.decay - arena.<i>.decay_time - arenas.decay_time - stats.arenas.<i>.decay_time This resolves #325.
* Refactor out arena_compute_npurge().Jason Evans2016-02-201-43/+37
| | | | | Refactor out arena_compute_npurge() by integrating its logic into arena_stash_dirty() as an incremental computation.
* Refactor arena_ralloc_no_move().Jason Evans2016-02-201-11/+10
| | | | | Refactor early return logic in arena_ralloc_no_move() to return early on failure rather than on success.
* Refactor arena_malloc_hard() out of arena_malloc().Jason Evans2016-02-201-1/+17
|
* Refactor prng* from cpp macros into inline functions.Jason Evans2016-02-201-3/+1
| | | | | Remove 32-bit variant, convert prng64() to prng_lg_range(), and add prng_range().
* Fast-path improvement: reduce # of branches and unnecessary operations.Qi Wang2015-11-101-13/+13
| | | | | | - Combine multiple runtime branches into a single malloc_slow check. - Avoid calling arena_choose / size2index / index2size on fast path. - A few micro optimizations.
* Allow const keys for lookupJoshua Kahn2015-11-091-3/+4
| | | | | | Signed-off-by: Steve Dougherty <sdougherty@barracuda.com> This resolves #281.
* Remove arena_run_dalloc_decommit().Mike Hommey2015-11-091-23/+2
| | | | This resolves #284.
* Fix a xallocx(..., MALLOCX_ZERO) bug.Jason Evans2015-09-251-3/+9
| | | | | | | | Fix xallocx(..., MALLOCX_ZERO to zero the last full trailing page of large allocations that have been randomly assigned an offset of 0 when --enable-cache-oblivious configure option is enabled. This addresses a special case missed in d260f442ce693de4351229027b37b3293fcbfd7d (Fix xallocx(..., MALLOCX_ZERO) bugs.).
* Fix xallocx(..., MALLOCX_ZERO) bugs.Jason Evans2015-09-241-0/+10
| | | | | | | | | | Zero all trailing bytes of large allocations when --enable-cache-oblivious configure option is enabled. This regression was introduced by 8a03cf039cd06f9fa6972711195055d865673966 (Implement cache index randomization for large allocations.). Zero trailing bytes of huge allocations when resizing from/to a size class that is not a multiple of the chunk size.
* Make arena_dalloc_large_locked_impl() static.Jason Evans2015-09-201-1/+1
|
* Centralize xallocx() size[+extra] overflow checks.Jason Evans2015-09-151-7/+0
|
* Rename arena_maxclass to large_maxclass.Jason Evans2015-09-121-10/+10
| | | | | arena_maxclass is no longer an appropriate name, because arenas also manage huge allocations.
* Fix xallocx() bugs.Jason Evans2015-09-121-108/+94
| | | | | Fix xallocx() bugs related to the 'extra' parameter when specified as non-zero.
* Reduce variables scopeDmitry-Me2015-09-041-9/+10
|
* Rename index_t to szind_t to avoid an existing type on Solaris.Jason Evans2015-08-191-23/+23
| | | | This resolves #256.
* Don't bitshift by negative amounts.Jason Evans2015-08-191-4/+3
| | | | | | | | Don't bitshift by negative amounts when encoding/decoding run sizes in chunk header maps. This affected systems with page sizes greater than 8 KiB. Reported by Ingvar Hagelund <ingvar@redpill-linpro.com>.
* Refactor arena_mapbits_{small,large}_set() to not preserve unzeroed.Jason Evans2015-08-111-42/+66
| | | | | Fix arena_run_split_large_helper() to treat newly committed memory as zeroed.
* Refactor arena_mapbits unzeroed flag management.Jason Evans2015-08-111-21/+22
| | | | | | Only set the unzeroed flag when initializing the entire mapbits entry, rather than mutating just the unzeroed bit. This simplifies the possible mapbits state transitions.
* Arena chunk decommit cleanups and fixes.Jason Evans2015-08-111-25/+49
| | | | | Decommit arena chunk header during chunk deallocation if the rest of the chunk is decommitted.
* Implement chunk hook support for page run commit/decommit.Jason Evans2015-08-071-98/+259
| | | | | | | | | Cascade from decommit to purge when purging unused dirty pages, so that it is possible to decommit cleaned memory rather than just purging. For non-Windows debug builds, decommit runs rather than purging them, since this causes access of deallocated runs to segfault. This resolves #251.
* Fix an in-place growing large reallocation regression.Jason Evans2015-08-071-5/+6
| | | | | | | | Fix arena_ralloc_large_grow() to properly account for large_pad, so that in-place large reallocation succeeds when possible, rather than always failing. This regression was introduced by 8a03cf039cd06f9fa6972711195055d865673966 (Implement cache index randomization for large allocations.)
* Generalize chunk management hooks.Jason Evans2015-08-041-101/+83
| | | | | | | | | | | | | | | | | | | | Add the "arena.<i>.chunk_hooks" mallctl, which replaces and expands on the "arena.<i>.chunk.{alloc,dalloc,purge}" mallctls. The chunk hooks allow control over chunk allocation/deallocation, decommit/commit, purging, and splitting/merging, such that the application can rely on jemalloc's internal chunk caching and retaining functionality, yet implement a variety of chunk management mechanisms and policies. Merge the chunks_[sz]ad_{mmap,dss} red-black trees into chunks_[sz]ad_retained. This slightly reduces how hard jemalloc tries to honor the dss precedence setting; prior to this change the precedence setting was also consulted when recycling chunks. Fix chunk purging. Don't purge chunks in arena_purge_stashed(); instead deallocate them in arena_unstash_purged(), so that the dirty memory linkage remains valid until after the last time it is used. This resolves #176 and #201.
* Change arena_palloc_large() parameter from size to usize.Jason Evans2015-07-241-12/+12
| | | | | This change merely documents that arena_palloc_large() always receives usize as its argument.
* Fix MinGW-related portability issues.Jason Evans2015-07-231-2/+2
| | | | | | | | | | | | | Create and use FMT* macros that are equivalent to the PRI* macros that inttypes.h defines. This allows uniform use of the Unix-specific format specifiers, e.g. "%zu", as well as avoiding Windows-specific definitions of e.g. PRIu64. Add ffs()/ffsl() support for compiling with gcc. Extract compatibility definitions of ENOENT, EINVAL, EAGAIN, EPERM, ENOMEM, and ENORANGE into include/msvc_compat/windows_extra.h and use the file for tests as well as for core jemalloc code.
* Revert to first-best-fit run/chunk allocation.Jason Evans2015-07-161-42/+17
| | | | | | | | | This effectively reverts 97c04a93838c4001688fe31bf018972b4696efe2 (Use first-fit rather than first-best-fit run/chunk allocation.). In some pathological cases, first-fit search dominates allocation time, and it also tends not to converge as readily on a steady state of memory layout, since precise allocation order has a bigger effect than for first-best-fit.
* Fix MinGW build warnings.Jason Evans2015-07-081-2/+2
| | | | | | | | | | Conditionally define ENOENT, EINVAL, etc. (was unconditional). Add/use PRIzu, PRIzd, and PRIzx for use in malloc_printf() calls. gcc issued (harmless) warnings since e.g. "%zu" should be "%Iu" on Windows, and the alternative to this workaround would have been to disable the function attributes which cause gcc to look for type mismatches in formatted printing function calls.
* Move a variable declaration closer to its use.Jason Evans2015-07-071-1/+2
|
* Convert arena_maybe_purge() recursion to iteration.Jason Evans2015-06-231-10/+24
| | | | This resolves #235.
* Fix performance regression in arena_palloc().Jason Evans2015-05-201-2/+13
| | | | | | Pass large allocation requests to arena_malloc() when possible. This regression was introduced by 155bfa7da18cab0d21d87aa2dce4554166836f5d (Normalize size classes.).
* Implement cache index randomization for large allocations.Jason Evans2015-05-061-42/+174
| | | | | | | | | | | | | | | | | | | | Extract szad size quantization into {extent,run}_quantize(), and . quantize szad run sizes to the union of valid small region run sizes and large run sizes. Refactor iteration in arena_run_first_fit() to use run_quantize{,_first,_next(), and add support for padded large runs. For large allocations that have no specified alignment constraints, compute a pseudo-random offset from the beginning of the first backing page that is a multiple of the cache line size. Under typical configurations with 4-KiB pages and 64-byte cache lines this results in a uniform distribution among 64 page boundary offsets. Add the --disable-cache-oblivious option, primarily intended for performance testing. This resolves #13.
* Fix in-place shrinking huge reallocation purging bugs.Jason Evans2015-03-261-6/+1
| | | | | | | | | | | Fix the shrinking case of huge_ralloc_no_move_similar() to purge the correct number of pages, at the correct offset. This regression was introduced by 8d6a3e8321a7767cb2ca0930b85d5d488a8cc659 (Implement dynamic per arena control over dirty page purging.). Fix huge_ralloc_no_move_shrink() to purge the correct number of pages. This bug was introduced by 9673983443a0782d975fbcb5d8457cfd411b8b56 (Purge/zero sub-chunk huge allocations as necessary.).
* Add the "stats.arenas.<i>.lg_dirty_mult" mallctl.Jason Evans2015-03-241-3/+5
|
* Fix signed/unsigned comparison in arena_lg_dirty_mult_valid().Jason Evans2015-03-241-1/+2
|
* Implement dynamic per arena control over dirty page purging.Jason Evans2015-03-191-12/+75
| | | | | | | | | | | | | | Add mallctls: - arenas.lg_dirty_mult is initialized via opt.lg_dirty_mult, and can be modified to change the initial lg_dirty_mult setting for newly created arenas. - arena.<i>.lg_dirty_mult controls an individual arena's dirty page purging threshold, and synchronously triggers any purging that may be necessary to maintain the constraint. - arena.<i>.chunk.purge allows the per arena dirty page purging function to be replaced. This resolves #93.
* Fix a declaration-after-statement regression.Jason Evans2015-03-111-3/+2
|