summaryrefslogtreecommitdiffstats
path: root/include/jemalloc
Commit message (Collapse)AuthorAgeFilesLines
...
* Refactor ph_merge_ordered() out of ph_merge().Jason Evans2016-03-082-17/+23
|
* Pairing heapDave Watson2016-03-083-0/+267
| | | | | | | | | | | | | | | Initial implementation of a twopass pairing heap with aux list. Research papers linked in comments. Where search/nsearch/last aren't needed, this gives much faster first(), delete(), and insert(). Insert is O(1), and first/delete don't have to walk the whole tree. Also tested rb_old with parent pointers - it was better than the current rb.h for memory loads, but still much worse than a pairing heap. An array-based heap would be much faster if everything fits in memory, but on a cold cache it has many more memory loads for most operations.
* Avoid a potential innocuous compiler warning.Jason Evans2016-03-031-1/+1
| | | | | | | Add a cast to avoid comparing a ssize_t value to a uint64_t value that is always larger than a 32-bit ssize_t. This silences an innocuous compiler warning from e.g. gcc 4.2.1 about the comparison always having the same result.
* Fix stats.arenas.<i>.[...] for --disable-stats case.Jason Evans2016-02-283-1/+8
| | | | | | | | Add missing stats.arenas.<i>.{dss,lg_dirty_mult,decay_time} initialization. Fix stats.arenas.<i>.{pactive,pdirty} to read under the protection of the arena mutex.
* Add/alphabetize private symbols.Jason Evans2016-02-271-15/+15
|
* Fix stats.cactive accounting regression.Jason Evans2016-02-271-2/+12
| | | | | | | | | | Fix stats.cactive accounting to always increase/decrease by multiples of the chunk size, even for huge size classes that are not multiples of the chunk size, e.g. {2.5, 3, 3.5, 5, 7} MiB with 2 MiB chunk size. This regression was introduced by 155bfa7da18cab0d21d87aa2dce4554166836f5d (Normalize size classes.) and first released in 4.0.0. This resolves #336.
* Refactor some bitmap cpp logic.Jason Evans2016-02-261-3/+2
|
* Use linear scan for small bitmapsDave Watson2016-02-261-2/+48
| | | | | | | | | | | | | For small bitmaps, a linear scan of the bitmap is slightly faster than a tree search - bitmap_t is more compact, and there are fewer writes since we don't have to propogate state transitions up the tree. On x86_64 with the current settings, I'm seeing ~.5%-1% CPU improvement in production canaries with this change. The old tree code is left since 32bit sizes are much larger (and ffsl smaller), and maybe the run sizes will change in the future. This resolves #339.
* Miscellaneous bitmap refactoring.Jason Evans2016-02-262-12/+10
|
* Silence miscellaneous 64-to-32-bit data loss warnings.Jason Evans2016-02-261-10/+9
| | | | This resolves #341.
* Make *allocx() size class overflow behavior defined.Jason Evans2016-02-256-20/+19
| | | | | | | Limit supported size and alignment to HUGE_MAXCLASS, which in turn is now limited to be less than PTRDIFF_MAX. This resolves #278 and #295.
* Refactor arenas array (fixes deadlock).Jason Evans2016-02-254-26/+30
| | | | | | | | | | | | Refactor the arenas array, which contains pointers to all extant arenas, such that it starts out as a sparse array of maximum size, and use double-checked atomics-based reads as the basis for fast and simple arena_get(). Additionally, reduce arenas_lock's role such that it only protects against arena initalization races. These changes remove the possibility for arena lookups to trigger locking, which resolves at least one known (fork-related) deadlock. This resolves #315.
* Attempt mmap-based in-place huge reallocation.Jason Evans2016-02-251-2/+2
| | | | | | | | Attempt mmap-based in-place huge reallocation by plumbing new_addr into chunk_alloc_mmap(). This can dramatically speed up incremental huge reallocation. This resolves #335.
* Fix ffs_zu() compilation error on MinGW.Jason Evans2016-02-241-3/+5
| | | | | This regression was caused by 9f4ee6034c3ac6a8c8b5f9a0d76822fb2fd90c41 (Refactor jemalloc_ffs*() into ffs_*().).
* Silence miscellaneous 64-to-32-bit data loss warnings.Jason Evans2016-02-244-11/+16
|
* Change lg_floor() return type from size_t to unsigned.Jason Evans2016-02-242-17/+18
|
* Make opt_narenas unsigned rather than size_t.Jason Evans2016-02-241-1/+1
|
* Make nhbins unsigned rather than size_t.Jason Evans2016-02-241-1/+1
|
* Refactor jemalloc_ffs*() into ffs_*().Jason Evans2016-02-246-37/+68
| | | | Use appropriate versions to resolve 64-to-32-bit data loss warnings.
* Fix Windows build issuesDmitri Smirnov2016-02-241-2/+1
| | | | This resolves #333.
* Collapse arena_avail_tree_* into arena_run_tree_*.Jason Evans2016-02-241-2/+1
| | | | | These tree types converged to become identical, yet they still had independently generated red-black tree implementations.
* Separate arena_avail treesDave Watson2016-02-241-6/+6
| | | | | | | | | | | Separate run trees by index, replacing the previous quantize logic. Quantization by index is now performed only on insertion / removal from the tree, and not on node comparison, saving some cpu. This also means we don't have to dereference the miscelm* pointers, saving half of the memory loads from miscelms/mapbits that have fallen out of cache. A linear scan of the indicies appears to be fast enough. The only cost of this is an extra tree array in each arena.
* Remove rbt_nilDave Watson2016-02-241-90/+64
| | | | | Since this is an intrusive tree, rbt_nil is the whole size of the node and can be quite large. For example, miscelm is ~100 bytes.
* Use table lookup for run_quantize_{floor,ceil}().Jason Evans2016-02-232-1/+2
| | | | | Reduce run quantization overhead by generating lookup tables during bootstrapping, and using the tables for all subsequent run quantization.
* Test run quantization.Jason Evans2016-02-222-0/+8
| | | | | Also rename run_quantize_*() to improve clarity. These tests demonstrate that run_quantize_ceil() is flawed.
* Indentation style cleanup.Jason Evans2016-02-221-13/+13
|
* Refactor time_* into nstime_*.Jason Evans2016-02-225-59/+68
| | | | | | | Use a single uint64_t in nstime_t to store nanoseconds rather than using struct timespec. This reduces fragility around conversions between long and uint64_t, especially missing casts that only cause problems on 32-bit platforms.
* Remove _WIN32-specific struct timespec declaration.Jason Evans2016-02-211-6/+0
| | | | struct timespec is already defined by the system (at least on MinGW).
* Fix overflow in prng_range().Jason Evans2016-02-214-5/+27
| | | | | Add jemalloc_ffs64() and use it instead of jemalloc_ffsl() in prng_range(), since long is not guaranteed to be a 64-bit type.
* Add symbol mangling for prng_[lg_]range().Jason Evans2016-02-201-0/+2
|
* Fix warning in ipallocrustyx2016-02-201-2/+2
|
* Detect LG_SIZEOF_PTR depending on MSVC platform targetrustyx2016-02-201-0/+8
|
* Fix a typo in the ckh_search() prototype.Christopher Ferris2016-02-201-1/+1
|
* Handle unaligned keys in hash().Jason Evans2016-02-201-1/+17
| | | | Reported by Christopher Ferris <cferris@google.com>.
* Implement decay-based unused dirty page purging.Jason Evans2016-02-207-23/+142
| | | | | | | | | | | | | | | | This is an alternative to the existing ratio-based unused dirty page purging, and is intended to eventually become the sole purging mechanism. Add mallctls: - opt.purge - opt.decay_time - arena.<i>.decay - arena.<i>.decay_time - arenas.decay_time - stats.arenas.<i>.decay_time This resolves #325.
* Implement smoothstep table generation.Jason Evans2016-02-203-0/+365
| | | | | | Check in a generated smootherstep table as smoothstep.h rather than generating it at configure time, since not all systems (e.g. Windows) have dc.
* Refactor arenas_cache tsd.Jason Evans2016-02-204-28/+53
| | | | | Refactor arenas_cache tsd into arenas_tdata, which is a structure of type arena_tdata_t.
* Refactor arena_malloc_hard() out of arena_malloc().Jason Evans2016-02-202-16/+8
|
* Refactor prng* from cpp macros into inline functions.Jason Evans2016-02-205-36/+80
| | | | | Remove 32-bit variant, convert prng64() to prng_lg_range(), and add prng_range().
* Use ticker for incremental tcache GC.Jason Evans2016-02-201-4/+2
|
* Implement ticker.Jason Evans2016-02-203-0/+82
| | | | | Implement ticker, which provides a simple API for ticking off some number of events before indicating that the ticker has hit its limit.
* Flesh out time_*() API.Jason Evans2016-02-202-2/+28
|
* Add time_update().Cameron Evans2016-02-204-0/+35
|
* Add --with-malloc-conf.Jason Evans2016-02-202-0/+4
| | | | | Add --with-malloc-conf, which makes it possible to embed a default options string during configuration.
* Fix arena_sdalloc() line wrapping.Jason Evans2016-02-201-5/+8
|
* Tweak code to allow compilation of concatenated src/*.c sources.Jason Evans2015-11-122-43/+46
| | | | This resolves #294.
* Fix a comment.Jason Evans2015-11-121-1/+1
|
* Fast-path improvement: reduce # of branches and unnecessary operations.Qi Wang2015-11-104-97/+150
| | | | | | - Combine multiple runtime branches into a single malloc_slow check. - Avoid calling arena_choose / size2index / index2size on fast path. - A few micro optimizations.
* Add function to destroy treeJoshua Kahn2015-11-091-1/+40
| | | | | | | ex_destroy iterates over the tree using post-order traversal so nodes can be removed and processed by the callback function without paying the cost to rebalance the tree. The destruction process cannot be stopped once started.
* Allow const keys for lookupJoshua Kahn2015-11-092-11/+11
| | | | | | Signed-off-by: Steve Dougherty <sdougherty@barracuda.com> This resolves #281.