summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* Workaround to address g++ unused variable warningsYinan Zhang2019-07-301-2/+4
| | | | | | g++ 5.5.0+ complained `parameter ‘expected’ set but not used [-Werror=unused-but-set-parameter]` (despite that `expected` is in fact used).
* Revert "Refactor prof log"Qi Wang2019-07-291-8/+0
| | | | This reverts commit 7618b0b8e458d9c0db6e4b05ccbe6c6308952890.
* Revert "Refactor profiling"Qi Wang2019-07-291-14/+0
| | | | This reverts commit 0b462407ae84a62b3c097f0e9f18df487a47d9a7.
* Refactor profilingYinan Zhang2019-07-291-0/+14
| | | | | | | | | | Refactored core profiling codebase into two logical parts: (a) `prof_data.c`: core internal data structure managing & dumping; (b) `prof.c`: mutexes & outward-facing APIs. Some internal functions had to be exposed out, but there are not that many of them if the modularization is (hopefully) clean enough.
* Refactor prof logYinan Zhang2019-07-291-0/+8
| | | | | | `prof.c` is growing too long, so trying to modularize it. There are a few internal functions that had to be exposed but I think it is a fair trade-off.
* Refactor arena_dalloc() / _sdalloc().Qi Wang2019-07-251-24/+18
|
* Invoke arena_dalloc_promoted() properly w/o tcache.Qi Wang2019-07-251-4/+12
| | | | When tcache was disabled, the dalloc promoted case was missing.
* Track the leaked VM space via the abandoned_vm counter.Qi Wang2019-07-241-0/+3
| | | | | The counter is 0 unless metadata allocation failed (indicates OOM), and is mainly for sanity checking.
* Implement retain on Windows.Qi Wang2019-07-243-2/+35
| | | | | | | | | | | The VirtualAlloc and VirtualFree APIs are different because MEM_DECOMMIT cannot be used across multiple VirtualAlloc regions. To properly support decommit, only allow merge / split within the same region -- this is done by tracking the "is_head" state of extents and not merging cross-region. Add a new state is_head (only relevant for retain && !maps_coalesce), which is true for the first extent in each VirtualAlloc region. Determine if two extents can be merged based on the head state, and use serial numbers for sanity checks.
* Remove prof_accumbytes in arenaYinan Zhang2019-07-161-1/+0
| | | | | `prof_accumbytes` was supposed to be replaced by `prof_accum` in https://github.com/jemalloc/jemalloc/pull/623.
* Fix logic in printingYinan Zhang2019-07-161-1/+1
| | | | | | `cbopaque` can now be overriden without overriding `write_cb` in the first place. (Otherwise there would be no need to have the `cbopaque` parameter in `malloc_message`.)
* Fix redzone setting and checkingYinan Zhang2019-07-121-2/+2
|
* Add confirm_conf optionYinan Zhang2019-05-221-0/+1
| | | | | | If the confirm_conf option is set, when the program starts, each of the four malloc_conf strings will be printed, and each option will be printed when being set.
* Track nfills and nflushes for arenas.i.small / large.Qi Wang2019-05-152-3/+15
| | | | | | Small is added purely for convenience. Large flushes wasn't tracked before and can be useful in analysis. Large fill simply reports nmalloc, since there is no batch fill for large currently.
* Add nonfull_slabs to bin_stats_t.Doron Roberts-Kedes2019-04-292-0/+4
| | | | | | | When config_stats is enabled track the size of bin->slabs_nonfull in the new nonfull_slabs counter in bin_stats_t. This metric should be useful for establishing an upper ceiling on the savings possible by meshing.
* Improve size class headerYinan Zhang2019-04-241-8/+21
| | | | | | Mainly fixing typos. The only non-trivial change is in the computation for SC_NPSIZES, though the result wouldn't be any different when SC_NGROUP = 4 as is always the case at the moment.
* Enforce TLS_MODEL attribute.Qi Wang2019-04-162-3/+7
| | | | | Caught by @zoulasc in #1460. The attribute needs to be added in the headers as well.
* Safety checks: Add a redzoning feature.David Goldblatt2019-04-155-4/+25
|
* Safety checks: Indirect through a function.David Goldblatt2019-04-151-0/+6
| | | | This will let us share code on failure pathways.pathways
* Move extra size checks behind a config flag.David Goldblatt2019-04-152-2/+21
| | | | | This will let us turn that flag into a generic "turn on runtime checks" flag that guards other functionality we have planned.
* Add an autoconf feature test for format_arg and a jemalloc-specificzoulasc2019-04-153-1/+10
| | | | macro for it.
* Fix incorrect macro use.zoulasc2019-04-151-1/+1
| | | | Compiling with warnings produces missing prototype warnings.
* Convert the format generator function to an annotated format function,zoulasc2019-04-151-15/+18
| | | | so that the generated formats can be checked by the compiler.
* remove compare and branch in fast path for c++ operator delete[]mgrice2019-04-081-0/+1
| | | | Summary: sdallocx is checking a flag that will never be set (at least in the provided C++ destructor implementation). This branch will probably only rarely be mispredicted however it removes two instructions in sdallocx and one at the callsite (to zero out flags).
* Add memory utilization analytics to mallctlYinan Zhang2019-04-043-0/+30
| | | | | | The analytics tool is put under experimental.utilization namespace in mallctl. Input is one pointer or an array of pointers and the output is a list of memory utilization statistics.
* Eagerly purge oversized merged extents.Qi Wang2019-03-151-0/+20
| | | | This change improves memory usage slightly, at virtually no CPU cost.
* Remove some unused comments.Qi Wang2019-03-151-3/+0
|
* Fallback to 32-bit when 8-bit atomics are missing for TSD.Qi Wang2019-03-091-2/+17
| | | | | When it happens, this might cause a slowdown on the fast path operations. However such case is very rare.
* Detect if 8-bit atomics are available.Qi Wang2019-03-092-0/+14
| | | | | In some rare cases (older compiler, e.g. gcc 4.2 w/ MIPS), 8-bit atomics might be unavailable. Detect such cases so that we can workaround.
* Do not use #pragma GCC diagnostic with gcc < 4.6.Jason Evans2019-03-091-10/+12
| | | | | This regression was introduced by 3d29d11ac2c1583b9959f73c0548545018d31c8a (Clean compilation -Wextra).
* Remove JE_FORCE_SYNC_COMPARE_AND_SWAP_[48].Jason Evans2019-02-221-16/+0
| | | | | | These macros have been unused since d4ac7582f32f506d5203bea2f0115076202add38 (Introduce a backport of C11 atomics).
* Avoid redefining tsd_t.Jason Evans2019-02-211-1/+1
| | | | | | This fixes a build failure when integrating with FreeBSD's libc. This regression was introduced by d1e11d48d4c706e17ef3508e2ddb910f109b779f (Move tsd link and in_hook after tcache.).
* Disable muzzy decay by default.Qi Wang2019-02-041-1/+1
|
* Sanity check szind on tcache flush.Qi Wang2019-02-011-0/+3
| | | | | This adds some overhead to the tcache flush path (which is one of the popular paths). Guard it behind a config option.
* Rename huge_threshold to oversize_threshold.Qi Wang2019-01-253-6/+6
| | | | | The keyword huge tend to remind people of huge pages which is not relevent to the feature.
* Set huge_threshold to 8M by default.Qi Wang2019-01-241-1/+1
| | | | | | This feature uses an dedicated arena to handle huge requests, which significantly improves VM fragmentation. In production workload we tested it often reduces VM size by >30%.
* Avoid creating bg thds for huge arena lone.Qi Wang2019-01-161-0/+1
| | | | | | For low arena count settings, the huge threshold feature may trigger an unwanted bg thd creation. Given that the huge arena does eager purging by default, bypass bg thd creation when initializing the huge arena.
* Avoid potential issues on extent zero-out.Qi Wang2019-01-121-0/+5
| | | | | | When custom extent_hooks or transparent huge pages are in use, the purging semantics may change, which means we may not get zeroed pages on repopulating. Fixing the issue by manually memset for such cases.
* implement malloc_getcpu for windowsLeonardo Santagada2019-01-082-2/+4
|
* Only read arena index from extent on the tcache flush path.Qi Wang2018-12-181-9/+10
| | | | | Add exten_arena_ind_get() to avoid loading the actual arena ptr in case we just need to check arena matching.
* Add rate counters to statsAlexander Zinoviev2018-12-182-8/+19
|
* Store the bin shard selection in TSD.Qi Wang2018-12-043-5/+21
| | | | | This avoids having to choose bin shard on the fly, also will allow flexible bin binding for each thread.
* Add opt.bin_shards to specify number of bin shards.Qi Wang2018-12-042-9/+5
| | | | | The option uses the same format as "slab_sizes" to specify number of shards for each bin size.
* Add support for sharded bins within an arena.Qi Wang2018-12-047-7/+87
| | | | | | | | | This makes it possible to have multiple set of bins in an arena, which improves arena scalability because the bins (especially the small ones) are always the limiting factor in production workload. A bin shard is picked on allocation; each extent tracks the bin shard id for deallocation. The shard size will be determined using runtime options.
* mutex: fix trylock spin wait contentionDave Watson2018-11-281-6/+15
| | | | | | | | | | | | | | | | | | | | | If there are 3 or more threads spin-waiting on the same mutex, there will be excessive exclusive cacheline contention because pthread_trylock() immediately tries to CAS in a new value, instead of first checking if the lock is locked. This diff adds a 'locked' hint flag, and we will only spin wait without trylock()ing while set. I don't know of any other portable way to get the same behavior as pthread_mutex_lock(). This is pretty easy to test via ttest, e.g. ./ttest1 500 3 10000 1 100 Throughput is nearly 3x as fast. This blames to the mutex profiling changes, however, we almost never have 3 or more threads contending in properly configured production workloads, but still worth fixing.
* Set the default number of background threads to 4.Qi Wang2018-11-161-0/+1
| | | | | The setting has been tested in production for a while. No negative effect while we were able to reduce number of threads per process.
* Deprecate OSSpinLock.Qi Wang2018-11-143-17/+1
|
* Add a fastpath for arena_slab_reg_alloc_batchDave Watson2018-11-142-0/+25
| | | | | Also adds a configure.ac check for __builtin_popcount, which is used in the new fastpath.
* add extent_nfree_subDave Watson2018-11-141-0/+6
|
* Fix tcache_flush (follow up cd2931a).Qi Wang2018-11-132-0/+6
| | | | Also catch invalid tcache id.