| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
rallocx() for an alignment-constrained request may end up with a
smaller-than-worst-case size if in-place reallocation succeeds due to
serendipitous alignment. In such cases, sampling may not happen.
|
| |
|
|
|
| |
This regression was caused by 3ef51d7f733ac6432e80fa902a779ab5b98d74f6
(Optimize the fast paths of calloc() and [m,d,sd]allocx().).
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
b2c0d6322d2307458ae2b28545f8a5c9903d7ef5 (Add witness, a simple online
locking validator.) caused a broad propagation of tsd throughout the
internal API, but tsd_fetch() was designed to fail prior to tsd
bootstrapping. Fix this by splitting tsd_t into non-nullable tsd_t and
nullable tsdn_t, and modifying all internal APIs that do not critically
rely on tsd to take nullable pointers. Furthermore, add the
tsd_booted_get() function so that tsdn_fetch() can probe whether tsd
bootstrapping is complete and return NULL if not. All dangerous
conversions of nullable pointers are tsdn_tsd() calls that assert-fail
on invalid conversion.
|
| | |
|
| |
|
|
|
|
|
|
| |
This is a broader application of optimizations to malloc() and free() in
f4a0f32d340985de477bbe329ecdaecd69ed1055 (Fast-path improvement:
reduce # of branches and unnecessary operations.).
This resolves #321.
|
| |
|
|
|
|
|
|
|
|
|
| |
If the OS overcommits:
- Commit all mappings in pages_map() regardless of whether the caller
requested committed memory.
- Linux-specific: Specify MAP_NORESERVE to avoid
unfortunate interactions with heuristic overcommit mode during
fork(2).
This resolves #193.
|
| |
|
|
|
|
| |
Fix witness to clear its list of owned mutexes in the child if
platform-specific malloc_mutex code re-initializes mutexes rather than
unlocking them.
|
| | |
|
| |
|
|
|
| |
This regression was caused by 66cd953514a18477eb49732e40d5c2ab5f1b12c5
(Do not allocate metadata via non-auto arenas, nor tcaches.).
|
| |
|
|
|
| |
This assures that all internally allocated metadata come from the
first opt_narenas arenas, i.e. the automatically multiplexed arenas.
|
| |
|
|
| |
This resolves #358.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to 767d85061a6fb88ec977bbcd9b429a43aff391e6 (Refactor arenas array
(fixes deadlock).), it was possible under some circumstances for
arena_get() to trigger recreation of the arenas cache during tsd
cleanup, and the arenas cache would then be leaked. In principle a
similar issue could still occur as a side effect of decay-based purging,
which calls arena_tdata_get(). Fix arenas_tdata_cleanup() by setting
tsd->arenas_tdata_bypass to true, so that arena_tdata_get() will
gracefully fail (an expected behavior) rather than recreating
tsd->arena_tdata.
Reported by Christopher Ferris <cferris@google.com>.
|
| |
|
|
|
|
|
| |
Add HUGE_MAXCLASS overflow checks that are specific to heap profiling
code paths. This fixes test failures that were introduced by
0c516a00c4cb28cff55ce0995f756b5aae074c9e (Make *allocx() size class
overflow behavior defined.).
|
| |
|
|
|
|
|
| |
Limit supported size and alignment to HUGE_MAXCLASS, which in turn is
now limited to be less than PTRDIFF_MAX.
This resolves #278 and #295.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Refactor the arenas array, which contains pointers to all extant arenas,
such that it starts out as a sparse array of maximum size, and use
double-checked atomics-based reads as the basis for fast and simple
arena_get(). Additionally, reduce arenas_lock's role such that it only
protects against arena initalization races. These changes remove the
possibility for arena lookups to trigger locking, which resolves at
least one known (fork-related) deadlock.
This resolves #315.
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
Use a single uint64_t in nstime_t to store nanoseconds rather than using
struct timespec. This reduces fragility around conversions between long
and uint64_t, especially missing casts that only cause problems on
32-bit platforms.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an alternative to the existing ratio-based unused dirty page
purging, and is intended to eventually become the sole purging
mechanism.
Add mallctls:
- opt.purge
- opt.decay_time
- arena.<i>.decay
- arena.<i>.decay_time
- arenas.decay_time
- stats.arenas.<i>.decay_time
This resolves #325.
|
| |
|
|
|
| |
Refactor arenas_cache tsd into arenas_tdata, which is a structure of
type arena_tdata_t.
|
| |
|
|
|
| |
Add --with-malloc-conf, which makes it possible to embed a default
options string during configuration.
|
| |
|
|
|
|
|
|
|
|
|
| |
When using LinuxThreads, malloc bootstrapping deadlocks, since
malloc_tsd_boot0() ends up calling pthread_setspecific(), which causes
recursive allocation. Fix it by moving the malloc_tsd_boot0() call to
malloc_init_hard_recursible().
The deadlock was introduced by 8bb3198f72fc7587dc93527f9f19fb5be52fa553
(Refactor/fix arenas manipulation.), when tsd_boot() was split and the
top half, tsd_boot0(), got an extra tsd_wrapper_set() call.
|
| |
|
|
|
|
| |
- Combine multiple runtime branches into a single malloc_slow check.
- Avoid calling arena_choose / size2index / index2size on fast path.
- A few micro optimizations.
|
| | |
|
| |
|
|
|
|
|
| |
Simplify imallocx_prof_sample() to always operate on usize rather than
sometimes using size. This avoids redundant usize computations and
more closely fits the style adopted by i[rx]allocx_prof_sample() to fix
sampling bugs.
|
| |
|
|
|
| |
Fix irallocx_prof_sample() to always allocate large regions, even when
alignment is non-zero.
|
| |
|
|
|
|
| |
Fix ixallocx_prof_sample() to never modify nor create sampled small
allocations. xallocx() is in general incapable of moving small
allocations, so this fix removes buggy code without loss of generality.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
Add arena_prof_tctx_reset() and use it instead of arena_prof_tctx_set()
when resetting the tctx pointer during reallocation, which happens
whenever an originally sampled reallocated object is not sampled during
reallocation.
This regression was introduced by
594c759f37c301d0245dc2accf4d4aaf9d202819 (Optimize
arena_prof_tctx_set().)
|
| |
|
|
|
| |
Fix ixallocx_prof() to pass usize_max and zero to ixallocx_prof_sample()
in the correct order.
|
| | |
|
| | |
|
| |
|
|
|
|
|
| |
Make one call to prof_active_get_unlocked() per allocation event, and
use the result throughout the relevant functions that handle an
allocation event. Also add a missing check in prof_realloc(). These
fixes protect allocation events against concurrent prof_active changes.
|
| | |
|
| | |
|
| |
|
|
|
| |
Fix ixallocx_prof() to clamp the extra parameter if size+extra would
overflow HUGE_MAXCLASS.
|
| |
|
|
| |
This resolves #269.
|
| |
|
|
|
|
|
|
|
| |
Fix arenas_cache_cleanup() and arena_get_hard() to handle
allocation/deallocation within the application's thread-specific data
cleanup functions even after arenas_cache is torn down.
This is a more general fix that complements
45e9f66c280e1ba8bebf7bed387a43bc9e45536d (Fix arenas_cache_cleanup().).
|
| |
|
|
|
|
| |
Fix arenas_cache_cleanup() to handle allocation/deallocation within the
application's thread-specific data cleanup functions even after
arenas_cache is torn down.
|
| |
|
|
|
|
| |
- Decorate public function with __declspec(allocator) and __declspec(restrict), just like MSVC 1900
- Support JEMALLOC_HAS_RESTRICT by defining the restrict keyword
- Move __declspec(nothrow) between 'void' and '*' so it compiles once more
|
| |
|
|
|
|
| |
Only use __declspec(nothrow) in C++ mode.
This resolves #244.
|
| |
|
|
|
|
|
|
| |
As per gcc documentation:
The alloc_size attribute is used to tell the compiler that the function
return value points to memory (...)
This resolves #245.
|
| |
|
|
|
|
|
|
|
| |
Add various function attributes to the exported functions to give the
compiler more information to work with during optimization, and also
specify throw() when compiling with C++ on Linux, in order to adequately
match what __THROW does in glibc.
This resolves #237.
|
| |
|
|
|
|
|
| |
- Set opt_lg_chunk based on run-time OS setting
- Verify LG_PAGE is compatible with run-time OS setting
- When targeting Windows Vista or newer, use SRWLOCK instead of CRITICAL_SECTION
- When targeting Windows Vista or newer, statically initialize init_lock
|
| |
|
|
|
|
|
|
|
|
| |
Fix size class overflow handling for malloc(), posix_memalign(),
memalign(), calloc(), and realloc() when profiling is enabled.
Remove an assertion that erroneously caused arena_sdalloc() to fail when
profiling was enabled.
This resolves #232.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extract szad size quantization into {extent,run}_quantize(), and .
quantize szad run sizes to the union of valid small region run sizes and
large run sizes.
Refactor iteration in arena_run_first_fit() to use
run_quantize{,_first,_next(), and add support for padded large runs.
For large allocations that have no specified alignment constraints,
compute a pseudo-random offset from the beginning of the first backing
page that is a multiple of the cache line size. Under typical
configurations with 4-KiB pages and 64-byte cache lines this results in
a uniform distribution among 64 page boundary offsets.
Add the --disable-cache-oblivious option, primarily intended for
performance testing.
This resolves #13.
|
| | |
|