summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* malloc_stats_print() fixes/cleanups.Jason Evans2016-11-011-18/+3
| | | | | | Fix and clean up various malloc_stats_print() issues caused by 0ba5b9b6189e16a983d8922d8c5cb6ab421906e8 (Add "J" (JSON) support to malloc_stats_print().).
* Add "J" (JSON) support to malloc_stats_print().Jason Evans2016-11-011-313/+716
| | | | This resolves #474.
* Fix extent_rtree acquire() to release element on error.Jason Evans2016-10-311-1/+3
| | | | This resolves #480.
* Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.Jason Evans2016-10-301-2/+2
| | | | | | | | The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC), whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but still has resolution (~1ms) that is adequate for our purposes. This resolves #479.
* Use syscall(2) rather than {open,read,close}(2) during boot.Jason Evans2016-10-301-0/+19
| | | | | | | | | Some applications wrap various system calls, and if they call the allocator in their wrappers, unexpected reentry can result. This is not a general solution (many other syscalls are spread throughout the code), but this resolves a bootstrapping issue that is apparently common. This resolves #443.
* Do not mark malloc_conf as weak on Windows.Jason Evans2016-10-291-1/+1
| | | | | | | This works around malloc_conf not being properly initialized by at least the cygwin toolchain. Prior build system changes to use -Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to work properly as a non-weak symbol (not tested).
* Do not mark malloc_conf as weak for unit tests.Jason Evans2016-10-291-1/+5
| | | | | | | This is generally correct (no need for weak symbols since no jemalloc library is involved in the link phase), and avoids linking problems (apparently unininitialized non-NULL malloc_conf) when using cygwin with gcc.
* Support static linking of jemalloc with glibcDave Watson2016-10-281-0/+31
| | | | | | | | | | | | | | | | | | | | | | | glibc defines its malloc implementation with several weak and strong symbols: strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc) strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree) strong_alias (__libc_free, __free) strong_alias (__libc_free, free) strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc) The issue is not with the weak symbols, but that other parts of glibc depend on __libc_malloc explicitly. Defining them in terms of jemalloc API's allows the linker to drop glibc's malloc.o completely from the link, and static linking no longer results in symbol collisions. Another wrinkle: jemalloc during initialization calls sysconf to get the number of CPU's. GLIBC allocates for the first time before setting up isspace (and other related) tables, which are used by sysconf. Instead, use the pthread API to get the number of CPUs with GLIBC, which seems to work. This resolves #442.
* Fix over-sized allocation of rtree leaf nodes.Jason Evans2016-10-281-1/+1
| | | | | Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).
* Uniformly cast mallctl[bymib]() oldp/newp arguments to (void *).Jason Evans2016-10-283-21/+29
| | | | | This avoids warnings in some cases, and is otherwise generally good hygiene.
* Do not (recursively) allocate within tsd_fetch().Jason Evans2016-10-215-75/+77
| | | | | | | Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.
* Make dss operations lockless.Jason Evans2016-10-136-127/+121
| | | | | | | | | | | | | | Rather than protecting dss operations with a mutex, use atomic operations. This has negligible impact on synchronization overhead during typical dss allocation, but is a substantial improvement for extent_in_dss() and the newly added extent_dss_mergeable(), which can be called multiple times during extent deallocations. This change also has the advantage of avoiding tsd in deallocation paths associated with purging, which resolves potential deadlocks during thread exit due to attempted tsd resurrection. This resolves #425.
* Add/use adaptive spinning.Jason Evans2016-10-133-2/+10
| | | | | | | | Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.
* Remove all vestiges of chunks.Jason Evans2016-10-127-91/+8
| | | | | | | | Remove mallctls: - opt.lg_chunk - stats.cactive This resolves #464.
* Remove ratio-based purging.Jason Evans2016-10-124-279/+29
| | | | | | | | | | | | | Make decay-based purging the default (and only) mode. Remove associated mallctls: - opt.purge - opt.lg_dirty_mult - arena.<i>.lg_dirty_mult - arenas.lg_dirty_mult - stats.arenas.<i>.lg_dirty_mult This resolves #385.
* Fix and simplify decay-based purging.Jason Evans2016-10-111-51/+58
| | | | | | | | | | | | | | | | | | | | | Simplify decay-based purging attempts to only be triggered when the epoch is advanced, rather than every time purgeable memory increases. In a correctly functioning system (not previously the case; see below), this only causes a behavior difference if during subsequent purge attempts the least recently used (LRU) purgeable memory extent is initially too large to be purged, but that memory is reused between attempts and one or more of the next LRU purgeable memory extents are small enough to be purged. In practice this is an arbitrary behavior change that is within the set of acceptable behaviors. As for the purging fix, assure that arena->decay.ndirty is recorded *after* the epoch advance and associated purging occurs. Prior to this fix, it was possible for purging during epoch advance to cause a substantially underrepresentative (arena->ndirty - arena->decay.ndirty), i.e. the number of dirty pages attributed to the current epoch was too low, and a series of unintended purges could result. This fix is also relevant in the context of the simplification described above, but the bug's impact would be limited to over-purging at epoch advances.
* Do not advance decay epoch when time goes backwards.Jason Evans2016-10-112-4/+39
| | | | | | Instead, move the epoch backward in time. Additionally, add nstime_monotonic() and use it in debug builds to assert that time only goes backward if nstime_update() is using a non-monotonic time source.
* Refactor arena->decay_* into arena->decay.* (arena_decay_t).Jason Evans2016-10-111-38/+38
|
* Refine nstime_update().Jason Evans2016-10-101-27/+49
| | | | | | | | | | | | | | | | | | | | | Add missing #include <time.h>. The critical time facilities appear to have been transitively included via unistd.h and sys/time.h, but in principle this omission was capable of having caused clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of gettimeofday(), which in turn could cause spurious non-monotonic time updates. Refactor nstime_get() out of nstime_update() and add configure tests for all variants. Add CLOCK_MONOTONIC_RAW support (Linux-specific) and mach_absolute_time() support (OS X-specific). Do not fall back to clock_gettime(CLOCK_REALTIME, ...). This was a fragile Linux-specific workaround, which we're unlikely to use at all now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we have no choice besides non-monotonic clocks, gettimeofday() is only incrementally worse.
* Reduce "thread.arena" mallctl contention.Jason Evans2016-10-041-3/+1
| | | | This resolves #460.
* Remove a size class assertion from extent_size_quantize_floor().Jason Evans2016-10-031-1/+0
| | | | | Extent coalescence can result in legitimate calls to extent_size_quantize_floor() with size larger than LARGE_MAXCLASS.
* Fix size class overflow bugs.Jason Evans2016-10-032-5/+9
| | | | | | | Avoid calling s2u() on raw extent sizes in extent_recycle(). Clamp psz2ind() (implemented as psz2ind_clamp()) when inserting/removing into/from size-segregated extent heaps.
* Close file descriptor after reading "/proc/sys/vm/overcommit_memory".Jason Evans2016-09-261-0/+1
| | | | | | | This bug was introduced by c2f970c32b527660a33fa513a76d913c812dcf7c (Modify pages_map() to support mapping uncommitted virtual memory.). This resolves #399.
* Formatting fixes.Jason Evans2016-09-261-9/+12
|
* Add various mutex ownership assertions.Jason Evans2016-09-232-6/+12
|
* Fix large_dalloc_impl() to always lock large_mtx.Jason Evans2016-09-231-4/+7
|
* Add new_addr validation in extent_recycle().Jason Evans2016-09-231-6/+28
|
* Protect extents_dirty access with extents_mtx.Jason Evans2016-09-222-47/+98
| | | | This fixes race conditions during purging.
* Fix extent_recycle() to exclude other arenas' extents.Jason Evans2016-09-221-1/+2
| | | | | When attempting to recycle an extent at a specified address, check that the extent belongs to the correct arena.
* Fix arena_bind().Qi Wang2016-09-221-6/+7
| | | | | When tsd is not in nominal state (e.g. during thread termination), we should not increment nthreads.
* Change how the default zone is foundMike Hommey2016-07-081-2/+29
| | | | | | | | | | | | On OSX 10.12, malloc_default_zone returns a special zone that is not present in the list of registered zones. That zone uses a "lite zone" if one is present (apparently enabled when malloc stack logging is enabled), or the first registered zone otherwise. In practice this means unless malloc stack logging is enabled, the first registered zone is the default. So get the list of zones to get the first one, instead of relying on malloc_default_zone.
* Avoid getting the same default zone twice in a row.Mike Hommey2016-07-081-2/+3
| | | | | | | | | | | 847ff22 added a call to malloc_default_zone() before the main loop in register_zone, effectively making malloc_default_zone() called twice without any different outcome expected in the returned result. It is also called once at the beginning, and a second time at the end of the loop block. Instead, call it only once per iteration.
* Fix potential VM map fragmentation regression.Jason Evans2016-06-071-1/+1
| | | | | | | | Revert 245ae6036c09cc11a72fab4335495d95cddd5beb (Support --with-lg-page values larger than actual page size.), because it could cause VM map fragmentation if the kernel grows mmap()ed memory downward. This resolves #391.
* Fix mixed decl in nstime.cElliot Ronaghan2016-06-071-3/+5
| | | | Fix mixed decl in the gettimeofday() branch of nstime_update()
* Propagate tsdn to default extent hooks.Jason Evans2016-06-071-25/+78
| | | | | | | This avoids bootstrapping issues for configurations that require allocation during tsd initialization. This resolves #390.
* Use extent_commit_wrapper() rather than directly calling commit hook.Jason Evans2016-06-061-3/+2
| | | | | As a side effect this causes the extent's 'committed' flag to be updated.
* Set 'committed' in extent_[de]commit_wrapper().Jason Evans2016-06-061-8/+13
|
* Fix regressions related extent splitting failures.Jason Evans2016-06-061-1/+3
| | | | | | | | | | Fix a fundamental extent_split_wrapper() bug in an error path. Fix extent_recycle() to deregister unsplittable extents before leaking them. Relax xallocx() test assertions so that unsplittable extents don't cause test failures.
* Fix an extent [de]allocation/[de]registration race.Jason Evans2016-06-061-4/+17
| | | | | Deregister extents before deallocation, so that subsequent reallocation/registration doesn't race with deregistration.
* Fix extent_alloc_dss() regressions.Jason Evans2016-06-062-22/+31
| | | | | Page-align the gap, if any, and add/use extent_dalloc_gap(), which registers the gap extent before deallocation.
* Fix gdump triggering regression.Jason Evans2016-06-061-13/+11
| | | | | Now that extents are not multiples of chunksize, it's necessary to track pages rather than chunks.
* Remove a stray memset(), and fix a junk filling test regression.Jason Evans2016-06-061-2/+11
|
* Silence a bogus compiler warning.Jason Evans2016-06-061-1/+3
|
* Fix locking order reversal in arena_reset().Jason Evans2016-06-061-5/+13
|
* Modify extent hook functions to take an (extent_t *) argument.Jason Evans2016-06-065-204/+173
| | | | | | | This facilitates the application accessing its own extent allocator metadata during hook invocations. This resolves #259.
* Add rtree lookup path caching.Jason Evans2016-06-062-24/+44
| | | | | | | | | rtree-based extent lookups remain more expensive than chunk-based run lookups, but with this optimization the fast path slowdown is ~3 CPU cycles per metadata lookup (on Intel Core i7-4980HQ), versus ~11 cycles prior. The path caching speedup tends to degrade gracefully unless allocated memory is spread far apart (as is the case when using a mixture of sbrk() and mmap()).
* Make tsd cleanup functions optional, remove noop cleanup functions.Jason Evans2016-06-065-50/+6
|
* Remove some unnecessary locking.Jason Evans2016-06-061-20/+2
|
* Fix rallocx() sampling code to not eagerly commit sampler update.Jason Evans2016-06-061-3/+3
| | | | | | rallocx() for an alignment-constrained request may end up with a smaller-than-worst-case size if in-place reallocation succeeds due to serendipitous alignment. In such cases, sampling may not happen.
* Miscellaneous s/chunk/extent/ updates.Jason Evans2016-06-061-1/+1
|