jemalloc.git - jemalloc is a general purpose malloc(3) implementation that emphasizes fragmentation avoidance and scalable concurrency support.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Implementing opt.background_thread.	Qi Wang	2017-05-23	1	-8/+24
\| \| \| \| \| \| \| \| \| \| \|	Added opt.background_thread to enable background threads, which handles purging currently. When enabled, decay ticks will not trigger purging (which will be left to the background threads). We limit the max number of threads to NCPUs. When percpu arena is enabled, set CPU affinity for the background threads as well. The sleep interval of background threads is dynamic and determined by computing number of pages to purge in the future (based on backlog).
*	Automatically generate private symbol name mangling macros.	Jason Evans	2017-05-12	1	-21/+89
\| \| \| \| \| \| \| \|	Rather than using a manually maintained list of internal symbols to drive name mangling, add a compilation phase to automatically extract the list of internal symbols. This resolves #677.
*	Remove unused private_unnamespace infrastructure.	Jason Evans	2017-05-12	1	-9/+0
\|
*	Add --with-version=VERSION .	Jason Evans	2017-05-03	1	-3/+7
\| \| \| \| \| \| \|	This simplifies configuration when embedding a jemalloc release into another project's git repository. This resolves #811.
*	Add extent_destroy_t and use it during arena destruction.	Jason Evans	2017-04-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Add the extent_destroy_t extent destruction hook to extent_hooks_t, and use it during arena destruction. This hook explicitly communicates to the callee that the extent must be destroyed or tracked for later reuse, lest it be permanently leaked. Prior to this change, retained extents could unintentionally be leaked if extent retention was enabled. This resolves #560.
*	Refactor !opt.munmap to opt.retain.	Jason Evans	2017-04-29	1	-6/+6
\|
*	Replace --disable-munmap with opt.munmap.	Jason Evans	2017-04-25	1	-14/+2
\| \| \| \| \| \| \| \| \|	Control use of munmap(2) via a run-time option rather than a compile-time option (with the same per platform default). The old behavior of --disable-munmap can be achieved with --with-malloc-conf=munmap:false. This partially resolves #580.
*	Remove --enable-code-coverage.	Jason Evans	2017-04-24	1	-26/+0
\| \| \| \| \| \| \|	This option hasn't been particularly useful since the original pre-3.0.0 push to broaden test coverage. This partially resolves #580.
*	Remove --disable-cc-silence.	Jason Evans	2017-04-24	1	-23/+4
\| \| \| \| \| \| \|	The explicit compiler warning suppression controlled by this option is universally desirable, so remove the ability to disable suppression. This partially resolves #580.
*	Remove --with-lg-tiny-min.	Jason Evans	2017-04-24	1	-9/+1
\| \| \| \| \| \|	This option isn't useful in practice. This partially resolves #580.
*	Remove --with-lg-size-class-group.	Jason Evans	2017-04-24	1	-8/+1
\| \| \| \| \| \| \| \|	Four size classes per size doubling has proven to be a universally good choice for the entire 4.x release series, so there's little point to preserving this configurability. This partially resolves #580.
*	Add missing 'test' to LG_SIZEOF_PTR tests.	Jason Evans	2017-04-24	1	-3/+3
\| \| \| \| \| \|	This fixes a bug/regression introduced by a01f99307719dcc8ca27cc70f0f0011beff914fa (Only disable munmap(2) by default on 64-bit Linux.).
*	Enable -Wundef, when supported.	David Goldblatt	2017-04-22	1	-1/+4
\| \| \| \| \| \|	This can catch bugs in which one header defines a numeric constant, and another uses it without including the defining header. Undefined preprocessor symbols expand to '0', so that this will compile fine, silently doing the math wrong.
*	Remove --enable-ivsalloc.	Jason Evans	2017-04-21	1	-18/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Continue to use ivsalloc() when --enable-debug is specified (and add assertions to guard against 0 size), but stop providing a documented explicit semantics-changing band-aid to dodge undefined behavior in sallocx() and malloc_usable_size(). ivsalloc() remains compiled in, unlike when #211 restored --enable-ivsalloc, and if JEMALLOC_FORCE_IVSALLOC is defined during compilation, sallocx() and malloc_usable_size() will still use ivsalloc(). This partially resolves #580.
*	Remove --disable-tls.	Jason Evans	2017-04-21	1	-23/+4
\| \| \| \| \| \| \|	This option is no longer useful, because TLS is correctly configured automatically on all supported platforms. This partially resolves #580.
*	Remove --disable-tcache.	Jason Evans	2017-04-21	1	-17/+0
\| \| \| \| \| \| \| \| \| \| \|	Simplify configuration by removing the --disable-tcache option, but replace the testing for that configuration with --with-malloc-conf=tcache:false. Fix the thread.arena and thread.tcache.flush mallctls to work correctly if tcache is disabled. This partially resolves #580.
*	Only disable munmap(2) by default on 64-bit Linux.	Jason Evans	2017-04-17	1	-2/+6
\| \| \| \| \| \| \|	This reduces the likelihood of address space exhaustion on 32-bit systems. This resolves #350.
*	Fix LD_PRELOAD_VAR configuration logic for 64-bit AIX.	Jason Evans	2017-04-17	1	-1/+1
\|
*	Header refactoring: Split up jemalloc_internal.h	David Goldblatt	2017-04-11	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a biggy. jemalloc_internal.h has been doing multiple jobs for a while now: - The source of system-wide definitions. - The catch-all include file. - The module header file for jemalloc.c This commit splits up this functionality. The system-wide definitions responsibility has moved to jemalloc_preamble.h. The catch-all include file is now jemalloc_internal_includes.h. The module headers for jemalloc.c are now in jemalloc_internal_[externs\|inlines\|types].h, just as they are for the other modules.
*	Port CPU_SPINWAIT to __powerpc64__	Rafael Folco	2017-04-10	1	-1/+2
\| \| \| \| \| \| \| \| \|	Hyper-threaded CPUs may need a special instruction inside spin loops in order to yield to another virtual CPU. The 'pause' instruction that is available for x86 is not supported on Power. Apparently the extended mnemonics like yield, mdoio, and mdoom are not actually implemented on POWER8, although mentioned in the ISA 2.07 document. The recommended magic bits are an 'or 31,31,31'.
*	Clamp LG_VADDR for 32-bit builds on x64.	Jason Evans	2017-03-23	1	-0/+3
\|
*	Fix pages_purge_forced() to discard pages on non-Linux systems.	Jason Evans	2017-03-14	1	-0/+2
\| \| \| \| \|	madvise(..., MADV_DONTNEED) only causes demand-zeroing on Linux, so fall back to overlaying a new mapping.
*	Implement per-CPU arena.	Qi Wang	2017-03-09	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new feature, opt.percpu_arena, determines thread-arena association dynamically based CPU id. Three modes are supported: "percpu", "phycpu" and disabled. "percpu" uses the current core id (with help from sched_getcpu()) directly as the arena index, while "phycpu" will assign threads on the same physical CPU to the same arena. In other words, "percpu" means # of arenas == # of CPUs, while "phycpu" has # of arenas == 1/2 * (# of CPUs). Note that no runtime check on whether hyper threading is enabled is added yet. When enabled, threads will be migrated between arenas when a CPU change is detected. In the current design, to reduce overhead from reading CPU id, each arena tracks the thread accessed most recently. When a new thread comes in, we will read CPU id and update arena if necessary.
*	Introduce a backport of C11 atomics	David Goldblatt	2017-03-03	1	-22/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces a backport of C11 atomics. It has four implementations; ranked in order of preference, they are: - GCC/Clang __atomic builtins - GCC/Clang __sync builtins - MSVC _Interlocked builtins - C11 atomics, from <stdatomic.h> The primary advantages are: - Close adherence to the standard API gives us a defined memory model. - Type safety: atomic objects are now separate types from non-atomic ones, so that it's impossible to mix up atomic and non-atomic updates (which is undefined behavior that compilers are starting to take advantage of). - Efficiency: we can specify ordering for operations, avoiding fences and atomic operations on strongly ordered architectures (example: `atomic_write_u32(ptr, val);` involves a CAS loop, whereas `atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store. This diff leaves in the current atomics API (implementing them in terms of the backport). This lets us transition uses over piecemeal. Testing: This is by nature hard to test. I've manually tested the first three options on Linux on gcc by futzing with the #defines manually, on freebsd with gcc and clang, on MSVC, and on OS X with clang. All of these were x86 machines though, and we don't have any test infrastructure set up for non-x86 platforms.
*	fix typo sytem -> system	charsyam	2017-03-01	1	-1/+1
\|
*	Put -D_REENTRANT in CPPFLAGS rather than CFLAGS.	Jason Evans	2017-02-28	1	-1/+1
\| \| \| \| \| \|	This regression was introduced by 194d6f9de8ff92841b67f38a2a6a06818e3240dd (Restructure CFLAGS/CXXFLAGS configuration.).
*	Avoid -lgcc for heap profiling if unwind.h is missing.	Jason Evans	2017-02-21	1	-1/+3
\| \| \| \| \| \|	This removes an unneeded library dependency when falling back to intrinsics-based backtracing (or failing to enable heap profiling at all).
*	Determine rtree levels at compile time.	Jason Evans	2017-02-09	1	-0/+68
\| \| \| \| \| \| \|	Rather than dynamically building a table to aid per level computations, define a constant table at compile time. Omit both high and low insignificant bits. Use one to three tree levels, depending on the number of significant bits.
*	Remove extraneous parens around return arguments.	Jason Evans	2017-01-21	1	-2/+2
\| \| \| \|	This resolves #540.
*	Remove -Werror=declaration-after-statement.	Jason Evans	2017-01-19	1	-1/+0
\| \| \| \|	This partially resolves #536.
*	Don't rely on OSX SDK malloc/malloc.h for malloc_zone struct definitions	Mike Hommey	2017-01-18	1	-31/+0
\| \| \| \| \| \| \| \| \| \|	The SDK jemalloc is built against might be not be the latest for various reasons, but the resulting binary ought to work on newer versions of OSX. In order to ensure this, we need the fullest definitions possible, so copy what we need from the latest version of malloc/malloc.h available on opensource.apple.com.
*	Add huge page configuration and pages_[no}huge().	Jason Evans	2016-12-27	1	-3/+41
\| \| \| \| \| \| \| \|	Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.
*	Restructure CFLAGS/CXXFLAGS configuration.	Jason Evans	2016-12-16	1	-122/+152
\| \| \| \| \| \| \| \| \| \| \| \| \|	Convert CFLAGS/CXXFLAGS to be concatenations: CFLAGS := CONFIGURE_CFLAGS SPECIFIED_CFLAGS EXTRA_CFLAGS CXXFLAGS := CONFIGURE_CXXFLAGS SPECIFIED_CXXFLAGS EXTRA_CXXFLAGS This ordering makes it possible to override the flags set by the configure script both during and after configuration, with CFLAGS/CXXFLAGS and EXTRA_CFLAGS/EXTRA_CXXFLAGS, respectively. This resolves #504.
*	jemalloc cpp new/delete bindings	Dave Watson	2016-12-13	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds cpp bindings for jemalloc, along with necessary autoconf settings. This is mostly to add sized deallocation support, which can't be added from C directly. Sized deallocation is ~10% microbench improvement. * Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the easiest way to get c++14 detection. * Adds various other changes, like CXXFLAGS, to configure.ac. * Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic unittest. * Both new and delete are overridden, to ensure jemalloc is used for both. * TODO future enhancement of avoiding extra PLT thunks for new and delete - sdallocx and malloc are publicly exported jemalloc symbols, using an alias would link them directly. Unfortunately, was having trouble getting it to play nice with jemalloc's namespace support. Testing: Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized deallocation support, verified that the rest build correctly. Tested mac osx and Centos. Tested --with-jemalloc-prefix and --without-export. This resolves #202.
*	Add --disable-syscall.	Jason Evans	2016-12-04	1	-9/+22
\| \| \| \|	This resolves #517.
*	Implement a more reliable detection scheme for os_unfair_lock.	John Szakmeister	2016-11-23	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \|	The core issue here is the weak linking of the symbol, and in certain environments--for instance, using the latest Xcode (8.1) with the latest SDK (10.12)--os_unfair_lock may resolve even though you're compiling on a host that doesn't support it (10.11). We can use the availability macros to circumvent this problem, and detect that we're not compiling for a target that is going to support them and error out at compile time. The other alternative is to do a runtime check, but that presents issues for cross-compiling.
*	Add pthread_atfork(3) feature test.	Jason Evans	2016-11-17	1	-0/+8
\| \| \| \| \| \|	Some versions of Android provide a pthreads library without providing pthread_atfork(), so in practice a separate feature test is necessary for the latter.
*	Refactor madvise(2) configuration.	Jason Evans	2016-11-17	1	-13/+25
\| \| \| \| \| \| \| \| \|	Add feature tests for the MADV_FREE and MADV_DONTNEED flags to madvise(2), so that MADV_FREE is detected and used for Linux kernel versions 4.5 and newer. Refactor pages_purge() so that on systems which support both flags, MADV_FREE is preferred over MADV_DONTNEED. This resolves #387.
*	Remove a residual comment.	Jason Evans	2016-11-17	1	-1/+0
\|
*	Revert "Add JE_RUNNABLE() and use it for os_unfair_lock_*() test."	Jason Evans	2016-11-16	1	-16/+1
\| \| \| \| \| \|	This reverts commit a2e601a2236315fb6f994ff364ea442ed0aed07b. JE_RUNNABLE() causes general cross-compilation issues.
*	Add JE_RUNNABLE() and use it for os_unfair_lock_*() test.	Jason Evans	2016-11-12	1	-1/+16
\| \| \| \|	This resolves #494.
*	Add configure support for --linux-android.	Jason Evans	2016-11-10	1	-0/+12
\| \| \| \| \| \| \|	This is tailored to Android, i.e. more specific than the --linux* configuration. This resolves #471.
*	Use -std=gnu11 if available.	Jason Evans	2016-11-04	1	-2/+8
\| \| \| \|	This supersedes -std=gnu99, and enables C11 atomics.
*	Support Debian GNU/kFreeBSD.	Samuel Moritz	2016-11-03	1	-1/+1
\| \| \| \|	Treat it exactly like Linux since they both use GNU libc.
*	Fix sycall(2) configure test for Linux.	Jason Evans	2016-11-03	1	-2/+1
\|
*	Do not use syscall(2) on OS X 10.12 (deprecated).	Jason Evans	2016-11-03	1	-0/+17
\|
*	Add os_unfair_lock support.	Jason Evans	2016-11-03	1	-0/+14
\| \| \| \| \|	OS X 10.12 deprecated OSSpinLock; os_unfair_lock is the recommended replacement.
*	Force no lazy-lock on Windows.	Jason Evans	2016-11-02	1	-5/+11
\| \| \| \| \| \| \|	Monitoring thread creation is unimplemented for Windows, which means lazy-lock cannot function correctly. This resolves #310.
*	Use CLOCK_MONOTONIC_COARSE rather than COARSE_MONOTONIC_RAW.	Jason Evans	2016-10-30	1	-6/+6
\| \| \| \| \| \| \| \|	The raw clock variant is slow (even relative to plain CLOCK_MONOTONIC), whereas the coarse clock variant is faster than CLOCK_MONOTONIC, but still has resolution (~1ms) that is adequate for our purposes. This resolves #479.
*	Fix EXTRA_CFLAGS to not affect configuration.	Jason Evans	2016-10-30	1	-4/+2
\|