summaryrefslogtreecommitdiffstats
path: root/Makefile.in
Commit message (Collapse)AuthorAgeFilesLines
* Remove --enable-code-coverage.Jason Evans2017-04-241-50/+0
| | | | | | | This option hasn't been particularly useful since the original pre-3.0.0 push to broaden test coverage. This partially resolves #580.
* Add hooking functionalityDavid Goldblatt2017-04-071-0/+2
| | | | | This allows us to hook chosen functions and do interesting things there (in particular: reentrancy checking).
* Implement two-phase decay-based purging.Jason Evans2017-03-151-2/+2
| | | | | | | | | | | | | | | | | | | | Split decay-based purging into two phases, the first of which uses lazy purging to convert dirty pages to "muzzy", and the second of which uses forced purging, decommit, or unmapping to convert pages to clean or destroy them altogether. Not all operating systems support lazy purging, yet the application may provide extent hooks that implement lazy purging, so care must be taken to dynamically omit the first phase when necessary. The mallctl interfaces change as follows: - opt.decay_time --> opt.{dirty,muzzy}_decay_time - arena.<i>.decay_time --> arena.<i>.{dirty,muzzy}_decay_time - arenas.decay_time --> arenas.{dirty,muzzy}_decay_time - stats.arenas.<i>.pdirty --> stats.arenas.<i>.p{dirty,muzzy} - stats.arenas.<i>.{npurge,nmadvise,purged} --> stats.arenas.<i>.{dirty,muzzy}_{npurge,nmadvise,purged} This resolves #521.
* Disentangle assert and utilDavid Goldblatt2017-03-061-4/+4
| | | | | | | | | This is the first header refactoring diff, #533. It splits the assert and util components into separate, hermetic, header files. In the process, it splits out two of the large sub-components of util (the stdio.h replacement, and bit manipulation routines) into their own components (malloc_io.h and bit_util.h). This is mostly to break up cyclic dependencies, but it also breaks off a good chunk of the catch-all-ness of util, which is nice.
* Introduce a backport of C11 atomicsDavid Goldblatt2017-03-031-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a backport of C11 atomics. It has four implementations; ranked in order of preference, they are: - GCC/Clang __atomic builtins - GCC/Clang __sync builtins - MSVC _Interlocked builtins - C11 atomics, from <stdatomic.h> The primary advantages are: - Close adherence to the standard API gives us a defined memory model. - Type safety: atomic objects are now separate types from non-atomic ones, so that it's impossible to mix up atomic and non-atomic updates (which is undefined behavior that compilers are starting to take advantage of). - Efficiency: we can specify ordering for operations, avoiding fences and atomic operations on strongly ordered architectures (example: `atomic_write_u32(ptr, val);` involves a CAS loop, whereas `atomic_store(ptr, val, ATOMIC_RELEASE);` is a plain store. This diff leaves in the current atomics API (implementing them in terms of the backport). This lets us transition uses over piecemeal. Testing: This is by nature hard to test. I've manually tested the first three options on Linux on gcc by futzing with the #defines manually, on freebsd with gcc and clang, on MSVC, and on OS X with clang. All of these were x86 machines though, and we don't have any test infrastructure set up for non-x86 platforms.
* Remove remainder of mb (memory barrier).Jason Evans2017-02-221-1/+0
| | | | | This complements 94c5d22a4da7844d0bdc5b370e47b1ba14268af2 (Remove mb.h, which is unused).
* Enhance spin_adaptive() to yield after several iterations.Jason Evans2017-02-091-0/+1
| | | | | This avoids worst case behavior if e.g. another thread is preempted while owning the resource the spinning thread is waiting for.
* Test JSON output of malloc_stats_print() and fix bugs.Jason Evans2017-01-191-0/+1
| | | | | | | | Implement and test a JSON validation parser. Use the parser to validate JSON output from malloc_stats_print(), with a significant subset of supported output options. This resolves #551.
* Fix prof_realloc() regression.Jason Evans2017-01-171-0/+1
| | | | | | | | | | | | | | Mostly revert the prof_realloc() changes in 498856f44a30b31fe713a18eb2fc7c6ecf3a9f63 (Move slabs out of chunks.) so that prof_free_sampled_object() is called when appropriate. Leave the prof_tctx_[re]set() optimization in place, but add an assertion to verify that all eight cases are correctly handled. Add a comment to make clear the code ordering, so that the regression originally fixed by ea8d97b8978a0c0423f0ed64332463a25b787c3d (Fix prof_{malloc,free}_sample_object() call order in prof_realloc().) is not repeated. This resolves #499.
* Implement arena.<i>.destroy .Jason Evans2017-01-071-0/+4
| | | | | | | Add MALLCTL_ARENAS_DESTROYED for accessing destroyed arena stats as an analogue to MALLCTL_ARENAS_ALL. This resolves #382.
* Implement per arena base allocators.Jason Evans2016-12-271-0/+1
| | | | | | | | | | | | | Add/rename related mallctls: - Add stats.arenas.<i>.base . - Rename stats.arenas.<i>.metadata to stats.arenas.<i>.internal . - Add stats.arenas.<i>.resident . Modify the arenas.extend mallctl to take an optional (extent_hooks_t *) argument so that it is possible for all base allocations to be serviced by the specified extent hooks. This resolves #463.
* Add huge page configuration and pages_[no}huge().Jason Evans2016-12-271-0/+1
| | | | | | | | Add the --with-lg-hugepage configure option, but automatically configure LG_HUGEPAGE even if it isn't specified. Add the pages_[no]huge() functions, which toggle huge page state via madvise(..., MADV_[NO]HUGEPAGE) calls.
* Simplify arena_slab_regind().Jason Evans2016-12-231-0/+1
| | | | | | | | | Rewrite arena_slab_regind() to provide sufficient constant data for the compiler to perform division strength reduction. This replaces more general manual strength reduction that was implemented before arena_bin_info was compile-time-constant. It would be possible to slightly improve on the compiler-generated division code by taking advantage of range limits that the compiler doesn't know about.
* Restructure *CFLAGS/*CXXFLAGS configuration.Jason Evans2016-12-161-2/+6
| | | | | | | | | | | | | Convert CFLAGS/CXXFLAGS to be concatenations: CFLAGS := CONFIGURE_CFLAGS SPECIFIED_CFLAGS EXTRA_CFLAGS CXXFLAGS := CONFIGURE_CXXFLAGS SPECIFIED_CXXFLAGS EXTRA_CXXFLAGS This ordering makes it possible to override the flags set by the configure script both during and after configuration, with CFLAGS/CXXFLAGS and EXTRA_CFLAGS/EXTRA_CXXFLAGS, respectively. This resolves #504.
* jemalloc cpp new/delete bindingsDave Watson2016-12-131-16/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds cpp bindings for jemalloc, along with necessary autoconf settings. This is mostly to add sized deallocation support, which can't be added from C directly. Sized deallocation is ~10% microbench improvement. * Import ax_cxx_compile_stdcxx.m4 from the autoconf repo, seems like the easiest way to get c++14 detection. * Adds various other changes, like CXXFLAGS, to configure.ac. * Adds new rules to Makefile.in for src/jemalloc-cpp.cpp, and a basic unittest. * Both new and delete are overridden, to ensure jemalloc is used for both. * TODO future enhancement of avoiding extra PLT thunks for new and delete - sdallocx and malloc are publicly exported jemalloc symbols, using an alias would link them directly. Unfortunately, was having trouble getting it to play nice with jemalloc's namespace support. Testing: Tested gcc 4.8, gcc 5, gcc 5.2, clang 4.0. Only gcc >= 5 has sized deallocation support, verified that the rest build correctly. Tested mac osx and Centos. Tested --with-jemalloc-prefix and --without-export. This resolves #202.
* Add packing test, which verifies stable layout policy.Jason Evans2016-11-151-0/+1
|
* Fix EXTRA_CFLAGS to not affect configuration.Jason Evans2016-10-301-1/+2
|
* Only link with libm (-lm) if necessary.Jason Evans2016-10-281-3/+4
| | | | This fixes warnings when building with MSVC.
* Only use --whole-archive with gcc.Jason Evans2016-10-281-2/+3
| | | | | | | Conditionalize use of --whole-archive on the platform plus compiler, rather than on the ABI. This fixes a regression caused by 7b24c6e5570062495243f1e55131b395adb31e33 (Use --whole-archive when linking integration tests on MinGW.).
* Use --whole-archive when linking integration tests on MinGW.Jason Evans2016-10-261-1/+10
| | | | | | | Prior to this change, the malloc_conf weak symbol provided by the jemalloc dynamic library is always used, even if the application provides a malloc_conf symbol. Use the --whole-archive linker option to allow the weak symbol to be overridden.
* Add/use adaptive spinning.Jason Evans2016-10-131-0/+1
| | | | | | | | Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.
* Remove all vestiges of chunks.Jason Evans2016-10-121-2/+0
| | | | | | | | Remove mallctls: - opt.lg_chunk - stats.cactive This resolves #464.
* Remove ratio-based purging.Jason Evans2016-10-121-5/+3
| | | | | | | | | | | | | Make decay-based purging the default (and only) mode. Remove associated mallctls: - opt.purge - opt.lg_dirty_mult - arena.<i>.lg_dirty_mult - arenas.lg_dirty_mult - stats.arenas.<i>.lg_dirty_mult This resolves #385.
* use install command determined by configureThomas Köckerbauer2016-09-261-20/+21
|
* Fix librt detection when using a Cray compiler wrapperElliot Ronaghan2016-07-071-1/+1
| | | | | | | | | | | | | | | | | | | | | The Cray compiler wrappers will often add `-lrt` to the base compiler with `-static` linking (the default at most sites.) However, `-lrt` isn't automatically added with `-dynamic`. This means that if jemalloc was built with `-static`, but then used in a program with `-dynamic` jemalloc won't have detected that librt is a dependency. The integration and stress tests use -dynamic, which is causing undefined references to clock_gettime(). This just adds an extra check for librt (ignoring the autoconf cache) with `-dynamic` thrown. It also stops filtering librt from the integration tests. With this `make check` passes for: - PrgEnv-gnu - PrgEnv-intel - PrgEnv-pgi PrgEnv-cray still needs more work (will be in a separate patch.)
* Add -dynamic for integration and stress tests with Cray compiler wrappersElliot Ronaghan2016-07-071-2/+3
| | | | | | | | | | | | | | | | | | | Cray systems come with compiler wrappers to simplify building parallel applications. CC is the C++ wrapper, and cc is the C wrapper. The wrappers call the base {Cray, Intel, PGI, or GNU} compiler with vendor specific flags. The "Programming Environment" (prgenv) that's currently loaded determines the base compiler. e.g. compiling with gnu looks something like: module load PrgEnv-gnu cc hello.c On most systems the wrappers defaults to `-static` mode, which causes them to only look for static libraries, and not for any dynamic ones (even if the dynamic version was explicitly listed.) The integration and stress tests expect to be using the .so, so we have to run the with -dynamic so that wrapper will find/use the .so.
* Rename most remaining *chunk* APIs to *extent*.Jason Evans2016-06-061-5/+5
|
* Rename huge to large.Jason Evans2016-06-061-1/+1
|
* Use huge size class infrastructure for large size classes.Jason Evans2016-06-061-1/+0
|
* Replace extent_tree_szad_* with extent_heap_*.Jason Evans2016-06-031-0/+1
|
* Remove quarantine support.Jason Evans2016-05-131-2/+0
|
* Remove Valgrind support.Jason Evans2016-05-131-4/+0
|
* Fix tsd bootstrapping for a0malloc().Jason Evans2016-05-071-0/+1
|
* Link against librt for clock_gettime(2) if glibc < 2.17.Jason Evans2016-05-041-4/+3
| | | | | | | Link libjemalloc against librt if clock_gettime(2) is in librt rather than libc, as for versions of glibc prior to 2.17. This resolves #349.
* Fix fork()-related lock rank ordering reversals.Jason Evans2016-04-261-0/+1
|
* Implement the arena.<i>.reset mallctl.Jason Evans2016-04-221-1/+3
| | | | | | | This makes it possible to discard all of an arena's allocations in a single operation. This resolves #146.
* Add witness, a simple online locking validator.Jason Evans2016-04-141-1/+3
| | | | This resolves #358.
* Refactor/fix ph.Jason Evans2016-04-111-1/+0
| | | | | | | | | | | | | | | | | | | | | Refactor ph to support configurable comparison functions. Use a cpp macro code generation form equivalent to the rb macros so that pairing heaps can be used for both run heaps and chunk heaps. Remove per node parent pointers, and instead use leftmost siblings' prev pointers to track parents. Fix multi-pass sibling merging to iterate over intermediate results using a FIFO, rather than a LIFO. Use this fixed sibling merging implementation for both merge phases of the auxiliary twopass algorithm (first merging the aux list, then replacing the root with its merged children). This fixes both degenerate merge behavior and the potential for deep recursion. This regression was introduced by 6bafa6678fc36483e638f1c3a0a9bf79fb89bfc9 (Pairing heap). This resolves #371.
* Unittest for pairing heapDave Watson2016-03-081-0/+1
|
* Pairing heapDave Watson2016-03-081-0/+1
| | | | | | | | | | | | | | | Initial implementation of a twopass pairing heap with aux list. Research papers linked in comments. Where search/nsearch/last aren't needed, this gives much faster first(), delete(), and insert(). Insert is O(1), and first/delete don't have to walk the whole tree. Also tested rb_old with parent pointers - it was better than the current rb.h for memory loads, but still much worse than a pairing heap. An array-based heap would be much faster if everything fits in memory, but on a cold cache it has many more memory loads for most operations.
* Test run quantization.Jason Evans2016-02-221-0/+1
| | | | | Also rename run_quantize_*() to improve clarity. These tests demonstrate that run_quantize_ceil() is flawed.
* Refactor time_* into nstime_*.Jason Evans2016-02-221-11/+27
| | | | | | | Use a single uint64_t in nstime_t to store nanoseconds rather than using struct timespec. This reduces fragility around conversions between long and uint64_t, especially missing casts that only cause problems on 32-bit platforms.
* Implement decay-based unused dirty page purging.Jason Evans2016-02-201-3/+8
| | | | | | | | | | | | | | | | This is an alternative to the existing ratio-based unused dirty page purging, and is intended to eventually become the sole purging mechanism. Add mallctls: - opt.purge - opt.decay_time - arena.<i>.decay - arena.<i>.decay_time - arenas.decay_time - stats.arenas.<i>.decay_time This resolves #325.
* Implement smoothstep table generation.Jason Evans2016-02-201-0/+1
| | | | | | Check in a generated smootherstep table as smoothstep.h rather than generating it at configure time, since not all systems (e.g. Windows) have dc.
* Refactor prng* from cpp macros into inline functions.Jason Evans2016-02-201-3/+5
| | | | | Remove 32-bit variant, convert prng64() to prng_lg_range(), and add prng_range().
* Implement ticker.Jason Evans2016-02-201-2/+3
| | | | | Implement ticker, which provides a simple API for ticking off some number of events before indicating that the ticker has hit its limit.
* Flesh out time_*() API.Jason Evans2016-02-201-1/+1
|
* Add time_update().Cameron Evans2016-02-201-2/+3
|
* Expand check_integration_prof testing.Jason Evans2015-09-171-0/+1
| | | | | Run integration tests with MALLOC_CONF="prof:true,prof_active:false" in addition to MALLOC_CONF="prof:true".
* Link test to librt if it contains clock_gettime(2).Jason Evans2015-09-151-3/+4
| | | | This resolves #257.