summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
| * | Use syscall(2) rather than {open,read,close}(2) during boot.Jason Evans2016-10-301-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Some applications wrap various system calls, and if they call the allocator in their wrappers, unexpected reentry can result. This is not a general solution (many other syscalls are spread throughout the code), but this resolves a bootstrapping issue that is apparently common. This resolves #443.
| * | Fix EXTRA_CFLAGS to not affect configuration.Jason Evans2016-10-302-5/+4
| | |
| * | Do not mark malloc_conf as weak on Windows.Jason Evans2016-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | This works around malloc_conf not being properly initialized by at least the cygwin toolchain. Prior build system changes to use -Wl,--[no-]whole-archive may be necessary for malloc_conf resolution to work properly as a non-weak symbol (not tested).
| * | Do not mark malloc_conf as weak for unit tests.Jason Evans2016-10-291-1/+5
| | | | | | | | | | | | | | | | | | | | | This is generally correct (no need for weak symbols since no jemalloc library is involved in the link phase), and avoids linking problems (apparently unininitialized non-NULL malloc_conf) when using cygwin with gcc.
| * | Support static linking of jemalloc with glibcDave Watson2016-10-282-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | glibc defines its malloc implementation with several weak and strong symbols: strong_alias (__libc_calloc, __calloc) weak_alias (__libc_calloc, calloc) strong_alias (__libc_free, __cfree) weak_alias (__libc_free, cfree) strong_alias (__libc_free, __free) strong_alias (__libc_free, free) strong_alias (__libc_malloc, __malloc) strong_alias (__libc_malloc, malloc) The issue is not with the weak symbols, but that other parts of glibc depend on __libc_malloc explicitly. Defining them in terms of jemalloc API's allows the linker to drop glibc's malloc.o completely from the link, and static linking no longer results in symbol collisions. Another wrinkle: jemalloc during initialization calls sysconf to get the number of CPU's. GLIBC allocates for the first time before setting up isspace (and other related) tables, which are used by sysconf. Instead, use the pthread API to get the number of CPUs with GLIBC, which seems to work. This resolves #442.
| * | Reduce memory requirements for regression tests.Jason Evans2016-10-283-35/+55
| | | | | | | | | | | | | | | | | | | | | This is intended to drop memory usage to a level that AppVeyor test instances can handle. This resolves #393.
| * | Periodically purge in memory-intensive integration tests.Jason Evans2016-10-281-0/+7
| | | | | | | | | | | | This resolves #393.
| * | Periodically purge in memory-intensive integration tests.Jason Evans2016-10-283-6/+27
| | | | | | | | | | | | This resolves #393.
| * | Only link with libm (-lm) if necessary.Jason Evans2016-10-282-6/+16
| | | | | | | | | | | | This fixes warnings when building with MSVC.
| * | Only use --whole-archive with gcc.Jason Evans2016-10-283-3/+7
| | | | | | | | | | | | | | | | | | | | | Conditionalize use of --whole-archive on the platform plus compiler, rather than on the ABI. This fixes a regression caused by 7b24c6e5570062495243f1e55131b395adb31e33 (Use --whole-archive when linking integration tests on MinGW.).
| * | Do not force lazy lock on Windows.Jason Evans2016-10-281-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | This reverts 13473c7c66a81a4dc1cf11a97e9c8b1dbb785b64, which was intended to work around bootstrapping issues when linking statically. However, this actually causes problems in various other configurations, so this reversion may force a future fix for the underlying problem, if it still exists.
| * | Fix over-sized allocation of rtree leaf nodes.Jason Evans2016-10-281-1/+1
| | | | | | | | | | | | | | | Use the correct level metadata when allocating child nodes so that leaf nodes don't end up over-sized (2^16 elements vs 2^4 elements).
| * | Use --whole-archive when linking integration tests on MinGW.Jason Evans2016-10-261-1/+10
| | | | | | | | | | | | | | | | | | | | | Prior to this change, the malloc_conf weak symbol provided by the jemalloc dynamic library is always used, even if the application provides a malloc_conf symbol. Use the --whole-archive linker option to allow the weak symbol to be overridden.
| * | Do not (recursively) allocate within tsd_fetch().Jason Evans2016-10-2114-130/+176
| | | | | | | | | | | | | | | | | | | | | Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.
| * | Make dss operations lockless.Jason Evans2016-10-1311-146/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than protecting dss operations with a mutex, use atomic operations. This has negligible impact on synchronization overhead during typical dss allocation, but is a substantial improvement for chunk_in_dss() and the newly added chunk_dss_mergeable(), which can be called multiple times during chunk deallocations. This change also has the advantage of avoiding tsd in deallocation paths associated with purging, which resolves potential deadlocks during thread exit due to attempted tsd resurrection. This resolves #425.
| * | Add/use adaptive spinning.Jason Evans2016-10-136-2/+66
| | | | | | | | | | | | | | | | | | | | | | | | Add spin_t and spin_{init,adaptive}(), which provide a simple abstraction for adaptive spinning. Adaptively spin during busy waits in bootstrapping and rtree node initialization.
| * | Disallow 0x5a junk filling when running in Valgrind.Jason Evans2016-10-131-6/+28
| | | | | | | | | | | | | | | | | | | | | | | | Explicitly disallow junk:true and junk:free runtime settings when running in Valgrind, since deallocation-time junk filling and redzone validation cause false positive Valgrind reports. This resolves #470.
| * | Fix and simplify decay-based purging.Jason Evans2016-10-112-69/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify decay-based purging attempts to only be triggered when the epoch is advanced, rather than every time purgeable memory increases. In a correctly functioning system (not previously the case; see below), this only causes a behavior difference if during subsequent purge attempts the least recently used (LRU) purgeable memory extent is initially too large to be purged, but that memory is reused between attempts and one or more of the next LRU purgeable memory extents are small enough to be purged. In practice this is an arbitrary behavior change that is within the set of acceptable behaviors. As for the purging fix, assure that arena->decay.ndirty is recorded *after* the epoch advance and associated purging occurs. Prior to this fix, it was possible for purging during epoch advance to cause a substantially underrepresentative (arena->ndirty - arena->decay.ndirty), i.e. the number of dirty pages attributed to the current epoch was too low, and a series of unintended purges could result. This fix is also relevant in the context of the simplification described above, but the bug's impact would be limited to over-purging at epoch advances.
| * | Fix decay tests to all adapt to nstime_monotonic().Jason Evans2016-10-111-6/+9
| | |
| * | Do not advance decay epoch when time goes backwards.Jason Evans2016-10-116-6/+63
| | | | | | | | | | | | | | | | | | Instead, move the epoch backward in time. Additionally, add nstime_monotonic() and use it in debug builds to assert that time only goes backward if nstime_update() is using a non-monotonic time source.
| * | Refactor arena->decay_* into arena->decay.* (arena_decay_t).Jason Evans2016-10-112-84/+91
| | |
| * | Refine nstime_update().Jason Evans2016-10-105-38/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add missing #include <time.h>. The critical time facilities appear to have been transitively included via unistd.h and sys/time.h, but in principle this omission was capable of having caused clock_gettime(CLOCK_MONOTONIC, ...) to have been overlooked in favor of gettimeofday(), which in turn could cause spurious non-monotonic time updates. Refactor nstime_get() out of nstime_update() and add configure tests for all variants. Add CLOCK_MONOTONIC_RAW support (Linux-specific) and mach_absolute_time() support (OS X-specific). Do not fall back to clock_gettime(CLOCK_REALTIME, ...). This was a fragile Linux-specific workaround, which we're unlikely to use at all now that clock_gettime(CLOCK_MONOTONIC_RAW, ...) is supported, and if we have no choice besides non-monotonic clocks, gettimeofday() is only incrementally worse.
| * | Simplify run quantization.Jason Evans2016-10-063-154/+31
| | |
| * | Refactor runs_avail.Jason Evans2016-10-056-49/+70
| | | | | | | | | | | | | | | | | | | | | | | | Use pszind_t size classes rather than szind_t size classes, and always reserve space for NPSIZES elements. This removes unused heaps that are not multiples of the page size, and adds (currently) unused heaps for all huge size classes, with the immediate benefit that the size of arena_t allocations is constant (no longer dependent on chunk size).
| * | Implement pz2ind(), pind2sz(), and psz2u().Jason Evans2016-10-046-28/+203
| | | | | | | | | | | | | | | | | | | | | These compute size classes and indices similarly to size2index(), index2size() and s2u(), respectively, but using the subset of size classes that are multiples of the page size. Note that pszind_t and szind_t are not interchangeable.
| * | Use TSDN_NULL rather than NULL as appropriate.Jason Evans2016-10-043-9/+9
| | |
| * | Fix a typo.Jason Evans2016-10-041-1/+1
| | |
| * | Define 64-bits atomics unconditionallyMike Hommey2016-10-041-10/+8
| | | | | | | | | | | | They are used on all platforms in prng.h.
| * | Close file descriptor after reading "/proc/sys/vm/overcommit_memory".Jason Evans2016-09-261-0/+1
| | | | | | | | | | | | | | | | | | | | | This bug was introduced by c2f970c32b527660a33fa513a76d913c812dcf7c (Modify pages_map() to support mapping uncommitted virtual memory.). This resolves #399.
| * | Add an AppVeyor configMike Hommey2016-09-261-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This builds jemalloc and runs all checks with: - MSVC 2015 64-bits - MSVC 2015 32-bits - MINGW64 (from msys2) - MINGW32 (from msys2) Normally, AppVeyor configs are named appveyor.yml, but it is possible to configure the .yml file name in the AppVeyor project settings such that the file stays "hidden", like typical travis configs.
| * | Add Travis-CI configurationMike Hommey2016-09-261-0/+29
| | |
| * | use install command determined by configureThomas Köckerbauer2016-09-261-20/+21
| | |
| * | Readme.txt error for building in the WindowsBai2016-09-261-1/+1
| | | | | | | | | | | | The command can't work using sh -C sh -c "./autogen.sh CC=cl --enable-lazy-lock=no". Change the position of the colon, the command of autogen work.
| * | Fix LG_QUANTUM definition for sparc64Eric Le Bihan2016-09-261-1/+1
| | | | | | | | | | | | | | | | | | GCC 4.9.3 cross-compiled for sparc64 defines __sparc_v9__, not __sparc64__ nor __sparcv9. This prevents LG_QUANTUM from being defined properly. Adding this new value to the check solves the issue.
| * | Disable irrelevant Cray compiler warnings if cc-silence is enabledElliot Ronaghan2016-09-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cray is pretty warning-happy, so disable ones that aren't helpful. Each warning has a numeric value instead of having named flags to disable specific warnings. Disable warnings 128 and 1357. 128: Ignore unreachable code warning. Cray warns about `not_reached()` not being reachable in a couple of places because it detects that some loops will never terminate. 1357: Ignore warning about redefinition of malloc and friends With this patch, Cray 8.4.0 and 8.5.1 build cleanly and pass `make check`
| * | Add Cray compiler's equivalent of -Werror before __attribute__ checksElliot Ronaghan2016-09-261-0/+4
| | | | | | | | | | | | | | | | | | Cray uses -herror_on_warning instead of -Werror. Use it everywhere -Werror is currently used for __attribute__ checks so configure actually detects they're not supported.
| * | Disable automatic dependency generation for the Cray compilerElliot Ronaghan2016-09-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | Cray only supports `-M` for generating dependency files. It does not support `-MM` or `-MT`, so don't try to use them. I just reused the existing mechanism for turning auto-dependency generation off (`CC_MM=`), but it might be more principled to add a configure test to check if the compiler supports `-MM` and `-MT`, instead of manually tracking which compilers don't support those flags.
| * | Add initial support for building with the cray compilerElliot Ronaghan2016-09-261-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Get jemalloc building and passing `make check_unit` with cray 8.4. An inlining bug in 8.4 results in internal errors while trying to build jemalloc. This has already been reported and fixed for the 8.5 release. In order to work around the inlining bug, disable gnu compatibility and limit ipa optimizations. I copied the msvc compiler check for cray, but note that we perform the test even if we think we're using gcc because cray pretends to be gcc if `-hgnu` (which is enabled by default) is used. I couldn't come up with a principled way to check for the inlining bug, so instead I just checked compiler versions. The build had lots of warnings I need to address and cray doesn't support -MM or -MT for dependency tracking, so I had to do `make CC_MM=`.
| * | Fix librt detection when using a Cray compiler wrapperElliot Ronaghan2016-09-262-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Cray compiler wrappers will often add `-lrt` to the base compiler with `-static` linking (the default at most sites.) However, `-lrt` isn't automatically added with `-dynamic`. This means that if jemalloc was built with `-static`, but then used in a program with `-dynamic` jemalloc won't have detected that librt is a dependency. The integration and stress tests use -dynamic, which is causing undefined references to clock_gettime(). This just adds an extra check for librt (ignoring the autoconf cache) with `-dynamic` thrown. It also stops filtering librt from the integration tests. With this `make check` passes for: - PrgEnv-gnu - PrgEnv-intel - PrgEnv-pgi PrgEnv-cray still needs more work (will be in a separate patch.)
| * | Add -dynamic for integration and stress tests with Cray compiler wrappersElliot Ronaghan2016-09-262-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cray systems come with compiler wrappers to simplify building parallel applications. CC is the C++ wrapper, and cc is the C wrapper. The wrappers call the base {Cray, Intel, PGI, or GNU} compiler with vendor specific flags. The "Programming Environment" (prgenv) that's currently loaded determines the base compiler. e.g. compiling with gnu looks something like: module load PrgEnv-gnu cc hello.c On most systems the wrappers defaults to `-static` mode, which causes them to only look for static libraries, and not for any dynamic ones (even if the dynamic version was explicitly listed.) The integration and stress tests expect to be using the .so, so we have to run the with -dynamic so that wrapper will find/use the .so.
| * | Don't use compact red-black trees with the pgi compilerElliot Ronaghan2016-09-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some bug (either in the red-black tree code, or in the pgi compiler) seems to cause red-black trees to become unbalanced. This issue seems to go away if we don't use compact red-black trees. Since red-black trees don't seem to be used much anymore, I opted for what seems to be an easy fix here instead of digging in and trying to find the root cause of the bug. Some context in case it's helpful: I experienced a ton of segfaults while using pgi as Chapel's target compiler with jemalloc 4.0.4. The little bit of debugging I did pointed me somewhere deep in red-black tree manipulation, but I didn't get a chance to investigate further. It looks like 4.2.0 replaced most uses of red-black trees with pairing-heaps, which seems to avoid whatever bug I was hitting. However, `make check_unit` was still failing on the rb test, so I figured the core issue was just being masked. Here's the `make check_unit` failure: ```sh === test/unit/rb === test_rb_empty: pass tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced tree_recurse:test/unit/rb.c:90: Failed assertion: (((_Bool) (((uintptr_t) (left_node)->link.rbn_right_red) & ((size_t)1)))) == (false) --> true != false: Node should be black test_rb_random:test/unit/rb.c:274: Failed assertion: (imbalances) == (0) --> 1 != 0: Tree is unbalanced node_remove:test/unit/rb.c:190: Failed assertion: (imbalances) == (0) --> 2 != 0: Tree is unbalanced <jemalloc>: test/unit/rb.c:43: Failed assertion: "pathp[-1].cmp < 0" test/test.sh: line 22: 12926 Aborted Test harness error ``` While starting to debug I saw the RB_COMPACT option and decided to check if turning that off resolved the bug. It seems to have fixed it (`make check_unit` passes and the segfaults under Chapel are gone) so it seems like on okay work-around. I'd imagine this has performance implications for red-black trees under pgi, but if they're not going to be used much anymore it's probably not a big deal.
| * | Work around a weird pgi bug in test/unit/math.cElliot Ronaghan2016-09-261-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pgi fails to compile math.c, reporting that `-INFINITY` in `pt_norm_expected[]` is a "Non-constant" expression. A simplified version of this failure is: ```c #include <math.h> static double inf1, inf2 = INFINITY; // no complaints static double inf3 = INFINITY; // suddenly INFINITY is "Non-constant" int main() { } ``` ```sh PGC-S-0074-Non-constant expression in initializer (t.c: 4) ``` pgi errors on the declaration of inf3, and will compile fine if that line is removed. I've reported this bug to pgi, but in the meantime I just switched to using (DBL_MAX + DBL_MAX) to work around this bug.
| * | Formatting fixes.Jason Evans2016-09-261-9/+12
| | |
| * | Change how the default zone is foundMike Hommey2016-09-261-2/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On OSX 10.12, malloc_default_zone returns a special zone that is not present in the list of registered zones. That zone uses a "lite zone" if one is present (apparently enabled when malloc stack logging is enabled), or the first registered zone otherwise. In practice this means unless malloc stack logging is enabled, the first registered zone is the default. So get the list of zones to get the first one, instead of relying on malloc_default_zone.
| * | Fix a bug in __builtin_unreachable configure checkElliot Ronaghan2016-09-261-1/+1
| | | | | | | | | | | | | | | | | | | | | In 1167e9e, I accidentally tested je_cv_gcc_builtin_ffsl instead of je_cv_gcc_builtin_unreachable (copy-paste error), which meant that JEMALLOC_INTERNAL_UNREACHABLE was always getting defined as abort even if __builtin_unreachable support was detected.
| * | Check for __builtin_unreachable at configure timeElliot Ronaghan2016-09-263-16/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a configure check for __builtin_unreachable instead of basing its availability on the __GNUC__ version. On OS X using gcc (a real gcc, not the bundled version that's just a gcc front-end) leads to a linker assertion: https://github.com/jemalloc/jemalloc/issues/266 It turns out that this is caused by a gcc bug resulting from the use of __builtin_unreachable(): https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438 To work around this bug, check that __builtin_unreachable() actually works at configure time, and if it doesn't use abort() instead. The check is based on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57438#c21. With this `make check` passes with a homebrew installed gcc-5 and gcc-6.
| * | Fix a valgrind regression in chunk_recycle()Elliot Ronaghan2016-09-261-1/+2
| | | | | | | | | | | | | | | Fix a latent valgrind bug exposed by d412624b25eed2b5c52b7d94a71070d3aab03cb4 (Move retaining out of default chunk hooks).
| * | Fix arena_bind().Qi Wang2016-09-231-6/+7
| | | | | | | | | | | | | | | When tsd is not in nominal state (e.g. during thread termination), we should not increment nthreads.
| * | Change html manual encoding to UTF-8.Jason Evans2016-09-123-88/+92
|/ / | | | | | | | | | | | | | | | | | | | | This works around GitHub's broken automatic reformatting from ISO-8859-1 to UTF-8 when serving static html. Remove <parameter/> from e.g. <function>malloc<parameter/></function>, add a custom template that does not append parentheses, and manually specify them, e.g. <function>malloc()</function>. This works around apparently broken XSL formatting that causes <code/> to be emitted in html (rather than <code></code>, or better yet, nothing).
* | Merge branch.4.2.1Jason Evans2016-06-089-38/+96
|\ \ | |/ |/|