summaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAgeFilesLines
* Maintain all the dirty runs in a linked list for each arenaQinfan Wu2014-08-121-0/+6
|
* Add atomic operations tests and fix latent bugs.Jason Evans2014-08-071-12/+29
|
* Add OpenRISC/or1k LG_QUANTUM size definitionManuel A. Fernandez Montecelo2014-07-291-0/+3
|
* Allow to build with clang-clMike Hommey2014-06-121-0/+4
|
* Add check for madvise(2) to configure.ac.Richard Diamond2014-06-031-0/+5
| | | | | | Some platforms, such as Google's Portable Native Client, use Newlib and thus lack access to madvise(2). In those instances, pages_purge() is transformed into a no-op.
* Try to use __builtin_ffsl if ffsl is unavailable.Richard Diamond2014-06-026-9/+43
| | | | | | | | | | | Some platforms (like those using Newlib) don't have ffs/ffsl. This commit adds a check to configure.ac for __builtin_ffsl if ffsl isn't found. __builtin_ffsl performs the same function as ffsl, and has the added benefit of being available on any platform utilizing Gcc-compatible compiler. This change does not address the used of ffs in the MALLOCX_ARENA() macro.
* Fix fallback lg_floor() implementations.Jason Evans2014-06-021-10/+16
|
* Don't use msvc_compat's C99 headers with MSVC versions that have (some) C99 ↵Mike Hommey2014-06-023-0/+0
| | | | support
* Use KQU() rather than QU() where applicable.Jason Evans2014-05-292-6/+6
| | | | Fix KZI() and KQI() to append LL rather than ULL.
* Add size class computation capability.Jason Evans2014-05-297-62/+406
| | | | | | | Add size class computation capability, currently used only as validation of the size class lookup tables. Generalize the size class spacing used for bins, for eventual use throughout the full range of allocation sizes.
* Move platform headers and tricks from jemalloc_internal.h.in to a new ↵Mike Hommey2014-05-283-56/+59
| | | | jemalloc_internal_decls.h header
* Move __func__ to jemalloc_internal_macros.hMike Hommey2014-05-272-1/+4
| | | | test/integration/aligned_alloc.c needs it.
* Use ULL prefix instead of LLU for unsigned long longsMike Hommey2014-05-271-4/+4
| | | | MSVC only supports the former.
* Refactor huge allocation to be managed by arenas.Jason Evans2014-05-1611-59/+35
| | | | | | | | | | | | | | | | | | | | Refactor huge allocation to be managed by arenas (though the global red-black tree of huge allocations remains for lookup during deallocation). This is the logical conclusion of recent changes that 1) made per arena dss precedence apply to huge allocation, and 2) made it possible to replace the per arena chunk allocation/deallocation functions. Remove the top level huge stats, and replace them with per arena huge stats. Normalize function names and types to *dalloc* (some were *dealloc*). Remove the --enable-mremap option. As jemalloc currently operates, this is a performace regression for some applications, but planned work to logarithmically space huge size classes should provide similar amortized performance. The motivation for this change was that mremap-based huge reallocation forced leaky abstractions that prevented refactoring.
* Add support for user-specified chunk allocators/deallocators.aravind2014-05-127-12/+33
| | | | | | | Add new mallctl endpoints "arena<i>.chunk.alloc" and "arena<i>.chunk.dealloc" to allow userspace to configure jemalloc's chunk allocator and deallocator on a per-arena basis.
* Simplify backtracing.Jason Evans2014-04-231-4/+3
| | | | | | | | | | | Simplify backtracing to not ignore any frames, and compensate for this in pprof in order to increase flexibility with respect to function-based refactoring even in the presence of non-deterministic inlining. Modify pprof to blacklist all jemalloc allocation entry points including non-standard ones like mallocx(), and ignore all allocator-internal frames. Prior to this change, pprof excluded the specifically blacklisted functions from backtraces, but it left allocator-internal frames intact.
* prof_backtrace: use unw_backtraceLucian Adrian Grijincu2014-04-231-2/+2
| | | | | | unw_backtrace: - does internal per-thread caching - doesn't acquire an internal lock
* Refactor small_size2bin and small_bin2size.Jason Evans2014-04-174-20/+52
| | | | | Refactor small_size2bin and small_bin2size to be inline functions rather than directly accessed arrays.
* Fix debug-only compilation failures.Jason Evans2014-04-161-3/+2
| | | | | | | | Fix debug-only compilation failures introduced by changes to prof_sample_accum_update() in: 6c39f9e059d0825f4c29d8cec9f318b798912c3c refactor profiling. only use a bytes till next sample variable.
* Merge pull request #73 from bmaurer/smallmallocJason Evans2014-04-165-188/+81
|\ | | | | Smaller malloc hot path
| * Create a const array with only a small bin to size mapBen Maurer2014-04-164-6/+8
| |
| * refactor profiling. only use a bytes till next sample variable.Ben Maurer2014-04-162-149/+70
| |
| * outline rare tcache_get codepathsBen Maurer2014-04-162-33/+3
| |
* | Optimize Valgrind integration.Jason Evans2014-04-154-86/+121
| | | | | | | | | | | | | | | | | | | | | | Forcefully disable tcache if running inside Valgrind, and remove Valgrind calls in tcache-specific code. Restructure Valgrind-related code to move most Valgrind calls out of the fast path functions. Take advantage of static knowledge to elide some branches in JEMALLOC_VALGRIND_REALLOC().
* | Remove the "opt.valgrind" mallctl.Jason Evans2014-04-152-5/+6
| | | | | | | | | | Remove the "opt.valgrind" mallctl because it is unnecessary -- jemalloc automatically detects whether it is running inside valgrind.
* | Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug.Jason Evans2014-04-153-5/+2
| | | | | | | | | | | | | | Make dss non-optional on all platforms which support sbrk(2). Fix the "arena.<i>.dss" mallctl to return an error if "primary" or "secondary" precedence is specified, but sbrk(2) is not supported.
* | Remove the *allocm() API, which is superceded by the *allocx() API.Jason Evans2014-04-155-34/+0
|/
* Remove support for non-prof-promote heap profiling metadata.Jason Evans2014-04-115-76/+18
| | | | | | | | | | | | | | | Make promotion of sampled small objects to large objects mandatory, so that profiling metadata can always be stored in the chunk map, rather than requiring one pointer per small region in each small-region page run. In practice the non-prof-promote code was only useful when using jemalloc to track all objects and report them as leaks at program exit. However, Valgrind is at least as good a tool for this particular use case. Furthermore, the non-prof-promote code is getting in the way of some optimizations that will make heap profiling much cheaper for the predominant use case (sampling a small representative proportion of all allocations).
* Don't dereference chunk->arena in free() hot pathBen Maurer2014-04-052-8/+5
| | | | | | | | | | | | When you call free() we load chunk->arena even though that data isn't used on the tcache hot path. In profiling some FB applications, I found that ~30% of the dTLB misses in the free() function come from this line. With 4 MB chunks, the arena_chunk_t->map is ~ 32 KB (1024 pages in the chunk, 4 8 byte pointers in arena_chunk_map_t). This means there's only a 1/8 chance of the page containing chunk->arena also comtaining the map bits.
* Add private namespace mangling for huge_dss_prec_get().Jason Evans2014-03-311-0/+1
|
* Adapt hash tests to big-endian systems.Jason Evans2014-03-302-1/+4
| | | | | | | | | The hash code, which has MurmurHash3 at its core, generates different output depending on system endianness, so adapt the expected output on big-endian systems. MurmurHash3 code also makes the assumption that unaligned access is okay (not true on all systems), but jemalloc only hashes data structures that have sufficient alignment to dodge this limitation.
* Use arena dss prec instead of default for huge allocs.Max Wang2014-03-282-8/+10
| | | | | Pass a dss_prec_t parameter to huge_{m,p,r}alloc instead of defaulting to the chunk dss prec.
* Add workaround for missing 'restrict' keyword.Jason Evans2014-02-252-0/+7
| | | | | | | | | Add a cpp #define that removes 'restrict' keyword usage unless the compiler definitely supports C99. As written, 'restrict' is only enabled if the compiler supports the -std=gnu99 option (e.g. gcc and llvm). Reported by Tobias Hieta.
* Avoid a compiler warning.Jason Evans2014-01-291-1/+5
| | | | | | | | Avoid copying "jeprof" to a 1-byte buffer within prof_boot0() when heap profiling is disabled. Although this is dead code under such conditions, the compiler doesn't figure that part out. Reported by Eduardo Silva.
* Remove __FBSDID from rb.h.Jason Evans2014-01-221-4/+0
|
* Add heap profiling tests.Jason Evans2014-01-172-0/+7
| | | | | | Fix a regression in prof_dump_ctx() due to an uninitized variable. This was caused by revision 4f37ef693e3d5903ce07dc0b61c0da320b35e3d9, so no releases are affected.
* Fix a variable prototype/definition mismatch.Jason Evans2014-01-171-1/+6
|
* Fix name mangling for stress tests.Jason Evans2014-01-179-145/+76
| | | | | | | | | | | Fix stress tests such that testlib code uses the jet_ allocator, but test code uses libjemalloc. Generate jemalloc_{rename,mangle}.h, the former because it's needed for the stress test name mangling fix, and the latter for consistency. As an artifact of this change, some (but not all) definitions related to the experimental API are absent from the headers unless the feature is enabled at configure time.
* Refactor prof_dump() to reduce contention.Jason Evans2014-01-161-0/+5
| | | | | | | | | | | | | | Refactor prof_dump() to use a two pass algorithm, and prof_leave() prior to the second pass. This avoids write(2) system calls while holding critical prof resources. Fix prof_dump() to close the dump file descriptor for all relevant error paths. Minimize the size of prof-related static buffers when prof is disabled. This saves roughly 65 KiB of application memory for non-prof builds. Refactor prof_ctx_init() out of prof_lookup_global().
* Refactor overly large/complex functions.Jason Evans2014-01-151-0/+1
| | | | | | | Refactor overly large functions by breaking out helper functions. Refactor overly complex multi-purpose functions into separate more specific functions.
* Extract profiling code from [re]allocation functions.Jason Evans2014-01-124-56/+76
| | | | | | | | | | | | | | | | | | | Extract profiling code from malloc(), imemalign(), calloc(), realloc(), mallocx(), rallocx(), and xallocx(). This slightly reduces the amount of code compiled into the fast paths, but the primary benefit is the combinatorial complexity reduction. Simplify iralloc[t]() by creating a separate ixalloc() that handles the no-move cases. Further simplify [mrxn]allocx() (and by implication [mrn]allocm()) to make request size overflows due to size class and/or alignment constraints trigger undefined behavior (detected by debug-only assertions). Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling backtrace creation in imemalign(). This bug impacted posix_memalign() and aligned_alloc().
* Add junk/zero filling unit tests, and fix discovered bugs.Jason Evans2014-01-083-2/+20
| | | | | | Fix growing large reallocation to junk fill new space. Fix huge deallocation to junk fill when munmap is disabled.
* Add util unit tests, and fix discovered bugs.Jason Evans2014-01-071-1/+2
| | | | | | | | | | | | | Add unit tests for pow2_ceil(), malloc_strtoumax(), and malloc_snprintf(). Fix numerous bugs in malloc_strotumax() error handling/reporting. These bugs could have caused application-visible issues for some seldom used (0X... and 0... prefixes) or malformed MALLOC_CONF or mallctl() argument strings, but otherwise they had no impact. Fix numerous bugs in malloc_snprintf(). These bugs were not exercised by existing malloc_*printf() calls, so they had no impact.
* Convert rtree from (void *) to (uint8_t) storage.Jason Evans2014-01-032-19/+21
| | | | | | | | | | | | | Reduce rtree memory usage by storing booleans (1 byte each) rather than pointers. The rtree code is only used to record whether jemalloc manages a chunk of memory, so there's no need to store pointers in the rtree. Increase rtree node size to 64 KiB in order to reduce tree depth from 13 to 3 on 64-bit systems. The conversion to more compact leaf nodes was enough by itself to make the rtree depth 1 on 32-bit systems; due to the fact that root nodes are smaller than the specified node size if possible, the node size change has no impact on 32-bit systems (assuming default chunk size).
* Add rtree unit tests.Jason Evans2014-01-032-4/+11
|
* Add missing prototypes.Jason Evans2013-12-171-2/+7
|
* Add quarantine unit tests.Jason Evans2013-12-172-0/+8
| | | | | | | | | Verify that freed regions are quarantined, and that redzone corruption is detected. Introduce a testing idiom for intercepting/replacing internal functions. In this case the replaced function is ordinarily a static function, but the idiom should work similarly for library-private functions.
* Add hash (MurmurHash3) tests.Jason Evans2013-12-171-1/+0
| | | | Add hash tests that are based on SMHasher's VerificationTest() function.
* Finish arena_prof_ctx_set() optimization.Jason Evans2013-12-161-7/+7
| | | | Delay reading the mapbits until it's unavoidable.
* Don't junk-fill reallocations unless usize changes.Jason Evans2013-12-161-0/+1
| | | | | | | | | | | Don't junk fill reallocations for which the request size is less than the current usable size, but not enough smaller to cause a size class change. Unlike malloc()/calloc()/realloc(), *allocx() contractually treats the full usize as the allocation, so a caller can ask for zeroed memory via mallocx() and a series of rallocx() calls that all specify MALLOCX_ZERO, and be assured that all newly allocated bytes will be zeroed and made available to the application without danger of allocator mutation until the size class decreases enough to cause usize reduction.