summaryrefslogtreecommitdiffstats
path: root/doc
Commit message (Collapse)AuthorAgeFilesLines
...
* Add missing header includes in jemalloc/jemalloc.h .Jason Evans2014-10-051-2/+1
| | | | | | | Add stdlib.h, stdbool.h, and stdint.h to jemalloc/jemalloc.h so that applications only have to #include <jemalloc/jemalloc.h>. This resolves #132.
* Implement/test/fix prof-related mallctl's.Jason Evans2014-10-041-6/+46
| | | | | | | | | | | Implement/test/fix the opt.prof_thread_active_init, prof.thread_active_init, and thread.prof.active mallctl's. Test/fix the thread.prof.name mallctl. Refactor opt_prof_active to be read-only and move mutable state into the prof_active variable. Stop leaning on ctl-related locking for protection.
* Test prof.reset mallctl and fix numerous discovered bugs.Jason Evans2014-10-031-2/+3
|
* Add support for sized deallocation.Daniel Micay2014-09-091-1/+18
| | | | | | | | | | | | | | | | | This adds a new `sdallocx` function to the external API, allowing the size to be passed by the caller. It avoids some extra reads in the thread cache fast path. In the case where stats are enabled, this avoids the work of calculating the size from the pointer. An assertion validates the size that's passed in, so enabling debugging will allow users of the API to debug cases where an incorrect size is passed in. The performance win for a contrived microbenchmark doing an allocation and immediately freeing it is ~10%. It may have a different impact on a real workload. Closes #28
* Implement per thread heap profiling.Jason Evans2014-08-201-1/+55
| | | | | | | | | | | | | | | | | | | | | | | | Rename data structures (prof_thr_cnt_t-->prof_tctx_t, prof_ctx_t-->prof_gctx_t), and convert to storing a prof_tctx_t for sampled objects. Convert PROF_ALLOC_PREP() to prof_alloc_prep(), since precise backtrace depth within jemalloc functions is no longer an issue (pprof prunes irrelevant frames). Implement mallctl's: - prof.reset implements full sample data reset, and optional change of sample interval. - prof.lg_sample reads the current sample interval (opt.lg_prof_sample was the permanent source of truth prior to prof.reset). - thread.prof.name provides naming capability for threads within heap profile dumps. - thread.prof.active makes it possible to activate/deactivate heap profiling for individual threads. Modify the heap dump files to contain per thread heap profile data. This change is incompatible with the existing pprof, which will require enhancements to read and process the enriched data.
* Minor doc edit.Jason Evans2014-05-161-4/+4
|
* Refactor huge allocation to be managed by arenas.Jason Evans2014-05-161-63/+65
| | | | | | | | | | | | | | | | | | | | Refactor huge allocation to be managed by arenas (though the global red-black tree of huge allocations remains for lookup during deallocation). This is the logical conclusion of recent changes that 1) made per arena dss precedence apply to huge allocation, and 2) made it possible to replace the per arena chunk allocation/deallocation functions. Remove the top level huge stats, and replace them with per arena huge stats. Normalize function names and types to *dalloc* (some were *dealloc*). Remove the --enable-mremap option. As jemalloc currently operates, this is a performace regression for some applications, but planned work to logarithmically space huge size classes should provide similar amortized performance. The motivation for this change was that mremap-based huge reallocation forced leaky abstractions that prevented refactoring.
* Add support for user-specified chunk allocators/deallocators.aravind2014-05-121-0/+63
| | | | | | | Add new mallctl endpoints "arena<i>.chunk.alloc" and "arena<i>.chunk.dealloc" to allow userspace to configure jemalloc's chunk allocator and deallocator on a per-arena basis.
* Optimize Valgrind integration.Jason Evans2014-04-151-1/+2
| | | | | | | | | | | Forcefully disable tcache if running inside Valgrind, and remove Valgrind calls in tcache-specific code. Restructure Valgrind-related code to move most Valgrind calls out of the fast path functions. Take advantage of static knowledge to elide some branches in JEMALLOC_VALGRIND_REALLOC().
* Remove the "opt.valgrind" mallctl.Jason Evans2014-04-151-13/+0
| | | | | Remove the "opt.valgrind" mallctl because it is unnecessary -- jemalloc automatically detects whether it is running inside valgrind.
* Remove the "arenas.purge" mallctl.Jason Evans2014-04-151-11/+1
| | | | | Remove the "arenas.purge" mallctl, which was obsoleted by the "arena.<i>.purge" mallctl in 3.1.0.
* Make dss non-optional, and fix an "arena.<i>.dss" mallctl bug.Jason Evans2014-04-151-16/+13
| | | | | | | Make dss non-optional on all platforms which support sbrk(2). Fix the "arena.<i>.dss" mallctl to return an error if "primary" or "secondary" precedence is specified, but sbrk(2) is not supported.
* Update MALLOCX_ARENA() documentation.Jason Evans2014-04-151-4/+4
| | | | | Update MALLOCX_ARENA() documentation to no longer claim that it has no effect for huge region allocations.
* Remove the *allocm() API, which is superceded by the *allocx() API.Jason Evans2014-04-151-189/+2
|
* Document how dss precedence affects huge allocation.Jason Evans2014-03-311-2/+6
|
* Extract profiling code from [re]allocation functions.Jason Evans2014-01-121-10/+16
| | | | | | | | | | | | | | | | | | | Extract profiling code from malloc(), imemalign(), calloc(), realloc(), mallocx(), rallocx(), and xallocx(). This slightly reduces the amount of code compiled into the fast paths, but the primary benefit is the combinatorial complexity reduction. Simplify iralloc[t]() by creating a separate ixalloc() that handles the no-move cases. Further simplify [mrxn]allocx() (and by implication [mrn]allocm()) to make request size overflows due to size class and/or alignment constraints trigger undefined behavior (detected by debug-only assertions). Report ENOMEM rather than EINVAL if an OOM occurs during heap profiling backtrace creation in imemalign(). This bug impacted posix_memalign() and aligned_alloc().
* Fix a few mallctl() documentation errors.Jason Evans2013-12-201-17/+20
| | | | Normalize mallctl() order (code and documentation).
* Add mallctl*() unit tests.Jason Evans2013-12-201-3/+2
|
* Remove ENOMEM from the documented set of *mallctl() errors.Jason Evans2013-12-181-6/+0
| | | | | | | *mallctl() always returns EINVAL and does partial result copying when *oldlenp is to short to hold the requested value, rather than returning ENOMEM. Therefore remove ENOMEM from the documented set of possible errors.
* Implement the *allocx() API.Jason Evans2013-12-131-47/+201
| | | | | | | | | | | | | | | | | | | | | | | Implement the *allocx() API, which is a successor to the *allocm() API. The *allocx() functions are slightly simpler to use because they have fewer parameters, they directly return the results of primary interest, and mallocx()/rallocx() avoid the strict aliasing pitfall that allocm()/rallocx() share with posix_memalign(). The following code violates strict aliasing rules: foo_t *foo; allocm((void **)&foo, NULL, 42, 0); whereas the following is safe: foo_t *foo; void *p; allocm(&p, NULL, 42, 0); foo = (foo_t *)p; mallocx() does not have this problem: foo_t *foo = (foo_t *)mallocx(42, 0);
* Fix ALLOCM_ARENA(a) handling in rallocm().Jason Evans2013-11-261-4/+6
| | | | | | | Fix rallocm() to use the specified arena for allocation, not just deallocation. Clarify ALLOCM_ARENA(a) documentation.
* Add ids for all mallctl entries.Jason Evans2013-10-301-69/+69
| | | | | Add ids for all mallctl entries, so that external documents can link to arbitrary mallctl entries.
* Clarify how to use malloc_conf.Jason Evans2013-03-191-1/+8
| | | | | | Clarify that malloc_conf is intended only for compile-time configuration, since jemalloc may be initialized before main() is entered.
* Add clipping support to lg_chunk option processing.Jason Evans2012-12-231-2/+5
| | | | | | | | | Modify processing of the lg_chunk option so that it clips an out-of-range input to the edge of the valid range. This makes it possible to request the minimum possible chunk size without intimate knowledge of allocator internals. Submitted by Ian Lepore (see FreeBSD PR bin/174641).
* document what stats.active does not trackJan Beich2012-11-071-2/+4
| | | | Based on http://www.canonware.com/pipermail/jemalloc-discuss/2012-March/000164.html
* Purge unused dirty pages in a fragmentation-reducing order.Jason Evans2012-11-061-1/+1
| | | | | | | | | | | | | | | | Purge unused dirty pages in an order that first performs clean/dirty run defragmentation, in order to mitigate available run fragmentation. Remove the limitation that prevented purging unless at least one chunk worth of dirty pages had accumulated in an arena. This limitation was intended to avoid excessive purging for small applications, but the threshold was arbitrary, and the effect of questionable utility. Relax opt_lg_dirty_mult from 5 to 3. This compensates for increased likelihood of allocating clean runs, given the same ratio of clean:dirty runs, and reduces the potential for repeated purging in pathological large malloc/free loops that push the active:dirty page ratio just over the purge threshold.
* Add arena-specific and selective dss allocation.Jason Evans2012-10-131-9/+80
| | | | | | | | | | | | | | | | | | | Add the "arenas.extend" mallctl, so that it is possible to create new arenas that are outside the set that jemalloc automatically multiplexes threads onto. Add the ALLOCM_ARENA() flag for {,r,d}allocm(), so that it is possible to explicitly allocate from a particular arena. Add the "opt.dss" mallctl, which controls the default precedence of dss allocation relative to mmap allocation. Add the "arena.<i>.dss" mallctl, which makes it possible to set the default dss precedence on a per arena or global basis. Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge". Add the "stats.arenas.<i>.dss" mallctl.
* Disable tcache by default if running inside Valgrind.Jason Evans2012-05-161-1/+2
| | | | | Disable tcache by default if running inside Valgrind, in order to avoid making unallocated objects appear reachable to Valgrind.
* Auto-detect whether running inside Valgrind.Jason Evans2012-05-151-16/+11
| | | | | Auto-detect whether running inside Valgrind, thus removing the need to manually specify MALLOC_CONF=valgrind:true.
* Generalize "stats.mapped" documentation.Jason Evans2012-05-101-2/+2
| | | | | | Generalize "stats.mapped" documentation to state that all inactive chunks are omitted, now that it is possible for mmap'ed chunks to be omitted in addition to DSS chunks.
* Add the --enable-mremap option.Jason Evans2012-05-091-0/+10
| | | | | | Add the --enable-mremap option, and disable the use of mremap(2) by default, for the same reason that freeing chunks via munmap(2) is disabled by default on Linux: semi-permanent VM map fragmentation.
* Fix Valgrind URL in documentation.Jason Evans2012-04-261-20/+20
| | | | Reported by Daichi GOTO.
* Fix a memory corruption bug in chunk_alloc_dss().Jason Evans2012-04-211-2/+2
| | | | | | | | | Fix a memory corruption bug in chunk_alloc_dss() that was due to claiming newly allocated memory is zeroed. Reverse order of preference between mmap() and sbrk() to prefer mmap(). Clean up management of 'zero' parameter in chunk_alloc*().
* Update prof defaults to match common usage.Jason Evans2012-04-171-17/+28
| | | | | | | | | Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB). Change the "opt.prof_accum" default from true to false. Add the "opt.prof_final" mallctl, so that "opt.prof_prefix" need not be abused to disable final profile dumping.
* Update pprof (from gperftools 2.0).Jason Evans2012-04-171-1/+1
|
* Add the --disable-munmap option.Jason Evans2012-04-171-0/+10
| | | | | | Add the --disable-munmap option, remove the configure test that attempted to detect the VM allocation quirk known to exist on Linux x86[_64], and make --disable-munmap implicit on Linux.
* Always disable redzone by default.Jason Evans2012-04-131-3/+1
| | | | | | Always disable redzone by default, even when --enable-debug is specified. The memory overhead for redzones can be substantial, which makes this feature something that should only be opted into.
* Implement Valgrind support, redzones, and quarantine.Jason Evans2012-04-111-4/+75
| | | | | | | | | | | | | Implement Valgrind support, as well as the redzone and quarantine features, which help Valgrind detect memory errors. Redzones are only implemented for small objects because the changes necessary to support redzones around large and huge objects are complicated by in-place reallocation, to the point that it isn't clear that the maintenance burden is worth the incremental improvement to Valgrind support. Merge arena_salloc() and arena_salloc_demote(). Refactor i[v]salloc() to expose the 'demote' option.
* Add utrace(2)-based tracing (--enable-utrace).Jason Evans2012-04-051-0/+25
|
* Remove obsolete "config.dynamic_page_shift" mallctl documentation.Jason Evans2012-04-031-10/+0
|
* Clean up *PAGE* macros.Jason Evans2012-04-021-10/+1
| | | | | | | | | | | s/PAGE_SHIFT/LG_PAGE/g and s/PAGE_SIZE/PAGE/g. Remove remnants of the dynamic-page-shift code. Rename the "arenas.pagesize" mallctl to "arenas.page". Remove the "arenas.chunksize" mallctl, which is redundant with "opt.lg_chunk".
* Add the "thread.tcache.enabled" mallctl.Jason Evans2012-03-271-0/+14
|
* Fix various documentation formatting regressions.Jason Evans2012-03-191-18/+20
|
* Rename the "tcache.flush" mallctl to "thread.tcache.flush".Jason Evans2012-03-171-18/+18
|
* Implement aligned_alloc().Jason Evans2012-03-131-0/+35
| | | | | | | | Implement aligned_alloc(), which was added in the C11 standard. The function is weakly specified to the point that a minimally compliant implementation would be painful to use (size must be an integral multiple of alignment!), which in practice makes posix_memalign() a safer choice.
* Remove the lg_tcache_gc_sweep option.Jason Evans2012-03-051-19/+1
| | | | | | | Remove the lg_tcache_gc_sweep option, because it is no longer very useful. Prior to the addition of dynamic adjustment of tcache fill count, it was possible for fill/flush overhead to be a problem, but this problem no longer occurs.
* Add the --disable-experimental option.Jason Evans2012-03-031-1/+3
|
* Add nallocm().Jason Evans2012-02-291-8/+30
| | | | | | | Add nallocm(), which computes the real allocation size that would result from the corresponding allocm() call. nallocm() is a functional superset of OS X's malloc_good_size(), in that it takes alignment constraints into account.
* Remove the sysv option.Jason Evans2012-02-291-26/+0
|
* Simplify small size class infrastructure.Jason Evans2012-02-291-175/+23
| | | | | | | | | | | | Program-generate small size class tables for all valid combinations of LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT. Use the appropriate table to generate all relevant data structures, and remove the distinction between tiny/quantum/cacheline/subpage bins. Remove --enable-dynamic-page-shift. This option didn't prove useful in practice, and it prevented optimizations. Add Tilera architecture support.