summaryrefslogtreecommitdiffstats
path: root/src/stats.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix MinGW-related portability issues.Jason Evans2015-07-231-45/+44
| | | | | | | | | | | | | Create and use FMT* macros that are equivalent to the PRI* macros that inttypes.h defines. This allows uniform use of the Unix-specific format specifiers, e.g. "%zu", as well as avoiding Windows-specific definitions of e.g. PRIu64. Add ffs()/ffsl() support for compiling with gcc. Extract compatibility definitions of ENOENT, EINVAL, EAGAIN, EPERM, ENOMEM, and ENORANGE into include/msvc_compat/windows_extra.h and use the file for tests as well as for core jemalloc code.
* Fix MinGW build warnings.Jason Evans2015-07-081-46/+49
| | | | | | | | | | Conditionally define ENOENT, EINVAL, etc. (was unconditional). Add/use PRIzu, PRIzd, and PRIzx for use in malloc_printf() calls. gcc issued (harmless) warnings since e.g. "%zu" should be "%Iu" on Windows, and the alternative to this workaround would have been to disable the function attributes which cause gcc to look for type mismatches in formatted printing function calls.
* Add the "stats.arenas.<i>.lg_dirty_mult" mallctl.Jason Evans2015-03-241-10/+1
|
* Add the "stats.allocated" mallctl.Jason Evans2015-03-241-3/+5
|
* Fix a compile error caused by mixed declarations and code.Qinfan Wu2015-03-211-2/+3
|
* Fix lg_dirty_mult-related stats printing.Jason Evans2015-03-211-66/+82
| | | | | | | | This regression was introduced by 8d6a3e8321a7767cb2ca0930b85d5d488a8cc659 (Implement dynamic per arena control over dirty page purging.). This resolves #215.
* Implement dynamic per arena control over dirty page purging.Jason Evans2015-03-191-0/+10
| | | | | | | | | | | | | | Add mallctls: - arenas.lg_dirty_mult is initialized via opt.lg_dirty_mult, and can be modified to change the initial lg_dirty_mult setting for newly created arenas. - arena.<i>.lg_dirty_mult controls an individual arena's dirty page purging threshold, and synchronously triggers any purging that may be necessary to maintain the constraint. - arena.<i>.chunk.purge allows the per arena dirty page purging function to be replaced. This resolves #93.
* Move centralized chunk management into arenas.Jason Evans2015-02-121-12/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migrate all centralized data structures related to huge allocations and recyclable chunks into arena_t, so that each arena can manage huge allocations and recyclable virtual memory completely independently of other arenas. Add chunk node caching to arenas, in order to avoid contention on the base allocator. Use chunks_rtree to look up huge allocations rather than a red-black tree. Maintain a per arena unsorted list of huge allocations (which will be needed to enumerate huge allocations during arena reset). Remove the --enable-ivsalloc option, make ivsalloc() always available, and use it for size queries if --enable-debug is enabled. The only practical implications to this removal are that 1) ivsalloc() is now always available during live debugging (and the underlying radix tree is available during core-based debugging), and 2) size query validation can no longer be enabled independent of --enable-debug. Remove the stats.chunks.{current,total,high} mallctls, and replace their underlying statistics with simpler atomically updated counters used exclusively for gdump triggering. These statistics are no longer very useful because each arena manages chunks independently, and per arena statistics provide similar information. Simplify chunk synchronization code, now that base chunk allocation cannot cause recursive lock acquisition.
* Implement metadata statistics.Jason Evans2015-01-241-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are three categories of metadata: - Base allocations are used for bootstrap-sensitive internal allocator data structures. - Arena chunk headers comprise pages which track the states of the non-metadata pages. - Internal allocations differ from application-originated allocations in that they are for internal use, and that they are omitted from heap profiles. The metadata statistics comprise the metadata categories as follows: - stats.metadata: All metadata -- base + arena chunk headers + internal allocations. - stats.arenas.<i>.metadata.mapped: Arena chunk headers. - stats.arenas.<i>.metadata.allocated: Internal allocations. This is reported separately from the other metadata statistics because it overlaps with the allocated and active statistics, whereas the other metadata statistics do not. Base allocations are not reported separately, though their magnitude can be computed by subtracting the arena-specific metadata. This resolves #163.
* Use the correct type for opt.junk when printing stats.Guilherme Goncalves2015-01-231-1/+1
|
* Add small run utilization to stats output.Jason Evans2014-10-151-16/+34
| | | | | | | | | | | Add the 'util' column, which reports the proportion of available regions that are currently in use for each small size class. Small run utilization is the complement of external fragmentation. For example, utilization of 0.75 indicates that 25% of small run memory is consumed by external fragmentation, in other (more obtuse) words, 33% external fragmentation overhead. This resolves #27.
* Add per size class huge allocation statistics.Jason Evans2014-10-131-81/+134
| | | | | | | | | | | | | Add per size class huge allocation statistics, and normalize various stats: - Change the arenas.nlruns type from size_t to unsigned. - Add the arenas.nhchunks and arenas.hchunks.<i>.size mallctl's. - Replace the stats.arenas.<i>.bins.<j>.allocated mallctl with stats.arenas.<i>.bins.<j>.curregs . - Add the stats.arenas.<i>.hchunks.<j>.nmalloc, stats.arenas.<i>.hchunks.<j>.ndalloc, stats.arenas.<i>.hchunks.<j>.nrequests, and stats.arenas.<i>.hchunks.<j>.curhchunks mallctl's.
* Implement/test/fix prof-related mallctl's.Jason Evans2014-10-041-14/+19
| | | | | | | | | | | Implement/test/fix the opt.prof_thread_active_init, prof.thread_active_init, and thread.prof.active mallctl's. Test/fix the thread.prof.name mallctl. Refactor opt_prof_active to be read-only and move mutable state into the prof_active variable. Stop leaning on ctl-related locking for protection.
* Convert to uniform style: cond == false --> !condJason Evans2014-10-031-1/+1
|
* Implement per thread heap profiling.Jason Evans2014-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | Rename data structures (prof_thr_cnt_t-->prof_tctx_t, prof_ctx_t-->prof_gctx_t), and convert to storing a prof_tctx_t for sampled objects. Convert PROF_ALLOC_PREP() to prof_alloc_prep(), since precise backtrace depth within jemalloc functions is no longer an issue (pprof prunes irrelevant frames). Implement mallctl's: - prof.reset implements full sample data reset, and optional change of sample interval. - prof.lg_sample reads the current sample interval (opt.lg_prof_sample was the permanent source of truth prior to prof.reset). - thread.prof.name provides naming capability for threads within heap profile dumps. - thread.prof.active makes it possible to activate/deactivate heap profiling for individual threads. Modify the heap dump files to contain per thread heap profile data. This change is incompatible with the existing pprof, which will require enhancements to read and process the enriched data.
* Refactor huge allocation to be managed by arenas.Jason Evans2014-05-161-16/+13
| | | | | | | | | | | | | | | | | | | | Refactor huge allocation to be managed by arenas (though the global red-black tree of huge allocations remains for lookup during deallocation). This is the logical conclusion of recent changes that 1) made per arena dss precedence apply to huge allocation, and 2) made it possible to replace the per arena chunk allocation/deallocation functions. Remove the top level huge stats, and replace them with per arena huge stats. Normalize function names and types to *dalloc* (some were *dealloc*). Remove the --enable-mremap option. As jemalloc currently operates, this is a performace regression for some applications, but planned work to logarithmically space huge size classes should provide similar amortized performance. The motivation for this change was that mremap-based huge reallocation forced leaky abstractions that prevented refactoring.
* Normalize #define whitespace.Jason Evans2013-12-091-4/+4
| | | | Consistently use a tab rather than a space following #define.
* Add arena-specific and selective dss allocation.Jason Evans2012-10-131-2/+8
| | | | | | | | | | | | | | | | | | | Add the "arenas.extend" mallctl, so that it is possible to create new arenas that are outside the set that jemalloc automatically multiplexes threads onto. Add the ALLOCM_ARENA() flag for {,r,d}allocm(), so that it is possible to explicitly allocate from a particular arena. Add the "opt.dss" mallctl, which controls the default precedence of dss allocation relative to mmap allocation. Add the "arena.<i>.dss" mallctl, which makes it possible to set the default dss precedence on a per arena or global basis. Add the "arena.<i>.purge" mallctl, which obsoletes "arenas.purge". Add the "stats.arenas.<i>.dss" mallctl.
* Don't use sizeof() on a VARIABLE_ARRAYMike Hommey2012-05-021-2/+2
| | | | In the alloca() case, this fails to be the right size.
* Allow je_malloc_message to be overridden when linking staticallyMike Hommey2012-05-021-15/+7
| | | | | | | | | | | | | If an application wants to override je_malloc_message, it is better to define the symbol locally than to change its value in main(), which might be too late for various reasons. Due to je_malloc_message being initialized in util.c, statically linking jemalloc with an application defining je_malloc_message fails due to "multiple definition of" the symbol. Defining it without a value (like je_malloc_conf) makes it more easily overridable.
* Avoid variable length arrays and remove declarations within codeMike Hommey2012-04-291-2/+2
| | | | | | | | | | | | MSVC doesn't support C99, and building as C++ to be able to use them is dangerous, as C++ and C99 are incompatible. Introduce a VARIABLE_ARRAY macro that either uses VLA when supported, or alloca() otherwise. Note that using alloca() inside loops doesn't quite work like VLAs, thus the use of VARIABLE_ARRAY there is discouraged. It might be worth investigating ways to check whether VARIABLE_ARRAY is used in such context at runtime in debug builds and bail out if that happens.
* Update prof defaults to match common usage.Jason Evans2012-04-171-0/+1
| | | | | | | | | Change the "opt.lg_prof_sample" default from 0 to 19 (1 B to 512 KiB). Change the "opt.prof_accum" default from true to false. Add the "opt.prof_final" mallctl, so that "opt.prof_prefix" need not be abused to disable final profile dumping.
* Implement Valgrind support, redzones, and quarantine.Jason Evans2012-04-111-0/+3
| | | | | | | | | | | | | Implement Valgrind support, as well as the redzone and quarantine features, which help Valgrind detect memory errors. Redzones are only implemented for small objects because the changes necessary to support redzones around large and huge objects are complicated by in-place reallocation, to the point that it isn't clear that the maintenance burden is worth the incremental improvement to Valgrind support. Merge arena_salloc() and arena_salloc_demote(). Refactor i[v]salloc() to expose the 'demote' option.
* Add utrace(2)-based tracing (--enable-utrace).Jason Evans2012-04-051-0/+1
|
* Finish renaming "arenas.pagesize" to "arenas.page".Jason Evans2012-04-021-11/+10
|
* Clean up *PAGE* macros.Jason Evans2012-04-021-0/+3
| | | | | | | | | | | s/PAGE_SHIFT/LG_PAGE/g and s/PAGE_SIZE/PAGE/g. Remove remnants of the dynamic-page-shift code. Rename the "arenas.pagesize" mallctl to "arenas.page". Remove the "arenas.chunksize" mallctl, which is redundant with "opt.lg_chunk".
* Fix malloc_stats_print() option support.Jason Evans2012-03-131-6/+8
| | | | Fix malloc_stats_print() to honor 'b' and 'l' in the opts parameter.
* Implement malloc_vsnprintf().Jason Evans2012-03-081-172/+50
| | | | | | | | | | | | Implement malloc_vsnprintf() (a subset of vsnprintf(3)) as well as several other printing functions based on it, so that formatted printing can be relied upon without concern for inducing a dependency on floating point runtime support. Replace malloc_write() calls with malloc_*printf() where doing so simplifies the code. Add name mangling for library-private symbols in the data and BSS sections. Adjust CONF_HANDLE_*() macros in malloc_conf_init() to expose all opt_* variable use to cpp so that proper mangling occurs.
* Remove the lg_tcache_gc_sweep option.Jason Evans2012-03-051-11/+0
| | | | | | | Remove the lg_tcache_gc_sweep option, because it is no longer very useful. Prior to the addition of dynamic adjustment of tcache fill count, it was possible for fill/flush overhead to be a problem, but this problem no longer occurs.
* Add --with-mangling.Jason Evans2012-03-021-18/+17
| | | | | | | | | | Add the --with-mangling configure option, which can be used to specify name mangling on a per public symbol basis that takes precedence over --with-jemalloc-prefix. Expose the memalign() and valloc() overrides even if --with-jemalloc-prefix is specified. This change does no real harm, and simplifies the code.
* Remove unused variables in stats_print().Jason Evans2012-02-291-4/+0
| | | | Submitted by Mike Hommey.
* Remove the sysv option.Jason Evans2012-02-291-1/+0
|
* Simplify small size class infrastructure.Jason Evans2012-02-291-66/+6
| | | | | | | | | | | | Program-generate small size class tables for all valid combinations of LG_TINY_MIN, LG_QUANTUM, and PAGE_SHIFT. Use the appropriate table to generate all relevant data structures, and remove the distinction between tiny/quantum/cacheline/subpage bins. Remove --enable-dynamic-page-shift. This option didn't prove useful in practice, and it prevented optimizations. Add Tilera architecture support.
* Remove the opt.lg_prof_bt_max option.Jason Evans2012-02-141-6/+0
| | | | | | | | Remove opt.lg_prof_bt_max, and hard code it to 7. The original intention of this option was to enable faster backtracing by limiting backtrace depth. However, this makes graphical pprof output very difficult to interpret. In practice, decreasing sampling frequency is a better mechanism for limiting profiling overhead.
* Remove the opt.lg_prof_tcmax option.Jason Evans2012-02-141-12/+0
| | | | | | | Remove the opt.lg_prof_tcmax option and hard-code a cache size of 1024. This setting is something that users just shouldn't have to worry about. If lock contention actually ends up being a problem, the simple solution available to the user is to reduce sampling frequency.
* Remove highruns statistics.Jason Evans2012-02-131-17/+11
|
* Remove the swap feature.Jason Evans2012-02-131-21/+5
| | | | | Remove the swap feature, which enabled per application swap files. In practice this feature has not proven itself useful to users.
* Reduce cpp conditional logic complexity.Jason Evans2012-02-111-11/+2
| | | | | | | | | | | | | | | | | | | | | | Convert configuration-related cpp conditional logic to use static constant variables, e.g.: #ifdef JEMALLOC_DEBUG [...] #endif becomes: if (config_debug) { [...] } The advantage is clearer, more concise code. The main disadvantage is that data structures no longer have conditionally defined fields, so they pay the cost of all fields regardless of whether they are used. In practice, this is only a minor concern; config_stats will go away in an upcoming change, and config_prof is the only other major feature that depends on more than a few special-purpose fields.
* Fix malloc_stats_print(..., "a") output.Jason Evans2011-11-111-1/+1
| | | | | | Fix the logic in stats_print() such that if the "a" flag is passed in without the "m" flag, merged statistics will be printed even if only one arena is initialized.
* Move repo contents in jemalloc/ to top level.Jason Evans2011-04-011-0/+790