| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This resolves #358.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
This regression was caused by 8f683b94a751c65af8f9fa25970ccf2917b96bb8
(Make opt_narenas unsigned rather than size_t.).
|
|
|
|
|
| |
Use 1U rather than ZU(1) in macro definitions, so that the preprocessor
can evaluate the resulting expressions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During over-allocation in preparation for creating aligned mappings,
allocate one more page than necessary if PAGE is the actual page size,
so that trimming still succeeds even if the system returns a mapping
that has less than PAGE alignment. This allows compiling with e.g. 64
KiB "pages" on systems that actually use 4 KiB pages.
Note that for e.g. --with-lg-page=21, it is also necessary to increase
the chunk size (e.g. --with-malloc-conf=lg_chunk:22) so that there are
at least two "pages" per chunk. In practice this isn't a particularly
compelling configuration because so much (unusable) virtual memory is
dedicated to chunk headers.
|
|
|
|
| |
Consistently use uint8_t rather than char for junk filling code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactor ph to support configurable comparison functions. Use a cpp
macro code generation form equivalent to the rb macros so that pairing
heaps can be used for both run heaps and chunk heaps.
Remove per node parent pointers, and instead use leftmost siblings' prev
pointers to track parents.
Fix multi-pass sibling merging to iterate over intermediate results
using a FIFO, rather than a LIFO. Use this fixed sibling merging
implementation for both merge phases of the auxiliary twopass algorithm
(first merging the aux list, then replacing the root with its merged
children). This fixes both degenerate merge behavior and the potential
for deep recursion.
This regression was introduced by
6bafa6678fc36483e638f1c3a0a9bf79fb89bfc9 (Pairing heap).
This resolves #371.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix bitmap_sfu() to shift by LG_BITMAP_GROUP_NBITS rather than
hard-coded 6 when using linear (non-USE_TREE) bitmap search. In
practice this affects only 64-bit systems for which sizeof(long) is not
8 (i.e. Windows), since USE_TREE is defined for 32-bit systems.
This regression was caused by b8823ab02607d6f03febd32ac504bb6188c54047
(Use linear scan for small bitmaps).
This resolves #368.
|
| |
|
|
|
|
|
|
|
| |
Document that the maximum size class is limited by PTRDIFF_MAX, rather
than the full address space. This reflects changes that were part of
0c516a00c4cb28cff55ce0995f756b5aae074c9e (Make *allocx() size class
overflow behavior defined.).
|
|
|
|
|
| |
Replace hardcoded 0xa5 and 0x5a junk values with JEMALLOC_ALLOC_JUNK and
JEMALLOC_FREE_JUNK macros, respectively.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move chunk_dalloc_arena()'s implementation into chunk_dalloc_wrapper(),
so that if the dalloc hook fails, proper decommit/purge/retain cascading
occurs. This fixes three potential chunk leaks on OOM paths, one during
dss-based chunk allocation, one during chunk header commit (currently
relevant only on Windows), and one during rtree write (e.g. if rtree
node allocation fails).
Merge chunk_purge_arena() into chunk_purge_default() (refactor, no
change to functionality).
|
|
|
|
|
|
|
|
|
|
|
| |
Variables s and slen are declared inside a switch statement, but outside
a case scope. clang reports these variable definitions as "unreachable",
though this is not really meaningful in this case. This is the only
-Wunreachable-code warning in jemalloc.
src/util.c:501:5 [-Wunreachable-code] code will never be executed
This resolves #364.
|
|
|
|
|
|
| |
The arenas_extend() function was renamed to arenas_init() in commit
8bb3198f72fc7587dc93527f9f19fb5be52fa553, but its function declaration
was not removed from jemalloc_internal.h.in.
|
| |
|
| |
|
|
|
|
|
|
| |
Restructure the test program master header to avoid blindly enabling
assertions. Prior to this change, assertion code in e.g. arena.h was
always enabled for tests, which could skew performance-related testing.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Specialize fast path to avoid code that cannot execute for dependent
loads.
Manually unroll.
|
| |
|
| |
|
|
|
|
|
|
| |
Also avoid deleting the VERSION file while trying to (re)generate it.
This resolves #305.
|
|
|
|
|
|
|
|
| |
Add (size_t) casts to MALLOCX_ALIGN() macros so that passing the integer
constant 0x80000000 does not cause a compiler warning about invalid
shift amount.
This resolves #354.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use pairing heap instead of red black tree in arena runs_avail. The
extra links are unioned with the bitmap_t, so this change doesn't use
any extra memory.
Canaries show this change to be a 1% cpu win, and 2% latency win. In
particular, large free()s, and small bin frees are now O(1) (barring
coalescing).
I also tested changing bin->runs to be a pairing heap, but saw a much
smaller win, and it would mean increasing the size of arena_run_s by two
pointers, so I left that as an rb-tree for now.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initial implementation of a twopass pairing heap with aux list.
Research papers linked in comments.
Where search/nsearch/last aren't needed, this gives much faster first(),
delete(), and insert(). Insert is O(1), and first/delete don't have to
walk the whole tree.
Also tested rb_old with parent pointers - it was better than the current
rb.h for memory loads, but still much worse than a pairing heap.
An array-based heap would be much faster if everything fits in memory,
but on a cold cache it has many more memory loads for most operations.
|
| |
|
|
|
|
|
|
|
| |
Add a cast to avoid comparing a ssize_t value to a uint64_t value that
is always larger than a 32-bit ssize_t. This silences an innocuous
compiler warning from e.g. gcc 4.2.1 about the comparison always having
the same result.
|
|
|
|
|
|
| |
Stack corruption happens in x64 bit
This resolves #347.
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prior to 767d85061a6fb88ec977bbcd9b429a43aff391e6 (Refactor arenas array
(fixes deadlock).), it was possible under some circumstances for
arena_get() to trigger recreation of the arenas cache during tsd
cleanup, and the arenas cache would then be leaked. In principle a
similar issue could still occur as a side effect of decay-based purging,
which calls arena_tdata_get(). Fix arenas_tdata_cleanup() by setting
tsd->arenas_tdata_bypass to true, so that arena_tdata_get() will
gracefully fail (an expected behavior) rather than recreating
tsd->arena_tdata.
Reported by Christopher Ferris <cferris@google.com>.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add missing stats.arenas.<i>.{dss,lg_dirty_mult,decay_time}
initialization.
Fix stats.arenas.<i>.{pactive,pdirty} to read under the protection of
the arena mutex.
|
| | |
|