| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initial implementation of a twopass pairing heap with aux list.
Research papers linked in comments.
Where search/nsearch/last aren't needed, this gives much faster first(),
delete(), and insert(). Insert is O(1), and first/delete don't have to
walk the whole tree.
Also tested rb_old with parent pointers - it was better than the current
rb.h for memory loads, but still much worse than a pairing heap.
An array-based heap would be much faster if everything fits in memory,
but on a cold cache it has many more memory loads for most operations.
|
| |
|
|
|
|
|
|
|
| |
Add a cast to avoid comparing a ssize_t value to a uint64_t value that
is always larger than a 32-bit ssize_t. This silences an innocuous
compiler warning from e.g. gcc 4.2.1 about the comparison always having
the same result.
|
|
|
|
|
|
| |
Stack corruption happens in x64 bit
This resolves #347.
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Prior to 767d85061a6fb88ec977bbcd9b429a43aff391e6 (Refactor arenas array
(fixes deadlock).), it was possible under some circumstances for
arena_get() to trigger recreation of the arenas cache during tsd
cleanup, and the arenas cache would then be leaked. In principle a
similar issue could still occur as a side effect of decay-based purging,
which calls arena_tdata_get(). Fix arenas_tdata_cleanup() by setting
tsd->arenas_tdata_bypass to true, so that arena_tdata_get() will
gracefully fail (an expected behavior) rather than recreating
tsd->arena_tdata.
Reported by Christopher Ferris <cferris@google.com>.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add missing stats.arenas.<i>.{dss,lg_dirty_mult,decay_time}
initialization.
Fix stats.arenas.<i>.{pactive,pdirty} to read under the protection of
the arena mutex.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix stats.cactive accounting to always increase/decrease by multiples of
the chunk size, even for huge size classes that are not multiples of the
chunk size, e.g. {2.5, 3, 3.5, 5, 7} MiB with 2 MiB chunk size. This
regression was introduced by 155bfa7da18cab0d21d87aa2dce4554166836f5d
(Normalize size classes.) and first released in 4.0.0.
This resolves #336.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This removes an implicit conversion from size_t to ssize_t. For cactive
decreases, the size_t value was intentionally underflowed to generate
"negative" values (actually positive values above the positive range of
ssize_t), and the conversion to ssize_t was undefined according to C
language semantics.
This regression was perpetuated by
1522937e9cbcfa24c881dc439cc454f9a34a7e88 (Fix the cactive statistic.)
and first release in 4.0.0, which in retrospect only fixed one of two
problems introduced by aa5113b1fdafd1129c22512837c6c3d66c295fc8
(Refactor overly large/complex functions) and first released in 3.5.0.
|
| |
| |
| |
| |
| | |
Remove invalid tests that were intended to be tests of (hugemax+1) OOM,
for which tests already exist.
|
| |
| |
| |
| |
| |
| |
| | |
This fixes chunk allocation to reuse retained memory even if an
application-provided chunk allocation function is in use.
This resolves #307.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For small bitmaps, a linear scan of the bitmap is slightly faster than
a tree search - bitmap_t is more compact, and there are fewer writes
since we don't have to propogate state transitions up the tree.
On x86_64 with the current settings, I'm seeing ~.5%-1% CPU improvement
in production canaries with this change.
The old tree code is left since 32bit sizes are much larger (and ffsl
smaller), and maybe the run sizes will change in the future.
This resolves #339.
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| | |
This resolves #341.
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Add HUGE_MAXCLASS overflow checks that are specific to heap profiling
code paths. This fixes test failures that were introduced by
0c516a00c4cb28cff55ce0995f756b5aae074c9e (Make *allocx() size class
overflow behavior defined.).
|
| |
| |
| |
| |
| |
| | |
This fixes compilation warnings regarding integer overflow that were
introduced by 0c516a00c4cb28cff55ce0995f756b5aae074c9e (Make *allocx()
size class overflow behavior defined.).
|
| |
| |
| |
| |
| |
| |
| | |
Limit supported size and alignment to HUGE_MAXCLASS, which in turn is
now limited to be less than PTRDIFF_MAX.
This resolves #278 and #295.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Refactor the arenas array, which contains pointers to all extant arenas,
such that it starts out as a sparse array of maximum size, and use
double-checked atomics-based reads as the basis for fast and simple
arena_get(). Additionally, reduce arenas_lock's role such that it only
protects against arena initalization races. These changes remove the
possibility for arena lookups to trigger locking, which resolves at
least one known (fork-related) deadlock.
This resolves #315.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fix arena_size arena_new() computation to incorporate
runs_avail_nclasses elements for runs_avail, rather than
(runs_avail_nclasses - 1) elements. Since offsetof(arena_t, runs_avail)
is used rather than sizeof(arena_t) for the first term of the
computation, all of the runs_avail elements must be added into the
second term.
This bug was introduced (by Jason Evans) while merging pull request #330
as 3417a304ccde61ac1f68b436ec22c03f1d6824ec (Separate arena_avail
trees).
|
| |
| |
| |
| |
| |
| | |
Merge of 3417a304ccde61ac1f68b436ec22c03f1d6824ec looks like a small
bug: first_best_fit doesn't scan through all the classes, since ind is
offset from runs_avail_nclasses by run_avail_bias.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Attempt mmap-based in-place huge reallocation by plumbing new_addr into
chunk_alloc_mmap(). This can dramatically speed up incremental huge
reallocation.
This resolves #335.
|
| |
| |
| |
| | |
This resolves #258.
|
| |
| |
| |
| | |
This resolves #323.
|
| |
| |
| |
| |
| | |
This regression was caused by 9f4ee6034c3ac6a8c8b5f9a0d76822fb2fd90c41
(Refactor jemalloc_ffs*() into ffs_*().).
|
| | |
|
| |
| |
| |
| |
| | |
This will prevent accidental creation of potential integer truncation
bugs when developing on LP64 systems.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| | |
Use appropriate versions to resolve 64-to-32-bit data loss warnings.
|
| |
| |
| |
| | |
This resolves #333.
|
| |
| |
| |
| |
| | |
These tree types converged to become identical, yet they still had
independently generated red-black tree implementations.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Separate run trees by index, replacing the previous quantize logic.
Quantization by index is now performed only on insertion / removal from
the tree, and not on node comparison, saving some cpu. This also means
we don't have to dereference the miscelm* pointers, saving half of the
memory loads from miscelms/mapbits that have fallen out of cache. A
linear scan of the indicies appears to be fast enough.
The only cost of this is an extra tree array in each arena.
|