summaryrefslogtreecommitdiffstats
path: root/include/jemalloc/internal/tsd.h
Commit message (Collapse)AuthorAgeFilesLines
* Fallback to 32-bit when 8-bit atomics are missing for TSD.Qi Wang2019-03-091-2/+17
| | | | | When it happens, this might cause a slowdown on the fast path operations. However such case is very rare.
* Store the bin shard selection in TSD.Qi Wang2018-12-041-2/+3
| | | | | This avoids having to choose bin shard on the fly, also will allow flexible bin binding for each thread.
* Add support for sharded bins within an arena.Qi Wang2018-12-041-0/+2
| | | | | | | | | This makes it possible to have multiple set of bins in an arena, which improves arena scalability because the bins (especially the small ones) are always the limiting factor in production workload. A bin shard is picked on allocation; each extent tracks the bin shard id for deallocation. The shard size will be determined using runtime options.
* restrict bytes_until_sample to int64_t. This allows optimal asmDave Watson2018-10-151-1/+1
| | | | | | generation of sub bytes_until_sample, usize; je; for x86 arch. Subtraction is unconditional, and only flags are checked for the jump, no extra compare is necessary. This also reduces register pressure.
* move bytes until sample to tsd. Fastpath allocation does not needDave Watson2018-10-151-0/+2
| | | | to load tdata now, avoiding several branches.
* TSD: Add fork support to tsd_nominal_tsds.David Goldblatt2018-07-271-0/+3
| | | | | In case of multithreaded fork, we want to leave the child in a reasonable state, in which tsd_nominal_tsds is either empty or contains only the forking thread.
* Move tsd link and in_hook after tcache.David Goldblatt2018-06-271-7/+0
| | | | | This can lead to better cache utilization down the common paths where we don't touch the link.
* Hooks: Protect against reentrancy.David Goldblatt2018-05-181-0/+2
| | | | | | | Previously, we made the user deal with this themselves, but that's not good enough; if hooks may allocate, we should test the allocation pathways down hooks. If we're doing that, we might as well actually implement the protection for the user.
* Tests: Shouldn't be able to change global slowness.David Goldblatt2018-05-181-0/+1
| | | | | This can help ensure that we don't leave slowness changes behind in case of resource exhaustion.
* TSD: Add the ability to enter a global slow path.David Goldblatt2018-05-181-26/+74
| | | | | This gives any thread the ability to send other threads down slow paths the next time they fetch tsd.
* TSD: Pull name mangling into a macro.David Goldblatt2018-05-181-2/+9
|
* TSD: Make state atomic.David Goldblatt2018-05-181-4/+10
| | | | This will let us change the state of another thread remotely, eventually.
* TSD: Make all state access happen through a function.David Goldblatt2018-05-181-14/+23
| | | | | Shortly, tsd state will be atomic and have some complicated enough logic down the state-setting path that we should be aware of it.
* Use tsd offset_state instead of atomicDave Watson2017-11-141-0/+2
| | | | | | While working on #852, I noticed the prng state is atomic. This is the only atomic use of prng in all of jemalloc. Instead, use a threadlocal prng state if possible to avoid unnecessary cache line contention.
* Add minimal initialized TSD.Qi Wang2017-06-161-8/+22
| | | | | | | | | We use the minimal_initilized tsd (which requires no cleanup) for free() specifically, if tsd hasn't been initialized yet. Any other activity will transit the state from minimal to normal. This is to workaround the case where a thread has no malloc calls in its lifetime until during thread termination, free() happens after tls destructors.
* Remove redundant typedefs.Jason Evans2017-06-081-2/+0
| | | | Pre-C11 compilers do not support typedef redefinition.
* Add internal tsd for background_thread.Qi Wang2017-06-081-5/+10
|
* Make tsd no-cleanup during tsd reincarnation.Qi Wang2017-06-071-1/+1
| | | | | Since tsd cleanup isn't guaranteed when reincarnated, we set up tsd in a way that needs no cleanup, by making it going through slow path instead.
* Header refactoring: unify and de-catchall rtree module.David Goldblatt2017-05-311-1/+1
|
* Header refactoring: unify and de-catchall witness code.David Goldblatt2017-05-241-43/+55
|
* Protect the rtree/extent interactions with a mutex pool.David Goldblatt2017-05-191-3/+0
| | | | | | | | | | | | | | | | | | Instead of embedding a lock bit in rtree leaf elements, we associate extents with a small set of mutexes. This gets us two things: - We can use the system mutexes. This (hypothetically) protects us from priority inversion, and lets us stop doing a backoff/sleep loop, instead opting for precise wakeups from the mutex. - Cuts down on the number of mutex acquisitions we have to do (from 4 in the worst case to two). We end up simplifying most of the rtree code (which no longer has to deal with locking or concurrency at all), at the cost of additional complexity in the extent code: since the mutex protecting the rtree leaf elements is determined by reading the extent out of those elements, the initial read is racy, so that we may acquire an out of date mutex. We re-check the extent in the leaf after acquiring the mutex to protect us from this race.
* Header refactoring: tsd - cleanup and dependency breaking.David Goldblatt2017-05-011-0/+298
| | | | | | | | | | | | This removes the tsd macros (which are used only for tsd_t in real builds). We break up the circular dependencies involving tsd. We also move all tsd access through getters and setters. This allows us to assert that we only touch data when tsd is in a valid state. We simplify the usages of the x macro trick, removing all the customizability (get/set, init, cleanup), moving the lifetime logic to tsd_init and tsd_cleanup. This lets us make initialization order independent of order within tsd_t.
* Break up headers into constituent partsDavid Goldblatt2017-01-121-811/+0
| | | | | | | | | | This is part of a broader change to make header files better represent the dependencies between one another (see https://github.com/jemalloc/jemalloc/issues/533). It breaks up component headers into smaller parts that can be made to have a simpler dependency graph. For the autogenerated headers (smoothstep.h and size_classes.h), no splitting was necessary, so I didn't add support to emit multiple headers.
* Add some missing explicit casts.Jason Evans2016-12-131-3/+4
|
* Do not (recursively) allocate within tsd_fetch().Jason Evans2016-10-211-17/+57
| | | | | | | Refactor tsd so that tsdn_fetch() does not trigger allocation, since allocation could cause infinite recursion. This resolves #458.
* Avoid self assignment in tsd_set().Jason Evans2016-09-231-4/+8
|
* Add rtree lookup path caching.Jason Evans2016-06-061-0/+19
| | | | | | | | | rtree-based extent lookups remain more expensive than chunk-based run lookups, but with this optimization the fast path slowdown is ~3 CPU cycles per metadata lookup (on Intel Core i7-4980HQ), versus ~11 cycles prior. The path caching speedup tends to degrade gracefully unless allocated memory is spread far apart (as is the case when using a mixture of sbrk() and mmap()).
* Make tsd cleanup functions optional, remove noop cleanup functions.Jason Evans2016-06-061-17/+17
|
* Add rtree element witnesses.Jason Evans2016-06-031-0/+2
|
* Remove quarantine support.Jason Evans2016-05-131-2/+0
|
* Resolve bootstrapping issues when embedded in FreeBSD libc.Jason Evans2016-05-111-0/+76
| | | | | | | | | | | | | b2c0d6322d2307458ae2b28545f8a5c9903d7ef5 (Add witness, a simple online locking validator.) caused a broad propagation of tsd throughout the internal API, but tsd_fetch() was designed to fail prior to tsd bootstrapping. Fix this by splitting tsd_t into non-nullable tsd_t and nullable tsdn_t, and modifying all internal APIs that do not critically rely on tsd to take nullable pointers. Furthermore, add the tsd_booted_get() function so that tsdn_fetch() can probe whether tsd bootstrapping is complete and return NULL if not. All dangerous conversions of nullable pointers are tsdn_tsd() calls that assert-fail on invalid conversion.
* Fix fork()-related lock rank ordering reversals.Jason Evans2016-04-261-1/+3
|
* Do not allocate metadata via non-auto arenas, nor tcaches.Jason Evans2016-04-221-0/+2
| | | | | This assures that all internally allocated metadata come from the first opt_narenas arenas, i.e. the automatically multiplexed arenas.
* Add witness, a simple online locking validator.Jason Evans2016-04-141-2/+4
| | | | This resolves #358.
* Refactor arenas_cache tsd.Jason Evans2016-02-201-3/+3
| | | | | Refactor arenas_cache tsd into arenas_tdata, which is a structure of type arena_tdata_t.
* Fix tsd_boot1() to use explicit 'void' parameter list.Craig Rodrigues2015-09-211-4/+4
|
* Preserve LastError when calling TlsGetValueMike Hommey2015-03-041-2/+6
| | | | | | | TlsGetValue has a semantic difference with pthread_getspecific, in that it can return a non-error NULL value, so it always sets the LastError. But allocator callers may not be expecting calling e.g. free() to change the value of the last error, so preserve it.
* Refactor bootstrapping to delay tsd initialization.Jason Evans2015-01-221-1/+1
| | | | | | | | | | | | Refactor bootstrapping to delay tsd initialization, primarily to support integration with FreeBSD's libc. Refactor a0*() for internal-only use, and add the bootstrap_{malloc,calloc,free}() API for use by FreeBSD's libc. This separation limits use of the a0*() functions to metadata allocation, which doesn't require malloc/calloc/free API compatibility. This resolves #170.
* Remove extra definition of je_tsd_boot on win32.Guilherme Goncalves2014-11-181-6/+0
|
* Refactor/fix arenas manipulation.Jason Evans2014-10-081-57/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Abstract arenas access to use arena_get() (or a0get() where appropriate) rather than directly reading e.g. arenas[ind]. Prior to the addition of the arenas.extend mallctl, the worst possible outcome of directly accessing arenas was a stale read, but arenas.extend may allocate and assign a new array to arenas. Add a tsd-based arenas_cache, which amortizes arenas reads. This introduces some subtle bootstrapping issues, with tsd_boot() now being split into tsd_boot[01]() to support tsd wrapper allocation bootstrapping, as well as an arenas_cache_bypass tsd variable which dynamically terminates allocation of arenas_cache itself. Promote a0malloc(), a0calloc(), and a0free() to be generally useful for internal allocation, and use them in several places (more may be appropriate). Abstract arena->nthreads management and fix a missing decrement during thread destruction (recent tsd refactoring left arenas_cleanup() unused). Change arena_choose() to propagate OOM, and handle OOM in all callers. This is important for providing consistent allocation behavior when the MALLOCX_ARENA() flag is being used. Prior to this fix, it was possible for an OOM to result in allocation silently allocating from a different arena than the one specified.
* Fix tsd cleanup regressions.Jason Evans2014-10-041-29/+36
| | | | | | | | | | | | | | | | Fix tsd cleanup regressions that were introduced in 5460aa6f6676c7f253bfcb75c028dfd38cae8aaf (Convert all tsd variables to reside in a single tsd structure.). These regressions were twofold: 1) tsd_tryget() should never (and need never) return NULL. Rename it to tsd_fetch() and simplify all callers. 2) tsd_*_set() must only be called when tsd is in the nominal state, because cleanup happens during the nominal-->purgatory transition, and re-initialization must not happen while in the purgatory state. Add tsd_nominal() and use it as needed. Note that tsd_*{p,}_get() can still be used as long as no re-initialization that would require cleanup occurs. This means that e.g. the thread_allocated counter can be updated unconditionally.
* Convert all tsd variables to reside in a single tsd structure.Jason Evans2014-09-231-120/+221
|
* Add mq (message queue) to test infrastructure.Jason Evans2013-12-121-1/+1
| | | | | | | | | Add mtx (mutex) to test infrastructure, in order to avoid bootstrapping complications that would result from directly using malloc_mutex. Rename test infrastructure's thread abstraction from je_thread to thd. Fix some header ordering issues.
* Normalize #define whitespace.Jason Evans2013-12-091-1/+1
| | | | Consistently use a tab rather than a space following #define.
* Add support for LinuxThreads.Leonard Crestez2013-10-251-0/+37
| | | | | | | | | | | | | | | | | When using LinuxThreads pthread_setspecific triggers recursive allocation on all threads. Work around this by creating a global linked list of in-progress tsd initializations. This modifies the _tsd_get_wrapper macro-generated function. When it has to initialize an TSD object it will push the item to the linked list first. If this causes a recursive allocation then the _get_wrapper request is satisfied from the list. When pthread_setspecific returns the item is removed from the list. This effectively adds a very poor substitute for real TLS used only during pthread_setspecific allocation recursion. Signed-off-by: Crestez Dan Leonard <lcrestez@ixiacom.com>
* Add support for MingwMike Hommey2012-04-221-0/+101
|
* Fix chunk allocation/deallocation bugs.Jason Evans2012-04-211-1/+1
| | | | | | | | | | | | Fix chunk_alloc_dss() to zero memory when requested. Fix chunk_dealloc() to avoid chunk_dealloc_mmap() for dss-allocated memory. Fix huge_palloc() to always junk fill when requested. Improve chunk_recycle() to report that memory is zeroed as a side effect of pages_purge().
* Remove extra argument for malloc_tsd_cleanup_registerMike Hommey2012-04-191-10/+5
| | | | | Bookkeeping an extra argument that actually only stores a function pointer for a function we already have is not very useful.
* Remove initialization of the non-TLS tsd wrapper from static memoryMike Hommey2012-04-191-12/+3
| | | | | | Using static memory when malloc_tsd_malloc fails means all threads share the same wrapper and thus the same wrapped value. This defeats the purpose of TSD.
* Initialize all members of non-TLS tsd wrapper when creating itMike Hommey2012-04-191-0/+1
| | | | | | Not setting the initialized member leads to randomly calling the cleanup function in cases it shouldn't be called (and isn't called in other implementations).