lib/rcuref.c
Source file repositories/reference/linux-study-clean/lib/rcuref.c
File Facts
- System
- Linux kernel
- Corpus path
lib/rcuref.c- Extension
.c- Size
- 9770 bytes
- Lines
- 281
- Domain
- Kernel Services
- Bucket
- lib
- Inferred role
- Kernel Services: exported/initcall integration point
- Status
- integration implementation candidate
Why This File Exists
Shared kernel service surface used by multiple subsystems, including helpers, cryptography, virtualization support, and async I/O infrastructure.
- Shared kernel service surface used by multiple subsystems, including helpers, cryptography, virtualization support, and async I/O infrastructure.
- Exports symbols or registers init work; inspect boot/module ordering and who consumes the exported contract.
- Uses kernel synchronization; read lock ordering, sleepability, and interrupt context assumptions before translating.
- Defines or uses C structs; map object ownership, embedded links, reference counts, and lock ownership.
Dependency Surface
linux/export.hlinux/rcuref.h
Detected Declarations
function putfunction rcuref_getfunction putfunction putexport rcuref_get_slowpathexport rcuref_put_slowpath
Annotated Snippet
* if (!atomic_dec_return(&->refcnt)) {
* remove_ptr(p);
* kfree_rcu((p, rcu);
* }
*
* atomic_inc_not_zero() is implemented with a try_cmpxchg() loop which has
* O(N^2) behaviour under contention with N concurrent operations.
*
* rcuref uses atomic_add_negative_relaxed() for the fast path, which scales
* better under contention.
*
* Why not refcount?
* =================
*
* In principle it should be possible to make refcount use the rcuref
* scheme, but the destruction race described below cannot be prevented
* unless the protected object is RCU managed.
*
* Theory of operation
* ===================
*
* rcuref uses an unsigned integer reference counter. As long as the
* counter value is greater than or equal to RCUREF_ONEREF and not larger
* than RCUREF_MAXREF the reference is alive:
*
* ONEREF MAXREF SATURATED RELEASED DEAD NOREF
* 0 0x7FFFFFFF 0x8000000 0xA0000000 0xBFFFFFFF 0xC0000000 0xE0000000 0xFFFFFFFF
* <---valid --------> <-------saturation zone-------> <-----dead zone----->
*
* The get() and put() operations do unconditional increments and
* decrements. The result is checked after the operation. This optimizes
* for the fast path.
*
* If the reference count is saturated or dead, then the increments and
* decrements are not harmful as the reference count still stays in the
* respective zones and is always set back to STATURATED resp. DEAD. The
* zones have room for 2^28 racing operations in each direction, which
* makes it practically impossible to escape the zones.
*
* Once the last reference is dropped the reference count becomes
* RCUREF_NOREF which forces rcuref_put() into the slowpath operation. The
* slowpath then tries to set the reference count from RCUREF_NOREF to
* RCUREF_DEAD via a cmpxchg(). This opens a small window where a
* concurrent rcuref_get() can acquire the reference count and bring it
* back to RCUREF_ONEREF or even drop the reference again and mark it DEAD.
*
* If the cmpxchg() succeeds then a concurrent rcuref_get() will result in
* DEAD + 1, which is inside the dead zone. If that happens the reference
* count is put back to DEAD.
*
* The actual race is possible due to the unconditional increment and
* decrements in rcuref_get() and rcuref_put():
*
* T1 T2
* get() put()
* if (atomic_add_negative(-1, &ref->refcnt))
* succeeds-> atomic_cmpxchg(&ref->refcnt, NOREF, DEAD);
*
* atomic_add_negative(1, &ref->refcnt); <- Elevates refcount to DEAD + 1
*
* As the result of T1's add is negative, the get() goes into the slow path
* and observes refcnt being in the dead zone which makes the operation fail.
*
* Possible critical states:
*
* Context Counter References Operation
* T1 0 1 init()
* T2 1 2 get()
* T1 0 1 put()
* T2 -1 0 put() tries to mark dead
* T1 0 1 get()
* T2 0 1 put() mark dead fails
* T1 -1 0 put() tries to mark dead
* T1 DEAD 0 put() mark dead succeeds
* T2 DEAD+1 0 get() fails and puts it back to DEAD
*
* Of course there are more complex scenarios, but the above illustrates
* the working principle. The rest is left to the imagination of the
* reader.
*
* Deconstruction race
* ===================
*
* The release operation must be protected by prohibiting a grace period in
* order to prevent a possible use after free:
*
* T1 T2
* put() get()
* // ref->refcnt = ONEREF
* if (!atomic_add_negative(-1, &ref->refcnt))
Annotation
- Immediate include surface: `linux/export.h`, `linux/rcuref.h`.
- Detected declarations: `function put`, `function rcuref_get`, `function put`, `function put`, `export rcuref_get_slowpath`, `export rcuref_put_slowpath`.
- Atlas domain: Kernel Services / lib.
- Implementation status: integration implementation candidate.
- Synchronization appears in or near this file; preserve lock ordering, sleepability, and interrupt-context constraints.
Implementation Notes
- This generated page is the file-by-file coverage layer; curated subsystem chapters should link here when they synthesize a multi-file control flow.
- Core OS pages should be promoted from atlas-only to deep-reviewed when they explain data structures, invariants, locking, lifecycle, and C implementation snippets.
- Driver-family pages are intentionally pattern-oriented unless they are part of the selected PCIe/NVMe representative device path.