tools/memory-model/Documentation/simple.txt
Source file repositories/reference/linux-study-clean/tools/memory-model/Documentation/simple.txt
File Facts
- System
- Linux kernel
- Corpus path
tools/memory-model/Documentation/simple.txt- Extension
.txt- Size
- 12581 bytes
- Lines
- 271
- Domain
- Support Tooling And Documentation
- Bucket
- tools
- Inferred role
- Support Tooling And Documentation: documentation
- Status
- atlas-only
Why This File Exists
Repository support layer: documentation, build tooling, samples, user-space helper tools, generated initramfs support, licenses, and validation utilities.
- Repository support layer: documentation, build tooling, samples, user-space helper tools, generated initramfs support, licenses, and validation utilities.
- Uses kernel synchronization; read lock ordering, sleepability, and interrupt context assumptions before translating.
Dependency Surface
- No C-style include directives detected by the generator.
Detected Declarations
- No top-level syscall, struct, function, initcall, or export declaration detected by the generator.
Annotated Snippet
This document provides options for those wishing to keep their
memory-ordering lives simple, as is necessary for those whose domain
is complex. After all, there are bugs other than memory-ordering bugs,
and the time spent gaining memory-ordering knowledge is not available
for gaining domain knowledge. Furthermore Linux-kernel memory model
(LKMM) is quite complex, with subtle differences in code often having
dramatic effects on correctness.
The options near the beginning of this list are quite simple. The idea
is not that kernel hackers don't already know about them, but rather
that they might need the occasional reminder.
Please note that this is a generic guide, and that specific subsystems
will often have special requirements or idioms. For example, developers
of MMIO-based device drivers will often need to use mb(), rmb(), and
wmb(), and therefore might find smp_mb(), smp_rmb(), and smp_wmb()
to be more natural than smp_load_acquire() and smp_store_release().
On the other hand, those coming in from other environments will likely
be more familiar with these last two.
Single-threaded code
====================
In single-threaded code, there is no reordering, at least assuming
that your toolchain and hardware are working correctly. In addition,
it is generally a mistake to assume your code will only run in a single
threaded context as the kernel can enter the same code path on multiple
CPUs at the same time. One important exception is a function that makes
no external data references.
In the general case, you will need to take explicit steps to ensure that
your code really is executed within a single thread that does not access
shared variables. A simple way to achieve this is to define a global lock
that you acquire at the beginning of your code and release at the end,
taking care to ensure that all references to your code's shared data are
also carried out under that same lock. Because only one thread can hold
this lock at a given time, your code will be executed single-threaded.
This approach is called "code locking".
Code locking can severely limit both performance and scalability, so it
should be used with caution, and only on code paths that execute rarely.
After all, a huge amount of effort was required to remove the Linux
kernel's old "Big Kernel Lock", so let's please be very careful about
adding new "little kernel locks".
One of the advantages of locking is that, in happy contrast with the
year 1981, almost all kernel developers are very familiar with locking.
The Linux kernel's lockdep (CONFIG_PROVE_LOCKING=y) is very helpful with
the formerly feared deadlock scenarios.
Please use the standard locking primitives provided by the kernel rather
than rolling your own. For one thing, the standard primitives interact
properly with lockdep. For another thing, these primitives have been
tuned to deal better with high contention. And for one final thing, it is
surprisingly hard to correctly code production-quality lock acquisition
and release functions. After all, even simple non-production-quality
locking functions must carefully prevent both the CPU and the compiler
from moving code in either direction across the locking function.
Despite the scalability limitations of single-threaded code, RCU
takes this approach for much of its grace-period processing and also
for early-boot operation. The reason RCU is able to scale despite
single-threaded grace-period processing is use of batching, where all
updates that accumulated during one grace period are handled by the
next one. In other words, slowing down grace-period processing makes
it more efficient. Nor is RCU unique: Similar batching optimizations
are used in many I/O operations.
Annotation
- Atlas domain: Support Tooling And Documentation / tools.
- Implementation status: atlas-only.
- Synchronization appears in or near this file; preserve lock ordering, sleepability, and interrupt-context constraints.
Implementation Notes
- This generated page is the file-by-file coverage layer; curated subsystem chapters should link here when they synthesize a multi-file control flow.
- Core OS pages should be promoted from atlas-only to deep-reviewed when they explain data structures, invariants, locking, lifecycle, and C implementation snippets.
- Driver-family pages are intentionally pattern-oriented unless they are part of the selected PCIe/NVMe representative device path.