tools/memory-model/Documentation/simple.txt

Source file repositories/reference/linux-study-clean/tools/memory-model/Documentation/simple.txt

File Facts

System
Linux kernel
Corpus path
tools/memory-model/Documentation/simple.txt
Extension
.txt
Size
12581 bytes
Lines
271
Domain
Support Tooling And Documentation
Bucket
tools
Inferred role
Support Tooling And Documentation: documentation
Status
atlas-only

Why This File Exists

Repository support layer: documentation, build tooling, samples, user-space helper tools, generated initramfs support, licenses, and validation utilities.

Dependency Surface

Detected Declarations

Annotated Snippet

This document provides options for those wishing to keep their
memory-ordering lives simple, as is necessary for those whose domain
is complex.  After all, there are bugs other than memory-ordering bugs,
and the time spent gaining memory-ordering knowledge is not available
for gaining domain knowledge.  Furthermore Linux-kernel memory model
(LKMM) is quite complex, with subtle differences in code often having
dramatic effects on correctness.

The options near the beginning of this list are quite simple.  The idea
is not that kernel hackers don't already know about them, but rather
that they might need the occasional reminder.

Please note that this is a generic guide, and that specific subsystems
will often have special requirements or idioms.  For example, developers
of MMIO-based device drivers will often need to use mb(), rmb(), and
wmb(), and therefore might find smp_mb(), smp_rmb(), and smp_wmb()
to be more natural than smp_load_acquire() and smp_store_release().
On the other hand, those coming in from other environments will likely
be more familiar with these last two.


Single-threaded code
====================

In single-threaded code, there is no reordering, at least assuming
that your toolchain and hardware are working correctly.  In addition,
it is generally a mistake to assume your code will only run in a single
threaded context as the kernel can enter the same code path on multiple
CPUs at the same time.  One important exception is a function that makes
no external data references.

In the general case, you will need to take explicit steps to ensure that
your code really is executed within a single thread that does not access
shared variables.  A simple way to achieve this is to define a global lock
that you acquire at the beginning of your code and release at the end,
taking care to ensure that all references to your code's shared data are
also carried out under that same lock.  Because only one thread can hold
this lock at a given time, your code will be executed single-threaded.
This approach is called "code locking".

Code locking can severely limit both performance and scalability, so it
should be used with caution, and only on code paths that execute rarely.
After all, a huge amount of effort was required to remove the Linux
kernel's old "Big Kernel Lock", so let's please be very careful about
adding new "little kernel locks".

One of the advantages of locking is that, in happy contrast with the
year 1981, almost all kernel developers are very familiar with locking.
The Linux kernel's lockdep (CONFIG_PROVE_LOCKING=y) is very helpful with
the formerly feared deadlock scenarios.

Please use the standard locking primitives provided by the kernel rather
than rolling your own.  For one thing, the standard primitives interact
properly with lockdep.  For another thing, these primitives have been
tuned to deal better with high contention.  And for one final thing, it is
surprisingly hard to correctly code production-quality lock acquisition
and release functions.  After all, even simple non-production-quality
locking functions must carefully prevent both the CPU and the compiler
from moving code in either direction across the locking function.

Despite the scalability limitations of single-threaded code, RCU
takes this approach for much of its grace-period processing and also
for early-boot operation.  The reason RCU is able to scale despite
single-threaded grace-period processing is use of batching, where all
updates that accumulated during one grace period are handled by the
next one.  In other words, slowing down grace-period processing makes
it more efficient.  Nor is RCU unique:  Similar batching optimizations
are used in many I/O operations.

Annotation

Implementation Notes