Architecture Map
Imported from
_research/manual-study-linux/architecture-map.md.
Linux Architecture Map
This document accumulates confirmed architecture facts from reviewed source
notes. Do not add speculative design ideas here; put Rust translation in
rust-translation.md and AI-native interpretation in ai-native-systems.md.
Confirmed System Shape
Rust Kernel API Membrane
The in-kernel Rust surface is intentionally centralized in the kernel crate.
Rust modules depend on this crate, and missing C APIs are expected to be
wrapped there before use. The crate is no_std, checks for CONFIG_RUST, and
feature-gates subsystem wrappers at the root module boundary.
Evidence: file-notes/linux__rust__kernel__lib.rs.md.
Refcounted And Pinned Ownership
Linux’s Rust ownership layer does not treat Arc as a generic userspace
utility. Kernel Arc is backed by kernel Refcount, has no weak references,
saturates counts, pins data implicitly, and uses companion types for borrowed
and unique states. ArcBorrow avoids extra indirection where a refcount-capable
borrow is enough; UniqueArc models exclusive pre-publication ownership.
Evidence: file-notes/linux__rust__kernel__sync__arc.rs.md.
Current Context Versus Durable Task Handle
The Rust task wrapper splits current-context access from long-lived task
handles. CurrentTask is scoped to the active task context and made
non-thread-safe; durable task references use ARef<Task> and the C
get_task_struct / put_task_struct lifetime model.
Evidence: file-notes/linux__rust__kernel__task.rs.md.
Staged Process Creation
Linux task creation is a staged pipeline: validate clone flags and namespace constraints, duplicate task state, copy or share resources, reserve pid and scheduler/cgroup state, make the task visible, then wake it. Error labels unwind initialized subsystems before visibility; after the no-failure point, publication proceeds through locks and post-fork hooks.
Evidence: file-notes/linux__kernel__fork.c.md.
Per-CPU Runqueues And Scheduler Classes
Linux scheduler state is centered on per-CPU struct rq instances. Each
runqueue embeds fair, RT, and deadline queues, tracks hot current/runnable
state, and is protected by explicit runqueue locking rules. Scheduling policy is
factored through struct sched_class, a C operation table whose callbacks also
document lock preconditions.
Evidence: file-notes/linux__kernel__sched__sched.h.md,
file-notes/linux__kernel__sched__core.c.md.
Scheduler Lifecycle Pipeline
The core scheduler owns task lifecycle transitions. Fork-time setup keeps a
task in TASK_NEW; cgroup fork setup attaches task-group and CPU state;
wake_up_new_task() publishes the first runnable state; schedule() enters
__schedule() under preemption discipline; and context_switch() performs
memory-map and architecture register/stack handoff before post-switch cleanup.
Evidence: file-notes/linux__kernel__sched__core.c.md.
Fair Scheduling Uses EEVDF Accounting
The fair class tracks virtual runtime, lag, and virtual deadlines. A fair task is eligible when it is owed service, and among eligible tasks the earliest virtual deadline wins. Linux implements this with augmented RB-tree state plus runtime/deadline accounting on the current scheduling entity.
Evidence: file-notes/linux__kernel__sched__fair.c.md,
file-notes/linux__Documentation__scheduler__sched-eevdf.rst.md.
RT And Deadline Classes Are Distinct Policy Engines
Real-time scheduling uses priority arrays, bitmaps, FIFO lists, and runtime throttling. Deadline scheduling uses EDF plus Constant Bandwidth Server, deadline-ordered RB trees, bandwidth counters, throttling, replenishment, and parameter validation.
Evidence: file-notes/linux__kernel__sched__rt.c.md,
file-notes/linux__kernel__sched__deadline.c.md,
file-notes/linux__Documentation__scheduler__sched-rt-group.rst.md,
file-notes/linux__Documentation__scheduler__sched-deadline.rst.md.
Address Spaces, VMAs, And Fault Descriptors
Linux memory management separates process address-space state (mm_struct),
virtual memory region state (vm_area_struct), and per-fault state
(struct vm_fault). VMAs carry range, permissions, backing object, callbacks,
anonymous state, and lock/refcount state. Fault results are bitmasks so the
fault path can distinguish OOM, retry, SIGBUS/SIGSEGV, COW completion, and
lock-release completion.
Evidence: file-notes/linux__include__linux__mm_types.h.md,
file-notes/linux__include__linux__mm.h.md.
Page Faults Are Typed Control Flow
The page-fault path is a staged control flow: validate access, enter memcg fault context, walk/allocate page-table levels, attempt huge-page paths, fall back to PTE dispatch, then handle missing, swap, NUMA, write-protect, anonymous, file-backed, shared, and COW cases. PTE mutation is protected by page-table locks and race rechecks.
Evidence: file-notes/linux__mm__memory.c.md.
Slab Caches Are Validated Object Allocators
Linux uses named slab caches for repeated fixed-size kernel objects. Cache
creation validates context, flags, usercopy ranges, merge policy, alignment,
and aliases under slab_mutex. Cache destruction waits for deferred RCU/free
activity and verifies object teardown before releasing cache metadata.
Evidence: file-notes/linux__mm__slab_common.c.md,
file-notes/linux__Documentation__core-api__memory-allocation.rst.md.
AI Contribution Provenance
Linux process guidance treats AI assistance as provenance, not authorship certification. AI tools follow normal kernel development process, humans retain DCO responsibility, and AI assistance is attributed with agent/model/tool metadata.
Evidence: file-notes/linux__Documentation__process__coding-assistants.rst.md.
Cross-Cutting Patterns To Track
- Table-driven subsystem registration.
- Operation tables and late binding.
- Reference-counted object lifetimes.
- Per-CPU state and locality.
- Locking hierarchy and RCU read-side behavior.
- User/kernel boundary validation.
- Capability and namespace checks.
- Build-time feature selection.
- Runtime observability hooks.
Evidence Index
file-notes/linux__rust__kernel__lib.rs.mdfile-notes/linux__rust__kernel__sync__arc.rs.mdfile-notes/linux__rust__kernel__task.rs.mdfile-notes/linux__kernel__fork.c.mdfile-notes/linux__kernel__sched__sched.h.mdfile-notes/linux__kernel__sched__core.c.mdfile-notes/linux__kernel__sched__fair.c.mdfile-notes/linux__kernel__sched__rt.c.mdfile-notes/linux__kernel__sched__deadline.c.mdfile-notes/linux__Documentation__scheduler__sched-design-CFS.rst.mdfile-notes/linux__Documentation__scheduler__sched-eevdf.rst.mdfile-notes/linux__Documentation__scheduler__sched-rt-group.rst.mdfile-notes/linux__Documentation__scheduler__sched-deadline.rst.mdfile-notes/linux__include__linux__mm_types.h.mdfile-notes/linux__include__linux__mm.h.mdfile-notes/linux__mm__memory.c.mdfile-notes/linux__mm__slab_common.c.mdfile-notes/linux__Documentation__admin-guide__mm__concepts.rst.mdfile-notes/linux__Documentation__core-api__memory-allocation.rst.mdfile-notes/linux__Documentation__process__coding-assistants.rst.md