linux/kernel/fork.c
Imported from
_research/manual-study-linux/file-notes/linux__kernel__fork.c.md.
File Notes: kernel/fork.c
Status: reviewed.
Purpose
kernel/fork.c implements Linux task/process creation and related copy,
sharing, validation, publication, and cleanup paths. It is the practical
center of task lifecycle construction: fork, vfork, legacy clone,
clone3, kernel threads, user-mode threads, and io worker threads converge on
kernel_clone / copy_process.
Key Types And Functions
fork_init: initializes task allocation caches and global task limits.dup_task_struct: allocates and prepares a newtask_structplus stack.dup_mm/copy_mm: share or duplicate memory context.copy_fs,copy_files,copy_sighand,copy_signal,copy_seccomp: clone or share process resources.copy_process: validates clone flags, allocates task state, performs subsystem setup, publishes the task, and contains the unwind path.kernel_clone: common routine that callscopy_process, wakes the task, and handles vfork/ptrace details.kernel_thread,user_mode_thread,fork,vfork, legacyclone, andclone3: entrypoints that package arguments forkernel_clone.
Data Flow
Task creation starts by validating clone flag combinations and namespace/thread
constraints. copy_process duplicates the task structure, copies credentials,
checks process limits, initializes task-local state, calls each subsystem’s
fork hook, copies or shares resources according to clone flags, allocates a
pid, performs cgroup and scheduler checks before visibility, then publishes the
task under tasklist and signal locks.
kernel_clone wraps that construction, records tracing metadata, optionally
sets vfork completion state, wakes the new task, reports ptrace events, waits
for vfork completion if needed, then returns the visible pid.
Invariants And Safety Contracts
- Some clone flag combinations are invalid because they would share state across incompatible namespaces or process/thread boundaries.
- Thread groups imply shared signal handlers and shared VM.
- New tasks are heavily initialized before they become visible.
- cgroup and scheduler placement happen before the task is externally visible.
- After the “no more failure paths” point, publication proceeds through pid attachment, process tree insertion, thread counters, and tracing/post-fork hooks.
- The error path unwinds each initialized subsystem in reverse-ish dependency order.
Rust Translation Guidance
A Rust reimplementation should model process creation as a staged builder with typestate transitions:
- validated clone request;
- allocated but unpublished task;
- resources copied/shared;
- pid and scheduler/cgroup placement reserved;
- visible but not running;
- running or fully unwound.
The current C function is explicit about every subsystem hook and cleanup label. In Rust, that should become RAII guards and commit points, not a single large function with gotos. Clone flags should become typed capability/namespace requests with validation before allocation.
AI-Native Systems Guidance
This file is a model for safe agent job creation. AI jobs should not become visible/runnable until policy, resource limits, namespace/capability checks, scheduler placement, observability hooks, and cleanup guards are installed. The fork path also shows why cancellation points must be explicit: before publication, failures unwind; after publication, the system must use runtime termination/cleanup protocols.
Evidence
kernel/fork.c:854-897:fork_inittask cache, limits, and init setup.kernel/fork.c:914-1018:dup_task_structallocation, stack setup, refcount setup, and cleanup labels.kernel/fork.c:1522-1598:dup_mmandcopy_mmmemory sharing/duplication.kernel/fork.c:1616-1665:copy_fsandcopy_filesshare-or-copy logic.kernel/fork.c:1667-1770: signal handler and signal state setup.kernel/fork.c:1772-1795: seccomp copy timing and lock requirement.kernel/fork.c:1989-2089: clone flag, namespace, thread, pidfd, and privilege validation.kernel/fork.c:2100-2259: signal ordering, task duplication, credentials, limits, cgroup fork setup, and scheduler setup.kernel/fork.c:2262-2305: subsystem alloc/copy hooks and pid allocation.kernel/fork.c:2386-2407: cgroup permission and scheduler-cgroup placement before visibility.kernel/fork.c:2420-2558: visibility, pid attachment, final no-failure phase, post-fork hooks, and return.kernel/fork.c:2560-2618: cleanup/unwind labels.kernel/fork.c:2671-3048: io thread,kernel_clone, kernel/user threads,fork,vfork, legacyclone, andclone3entrypoints.