Skip to content

linux/kernel/fork.c

Imported from _research/manual-study-linux/file-notes/linux__kernel__fork.c.md.

File Notes: kernel/fork.c

Status: reviewed.

Purpose

kernel/fork.c implements Linux task/process creation and related copy, sharing, validation, publication, and cleanup paths. It is the practical center of task lifecycle construction: fork, vfork, legacy clone, clone3, kernel threads, user-mode threads, and io worker threads converge on kernel_clone / copy_process.

Key Types And Functions

  • fork_init: initializes task allocation caches and global task limits.
  • dup_task_struct: allocates and prepares a new task_struct plus stack.
  • dup_mm / copy_mm: share or duplicate memory context.
  • copy_fs, copy_files, copy_sighand, copy_signal, copy_seccomp: clone or share process resources.
  • copy_process: validates clone flags, allocates task state, performs subsystem setup, publishes the task, and contains the unwind path.
  • kernel_clone: common routine that calls copy_process, wakes the task, and handles vfork/ptrace details.
  • kernel_thread, user_mode_thread, fork, vfork, legacy clone, and clone3: entrypoints that package arguments for kernel_clone.

Data Flow

Task creation starts by validating clone flag combinations and namespace/thread constraints. copy_process duplicates the task structure, copies credentials, checks process limits, initializes task-local state, calls each subsystem’s fork hook, copies or shares resources according to clone flags, allocates a pid, performs cgroup and scheduler checks before visibility, then publishes the task under tasklist and signal locks.

kernel_clone wraps that construction, records tracing metadata, optionally sets vfork completion state, wakes the new task, reports ptrace events, waits for vfork completion if needed, then returns the visible pid.

Invariants And Safety Contracts

  • Some clone flag combinations are invalid because they would share state across incompatible namespaces or process/thread boundaries.
  • Thread groups imply shared signal handlers and shared VM.
  • New tasks are heavily initialized before they become visible.
  • cgroup and scheduler placement happen before the task is externally visible.
  • After the “no more failure paths” point, publication proceeds through pid attachment, process tree insertion, thread counters, and tracing/post-fork hooks.
  • The error path unwinds each initialized subsystem in reverse-ish dependency order.

Rust Translation Guidance

A Rust reimplementation should model process creation as a staged builder with typestate transitions:

  • validated clone request;
  • allocated but unpublished task;
  • resources copied/shared;
  • pid and scheduler/cgroup placement reserved;
  • visible but not running;
  • running or fully unwound.

The current C function is explicit about every subsystem hook and cleanup label. In Rust, that should become RAII guards and commit points, not a single large function with gotos. Clone flags should become typed capability/namespace requests with validation before allocation.

AI-Native Systems Guidance

This file is a model for safe agent job creation. AI jobs should not become visible/runnable until policy, resource limits, namespace/capability checks, scheduler placement, observability hooks, and cleanup guards are installed. The fork path also shows why cancellation points must be explicit: before publication, failures unwind; after publication, the system must use runtime termination/cleanup protocols.

Evidence

  • kernel/fork.c:854-897: fork_init task cache, limits, and init setup.
  • kernel/fork.c:914-1018: dup_task_struct allocation, stack setup, refcount setup, and cleanup labels.
  • kernel/fork.c:1522-1598: dup_mm and copy_mm memory sharing/duplication.
  • kernel/fork.c:1616-1665: copy_fs and copy_files share-or-copy logic.
  • kernel/fork.c:1667-1770: signal handler and signal state setup.
  • kernel/fork.c:1772-1795: seccomp copy timing and lock requirement.
  • kernel/fork.c:1989-2089: clone flag, namespace, thread, pidfd, and privilege validation.
  • kernel/fork.c:2100-2259: signal ordering, task duplication, credentials, limits, cgroup fork setup, and scheduler setup.
  • kernel/fork.c:2262-2305: subsystem alloc/copy hooks and pid allocation.
  • kernel/fork.c:2386-2407: cgroup permission and scheduler-cgroup placement before visibility.
  • kernel/fork.c:2420-2558: visibility, pid attachment, final no-failure phase, post-fork hooks, and return.
  • kernel/fork.c:2560-2618: cleanup/unwind labels.
  • kernel/fork.c:2671-3048: io thread, kernel_clone, kernel/user threads, fork, vfork, legacy clone, and clone3 entrypoints.