linux/kernel/sched/rt.c
Imported from
_research/manual-study-linux/file-notes/linux__kernel__sched__rt.c.md.
File Notes: kernel/sched/rt.c
Status: reviewed.
Purpose
Implements the real-time scheduling class for FIFO/RR style static-priority scheduling, priority-array queues, runtime throttling, group scheduling, push logic, and RT class callbacks.
Key Types And Functions
init_rt_rq(): initializes priority queues and bitmap state.init_rt_bandwidth()/ RT period timer: runtime budget machinery.sched_rt_runtime_exceeded(): throttling check.enqueue_task_rt()/dequeue_task_rt(): class queue operations.pick_next_rt_entity()/pick_task_rt(): priority selection.put_prev_task_rt()/set_next_task_rt(): switch hooks.DEFINE_SCHED_CLASS(rt): RT class callback table.
Data Flow
RT runqueues use a priority array: one list per RT priority plus a bitmap for
finding the first active priority. init_rt_rq() initializes the lists, clears
the bitmap, sets a delimiter bit, resets highest-priority state, initializes
pushable task tracking, and starts with no queued RT tasks.
Enqueue/dequeue operations manipulate sched_rt_entity membership in these
priority lists. Group scheduling walks nested RT entities so parent groups
reflect child activity. The top RT runqueue contributes its task count to the
generic rq->nr_running only when it is queued and not throttled.
Runtime accounting uses per-group bandwidth settings. update_curr_rt()
charges elapsed execution through common scheduler accounting, while
sched_rt_runtime_exceeded() checks budget, balances runtime, marks throttled
queues, and dequeues throttled queues from the top runqueue.
Selection is direct: pick_next_rt_entity() finds the first set priority bit
and returns the first entity in that priority list; nested group queues are
walked until a task entity is reached.
Invariants And Safety Contracts
- A throttled RT runqueue must not be enqueued at the top level.
- Priority bitmap and per-priority lists must remain consistent.
- Group dequeue occurs top-down because parent priority depends on child entries.
- RT runtime controls are a safety feature: unlimited RT can starve normal progress.
Rust Translation Guidance
Use a fixed priority-array abstraction with internal consistency checks between bitmap and lists. Runtime throttling should be modeled as a state on the queue, not as a side flag ignored by enqueue. Group scheduling needs explicit parent updates or a tree walk that cannot be skipped by callers.
AI-Native Systems Guidance
Privileged AI jobs that claim low latency should still have RT-like budget throttles. A system should make high-priority classes deterministic but bounded: priority alone cannot override the need for forward progress and recovery.
Evidence
init_rt_rq()initializes per-priority queues, bitmap delimiter, highest priority, pushable tasks, and throttling fields atkernel/sched/rt.c:68-95.- RT bandwidth is timer-backed through
init_rt_bandwidth()atkernel/sched/rt.c:125-134. sched_rt_runtime_exceeded()throttles over-budget RT queues and dequeues them atkernel/sched/rt.c:863-904.update_curr_rt()skips non-RT current tasks and charges RT execution atkernel/sched/rt.c:970-990.- Top-level RT enqueue/dequeue updates
rq->nr_runningand avoids throttled queues atkernel/sched/rt.c:1010-1047. - RT entity enqueue/dequeue manipulates priority lists and group stacks at
kernel/sched/rt.c:1331-1429;enqueue_task_rt()starts atkernel/sched/rt.c:1432-1445. pick_next_rt_entity()uses the first bitmap bit and FIFO list head atkernel/sched/rt.c:1682-1698;pick_task_rt()wraps it atkernel/sched/rt.c:1715-1725.- RT switch and callback table logic appears at
kernel/sched/rt.c:1727-1745andkernel/sched/rt.c:2601-2637.