AI-Native Systems
Imported from
_research/manual-study-linux/ai-native-systems.md.
AI-Native Systems Notes
This document captures design ideas for AI-aware systems inspired by Linux implementation patterns. Entries must say whether they are source-backed, interpretive, or speculative.
Initial Design Themes
- Agent-visible runtime state should be structured like kernel observability, not scraped from logs only.
- Agent operations need policy hooks similar in spirit to LSM and capability checks.
- Long-running AI work needs scheduler/resource boundaries similar to cgroups and namespaces.
- Code-editing agents need auditable operation tables: what action was requested, what authority allowed it, what state changed, and what rollback path exists.
- AI-readable telemetry should be planned as a subsystem, not bolted onto UI logs.
Evidence Levels
confirmed: directly supported by reviewed Linux source or documentation.interpretive: design conclusion based on reviewed source.speculative: useful idea not yet source-backed.
Pending Track Notes
API Membrane For Agents
Evidence level: interpretive.
The Linux Rust crate boundary implies that AI systems should expose stable, audited capability wrappers to agents rather than raw system operations. Agents should request a capability through a typed wrapper and policy layer, not reach directly into unrestricted runtime internals.
Evidence: file-notes/linux__rust__kernel__lib.rs.md.
Ownership-Aware Agent Handles
Evidence level: interpretive.
Agent resource handles should encode whether the agent has a borrowed view, a
unique editable object, or a shared published object. This mirrors the
ArcBorrow / UniqueArc / Arc separation and gives safer semantics for
edits, rollback, and publication.
Evidence: file-notes/linux__rust__kernel__sync__arc.rs.md.
Scoped Authority Versus Durable Authority
Evidence level: interpretive.
Ambient authority should be scoped like CurrentTask; durable authority should
require explicit conversion into a refcounted/audited handle. This maps to
agent leases, session permissions, and detached background jobs.
Evidence: file-notes/linux__rust__kernel__task.rs.md.
Agent Job Creation Pipeline
Evidence level: interpretive.
Agent jobs should follow a process-creation pipeline: validate policy and namespace/resource constraints, allocate job state, install observability and cleanup guards, publish the job, then start execution. Before publication, failures unwind; after publication, cancellation uses runtime protocols.
Evidence: file-notes/linux__kernel__fork.c.md.
Agent Runqueues And Policy Classes
Evidence level: interpretive.
Agent work should be scheduled through explicit runqueues and policy classes, not only through ad hoc async task queues. A policy class can define enqueue, dequeue, pick, preempt, account, and completion hooks. That makes scheduling decisions inspectable and enforceable.
Evidence: file-notes/linux__kernel__sched__sched.h.md,
file-notes/linux__kernel__sched__core.c.md.
Fair Agent Scheduling With Lag And Deadlines
Evidence level: interpretive.
EEVDF suggests a practical AI runtime policy: track service received by each agent job, identify jobs owed service by lag, and let latency-sensitive jobs request shorter slices through virtual deadlines. This avoids pure FIFO queues and avoids giving interactive jobs unlimited priority.
Evidence: file-notes/linux__kernel__sched__fair.c.md,
file-notes/linux__Documentation__scheduler__sched-eevdf.rst.md.
Bounded Privileged Agents
Evidence level: interpretive.
Real-time Linux scheduling shows why privileged classes still need runtime budgets. High-priority agent jobs should have explicit period/runtime controls and leave recovery capacity for control-plane work.
Evidence: file-notes/linux__kernel__sched__rt.c.md,
file-notes/linux__Documentation__scheduler__sched-rt-group.rst.md.
Deadline-Based Agent SLAs
Evidence level: interpretive.
Deadline scheduling maps to AI jobs with service-level objectives: runtime, deadline, period, admission control, throttling on overrun, and replenishment. This is a better model for bounded inference or automation windows than a single global priority queue.
Evidence: file-notes/linux__kernel__sched__deadline.c.md,
file-notes/linux__Documentation__scheduler__sched-deadline.rst.md.
Lazy Agent State Faults
Evidence level: interpretive.
Linux page faults show how a runtime can materialize state only when accessed. Agent runtimes can apply the same model to long contexts, workspace snapshots, retrieval chunks, and generated artifacts: missing state can become zero-fill, backing-store fetch, copy-on-write clone, retry, or typed failure.
Evidence: file-notes/linux__mm__memory.c.md,
file-notes/linux__include__linux__mm.h.md.
Region Permissions And Drop-Lock Outcomes
Evidence level: interpretive.
Agent-accessible state should be divided into regions with permissions,
backing store, callbacks, and fault results. If a fault operation can drop a
lock or retry, stale region handles must be invalidated just like Linux warns
against dereferencing a VMA after mmap_lock may have been dropped.
Evidence: file-notes/linux__include__linux__mm_types.h.md,
file-notes/linux__mm__memory.c.md.
Allocator Classes For Agent Runtimes
Evidence level: interpretive.
Repeated AI runtime objects should use allocator classes: fixed-size prompt segments, trace spans, tool-call records, embedding chunks, and job descriptors can be cached with accounting and export/usercopy policy attached.
Evidence: file-notes/linux__mm__slab_common.c.md,
file-notes/linux__Documentation__core-api__memory-allocation.rst.md.
Provenance As System Metadata
Evidence level: confirmed process guidance plus interpretation.
Agent identity, model version, tool use, review status, and human acceptance should be first-class fields on changes. Linux’s process guidance separates AI assistance attribution from human DCO certification.
Evidence: file-notes/linux__Documentation__process__coding-assistants.rst.md.