Trace, Perf, BPF, And Observability
Imported from
_research/manual-study-linux/observability-bpf-trace.md.
Trace, Perf, BPF, And Observability
Status: implemented source-backed volume.
Source Surface
kernel/trace/trace.c: tracing core, trace arrays, ring-buffer routing, tracer registration, and runtime controls.kernel/bpf/syscall.c: BPF syscall boundary, program/map/link operations, and attachment surfaces.
Trace Core
kernel/trace/trace.c identifies itself as the ring-buffer-based tracer at the
top of the file. It defines tracer registration structures around line 95 and
the global trace array around line 523. Runtime selection flows through
tracing_set_tracer() around line 213.
The trace subsystem is not just logging. It is a low-overhead event capture pipeline with per-tracer policy, buffers, enable/disable controls, and interfaces for readers.
BPF Boundary
kernel/bpf/syscall.c is the programmable extension boundary. It handles
program loading, map creation, link creation, perf/kprobe/uprobe/tracepoint
attachments, and file-descriptor-backed object lifetimes.
Important implementation idea: BPF is not arbitrary kernel code. The syscall boundary constructs typed kernel objects, checks capabilities and verifier constraints, and exposes handles through descriptors and links.
Control Flow
Trace and BPF meet around attach points. Static kernel trace infrastructure captures events; BPF programs can be loaded and attached to controlled hook sites; perf and tracing infrastructure can consume or expose the resulting events.
This creates a layered observability stack:
- Kernel code emits events or exposes hook points.
- Tracing/perf infrastructure buffers and routes events.
- BPF provides constrained programmable filtering/aggregation.
- User space reads results through explicit handles.
Concurrency And Safety
Observability must avoid perturbing the system too much. Ring buffers, static keys, per-CPU data, and verifier constraints exist because tracing a hot path can itself become a hot path. The safe design is to make disabled tracing cheap and enabled tracing bounded.
Rust Translation
A Rust equivalent should define:
- Typed event schemas.
TracePoint<T>with static payload shape.- Bounded ring buffers with explicit loss/backpressure semantics.
- Program/plugin attachment through verified interfaces, not raw callbacks.
- Descriptor-like links that own attachment lifetime.
AI-Native Translation
AI agents need observability that is structured and policy-aware. The Linux lesson is to put capture points into the runtime itself, then allow constrained programs or rules to summarize events. Agents should receive audited telemetry streams, not unrestricted runtime memory access.
Evidence Links
file-notes/linux__kernel__trace__trace.c.mdfile-notes/linux__kernel__bpf__syscall.c.md