VFS And Filesystems
Imported from
_research/manual-study-linux/vfs-filesystems.md.
VFS And Filesystems
Status: implemented source-backed volume.
Source Surface
fs/open.c: open/openat/openat2 and close syscall path, open flag normalization, file creation, andstruct fileactivation.fs/read_write.c: read/write syscall path,vfs_read(),vfs_write(), and file-operation dispatch.include/linux/fs.h: shared VFS object contracts such asstruct file, inode/superblock types, and operation tables.
Entry Points
User space enters the VFS through syscall wrappers. fs/open.c exposes
SYSCALL_DEFINE3(open) at line 1405, SYSCALL_DEFINE4(openat) at line 1412,
and SYSCALL_DEFINE4(openat2) at line 1420. Those wrappers converge on
do_sys_openat2() at line 1386 after open flags have been normalized by
build_open_flags() at line 1179.
Read/write uses the same layered style. fs/read_write.c exposes
SYSCALL_DEFINE3(read) at line 723 and SYSCALL_DEFINE3(write) at line 747.
They converge on ksys_read() and ksys_write(), which in turn call
vfs_read() at line 554 and vfs_write() at line 667.
Core Data Model
The VFS is an object membrane. User space receives file descriptors, but the
kernel internally moves through paths, dentries, inodes, superblocks, mounts,
struct file, and filesystem-specific operation tables.
The key contract is that generic VFS code performs common checks and lifetime management, then calls operation-table methods supplied by the filesystem or device. This lets ext4, procfs, tmpfs, sockets, pipes, and device files share the same syscall shape without sharing implementation internals.
Open Flow
The open path can be read as a staged constructor:
- Copy and validate user arguments.
- Normalize open flags with
build_open_flags(). - Resolve the path and permissions.
- Allocate or activate a
struct file. - Bind file operations.
- Install the file into the caller’s descriptor table.
do_dentry_open() at line 885 is the important internal activation point:
after path lookup chooses the target object, the VFS turns that target into an
open file with operations and mode bits. vfs_open() appears around lines
1070-1074 as the common helper for opening a resolved path. Close unwinds the
descriptor and file lifetime through filp_close() at line 1507 and
SYSCALL_DEFINE1(close) at line 1523.
Read/Write Flow
The read/write path is policy plus dispatch. rw_verify_area() at line 453
checks offset/count/permission constraints before the operation reaches the
object-specific implementation. vfs_read() and vfs_write() then route to
the file’s operations. Synchronous helpers such as new_sync_read() at line
483 and new_sync_write() at line 585 bridge common file-operation shapes.
The lesson is that Linux does not special-case every file kind in the syscall handler. It normalizes the request into a file-object operation and then lets the operation table carry the concrete behavior.
Concurrency And Lifetime
The VFS boundary depends on reference ownership rather than global serialization. Descriptors reference files; files reference paths/inodes; inodes and dentries have their own cache and reclaim rules. The syscall layer must not hold transient user pointers across deeper operations; it converts them into kernel objects, validated lengths, and explicit offsets.
Rust Translation
A clean Rust design should model this as:
FileDescriptoras a process-table handle, not the file object itself.OpenFileorFileRefas an owned/refcounted kernel object.FileOpstrait for read/write/ioctl/mmap-style dispatch.OpenFlagsas a validated builder output, not raw integers everywhere.- Path lookup returning typed resolved objects before open activation.
The unsafe boundary is not “reading bytes”; it is converting user memory, integer flags, and descriptor numbers into typed kernel capabilities.
AI-Native Translation
For an AI runtime, the VFS pattern maps to tool handles. Agents should not hold ambient filesystem authority. They should receive typed handles with operation tables, explicit capability bits, audit metadata, and lifetime revocation.
Evidence Links
file-notes/linux__fs__open.c.mdfile-notes/linux__fs__read_write.c.md