Skip to content

VFS And Filesystems

Imported from _research/manual-study-linux/vfs-filesystems.md.

VFS And Filesystems

Status: implemented source-backed volume.

Source Surface

  • fs/open.c: open/openat/openat2 and close syscall path, open flag normalization, file creation, and struct file activation.
  • fs/read_write.c: read/write syscall path, vfs_read(), vfs_write(), and file-operation dispatch.
  • include/linux/fs.h: shared VFS object contracts such as struct file, inode/superblock types, and operation tables.

Entry Points

User space enters the VFS through syscall wrappers. fs/open.c exposes SYSCALL_DEFINE3(open) at line 1405, SYSCALL_DEFINE4(openat) at line 1412, and SYSCALL_DEFINE4(openat2) at line 1420. Those wrappers converge on do_sys_openat2() at line 1386 after open flags have been normalized by build_open_flags() at line 1179.

Read/write uses the same layered style. fs/read_write.c exposes SYSCALL_DEFINE3(read) at line 723 and SYSCALL_DEFINE3(write) at line 747. They converge on ksys_read() and ksys_write(), which in turn call vfs_read() at line 554 and vfs_write() at line 667.

Core Data Model

The VFS is an object membrane. User space receives file descriptors, but the kernel internally moves through paths, dentries, inodes, superblocks, mounts, struct file, and filesystem-specific operation tables.

The key contract is that generic VFS code performs common checks and lifetime management, then calls operation-table methods supplied by the filesystem or device. This lets ext4, procfs, tmpfs, sockets, pipes, and device files share the same syscall shape without sharing implementation internals.

Open Flow

The open path can be read as a staged constructor:

  1. Copy and validate user arguments.
  2. Normalize open flags with build_open_flags().
  3. Resolve the path and permissions.
  4. Allocate or activate a struct file.
  5. Bind file operations.
  6. Install the file into the caller’s descriptor table.

do_dentry_open() at line 885 is the important internal activation point: after path lookup chooses the target object, the VFS turns that target into an open file with operations and mode bits. vfs_open() appears around lines 1070-1074 as the common helper for opening a resolved path. Close unwinds the descriptor and file lifetime through filp_close() at line 1507 and SYSCALL_DEFINE1(close) at line 1523.

Read/Write Flow

The read/write path is policy plus dispatch. rw_verify_area() at line 453 checks offset/count/permission constraints before the operation reaches the object-specific implementation. vfs_read() and vfs_write() then route to the file’s operations. Synchronous helpers such as new_sync_read() at line 483 and new_sync_write() at line 585 bridge common file-operation shapes.

The lesson is that Linux does not special-case every file kind in the syscall handler. It normalizes the request into a file-object operation and then lets the operation table carry the concrete behavior.

Concurrency And Lifetime

The VFS boundary depends on reference ownership rather than global serialization. Descriptors reference files; files reference paths/inodes; inodes and dentries have their own cache and reclaim rules. The syscall layer must not hold transient user pointers across deeper operations; it converts them into kernel objects, validated lengths, and explicit offsets.

Rust Translation

A clean Rust design should model this as:

  • FileDescriptor as a process-table handle, not the file object itself.
  • OpenFile or FileRef as an owned/refcounted kernel object.
  • FileOps trait for read/write/ioctl/mmap-style dispatch.
  • OpenFlags as a validated builder output, not raw integers everywhere.
  • Path lookup returning typed resolved objects before open activation.

The unsafe boundary is not “reading bytes”; it is converting user memory, integer flags, and descriptor numbers into typed kernel capabilities.

AI-Native Translation

For an AI runtime, the VFS pattern maps to tool handles. Agents should not hold ambient filesystem authority. They should receive typed handles with operation tables, explicit capability bits, audit metadata, and lifetime revocation.

  • file-notes/linux__fs__open.c.md
  • file-notes/linux__fs__read_write.c.md