include/linux/iversion.h
Source file repositories/reference/linux-study-clean/include/linux/iversion.h
File Facts
- System
- Linux kernel
- Corpus path
include/linux/iversion.h- Extension
.h- Size
- 11220 bytes
- Lines
- 301
- Domain
- Core OS
- Bucket
- Core Kernel Interface
- Inferred role
- Core OS: implementation source
- Status
- source implementation candidate
Why This File Exists
Core operating-system implementation surface: boot, tasks, memory, VFS, syscall-facing interfaces, synchronization, credentials, and isolation.
- Core operating-system implementation surface: boot, tasks, memory, VFS, syscall-facing interfaces, synchronization, credentials, and isolation.
- Defines or uses C structs; map object ownership, embedded links, reference counts, and lock ownership.
Dependency Surface
linux/fs.h
Detected Declarations
function attributefunction inode_peek_iversion_rawfunction inode_set_max_iversion_rawfunction inode_set_iversionfunction inode_set_iversion_queriedfunction inode_inc_iversionfunction inode_iversion_need_incfunction inode_inc_iversion_rawfunction inode_peek_iversionfunction time_to_chattrfunction inode_eq_iversion_rawfunction inode_eq_iversion
Annotated Snippet
#ifndef _LINUX_IVERSION_H
#define _LINUX_IVERSION_H
#include <linux/fs.h>
/*
* The inode->i_version field:
* ---------------------------
* The change attribute (i_version) is mandated by NFSv4 and is mostly for
* knfsd, but is also used for other purposes (e.g. IMA). The i_version must
* appear larger to observers if there was an explicit change to the inode's
* data or metadata since it was last queried.
*
* An explicit change is one that would ordinarily result in a change to the
* inode status change time (aka ctime). i_version must appear to change, even
* if the ctime does not (since the whole point is to avoid missing updates due
* to timestamp granularity). If POSIX or other relevant spec mandates that the
* ctime must change due to an operation, then the i_version counter must be
* incremented as well.
*
* Making the i_version update completely atomic with the operation itself would
* be prohibitively expensive. Traditionally the kernel has updated the times on
* directories after an operation that changes its contents. For regular files,
* the ctime is usually updated before the data is copied into the cache for a
* write. This means that there is a window of time when an observer can
* associate a new timestamp with old file contents. Since the purpose of the
* i_version is to allow for better cache coherency, the i_version must always
* be updated after the results of the operation are visible. Updating it before
* and after a change is also permitted. (Note that no filesystems currently do
* this. Fixing that is a work-in-progress).
*
* Observers see the i_version as a 64-bit number that never decreases. If it
* remains the same since it was last checked, then nothing has changed in the
* inode. If it's different then something has changed. Observers cannot infer
* anything about the nature or magnitude of the changes from the value, only
* that the inode has changed in some fashion.
*
* Not all filesystems properly implement the i_version counter. Subsystems that
* want to use i_version field on an inode should first check whether the
* filesystem sets the SB_I_VERSION flag (usually via the IS_I_VERSION macro).
*
* Those that set SB_I_VERSION will automatically have their i_version counter
* incremented on writes to normal files. If the SB_I_VERSION is not set, then
* the VFS will not touch it on writes, and the filesystem can use it how it
* wishes. Note that the filesystem is always responsible for updating the
* i_version on namespace changes in directories (mkdir, rmdir, unlink, etc.).
* We consider these sorts of filesystems to have a kernel-managed i_version.
*
* It may be impractical for filesystems to keep i_version updates atomic with
* respect to the changes that cause them. They should, however, guarantee
* that i_version updates are never visible before the changes that caused
* them. Also, i_version updates should never be delayed longer than it takes
* the original change to reach disk.
*
* This implementation uses the low bit in the i_version field as a flag to
* track when the value has been queried. If it has not been queried since it
* was last incremented, we can skip the increment in most cases.
*
* In the event that we're updating the ctime, we will usually go ahead and
* bump the i_version anyway. Since that has to go to stable storage in some
* fashion, we might as well increment it as well.
*
* With this implementation, the value should always appear to observers to
* increase over time if the file has changed. It's recommended to use
* inode_eq_iversion() helper to compare values.
*
* Note that some filesystems (e.g. NFS and AFS) just use the field to store
* a server-provided value (for the most part). For that reason, those
* filesystems do not set SB_I_VERSION. These filesystems are considered to
* have a self-managed i_version.
*
* Persistently storing the i_version
* ----------------------------------
* Queries of the i_version field are not gated on them hitting the backing
* store. It's always possible that the host could crash after allowing
* a query of the value but before it has made it to disk.
*
* To mitigate this problem, filesystems should always use
* inode_set_iversion_queried when loading an existing inode from disk. This
* ensures that the next attempted inode increment will result in the value
* changing.
*
* Storing the value to disk therefore does not count as a query, so those
* filesystems should use inode_peek_iversion to grab the value to be stored.
* There is no need to flag the value as having been queried in that case.
*/
/*
* We borrow the lowest bit in the i_version to use as a flag to tell whether
* it has been queried since we last incremented it. If it has, then we must
Annotation
- Immediate include surface: `linux/fs.h`.
- Detected declarations: `function attribute`, `function inode_peek_iversion_raw`, `function inode_set_max_iversion_raw`, `function inode_set_iversion`, `function inode_set_iversion_queried`, `function inode_inc_iversion`, `function inode_iversion_need_inc`, `function inode_inc_iversion_raw`, `function inode_peek_iversion`, `function time_to_chattr`.
- Atlas domain: Core OS / Core Kernel Interface.
- Implementation status: source implementation candidate.
Implementation Notes
- This generated page is the file-by-file coverage layer; curated subsystem chapters should link here when they synthesize a multi-file control flow.
- Core OS pages should be promoted from atlas-only to deep-reviewed when they explain data structures, invariants, locking, lifecycle, and C implementation snippets.
- Driver-family pages are intentionally pattern-oriented unless they are part of the selected PCIe/NVMe representative device path.