Documentation/admin-guide/device-mapper/vdo-design.rst

Source file repositories/reference/linux-study-clean/Documentation/admin-guide/device-mapper/vdo-design.rst

File Facts

System
Linux kernel
Corpus path
Documentation/admin-guide/device-mapper/vdo-design.rst
Extension
.rst
Size
36294 bytes
Lines
634
Domain
Support Tooling And Documentation
Bucket
Documentation
Inferred role
Support Tooling And Documentation: documentation
Status
atlas-only

Why This File Exists

Repository support layer: documentation, build tooling, samples, user-space helper tools, generated initramfs support, licenses, and validation utilities.

Dependency Surface

Detected Declarations

Annotated Snippet

.. SPDX-License-Identifier: GPL-2.0-only

================
Design of dm-vdo
================

The dm-vdo (virtual data optimizer) target provides inline deduplication,
compression, zero-block elimination, and thin provisioning. A dm-vdo target
can be backed by up to 256TB of storage, and can present a logical size of
up to 4PB. This target was originally developed at Permabit Technology
Corp. starting in 2009. It was first released in 2013 and has been used in
production environments ever since. It was made open-source in 2017 after
Permabit was acquired by Red Hat. This document describes the design of
dm-vdo. For usage, see vdo.rst in the same directory as this file.

Because deduplication rates fall drastically as the block size increases, a
vdo target has a maximum block size of 4K. However, it can achieve
deduplication rates of 254:1, i.e. up to 254 copies of a given 4K block can
reference a single 4K of actual storage. It can achieve compression rates
of 14:1. All zero blocks consume no storage at all.

Theory of Operation
===================

The design of dm-vdo is based on the idea that deduplication is a two-part
problem. The first is to recognize duplicate data. The second is to avoid
storing multiple copies of those duplicates. Therefore, dm-vdo has two main
parts: a deduplication index (called UDS) that is used to discover
duplicate data, and a data store with a reference counted block map that
maps from logical block addresses to the actual storage location of the
data.

Zones and Threading
-------------------

Due to the complexity of data optimization, the number of metadata
structures involved in a single write operation to a vdo target is larger
than most other targets. Furthermore, because vdo must operate on small
block sizes in order to achieve good deduplication rates, acceptable
performance can only be achieved through parallelism. Therefore, vdo's
design attempts to be lock-free.

Most of a vdo's main data structures are designed to be easily divided into
"zones" such that any given bio must only access a single zone of any zoned
structure. Safety with minimal locking is achieved by ensuring that during
normal operation, each zone is assigned to a specific thread, and only that
thread will access the portion of the data structure in that zone.
Associated with each thread is a work queue. Each bio is associated with a
request object (the "data_vio") which will be added to a work queue when
the next phase of its operation requires access to the structures in the
zone associated with that queue.

Another way of thinking about this arrangement is that the work queue for
each zone has an implicit lock on the structures it manages for all its
operations, because vdo guarantees that no other thread will alter those
structures.

Although each structure is divided into zones, this division is not
reflected in the on-disk representation of each data structure. Therefore,
the number of zones for each structure, and hence the number of threads,
can be reconfigured each time a vdo target is started.

The Deduplication Index
-----------------------

In order to identify duplicate data efficiently, vdo was designed to
leverage some common characteristics of duplicate data. From empirical
observations, we gathered two key insights. The first is that in most data
sets with significant amounts of duplicate data, the duplicates tend to
have temporal locality. When a duplicate appears, it is more likely that

Annotation

Implementation Notes