Linux/dm-cache
dm-cache[edit | edit source]
Device Mapper Cache (dm-cache) is a Linux kernel feature that enhances storage performance by adding a block-level cache on a separate, faster device. This setup can significantly boost random read/write performance on slower primary storage devices (such as HDDs) by caching frequently accessed blocks on a fast SSD or NVMe device.
The dm-cache
tool is a collection of scripts designed to make setting up a Device Mapper Cache device straightforward, even for systems with existing data. Unlike alternatives like LVM cache and Bcache, dm-cache can be configured on devices with pre-existing filesystems, making it a safer and more convenient choice for systems with existing data.
Setup[edit | edit source]
dm-cache can be configured either with the dmsetup.sh
script or through OpenRC init scripts for a persistent setup.
The source code is available at https://git.tnonline.net/Forza/dm-cache and is released under the GPLv3 license.
Requirements[edit | edit source]
dm-cache utilises the dmsetup
utility which usually can be found in the lvm2 or device-mapper packages.
Three devices are required to setup a cache.
- origin: The slow device.
- cache: A fast SSD or NVMe device, which can vary in size.
- meta: A small device that stores dm-cache metadata.
The metadata device size depends on how many cache blocks fit on the cache device. With the default setting, it should be a least 0.01% of the cache device size. If the cache device is 50GiB, and a cache block size of 128KiB, a metadata device of 5MiB is enough. Smaller block sizes requires more metadata and memory, while larger block sizes may reduce dm-cache's effectiveness.
It is important to mount the filesystem using the /dev/mapper/dm-name
path and not with the filesystem UUID as is commonly done. This is because the kernel might associate the UUID to the origin device instead of the dm-cache device, and this can cause data loss!
# dmesg
BTRFS warning: duplicate device /dev/sdj1 devid 1 generation 182261 scanned by mount (13706)
Use the provided 90-dmcache.rules
udev rule that prevents this issue by removing the /dev/disk/by-uuid/
symlink to the origin device.
Configuration[edit | edit source]
The following options are available:
- dmname: Choose a new name for the assembled dm-cache. It will be exposed as a block device as `/dev/mapper/dmname`
- origindev: Path to the slow device that shoulf be accelerated with dm-cache. Use a stable device ID, not FS UUID.
- cachedev: The fast cache device, usually an SSD or NVME disk.
- metadev: A small decice to hold cache metadata.
- cachemode: Choose writethrough or writeback cache.
- writethrough cache (default): Write through caching prohibits cachedev content from being different from origindev content. This mode only accelerates reads, but should allow the origin device to be used without the cache dev after a crash.
- writeback cache: When write back cache is used. Writes are written to the cachedev first, before being synced in the background to the origin dev. If the system crashes, the dm-cache must be assembled again before use to avoid serious filesystem damage. If the cachedev fails, the filesystem can be irrevokably damaged!
- cacheblock: The size of cache blocks in sectors. dm-cache promotes and demotes only whole blocks. Too large block size wastes cache discs, reducing its effectiveness, while too small has more memory and metadata overhead.
- cachepolicy: Cache policy affects how dm-cache promotes and demotes data from the cachedev. This is an advanced option. Leave it as default.
- readahead: Linux block device read-ahead value in sectors. The kernel calculates a suitable default if this is unset.
The Linux kernel documentation has more details on possible configuration options.
/dev/mapper/dm-name
path. Using the filesystem UUID, as commonly done, can result in the kernel seeing the UUID from the origin device, potentially leading to data loss.udev rules[edit | edit source]
To avoid risk of accessing the filesystem via the origin device instead of via the dm-cache device, the following udev rule can be used. It removes the UUID symlink pointing to the origin device.
File: /etc/udev/rules.d/90-dmcache.rules
ENV{ID_FS_UUID_ENC}=="df68a30d-d26e-4b9c-9606-a130e66ce63d", KERNEL=="sd*", SUBSYSTEM=="block", ACTION=="add|change", SYMLINK-="disk/by-uuid/$env{ID_FS_UUID_ENC}"
ID_FS_UUID_ENC
, means the filesystem's UUID.sd*
means the rule should match any /dev/sd* devices. Adjust if you use other names such as vd*, nvme*, etc.
The filesystem UUID can be found using blkid /dev/origindev
.
# blkid /dev/sdj1
/dev/sdj1: LABEL="usb-backup" UUID="df68a30d-d26e-4b9c-9606-a130e66ce63d" UUID_SUB="254fe753-d4d6-4ad1-9cc3-cd9f4c1bfa67" BLOCK_SIZE="4096" TYPE="btrfs" PARTLABEL="Basic data partition" PARTUUID="ac0ae9b1-8e32-4e33-b641-998bc0298d14"
mdev rules[edit | edit source]
Alpine Linux uses mdev
instead of udev
by default. The setup with mdev is slightly more complicated because it does not support removing existing symlinks. A workaround is using a shell script hook in /etc/mdev.conf
.
- install dmcache.mdev to
/lib/mdev/dmcache
. Make sure it has the executable bit set. - install dmcache-uuids to
/etc/dmcache-uuids
. - add the dmcache hook
/lib/mdev/dmcache
to mdev.conf at the persistent storage section.
File: /etc/mdev.conf
# persistent storage dasd.* root:disk 0660 */lib/mdev/persistent-storage mmcblk.* root:disk 0660 */lib/mdev/persistent-storage nbd.* root:disk 0660 */lib/mdev/persistent-storage nvme.* root:disk 0660 */lib/mdev/persistent-storage sd[a-z].* root:disk 0660 */lib/mdev/persistent-storage; /lib/mdev/dmcache sr[0-9]+ root:cdrom 0660 */lib/mdev/persistent-storage vd[a-z].* root:disk 0660 */lib/mdev/persistent-storage xvd[a-z].* root:disk 0660 */lib/mdev/persistent-storage
Using OpenRC[edit | edit source]
The OpenRC init script can automate setting up and stopping dm-cache during boot.
- Install
conf.d/dmcache
andinit.d/dmscript
- Modify
conf.d/dmcache
to suit your setup - Add a udev rule to block FS UUID device symlinks
- Add dmcache to boot runlevel:
rc-update add dmcache boot
Multiple devices[edit | edit source]
If you have several devices you can simply make a copy of the init.d and conf.d files to a new name. The filenames in init.d and conf.d must be the same.
cp /etc/conf.d/dmcache /etc/conf.d/dmcache new
ln -s /etc/init.d/dmcache /etc/init.d/dmcache new
- update
/etc/conf.d/dmcache.new
- update udev rules
rc-service dmcache.new start
rc-update add dmcache.new boot
Using dmcache.sh[edit | edit source]
Edit dmcache.sh
and add the devices and configuration options you need.
After starting dm-cache, you should remove the UUID symlink from /dev/disk/by-uuid/
which is pointing to your origin device, before attempting to mount the filesystem. Use the rule to remove the symlink persistently.
The dm-cache mapping created by dmcache.sh
is not persistent. After a reboot, the dm-cache must be assembled again before the filesystem safely can be mounted.
Manually stopping dm-cache is done using dmsetup remove <dmname>
.
Cache Statistics[edit | edit source]
Use cachestats.sh
to get some statistics on the dm-cache performance.
# cachestats.sh --help
Usage: cachestats [-v|--verbose] [DEVICE_NAME or PATH] [DEVICE_NAME or PATH] ... Options: -h, --help Display this help message -v, --verbose Display detailed information
# cachestats.sh -v data2
DEVICE ======== Device-mapper name: /dev/mapper/data2 Origin size: 9 TiB Discards: no_discard_passdown CACHE ======== Size / Usage: 100 GiB / 100 GiB (100 %) Read Hit Rate: 335116714 / 520317915 (64 %) Write Hit Rate: 24739679 / 31858340 (77 %) Dirty: 0 bytes Block Size: 128 KiB Promotions / Demotions: 646797 / 646796 Migration Threshold: 1 MiB Read-Write mode: rw Type: writeback Policy: smq Status: OK METADATA ======== Size / Usage: 256 MiB / 10 MiB (3 %)
cachestats.sh data*
List all cache devices[edit | edit source]
The dmsetup
utility can be used to list all known device-mapper devices.
List only cache devices.
# dmsetup ls --target cache
3t_backup (254, 23) data1 (254, 24) data2 (254, 25) data3 (254, 26) usb_backup (254, 27)
List all device-mapper devices and their relationships in a tree layout.
# dmsetup ls --tree
3t_backup (254:23) ├─ (8:114) ├─vg_800g-lv_cache_3t_backup_cache (254:3) │ └─ (8:0) └─vg_800g-lv_cache_3t_backup_meta (254:2) └─ (8:16) data1 (254:24) ├─ (8:98) ├─vg_800g-lv_cache_data1_cache (254:6) │ └─ (8:0) └─vg_800g-lv_cache_data1_meta (254:4) └─ (8:16) data2 (254:25) ├─ (8:130) ├─vg_800g-lv_cache_data2_cache (254:7) │ └─ (8:16) └─vg_800g-lv_cache_data2_meta (254:5) └─ (8:0) data3 (254:26) ├─ (8:49) ├─vg_800g-lv_cache_data3_cache (254:14) │ └─ (8:16) └─vg_800g-lv_cache_data3_meta (254:15) └─ (8:0) usb_backup (254:27) ├─ (8:145) ├─vg_800g-lv_cache_usb_backup_cache (254:1) │ └─ (8:0) └─vg_800g-lv_cache_usb_backup_meta (254:0) └─ (8:16) vg_800g-3TB_meta1 (254:10) └─ (8:0) vg_800g-3TB_meta2 (254:11) └─ (8:16) vg_800g-6TB_meta1 (254:8) └─ (8:0) vg_800g-6TB_meta2 (254:9) └─ (8:16) vg_800g-virtiofs_meta1 (254:12) └─ (8:0) vg_800g-virtiofs_meta2 (254:13) └─ (8:16)
dmsetup info
can be used to display additional information for a cache device.
# dmsetup info /dev/mapper/data1
Name: data1 State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 4480 Major, minor: 254, 24 Number of targets: 1