Linux/dm-cache
dm-cache[edit | edit source]
![Logo. Illustration in black and white drawing of stacked books, ink and a brush](/mediawiki/images/thumb/5/54/Dm-cache_logo.jpg/300px-Dm-cache_logo.jpg)
Device Mapper Cache, is a Linux kernel feature designed to enhance storage performance by implementing a block-level cache on a separate cache device. dm-cache
is a tool that helps the user setup a cache device.
The goal with dm-cache is to improve random read/write performance of a slow HDD by using a small but fast SSD or NVME device.
The main advantage of dm-cache over LVM and Bcache is that it is possible to setup on devices that already have a filesystem with data on them. Both LVM and Bcache requires unformatted, empty devices (there are ways to get around, but can be risky).
Setup[edit | edit source]
dm-cache can be set up using dmsetup.sh
or with OpenRC
init scripts for persistent configuration.
Source code can he downloaded from https://git.tnonline.net/Forza/dm-cache and is released under the GPLv3 license.
Requirements[edit | edit source]
dm-cache utilises the dmsetup
utility which usually can be found in lvm2 or device-mapper packages.
dm-cache requires three devices
- origin: The slow device.
- cache: A fast SSD or NVME device. Can be of any size.
- meta: A small device that holds dm-cache metadata.
The metadata device size depends on how many cache blocks fit on the cache device. With default setting it should be a least 0.01% of the cache device size. If the cache device is 50GiB, and a cache block size of 128KiB, a metadata device of 5MiB is enough. Smaller block sizes requires more metadata and memory, while larger block sizes may reduce effectiveness of the cache by storing cold data.
It is important to mount the filesystem on the dm-cache using the /dev/mapper/dmname
path and not with the filesystem UUID as is commonly done. This is because the kernel might still see the UUID from the origin device, and this can cause data loss!
If you're using Btrfs, the following message in the kernel log:
# dmesg
BTRFS warning: duplicate device /dev/sdj1 devid 1 generation 182261 scanned by mount (13706)
There is a udev rule that prevents this issue by removing the /dev/disk/by-uuid/
symlink to the origin device.
Configuration[edit | edit source]
The following options are available:
- dmname: Choose a new name for the assembled dm-cache. It will be exposed as a block device as `/dev/mapper/dmname`
- origindev: Path to the slow device that shoulf be accelerated with dm-cache. Use a stable device ID, not FS UUID.
- cachedev: The fast cache device, usually an SSD or NVME disk.
- metadev: A small decice to hold cache metadata.
- cachemode: Choose writethrough or writeback cache.
- writethrough cache (default): Write through caching prohibits cachedev content from being different from origindev content. This mode only accelerates reads, but should allow the origin device to be used without the cache dev after a crash.
- writeback cache: When write back cache is used. Writes are written to the cachedev first, before being synced in the background to the origin dev. If the system crashes, the dm-cache must be assembled again before use to avoid serious filesystem damage. If the cachedev fails, the filesystem can be irrevokably damaged!
- cacheblock: The size of cache blocks in sectors. dm-cache promotes and demotes only whole blocks. Too large block size wastes cache discs, reducing its effectiveness, while too small has more memory and metadata overhead.
- cachepolicy: Cache policy affects how dm-cache promotes and demotes data from the cachedev. This is an advanced option. Leave it as default.
- readahead: Linux block device read-ahead value in sectors. The kernel calculates a suitable default if this is unset.
The Linux kernel documentation has more details on possible configuration options.
/dev/mapper/dmname
path and not with the filesystem UUID as is commonly done. This is because the kernel might still see the UUID from the origin device, and this can cause data loss!udev rules[edit | edit source]
To avoid risk of accessing the filesystem via the origin device instead of via the dm-cache device, the following udev rule can be used. It removes the UUID symlink pointing to the origin device.
File: /etc/udev/rules.d/90-dmcache.rules
ENV{ID_FS_UUID_ENC}=="df68a30d-d26e-4b9c-9606-a130e66ce63d", KERNEL=="sd*", SUBSYSTEM=="block", ACTION=="add|change", SYMLINK-="disk/by-uuid/$env{ID_FS_UUID_ENC}"
ID_FS_UUID_ENC
, means the filesystem's UUID.sd*
means the rule should match any /dev/sd* devices. Adjust if you use other names such as vd*, nvme*, etc.
The filesystem UUID can be found using blkid /dev/origindev
.
# blkid /dev/sdj1
/dev/sdj1: LABEL="usb-backup" UUID="df68a30d-d26e-4b9c-9606-a130e66ce63d" UUID_SUB="254fe753-d4d6-4ad1-9cc3-cd9f4c1bfa67" BLOCK_SIZE="4096" TYPE="btrfs" PARTLABEL="Basic data partition" PARTUUID="ac0ae9b1-8e32-4e33-b641-998bc0298d14"
mdev rules[edit | edit source]
Alpine Linux uses mdev
instead of udev
by default. The setup with mdev is slightly more complicated because it does not support removing existing symlinks. A workaround is using a shell script hook in /etc/mdev.conf
.
- install dmcache.mdev to
/lib/mdev/dmcache
. Make sure it has the executable bit set. - install dmcache-uuids to
/etc/dmcache-uuids
. - add the dmcache hook
/lib/mdev/dmcache
to mdev.conf at the persistent storage section.
File: /etc/mdev.conf
# persistent storage dasd.* root:disk 0660 */lib/mdev/persistent-storage mmcblk.* root:disk 0660 */lib/mdev/persistent-storage nbd.* root:disk 0660 */lib/mdev/persistent-storage nvme.* root:disk 0660 */lib/mdev/persistent-storage sd[a-z].* root:disk 0660 */lib/mdev/persistent-storage; /lib/mdev/dmcache sr[0-9]+ root:cdrom 0660 */lib/mdev/persistent-storage vd[a-z].* root:disk 0660 */lib/mdev/persistent-storage xvd[a-z].* root:disk 0660 */lib/mdev/persistent-storage
Using OpenRC[edit | edit source]
The OpenRC init script can automate setting up and stopping dm-cache during boot.
- Install
conf.d/dmcache
andinit.d/dmscript
- Modify
conf.d/dmcache
to suit your setup - Add a udev rule to block FS UUID device symlinks
- Add dmcache to boot runlevel:
rc-update add dmcache boot
Multiple devices[edit | edit source]
If you have several devices you can simply make a copy of the init.d and conf.d files to a new name. The filenames in init.d and conf.d must be the same.
cp /etc/conf.d/dmcache /etc/conf.d/dmcache new
ln -s /etc/init.d/dmcache /etc/init.d/dmcache new
- update
/etc/conf.d/dmcache.new
- update udev rules
rc-service dmcache.new start
rc-update add dmcache.new boot
Using dmcache.sh[edit | edit source]
Edit dmcache.sh
and add the devices and configuration options you need.
After starting dm-cache, you should remove the UUID symlink from /dev/disk/by-uuid/
which is pointing to your origin device. The udev rule can also be used to achieve this.
The dm-cache mapping is not persistent. After a reboot, the dm-cache must be assembled before the filesystem safely can be mounted.
Manually stopping dm-cache is done with dmsetup remove <dmname>
.
Cache Statistics[edit | edit source]
Use cachestats.sh
to get some statistics on the dm-cache performance.
# cachestats.sh --help
Usage: cachestats.sh [DEVICE_NAME or PATH] [DEVICE_NAME or PATH] ... -h, --help Display this help message
# cachestats.sh data2
DEVICE ======== Device-mapper name: /dev/mapper/data2 Origin size: 9 TiB Discards: no_discard_passdown CACHE ======== Size / Usage: 100 GiB / 100 GiB (100 %) Read Hit Rate: 335116714 / 520317915 (64 %) Write Hit Rate: 24739679 / 31858340 (77 %) Dirty: 0 bytes Block Size: 128 KiB Promotions / Demotions: 646797 / 646796 Migration Threshold: 1 MiB Read-Write mode: rw Type: writeback Policy: smq Status: OK METADATA ======== Size / Usage: 256 MiB / 10 MiB (3 %)
List all cache devices[edit | edit source]
The dmsetup
utility can be used to list all known device-mapper devices.
List only cache devices.
# dmsetup ls --target cache
3t_backup (254, 23) data1 (254, 24) data2 (254, 25) data3 (254, 26) usb_backup (254, 27)
List all device-mapper devices and their relationships in a tree layout.
# dmsetup ls --tree
3t_backup (254:23) ├─ (8:114) ├─vg_800g-lv_cache_3t_backup_cache (254:3) │ └─ (8:0) └─vg_800g-lv_cache_3t_backup_meta (254:2) └─ (8:16) data1 (254:24) ├─ (8:98) ├─vg_800g-lv_cache_data1_cache (254:6) │ └─ (8:0) └─vg_800g-lv_cache_data1_meta (254:4) └─ (8:16) data2 (254:25) ├─ (8:130) ├─vg_800g-lv_cache_data2_cache (254:7) │ └─ (8:16) └─vg_800g-lv_cache_data2_meta (254:5) └─ (8:0) data3 (254:26) ├─ (8:49) ├─vg_800g-lv_cache_data3_cache (254:14) │ └─ (8:16) └─vg_800g-lv_cache_data3_meta (254:15) └─ (8:0) usb_backup (254:27) ├─ (8:145) ├─vg_800g-lv_cache_usb_backup_cache (254:1) │ └─ (8:0) └─vg_800g-lv_cache_usb_backup_meta (254:0) └─ (8:16) vg_800g-3TB_meta1 (254:10) └─ (8:0) vg_800g-3TB_meta2 (254:11) └─ (8:16) vg_800g-6TB_meta1 (254:8) └─ (8:0) vg_800g-6TB_meta2 (254:9) └─ (8:16) vg_800g-virtiofs_meta1 (254:12) └─ (8:0) vg_800g-virtiofs_meta2 (254:13) └─ (8:16)
dmsetup info
can be used to display additional information for a cache device.
# dmsetup info /dev/mapper/data1
Name: data1 State: ACTIVE Read Ahead: 256 Tables present: LIVE Open count: 1 Event number: 4480 Major, minor: 254, 24 Number of targets: 1