Btrfs/Send

From Forza's ramblings
(Redirected from Btrfs/Receive)

Snapshots and Subvolumes[edit | edit source]

Subvolumes are useful to separate data that has different needs of backup retention time. If you consider the root '/' filesystem and /home directories, it may be reasonable to want to keep long-term backups of the home directory, while the root directory only needs a short period. Separating the two in different subvolumes makes it easy to manage different backup schedules.

A snapshot is a point-in-time copy of a subvolume. A read-write snapshot is basically a normal subvolume that can be used independently of its copy. Read-only snapshots, on the other hand are static and cannot be modified, which is the prerequisite for btrfs send.

Btrfs Send[edit | edit source]

btrfs send is a command outputs a stream of data representing the contents of a snapshot, or the changes between two snapshots. This stream can be sent to btrfs receive or to another process for archiving.

btrfs send <subvol> | btrfs receive /mnt/target

The pipe '|' symbol is used to send the standard output from one program to another, in this case directly to btrfs receive. It is also possible to 'pipe' the transfer across SSH connections or combine with tools like pv or mbuffer to get a nice progress report.

Btrfs send usage:

btrfs send [options] <subvol> [<subvol>...]

Options:

-e
To send multiple subvolumes at once.
-p <parent>
send an incremental snapshot using <parent> as basis.
-f <outfile>
output is normally written to standard output so it can be, for example, piped to btrfs receive. Use this option to write it to a file instead.

Full documentation is available at https://btrfs.readthedocs.io/en/latest/btrfs-send.html

Btrfs Receive[edit | edit source]

btrfs receive is a command used to apply the changes received from `btrfs send` to a destination filesystem. It reconstructs the snapshots on the receiving side.

btrfs receive [options] <mount>

Options:

-f <FILE>
read the stream from FILE instead of stdin.

See https://btrfs.readthedocs.io/en/latest/btrfs-receive.html for further details.

Incremental Snapshots[edit | edit source]

Btrfs supports incremental snapshots. It means that multiple read-only snapshots are taken over time of the same source subvolume.

Incremental snapshots are incredibly efficient on storage space as they only store changes since the previous snapshot. If no changes have happened, no additional space is used at all (except for a small metadata reference to it).

These incremental snapshots can also be sent to a backup location using btrfs send.

It is not necessary to keep the whole chain of incremental snapshots and it is ok to prune and delete snapshots in any order.

A snapshot schedule could look like this:

  • make hourly snapshots
  • after 24 hours, keep one snapshot per day for 7 days.
  • after 7 days, keep one weekly snapshot for 6 months.

Examples[edit | edit source]

Let's consider the following filesystem. It is the root filesystem of one of my Gentoo Linux machines.

# tree -d -L 2 /mnt/rootvol/
/mnt/rootvol          # Btrfs top level volume
    ├── snapshots     # Directly to keep snapshots
    └── volume        # A flat subvolume layout with all filesystem subvolumes. 
        ├── home      # User homes
        ├── root      # System root mount point 
        ├── var_log   # Logfiles
        └── www       # Websites data
Subvolume 'volume/root'
is mounted as the system root '/' mount point.
Subvolume 'volume/home'
is mounted as /home dir.
Subvolume 'var/www'
is mounted as /var/www/
Subvolume 'volume/var_log'
is mounted at /var/log.

Make a read-only snapshot of home and root subvols[edit | edit source]

# btrfs subvolume snapshot -r /mnt/rootvol/volume/root /mnt/rootvol/snapshots/root.2024-01
# btrfs subvolume snapshot -r /mnt/rootvol/volume/home /mnt/rootvol/snapshots/home.2024-01

Removing a snapshot[edit | edit source]

Snapshots are normal subvolumes and can be deleted in the same way.

# btrfs subvolume delete /mnt/rootvol/snapshots/home.2024-01

Sending a snapshot to a backup location[edit | edit source]

This sends a full copy of the snapshot to a backup disk with a Btrfs filesystem mounted at /media/backup.

# btrfs send /mnt/rootvol/volume/home | btrfs receive /media/backup

The snapshot will be created as /media/backup/home.2024-01.

Note: Only read-only snapshots can be sent.

If the target location is not Btrfs it is possible to send the subvolume to a file that can be stored on any media, or uploaded to cloud storage.

Here we are sending the snapshot through zstd and storing it as a compressed archive at /media/backup/www.20240103T0001.zst.

# btrfs send /mnt/rootvol/snapshots/www.20240103T0001/ | zstd -o /media/backup/www.20240103T0001.zst
At subvol /mnt/rootvol/snapshots/www.20240103T0001/
/*stdin*\            : 74.93%   (  17.4 GiB =>   13.0 GiB, /media/backup/www.20240103T0001.zst)

Send incremental snapshots[edit | edit source]

Using btrfs send | receive with incremental snapshots enables efficient data transfer. By sending only the changes, it minimises data transfer size, reduces the backup time and maintains efficient storage usage on the target.

Btrfs doesn't require any specific naming scheme for incremental snapshot because it uses internal UUID references.

Using a datetime format like ISO 8601 is popular and works with third party tools like btrbk and Samba.

Lets consider the following snapshots:

# ls /mnt/rootvol/snapshots
home.20240103T1701
home.20240103T1801
home.20240103T1901

To make incremental backups possible, the same initial snapshot has to exist on both sides. This is usually referred to a full send/backup.

# btrfs send /mnt/rootvol/snapshots/home.20240103T1701 | btrfs receive /media/backup

Now it is possible to use this base copy as a common reference for both the send and receive sides when sending the next snapshot.

# btrfs send -p /mnt/rootvol/snapshots/home.20240103T1701 /mnt/rootvol/snapshots/home.20240103T1801 | btrfs receive /media/backup

You will notice that the incremental send is very quick. Here, the original full copy took 2 minutes, while this second incremental send only took a few seconds because not much had actually changed between the two snapshots.

Checking with du, we can see that both snapshots have all data.

# du -sh /media/backup/home.*
5,1G    /media/backup/home.20240103T1701
5,1G    /media/backup/home.20240103T1801

Let's send the third snapshot. This time we can use the more recent incremental snapshot that we sent earlier.

# btrfs send -p /mnt/rootvol/snapshots/home.20240103T1801 /mnt/rootvol/snapshots/home.20240103T1901 | btrfs receive /media/backup

As seen above, du does not understand Btrfs snapshots, so it thinks the disk space usage doubled, when in fact it hardly changed at all. compsize is similar to du, but gives much more accurate details:

# compsize  /media/backup/home.*
Processed 569084 files, 192666 regular extents (582168 refs), 37786 inline.
Type       Perc     Disk Usage   Uncompressed Referenced
TOTAL       52%      2.5G         4.8G          15G
none       100%      543M         543M         2.2G
zstd        46%      2.0G         4.3G          12G

Here we can see several important facts, The total amount of data is 15GiB (each snapshot has 5GiB data), but the unique data size is only 4.8GiB. This in turn was compressed using zstd so that actual disk usage is 2.5GiB.

I am using hourly snapshots and backups. If i look at the last 7 months of snapshots of my home dir, the total disk usage is only 4.2GiB, only a little more than the 2.5GiB the home subvol uses. Had I used rsync to store copies instead, the space needed would have amounted to 100x more!

Limitations[edit | edit source]

  • When sending incremental snapshots, it is important that a common ancestor exists on both the sending side and the receiving side.
  • While defrag can defragment read-write snapshots, it would also break the reflinks between the snapshots, potentially increasing disk usage by a lot.
  • Avoid performing data deduplication tasks on the sending side while a btrfs send is running.
WARNING! Never use btrfs property to change a snapshot from read-only to read-write and back. send | receive depend on the fact that snapshots are immutable. Changing a snapshot to read-write breaks this assumtion and will eventually lead data loss!

Restoring Snapshots[edit | edit source]

There is no concept of rolling back snapshots in Btrfs. Snapshots are independent subvolumes, and the way to restore them is to make a read-write snapshot of the snapshot.

Let's consider the subvolume layout from before. I've added three snapshots of the www subvolume.

# tree -d -L 2 /mnt/rootvol/
/mnt/rootvol          # Btrfs top level volume
    ├── snapshots     # Directly to keep snapshots
        ├──www.20240103T2001
        ├──www.20240103T2101
        └──www.20240103T2201
    └── volume        # A flat subvolume layout with all filesystem subvolumes. 
        ├── home      # User homes
        ├── root      # System root mount point 
        ├── var_log   # Logfiles
        └── www       # Websites data

The www subvolume is mounted at /var/www.

To restore www.20240103T2001 snapshot, we need to rename or delete the original subvolume, then make a read-write snapshot in its place. This part can be done even if the subvolume is mounted.

# enter the volume directory 
cd /mnt/rootvol/volume

# rename the existing subvolume
mv www www.bak

# make a read-write snapshot
btrfs sub snap ../snapshots/www.20240103T2001 www

Since the original (now 'www.bak') is mounted, we need to either restart the computer or unmount + remount it.

This assumes that /var/www is in /etc/fstab and mounted using the subvol=volume/www option.

File: /etc/fstab
UUID=32234b01-c599-4eaf-a6b2-fafd35034062       /var/www           btrfs   noatime,subvol=volume/www   0 0
# unmunt the existing mount
unmount /var/www

# mounting again will use the restored snapshot as it now has the same subvolume name
mount /var/www

If you are using the subvolid= mount option instead of subvol=, it is important to use the restored snapshot's subvolume ID.

# btrfs subvolume list /mnt/rootvol
ID 293 gen 2115373 top level 5 path volume/home
ID 302 gen 2115373 top level 5 path volume/root
ID 307 gen 2115373 top level 5 path volume/var_log
ID 309 gen 2115373 top level 5 path volume/www

Existing Backup Tools[edit | edit source]

btrbk is a backup tool for Btrfs subvolumes, taking advantage of Btrfs specific capabilities to create atomic snapshots and transfer them incrementally to various backup locations. Key features include separated and flexible retention periods for both source system and backup targets as well as support for offline backups (usb drives, laptops, etc).

snapper is snapshot tool with support for Btrfs, providing an easy-to-use graphical interface. It supports automatic snapshot management and is available in many Linux distributions.

Suitable Snapshot Structure[edit | edit source]

For optimal use with backup tools like btrbk and snapper, it's recommended to organize snapshots in a structured manner. Consider using a naming convention that reflects the purpose and timing of the snapshots, making it easier to manage and restore data.