2020-09-12: Deduplication With Btrfs
A rather unique feature of Btrfs is the concept of cloning files, or parts of file. We usually refer to this as reflinking.
Reflinking allows a user to make an instant copy of a file. It is similar to a hard link with a big difference. When the original file or the copy is modified, Copy-on-Write (CoW), ensures that the files remain unique from each other.
In other words, if you reflink a to b, and you then write new data to a, b will be kept unique. If you instead had made a hard-link with
ln a b, any writes to a, would also happen to b because they are in fact the same file.
File cloning (reflink, copy-on-write) is easiest done with
cp --reflink <source file> <destination file>
I wrote a guide today on how to use Bees to automatically deduplicate your filesystem. Head over to Btrfs/Deduplication/Bees! There are several other tools that also support deduplication. Take a look at Btrfs/Deduplication for a short list.