Btrfs/ENOSPC

From Forza's ramblings
(Redirected from Btrfs/No Diskspace Left)

ENOSPC - Out of disk space[edit | edit source]

Graph showing btrfs usage divided by block groups.

Unlike conventional filesystems, Btrfs uses a two-stage allocator. The first stage allocates large regions of space known as chunks for specific types of data, then the second stage allocates blocks like a regular (old-fashioned) filesystem within these larger regions.

Btrfs combines chunks into three types of block groups:

Type Description
DATA Stores normal user file data
METADATA Stores internal metadata. Small files can also stored inline
SYSTEM Stores mapping between physical devices and the logical space representing the filesystem
UNALLOCATED Any unallocated space
Only the type of data that the chunk is allocated for can be stored in that block group.

The most common case these days when you get a -ENOSPC error on Btrfs is that the filesystem has run out of room for data or metadata in existing block groups, and that there is not enough unallocated space to allocate a new block group of correct type.

You can verify that this is the case by running filesystem usage on the filesystem that threw the error. If the Data or Metadata line shows a Total value that is significantly different from the Used value, then this is probably the cause.

# btrfs fi us /mnt/btrfs_vol
 Overall:
     Device size:          14.01GiB
     Device allocated:      1.38GiB
     Device unallocated:   12.63GiB
     Device missing:        0.00B
     Used:                901.58MiB
     Free (estimated)      12.64GiB    (min: 6.33GiB)
     Data ratio:            1.00
     Metadata ratio:        2.00 
     Global reserve:        3.25MiB    (used: 0.00B)
     Multiple profiles:       no
 
 Data,RAID0: Size:890.00MiB, Used:870.08MiB (97.76%)
    /dev/sdb1     445.00MiB
    /dev/sdc1     445.00MiB
 
 Metadata,RAID1: Size:256.00MiB, Used:15.73MiB (6.15%)
    /dev/sdb1     256.00MiB
    /dev/sdc1     256.00MiB
 
 System,RAID1: Size:8.00MiB, Used:16.00KiB (0.20%)
    /dev/sdb1       8.00MiB
    /dev/sdc1       8.00MiB
 
 Unallocated:
    /dev/sdb1      21.00MiB  <== Not enough space for another chunk of data or metatada in RAID profile.
    /dev/sdc1      12.61GiB

Preventing ENOSPC - Btrfs Balance[edit | edit source]

What btrfs balance does is to send things back through the allocator, which results in space usage in the chunks being compacted. For example, if you have two data chunks that are both 40% full, a balance will result in them becoming one chunk that's 80% full. By compacting chunks, the balance operation is able to convert the empty chucks into unallocated space that can be used for new applications.

It is important to run a btrfs balance before you run out of unallocated space. A common way is to set up a scheduled maintenance task that regularly runs a limited balance.

NOTE! Only balance DATA chunks, never METADATA chunks

Fixing ENOSPC errors[edit | edit source]

There have a few options to correct ENOSPC errors, depending on why they were caused.

  • ENOSPC caused by full or unbalanced filesystem.
  • ENOSPC caused by btrfs balance.

Always do recovery options from a root shell, not from a GUI, file manager or similar.

Recovering from a full filesystem[edit | edit source]

Even if you balance data regularly, you can end up with a full disk by simply writing to many files to it. Writing files requires additional metadata and Btrfs may need to allocate additional metadata chunks to do that, which is not possible on a full filesystem.

When ENOSPC happens, Btrfs will change your filesystem to read-only to protect itself. This may seem counter-intuitive as it prevents you from deleting stuff.

  1. First, unmount your filesystem. We cannot operate on a read-only filesystem.
  2. Mount your filesystem again. It should now be read-write.
  3. Now you should delete enough files to create at least 1GiB free space for metadata allocation.

Please note that deleting snapshots, deleting ref-linked files and de-duped files can cause additional metadata allocation, which brings you back to ENOSPC again.

When you can't delete files without going ENOSPC[edit | edit source]

First you can try to truncate some large files to 0 length. This will reduce the space used without requireing extra metadata which is what is leading up to the ENOSPC situation.

An ISO file is usually several GiB. For example the Fedora Live DVDs. Swapfiles are also large, but can be used by the running system, so take care to disable swap before truncating a swapfile.

# ls -l Fedora-f34.iso
-rw-r--r-- 1 root root 4294967296 Jan 25 10:35 Fedora-f34.iso
# truncate -s 0 Fedora-f34.iso
# ls -l Fedora-f34.iso
-rw-r--r-- 1 root root 0 Jan 25 10:38 Fedora-f34.iso

If this doesn't work, the only way to recover now is to add additional devices to your Btrfs filesystem.

Same steps as before:

  1. First, unmount your filesystem. We cannot operate on a read-only filesystem.
  2. Mount your filesystem again. It should now be read-write.

Now you need to find another device that you can add to your filesystem for a short period, for example a USB-stick.

WARNING! Do not use a ramdisk as this will lead to data loss if you have to reboot before your filesystem is fixed

Here, I assume your USB device is /dev/sdc1

  1. Add the USB stick to your filesystem: btrfs device add /dev/sdc1 /mnt/your-btrfs-mountpoint
  2. Now you should be able to delete files. Remove several GiB of files.
  3. When you have removed several GiB of files you can remove the USB-stick by doing btrfs device delete /dev/sdc1 /mnt/your-btrfs-mountpoint.

Important:

If you have a RAID1 filesystem, you will need to add two new devices.
If you have a RAID10 filesystem, you will need to add four new devices.

This is because data and metadata chunks are allocated according to the profile used.

Recovering from balance[edit | edit source]

Sometimes a balance operation can get stuck or cause ENOSPC errors itself. This can lead to the filesystem turning read-only.

The kernel will resume any balance operation on any remount. To prevent this you have to use mount -o skip_balance.