Btrfs
Related articles
From Wikipedia:Btrfs
- Btrfs (B-tree file system, variously pronounced: "Butter F S", "Better F S", "B-tree F S", or simply "Bee Tee Arr Eff Ess") is a GPL-licensed experimental copy-on-write file system for Linux. Development began at Oracle Corporation in 2007.
From Btrfs Wiki
- Btrfs is a new copy on write (CoW) filesystem for Linux aimed at implementing advanced features while focusing on fault tolerance, repair and easy administration. Jointly developed at Oracle, Red Hat, Fujitsu, Intel, SUSE, STRATO and many others, Btrfs is licensed under the GPL and open for contribution from anyone.
Contents
- 1 Installation
- 2 General administration of Btrfs
- 3 Limitations
- 4 Features
- 5 Troubleshooting
- 6 See also
Installation
The official kernels linux and linux-lts include support for Btrfs, user space utilities are available in btrfs-progs.
GRUB, mkinitcpio, and Syslinux have support for Btrfs and require no additional configuration.
Additional packages
- btrfs-progs includes btrfsck, a tool that can fix errors on Btrfs filesystems.
- btrfs-progs-gitAUR for nightly
General administration of Btrfs
Creating a new file system
A Btrfs file system can either be newly created or have one converted.
To format a partition do:
# mkfs.btrfs -L mylabel /dev/partition
To use a larger blocksize for data/metadata, specify a value for the nodesize
via the -n
switch as shown in this example using 16KB blocks:
# mkfs.btrfs -L mylabel -n 16k /dev/partition
Multiple devices can be entered to create a RAID. Supported RAID levels include RAID 0, RAID 1, RAID 10, RAID 5 and RAID 6. The RAID levels can be configured separately for data and metadata using the -d
and -m
options respectively. By default the metadata is mirrored and data is striped. See Using Btrfs with Multiple Devices for more information about how to create a Btrfs RAID volume.
# mkfs.btrfs [options] /dev/part1 /dev/part2 ...
Convert from Ext3/4
Boot from an install CD, then convert by doing:
# btrfs-convert /dev/partition
Mount the partion and test the conversion by checking the files. Be sure to change the /etc/fstab
to reflect the change (type to btrfs
and fs_passno [the last field] to 0
as Btrfs does not do a file system check on boot). Also note that the UUID of the partition will have changed, so update fstab accordingly when using UUIDs. chroot
into the system and rebuild the GRUB menu list (see Install from existing Linux and GRUB articles). If converting a root filesystem, while still chrooted run mkinitcpio -p linux
to regenerate the initramfs or the system will not successfully boot. If you get stuck in grub with 'unknown filesystem' try reinstalling grub with grub-install /dev/partition
and regenerate the config as well grub-mkconfig -o /boot/grub/grub.cfg
.
After confirming that there are no problems, complete the conversion by deleting the backup ext2_saved
sub-volume. Note that you cannot revert back to ext3/4 without it.
# btrfs subvolume delete /ext2_saved
Finally balance the file system to reclaim the space.
Mount options
See Btrfs Wiki Mount options and Btrfs Wiki Gotchas for more information.
In addition to configurations that can be made during or after file system creation, the various mount options for Btrfs can drastically change its performance characteristics.
As this is a file system that is still in active development. Changes and regressions should be expected. See links in the #See also section for some benchmarks.
Examples
- Linux 3.15
- Btrfs on a SSD for system installation and an emphasis on maximizing performance (also read #SSD TRIM)
noatime,discard,ssd,compress=lzo,space_cache
- Btrfs on a HDD for archival purposes with an emphasis on maximizing space.
-
noatime,autodefrag,compress-force=lzo,space_cache
Displaying used/free space
General linux userspace tools such as /usr/bin/df
will inaccurately report free space on a Btrfs partition since it does not take into account space allocated for and used by the metadata. It is recommended to use /usr/bin/btrfs
to query a Btrfs partition. Below is an illustration of this effect, first querying using df -h
, and then using btrfs filesystem df
:
$ df -h /
Filesystem Size Used Avail Use% Mounted on /dev/sda3 119G 3.0G 116G 3% /
$ btrfs filesystem df /
Data: total=3.01GB, used=2.73GB System: total=4.00MB, used=16.00KB Metadata: total=1.01GB, used=181.83MB
Notice that df -h
reports 3.0GB used but btrfs filesystem df
reports 2.73GB for the data. This is due to the way Btrfs allocates space into the pool. The true disk usage is the sum of all three 'used' values which is inferior to 3.0GB as reported by df -h
.
Another useful command to show a less verbose readout of used space is btrfs filesystem show
:
# btrfs filesystem show /dev/sda3
failed to open /dev/sr0: No medium found Label: 'arch64' uuid: 02ad2ea2-be12-2233-8765-9e0a48e9303a Total devices 1 FS bytes used 2.91GB devid 1 size 118.95GB used 4.02GB path /dev/sda2 Btrfs v0.20-rc1-358-g194aa4a-dirty
Limitations
A few limitations should be known before trying.
Encryption
Btrfs has no built-in encryption support (this may come in future); users can encrypt the partition before running mkfs.btrfs
. See dm-crypt.
Existing Btrfs file system, can use something like EncFS or TrueCrypt, though perhaps without some of Btrfs' features.
Swap file
Btrfs does not yet support swap files. This is due to swap files requiring a function that Btrfs doesn't have for possibility of file system corruption [1]. Patches for swapfile support are already available [2] and may be included in an upcoming kernel release. As an alternative a swap file can be mounted on a loop device with poorer performance but will not be able to hibernate. Install the package systemd-swap from the official repositories to automate this.
Linux-rt kernel
As of version 3.14.12_rt9, the linux-rt kernel does not boot with the Btrfs file system. This is due to the slow development of the rt patchset.
Features
Various features are available and can be adjusted.
Commit interval settings
The resolution at which data are written to the filesystem is dictated by Btrfs itself and by system-wide settings. Btrfs defaults to a 30 seconds checkpoint interval in which new data are committed to the filesystem. This is tuneable using mount options (see below)
System-wide settings also affect commit intervals. They include the files under /proc/sys/vm/*
and are out-of-scope of this wiki article. The kernel documentation for them resides in Documentation/sysctl/vm.txt
.
Copy-On-Write (CoW)
By default, Btrfs performs copy-on-write for all files, at all times: If you write a file that did not exist before, then the data is written to empty space, and some of the metadata blocks that make up the filesystem are copied-on-write. In a traditional filesystem, if you then go back and overwrite a piece of that file, then the piece you are writing is put directly over the data it is replacing. In a CoW filesystem, the new data is written to a piece of free space on the disk, and only then is the file's metadata changed to refer to the new data. The old data that was replaced can then be freed up if nothing points to it any more.
Copy-on-write comes with some advantages, but can negatively affect performance with large files that have small random writes because it will fragment them (even if no "copy" is ever performed!). It is recommended to disable copy-on-write for database files and virtual machine images.
One can disable copy-on-write for the entire block device by mounting it with nodatacow
option. However, this will disable copy-on-write for the entire file system.
To disable copy-on-write for single files/directories do:
$ chattr +C /dir/file
This will disable copy-on-write for those operation in which there is only one reference to the file. If there is more than one reference (e.g. through cp --reflink=always
or because of a filesystem snapshot), copy-on-write still occurs.
Likewise, to save space by forcing copy-on-write when copying files use:
$ cp --reflink source dest
As dest
file is changed, only those blocks that are changed from source will be written to the disk. One might consider aliasing cp to cp --reflink=auto
.
Multi-device filesystem and RAID feature
See Using Btrfs with Multiple Devices for suggestions.
Multi-device filesystem
When creating a Btrfs filesystem, one can pass many partitions or disk devices to mkfs.btrfs. The filesystem will be created across these devices. One can "pool" this way, multiple partitions or devices to get a single Btrfs filesystem.
One can also add or remove device from an existing Btrfs filesystem (caution is mandatory).
RAID features
When creating a multi-device filesystem, one can also specify to use RAID0, RAID1, RAID10, RAID5 or RAID6 across the devices comprising the filesystem. RAID levels can be applied independently to data and metadata. By default, metadata is duplicated on single volumes or RAID1 on multi-disk sets.
Btrfs works in block-pairs for raid0, raid1, and raid10. This means:
raid0 - block-pair striped across 2 devices
raid1 - block-pair written to 2 devices
The raid level can be changed while the disks are online using the btrfs balance
command:
# btrfs balance start -mconvert=RAIDLEVEL -dconvert=RAIDLEVEL /path/to/mount
For 2 disk sets, this matches raid levels as defined in md-raid (mdadm). For 3+ disk-sets, the result is entirely different than md-raid.
For example:
- Three 1TB disks in an md based raid1 yields a
/dev/md0
with 1TB free space and the ability to safely lose 2 disks without losing data. - Three 1TB disks in a Btrfs volume with data=raid1 will allow the storage of approximately 1.5TB of data before reporting full. Only 1 disk can safely be lost without losing data.
Btrfs uses a round-robin scheme to decide how block-pairs are spread among disks. As of Linux 3.0, a quasi-round-robin scheme is used which prefers larger disks when distributing block pairs. This allows raid0 and raid1 to take advantage of most (and sometimes all) space in a disk set made of multiple disks. For example, a set consisting of a 1TB disk and 2 500GB disks with data=raid1 will place a copy of every block on the 1TB disk and alternate (round-robin) placing blocks on each of the 500GB disks. Full space utilization will be made. A set made from a 1TB disk, a 750GB disk, and a 500GB disk will work the same, but the filesystem will report full with 250GB unusable on the 750GB disk. To always take advantage of the full space (even in the last example), use data=single. (data=single is akin to JBOD defined by some raid controllers) See the Btrfs FAQ for more info.
Sub-volumes
See the following links for more details:
- Btrfs Wiki SysadminGuide#Subvolumes
- Btrfs Wiki Getting started#Basic Filesystem Commands
- Btrfs Wiki Trees
Creating sub-volumes
To create a sub-volume:
# btrfs subvolume create /path/to/subvolume
Listing sub-volumes
To see a list of current sub-volumes:
# btrfs subvolume list -p .
Setting a default sub-volume
The default sub-volume is mounted if no subvol=
mount option is provided.
# btrfs subvolume set-default subvolume-id /.
Example:
# btrfs subvolume list .
ID 258 gen 9512 top level 5 path root_subvolume ID 259 gen 9512 top level 258 path home ID 260 gen 9512 top level 258 path var ID 261 gen 9512 top level 258 path usr
# btrfs subvolume set-default 258 .
Reset:
# btrfs subvolume set-default 0 .
Snapshots
See Btrfs Wiki SysadminGuide#Snapshots for details.
To create a snapshot:
# btrfs subvolume snapshot source [dest/]name
Snapshots are not recursive. Every sub-volume inside sub-volume will be an empty directory inside the snapshot.
Defragmentation
Btrfs supports online defragmentation. To defragment the metadata of the root folder:
# btrfs filesystem defragment /
This will not defragment the entire file system. For more information read this page on the Btrfs wiki.
To defragment the entire file system verbosely:
# btrfs filesystem defragment -r -v /
Compression
Btrfs supports transparent compression, meaning every file on the partition is automatically compressed. This not only reduces the size of files, but also improves performance, in particular if using the lzo algorithm, in some specific use cases (e.g. single thread with heavy file IO), while obviously harming performance on other cases (e.g. multithreaded and/or cpu intensive tasks with large file IO).
Compression is enabled using the compress=zlib
or compress=lzo
mount options. Only files created or modified after the mount option is added will be compressed. However, it can be applied quite easily to existing files (e.g. after a conversion from ext3/4) using the btrfs filesystem defragment -calg
command, where alg
is either zlib
or lzo
. In order to re-compress the whole file system with lzo
, run the following command:
# btrfs filesystem defragment -r -v -clzo /
When installing Arch to an empty Btrfs partition, set the compress
option after preparing the storage drive. Simply switch to another terminal (Ctrl+Alt+number
), and run the following command:
# mount -o remount,compress=lzo /mnt/target
After the installation is finished, add compress=lzo
to the mount options of the root file system in fstab.
Checkpoint interval
Starting with Linux 3.12, users are able to change the checkpoint interval from the default 30 s to any value by appending the commit
mount option in /etc/fstab
for the btrfs partition.
LABEL=arch64 / btrfs defaults,noatime,ssd,compress=lzo,commit=120 0 0
Partitioning
Btrfs can occupy an entire data storage device, replacing the MBR or GPT partitioning schemes. One can use subvolumes to simulate partitions. There are some limitations to this approach in single disk setups:
- Cannot use different file systems for different mount points.
- Cannot use swap area as Btrfs does not support swap files and there is no place to create swap partition. This also limits the use of hibernation/resume, which needs a swap area to store the hibernation image.
- Cannot use UEFI to boot.
To overwrite the existing partition table with Btrfs, run the following command:
# mkfs.btrfs /dev/sdX
Do not specify /dev/sdaX
or it will format an existing partition instead of replacing the entire partitioning scheme.
Install the boot loader in a like fashion to installing it for a data storage device with a Master Boot Record. For example:
# grub-install --recheck /dev/sdX
for GRUB.
Scrub
See Btrfs Wiki Glossary.
# btrfs scrub start / # btrfs scrub status /
If running the scrub as a systemd service, use Type=forking
. Alternatively, you can pass the -B
flag to btrfs scrub start
to run it in the foreground and use the default Type
value.
Balance
See Upstream FAQ page.
Since btrfs-progs-3.12 balancing is a background process - see man 8 btrfs-balance
for full description.
# btrfs balance start / # btrfs balance status /
SSD TRIM
If mounted with the discard
option, a Btrfs filesystem will automatically free unused blocks from an SSD drive supporting the TRIM command.
Note that before SATA 3.1, TRIM commands are synchronous and will block all I/O while running. This may cause short freezes while this happens, for example during a filesystem sync. You may not want to use discard
in that case but enable regular trims instead:
# systemctl enable fstrim.timer
One way to check your SATA version is with:
# smartctl --info /dev/sdX
Troubleshooting
See the Btrfs Problem FAQ for general troubleshooting.
GRUB
Partition offset
GRUB can boot Btrfs partitions however the module may be larger than other file systems. And the core.img
file made by grub-install
may not fit in the first 63 sectors (31.5KiB) of the drive between the MBR and the first partition. Up-to-date partitioning tools such as fdisk
and gdisk
avoid this issue by offsetting the first partition by roughly 1MiB or 2MiB.
Missing root
Users experiencing the following: error no such device: root
when booting from a RAID style setup then edit /usr/share/grub/grub-mkconfig_lib and remove both quotes from the line echo " search --no-floppy --fs-uuid --set=root ${hints} ${fs_uuid}"
. Regenerate the config for grub and the system should boot without an error.
BTRFS: open_ctree failed
As of November 2014 there seems to be a bug in systemd or mkinitcpio causing the following error on systems with multi-device Btrfs filesystem using the btrfs
hook in mkinitcpio.conf
:
BTRFS: open_ctree failed mount: wrong fs type, bad option, bad superblock on /dev/sdb2, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg|tail or so. You are now being dropped into an emergency shell.
A workaround is to remove btrfs
from the HOOKS
array in /etc/mkinitcpio.conf
and instead adding btrfs
to the MODULES
array. Then regenerate the initramfs with mkinitcpio -p linux
(adjust the preset if needed) and reboot.
See the original forums thread and FS#42884 for further information and discussion.
You will get same error if you trying mount raid array without one device.
A workaround is to use mount options degraded
. Important use this options in /etc/fstab
and /etc/grub.d/10_linux
if you want to be sure that system will boot from btrfs raid array.
add rootflags=degraded
in GRUB_CMDLINE_LINUX
btrfs check
The btrfs check command can be used to check or repair an unmounted Btrfs filesystem. However, this repair tool is still immature and not able to repair certain filesystem errors even those that do not render the filesystem unmountable.
btrfs check example
- Methodology
- Firstly, boot from an Arch Live USB
- Then decrypt the backUp.iso (skip this step if this is not the case)
$ cryptsetup /path/buckUp .iso name $ btrfs check /dev/mapping/name
- If the previous steps were successful, execute the following:
$ btrfs check --repair /dev/mapping/name
Finally, try to mount the filesystem to see if the problem is fixed. If the mount was successful, the process can be repeated on a real partition.
See also
- Official site
- Official FAQs
- Btrfs pull requests
- Performance related
- Miscellaneous
- Funtoo Wiki Btrfs Fun
- Avi Miller presenting Btrfs at SCALE 10x, January 2012.
- Summary of Chris Mason's talk from LFCS 2012
- Btrfs: stop providing a bmap operation to avoid swapfile corruptions 2009-01-21
- Doing Fast Incremental Backups With Btrfs Send and Receive