In association with heise online

Getting Buttered Up

Btrfs is still experimental, but the recent first RC of kernel version 2.6.31 contains the latest version of Btrfs (0.19). This version has modified disk data structures and therefore promises a number of performance improvements over previous versions.

Two in-development distributions already come with the 2.6.31 kernel, Ubuntu's Karmic Koala alpha and Fedora's Rawhide. If you build your own kernel, then what you need to do is to upgrade to a 2.6.31 release candidate which involves setting up the 2.6.30 sources and then applying the current 2.6.31 patch using patch -p1

With the current Fedora Rawhide, the Btrfs-tools package is already at version 0.19. For Ubuntu 9.10, Karmic Koala, you will need to install the Btrfs-tools package although the repository currently only carries the 0.18 version of the tools.

To build the latest version, 0.19, of the tools, you first need to download the userland tools and extract them. For the core set of tools, you can simply run make followed by make install. To compile successfully, the uuid-dev library headers needs to be installed (Ubuntu: sudo apt-get install uuid-dev).

The default build of the tools omits a number of other potentially useful programs in the userland Btrfs tools: btrfs-image, btrfstune and btrfs-convert.

btrfs-image is a utility which creates an image of a Btrfs file system containing all the metadata, but not the data itself. These images can be sent to the Btrfs developers for analysis in the case of errors.

btrfstune enables or disables a rather exotic feature called seeding which is used to create a new Btrfs file system which contains a read-only copy of the contents of another Btrfs. This could, for example, be useful when setting up several virtual machines with identical basic configurations on various different file systems.

To build these tools, first ensure that the zlib development package is installed (Ubuntu: sudo apt-get install zlib1g-dev) and then add "btrfs-image btrfstune" to the end of the "progs" line in the Makefile for btrfs-tools. Finally, run make followed by make install.

btrfs-convert is used for converting an Ext3 file system to Btrfs. The tool creates a copy of the Ext3 metadata in Btrfs format, and initially both the Ext3 and the Btrfs metadata point to the same data blocks. Due to the Copy-on-Write mechanism, write operations in Btrfs will occupy new data blocks and the Ext3 file system remains consistent (find details about this in the Btrfs wiki). To compile this program, you will need to install the libacl and e2fs development library headers (Ububtu: sudo apt-get install libacl1-dev e2fslibs-dev) and then run make convert followed by make install

With the tools to manage Btrfs installed, we can now look at using the Btrfs file system. A final reminder though; Btrfs is experimental and could well change disk layout in the future. Do not use for production systems.

And they're away!

The simple command

mkfs.btrfs /dev/sda5

is all that's required for creating a Btrfs file system. There are set-up options for adjusting the size of nodes and leaves – the default (same size as the memory pages, which is 4 KBytes) seems a reasonable starting point.

A Btrfs file system can also be set up across several volumes:

mkfs.btrfs /dev/sdb /dev/sdc

By default, Btrfs mirrors the metadata across the named volumes, while the data is distributed across the volumes (striping). This can be fine-tuned via the -m (metadata) and -d (data) options; possible values in each case include raid0 (striping), raid1 (mirroring) and raid10. An additional single option is available for the metadata – this prevents the metadata from being duplicated (by default, Btrfs also saves two copies of the metadata on a single device).

The file system's integrated volume management not only allows the verification of data and metadata via checksums, but also the correction of data errors on the fly, provided a RAID-1 or RAID-10 is in place. Using the Linux device mapper to comply with the basic tenet of clear layer separation would make such corrections much more complicated due to gradual data corruption.

So how do we mount such a RAID? After all, there is no dedicated device for the RAID. It's another simple job: When mounting either one of the devices, Btrfs includes both devices in the respective configuration. After rebooting (or unloading the Btrfs module), however,

btrfsctl -a

needs to run once first. The command scans all the devices and establishes which of them are used for a Btrfs in which configuration. For those in need of an overview,

btrfs-show

displays this information in plain text.

The btrfs-vol command adds or removes additional volumes. It has to be executed on a mounted (!) file system (in our example it is /mnt/btrfs):

btrfs-vol -a /dev/sdd /mnt/btrfs

To distribute the file system data across the new device, use the

btrfs-vol -b /mnt/btrfs

command. Use

btrfs-vol -r /dev/sdd /mnt/btrfs

to remove a device. This tool makes sure that any data on the device to be removed is copied to the other volumes first.

And what if a disk in the RAID-1 goes bust? Btrfs will initially refuse to mount it. Use a mount option to make it happen anyway:

mount -o degraded /dev/sdb /mnt/btrfs

Now, the corrupted disk can be removed with btrfs-vol -r, and a new one can be added.

A file system can be de-fragmented with

btrfsctl -d /mnt/btrfs

This command once again requires the mount point, not the device, to be stated. De-fragmenting the complete file system isn't always required, however: individual files or subdirectories can be de-fragmented in the same way.

Next: Snapshots and subvolumes

Print Version | Permalink: http://h-online.com/-746597
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit
 


  • July's Community Calendar





The H Open

The H Security

The H Developer

The H Internet Toolkit