In association with heise online

Medium Tux KL

Changes to the Ext3 and Ext4 file systems

In this latest development cycle, the kernel hackers were particularly active changing the code of the still evolving Ext4 file system and that of its predecessor, Ext3. Right at the beginning, for instance, they made several changes to considerably reduce the risk of data loss caused by delayed allocation, which came to light in early March and sparked a lot of discussion. This was to be expected, although it may reduce performance in certain situations.

However, a large number of further changes weren't planned from the start and tended to originate from the sometimes rather heated discussions that eventually spanned 650 emails. One old and previously much-discussed general file system topic flared up again at a relatively early stage of these discussions: when and how often should the kernel update a file's atime (last access time)?

The background of this rather complex subject and many other file system changes only mentioned briefly here, are discussed in detail in the second part of the "What's coming in 2.6.30" mini series of the Kernel Log.

While similar discussions have been without results in the past, Linus Torvalds' spirited, but controversially received intervention, did produce a result this time round: The standard configuration of kernel version 2.6.30 now only updates the access time once a day (relative atime/relatime); the old behaviour can be reactivated via strictatime – this is significant for a few programs, such as the Mutt mail client.

Latencies

Several other changes aim at shortening the latencies of sometimes a few seconds that occur with Ext3 when applications empty the write buffer via fsync() while the kernel is processing major read accesses. This problem, which sometimes results in noticeable jerking, was caused not only by the Ext3 file system's code, but also by the widely used CFQ scheduler in the block layer. The kernel developers also corrected and optimised the CFQ code – together, these changes should increase the speed of some desktop systems not only measurably, but also noticeably.

The actual latency problem, however, is largely caused by Ext3 file systems being integrated as "data=ordered" by default – while this offers high data security, it doesn't yield optimum performance. Much discussion eventually resulted in the inclusion of an also controversial alteration which allows the kernel to integrate Ext3 file systems via "data=writeback". While this should improve performance, it increases the risk of data loss in the event of a crash or if a computer is turned off without shutting down, and also has other disadvantages. As a compromise between security and performance, Btrfs chief developer Chris Mason developed the "data=guarded" mode, which is likely to be included with 2.6.31 – until then, users are advised to continue using "data=ordered" for desktop PCs and notebooks; this can be configured in /etc/fstab during kernel configuration.

Newbies

Having included Btrfs and SquashFS with Linux 2.6.29, the kernel developers incorporated another two new file systems, Nilfs2 and EXOFS, in version 2.6.30. The complete name of Nilfs2 is New Implementation of a Log-structured File System version 2; it is a log-structured file system (LFS) with continuous snapshotting and is mainly optimised for solid state disks (SSDs) without wear levelling. A detailed description of its operation can be found on the Nilfs2 homepage and in the kernel documentation about Nilfs2.

EXOFS is short for Extended Object File System and used to be known as OSDFS (Object-Based Storage Devices File System). As indicated by the old name, it is intended for the rather exotic OSDs (Object-based Storage Devices) first supported by the SCSI subsystem in kernel version 2.6.30.

Next: More file system changes and RAIDs

Print Version | Permalink: http://h-online.com/-746581
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit
 


  • July's Community Calendar





The H Open

The H Security

The H Developer

The H Internet Toolkit