Kernel Log: What's coming in 2.6.30 - Storage: RAID improvements, optimised CFQ Scheduler, SAS drivers
by Thorsten Leemhuis
The next kernel version is to provide all that's necessary to convert, for example, a RAID 5 into a RAID 6 and vice versa. There are changes to the block layer designed to speed up the system, and new and improved drivers will offer better SAS support.
With the fifth release candidate of Linux 2.6.30 out a few days ago, the development of the next kernel version in the main development line continues to progress. As indicated by Linus Torvalds in his release email, the changes are slowly decreasing in number and size, which is what usually happens at this development stage.
As every development cycle tends to have eight or nine pre-release versions which often appear on a weekly basis, it will be another few weeks before Torvalds releases Linux 2.6.30. Nevertheless, the Kernel Log will continue its review of the major changes in Linux 2.6.30 with the following overview of the new developments in the storage infrastructure and driver areas.
Flexible cooperation
The developers have made quite a few changes to the code for MD (Multiple Device) software RAIDs (see list at the end of this article), which now also allow the number of devices in a RAID 5 to be reduced. As a result, a RAID-5 array can now be converted into a RAID 6 and back again; in addition, the kernel can convert a RAID 1 into a RAID 5. However, the current version of the mdadm userland tool is unable to handle these extensive conversions.
The MD code now also offers "data integrity support", which was first included in 2.6.27, as long as all the devices in an MD array and their respective controllers are also capable of it. Due to various changes, the same applies to the Device Mapper (DM), which now supports barriers better than it did before (1, 2).
Numerous changes have been made to the block layer and its CFQ I/O scheduler after the developers found some performance issues in this area while doing tests that related to the lengthy discussions around Ext3 and Ext4 – some of the issues were so serious that Torvalds even threatened to replace the standard I/O scheduler. However, the developers were able to track down and eliminate the causes. In some cases this results not only in measurable, but also in noticeable performance improvements (see also the LWN.net article "Solving the ext3 latency problem").
Substructure
As we already mentioned in the second part of the "What's coming in 2.6.30" mini series of the Kernel Log, the SCSI subsystem now also supports the rather exotic OSDs (Object-based Storage Devices) (for example 1, 2, documentation). For the first time, the mpt2sas SCSI driver for SAS2004, SAS2008, SAS2108 and SAS2116 SAS-2.0 controllers by LSI has been included in the kernel. The stex driver now also supports some of the SAS-6G controllers by Promise. New in 2.6.29, Fibre Channel over Ethernet can now also handle FIP (FCoE Initialization Protocol), which serves for detecting and integrating Fibre Channel Forwarders (FCFs). After they already placed the firmware code of numerous SCSI drivers in separate files in versions 2.6.27, .28 and .29, the kernel developers have continued this work in 2.6.30 and split up a further few drivers.
Bartlomiej Zolnierkiewicz announced a few weeks ago that he completed all of the planned IDE subsystem restructuring tasks and will from now on focus mainly on the subsystem's maintenance and on minor extensions. This has already been noticeable during the development of 2.6.30, where we've seen considerably less activity compared to the sometimes comprehensive changes to the IDE code in the previous few kernel versions. Comparatively few major changes – among them the support of ATAPI devices in the sata_mvdriver, which handles Marvell ATA chips – were also made to the Libata subsystem, whose PATA drivers are used for the IDE adapters in most mainstream distributions.
Kernel Log – What's coming in 2.6.30
Other parts of the "What's coming in 2.6.30" mini series of the Kernel Log:
1. Network: New Wi-Fi drivers and other network novelties
2. File systems: New and revamped file systems
The article ""Steady growth: What's new in Linux 2.6.29" describes the new features of the kernel version in the main development line, current at the article's time of writing.
Further background and information about developments in the Linux kernel and its environment can also be found in previous issues of the Kernel Log at The H Open Source.
Minor gems
The kernel developers also discussed extensions to incorporate ATA-TRIM, which allows the kernel to tell SSDs that support the relatively recent ATA command which areas no longer hold any data ("discard"). However, only a patch has been included in the kernel to provide the basis for further changes likely to be integrated with 2.6.31.
What we have mentioned so far only describes the most important and recent changes the kernel hackers have made to the storage-related code of Linux. Many further changes can be found in the following list containing the respective commits in the main development branch; the links directly display the changes in a web frontend, where the commit comment and the patch itself offer more information about the perhaps minor, but by no means unimportant changes.
Block Layer
- as-iosched: get rid of private REQ_SYNC/REQ_ASYNC defines
- block: Add flag for telling the IO schedulers NOT to anticipate more IO
- block: update biodoc.txt on plugging
- brd: support barriers
- cfq-iosched: add close cooperator code
- cfq-iosched: change dispatch logic to deal with single requests at the time
- cfq-iosched: don't delay queue kick for a merged request
- cfq-iosched: don't let idling interfere with plugging
- cfq-iosched: tweak kick logic a bit more
- Document and move the various READ/WRITE types
- loop: add ioctl to resize a loop device
- loop: support barrier writes
Device Mapper (DM)
IDE
Libata
- ahci: force CAP_NCQ for earlier NV MCPs
- ata_piix: ICH7 does not support correct MWDMA timings
- ata: Report 16/32bit PIO as best we can
- libata: ahci enclosure management bios workaround
- pata_hpt37x: fix HPT370 DMA timeouts
- sata_mv: introduce support for ATAPI devices
Mapper Device (DM)
- Documentation/md.txt update
- md: add explicit method to signal the end of a reshape.
- md: add ->takeover method for raid5 to be able to take over raid1
- md: add ->takeover method to support changing the personality managing an array
- md: add takeover support for converting raid6 back into raid5
- md: add takeover support for raid4 -> raid5 conversion.
- md: allow number of drives in raid5 to be reduced
- md: enable suspend/resume of md devices.
- md: occasionally checkpoint drive recovery to reduce duplicate effort after a crash
- md/raid5: Add support for new layouts for raid5 and raid6.
- md/raid5: allow layout and chunksize to be changed on active array.
- md/raid5: allow layout/chunksize to be changed on an active 2-drive raid5.
- md/raid5: finish support for DDF/raid6
- md/raid6: move raid6 data processing to raid6_pq.ko
- md: remove CONFIG_MD_RAID_RESHAPE config option.
- md: support bitmaps on RAID10 arrays larger then 2 terabytes
MMC
- mmc: add MODALIAS linkage for MMC/SD devices
- mmc: SDIO driver for Marvell SoCs
- sdhci: Add support for bus-specific IO memory accessors
- sdhci: Add support for card-detection polling
MTD
- MTD-CHIPS: Add JEDEC probe support for the SST 39VF3201 flash chip
- MTD-NAND: Add parent info for CAFÉ controller
- MTD-NAND: Add support for 4KiB pages.
- MTD-NAND: Add support for NAND on the Socrates board
- MTD-NAND: davinci_nand driver
- MTD-NAND: FSL-UPM: add multi chip support
- MTD-NAND: FSL-UPM: Add wait flags to support board/chip specific delays
- MTD-NAND: pxa3xx_nand: add ability to keep controller settings defined by OBM/bootloader
- MTD-NAND: TXx9: add NDFMC support
- MTD-NOR: Add device parent info to physmap_of
- MTD-OneNAND: Add write-while-program support
- MTD: RBTX4939: add MTD support
- MTD: RBTX4939 map driver
- MTD: TXx9 SoC NAND Flash Memory Controller driver
- NOMMU: Make it possible for RomFS to use MTD devices directly
SCSI
- SCSI: 3w-9xxx: add power management support
- SCSI: aacraid driver update
- SCSI: advansys: use request_firmware
- SCSI: fcoe, libfc: add libfcoe module
- SCSI: libosd: attributes Support
- SCSI: libosd: SCSI/OSD Sense decoding support
- SCSI: major.h: char-major number for OSD device driver
- SCSI: mpt2sas : Identify Dell series-7 adapters at driver load time
- SCSI: osd: Kconfig file for in-tree builds
- SCSI: osd_uld: OSD scsi ULD
- SCSI: qla1280: use request_firmware
- SCSI: qla2xxx: Add Flash-Access-Control support for recent ISPs.
- SCSI: qla2xxx: Add reset capabilities for application support.
- SCSI: qlogicpti: use request_firmware
- SCSI: scsi: Add osd library to build system
- SCSI: stex: add MSI support
- SCSI: stex: Add new device id
Various
Further background and information about developments in the Linux kernel and its environment can also be found in previous issues of the Kernel Log at The H Open Source:
- Kernel Log: X.org 7.5 coming in summer, re-write for Intel's graphics driver
- Kernel Log: What's coming in 2.6.30 - File systems: New and revamped file systems
- Kernel Log: 3D support for the new Radeon driver; new Intel drivers
- Kernel Log: What's coming in 2.6.30 - Network: New Wi-Fi drivers and other network novelties
- Kernel Log: Linux 2.6.30 is taking shape
- Kernel Log: Development of 2.6.30 is under way
Older Kernel Logs can be found in the archives or by using the search function at The H Open Source.
(thl/c't)
(djwm)