Kernel Log: Coming in Linux 3.5 (Part 1) - Networking
by Thorsten Leemhuis
A new packet scheduler is designed to help avoid buffer bloat and "Early Retransmit" offers faster connection recovery after TCP packet loss. The E1000e driver already supports the network chip for Intel's next-generation desktop and notebook platform.
To start the week, Linus Torvalds published the fourth release candidate for Linux 3.5. In the release announcement, Torvalds says that it has more than 200 commits, but notes that "they really are all pretty tiny and insignificant".
As usual, the kernel developers integrated all of the major new features for Linux 3.5 at the beginning of its development cycle. The Kernel Log can, therefore, already provide a comprehensive overview of the most important new features of Linux 3.5 – the kernel developers rarely add, or revert, any major changes during the stabilising phase.
The Kernel Log overview will be presented in the usual series of articles that will successively cover the various kernel areas. The first article below describes the most important new features in the kernel's network infrastructure and drivers; subsequent articles will discuss the kernel's graphics drivers, filesystems, storage support, architecture code and other hardware drivers.
Avoiding buffer bloat
The network subsystem now includes "Codel", an implementation of the "Controlled Delay Active Queue Management (AQM)" packet scheduler that was conceived by Kathleen Nichols and Van Jacobson. The "Fair Queue Codel AQM" scheduler, which is also based on Codel mechanisms, has been integrated into Linux 3.5 as well, but works in a different way. Both schedulers use a slightly different approach to previous schedulers when prioritising sent or forwarded network packets and are designed to avoid the "buffer bloat" problem; this problem can cause large amounts of network latency and connection problems due to excessive caching in modern network chips. LWN.net provides some background on this problem, and the approach used in Codel .
A Google developer has added an "Early Retransmit" (ER) feature to the TCP stack that can accelerate connection recovery when packets are lost, as described in RFC 5827; however, the integrated implementation has been slightly modified to avoid some problems with the algorithm described in the RFC. Early Retransmit can be enabled via the "tcp_early_retrans" sysctl value and has managed to reduce TCP latencies by up to 8.5% in tests, as described in section 6 of "Proportional Rate Reduction for TCP", an article Google employees wrote for IMC 2011. The Proportional Rate Reduction (PRR) feature described in this article has been part of the kernel since version 3.2.
The E1000e driver has been extended to support the i217 PHY, which is said to work with Intel's Lynx Point Platform Controller Hub (PCH) – this line of motherboard chipsets is mainly intended for Intel's Haswell processors and will probably be released roughly in parallel with these processors in the first half of 2013. The R8169 driver for Gigabit Ethernet chips by Realtek can now communicate with the RTL8402 and RTL8411 chips, and the Mwifiex Wi-Fi driver can now address Marvell's USB8797 USB chip. Further changes that add support for LAN and Wi-Fi components or offer extended features can be found in the "Minor Gems" section of this article.
- The new "TCP connection repair" feature is designed to avoid network traffic problems that can occur when a container has been relocated to a different host; details on this can be found in an article on LWN.net.
- The NFC (Near Field Communication) code now supports (documentation) NFC components that support the Host Controller Interface (HCI).
- The developers have removed the support for network hardware that complies with the Token Ring or Econet standards (1, 2, 3); both of these technologies can now be found almost exclusively in IT museums.
- Network subsystem maintainer David Miller mentions various other changes in the email that accompanies his main Git-Pull request for Linux 3.5; among them are various improvements for the use of Splice in network communication.