Kernel Log - Coming in 3.6 (Part 3): Architecture
by Thorsten Leemhuis
Linux 3.6 can cut off the power to PCIe chips and ATA ports. A new userspace driver framework is designed to provide faster access to individual PCI/PCIe devices for virtualised systems.
Linux kernel version 3.6, expected to be released in about one to two weeks, can send PCIe devices into the "D3cold" deep sleep state; this is supported by certain modern computers to completely power down individual PCIe devices (1, 2, 3). The Libata subsystem can now put individual ATA ports into such a sleep state; this is to be the basis for code that is being prepared for future kernel versions to support deep sleep states for optical drives, a technology that is called ZPODD (Zero-Power Optical Disk Drive).
The intel_idle driver can now control sleep state access in Intel's current Ivy Bridge generation of processors; this can be relevant when a system's firmware doesn't support CPU sleep states very well. In Linux 3.6, the turbostat tool that is included with the kernel and provides a usage analysis of the turbo and sleep states of modern processors has been thoroughly revised to fix various problems and improve the support of current processors.
Another addition to the kernel is the IOMMU Groups that improve the isolation of PCI and PCIe devices using I/O virtualisation technologies such as AMD-Vi and Intel's VT-d. The IOMMU Group feature is also the basis for the VFIO (Virtual Function I/O) userspace driver framework (1, 2, 3); mainly intended for KVM, this feature is designed to pass through PCI and PCIe devices to guests, allowing them to access these devices at low latency and high data throughput levels, and without any risk to the host. Details on VFIO are available in the documentation and in an article on LWN.net. Extensions to provide VFIO via QEMU are still in development.
KVM now includes various modifications which reduce the workload for interrupt handling and therefore enhance performance (1, 2, 3). Xen now supports the logging of machine errors via mcelog. A new sysfs interface allows Xen guests to offline individual CPU cores, which can be relevant in terms of power management.
On Sunday, Linus Torvalds released the sixth release candidate of Linux 3.6. There he asked to test things out, as he'd "really like to be able to do the final 3.6 soonish..."
With Intel's Nehalem and Sandy Bridge EP processors, the perf infrastructure now offers performance information on the behaviour of the uncore area, which includes the memory controller and L3 cache (1, 2). Ingo Molnar summarises various other new features of perf and its associated tools in an email with his main git pull request for this subsystem.
The developers have made various processor-specific optimisations to a number of crypto drivers; with x86-64 systems, for example, the modules for the Serpent and Twofish algorithms can now access the AVX assembler implementation that in certain cases works much faster than the previous code, as demonstrated by measured results that are mentioned in the commit comments (1, 2).