Kernel Log: BIOS bugs behind greater power use
by Thorsten Leemhuis
Since version 2.6.38, kernels have used more power because, in certain situations, they disable the power-saving ASPM feature. New stable and long-term kernels offer corrections; however, one of them is conspicuously lagging behind.
In April, the Phoronix web site reported that some systems require more power with Linux 2.6.35 and 2.6.38 than they did with the previous versions. Phoronix says it has now found the reason for the increased power consumption in 2.6.38.
The culprit is said to be the "PCI: Disable ASPM if BIOS asks us to" change that was integrated in 2.6.38. According to its commit comment, the feature is designed to fix problems that occur when the BIOS activates the ASPM (Active State Power Management) power-saving feature with certain PCIe chips but declares that it doesn't support the ASPM in the FADT (Fixed ACPI Description Table), which is consulted by Linux. The modification allows the kernel to consider the information in the ACPI tables and attempt to disable ASPM for all PCIe devices if the BIOS permits it to do so.
However, in a large number of systems, the BIOS apparently provides incorrect ASPM support information via the FADT. From kernel version 2.6.38, all the PCIe lanes for communicating with PCIe chips are, therefore, permanently active in many of these systems, and it is this that has caused the increased power consumption many users have noticed. The kernel can be forced to enable ASPM via the "pcie_aspm=force" kernel parameter. However, this can cause system crashes, as explained, for example, by Red Hat in the RHEL 6 documentation. On a ThinkPad notebook tested by Phoronix, enabling ASPM reportedly caused power consumption to drop from 24.8 to 21.6 watts, which is the level that was required with kernel version 2.6.37.
Windows appears to handle such situations differently – whether it looks for ASPM support information elsewhere or simply uses ASPM if the BIOS makes it available is unclear. Ultimately, the scenario is very reminiscent of the reboot and UEFI problems in Linux that have recently attracted a certain amount of attention (1, 2): Linux and Windows address this hardware differently, which causes Linux to go down untested avenues, and problems to occur, because hardware manufacturers have usually only tested their products with Windows. That Linux actually approaches some situations more correctly than Windows is irrelevant: highly specification-compliant operational procedures coupled with inaccurate notebook BIOS ACPI tables have been among the causes of many ACPI problems that have troubled Linux users over the years. Things got better when the Linux ACPI interpreter began to imitate that of Windows, making some of the same mistakes ("bug compatibility").
The coming days and weeks will show how the kernel hackers tackle the problem; however, a correction to re-enable ASPM in a larger number of systems will probably be part of Linux 3.1 at the earliest. Phoronix is still searching for the modification that has increased the kernel's power consumption since Linux 2.6.35.
Kernel version status
Greg Kroah-Hartman has released stable kernel 220.127.116.11 as well as long-term kernels 18.104.22.168 and 22.214.171.124. The release emails for the first two versions contain the usual strong recommendation to update, without going into detail about any potential security holes the developers may have fixed.
Shortly afterwards, Paul Gortmaker released long-term kernel 126.96.36.199, a version that offers 240 modifications. Unlike Kroah-Hartman, Gortmaker does mention in his release email that security holes have been fixed. However, more than ten weeks have passed between 188.8.131.52 and 184.108.40.206, and Kroah-Hartman released four new series .32 long-term kernels during that time. It is likely that these series .32 versions have already corrected various security holes which have only now been fixed in series .34.