Performance problem in AMD's Phenom II X6 under Linux
In many current Linux distributions, AMD's recently introduced desktop processors such as the hexa-core Phenom II X6 1090T don't reach their full performance potential because the kernel doesn't retrieve the correct processor frequencies from the ACPI tables – this confuses the kernel to such a degree that in many configurations the processor doesn't reach its nominal clock frequency.
Introduced with the hexa-core processors and identified via the "T" in the processor name, the Turbo Core feature is similar to Intel's "Turbo Boost": When individual processor cores are idle, other processor cores can run a bit faster to get their work done quicker without overtaxing the cooling system. The 1090T, which usually operates at 3.2 GHz, for instance, can temporarily run up to three of its cores at 3.6 GHz while the other three cores are idle.
Turbo Core and Turbo Boost interact with a system's power saving features, which clock down individual cores or the whole processor and decrease the voltage to reduce power consumption when a system is idle. Herein lies the problem with AMD's new processors in current Linux kernels: when their "Cool'n'Quiet" feature is enabled, Turbo Core processors no longer step up to their nominal speed, but operate at a slightly slower frequency.
The simplest workaround is to disable the "Cool'n'Quiet" feature. This requires users to fully disable the function in the board's BIOS set-up, or to instruct the kernel not to adapt clock speeds via cpufreq – the Fedora Linux distribution allows users to do this by stopping the "cpuspeed" daemon, while other distributions require users to blacklist the powernow-k8 kernel module, which is responsible for cpufreq in modern AMD CPUs. However, disabling the power saving features can increase an idle system's power consumption by 10 to 20 watts.
AMD's in-house kernel developers have recently become aware of this problem. The issue is to be solved through changes already submitted to the kernel developers for inspection in March. These changes are to provide adequate Turbo Core support under Linux and will probably be integrated into Linux 2.6.35. According to the AMD employees, however, only one of the patches is actually required to fix the performance problem. This recent disclosure has caused this patch, which only modifies one line of source code, to be scheduled for prompt integration into Linux kernel 2.6.34, which is currently still in development. Integration of the patch into the currently maintained Stable Series kernels is also planned, but it is currently hard to say when the new versions in the 2.6.32.x and 2.6.33.x Linux kernel series containing this modification will be released. AMD's developers are also in touch with the developers of various other distributions so the patch can be integrated into the kernels deployed via these distributions' update features.
Update: Linux Torvalds has now integrated the one line patch to fix frequency reporting into the 2.6.34 Linux kernel main development branch.
Several tests with a release candidate of Fedora 13 and various kernels that are based on Linux kernel 18.104.22.168-57.fc13.x86_64, which comes with the distribution by default, demonstrate the effects of the problem as well as those of the kernel patches. To compile Linux 2.6.25 with a maximum of twelve processes ("make -j 12 bzImage") in standard configuration ("make defconfig") using the "kcbench" bench mark, a test system with Phenom II X6 1090T required approximately 75 seconds. Compilation was about 20 seconds faster after Cool'n'Quiet was disabled via the BIOS set-up or by stopping cpuspeed. However, a kernel self-compiled with the aforementioned patch that only modifies one line required 59 seconds with cpuspeed enabled – that's four seconds more. On the other hand, with a kernel that included all six changes in the patch series created for 2.6.34 and cpuspeed enabled, compilation only took about 52.5 seconds. (thl)