Kernel Log: Coming in 3.10 (Part 3)
by Thorsten Leemhuis
Kernel developers have toned down an over-eager feature for protecting against the Samsung UEFI bug and added a function for reducing timer interrupt overhead. Improvements have also been made to Hyper-V support and instructions for reporting errors.
For Linux 3.10, the kernel developers have modified code designed to protect Samsung laptops from faults caused by a problem with garbage collection in the UEFI firmware. The changes mean that the previously sometimes over-eager protection feature, which had occasionally blocked machines from other manufacturers from setting or modifying UEFI variables, will swing into action less frequently. A recent test by c't magazine, for example, found that several test machines were missing UEFI entries for booting the installed Linux OS following UEFI Linux installation, due to UEFI entry creation having been blocked by this feature. Users can solve situations like this by using efibootmgr on a LiveCD Linux, such as Fedora 18 or Ubuntu 12.10, that doesn't contain the code to protect Samsung notebooks and manually create a UEFI boot entry; obviously this option should not be used with a Samsung device. Details of the approach adopted by the revised protection feature can be found in a blog entry by kernel developer Matthew Garrett.
Linux can now slow the timer interrupt for individual CPU cores, which normally fires 100, 250 or 1000 times per second, to just one interrupt per second. This should prevent response time jitter on real-time systems and provide a small performance boost for high performance computing setups (1, 2, 3, 4, 5, and others). Timer interrupts cannot, however, be slowed on the CPU core used during booting (boot strap processor/BSP/CPU#0). In addition, this feature only works on cores configured to run only a single process. The change, which has been several years in the making, will see further enhancements in future, including the ability to disable the timer interrupt completely. This and other planned enhancements being developed under the code names 'full dynamic ticks' and 'nohz' will also benefit desktop systems. More information can be found in the documentation and in this LWN.net article.
The cpufreq subsystem, which controls changes to processor clock speed, now has basic support for ARM's big.LITTLE concept. This involves processors containing both fast, power-hungry cores and much slower, more power efficient cores. Further details on Linux support for the concept can be found in three LWN.net articles (1, 2, 3). Full support for big.LITTLE will require some major revisions to areas such as the kernel's process scheduler. This will be delivered by the in-kernel switcher (IKS), currently being developed by Linaro developers.
Over the weekend, Linus Torvalds released Linux 3.10-rc7. He seemed confident that the seventh pre-release version should be the last, meaning that Linux 3.10 will probably be released at the turn of the month – though it's far from unheard of for such pronouncements to be derailed by the arrival of major problems, leading to a further RC and a further week of development.
Linux now includes a framebuffer graphics driver for Hyper-V synthetic video, which emulates virtualisation solutions in Microsoft's Windows Server. The kernel guest driver for Microsoft's Hyper-V hypervisor now supports enlarging the amount of memory at run-time (memory hot add). A new driver enables Windows hosts to instruct Linux guests to quickly get all filesystems into a consistent state to allow the host to create a snapshot of disks used by the guest (host initiated backup).
Changes to KVM include improvements to nested virtualisation (where a VM runs inside another VM) on Intel processors (1, 2, 3 and others). Support for Intel's APIC virtualisation and posted interrupts should reduce overhead when processing interrupts intended for guest systems (1, 2, 3, 4, 5, 6). KVM can also now be used for virtualisation on some MIPS32 processors (1, 2, 3 and others). Xen on ARM now supports SMP. The new pvpanic driver enables qemu to tell the host when a guest has crashed. Qemu is used for KVM and Xen virtualisation.
Changes to the perf subsystem include the addition of uretprobes, which allow the kernel to insert breakpoints into the return path for userspace code (1, 2, 3, 4, 5, documentation). This makes it easier for perf to determine when a specific program function was exited. The new perf subcommand 'mem' and corresponding support for 'perf record' and 'perf report' enable memory access profiling on processors with PEBS (precise event based sampling) (1, 2, 3, 4, 5 and others).
Function tracer ftrace now supports multiple buffers. This can be useful in situations in which otherwise rare events could be submerged in the flood of more frequently occurring events. Improvements have also been made to tracer triggers, which are now able to enable or disable trace events when a specific function is executed. Details of these and other changes can be found in the updated documentation.
A group of developers has further tidied and redesigned the control groups (cgroups) code as part of efforts to slowly eliminate a number of known problems with the code. The code now offers a new mount option, which can be used to force the code to behave in what is eventually intended to be the normal manner. In the wake of these changes, numerous changes aimed at giving the init system responsibility for controlling cgroups have also been made to cgroup support in systemd. Other programs will be able to influence the configuration via a systemd API.
The cgroup controller's new 'memory.pressure_level events' memcg means that applications can now be notified when memory is running short. Details can be found in the relevant documentation and this LWN.net article.
Sarah Sharp has made major changes to the REPORTING BUGS document, which provides information on how to report kernel bugs. The text file, which had long been left largely untouched, now contains step-by-step instructions and explains how to find the right person to report to.
Changes to locking mechanism rwsem (reader-writer semaphore) should improve performance in specific cases (1, 2, 3, 4, 5 and others). In one test setup, it resulted in a doubling in speed of PostgreSQL benchmark pgbench. There have also been changes to the mutex code which should significantly improve performance or scalability in certain cases (1, 2).
ARM platforms bcm2835, cns3xxx, sirf, nomadik, msx, spear, tegra and ux500 are now supported by a multi-platform kernel. The code for Samsung's Exynos should be converted in time for 3.11. ARM maintainer Arnd Bergmann details further plans for multi-platform support in a comment on LWN.net. A summary of other changes to the code for ARM SOCs can be found in the LKML threads covering the first, second and third tranches of changes for this area.
The kernel can now be compiled for the microMIPS instruction set architecture (ISA). Kernel images thus created should be at least 20 per cent smaller than the corresponding image for the MIPS32R2 ISA.
Kernels compiled for Tilera processors can now run at privilege level 2 under Tilera hypervisors running at level 1.