In association with heise online

Continued focus on virtualization techniques

Linux kernel development lines Kernel hackers are constantly developing the 2.6 series Linux kernel. They are not afraid to make comprehensive changes in the process. That is why we are unlikely to see a 2.7 developer branch in the near future, from which Linux version 2.8 or 3.0 might evolve the way Linux 2.6 came from 2.5. Instead, developers are maintaining several kernel series in parallel for different groups of users. Versions with three numbers separated by points (2.6.x, such as 2.6.26) make up the main development line. Parallel to the main development line, the administrators maintain the stable kernel series of the two most recent versions of the main development line, indicating the updates with an additional number ([b]2.6.x.y[b], for instance 2.6.24.10 or 2.6.25.8)

Once again the KVM (Kernel-based Virtual Machine) virtualization solution, which makes Linux itself into the hypervisor, has been thoroughly expanded. The KVM version contained in the 2.6.26 kernel now supports IA64 (1, 2, documentation) and s390 (1, 2, documentation) architectures, as well as 44x series PowerPC processors – previously KVM only ran on 32 and 64 bit x86 processors. There were also numerous improvements for these. For instance, the hardware task switching emulation needed for FreeDOS, support for the nested page tables (1, 2, 3) virtualization technology in the new AMD processors, as well as Intel's virtual processor identification (VPID) and extended page tables (EPT; 1, 2, 3).

As with previous kernel versions, there was also a whole range of improvements to the KVM infrastructure aimed at improving compatibility and performance. Among these are basic paravirt support (e.g. 1, 2, 3, 4), large page support, PIT emulation in the kernel, and the kvmtrace (1, 2) performance tracing system. KVM is now no longer considered experimental. Once again, the KVM maintainer faced harsh words from Torvalds after submitting a large correction at the eleventh hour.

KVM is not the only thing that was improved. Kernel support since Linux 2.6.23 for operation as a Xen guest was also a focus of developers' efforts. They integrated, among other things, the Balloon Driver, which allows the memory assigned to the guest system to be reduced. Kernel developers also added a paravirtual framebuffer, keyboard, and mouse driver (xen pvfb). A change integrated shortly before the end of the development cycle requires Xen to use a kernel compiled with PAE support.

The virtual filesystem (VFS) makes a read-only version of a mounted filesystem available at a different location (r/o bind mounts; some related commits: 1, 2). This is especially interesting in respect to container virtualisation, as it prevents a guest system from changing files in a shared root file system. As with previous Linux versions, there was a host of improvements for containers (among them, a whitelist for devices) in various parts of the kernel. This is moving container virtualisation to a point where software like Linux VServer and OpenVZ will soon be equipped to deliver all of the important functions without any additional kernel patches.

Tuning under the hood

In the innner workings of the kernel a number of changes have been made – users are not likely to notice these at all, but should profit from them indirectly. Support for PAT (Page Attribute Table) on x86 processors, for instance, is new in 2.6.26 (documentation). That will allow the kernel in future to have more influence on cacheing used in modern processors.

Generic semaphore implementation merges previously architecture-specific code for locking with semaphores, considerably shrinking its size in the process (1, 2, documentation, pull request). This is intended to make the code easier to maintain and less vulnerable to errors. Locking is done using a semaphore, but also a bit slower, since the code is not optimised to the peculiarities of various architectures. This disadvantage is not considered serious, since current kernel versions use mutexes far more than semaphores.

In practise, however, generic semaphores triggered significant performance problems with some pre-releases of 2.6.26 in various tests. Torvalds himself fixed one of the most severe problems. Torvalds' fix means that the Big Kernel Lock (BKL) will continue to be implemented with a spinlock as it has been in earlier versions in the 2.6 series, rather than a semaphore; the result was that the PREEMPT_BKL ("preempt the Big Kernel Lock") configuration option had to be deleted.

To find the best remedy for the performance problem some developers have focused on the performance-critical BKL and have started work on different parts of the kernel aimed at completely removing the BKL in the long run. Another approach is phase locking with more precision, which should boost performance of systems with several CPUs.

The basic driver framework (driver core) too saw big changes, which should make it easier for the kernel hackers to use it correctly. Alan Cox is currently working on a general overhaul of the kernel in TTY code and introduced various significant changes. Using programmers tricks, developers were also able to shrink the size of the Tcrypt kernel module immensely. There were comprehensive changes as well to cacheing techniques for FUSE (Filesystem in Userspace). Access restrictions to /dev/mem (1, 2) close back doors to attackers. The choice between SELinux, SMACK, or another LSM based security framework can now be specified with a new kernel parameter. Changes in the module loader prevent unintentional loading of modules that are incompatible with the kernel (1, 2, 3, 4).

Following the kernel hackers' successful merging of source code directories for 32 and 64 bit x86 architectures in Linux 2.6.24 using scripts, programmers have continued to manually merge many files as they did in 2.6.25. In the process they did a lot of tidying up to simplify code and make it more robust. As with previous versions, there was also improvement and tidying up done in the ACPI S3/Suspend-to-RAM and Hibernate/Suspend-to-Disk power management modes.

More: Cornucopia of new and updated drivers; miscellany

Print Version | Permalink: http://h-online.com/-746492
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit
 


  • July's Community Calendar





The H Open

The H Security

The H Developer

The H Internet Toolkit