Kernel Log - Coming in 3.7 (Part 3): Infrastructure
by Thorsten Leemhuis
Linux 3.7 can use signatures to verify the integrity of kernel modules, while the new integrity appraisal extension helps to detect malicious software from a third party. The new kernel loads firmware files without udev and includes important container improvements.
Linux 3.7 can sign kernel modules and verify those signatures and, therefore, the integrity of the modules before loading them (1, 2, 3, 4, 5, 6, 7). Some enterprise distributions have had similar features for a while – for example, to ensure that the modules used for troubleshooting are really from the distribution kernel. Developers have been working on integrating the functionality into Linux as some distributions want to load only signed kernel modules when booted with UEFI secure boot – this is now possible with the integrated code.
Another new feature is the integrity appraisal extension for the Integrity Measurement Architecture (IMA), which the kernel has supported for quite some time now (1, 2). IMA can store signed hashes for files and use them to recognise when binaries from the Linux installation have been changed. The Integrity Appraisal Extension can now check those hashes and prevent access if they have been changed. The behaviour of the mechanism can be configured with a new boot parameter. IMA can use a trusted platform module (TPM) to sign and thus securely save the hashes; the kernel developers have continued to work on support for TPMs (1, 2).
The kernel can now load firmware files on its own from the hard disk and is no longer dependent on udev (1, 2). This is a consequence of a recent change to udev's firmware handling that caused a delay during system start-up under certain circumstances; this behaviour led to a long discussion in which Linus Torvalds criticised the udev developers – which in turn led not only to the development of a firmware loader for the kernel itself but also was among the reasons that led to the udev fork eudev. Since then, a change has been made in udev that also fixes the problem; in addition, udev maintainer Kay Sievers wrote even before the kernel was changed that he thought the kernel should be able to load firmware files itself – he told us that it would be easier and much more reliable than having udev handle it in the userspace.
Eric W. Biederman integrated a large patch series that improves support for namespaces, which is useful for stricter separation of user and group IDs between the host and containers (1, 2, 3, 4). The changes improve upon the "user namespace enhancements" that Biederman contributed to Linux 3.5, describing them at the time as a course correction for user namespaces. These changes should mean that everything now works according to the new approach – except under more complex filesystems such as CIFS, NFS, OCFS2 and XFS. Support for those is currently being developed and may be included in Linux 3.8. Biederman is working on another patch series that should wrap up the major reconstruction of the namespace code; the new infrastructure will then be complete and allow users to simply set up and use a namespace.
The cgroups infrastructure – often used for containers but can also be used in other situations – now warns users when they create a nested hierarchy in a control group that uses a controller that doesn't properly support nesting; this issue is just one of the many problems in the cgroups code that cgroups maintainer Tejun Heo is currently working on fixing. Another addition is support for extended attributes in virtual cgroup filesystems (1, 2), a feature that systemd developer Lennart Poettering had on the "linux plumber's wish list" in order to be able to save metadata for background services.
- Linus Torvalds removed the Documentation/feature-removal.txt file used to announce the removal of kernel features. In his commit comment, Torvalds described the document as idiotic, saying that if no one used a feature, then it could just be removed; otherwise, there was no reason to remove it.
- The make target for setting all unused kernel configuration options to their default values is now called "olddefconfig" – the developer who made the change said that the name previously used, "oldnoconfig", confused users because the command doesn't do what the name suggests.
- The Yama security extension can now be used with other LSMs (Linux Security Modules), so that Yama's security functions can be combined with, say, SELinux's.
- Because of changes to coordinated data access using RCU (read copy update), kernel threads like rcu_sched, rcu_preempt and rcu_bh are now showing up in the process list – and consuming quite a lot of CPU time on some systems, according to programs like ps and top. In fact, the RCU code required just as much time before, but the processes were in the ksoftirqd kernel thread. RCU maintainer Paul McKenney explains this situation and other background information in an LWN.net article on the changes.
For kernel hackers
- The kernel is now linking debugfs, which is typically found under /debug/ or /sys/kernel/debug/, with 0700 rights by default, so that, generally, only root users can read the information there.
- A generic implementation for a simple hash table was added to the kernel. In later kernel versions, this change is supposed to replace a number of other implementations that have popped up in various subsystems of the kernel and are now realising the same function in multiple ways.
- Workqueues are now "non-reentrant" by default – that is, they don't execute the same code with multiple processors at the same time. Many developers were not aware of the concurrent execution and were therefore surprised when workqueue code didn't work as they expected.
- The kernel developers have begun to store header files with data type definitions used by userland software in directories called "uapi" located under include/ and arch/<arch>/include/ (1, 2, 3). This change should make it easier to differentiate them from definitions used within the kernel and take care of some problems in the integration of header files. This restructuring is one of the main reasons why a diffstat shows such a large number of changes compared to Linux 3.6.