Kernel Log: Who’s responsible for troubleshooting and quality assurance?
by Thorsten Leemhuis
Recently, a Red Hat developer got rid of a flaw in an Intel graphics driver, probably for a special corporate customer. For years, the flaw had been a thorn in the side of numerous users of systems with the 945GM chip-set. Now, Theodore 'tytso' Tso has stated in a discussion on LKML that users have to accept more responsibility for troubleshooting and quality assurance.
A change recently implemented in kernel 2.6.35 shows how developers from different companies collaborate on the kernel, what good support contracts with Linux distributors are, and how commercial interests influence the development of Linux and troubleshooting. The change was made by Red Hat's DRM subsystem maintainer Dave Arlie to solve some stability problems and prevent crashes that reportedly occurred on a lot of systems with Intel's 945GM, which was launched in 2006 and is mainly used in notebooks.
Developers have known about the problem for years because a large number of users reported it (1). As Arlie writes in the git-pull-request, there was an "enterprise reason" for this change and other corrections made to the DRM subsystem; in all likelihood, a Red Hat customer stumbled across the problem and insisted on a remedy. The commit comment seems to indicate that Arlie only spent "a week of digging and hair ripping" to solve the problem. Intel was not completely inactive; Intel's Keith Packard provided a patch documenting some of the register used by Arlie's new code.
When you can't wait
A recent conclusion drawn by Linux guru Theodore 'tytso' Tso in an LKML discussion was that Linux is not at all free monetarily, but that often a lot of time and money has to be invested by its users. He was responding to a complaint about a lack of quality and quality assurance (1). With suggestions like "Scratch your own itch" and "There is no such thing as a free lunch" (TANSTAAFL), Tytso tries to make it clear that those who own a certain type of hardware have to make sure that it works.
For instance, he cannot say whether a certain eSATA-PCMCIA card works in a ThinkPad T23 because he does not have that hardware. It follows that users have to take things for a test drive if they want to know whether changes to the kernel will work with their particular systems. Ideally, he says, such tests would be performed in the third or fourth beta version of a new kernel version because the later hardware is tested and flaws are reported, the less opportunity there is for them to be corrected prior to a new full release. He adds that you can also pay someone to ensure that everything works. Finally, he argues that "demanding that kernel.org become ‘more stable’ when it is supported by purely volunteers is simply not reasonable."
Linux version status
In the course of work on the stable kernels released at the beginning of July (Linux 126.96.36.199, 188.8.131.52, 184.108.40.206, 220.127.116.11 and 18.104.22.168), Greg Kroah-Hartman indicated that a number of further changes were in his inbox and that he would be providing additional stable kernels at a later date – though we have not seen anything yet.
In the main development branch, the current development cycle seems to be closing, as Torvalds indicated when releasing the sixth beta version of 2.6.35 that it might be the last RC. Publication of 2.6.35 is therefore quite possible at the end of this week or sometime next week. The recently updated "regression reports" will probably not change anything; they list 25 flaws in 2.6.34.x and 2.6.35-rc6 that did not affect 2.6.33 and 2.6.34 (1, 2).