Kernel Comment: Taking a partial view
by Thorsten Leemhuis
In the latest study by the Linux Foundation, Microsoft only just misses out on a spot among the top 20 groups and companies contributing to the Linux kernel. It has, however, achieved this only by dint of delivering bad code and then slowly improving it.
The Linux Foundation has published another study – its fourth – of the latest changes merged into the Linux kernel. The section on how much different companies have contributed to Linux often attracts particular attention, with top ranking companies frequently not backward in boasting about the results. Sometimes the list even manages to attract attention on its own merits, particularly when it includes companies who are perhaps among the last one would expect to see. Microsoft, for example, made headlines when a similar analysis by LWN.net found that a Microsoft employee headed the list of developers who had contributed the greatest number of changes to Linux kernel 3.0.
Users stumbling across these numbers would, however, do well to remember the old adage that you should never trust statistics you didn't make up yourself. Even if the figures are built on firm numerical foundations, the analysis merely presents an outline from a specific point of view – readers would be ill-advised to read too much into the figures.
Much as we might prefer to avoid courting controversy by choosing a different example, Microsoft is in fact the perfect illustration of the problem with these figures. In the Linux Foundation's latest study, Microsoft is recorded as making the 21st largest contribution to the Linux kernel in terms of the percentage of changes merged between Linux 2.6.36 and 3.2.
This is down to work on the Hyper-V drivers aimed at optimising Linux's ability to run under the Windows Server virtualisation solution. Microsoft released these drivers in July 2009 under the GPLv2. They were merged into Linux kernel 2.6.32 shortly thereafter, though only into the staging area for drivers which do not meet kernel quality standards.
In other words, Microsoft chalked up its first set of changes by merging bad code. This was followed by many further changes, as, in line with the wishes of the kernel development team, Microsoft's programmers improved the staging drivers in small, easily comprehensible increments. Because there was so much that needed fixing, this resulted in 688 changes (one per cent of all commits) between Linux 2.6.36 and 3.2. In the process, the code for the Hyper-V drivers shrank by around sixty per cent. According to the Hyper-V drivers' maintainers, they now also offer better performance, are more stable and represent a solid basis for further improvement.
Finally, following many changes and with the support of experienced kernel developers, the quality of the Hyper-V drivers has reached the kind of standard which most drivers attain before they are even merged into the kernel. Many companies and volunteer developers are, in fact, perfectly able to deliver decent code from the off, so that the vast majority of drivers never see service in the staging area at all.
The stopover in the staging area got Microsoft's development team up the Linux Foundation's rankings, since these are based on the number of commits made by a company – a single line change to the documentation carries the same weight as a patch which adds a properly written driver to the kernel and consists of thousands of lines of code. Jonathan Corbet, who was involved in putting together the Linux Foundation study, noted this problem when he published his analysis of recent kernel versions on LWN.net (see, for example, analyses of 3.0, 3.1, 3.2 and 3.3). Next to the table ranking companies by the number of changes, he also showed a table which ranked them on the basis of the number of lines changed – in which Microsoft found itself in a far less prominent position.
Drivers which are already of a high quality when they are merged into the kernel do not, therefore, make much of a splash in the kind of study performed by the Linux Foundation – in fact it's much easier to get ahead with bad code which requires incremental improvement. This is just one of the many problems with the method of analysis. It also, for example, appears to include patches which update proprietary firmware bundled with the kernel and commits which revert a just merged change.
So just in case you missed it – don't read too much into these figures!
Unless you're running Linux as a guest on the Windows Server virtualisation solution, you're not getting anything out of the one per cent of changes contributed by Microsoft between 2.6.36 and 3.2 anyway. By contrast, some of the contributions made by volunteer developers and other companies involve major changes from which many more Linux users benefit. The Linux Foundation study does not reflect this at all.
In the interests of balance, we should reiterate that Microsoft is just one example of this problem – other staging drivers have caused similar statistical distortions. And we'd also like to thank Microsoft for understanding that it is to everyone's benefit when drivers such as those for Hyper-V become a proper component of the Linux kernel, as they are set to in Linux 3.4. There are still some companies that are yet to make that intellectual leap...