Monday, October 22, 2007

From ZDNet:

How Linux Is Testing The Limits Of Open Source Development

The community's pushing a breakneck pace to add new kernel features, while struggling to keep up with bug fixes. Slowing down doesn't look like much of an option.

By Charles Babcock, InformationWeek
Oct. 20, 2007
URL: http://www.informationweek.com/story/showArticle.jhtml?articleID=202404635

As the latest release of the Linux kernel emerged this month, it reflected a dizzying number of changes. Kernel 2.6.23, coming just three months after the last update, incorporated business-friendly features, including better virtualization support and an update to the all-important scheduler, as well as the usual new device drivers and bug fixes.

The sheer number of changes coming every two to three months from Linus Torvalds' "code tree" is a sign of accelerating kernel development. The process so far has produced undeniably high-quality, reliable code.

But make no mistake: Torvalds is pushing open source development tactics to new extremes. As the kernel grows in size and complexity, the rapid-fire iterations are straining the capacity of the community of volunteers who test and debug them.

Yet Torvalds can't let off the gas, for two reasons. First, Linux can't afford to fall behind technically, or it'll lose ever-demanding business users. The new kernel, for example, has hooks to take advantage of the latest virtualization capabilities embedded in Intel and Advanced Micro Devices processors. Second, Linux needs to feed its developer community. New features keep coders from getting bored and moving on to other projects, and they attract new talent as coders age or drop out of the process.

The road map of new Linux features, informal and unpredictable as it is, springs from this tension, this constant drive to add features while maintaining quality and stability. Can this 16-year-old open source project be sustained for another 16 years on this scale? "No other open source project has gotten this large or moved this fast," says Dan Frye, an IBM VP who tracks the kernel process. "It's a first-of-a-kind developer community."

Business users depend on this pell-mell process to improve Linux on many fronts beyond virtualization, including power management and security. It can take as long as two years for these rapid kernel changes to find their way into the systems put out by Red Hat and Novell, which most businesses that run Linux use, so there's something of a buffer to the kernel's frantic development process. Still, as the kernel goes, so goes the future of Linux.

SPEED VS. QUALITY
Linux is gaining code at an average of 2,000 lines per day, despite Torvalds' goal of limiting the amount of code that gets into the kernel to keep it as efficient as possible. Linux's modular kernel is the core of the operating system, handling all general-purpose tasks, such as memory management, requests to the CPU, and input/output. It's surrounded by hundreds of add-on packages that do more specialized things, such as translate files between Linux and Windows and configure files for display on an Apache Web server. But the kernel must grow to handle more functionality, more hardware, more users. What started in 1991 as a hobbyist's 10,250 lines of code is now more than 8 million lines.

Some think the kernel, clocked at 86 lines of new code per hour, is exceeding the software development speed limit. A key maintainer, Alan Cox, has warned that some device driver changes should get more testing before being incorporated into the kernel. Andrew Morton, a skilled programmer dubbed "the colonel of the kernel" after Torvalds tapped him as a general manager, has been outspoken on the problem of unfixed bugs in Linux. "I would like to see people spending more time fixing bugs and less time on new features," Morton says. "That's my personal opinion."

But Torvalds indicated at the recent Linux Kernel Summit in Cambridge, England, that he thinks he has erred in the past on the side of caution. Slow kernel releases cause logjams upstream as additions await their chance to get into the kernel. Contributors lose interest without immediate feedback from kernel maintainers and their trusted expert developers. (Torvalds didn't respond to an interview request.)


More bug fixes, fewer new features, Andrew Morton requests.

More bug fixes, fewer new features, Andrew Morton requests
By erring on the side of speeding Linux's development, Torvalds is counting on the basic open source principle that many users testing frequent releases of the code are more likely to catch defects than a more structured testing process. Linux bugs crop up constantly as additions to the kernel are found not to work on certain hardware or to clash with other software, either inside or outside the kernel. Developers who submit code are expected to troubleshoot bugs as they crop up, but often they don't.

At the summit, Morton said he wanted to appoint "a nasty person" to be kernel bugmaster, someone to identify bug sources and "beat up on developers who do not fix bugs," according to kernel developer Jonathan Corbet's account, published by the Linux Foundation. Natalie Protasevich was named bugmaster, and Morton says she has brought more discipline to bug clean-up, even if she falls short of his description of preferred temperament. There were more than 1,500 bugs in the kernel's Bugzilla database; it's down to 1,400.

"It's become a very sophisticated balancing act between rapid development and complete code review," says Dirk Hohndel, who heads Linux and open source technology at Intel. Yet even in its breakneck pace, not every feature that developers want in--or businesses demand--sails right into the kernel.

The process can be frustrating for Linux business customers. At the European travel service Amadeus, Linux is central to its strategy of reducing its infrastructure costs by a factor of 10, says Fred Bessis, VP of technology and strategic planning, by phasing out mainframe systems and running Linux "on cheap hardware." With 10 years of experience using Linux already, it knows what it's getting into, including watching potentially useful new features creep toward business editions.

Holger Weisbrodt, Amadeus' senior systems programmer, says new hardware and drivers get quickly worked into the kernel, but new diagnostic and debugging tools are "taking pretty long to get there." He'd like to see more emphasis put on debugging tools in general.

The latest Linux version shows this unpredictable process at work with two new features, a new scheduler and improved virtualization, that took much different tracks to the kernel, each with its own risks and complications.

THICK SKIN REQUIRED
The process to get a feature approved can be unrestrained and bruising. That was the case with one of the kernel's most important recent gains, the scheduler. The kernel scheduler strives to combine the even-handed, time-sharing characteristics of Unix, so that it can deal with many tasks and users, with the pre-emptive, swift interrupts of a real-time operating system that can respond swiftly to unscheduled events. In commercial operating systems, these tend to be distinct functions. Linux wants to do both.

Contributors have been working on the scheduler for years, but one, Australian doctor Con Kolivas, made a splash in the open source community this summer by airing out, in a widely discussed Australian Personal Computer magazine article, the reasons he quit Linux development in frustration.

His code for kernel 2.6.23, which he dubbed "-ck patchset," was reviewed by Ingo Molnar, a developer employed by Red Hat who has become one of the trusted Linux experts on schedulers, based on previously contributing several schedulers. He found Kolivas' submission wanting when it came to the real-time aspects of scheduling but used it to produce his own version of a multipurpose scheduler. Such borrowing and grafting of other people's code is what the General Public License Linux was meant to encourage, and kernel maintainers try to pay tribute to their sources. Kolivas, who had been getting rejection slips on his proposed code, found the process aggravating.

Kolivas ran into something that can be a barrier to developers. He envisioned different schedulers being used depending on the task. Torvalds and his associates philosophically want basic functions to do things once and do them well, as opposed to generating alternative ways of doing them. That keeps maintenance simpler and interactions between the different kernel subsystems easier to predict. Torvalds imposes the discipline of that architecture, as do participants on the mailing lists where new code gets discussed--where Kolivas' code took a not-uncommon drubbing. "Some of things said on the Linux kernel mailing list [about other developers' code] would probably get you fired at a commercial company," says Joel Berman, who watches the list as Red Hat's product management director.

What emerged in 2.6.23 was the Completely Fair Scheduler, a name that's in part an ironic comment on the need to make trade-offs in a scheduler. Just as Kolivas was displeased, so were those who want better real-time performance. They're hoping improvements on that front will be added next year.

VIRTUALIZATION ON A FAST TRACK
Contrast the years of jostling around the scheduler with the experience of Avi Kivity, an Israeli developer who submitted a large, 12,000-line batch of code called the KVM virtualization engine. It helps to be known to kernel developers and maintainers when submitting a patch, but "KVM came out of the blue," says Morton. "I had never heard of him or his company [Qumranet] before."

Kivity describes himself as a "longtime lurker" on the Linux kernel mailing list, reading it avidly and noting its expert personalities and debates without submitting much code himself. He designed KVM to what he considered kernel standards, kept the kernel's file system expert abreast of progress on the code, and responded immediately to questions and comments from kernel maintainers. KVM addressed an important need in Linux given the rush of interest in virtualization, giving the kernel its first features to exploit the latest virtualization hooks in the Intel and AMD chips. It also artfully made use of the kernel's scheduler and memory manager and affected little else in the operating system. The result was that KVM sailed into the kernel in less than three months after its submission last fall.

Adding code from a little-known author and a fledgling company was a risk, Morton says, since both could fade away, leaving no one with expertise in the code. But given the code's standalone approach, developers could just as simply remove it if it withered.

Even when code like KVM zips into the kernel, there can be a lag of a year or two before it's picked up by one of the top two enterprise distributions, Red Hat Enterprise Linux and SUSE Linux Enterprise Server. ("Community releases" such as Red Hat Fedora and Novell OpenSUSE are updated quickly.) That's to allow for extensive testing and support materials. Many businesses are happy to have that stability and hold off on having the latest and greatest kernel.

And yet Linux races ahead, with developers pouring new features into the kernel in the name of fame, curiosity, and sometimes salary. Over a recent 28-month period in which 11 new kernels appeared, the number of identifiable individual contributors went from 479 to 838. And for every person who gets his or her name on a block of code, there are probably three or four people who helped that person, goes the common estimate. That means 3,000 or so people are involved in each iteration of the kernel.

It's that volunteer community that the Linux movement's still counting on, even as the kernel maintainers, the skilled developers who lead Linux subsystems, are paid by companies such as Google, Hewlett-Packard, IBM, Novell, and Red Hat. That community is why Morton says there's not a "direct trade-off" between speed of development and reliability, since getting features out sooner gets them hammered on long before they'd show up in a business.

Still, there's a drawback, compared with commercial code. "I don't want to call it unpredictability, but you can't guarantee a delivery date," Intel's Hohndel says. "Linux code is delivered when it's ready."

In another two or three months, Torvalds will issue kernel 2.6.24, with a dozen or so new features produced and tested with the help of hundreds of developers who weren't involved in this month's release, with no way to know how much if any of it will eventually make it into business-tested versions. It's not really what anyone would call a product "road map." But so far, it hasn't steered businesses wrong.

Illustration by Dale Stephanos

Continue to the sidebar:
Seven Areas Linux Could Get Better

No comments: