|
NewsForge Interview with BitMover CEO Larry McVoy
|
|
Original Articles On NewsForge
BitKeeper after the storm - Part 1 By Joe Barr
BitKeeper after the storm - Part 2 By Joe Barr
Bitkeeper after the storm - Part 1
Tuesday May 11, 2004 (12:00 PM GMT)
Topics: Linux
,
, Software
, Operating Systems
By: Joe Barr
It has been a couple of years since the Linux kernel mailing list
was debating the issues of Linus Torvalds' scalability and the use
of a proprietary source management tool called Bitkeeper to handle
kernel patches. Now that the dust has settled, and intrigued by a press release
from Bitkeeper author Larry McVoy that claimed impressive productivity
gains for Linus Torvalds and other kernel hackers using Bitkeeper,
NewsForge decided it was time to talk with McVoy on the current state
of affairs between the free software hackers and his proprietary code.
This is part one of that interview; part two will appear tomorrow.
NF:
It has been about a month since the press release with the startling
news that BitKeeper more than doubled Linus Torvalds' productivity.
What was the reaction to the news by other kernel hackers?
McVoy: It wasn't really news to the senior developers. They already knew.
Here's how that announcement came about. I asked someone we were
considering hiring why he wanted to come work for us. His response was,
"I hang out on the kernel list and it is obvious that Linus is ten times
more effective since he switched to BitKeeper." That sounded pretty nice,
but I didn't believe it. I knew things were better, but ten times better?
That sounded a little too good to be true.
I know some of the senior kernel people personally so I started asking
around. I spoke with Dave Miller, Jeff Garzik, Greg Kroah-Hartman,
Andrew Morton, and Linus about this. Dave was the first person I spoke
with and he said that he thought that 10x wasn't at all unlikely, and
it was certainly 8x. Interesting. So I talked to Jeff and
his comment was, "Oh, man, it's so much better, it has to be 10x."
Greg had a fairly similar reaction. I was having lunch with Linus,
Andrew and Ted T'so to talk about digital signatures for the kernel (those
are implemented now, by the way) and I brought this up as a question.
Andrew thought that anything would have been an improvement over what
Linus was doing before and he agreed that BitKeeper was a lot better
than CVS. But his take was that just a move to CVS would have been an
improvement. Linus disagreed. Linus was adamant that if he had moved
to CVS it would have slowed him down. So in Linus's mind, whatever
improvement had happened was due to BitKeeper.
Greg has written a paper about the rate of change since the switch to
BitKeeper. He has a lot to say about how BitKeeper has helped -- you
might ping him for details. Some of the things I remember are:
- It's a lot easier to track what Linus is doing because you can see
his tree long before an initial release. Linus pushes close to daily to linus.bkbits.net.
- Independent development works much better. You can just use BK to
do the merges and track what is in Linus's tree.
- It's trivial to see if Linus has merged your changes.
- The whole system is asynchronous; you can do work while someone else does work and BK will merge it for you when you sync up.
The senior developers were well aware that things are better. The 2x
announcement wasn't news at all. From their point of view, the 2x claim
is an understatement because for them the improvement is bigger than that.
But any claim is likely to be challenged, so what we did to arrive at that
number was to simply measure the amount of change over the two-year period
in BitKeeper and contrast that with the two-year period before BitKeeper.
It worked out to about 2.5x more change. The metric I'd love to have
is the number of patches integrated. We're all in agreement that it is
far more than 2.5x more than it was before. Linus is processing around
50 patches a day, 365 days a year. That's an amazingly high number.
Nobody in the software industry has ever processed that much change to
my knowledge and I have worked at SCO, Sun, SGI, and Google as well as
a few smaller companies.
Before we made the productivity announcement we talked it over with Linus.
It was Greg who suggested the idea of measuring the diffs as a way of
getting a quantitative handle on the problem. I showed the numbers to
Linus and asked him if he agreed and he did. So that's how it happened.
NF: What is the level of acceptance for BitKeeper now as opposed to when
Torvalds first announced he was going to use it?
McVoy: Most people use it so the acceptance level seems high. There was concern
at the beginning that maybe we were trying to exploit the kernel team
somehow. Our position has always been that we were and are sincere
in our desire to help the community. Nobody believes in a free lunch,
so many people try to figure out "what's the catch?" The more vocal
we were regarding our sincerity, the more suspicious people became.
That's human nature and there isn't much we can do about it other than
continue to demonstrate that we will do the right thing. I used to
work at Google and their "do no evil" motto is something that I took
away from them. It's a good way to run a business, but it makes people
wonder a bit. People expect corporations to be "evil," but not all of
the corporations are evil. Google is a very visible company trying to
do the right thing; we're a far less visible company but we are also
trying to do the right thing. It is possible to do the right thing and
make money and maybe Google's example will inspire other companies to
follow suit.
I believe that a lot of the concerns have faded away because it is
years later and we are still here and still supporting the free use
of BitKeeper. Linus has used BK for more than two years, but the Power
PC folks have been in BK since 1999, so we have been supporting kernel
people in BK for at least four years. The MySQL folks have been using BK
for about the same amount of time, so it ought to be clear that we are
committed to helping the free software community.
It's worth pointing out that we are profitable and have no outside
investors. That means that we, the employees and owners of BitKeeper,
decide if it is a good idea to support the kernel and the other free
users. We, not some outside money-focused investors, decide if what
we are doing is a good thing. And we like the free software community.
There are some people who will always be worked up about any
infrastructure that isn't GPLed. We understand their concerns and
that's why we built the BK2CVS gateway. That way people know that no
matter what they have the history in a GPLed tool. We do the export
nightly and mirror the CVS root to master.kernel.org so there is simply
no question that the data is available in a free form.
Along with BitKeeper itself, we provide bkbits.net, a free hosting service
for BK repositories (Linux is there, MySQL is there, so are lots of
other projects), and we provide a free public server (kernel.bkbits.net)
that anyone can use if they are working on the kernel, BK user or not.
The amount of service that we provide for free should, in theory, help
convince people that our intentions are good and we are really trying to
help the community of free software developers. People didn't trust that
initially, but the longer we keep helping the more people tend to trust us.
NF: Is your pro bono work for Linux kernel development paying off in
sales of your proprietary product?
McVoy: Absolutely. People look at how the kernel is being managed in BK and
they believe that if BK can do that then it can handle their problems
as well. A big marketing win for us is bkbits.net, our free hosting
service. Managers look at that and at the sheer volume of the data
(6 million files in 55GB of data) and when they learn that we spend
less than a man-week a year on supporting bkbits.net, they are sold.
That's a good thing for everyone; we're providing a useful service and
we get some marketing value from it.
We derive benefit from the pro bono work in other ways as well. When we
are testing out a new release we can put it on bkbits.net and we know in
seconds if we have broken something important; people use old versions
of BK to talk to bkbits.net every few seconds.
We are strongly committed to helping the Linux kernel community and
other open source projects. Not everyone may believe this, but we'd be
doing it even if there was no benefit to us. It is our way of giving
back some value for all the great free software we use every day.
We run our business on free software, we develop our product with free
software, the free software community has been great for our business.
All companies who benefit from free software ought to find a way to help
the people who are producing that software.
I'm aware that some in the community would prefer that we gave back by
adding to the pool of free software, but our product space doesn't seem to
work well in that model. So we give back in other ways. The majority
of the people in the community has come to trust that we will continue
to do so.
Bitkeeper after the storm - Part 2
Wednesday May 12, 2004 (10:00 AM GMT)
Topics: Linux
, Software
, Operating Systems
By: Joe Barr
In Part 1 of this interview, we learned just how much Linus Torvalds and others have increased their productivity through the use of Bitkeeper to handle kernel patches. In this conclusion to the interview, we examine the consequences of that increase. Is it good or bad for the Linux kernel that more patches than ever are being applied? Both Larry McVoy, author of Bitkeeper, and Linus Torvalds, creator of Linux, offer their opinions.
NF:
Thinking back to the chant "Linus doesn't scale" and having clearly demonstrated that with the right tools he has scaled, is there any
concern on your part that the accelerated pace of Linux development we're seeing today might be taking too great a toll on Linus, or that the quality of Linux might suffer?
McVoy:
Good questions. I'm going to answer in opposite order because the first
one is a longer answer.
I don't think that the quality is suffering, we run our company on Linux
and we see Linux steadily improving. There are definitely things going into
the kernel that I don't agree with (mostly fake realtime stuff or
fine-grained threading that less than .001% of the machines in the world
will ever use) but I'm not the guy who gets to choose. So if I leave
my personal views aside and try to be objective about it, it certainly
seems to me that the kernel keeps getting better. 2.6 looks pretty good
and the rate of change is dramatically higher. If the faster pace was
going to cause problems, I suspect it would have done so by now.
The first question is more involved. The short answer is that I think
that rather than taking a toll, Linus is more relaxed and able to spend
more time doing what he should do, educating people, teaching them good taste,
acting as a filter, etc. He and I talk periodically and he certainly seems
more relaxed to me. I've seen him take interest in people issues that he
would have let slide when he was under more pressure.
The longer answer, which addresses why the increased pace is not
taking a toll on Linus, requires some background. If you look at software
development, there are two common models, each optimizing one thing at
the expense of the other. I call the two models "maintainer" and
"commercial."
Development models
The maintainer model is one where all the code goes through one person who
acts as a filter. This model is used by many open source projects where
there is an acknowledged leader who asserts control over the source base.
The advantage of this model is that the source base doesn't turn into
a mess. The bad changes are filtered out. The disadvantage is that it
is slow; you are going only as fast as the maintainer can filter.
The commercial model is one where changes are pushed into the tree as
fast as possible. This could be called the "time to market model."
Many commercial efforts start out in maintainer mode but then switch
to commercial mode because in the commercial world, time to market
is critical. The advantage of the commercial model is speed (gets to
market first) and the disadvantage is a loss of quality control.
- Commercial model: Very fast, lower quality
- Maintainer model: Slow, higher quality
- Maintainer+BK: Fast, higher quality
Scaling development
Everyone knows that small team development works well but problems
emerge as the team grows. With a team of five or six people, filtering all
changes works fine -- one person can handle the load.
What happens when you try to grow the team? Commercial and open source
efforts diverge at this point, but both have growing pains.
The commercial approach is to abandon the filtering process and move
quickly to get something out the door. It's simply not effective to
try and filter the work of a few hundred developers through one person;
nobody can keep up with that load. The commercial world has tried many
different ways to have their cake and eat it too. Management would love
to have speed and quality, but the reality is that if they get speed then
they sacrifice quality.
The maintainer-model process has scaling problems as well. It works as long
as the maintainer can keep up and then it starts to fall apart. For a
lot of open source projects, it works really well because the projects
never get above five or six people. That may seem small, but the reality is
that most good things have come from small teams. But some projects are
bigger than that: the Linux kernel, X11, KDE, Gnome, etc. Some projects
are much larger -- the 2.5/2.6 branch of the Linux kernel shows more than 1,400 different people who have committed using BitKeeper.
It is obvious that trying to keep up with the efforts of more than
1,000 people is impossible for one person, so how do maintainer-model
projects scale? They divide and conquer. Imagine a basic building block
consisting of a set of workers and a maintainer. I think of these as
triangles with the maintainer at the top and the workers along the bottom.
You can start out with a maintainer and a couple of workers and you keep
adding until you can't fit any more in the triangle. When the triangle is
full you create another layer of maintainers. The top triangle is filled
with the ultimate maintainer who then delegates to sub-maintainers.
So what were workers are now the first line of maintainers. Each of
those sub-maintainers is leader of a second level triangle, and there are
several of those below the top triangle. All I'm describing is a log(N)
fan-in model where the same process of filtering is applied in layers.
The Linux kernel had moved to this model before they started using
BitKeeper and it was troublesome. What is not explicitly stated in the
layered maintainer model is that as you add these layers the workers are
farther away from the authoritative version of the tree and all versions
of the tree are changing. The farther away from the tree the more merging
is required to put all the versions together. The sub-maintainers
of Linux, who are the usual suspects like Dave Miller, Greg KH, Jeff
Garzik, etc., were in "merge hell" every time Linus did a new release.
Maintainer mode worked quite well for small teams but as it scaled up,
the divide and conquer solution forces the sub-maintainers pay the price
in repeated and difficult merging.
Scaling maintainer mode with BitKeeper
BitKeeper was designed with the maintainer model in mind, to enable that model
(among others) by removing some of the repeated work such as merging.
We knew that the maintainer model would be dominated by trees with various
differences being merged and remerged constantly, so good merging had to
be a key BitKeeper feature. BitKeeper is enough better at merging that it
allows the model described above to work and to scale into the hundreds
or thousands of developers. The fact that BitKeeper works well in this
model is a big part of why the sub-maintainers all thought things were
ten times better. For them, it was easily ten times better because they
were doing much less work, because BitKeeper was doing all the merging
for them. The sub-maintainers were doing more work and BitKeeper made
most of that work go away, so the improvement for them was dramatic.
The fan-in/fan-out variation of the maintainer model is the way that
Linus reduces his load. A sub-maintainer emerges as someone who can be
trusted, a sub-section of the kernel is spun off as a somewhat autonomous
sub-project, Linus works with that person to make sure that the filtering
is done well, and the development scales a little further.
The point of this long-winded response to your question is to explain
why the increased rate of change hasn't taken a toll on Linus. If a
tool can support the maintainer plus multiple sub-maintainers (and even
sub-sub-maintainers and so on) then the top-level maintainer can learn
over time which of his sub-maintainers can be trusted to do a good job
of filtering. There are some people from which Linus pulls changes
and more or less trusts the changes without review. He's counting on
those sub-maintainers to have filtered out the bad change and he has
learned which ones actually do it. There are other people who send
in patches and Linus reads every line of the patch carefully.
If I've done a good job explaining, then you can see how this model can
scale. It's log(N), and log(N) approaches can handle very big Ns easily.
The goal of the model is to make sure that changes can happen quickly
but be carefully filtered even with a large number of developers.
Without BitKeeper doing a lot of the grunt work, a project has to choose
between the faster commercial model and the more careful maintainer model,
but with BitKeeper you get to have your cake and eat it too. The process
moves fast, close to as fast as the commercial model, but without losing
the quality control that is so crucial to any source base, large or small.
To some extent, Linus's job becomes one of working with sub-maintainers
to make sure they are as good as he is at filtering. He still does a
lot of "real work" himself but he is scaling by enabling other people
to do part of his job.
NF: Linus, since the number of patches handled has gone up so dramatically, do you still have time to give them the same sort of attention you did the old way?
Torvalds: Larry already answered, I'll just throw in my 2 cents'.
To me, the big thing BK allows me to do is to not worry about the people I
trust, and who actively maintain their own subsystems. Merging with them
is easier on both sides, without losing any history or commentary.
So the answer to your question is that to a large degree BK makes it much
easier to give attention to those patches that need it, by allowing me
to not have to care about every single patch. That, in turn, is what makes
it possible for me to take many more patches.
So in that sense, I don't give the "same sort of attention" that I did in
the old way. But that's the whole point -- allowing me (and others, for
that matter) to scale better, exactly because I can direct the attention.
A lot of my time used to be taken up by the "obvious" patches -- patches
that came in from major subsystem maintainers that I trusted. That has
always been the bulk of the work, and the patches that require attention
are comparatively few. But when everything was done with patches, I
basically needed to do the same thing for the "hard" cases as for the
"easy" ones. And a fair amount of the work was just looking at the email
to decide into which category it fell.
That's where BK helps.
There is another part to it too -- BK allows me to give much more control
to the people I trust, without losing track of what is going on.
Traditionally, when you have multiple people working on the same source
tree, they all have "write access" to whatever source control management
system they use. That in turn leads to having to have strict "commit"
rules, since nobody wants anybody else to actually make changes without
having had a chance to review the changes. That in turn tends to mean that
the limiting point becomes the "commit review" process. Either the
process is very lax ("we'll fix the problems later," which never works),
or the process is so strict that it puts a brake on everybody.
In contrast, the distributed nature of BK means that I don't give any
"write access" to anybody up-front, but once they are done and explain
what they did, we can both just synchronize, and there is no issue of
patches being "stuck" in the review process.
So not only does BK allow me to concentrate my attention on stuff I feel I
need to think about (or the other side asks me to think about, which is
actually more common), but it also allows me to literally give people more
control. That makes it much easier to pass the maintenance around more,
which is, after all, what it's all about.
|