A T-shirt once offered this wisdom to the world: "If you love someone, set them free. If they come back to you, it was meant to be. If they don't come back, hunt them down and kill them." The world of free software revolves around letting your source code go off into the world. If things go well, others will love the source code, shower it with bug fixes, and send all of this hard work flowing back to you. It will be a shining example of harmony and another reason why the free software world is great. But if things don't work out, someone might fork you and there's nothing you can do about it.
"Fork" is a UNIX command that allows you to split a job in half. UNIX is an operating system that allows several people to use the same computer to do different tasks, and the operating system pretends to run them simultaneously by quickly jumping from task to task. A typical UNIX computer has at least 100 different tasks running. Some watch the network for incoming data, some run programs for the user, some watch over the file system, and others do many menial tasks.
If you "fork a job," you arrange to split it into two parts that the computer treats as two separate jobs. This can be quite useful if both jobs are often interrupted, because one can continue while the other one stalls. This solution is great if two tasks, A and B, need to be accomplished independently of each other. If you use one task and try to accomplish A first, then B won't start until A finishes. This can be quite inefficient if A stalls. A better solution is to fork the job and treat A and B as two separate tasks.
Most programmers don't spend much time talking about these kinds of forks. They're mainly concerned about forks in the political process.
Programmers use "fork" to describe a similar process in the organization of a project, but the meaning is quite different. Forks of a team mean that the group splits and goes in different directions. One part might concentrate on adding support for buzzword Alpha while the other might aim for full buzzword Beta compatibility.
In some cases, there are deep divisions behind the decision to fork. One group thinks buzzword Alpha is a sloppy, brain-dead kludge job that's going to blow up in a few years. The other group hates buzzword Beta with a passion. Disputes like this happen all the time. They often get resolved peacefully when someone comes up with buzzword Gamma, which eclipses them both. When no Gamma arrives, people start talking about going their separate ways and forking the source. If the dust settles, two different versions start appearing on the Net competing with each other for the hearts and CPUs of the folks out there. Sometimes the differences between the versions are great and sometimes they're small. But there's now a fork in the evolution of the source code, and people have to start making choices.
The free software community has a strange attitude toward forks. On one hand, forking is the whole reason Stallman wrote the free software manifesto. He wanted the right and the ability to mess around with the software on his computer. He wanted to be free to change it, modify it, and tear it to shreds if he felt like doing it one afternoon. No one should be able to stop him from doing that. He wanted to be totally free.
On the other hand, forking can hurt the community by duplicating efforts, splitting alliances, and sowing confusion in the minds of users. If Bob starts writing and publishing his own version of Linux out of his house, then he's taking some energy away from the main version. People start wondering if the version they're running is the Missouri Synod version of Emacs or the Christian Baptist version. Where do they send bug fixes? Who's in charge? Distribution groups like Debian or Red Hat have to spend a few moments trying to decide whether they want to include one version or the other. If they include both, they have to choose one as the default. Sometimes they just throw up their hands and forget about both. It's a civil war, and those are always worse than a plain old war.
Some forks evolve out of personalities that just rub each other the wrong way. I've heard time and time again, "Oh, we had to kick him out of the group because he was offending people." Many members of the community consider this kind of forking bad. They use the same tone of voice to describe a fork of the source code as they use to describe the breakup of two lovers. It is sad, unfortunate, unpleasant, and something we'll never really understand because we weren't there. Sometimes people take sides because they have a strong opinion about who is right. They'll usually go off and start contributing to that code fork. In other cases, people don't know which to pick and they just close their eyes and join the one with the cutest logo.
Eric Raymond once got in a big fight with Richard Stallman about the structure of Emacs Lisp. Raymond said, "The Lisp libraries were in bad shape in a number of ways. They were poorly documented. There was a lot of work that had gone on outside the FSF that should be integrated and I wanted to merge in the best work from outside."
The problem is that Stallman didn't want any part of Raymond's work. "He just said, 'I won't take those changes into the distribution.' That's his privilege to do," Raymond said.
That put Raymond in an awkward position. He could continue to do the work, create his own distribution of Emacs, and publicly break with Stallman. If he were right and the Lisp code really needed work, then he would probably find more than a few folks who would cheer his work. They might start following him by downloading his distribution and sending their bug fixes his way. Of course, if he were wrong, he would set up his own web server, do all the work, put his Lisp fixes out there, and find that no one would show up. He would be ignored because people found it easier to just download Stallman's version of Emacs, which everyone thought was sort of the official version, if one could be said to exist. They didn't use the Lisp feature too much so it wasn't worth thinking about how some guy in Pennsylvania had fixed it. They were getting the real thing from the big man himself.
Of course, something in between would probably happen. Some folks who cared about Lisp would make a point of downloading Raymond's version. The rest of the world would just go on using the regular version. In time, Stallman might soften and embrace the changes, but he might not. Perhaps someone would come along and create a third distribution that melded Raymond's changes with Stallman's into a harmonious version. That would be a great thing, except that it would force everyone to choose from among three different versions.
In the end, Raymond decided to forget about his improvements. "Emacs is too large and too complicated and forking is bad. There was in fact one group that got so fed up with working with him that they did fork Emacs. That's why X Emacs exists. But major forks like that are rare events and I didn't want to be part of perpetrating another one," he said. Someone else was going to have to start the civil war by firing those shots at Fort Sumter.
Some forks aren't so bad. There often comes a time when people have legitimate reasons to go down different paths. What's legitimate and what's not is often decided after a big argument, but the standard reasons are the same ones that drive programming projects. A good fork should make a computer run software a gazillion times faster. Or it might make the code much easier to port to a new platform. Or it might make the code more secure. There are a thousand different reasons, and it's impossible to really measure which is the right one. The only true measure is the number of people who follow each branch of the fork. If a project has a number of good disciples and the bug fixes are coming quickly, then people tend to assume it is legitimate.
The various versions of the BSD software distribution are some of the more famous splits around. All are descended, in one way or another, from the original versions of UNIX that came out of Berkeley. Most of the current ones evolved from the 4.3BSD version and the Network Release 2 and some integrated code from the 4.4BSD release after it became free. All benefited from the work of the hundreds of folks who spent their free time cloning the features controlled by AT&T. All of them are controlled by the same loose BSD license that gives people the right to do pretty much anything they want to the code. All of them share the same cute daemon as a mascot.
That's where the similarities end. The FreeBSD project is arguably the most successful version. It gets a fairly wide distribution because its developers have a good deal with Walnut Creek CD-ROM Distributors, a company that packages up large bundles of freeware and shareware on the Net and then sells them on CD-ROM. The system is well known and widely used because the FreeBSD team concentrates on making the software easy to use and install on Intel computers. Lately, they've created an Alpha version, but most of the users run the software on x86 chips. Yahoo! uses FreeBSD.
FreeBSD, of course, began as a fork of an earlier project known as 386BSD, started by Bill Jolitz. This version of BSD was more of an academic example or a proof-of-concept than a big open source project designed to take over the world.
Jordan Hubbard, someone who would come along later to create a fork of 386BSD, said of Jolitz's decision to create a 386-based fork of BSD, "Bill's real contribution was working with the 386 port. He was kind of an outsider. No one else saw the 386 as interesting. Berkeley had a myopic attitude toward PCs. They were just toys. No one would support Intel. That was the climate at the time. No one really took PCs seriously. Bill's contribution was to realize that PCs were going places."
From the beginning, Hubbard and several others saw the genius in creating a 386 version of BSD that ran on the cheapest hardware available. They started adding features and gluing in bug fixes, which they distributed as a file that modified the main 386BSD distribution from Jolitz. This was practical at the beginning when the changes were few, but it continued out of respect for the original creator, even after the patches grew complicated.
Finally, a tussle flared up in 1993. Jordan Hubbard, one of the forkers, writes in his history of the project,
386BSD was Bill Jolitz's operating system, which had been up to that point suffering rather severely from almost a year's worth of neglect. As the patchkit swelled ever more uncomfortably with each passing day, we were in unanimous agreement that something had to be done and decided to try and assist Bill by providing this interim "cleanup" snapshot. Those plans came to a rude halt when Bill Jolitz suddenly decided to withdraw his sanction from the project and without any clear indication of what would be done instead.
The FreeBSD team pressed on despite the denial. They decided to fork. Today, 386BSD is largely part of the history of computing while FreeBSD is a living, current OS, at least at the time this book was written. The FreeBSD team has done a good job distributing bug-free versions, and they've been paid off in loyalty, disciples, and money and computers from Walnut Creek. Forking can often be good for society because it prevents one person or clique from thwarting another group. The free software world is filled with many of the same stories of politics that float across the watercoolers of corporations, but the stories don't have to end the same way. If one boss or group tries to shut down a free software project, it really can't. The source code is freely available, and people are free to carry on. The FreeBSD project is one example.
Of course, good software can have anti-forking effects. Linus Torvalds said in one interview, "Actually, I have never even checked 386BSD out; when I started on Linux it wasn't available (although Bill Jolitz's series on it in Dr. Dobbs Journal had started and were interesting), and when 386BSD finally came out, Linux was already in a state where it was so usable that I never really thought about switching. If 386BSD had been available when I started on Linux, Linux would probably never have happened." So if 386BSD had been easier to find on the Net and better supported, Linux might never have begun.
Once someone starts forking BSD, one fork is rarely enough. Another group known as NetBSD also grew fed up with the progress of 386BSD in 1993. This group, however, wanted to build a platform that ran well on many different machines, not just the Intel 386. The FreeBSD folks concentrated on doing a good job on Intel boxes, while the NetBSD wanted to create a version that ran on many different machines. Their slogan became "Of course it runs NetBSD."
NetBSD runs on practically every machine you can imagine, including older, less up-to-date machines like the Amiga and the Atari. It has also been embraced by companies like NeXT, which bundled parts of it into the version of the OS for the Macintosh known as Rhapsody. Of course, the most common chips like the Intel line and the Alpha are also well supported.
The NetBSD community emerged at the same time as the FreeBSD world. They didn't realize that each team was working on the same project at the same time. But once they started releasing their own versions, they stayed apart.
"The NetBSD group has always been the purest. They saw it as an OS research vehicle. That was what CSRG was doing. Their only mandate was to do interesting research," said Hubbard. "It's a very different set of goals than we concentrated on for the 386. The important thing for us was to polish it up. We put all of our efforts into polishing, not porting. This was part of our bringing BSD to the masses kind of thing. We're going for numbers. We're going for mass penetration."
This orientation meant that NetBSD never really achieved the same market domination as FreeBSD. The group only recently began shipping versions of NetBSD on CD-ROM. FreeBSD, on the other hand, has always excelled at attracting new and curious users thanks to their relationship with Walnut Creek. Many experimenters and open-minded users picked up one of the disks, and a few became excited enough to actually make some contributions. The Walnut Creek partnership also helped the FreeBSD team understand what it needed to do to make their distribution easier to install and simpler to use. That was Walnut Creek's business, after all.
The forking did not stop with NetBSD. Soon one member of the NetBSD world, Theo de Raadt, began to rub some people the wrong way. One member of the OpenBSD team told me, "The reason for the split from NetBSD was that Theo got kicked out. I don't understand it completely. More or less they say he was treating users on the mailing list badly. He does tend to be short and terse, but there's nothing wrong with that. He was one of the founding members of NetBSD and they asked him to resign."
Now, four years after the split began in 1995, de Raadt is still a bit hurt by their decision. He says about his decision to fork BSD again, "I had no choice. I really like what I do. I really like working with a community. At the time it all happened, I was the second most active developer in their source tree. They took the second most active developer and kicked him off."
Well, they didn't kick him out completely, but they did take away his ability to "commit" changes to the source tree and make them permanent. After the split, de Raadt had to e-mail his contributions to a member of the team so they could check them in. This didn't sit well with de Raadt, who saw it as both a demotion and a real impediment to doing work.
The root of the split is easy to see. De Raadt is energetic. He thinks and speaks quickly about everything. He has a clear view about most free software and isn't afraid to share it. While some BSD members are charitable and conciliatory to Richard Stallman, de Raadt doesn't bother to hide his contempt for the organization. "The Free Software Foundation is one of the most misnamed organizations," he says, explaining that only BSD-style licensees have the true freedom to do whatever they want with the software. The GNU General Public License is a pair of handcuffs to him.
De Raadt lives in Calgary and dresses up his personal web page with a picture of himself on top of a mountain wearing a bandanna. If you want to send him a pizza for any reason, he's posted the phone number of his favorite local shop (403/531-3131). Unfortunately, he reports that they don't take foreign credit card numbers anymore.
He even manages to come up with strong opinions about simple things that he ostensibly loves. Mountain biking is a big obsession, but, he says, "I like mud and despise 'wooded back-alleys' (what most people call logging roads)." That's not the best way to make friends with less extreme folks who enjoy a Sunday ride down logging roads.
If you like cats, don't read what he had to say about his pets: "I own cats. Their names are Galileo and Kepler--they're still kittens. Kepler-the little bitch--can apparently teleport through walls. Galileo is a rather cool monster. When they become full-grown cats I will make stew & soup out of them. (Kepler is only good for soup)."
Throwaway comments like this have strange effects on the Net, where text is the only way people can communicate. There are no facial gestures or tonal clues to tell people someone is joking around, and some people don't have well-developed scanners for irony or sarcasm. Some love the sniping and baiting, while others just get annoyed. They can't let snide comments slide off their back. Eventually, the good gentlefolk who feel that personal kindness and politeness should still count for something in this world get annoyed and start trying to do something.
It's easy to see how this affected the NetBSD folks, who conduct their business in a much more proper way. Charles Hannum, for instance, refused to talk to me about the schism unless I promised that he would be able to review the parts of the book that mentioned NetBSD. He also suggested that forks weren't particularly interesting and shouldn't be part of the book. Others begged off the questions with more polite letters saying that the split happened a long time ago and wasn't worth talking about anymore. Some pointed out that most of the members of the current NetBSD team weren't even around when the split happened.
While their silence may be quite prudent and a better way to spend a life, it certainly didn't help me get both sides of the story. I pointed out that they wouldn't accept code into the NetBSD tree if the author demanded the right to review the final distribution. I said they could issue a statement or conduct the interview by e-mail. One argued that there was no great problem if a few paragraphs had to be deleted from the book in the end. I pointed out that I couldn't give the hundreds of people I spoke with veto power over the manuscript. It would be impossible to complete. The book wasn't being written by a committee. No one at NetBSD budged.
De Raadt, on the other hand, spoke quite freely with no preconditions or limitations. He still keeps a log file with a good number of email letters exchanged during the separation and makes it easy to read them on his personal website. That's about as open as you can get. The NetBSD folks who refused to talk to me, on the other hand, seemed intent on keeping control of the story. Their silence came from a different world than the website offering the phone number of the local pizza place as a hint. They were Dragnet; de Raadt was Politically Incorrect.
When the NetBSD folks decided to do something, they took away de Raadt's access to the source tree. He couldn't just poke around the code making changes as he went along. Well, he could poke around and make changes, but not to the official tree with the latest version. The project was open source, after all. He could download the latest release and start fiddling, but he couldn't make quasi-official decisions about what source was part of the latest official unreleased version.
De Raadt thought this was a real barrier to work. He couldn't view the latest version of the code because it was kept out of his view. He was stuck with the last release, which might be several months old. That put him at an extreme disadvantage because he might start working on a problem only to discover that someone had either fixed it or changed it.
Chris Demetriou found himself with the task of kicking de Raadt off of the team. His letter, which can still be found on the OpenBSD site, said that de Raadt's rough behavior and abusive messages had driven away people who might have contributed to the project. Demetriou also refused to talk about NetBSD unless he could review the sections of the book that contained his comments. He also threatened to take all possible action against anyone who even quoted his letters in a commercial book without his permission.
De Raadt collected this note from Demetriou and the firestorm that followed in a 300k file that he keeps on his website. The NetBSD core tried to be polite and firm, but the matter soon degenerated into a seven-month-long flame war. After some time, people started having meta-arguments, debating whether the real argument was more or less like the bickering of a husband and wife who happen to work at the same company. Husbands and wives should keep their personal fights out of the workplace, they argued. And so they bickered over whether de Raadt's nastygrams were part of his "job" or just part of his social time.
Through it all, de Raadt tried to get back his access to the source tree of NetBSD and the group tried to propose all sorts of mechanisms for making sure he was making a "positive" contribution and getting along with everyone. At one time, they offered him a letter to sign. These negotiations went nowhere, as de Raadt objected to being forced to make promises that other contributors didn't have to.
De Raadt wrote free software because he wanted to be free to make changes or write code the way he wanted to do it. If he had wanted to wear the happy-face of a positive contributor, he could have gotten a job at a corporation. Giving up the right to get in flame wars and speak at will may not be that much of a trade-off for normal people with fulltime jobs. Normal folks swallow their pride daily. Normal people don't joke about turning their cats into soup. But de Raadt figured it was like losing a bit of his humanity and signing up willingly for a set of manacles. It just wasn't livable.
The argument lasted months. De Raadt felt that he tried and tried to rejoin the project without giving away his honor. The core NetBSD team argued that they just wanted to make sure he would be positive. They wanted to make sure he wouldn't drive away perfectly good contributors with brash antics. No one ever gained any ground in the negotiations and in the end, de Raadt was gone.
The good news is that the fork didn't end badly. De Raadt decided he wasn't going to take the demotion. He just couldn't do good work if he had to run all of his changes by one of the team that kicked him off the project. It took too long to ask "Mother, may I?" to fix every little bug. If he was going to have to run his own tree, he might as well go whole hog and start his own version of BSD. He called it OpenBSD. It was going to be completely open. There were going to be relatively few controls on the members. If the NetBSD core ran its world like the Puritan villagers in a Nathaniel Hawthorne story, then de Raadt was going to run his like Club Med.
OpenBSD struggled for several months as de Raadt tried to attract more designers and coders to his project. It was a battle for popularity in many ways, not unlike high school. When the cliques split, everyone had to pick and choose. De Raadt had to get some folks in his camp if he was going to make some lemonade.
The inspiration came to de Raadt one day when he discovered that the flame war archive on his web page was missing a few letters. He says that someone broke into his machine and made a few subtle deletions. Someone who had an intimate knowledge of the NetBSD system. Someone who cared about the image portrayed by the raw emotions in the supposedly private letters.
He clarifies his comments to make it clear that he's not sure it was someone from the NetBSD core. "I never pursued it. If it happens, it's your own fault. It's not their fault," he said. Of course, the folks from NetBSD refused to discuss this matter or answer questions unless they could review the chapter.
This break-in gave him a focus. De Raadt looked at NetBSD and decided that it was too insecure. He gathered a group of like-minded people and began to comb the code for potential insecurities.
"About the same time, I got involved with a company that wrote a network security scanner. Three of the people over there started playing with the source tree and searching for security holes. We started finding problems all over the place, so we started a comprehensive security audit. We started from the beginning. Our task load increased massively. At one time, I had five pieces of paper on my desk full of things to look for," he said.
Security holes in operating systems are strange beasts that usually appear by mistake when the programmer makes an unfounded assumption. One of the best-known holes is the buffer overflow, which became famous in 1988 after Robert Morris, then a graduate student at Cornell, unleashed a program that used the loophole to bring several important parts of the Internet to a crawl.
In this case, the programmer creates a buffer to hold all of the information that someone on the net might send. Web browsers, for instance, send requests like "GET ‹http://www.nytimes.com›" to ask for the home page of the New York Times website. The programmer must set aside some chunk of memory to hold this request, usually a block that is about 512 bytes long. The programmer chooses an amount that should be more than enough for all requests, including the strangest and most complicated.
Before the attack became well known, programmers would often ignore the length of the request and assume that 512 bytes was more than enough for anything. Who would ever type a URL that long?
Who had an e-mail address that long? Attackers soon figured out that they could send more than 512 bytes and started writing over the rest of the computer's memory. The program would dutifully take in 100,000 bytes and keep writing it to memory. An attacker could download any software and start it running. And attackers did this.
De Raadt and many others started combing the code for loopholes. They made sure every program that used a buffer included a bit of code that would check to ensure that no hacker was trying to sneak in more than the buffer could hold. They checked thousands of other possibilities. Every line was checked and changes were made even if there was no practical way for someone to get at the potential hole. Many buffers, for instance, only accept information from the person sitting at the terminal. The OpenBSD folks changed them, too.
This audit began soon after the fork in 1995 and continues to this day. Most of the major work is done and the group likes to brag that they haven't had a hole that could be exploited remotely to gain root access in over two years. The latest logo boasts the tag line "Sending kiddies to /dev/null since 1995." That is, any attacker is going to go nowhere with OpenBSD because all of the extra information from the attacks would be routed to /dev/null, a UNIX conceit for being erased, ignored, and forgotten.
The OpenBSD fork is a good example of how bad political battles can end up solving some important technical problems. Everyone fretted and worried when de Raadt announced that he was forking the BSD world one more time. This would further dilute the resources and sow confusion among users. The concentration on security, however, gave OpenBSD a brand identity, and the other BSD distributions keep at least one eye on the bug fixes distributed by the OpenBSD team. These often lead to surreptitious fixes in their own distribution.
The focus also helped him attract new coders who were interested in security. "Some of them used to be crackers and they were really cool people. When they become eighteen, it becomes a federal offense, you know," de Raadt says.
This fork may have made the BSD community stronger because it effectively elevated the focus on security and cryptography to the highest level. In the corporate world, it's like taking the leader of the development team responsible for security and promoting him from senior manager to senior executive vice president of a separate division. The autonomy also gave the OpenBSD team the ability to make bold technical decisions for their own reasons. If they saw a potential security problem that might hurt usability or portability, the OpenBSD team could make the change without worrying that other team members would complain. OpenBSD was about security. If you wanted to work on portability, go to NetBSD. If you cared about ease-of-use on Intel boxes, go to FreeBSD. Creating a separate OpenBSD world made it possible to give security a strong focus.
It's a mistake to see these forks as absolute splits that never intermingle again. While NetBSD and OpenBSD continue to glower at each other across the Internet ether, the groups share code frequently because the licenses prevent one group from freezing out another.
Jason Wright, one of the OpenBSD developers, says, "We do watch each other's source trees. One of the things I do for fun is take drivers out of FreeBSD and port them to OpenBSD. Then we have support for a new piece of hardware."
He says he often looks for drivers written by Bill Paul, because "I've gotten used to his style. So I know what to change when I receive his code. I can do it in about five to six hours. That is, at least a rough port to test if it works."
Still, the work is not always simple. He says some device drivers are much harder to handle because both groups have taken different approaches to the problem. "SCSI drivers are harder," he says. "There's been some divergence in the layering for SCSI. They're using something called CAM. We've got an older implementation that we've stuck to." That is, the FreeBSD has reworked the structure of the way that the SCSI information is shipped to the parts of the system asking for information. The OpenBSD hasn't adopted their changes, perhaps because of security reasons or perhaps because of inertia or perhaps because no one has gotten around to thinking about it. The intermingling isn't perfect.
Both NetBSD and FreeBSD work on security, too. They also watch the change logs of OpenBSD and note when security holes are fixed. They also discover their own holes, and OpenBSD may use them as an inspiration to plug their own code. The discoveries and plugs go both ways as the groups compete to make a perfect OS.
Kirk McKusick says, "The NetBSD and the OpenBSD have extremely strong personalities. Each one is absolutely terrified the other will gain an inch."
While the three forks of BSD may cooperate more than they compete, the Linux world still likes to look at the BSD world with a bit of contempt. All of the forks look somewhat messy, even if having the freedom to fork is what Stallman and GNU are ostensibly fighting to achieve. The Linux enthusiasts seem to think, "We've got our ducks in a single row. What's your problem?" It's sort of like the Army mentality. If it's green, uniform, and the same everywhere, then it must be good.
The BSD lacks the monomaniacal cohesion of Linux, and this seems to hurt their image. The BSD community has always felt that Linux is stealing the limelight that should be shared at least equally between the groups. Linux is really built around a cult of Linus Torvalds, and that makes great press. It's very easy for the press to take photos of one man and put him on the cover of a magazine. It's simple, clean, neat, and perfectly amenable to a 30-second sound bite. Explaining that there's FreeBSD, NetBSD, OpenBSD, and who knows what smaller versions waiting in the wings just isn't as manageable.
Eric Raymond, a true disciple of Linus Torvalds and Linux, sees it in technical terms. The BSD community is proud of the fact that each distribution is built out of one big source tree. They get all the source code for all the parts of the kernel, the utilities, the editors, and whatnot together in one place. Then they push the compile button and let people work. This is a crisp, effective, well-managed approach to the project.
The Linux groups, however, are not that coordinated at all. Torvalds only really worries about the kernel, which is his baby. Someone else worries about GCC. Everyone comes up with their own source trees for the parts. The distribution companies like Red Hat worry about gluing the mess together. It's not unusual to find version 2.0 of the kernel in one distribution while another is sporting version 2.2.
"In BSD, you can do a unified make. They're fairly proud of that," says Raymond. "But this creates rigidities that give people incentives to fork. The BSD things that are built that way develop new spin-off groups each week, while Linux, which is more loosely coupled, doesn't fork."
He elaborates, "Somebody pointed out that there's a parallel of politics. Rigid political and social institutions tend to change violently if they change at all, while ones with more play in them tend to change peacefully."
But this distinction may be semantic. Forking does occur in the Linux realm, but it happens as small diversions that get explained away with other words. Red Hat may choose to use GNOME, while another distribution like SuSE might choose KDE. The users will see a big difference because both tools create virtual desktop environments. You can't miss them. But people won't label this a fork. Both distributions are using the same Linux kernel and no one has gone off and said, "To hell with Linus, I'm going to build my own version of Linux." Everyone's technically still calling themselves Linux, even if they're building something that looks fairly different on the surface.
Jason Wright, one of the developers on the OpenBSD team, sees the organization as a good thing. "The one thing that all of the BSDs have over Linux is a unified source tree. We don't have Joe Blow's tree or Bob's tree," he says. In other words, when they fork, they do it officially, with great ceremony, and make sure the world knows of their separate creations. They make a clear break, and this makes it easier for developers.
Wright says that this single source tree made it much easier for them to turn OpenBSD into a very secure OS."We've got the security over Linux. They've recently been doing a security audit for Linux, but they're going to have a lot more trouble. There's not one place to go for the source code."
To extend this to political terms, the Linux world is like the 1980s when Ronald Reagan ran the Republican party with the maxim that no one should ever criticize another Republican. Sure, people argued internally about taxes, abortion, crime, and the usual controversies, but they displayed a rare public cohesion. No one criticizes Torvalds, and everyone is careful to pay lip service to the importance of Linux cohesion even as they're essentially forking by choosing different packages.
The BSD world, on the other hand, is like the biblical realm in Monty Python's film The Life of Brian. In it, one character enumerates the various splinter groups opposing the occupation by the Romans. There is the People's Front of Judea, the Judean People's Front, the Front of Judean People, and several others. All are after the same thing and all are manifestly separate. The BSD world may share a fair amount of code; it may share the same goals, but it just presents it as coming from three different camps.
John Gilmore, one of the founders of the free software company Cygnus and a firm believer in the advantages of the GNU General Public License, says, "In Linux, each package has a maintainer, and patches from all distributions go back through that maintainer. There is a sense of cohesion. People at each distribution work to reduce their differences from the version released by the maintainer. In the BSD world, each tree thinks they own each program--they don't send changes back to a central place because that violates the ego model."
Jordan Hubbard, the leader of FreeBSD, is critical of Raymond's characterization of the BSD world. "I've always had a special place in my heart for that paper because he painted positions that didn't exist," Hubbard said of Raymond's piece "The Cathedral and the Bazaar." "You could point to just the Linux community and decide which part was cathedral-oriented and which part was bazaar-oriented.
"Every single OS has cathedral parts and bazaar parts. There are some aspects of development that you leave deliberately unfocused and you let people contribute at their own pace. It's sort of a bubble-up model and that's the bazaar part. Then you have the organizational part of every project. That's the cathedral part. They're the gatekeepers and the standards setters. They're necessary, too," he said.
When it comes right down to it, there's even plenty of forking going on about the definition of a fork. When some of the Linux team point at the BSD world and start making fun about the forks, the BSD team gets defensive. The BSD guys always get defensive because their founder isn't on the cover of all the magazines. The Linux team hints that maybe, if they weren't forking, they would have someone with a name in lights, too.
Hubbard is right. Linux forks just as much, they just call it a distribution or an experimental kernel or a patch kit. No one has the chutzpah to spin off their own rival political organization. No one has the political clout.
Now, after all of the nasty stories of backstabbing and bickering, it is important to realize that there are actually some happy stories of forks that merge back together. One of the best stories comes from the halls of an Internet security company, C2Net, that dealt with a fork in a very peaceful way.
C2Net is a Berkeley-based company run by some hard-core advocates of online privacy and anonymity. The company began by offering a remailing service that allowed people to send anonymous e-mails to one another. Their site would strip off the return address and pass it along to the recipient with no trace of who sent it. They aimed to fulfill the need of people like whistleblowers, leakers, and other people in positions of weakness who wanted to use anonymity to avoid reprisals.
The company soon took on a bigger goal when it decided to modify the popular Apache web server by adding strong encryption to make it possible for people to process credit cards over the web. The technology, known as SSL for "secure sockets layer," automatically arranged for all of the traffic between a remote web server and the user to be scrambled so that no one could eavesdrop. SSL is a very popular technology on the web today because many companies use it to scramble credit card numbers to defeat eavesdroppers.
C2Net drew a fair deal of attention when one of its founders, Sameer Parekh, appeared on the cover of Forbes magazine with a headline teasing that he wanted to "overthrow the government." In reality, C2Net wanted to move development operations overseas, where there were no regulations on the creation of cryptographically secure software. C2Net went where the talent was available and priced right.
In this case, C2Net chose a free version of SSL written by Eric Young known as SSLeay. Young's work is another of the open source success stories. He wrote the original version as a hobby and released it with a BSD-like license. Everyone liked his code, downloaded it, experimented with it, and used it to explore the boundaries of the protocol. Young was just swapping code with the Net and having a good time.
Parekh and C2Net saw an opportunity. They would merge two free products, the Apache web server and Young's SSLeay, and make a secure version so people could easily set up secure commerce sites for the Internet. They called this product Stronghold and put it on the market commercially.
C2Net's decision to charge for the software rubbed some folks the wrong way. They were taking two free software packages and making something commercial out of them. This wasn't just a fork, it seemed like robbery to some. Of course, these complaints weren't really fair. Both collections of code emerged with a BSD-style license that gave everyone the right to create and sell commercial additions to the product. There wasn't any GPL-like requirement that they give back to the community. If no one wanted a commercial version, they shouldn't have released the code with a very open license in the first place.
Parekh understands these objections and says that he has weathered plenty of criticism on the internal mailing lists. Still, he feels that the Stronghold product contributed a great deal to the strength of Apache by legitimizing it.
"I don't feel guilty about it. I don't think we've contributed a whole lot of source code, which is one of the key metrics that the people in the Apache group are using. In my perspective, the greatest contribution we've made is market acceptance," he said.
Parekh doesn't mean that he had to build market acceptance among web developers. The Apache group was doing a good job of accomplishing that through their guerrilla tactics, excellent product, and free price tag. But no one was sending a message to the higher levels of the computer industry, where long-term plans were being made and corporate deals were being cut. Parekh feels that he built first-class respectability for the Apache name by creating and supporting a first-class product that big corporations could use successfully. He made sure that everyone knew that Apache was at the core of Stronghold, and people took notice.
Parekh's first job was getting a patent license from RSA Data Security. Secure software like SSL relies on the RSA algorithm, an idea that was patented by three MIT professors in the 1970s. This patent is controlled by RSA Data Security. While the company publicized some of its licensing terms and went out of its way to market the technology, negotiating a license was not a trivial detail that could be handled by some free software team. Who's going to pay the license? Who's going to compute what some percentage of free is? Who's going to come up with the money? These questions are much easier to answer if you're a corporation charging customers to buy a product. C2Net was doing that. People who bought Stronghold got a license from RSA that ensured they could use the method without being sued.
The patent was only the first hurdle. SSL is a technology that tries to bring some security to web connections by encrypting the connections between the browser and the server. Netscape added one feature that allows a connection to be established only if the server has a digital certificate that identifies it. These certificates are only issued to a company after it pays a fee to a registered certificate agent like Verisign.
In the beginning, certificate agents like Verisign would issue the certificates only for servers created by big companies like Netscape or Microsoft. Apache was just an amorphous group on the Net. Verisign and the other authorities weren't paying attention to it.
Parekh went to them and convinced them to start issuing the certificates so he could start selling Stronghold.
"We became number three, right behind Microsoft and Netscape. Then they saw how much money they were making from us, so they started signing certificates for everyone," he said. Other Apache projects that used SSL found life much easier once Parekh showed Verisign that there was plenty of money to be made from folks using free software.
Parekh does not deny that C2Net has not made many contributions to the code base of Apache, but he doesn't feel that this is the best measure. The political and marketing work of establishing Apache as a worthwhile tool is something that he feels may have been more crucial to its long-term health. When he started putting money in the hands of Verisign, he got those folks to realize that Apache had a real market share. That cash talked.
The Stronghold fork, however, did not make everyone happy. SSL is an important tool and someone was going to start creating another free version. C2Net hired Eric Young and his collaborator Tim Hudson and paid them to do some work for Stronghold. The core version of Young's original SSLeay stayed open, and both continued to add bug fixes and other enhancements over time. Parekh felt comfortable with this relationship. Although Stronghold was paying the salaries of Young and Hudson, they were also spending some of their spare time keeping their SSLeay toolkit up to date.
Still, the notion of a free version of SSL was a tempting project for someone to undertake. Many people wanted it. Secure digital commerce demanded it. There were plenty of economic incentives pushing for it to happen. Eventually, a German named Ralf S. Engelschall stepped up and wrote a new version he called mod_SSL. Engelschall is a well-regarded contributor to the Apache effort, and he has written or contributed to a number of different modules that could be added to Apache. He calls one the "all-dancing-all-singing mod_rewrite module" for handling URLs easily.
Suddenly, Engelschall's new version meant that there were dueling forks. One version came out of Australia, where the creators worked for a company selling a proprietary version of the code. C2Net distributed the Australian version and concentrated on making their product easy to install. The other came out of Europe, distributed for free by someone committed to an open source license. The interface may have been a bit rougher, but it didn't cost any money and it came with the source code. The potential for battle between SSLeay and mod_SSL could have been great.
The two sides reviewed their options. Parekh must have felt a bit frustrated and at a disadvantage. He had a company that was making a good product with repeat buyers. Then an open source solution came along. C2Net's Stronghold cost money and didn't come with source code, while Engelschall's mod_SSL cost nothing and came with code. Those were major negatives that he could combat only by increasing service. When Engelschall was asked whether his free version was pushing C2Net, he sent back the e-mail with the typed message, "[grin]."
In essence, C2Net faced the same situation as many major companies like Microsoft and Apple do today. The customers now had a viable open source solution to their problems. No one had to pay C2Net for the software. The users in the United States needed a patent license, but that would expire in late 2000. Luckily, Parekh is a true devotee to the open source world, even though he has been running a proprietary source company for the last several years. He looked at the problem and decided that the only way to stay alive was to join forces and mend the fork.
To make matters worse, Hudson and Young left C2Net to work for RSA Data Security. Parekh lost two important members of his team, and he faced intense competition. Luckily, his devotion to open source came to the rescue. Hudson and Young couldn't take back any of the work they did on SSLeay. It was open source and available to everyone.
Parekh, Engelschall, several C2Net employees, and several others sat down (via e-mail) and created a new project they called OpenSSL. This group would carry the torch of SSLeay and keep it up-to-date. Young and Hudson stopped contributing and devoted their time to creating a commercial version for RSA Data Security.
Parekh says of the time, "Even though it was a serious setback for C2Net to have RSA pirate our people, it was good for the public. Development really accelerated when we started OpenSSL. More people became involved and control became less centralized. It became more like the Apache group. It's a lot bigger than it was before and it's much easier for anyone to contribute."
Parekh also worked on mending fences with Engelschall. C2Net began to adopt some of the mod_SSL code and blend it into their latest version of Stronghold. To make this blending easier, C2Net began sending some of their formerly proprietary code back to Engelschall so he could mix it with mod_SSL by releasing it as open source. In essence, C2Net was averting a disastrous competition by making nice and sharing with this competitor. It is a surprising move that might not happen in regular business.
Parekh's decision seems open and beneficent, but it has a certain amount of self-interest behind it. He explains, "We just decided to contribute all of the features we had into mod_SSL so we could start using mod_SSL internally, because it makes our maintenance of that easier. We don't have to maintain our own proprietary version of mod_SSL. Granted, we've made the public version better, but those features weren't significant."
This mixing wasn't particularly complicated--most of it focused on the structure of the parts of the source code that handle the interface. Programmers call these the "hooks" or the "API." If Stronghold and mod_SSL use the same hook structure, then connecting them is a piece of cake. If Engelschall had changed the hook structure of mod_SSL, then the C2Net would have had to do more work.
The decision to contribute the code stopped Engelschall from doing the work himself in a way that might have caused more grief for C2Net. "He was actually planning on implementing them himself, so we were better off contributing ours to avoid compatibility issues," says Parekh. That is to say, Parekh was worried that Engelschall was going to go off and implement all the features C2Net used, and there was a very real danger that Engelschall would implement them in a way that was unusable to Parekh. Then there would be a more serious fork that would further split the two groups. C2Net wouldn't be able to borrow code from the free version of OpenSSL very easily. So it decided to contribute its own code. It was easier to give their code and guarantee that OpenSSL fit neatly into Stronghold. In essence, C2Net chose to give a little so it could continue to get all of the future improvements.
It's not much different from the car industry. There's nothing inherently better or worse about cars that have their steering wheel on the right-hand side. They're much easier to use in England. But if some free car engineering development team emerged in England, it might make sense for a U.S. company to donate work early to ensure that the final product could have the steering wheel on either side of the car without extensive redesign. If Ford just sat by and hoped to grab the final free product, it might find that the British engineers happily designed for the only roads they knew.
Engelschall is happy about this change. He wrote in an e-mail message, "They do the only reasonable approach: They base their server on mod_SSL because they know they cannot survive against the Open Source solution with their old proprietary code. And by contributing stuff to mod_SSL they implicitly make their own product better. This way both sides benefit."
Parekh and C2Net now have a challenge. They must continue to make the Stronghold package better than the free version to justify the cost people are paying.
Not all forks end with such a happy-faced story of mutual cooperation. Nor do all stories in the free software world end with the moneymaking corporation turning around and giving back their proprietary code to the general effort. But the C2Net/OpenSSL case illustrates how the nature of software development encourages companies and people to give and cooperate to satisfy their own selfish needs. Software can do a variety of wonderful things, but the structure often governs how easy it is for some of us to use. It makes sense to spend some extra time and make donations to a free software project if you want to make sure that the final product fits your specs.
The good news is that most people don't have much incentive to break off and fork their own project. If you stay on the same team, then you can easily use all the results produced by the other members. Cooperating is so much easier than fighting that people have a big incentive to stay together. If it weren't so selfish, it would be heartwarming.
Eric von Hippel
Erik S. Raymond