开源软件的商业战略

来源：互联网发布：2个excel表格数据合并编辑：程序博客网时间：2024/05/22 15:56

http://www.linuxsir.org/bbs/archive/index.php/t-54757.html

此文好像是Apache项目负责人的演讲稿，解释了开源项目在公司发展战略中的地位和应用前景，运用了MySQL和Apache等项目做例子，让大家对开源项目的商业化之路有大概的了解。
Open Source as a Business Strategy

Brian Behlendorf

Over 1997 and 1998, open-source software such as Linux, FreeBSD,Apache, and Perl started to attract widespread attention from a newaudience: engineering managers, executives, industry analysts, andinvestors.

Most of the developers of such software welcomed this attention: notonly does it boost the pride of developers, it also allows them tojustify their efforts (now increasingly related to their salariedpositions) to upper management and their peers.

But this new audience has hard questions:

Is this really a new way of building software?
Are each of the successes in open-source software a fluke of circumstance, or is there a repeatable methodology to all this?
Why on earth would I allocate scarce financial resources to a projectwhere my competitor would get to use the same code, for free?
How reliant is this whole development model upon the hobbyist hacker orcomputer science student who just happens to put the right bitstogether to make something work well?
Does this threaten or obsolesce my company's current methods for building software and doing business?

I suggest that the open-source model is indeed a reliable model forconducting software development for commercial purposes. I will attemptto lay out the preconditions for such a project, what types of projectsmake sense to pursue in this model, and the steps a company should gothrough to launch such a project. This essay is intended for companieswho either release, sell, and support software commercially, or fortechnology companies that use a given piece of software as a corecomponent to their business processes.

It's All About Platforms

While I'm indeed a big fan of the open-source approach to softwaredevelopment, there are definitely situations where an open-sourceapproach would not benefit the parties involved. There are strongtradeoffs to this model, and returns are never guaranteed. A properanalysis requires asking yourself what your goals as a company are inthe long term, as well as what your competitive advantages are today.

Let's start first with a discussion about Application ProgrammingInterfaces (APIs), platforms, and standards. For the purposes of thisessay, I'll wrap APIs (such as the Apache server API for buildingcustom modules), on-the-wire protocols like HTTP, and operating systemconventions (such as the way Linux organizes system files, or NTservers are administered) into the generic term ``platform.''

Win32, the collection of routines and facilities provided and definedby Microsoft for all Windows 95 and NT application developers, is aplatform. If you intend to write an application for people to use onWindows, you must use this API. If you intend, as IBM once did withOS/2, to write an operating system which can run programs intended forMSWindows, you must implement the Win32 API in its entirety, as that'swhat Windows applications expect to be able to use.

Likewise, the Common Gateway Interface, or ``CGI,'' is a platform. TheCGI specification allows web server developers to write scripts andprograms that run behind a web server. CGI is a much much simplerplatform than Win32, and of course does much less, but its existencewas important to the web server market because it allowed applicationdevelopers to write portable code, programs that would run behind anyweb server. Besides a few orders of magnitude in complexity, a keydifference between CGI and Win32 was that no one really owned the CGIspecification; it was simply something the major web serversimplemented so that they could run each others' CGI scripts. Only afterseveral years of use was it deemed worthwhile to define the CGIspecification as an informational Request for Comments (RFCs) at theInternet Engineering Task Force (IETF).

A platform is what essentially defines a piece of software, anysoftware, be it a web browser like Netscape, or be it Apache. Platformsenable people to build or use one piece of software on top of another,and are thus essential not just for the Internet space, where commonplatforms like HTTP and TCP/IP are what really facilitated theInternet's explosive growth, but are becoming more and more essentialto consider within a computer environment, both in a server context andin an end-user client context.

In the Apache project, we were fortunate in that early on we developedan internal API to allow us to distinguish between the core serverfunctionality (that of handling the TCP connections, child processmanagement, and basic HTTP request handling) and almost all otherhigher-level functionality like logging, a module for CGI, server-sideincludes, security configuration, etc. Having a really powerful API hasalso allowed us to hand off other big pieces of functionality, such asmod_perl (an Apache module that bundles a Perl interpreter into Apache)and mod_jserv (which implements the Java Servlet API), to separategroups of committed developers. This freed the core development groupfrom having to worry about building a ``monster'' to support theselarge efforts in addition to maintaining and improving the core of theserver.

There are businesses built upon the model of owning software platforms.Such a business can charge for all use of this platform, whether on astandard software installation basis, or a pay-per-use basis, orperhaps some other model. Sometimes platforms are enforced bycopyright; other times platforms are obfuscated by the lack of awritten description for public consumption; other times they areevolved so quickly, sometimes other than for technical reasons, thatothers who attempt to provide such a platform fail to keep up and areperceived by the market as ``behind'' technologically speaking, eventhough it's not a matter of programming.

Such a business model, while potentially beneficial in the short termfor the company who owns such a platform, works against the interestsof every other company in the industry, and against the overall rate oftechnological evolution. Competitors might have better technology,better services, or lower costs, but are unable to use those benefitsbecause they don't have access to the platform. On the flip side,customers can become reliant upon a platform and, when prices rise, beforced to decide between paying a little more in the short run to stickwith the platform, or spending a large quantity of money to change to adifferent platform, which may save them money in the long run.

Computers and automation have become so ingrained and essential today-to-day business that a sensible business should not rely on asingle vendor to provide essential services. Having a choice of servicemeans not just having the freedom to choose; a choice must also beaffordable. The switching cost is an important aspect to this freedomto choose. Switching costs can be minimized if switching software doesnot necessitate switching platforms. Thus it is always in a customers'interests to demand that the software they deploy be based onnon-proprietary platforms.

This is difficult to visualize for many people because classiceconomics, the supply and demand curves we were all taught in highschool, are based on the notion that products for sale have arelatively scalable cost -- that to sell ten times as much product, thecost of raw goods to a vendor typically rises somewhere on the order often times as well. No one could have foreseen the dramatic economy ofscale that software exhibits, the almost complete lack of any directcorrelation between the amount of effort it takes to produce a softwareproduct and the number of people who can thus purchase and use it.

A reference body of open-source software that implements a wireprotocol or API is more important to the long-term health of thatplatform than even two or three independent non-open-sourceimplementations. Why is this? Because a commercial implementation canalways be bought by a competitor, removing it from the market as analternative, and thus destroying the notion that the standard wasindependent. It can also serve as an academic frame of reference forcomparing implementations and behaviors.

There are organizations like the IETF and the W3C who do a more-or-lessexcellent job of providing a forum for multi***** standardsdevelopment. They are, overall, effective in producing high-qualityarchitectures for the way things should work over the Internet.However, the long-term success of a given standard, and the widespreaduse of such a standard, are outside of their jurisdiction. They have nopower to force member organizations to create software that implementsthe protocols they define faithfully. Sometimes, the only recourse is abody of work that shows why a specific implementation is correct.

For example, in December of 1996, AOL made a slight change to theircustom HTTP proxy servers their customers use to access web sites. This``upgrade'' had a cute little political twist to it: when AOL usersaccessed a web site using the Apache 1.2 server, at that time only afew months old and implementing the new HTTP/1.1 specification, theywere welcomed with this rather informative message:

UNSUPPORTED WEB VERSION
The Web address you requested isnot available in a version supported by AOL. This is an issue with theWeb site, and not with AOL. The owner of this site is using anunsupported HTTP language. If you receive this message frequently, youmay want to set your web graphics preferences to COMPRESSED at Keyword:PREFERENCES
Alarmed at this ``upgrade,'' Apache core developers circled the wagonsand analyzed the situation. A query to AOL's technical team came backwith the following explanation:
New HTTP/1.1 web servers are starting to generate HTTP/1.1 responses toHTTP/1.0 requests when they should be generating only HTTP/1.0responses. We wanted to stem the tide of those faults proliferating andbecoming a de facto standard by blocking them now. Hopefully theauthors of those web servers will change their software to onlygenerate HTTP/1.1 responses when an HTTP/1.1 request is submitted.
Unfortunately AOL engineers were under the mistaken assumption thatHTTP/1.1 responses were not backward-compatible with HTTP/1.0 clientsor proxies. They are; HTTP was designed to be backward-compatiblewithin minor-number revisions. But the specification for HTTP/1.1 is socomplex that a less than thorough reading may lead one to haveconcluded this was not the case, especially with the HTTP/1.1 documentthat existed at the end of 1996.

So we Apache developers had a choice -- we could back down and giveHTTP/1.0 responses to HTTP/1.0 requests, or we could follow thespecification. Roy Fielding, the ``HTTP cop'' in the group, was able toclearly show us how the software's behavior at the time was correct andbeneficial; there would be cases where HTTP/1.0 clients may wish toupgrade to an HTTP/1.1 conversation upon discovering that a serversupported 1.1. It was also important to tell proxy servers that even ifthe first request they proxied to an origin server they saw was 1.0,the origin server could also support 1.1.

It was decided that we'd stick to our guns and ask AOL to fix theirsoftware. We suspected that the HTTP/1.1 response was actually causinga problem with their software that was due more to sloppy programmingpractices on their part than to bad protocol design. We had the sciencebehind our decision. What mattered most was that Apache was at thatpoint on 40% of the web servers on the Net, and Apache 1.2 was on avery healthy portion of those, so they had to decide whether it waseasier to fix their programming mistakes or to tell their users thatsome 20% or more of the web sites on the Internet were inaccessiblethrough their proxies. On December 26th, we published a web pagedetailing the dispute, and publicized its existence not just to our ownuser base, but to several major news outlets as well, such as C|Net andWired, to justify our actions.

AOL decided to fix their software. Around the same time, we announcedthe availability of a ``patch'' for sites that wanted to work aroundthe AOL problem until it was rectified, a patch that degraded responsesto HTTP/1.0 for AOL. We were resolute that this was to remain an``unofficial'' patch, with no support, and that it would not be made adefault setting in the official distribution.

There have been several other instances where vendors of other HTTPproducts (including both Netscape and Microsoft) had interoperabilityissues with Apache; in many of those cases, there was a choice thevendor had to make between expending the effort to fix their bug, orwriting off any sites which would become inoperable because of it. Inmany cases a vendor would implement the protocol improperly butconsistently on their clients and servers. The result was animplementation that worked fine for them, but imperfectly at best witheither a client or server from another vendor. This is much more subtlethan even the AOL situation, as the bug may not be apparent or evensignificant to the majority of people using this software -- and thusthe long-term ramifications of such a bug (or additional bugscompounding the problem) may not be seen until it's too late.

Were there not an open-source and widely used reference web server likeApache, it's entirely conceivable that these subtle incompatibilitiescould have grown and built upon each other, covered up by mutual blameor Jedi mind tricks (``We can't repeat that in the lab... .''), wherethe response to ``I'm having problem when I connect vendor X browser tovendor Y server'' is, ``Well, use vendor Y client and it'll be allbetter.'' At the end of this process we would have ended up with two(or more) World Wide Webs -- one that was built on vendor X webservers, the other on vendor Y servers, and each would only work withtheir respective vendors' clients. There is ample historic precedencefor this type of anti-standard activity, a policy (``locking in'')which is encoded as a basic business practice of many softwarecompanies.

Of course this would have been a disaster for everyone else out there-- the content providers, service providers, software developers, andeveryone who needed to use HTTP to communicate would have had tomaintain two separate servers for their offerings. While there may havebeen technical customer pressure to ``get along together,'' thecontrary marketing pressure to ``innovate, differentiate, lead theindustry, define the platform'' would have kept either ***** fromattempting to commodify their protocols.

We did, in fact, see such a disaster with client-side JavaScript. Therewas such a big difference in behavior between different browsers, evenwithin different beta versions of the same browser, that developers hadto create code that would detect different revisions and give differentbehavior -- something that added significantly more development time tointeractive pages using JavaScript. It wasn't until the W3C stepped inand laid the groundwork for a Document Object Model (DOM) that weactually saw a serious attempt at creating a multi***** standard aroundJavaScript.

There are natural forces in today's business world that drive fordeviation when a specification is implemented by closed software. Evenan accidental misreading of a common specification can cause adeviation if not corrected quickly.

Thus, I argue that building your services or products on top of astandards-based platform is good for the stability of your businessprocesses. The success of the Internet has not only shown how commonplatforms help facilitate communication, it has also forced companiesto think more about how to create value in what gets communicated,rather than trying to take value out of the network itself.

Analyzing Your Goals for an Open-Source Project

What you need to ask yourself, as a company, is to what degree yourproducts implement a new platform, and to what extent is it in yourbusiness interests to maintain ownership of that platform. How much ofyour overall product and service set, and thus how much of yourrevenue, is above that platform, or below it? This is probablysomething you can even apply numbers to.

Let's say you're a database company. You sell a database that runs onmultiple OSes; you separately sell packages for graphicaladministration, rapid development tools, a library of common storedprocedures people can use, etc. You sell support on a yearly basis.Upgrades require a new purchase. You also offer classes. And finally,you've got a growing but healthy consulting group who implement yourdatabase for customers.

Let's say your revenue balance looks something like this:

40% -- Sales of the database software
15% -- Support
10% -- Consulting work
10% -- Rapid development tools
10% -- Graphical administration tools
10% -- Library of stored procedures/applications on top of this DB
5% -- Manuals/classes

At first glance, the suggestion that you give away your databasesoftware for free would be ludicrous. That's 40% of your revenue gone.If you're lucky as a company you're profitable, and if you're evenluckier you've got maybe a 20% profit margin. 40% wipes that outcompletely.

This of course assumes nothing else changes in the equation. But thechances are, if you pull this off right, things will change. Databasesare the type of application that companies don't just pull off theshelf at CompUSA, throw the CD into their machine, and then forgetabout. All of the other categories of revenue are still valid andnecessary no matter how much was charged for the OS. In fact, there isnow more freedom to charge more for these other services than before,when the cost of the software ate up the bulk of what a customertypically paid for when they bought database software.

So very superficially speaking, if the free or low-cost nature of thedatabase were to cause it to be used on twice as many systems, andusers were as equally motivated as before to purchase consulting andsupport and development tools and libraries and such from your company,you'd see a 20% gain in the overall amount of revenue. What's morelikely is that three to four times as many new users are introduced toyour software, and while the take-up rate of your other services islower (either because people are happy just using the free version, oryou have competitors now offering these services for your product), solong as that take-up rate doesn't go too low, you've probably increasedoverall revenue into the company.

Furthermore, depending on the license applied, you may see lower costsinvolved in development of your software. You're likely to see bugsfixed by motivated customers, for example. You're also likely to seenew innovations in your software by customers who contribute their codeto the project because they want to see it maintained as a standardpart of the overall distribution. So overall, your development costscould go down.

It's also likely that, given a product/services mix like the aboveexample, releasing this product for free does little to help yourcompetitors compete against you in your other revenue spaces. There areprobably already consultants who do integration work with your tools;already independent authors of books; already libraries of code you'veencouraged other companies to build. The availability of source codewill marginally help competitors be able to provide support for yourcode, but as the original developers, you'll have a cache to your brandthat the others will have to compete against.

Not all is wine and roses, of course. There are costs involved in thisprocess that are going to be difficult to tie to revenue directly. Forexample, the cost of infrastructure to support such an endeavor, whilenot significant, can consume systems administration and support staff.There's also the cost of having developers communicating with othersoutside the company, and the extra overhead of developing the code in apublic way. There may be significant cost involved in preparing thesource code for public inspection. And after all this work, there maysimply not be the ``market need'' for your product as freeware. I'lladdress all these points in the rest of this essay.

simonhuan

03-07-29, 11:19

Evaluating the Market Need for Your Project

It may be very tempting for a company to look to Open Source as away to save a particular project, to gain notoriety, or to simply havea good story to end a product category. These are not good reasons tolaunch an open-source project. If a company is serious about pursuingthis model, it needs to do its research in determining exactly what theproduct needs to be for an open-source strategy to be successful.

The first step is to conduct a competitive analysis of the space, bothfor the commercial competitors and the freeware competitors, no matterhow small. Be very careful to determine exactly what your productoffers by componentizing your offering into separable ``chunks'' thatcould be potentially bundled or sold or open-sourced separately.Similarly, don't exclude combinations of freeware and commercialwarethat offer the same functionality.

Let's continue with the database vendor example above. Let's say thereare actually three components to the vendor's database product: a coreSQL server, a backup/transaction logging manager, and a developerlibrary. Such a vendor should not only compare their product's offeringto the big guys like Oracle and Sybase, not only to the smaller butgrowing commercial competitors like Solid and Velocis, but also to thefree databases like MySQL and Postgres. Such an analysis may concludethat the company's core SQL server provides only a little morefunctionality than MySQL, and in an area that was never considered acompetitive advantage but merely a necessary feature to keep up withthe other DB vendors. The backup/transaction logging manager has nofreeware competition, and the developer library is surpassed by thePerl DBI utilities but has little Java or C competition.

This company could then consider the following strategies:

1.
Replace the core SQL server with MySQL, and then package up the coreSQL server functionality and backup/transaction logging manager, andsell Java/C libraries while providing and supporting the free Perllibrary. This would ride upon the momentum generated by the MySQLpackage, and the incredible library of add-on code and plug-in modulesout there for it; it would also allow you to keep private any pieces ofcode you may believe have patents or patent-able code, or code yousimply think is cool enough that it's a competitive advantage. Marketyourself as a company that can scale MySQL up to larger deployments.
2.
Contribute the ``extra core SQL server functionality'' to MySQL, thendesign the backup/transaction logger to be sold as a separate productthat works with a wider variety of databases, with a clear preferencefor MySQL. This has smaller revenue potential, but allows you as acompany to be more focused and potentially reach a broader base ofcustomers. Such a product may be easier to support as well.
3.
Go in the other direction: stick with a commercial product strategy forthe core SQL server and libraries, but open-source thebackup/transaction logger as a general utility for a wide array ofdatabases. This would cut down on your development costs for thiscomponent, and be a marketing lead generator for your commercialdatabase. It would also remove a competitive advantage some of yourcommercial competitors would have over open source, even though itwould also remove some of yours too.

All of these are valid approaches to take. Another approach:

1.
Open-source the entire core server as its own product, separate fromMySQL or Postgres or any of the other existing packages, and providecommercial support for it. Sell as standard non-open-source thebackup/logging tool, but open-source the development libraries toencourage new users. Such a strategy carries more risk, as a popularpackage like MySQL or Postgres tends to have been around for quite sometime, and there's inherently much developer aversion to swapping out adatabase if their current one is working fine. To do this, you'd haveto prove significant benefit over what people are currently using.Either it has to be dramatically faster, more flexible, easier toadminister or program with, or contain sufficiently new features thatusers are motivated to try it out. You also have to spend much moretime soliciting interest in the project, and you probably will have tofind a way to pull developers away from competing products.

I wouldn't advocate the fourth approach in this exact circumstance, asMySQL actually has a very healthy head start here, lots and lots ofadd-on programs, and a rather large existing user base.

However, from time to time an open source project loses momentum,either because the core development team is not actively doingdevelopment, or the software runs into core architectural challengesthat keep it from meeting new demands, or the environment that createdthis demand simply dries up or changes focus. When that happens, and itbecomes clear people are looking for alternatives, there is thepossibility of introducing a replacement that will attract attention,even if it does not immediately present a significant advance over thestatus quo.

Analyzing demand is essential. In fact, it's demand that usuallycreates new open-source projects. Apache started with a group ofwebmasters sharing patches to the NCSA web server, deciding thatswapping patches like so many baseball cards was inefficient anderror-prone, and electing to do a separate distribution of the NCSAserver with their patches built in. None of the principals involved inthe early days got involved because they wanted to sell a commercialserver with Apache as its base, though that's certainly a valid reasonfor being involved.

So an analysis of the market demand for a particular open-sourceproject also involves joining relevant mailing lists and discussionforums, cruising discussion archives, and interviewing your customersand their peers; only then can you realistically determine if there arepeople out there willing to help make the project bear fruit.

Going back to Apache's early days: those of us who were sharing patchesaround were also sending them back to NCSA, hoping they'd beincorporated, or at the very least acknowledged, so that we could besomewhat assured that we could upgrade easily when the next releasecame out. NCSA had been hit when the previous server programmers hadbeen snatched away by Netscape, and the flood of email was too much forthe remaining developers. So building our own server was more an act ofself-preservation than an attempt to build the next great web server.It's important to start out with limited goals that can be accomplishedquite easily, and not have to rely upon your project dominating amarket before you realize benefits from the approach.

Open Source's Position in the Spectrum of Software

To determine which parts of your product line or components of a givenproduct to open-source, it may be helpful to conduct a simple exercise.First, draw a line representing a spectrum. On the left hand side, put``Infrastructural,'' representing software that implements frameworksand platforms, all the way down to TCP/IP and the kernel and evenhardware. On the right hand side, put ``End-user applications,''representing the tools and applications that the average, non-technicaluser will use. Along this line, place dots representing, in relativeterms, where you think each of the components of your product offeringlie. From the above example, the GUI front-ends and administrativetools lie on the far right-hand side, while code that manages backupsis off to the far left. Development libraries are somewhat to the rightof center, while the core SQL facilities are somewhat to the left.Then, you may want to throw in your competitors' products as well, alsoseparating them out by component, and if you're really creative, usinga different color pen to distinguish the free offerings from thecommercial offerings. What you are likely to find is that the freeofferings tend to clump towards the left-hand side, and the commercialofferings towards the right.

Open-source software has tended to be slanted towards theinfrastructural/back-end side of the software spectrum representedhere. There are several reasons for this:

End-user applications are hard to write, not only because a programmerhas to deal with a graphical, windowed environment which is constantlychanging, nonstandard, and buggy simply because of its complexity, butalso because most programmers are not good graphical interfacedesigners, with notable exceptions.
Culturally, open-source software has been conducted in the networking code and operating system space for years.
Open-source tends to thrive where incremental change is rewarded, andhistorically that has meant back-end systems more than front-ends.
Much open-source software was written by engineers to solve a task theyhad to do while developing commercial software or services; so theprimary audience was, early on, other engineers.

This is why we see solid open-source offerings in the operating systemand network services space, but very few offerings in the desktopapplication space.

There are certainly counterexamples to this. A great example is theGIMP, or GNU Image Manipulation Program, an X11 program comparable infeature set to Adobe Photoshop. Yet in some ways, this product is alsoan ``infrastructure'' tool, a platform, since it owes its success toits wonderful plug-in architecture, and the dozens and dozens ofplug-ins that have been developed that allow it to import and exportmany different file formats and which implement hundreds of filtereffects.

Look again at the spectrum you've drawn out. At some point, you canlook at your offering in the context of these competitors, and draw avertical line. This line denotes the separation between what youopen-source and what you may choose to keep proprietary. That lineitself represents your true platform, your interface between the publiccode you're trying to establish as a standard on the left, and yourprivate code you want to drive demand for on the right.

Nature Abhors a Vacuum

Any commercial-software gaps in an otherwise open-sourceinfrastructural framework are a strong motivating force forredevelopment in the public space. Like some force of nature, when acommercial wall exists between two strong pieces of open-sourcesoftware, there's pressure to bridge that gap with a public solution.This is because every gap can be crossed given enough resources, and ifthat gap is small enough for your company to cross with your owndevelopment team, it's likely to be small enough for a set of motivateddevelopers to also cross.

Let's return to the database example: say you decide to open-sourceyour core SQL server (or your advanced code on top of MySQL), butdecide to make money by building a commercial, non-source-availabledriver for plugging that database into a web server to create dynamiccontent. You decide the database will be a loss leader for thisproduct, and therefore you'll charge far higher than normal margins onthis component.

Since hooking up databases to web servers is a very common anddesirable thing, developers will either have to go through you, or findanother way to access the database from the web site. Each developerwill be motivated by the idea of saving the money they'd otherwise haveto pay you. If enough developers pool their resources to make it worththeir while, or a single talented individual simply can't pay for theplug-in but still wants to use that database, it's possible you couldwake up one morning to find an open-source competitor to yourcommercial offering, completely eliminating the advantage of having theonly solution for that task.

This is a piece of a larger picture: relying upon proprietary sourcecode in strategic places as your way of making money has become a riskybusiness venture. If you can make money by supporting the web server +plug-in + database combination, or by providing an interface tomanaging that system as a whole, you can protect yourself against thesetypes of surprises.

Not all commercial software has this vulnerability -- it isspecifically a characteristic of commercial software that tries to slotitself into a niche directly between two well-established open-sourceofferings. Putting your commercial offering as an addition to thecurrent set of open-source offerings is a more solid strategy.

Donate, or Go It Alone?

Open-source software exists in many of the standard softwarecategories, particularly those focused on the server side. Obviously wehave operating systems; web servers; mail (SMTP, POP, IMAP), news(NNTP), and DNS servers; programming languages (the ``glue'' fordynamic content on the Web); databases; networking code of all kinds.On the desktop you have text editors like Emacs, Nedit, and Jove;windowing systems like Gnome and KDE; web browsers like Mozilla; andscreen savers, calculators, checkbook programs, PIMs, mail clients,image tools -- the list goes on. While not every category hascategory-killers like Apache or Bind, there are probably very fewcommercial niches that don't have at least the beginnings of a decentopen source alternative available. This is much less true for the Win32platform than for the Unix or Mac platforms, primarily because theopen-source culture has not adopted the Win32 platform as ``open''enough to really build upon.

There is a compelling argument for taking advantage of whatevermomentum an existing open-source package has in a category thatoverlaps with your potential offering, by contributing your additionalcode or enhancements to the existing project and then aiming for areturn in the form of higher-quality code overall, marketing leadgeneration, or common platform establishment. In evaluating whetherthis is an acceptable strategy, one needs to look at licensing terms:

Are the terms on the existing package copacetic to your long-term goals?
Can you legally contribute your code under that license?
Does it incent future developers sufficiently? If not, would thedevelopers be willing to accommodate you by changing the license?
Are your contributions general enough that they would be of value tothe developers and users of the existing project? If all they do isimplement an API to your proprietary code, they probably won't beaccepted.
If your contributions are hefty, can you have ``peer'' status with theother developers, so that you can directly apply bug fixes andenhancements you make later?
Are the other developers people you can actually work with?
Are your developers people who can work with others in a collaborative setting?

Satisfying developers is probably the biggest challenge to theopen-source development model, one which no amount of technology oreven money can really address. Each developer has to feel like they aremaking a positive contribution to the project, that their concerns arebeing addressed, their comments on architecture and design questionsacknowledged and respected, and their code efforts rewarded withintegration into the distribution or a really good reason why not.

People mistakenly say ``open-source software works because the wholeInternet becomes your R&D and QA departments!'' In fact, the amountof talented programmer effort available for a given set of tasks isusually limited. Thus, it is usually to everyone's interests ifparallel development efforts are not undertaken simply because ofsemantic disputes between developers. On the other hand, evolutionworks best when alternatives compete for resources, so it's not a badthing to have two competing solutions in the same niche if there'senough talent pool for critical mass -- some real innovation may betried in one that wasn't considered in the other.

There is strong evidence for competition as a healthy trait in the SMTPserver space. For a long time, Eric Allman's ``Sendmail'' program wasthe standard SMTP daemon every OS shipped with. There were otheropen-source competitors that came up, like Smail or Zmailer, but thefirst to really crack the usage base was Dan Bernstein's Qmail package.When Qmail came on the scene, Sendmail was 20 years old, and hadstarted to show its age; it was also not designed for the Internet ofthe late 90s, where buffer overflows and denial of service attacks areas common as rainfall in Seattle. Qmail was a radical break in manyways -- program design, administration, even in its definition of whatgood ``network behavior'' for an SMTP server is. It was an evolutionthat would have been exceedingly unlikely to have been made withinAllman's Sendmail package. Not because Allman and his team weren't goodprogrammers or because there weren't motivated third-*****contributors; it's just that sometimes a radical departure is needed toreally try something new and see if it works. For similar reasons, IBMfunded the development of Weiste Venema's ``SecureMailer'' SMTP daemon,which as of this writing also appears to be likely to become ratherpopular. The SMTP daemon space is well-defined enough and importantenough that it can support multiple open-source projects; time willtell which will survive.

Bootstrapping

Essential to the health of an open-source project is that the projecthave sufficient momentum to be able to evolve and respond to newchallenges. Nothing is static in the software world, and each majorcomponent requires maintenance and new enhancements continually. One ofthe big selling points of this model is that it cuts down on the amountof development any single ***** must do, so for that theory to becomefact, you need other active developers.

In the process of determining demand for your project, you probably raninto a set of other companies and individuals with enough interest hereto form a core set of developers. Once you've decided on a strategy,shop it to this core set even more heavily; perhaps start a simplediscussion mailing list for this purpose, with nothing set in stone.Chances are this group will have some significant ideas for how to makethis a successful project, and list their own set of resources theycould apply to make it happen.

For the simplest of projects, a commitment from this group that they'llgive your product a try and if they're happy stay on the developmentmailing list is probably enough. However, for something moresignificant, you should try and size up just how big the total resourcebase is.

Here is what I would consider a minimum resource set for a project ofmoderate complexity, say a project to build a common shopping cartplug-in for a web server, or a new type of network daemon implementinga simple protocol. In the process I'll describe the various rolesneeded and the types of skills necessary to fill them.

Role 1: Infrastructure support: Someone to set up and maintain themailing list aliases, the web server, the CVS (Concurrent VersioningSystem) code server, the bug database, etc.
Startup: 100 hours
Maintenance: 20 hrs/week.
Role 2: Code ``captain'': Someone who watches all commits and hasoverall responsibility for the quality of the implemented code.Integrates patches contributed by third parties, fixing any bugs orincompatibilities in these contributions. This is outside of whatevernew development work they are also responsible for.
Startup: 40-200 hours (depends on how long it takes to clean up the code for public consumption!)
Maintenance: 20 hrs/week
Role 3: Bug database maintenance: While this is not free ``support,''it is important that the public have an organized way of communicatingbug reports and issues to the server developers. In a free setting, thedevelopers are of course not even obliged to answer all mail they get,but they should make reasonable efforts to respond to valid issues. Thebug database maintainer would be the first line of support, someone whogoes through the submissions on a regular basis and weeds out thesimple questions, tosses the clueless ones, and forwards the realissues on to the developers.
Startup: just enough to learn their way around the code
Maintenance: 10-15 hrs/week
Role 4: Documentation/web site content maintenance: This position isoften left unattended in open-source projects and left to the engineersor to people who really want to contribute but aren't star programmers;all too often it's simply left undone. So long as we're going aboutthis process deliberately, locating dedicated resources to make surethat non-technical people can understand and appreciate the tools theyare deploying is essential to widespread usage. It helps cut down onhaving to answer bug reports which are really just misunderstandings,and it also helps encourage new people to learn their way around thecode and become future contributors. A document that describes at ahigh level the internal architecture of the software is essential;documentation that explains major procedures or classes within the codeis almost as important.
Startup: 60 hours (presuming little code has been documented)
Maintenance: 10 hrs/week
Role 5: Cheerleader/zealot/evangelist/strategist: Someone who can workto build momentum for the project by finding other developers, pushspecific potential customers to give it a try, find other companies whocould be candidates for adopting this new platform, etc. Not quite amarketer or salesperson, as they need to stay close to the technology;but the ability to clearly see the role of the project in a largerperspective is essential.
Startup: enough to learn the project
Maintenance: 20 hrs/week

So here we have five roles representing almost three full-time people.In reality, some of these roles get handled by groups of people sharingresponsibility, and some projects can survive with the average coreparticipant spending less than 5 hrs/week after the first set ofrelease humps are passed. But for the early days of the project it isessential that developers have the time and focus they would if theproject were a regular development effort at the company.

These five roles also do not cover any resources that could be puttowards new development; this is purely maintenance. In the end, if youcan not find enough resources from peers and partners to cover thesebases and enough extra developers to do some basic new development(until new recruits are attracted), you may want to reconsideropen-sourcing your project.

What License to Use?

Determining which license to use for your project can be a fairlycomplex task; it's the kind of task you probably don't enjoy but yourlegal team will. There are other papers and web sites that covercopyright issues in finer detail; I'll provide an overview, though, ofwhat I see as the business considerations of each style of license.

The BSD-Style Copyright

This is the copyright used by Apache and by the BSD-based operatingsystems projects (FreeBSD, OpenBSD, NetBSD), and by and large it can besummed up as, ``Here's this code, do what you like with it, we don'tcare, just give us credit if you try and sell it.'' Usually that creditis demanded in different forms -- on advertising, or in a README file,or in the printed documentation, etc. It has been brought up that sucha copyright may be inscalable -- that is, if someone ever released abundle of software that included 40 different open-source modules, allBSD-based, one might argue that there'd be 40 different copyrightnotices that would be necessary to display. In practice this has notbeen a problem, and in fact it's been seen as a positive force inspreading awareness of the use of open-source software.

From a business perspective, this is the best type of license forjumping into an existing project, as there are no worries aboutlicenses or restrictions on future use or redistribution. You can mixand match this software with your own proprietary code, and onlyrelease what you feel might help the project and thus help you inreturn. This is one reason why we chose it for the Apache group --unlike many free software projects, Apache was started largely bycommercial webmasters in search of a better web server for their owncommercial needs. While probably none of the original team had a goalof creating a commercial server on top of Apache, none of us knew whatour futures would hold, and felt that limiting our options at thebeginning wasn't very smart.

This type of license is ideal for promoting the use of a reference bodyof code that implements a protocol or common service. This is anotherreason why we chose it for the Apache group -- many of us wanted to seeHTTP survive and become a true multi***** standard, and would not haveminded in the slightest if Microsoft or Netscape chose to incorporateour HTTP engine or any other component of our code into their products,if it helped further the goal of keeping HTTP common.

This degree of openness has risks. No incentive is built into thelicense to encourage companies to contribute their code enhancementsback to the project. There have certainly been cases in Apache'shistory where companies have developed technology around it that wewould have like to have seen offered back to the project. But had wehad a license which mandated that code enhancements be made availableback to the project, such enhancements would perhaps never have beenmade in the first place.

All this means that, strategically speaking, the project needs tomaintain sufficient momentum, and that participants realize greatervalue by contributing their code to the project, even code that wouldhave had value if kept proprietary. This is a tricky ratio to maintain,particularly if one company decides to dramatically increase the amountof coding they do on a derivative project; and begins to doubt thepotential return in proportion to their contribution to the project,e.g., ``We're doing all this work, more than anyone else combined, whyshould we share it?'' The author has no magic bullet for that scenario,other than to say that such a company probably has not figured out thebest way to inspire contributions from third parties to help meet theirengineering goals most efficiently.

The Mozilla Public License

The Mozilla Public License (MPL) was developed by the Netscape Mozillateam for use on their project. It was the first new license in severalyears when it was released, and really addressed some key issues notaddressed by the BSD or GNU licenses. It is adjacent to the BSD-stylelicense in the spectrum of open-source software licenses. It has twokey differences:

It mandates that changes to the ``distribution'' also be released underthe same copyright as the MPL, which thus makes it available back tothe project. The ``distribution'' is defined as the files asdistributed in the source code. This is important, because it allows acompany to add an interface to a proprietary library of code withoutmandating that the other library of code also be made MPL -- only theinterface. Thus, this software can more or less be combined into acommercial software environment.

It has several provisions protecting both the project as a whole andits developers against patent issues in contributed code. It mandatesthat the company or individual contributing code back to the projectrelease any and all claims to patent rights that may be exposed by thecode.

This second provision is really important; it also, at the time of this writing, contains a big flaw.

Taking care of the patent issue is a Very Good Thing. There is alwaysthe risk that a company could innocently offer code to a project, andthen once that code has been implemented thoroughly, try and demandsome sort of patent fee for its use. Such a business strategy would belaughably bad PR and very ugly, but unfortunately not all companies seethis yet. So, this second provision prevents the case of anyonesurreptitiously providing code they know is patented and liable tocause headaches for everyone down the road.

Of course it doesn't block the possibility that someone else owns apatent that would apply; there is no legal instrument that does providethat type of protection. I would actually advocate that this is anappropriate service for the U.S. Patent and Trade Office to perform;they seem to have the authority to declare certain ideas or algorithmsas property someone owns, so shouldn't they also be required to do theopposite and certify my submitted code as patent-free, granting me someprotection from patent lawsuits?

As I said earlier, though, there is a flaw in the current MPL, as ofDecember 1998. In essence, Section 2.2 mandates (through its definitionof ``Contributor Version'') that the contributor waive patent claims onany part of Mozilla, not just on the code they contribute. Maybe thatdoesn't seem like a bug. It would be nice to get the whole packagewaived by a number of large companies.

Unfortunately, a certain large company with one of the world's largestpatent portfolios has a rather specific, large issue with this quirk.Not because they intend to go after Mozilla some day and demandroyalties -- that would be foolhardy. They are concerned because thereare parts of Mozilla that implement processes they have patents on andreceive rather large numbers of dollars for every year -- and were theyto waive patent claims over the Mozilla code, those companies who paythem dollars for those patents could simply take the code from Mozillathat implements those same patents and shove them into their ownproducts, removing the need to license the patent from said largecompany. Were Section 2.2 to simply refer to the contributed patchesrather than the whole browser when it comes to waiving patents, thiswould not be a problem.

Aside from this quirk, the MPL is a remarkably solid license. Mandatingback the changes to the ``core'' means that essential bug fixes andportability enhancements will flow back to the project, whilevalue-added features can still be developed by commercial entities. Itis perhaps the best license to use to develop an end-user application,where patents are more likely to be an issue, and the drive to branchthe project may be greater. In contrast, the BSD license is perhapsmore ideal for projects intended to be ``invisible'' or essentiallylibrary functions, like an operating system or a web server.

The GNU Public License

While not obviously a business-friendly license, there are certainaspects of the GNU license which are attractive, believe it or not, forcommercial purposes.

Fundamentally, the GPL mandates that enhancements, derivatives, andeven code that incorporates GPL'd code are also themselves released assource code under the GPL. This ``viral'' behavior has been trumpetedwidely by open-source advocates as a way to ensure that code thatbegins free remains free -- that there is no chance of a commercialinterest forking their own development version from the available codeand committing resources that are not made public. In the eyes of thosewho put a GPL on their software, they would much rather have nocontribution than have a contribution they couldn't use as freely asthe original. There is an academic appeal to this, of course, and thereare advocates who claim that Linux would have never gotten as large asit has unless it was GPL'd, as the lure of forking for commercialpurposes would have been too great, keeping the critical mass ofunified development effort from being reached.

So at first glance, it may appear that the GPL would not have a happyco-existence with a commercial intent related to open-source software.The traditional models of making money through software value-add arenot really possible here. However, the GPL could be an extraordinarilyeffective means to establish a platform that discourages competitiveplatforms from being created, and which protects your claim to fame asthe ``premier'' provider of products and services that sit upon thisplatform.

An example of this is Cygnus and GCC. Cygnus makes a very healthy chunkof change every year by porting GCC to various different types ofhardware, and maintaining those ports. The vast majority of that work,in compliance with the GPL, gets contributed to the GCC distribution,and made available for free. Cygnus charges for the effort involved inthe port and maintenance, not for the code itself. Cygnus's history andleadership in this space make it the reference company to approach forthis type of service.

If a competitor were to start up and compete against Cygnus, it toowould be forced to redistribute their changes under the GPL. This meansthat there is no chance for a competitor to find a commercial technicalniche on top of the GCC framework that could be exploited, withoutgiving Cygnus the same opportunity to also take advantage of thattechnology. Cygnus has created a situation where competitors can'tcompete on technology differentiation, unless a competitor were tospend a very large amount of time and money and use a platform otherthan GCC altogether.

Another way in which the GPL could be used for business purposes is asa technology ``sentinel,'' with a non-GPL'd version of the same codeavailable for a price. For example, you may have a great program forencrypting TCP/IP connections over the Internet. You don't care ifpeople use it non-commercially, or even commercially -- your interestis in getting the people who want to embed it in a product orredistribute it for profit to pay you for the right to do that. If youput a GPL license on the code, this second group of users can't do whatthey want, without making their entire product GPL as well, somethingmany of them may be unwilling to do. However, if you maintain aseparate branch of your project, one which is not under the GPL, youcan commercially license the separate branch of code any way you like.You have to be very careful, though, to make sure that any codevolunteered to you by third parties is explicitly available for thisnon-free branch; you ensure this by either declaring that only you (orpeople employed by you) will write code for this project, or that (inaddition) you'll get explicit clearance from each contributor to takewhatever they contribute into a non-free version.

There are companies for whom this is a viable business model -- anexample is Transvirtual in Berkeley, who are applying this model to acommercial lightweight Java virtual machine and class library project.Some may claim that the number of contributors who would be turned offby such a model would be high, and that the GPL and non-GPL versionsmay branch; I would claim that if you treat your contributors right,perhaps even offer them money or other compensation for theircontributions (it is, after all, helping your commercial bottom line),this model could work.

The open-source license space is sure to evolve over the next few yearsas people discover what does and does not work. The simple fact is thatyou are free to invent a new license that exactly describes where onthe spectrum (represented by BSD on the right and GPL on the left) youwish to place it. Just remember, the more freedoms you grant those whouse and extend your code, the more incented they will be to contribute.

Tools for Launching Open Source Projects

We have a nice set of available, well-maintained tools used in theApache Project for allowing our distributed development process towork.

Most important among these is CVS, or Concurrent Versioning System. Itis a collection of programs that implement a shared code repository,maintaining a database of changes with names and dates attached to eachchange. It is extremely effective for being able to allow multiplepeople to simultaneously be the ``authors'' of a program withoutstepping over each others' toes. It also helps in the debuggingprocess, as it is possible to roll back changes one by one to find outexactly where a certain bug may have been introduced. There are clientsfor every major platform, and it works just fine over dial-up lines oracross long-distance connections. It can also be secured by tunnelingit over an encrypted connection using SSH.

The Apache project uses CVS not just for maintaining the actualsoftware, but also for maintaining our ``STATUS'' file, in which weplace all major outstanding issues, with comments, opinions, and evenvotes attached to each issue. We also use it to register votes fordecisions we make as a group, maintain our web site documents with it,manage development documents, etc. In short it is the asset andknowledge management software for the project. Its simplicity may seemlike a drawback -- most software in this space is expensive andfull-featured -- but in reality simplicity is a very strong virtue ofCVS. Every component of CVS is free -- the server and the clients.

Another essential element to an open-source project is a solid set ofdiscussion forums for developers and for users. The software to usehere is largely inconsequential -- we use Majordomo, but ezmlm orSmartlist or any of the others would probably be fine. The importantthing is to give each development effort their own list, so thatdevelopers can self-select their interests and reasonably keep up withdevelopment. It's also smart to create a separate list for each projectto which the CVS server emails changes that get made to the CVSrepository, to allow for a type of passive peer review of changes. Sucha model is actually very effective in maintaining code standards anddiscovering bugs. It may also make sense to have different lists forusers and developers, and perhaps even distinguish between alldevelopers and core developers if your project is large enough.Finally, it is important to have archives of the lists publiclyavailable so that new users can search to see if a particular issue hasbeen brought up in the past, or how something was addressed in thepast.

Bug and issue tracking is also essential to a well-run project. On theApache Project we use a GNU tool called GNATS, which has served us verywell through 3,000+ bug reports. You want to find a tool that allowsmultiple people to answer bug reports, allows people to specialize onbugs in one particular component of the project, and allows people toread bug reports by email and reply to them by email rather thanexclusively by a web form. The overriding goal for the bug database isthat it should be as easy and automated as possible both for developersto answer bugs (because this is really a chore to most developers), andto search to see if a particular bug has already been reported. Inessence, your bug database will become your repository for anecdotalknowledge about the project and its capabilities. Why is a particularbehavior a feature and not a bug? Is anyone addressing a known problem?These are the types of questions a good bug database should seek toanswer.

The open-source approach is not a magic bullet for every type ofsoftware development project. Not only do the conditions have to beright for conducting such a project, but there is a tremendous amountof real work that has to go into launching a successful project thathas a life of its own. In many ways you, as the advocate for a newproject, have to act a little like Dr. Frankenstein, mixing chemicalshere, applying voltage there, to bring your monster to life. Good luck.