SoftwareWars 200911 PDF
SoftwareWars 200911 PDF
SoftwareWars 200911 PDF
027, 11/07/2009
Words: 97474, Pages: 304
Keith Curtis
[email protected]
twitter: @keithccurtis
ISBN 978-0-578-01189-9
TABLE
OF
CONTENTS
4
Biotechnology Patents ..............................................................109
Openness in Health Care...........................................................112
The Scope of Copyright.............................................................114
Length of Copyright...................................................................114
Fair Use.....................................................................................116
Digital Rights Management (DRM)............................................117
Music versus Drivers.................................................................121
Tools...............................................................................................123
Brief History of Programming...................................................125
Lisp and Garbage Collection......................................................129
Reliability...................................................................................132
Portability..................................................................................140
Efficiency...................................................................................143
Maintainability...........................................................................147
Functionality and Usability........................................................149
Conclusion.................................................................................150
The Java Mess................................................................................152
Sun locked up the code..............................................................154
Sun obsessed over specs...........................................................156
Sun locked up the design...........................................................158
Sun fragmented Java.................................................................159
Sun sued Microsoft....................................................................160
Java as GPL from Day 0.............................................................160
Pouring Java down the drain......................................................162
Let's Start Today........................................................................164
The OS Battle.................................................................................169
IBM............................................................................................170
Red Hat......................................................................................172
Novell........................................................................................174
Debian.......................................................................................175
Ubuntu.......................................................................................179
Should Ubuntu Have Been Created?.........................................182
One Linux Distro?......................................................................187
Apple..........................................................................................190
Windows Vista...........................................................................201
Challenges for Free Software........................................................205
More Free Software...................................................................206
Cash Donations..........................................................................207
Devices......................................................................................209
Reverse Engineering.................................................................211
PC Hardware.............................................................................212
Fix the F'ing Hardware Bugs!....................................................214
5
Metrics.......................................................................................215
Volunteers Leading Volunteers..................................................216
Must PC vendors ship Linux?....................................................217
The Desktop...............................................................................219
Approachability..........................................................................220
Monoculture..............................................................................223
Linux Dev Tools..........................................................................225
Backward Compatibility............................................................226
Standards & Web...........................................................................228
Digital Images............................................................................229
Digital Audio..............................................................................229
The Next-Gen DVD Mess...........................................................230
MS's Support of Standards........................................................232
OpenDocument Format (ODF)...................................................234
Web............................................................................................240
Da Future.......................................................................................246
Phase II of Bill Gates' Career.....................................................246
Space, or How Man Got His Groove Back.................................249
The Space Elevator....................................................................254
21st Century Renaissance.........................................................266
Warning Signs From the Future................................................268
Afterword.......................................................................................270
US v. Microsoft..........................................................................270
Microsoft as a GPL Software Company.....................................272
The Outside World.....................................................................275
How to try Linux............................................................................295
Dedication......................................................................................296
Acknowledgments......................................................................296
FREE SOFTWARE
BATTLE
Some people think much faster computers are required for Artificial Intelligence, as well as new ideas. My own opinion is that
the computers of 30 years ago were fast enough if only we
knew how to program them.
John McCarthy, computer scientist, 2004
This IBM 305 RAMAC Computer, introduced in 1956, was the first computer
containing a (5 MB) hard drive on 24 huge spinning platters. Today you can
get 1000 times more memory in something the size of your thumb.
The digital version of this book has a number of hyperlinked words that take you
to references, like this video of writer Cory Doctorow at a Red Hat Summit.
body of work will lead not just to something that works, but eventually to the best that the world can achieve! Better cooperation
among our scientists will lead to, robot-driven cars, pervasive robotics, artificial intelligence, and much faster progress in biology, all of
which rely heavily on software.
A later chapter will describe the software freedoms in more
detail, and the motivations for programmers to use and write free
software, but it is important to clarify here that free software generally means that the source code is made available to its users.
Microsoft's Internet Explorer is not free because it requires a Windows license, but more importantly, you cannot download the source
code to learn how it works.
Today, proprietary software is considered more valuable than free
software because its owners charge for a black box, but that thinking is exactly backwards. Proprietary software is less valuable
because you cannot learn how it works, let alone improve it. It cannot make you better, and you cannot make it better. It is true that
not everyone will exercise the right to read and change their software, just as not everyone exercises their right to their freedom of
the press, but that doesn't make the freedom any less valuable!
The most important piece of free software is the Linux (pronounced Lin- ex) operating system, named after its founder Linus
Torvalds, who started coding it in college. While Linux is generally
not used on desktops today, it and other free software run on 60% of
all websites, an increasing number of cellphones, and 75% of the
world's top 500 fastest supercomputers:
For its part, Microsoft has fiercely fought against Linux and the
trend towards free software by pretending it is just another proprietary competitor. With $28 billion in cash, dominant market share in
Windows, Office and Internet Explorer, and an army of thousands of
experienced programmers, Microsoft is a focused and enduring
competitor.
Microsoft is the largest proprietary software company, but others
have adopted its philosophy of hoarding all knowledge, no matter
how irrelevant to their bottom line or useful to others. Google, the
dominant player in Internet search, relies heavily on free software
and considers it an important part of their success, but they are very
secretive and protect nearly all the software they produce. They are
a black hole of free software: innovation enters but never leaves.
This is all perfectly legal and ethical, and the free market gives
everyone an unfettered right to innovate in any way, create any
license agreement, and charge anything for a product. But free software is not just a competitor, it is a different way of creating software.
The free software community has long threatened to take over the
world. Evangelist Eric Raymond once growled to a Microsoft VIP
that he was their worst nightmare. That was in the mid-1990s,
when Microsoft stock price was doing this:
iBio
I first met Bill Gates at the age of twenty. He stood in the yard of
his Washington lake-front home, Diet Coke in hand, a tastefully
small ketchup stain on his shirt, which no one had the courage to
point out, and answered our questions, in-turn, like a savant. As a
Writing software is a craft, like carpentry. While you can read books on programming languages and software algorithms, you can't learn the countless
details of a craft from a book. You must work with experts on real-world
problems. Before free software, you had to join a company like Microsoft.
Glossary
GLOSSARY
Bit: A piece of information that can hold 2 values: 1 and 0. Bits are grouped
into bytes of 8, characters of 2 bytes (Unicode), 4-byte numbers and pictures with lots.1
Digitize: Process of converting something into 1s and 0s. Once something
is in a digital format, it can be infinitely manipulated by a computer.
Software: General term used to describe a collection of computer programs, procedures and documentation that perform tasks on a computer.
Function: The basic building block of software is a function, which is a discrete piece of code which accomplishes a task:
int SquareNumber (int n)
{
return n * n;
}
Machine language: At the lowest level, software is a bunch of bits that
represent an ordered sequence of processor-specific instructions to change
the state of the computer.
High-level language: A programming language that looks more like English.
Compiler: Software that (typically) converts a high-level language into a
machine language.
Kernel: The lowest level of an operating system that initializes and manages hardware.
Hardware: Physical interconnections and devices required to store and run
software.
Processor: Hardware that executes the programmer's instructions.
Hard drive: Spinning magnetic platters where bits are stored even after
the computer is turned off.
Memory: Hardware which provides fast access to bits of code and data for
the processor. A processor can only manipulate data after it has loaded
them into memory from the hard drive or network.
URL (Uniform Resource Locater): The textual location of a webpage, picture, etc. on the Internet. You can hand a URL to any computer in the world
that understands the Internet and it would return the same thing. (It
might notice that you prefer a version of the page in your language.) An email address is also a URL. The only thing everything on the Internet has is
a URL.
Like a number of places in this book, some of this text was taken from Wikipedia.
10
Wikipedia
WIKIPEDIA
A good friend of mine teaches High School in Bed-Stuy, Brooklyn pretty much the hood. Try to imagine this classroom; it
involves a lot of true stereotypes. But what does NOT fit the
stereotype is that he started a class wiki, and has all his students contribute to it. Instead of a total mess, instead of abuse,
graffiti and sludge, it's raised the level of ALL the students. It's
a peer environment: once it becomes cool to do it right, to be
right, abuse and problems dry up almost completely.
Slashdot.org commentator
My school blocks Wikipedia entirely. When asked why, the
answer is anybody can edit it. As opposed to the rest of the
Internet which is chock-full of nothing but the highest quality,
peer-reviewed content, written universally by the finest
experts, hand selected from across the world?
Slashdot.org commentator
One of the great movements in my lifetime among educated
people is the need to commit themselves to action. Most people
are not satisfied with giving money; we also feel we need to
work. That is why there is an enormous surge in the number of
unpaid staff, volunteers. The needs are not going to go away.
Business is not going to take up the slack, and government cannot.
Peter Drucker, father of modern management
Wikipedia
11
Compared to a paper encyclopedia, a digital edition has significant advantages. The biggest is cost, as printing and shipping a
50,000-page document represents an enormous expense in the production of an encyclopedia. The digital realm has other significant
advantages: the content can be constantly updated and multimedia
features can be incorporated. Why read about the phases of a 4stroke internal combustion engine when you can watch one in
action?
In the mid-1990s, Microsoft created Encarta, the first CD-ROM
based digital encyclopedia. CDs were a natural evolution for Microsoft because it was shipping its ever-growing software on an
increasingly large number of floppy disks. (Windows NT 3.1,
released in 1993, required 22 floppies. CDs quickly became more
cost-effective, as they hold 500 times more data, and are more reliable and faster, and Microsoft played an important role in introducing CD-ROM drives as a standard feature of computers.)
While CDs hold more data than floppies and are an important
technological advancement, this development was soon eclipsed by
the arrival of the web. Users could connect to a constantly-updated
encyclopedia of unlimited size from any computer without installing
it first.
Unfortunately for Microsoft, the Encarta team was slow in adopting the Internet because they felt some of the richness of its encyclopedia was lost on the web. However, with pictures, animations
12
Wikipedia
and text, even the early web was good enough and had substantial
advantages over a CD-ROM version. In the Internet realm, you only
need one Wikipedia, albeit running on hundreds of servers, for the
entire world; you don't even need to worry about the cost to make a
copy of an encyclopedia.
However, the biggest mistake the Encarta team made was not
realizing that the Internet could introduce feedback loops. The users
of an Internet encyclopedia can also become enhancers of it. If I
have a question about what I've read, or I think I've found a problem, I can post a question or fix the problem and report what I've
accomplished.
We will discuss later if the ability for anyone to edit, enhance or
add data will hurt quality, but it is important to remember that it
was the creation of the Internet that allows people in all the corners
of the world to work together and learn from each other; a completely new capability for man.
For any faults, Wikipedia became larger than the Encyclopedia
Britannica in just 2.5 years. The database now contains more than
15 times as many articles, and is already the best compendium of
human knowledge ever created. No corporation invested millions of
dollars in engineering or marketing either; it happened seemingly
on its own. Even if some of those articles are fluff about Star Trek
characters, many are not: Wikipedia's article on carbon nanotubes
and many other scientific topics is more detailed and more up to
date than Encyclopedia Britannica's. The vast depth of Wikipedia is
also a living refutation of perhaps the biggest criticism of free software and free content: that no one will work on the most arcane and
boring stuff. The lesson here is that different things are interesting
to different people.
Wikipedia is one of the 10 most popular websites on the Internet,
receiving 450 times the traffic of Encyclopedia Britannica, and with
an article collection that continues to grow at an exponential rate.
As Wikipedia has advanced, it has also added a multimedia collection, a dictionary, a compendium of quotes, textbooks, and a news
aggregator and they are just getting started.
In some ways, access to a search engine might seem to obviate
the need for an encyclopedia. But while search engines provide a
keyword index to the Internet, they do not replace the importance of
an encyclopedia: a comprehensive, coherent, neutral, compendium
of human knowledge.
Wikipedia
13
14
Wikipedia
Wikipedia
15
16
Linux
LINUX
Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect.
Linus Torvalds, 2003
Linux
17
dows kernels. Microsoft has said that it has bet the company on
Windows, and this is not an understatement! If the Windows kernel
loses to Linux, then Windows, and Microsoft, is also lost.1
The Linux kernel is not popular on desktops yet, but it is widely
used on servers and embedded devices because it supports thousands of devices and is reliable, clean, and fast. Those qualities are
even more impressive when you consider its size: printing out the
Linux kernel's 8,000,000 lines of code would create a stack of paper
30 feet tall! The Linux kernel represents 4,000 man-years of engineering and 80 different companies, and 3,000 programmers have
contributed to Linux over just the last couple of years.
That 30-foot stack of code is just the basic kernel. If you include a
media player, web browser, word processor, etc., the amount of free
software on a computer running Linux might be 10 times the kernel,
requiring 40,000 man-years and a printout as tall as a 30-story
building.
This 40 man-millennia even ignores the work of users reporting
bugs, writing documentation, creating artwork, translating strings,
and performing other non-coding tasks. The resulting Linux-based
free software stack is an effort that is comparable in complexity to
the Space Shuttle. We can argue about whether there are any motivations to write free software, but we can't argue it already exists!
One of the primary reasons I joined Microsoft was I believed their
Windows NT (New Technology) kernel, which is still alive in Windows Vista today, was going to dominate the brains of computers,
and eventually even robots. One of Bill Gates' greatest coups was
recognizing that the original Microsoft DOS kernel, the source of
most of its profits, and which became the Windows 9x kernel, was
not a noteworthy engineering effort. In 1988, Gates recruited David
Cutler from Digital Equipment Corporation, a veteran of ten operating systems, to design the product and lead the team to build the
Windows NT kernel, that was released as I joined in 1993.
18
Linux
The kernel Cutler and his team developed looks like this:
Linux
19
are people all over the world, from Sony to Cray, who tweaked it to
get it to run on their hardware. If Windows NT had been free from
the beginning, there would have been no reason to create Linux.
However, now that there is the free and powerful Linux kernel,
there is no longer any reason but inertia to use a proprietary kernel.
There are a number of reasons for the superiority of the Linux
kernel. But first, I want to describe the software development
process. When you understand how the Linux kernel is built, its
technical achievements are both more impressive and completely
logical.
20
Linux
Distributed Development
In Linux we reject lots of code, and that's the only way to create
a quality kernel. It's a bit like evolutionary selection: breathtakingly wasteful and incredibly efficient at the same time.
Ingo Molnar, Linux kernel developer
A portion of the speaker list for the 2006 Linux Kernel Symposium the
author attended. Linux kernel development is a distributed effort, which
greatly enhances its perspective.
Every 20th century management book I've read assumes that team
members work in the same building and speak the same language.
Linux
21
22
Linux
Linux
23
24
Linux
Security
Network &
file
systems
Init &
Memory
Manager
Crypto
Layers of the Linux kernel onion. The Linux kernel is 50% device drivers,
and 25% CPU-specific code. The two inner layers are very generic.
Notice that it is built as an onion and is comprised of many discrete components. The outermost layer of the diagram is device
drivers, which is 50% of the code, and more than 75% of its code is
hardware-specific. The Microsoft Windows NT kernel diagram,
shown several pages back, puts all the device drivers into a little
box in the lower left-hand corner, illustrating the difference between
Linux
25
theory and reality. In fact, if Microsoft had drawn the kernel mode
drivers box as 50% of the Windows NT diagram, they might have
understood how a kernel is mostly hardware-specific code, and
reconsidered whether it was a business they wanted to get into.
Refactoring (smoothing, refining, simplifying, polishing) is done
continuously in Linux. If many drivers have similar tasks, duplicate
logic can be pulled out and put into a new subsystem that can then
be used by all drivers. In many cases, it isn't clear until a lot of code
is written, that this new subsystem is even worthwhile. There are a
number of components in the Linux kernel that evolved out of duplicate logic in multiple places. This flexible but practical approach to
writing software has led Linus Torvalds to describe Linux as Evolution, not Intelligent Design.
One could argue that evolution is a sign of bad design, but evolution of Linux only happens when there is a need unmet by the current software. Linux initially supported only the Intel 80386
processor because that was what Linus owned. Linux evolved, via
the work of many programmers, to support additional processors
more than Windows, and more than any other operating system ever
has.
There is also a virtuous cycle here: the more code gets refactored,
the less likely it is that a code change will cause a regression; the
more code changes don't cause regressions, the more code can be
refactored. You can think about this virtuous cycle two different
ways: clean code will lead to even cleaner code, and the cleaner the
code, the easier it is for the system to evolve, yet still be stable.
Andrew Morton has said that the Linux codebase is steadily improving in quality, even as it has tripled in size.
Greg Kroah-Hartman, maintainer of the USB subsystem in Linux,
has told me that as USB hardware design has evolved from version
1.0 to 1.1 to 2.0 over the last decade, the device drivers and internal
kernel architecture have also dramatically changed. Because all of
the drivers live within the kernel, when the architecture is altered to
support the new hardware requirements, the drivers can be
adjusted at the same time.
Microsoft doesn't have a single tree with all the device drivers.
Because many hardware companies have their own drivers floating
around, Microsoft is obligated to keep the old architecture around
so that old code will still run. This increases the size and complexity
of the Windows kernel, slows down its development, and in some
cases reveals bugs or design flaws that can't even be fixed. These
26
Linux
49 C
Many of the Linux kernel's code changes are polish and cleanup. Clean
code is more reliable and maintainable, and reflects the pride of the free
software community.
2
This should arguably be expressed as XML, but because there is common code
that reads these values and provides them to applications, and because each file
contains only one value, this problem isn't very significant; the kernel's configuration information will never be a part of a web mashup.
Linux
27
These studies have limited value because their tools usually analyze just a few
types of coding errors. Then, they make the IT news, and get fixed quickly
because of the publicity, which then makes the study meaningless. However, these
tools do allow for comparisons between codebases. I believe the best analysis of
the number of Linux bugs is the 1,400 bugs in its bug database, which for 8.2 million lines of code is .17 bugs per 1,000 lines of code. This is a tiny number, though
it could easily be another 100 times smaller. Here is a link to the Linux kernel's
active bugs: http://tinyurl.com/LinuxBugs.
28
Linux
System call graph to return a picture in the free web server Apache.
Linux
29
This is a way to add additional security because the operating system can say, for
example: Because a media player has no reason to write files to disk, the system
can take away this permission. Before the kernel tries to do anything interesting,
it will ask the Mandatory Access System (MAC) whether such an operation is
allowed. The security checks in most other operating systems simply ask if the
person is allowed to do something.
Creating a default policy adds additional work for application writers, and by
itself doesn't entirely solve the problem. A word processor needs complete read
and write access, so how do you solve the problem of a virus in a document macro
opening all of your files and writing junk? SELinux doesn't deal with this situation
because it doesn't have this information. In GC programming languages, it is possible to walk the stack and determine more information about whether a macro,
or the word processor itself, is asking to open a file.
30
Linux
The alternative is for each component to use the previous version of all of its
dependent components, which means that the features in the latest Internet
Explorer wouldn't show up in various places that are using the old version. However, does the Help system need the latest version?
Linux
31
32
Linux
ware has two. Presumably they weren't satisfied with the features
Windows provided, and weren't able to fix them. And so they had to
build new applets from scratch! This is also what gives Windows a
feeling of a jumble of components slapped together.
Here are five of the 100 applets IBM adds to Windows:
Windows XP with 5 of IBM's 100 extra applets. Notice the large number
of status icons on this almost-virgin installation.
Building all of these applets, designing multilingual user interfaces, providing the means to install and configure, etc. is ten times
more work than merely writing the device driver, leveraging other
shipping drivers, and uploading it to the official codebase.
The reason my Photodesk printer driver didn't work on Windows
Server 2003 was because of a crash in the installation code which
HP shouldn't even be bothering with in the first place.
Linux
33
34
Linux
Linux
35
In Windows Vista, Microsoft moved some of the device drivers to user mode but
they should have kept the device drivers small, simple, and in the kernel and
instead moved the widgets and fluff to user mode.
36
Linux
long to ship, the free software community has often added features
before Microsoft. The website http://kernelnewbies.org displays the
latest list of the Linux features added since the previous release 3-4
months before, and it is typically 15 pages long! For example, here
is just the list of driver features added to the 2.6.26 version of the
Linux kernel, which had a 3-month dev cycle.
Linus 2.6.26 driver workitems
4.1. IDE/SATA
IDE
Add warm-plug support for IDE devices
Mark "idebus=" kernel parameter as obsoleted (take 2)
Remove ide=reverse IDE core
Add "vlb|pci_clock=" parameter
Add "noacpi" / "acpigtf" / "acpionboot" parameters
Add "cdrom=" and "chs=" parameters
Add "nodma|noflush|noprobe|nowerr=" parameters
Add Intel SCH PATA driver
Add ide-4drives host driver (take 3)
gayle: add "doubler" parameter
Remove the broken ETRAX_IDE driver
SATA
sata_inic162x: add cardbus support
libata: prefer hardreset
ata: SWNCQ should be enabled by default
Make SFF support optional
libata: make PMP support optional
sata_mv: disable hotplug for now, enable NCQ on SOC, add basic
port multiplier support
sata_fsl: Fix broken driver, add port multiplier (PMP) support
4.2. Networking
ssb: add a new Gigabit Ethernet driver to the ssb core
Add new qeth device driver,
Add new ctcm driver that reemplaces the old ctc one,
New driver "sfc" for Solarstorm SFC4000 controller.
Driver for IXP4xx built-in Ethernet ports
Add support the Korina (IDT RC32434) Ethernet MAC
iwlwifi: Support the HT (802.11n) improvements,,,, add default
WEP key host command, add 1X HW WEP support, add default
WEP HW encryption, use HW acceleration decryption by default,
hook iwlwifi with Linux rfkill, add TX/RX statistics to driver, add
debugfs to iwl core, enables HW TKIP encryption, add led
support, enables RX TKIP decryption in HW, remove
IWL{4965,3945}_QOS
ath5k: Add RF2413 srev values, add RF2413 initial settings,
identify RF2413 and deal with PHY_SPENDING, more RF2413
stuff, port to new bitrate/channel API, use software encryption for
now
pasemi_mac: jumbo frame support, enable GSO by default, basic
ethtool support, netpoll support
rt2x00: Add per-interface structure, enable master and adhoc
mode again, enable LED class support for rt2500usb/rt73usb
e1000e: Add interrupt moderation run-time ethtool interface, add
support for BM PHYs on ICH9
niu: Add support for Neptune FEM/NEM cards for C10 server
blades, add Support for Sun ATCA Blade Server.
gianfar: Support NAPI for TX Frames
ehea: Add DLPAR memory remove support
sfc: Add TSO support
b43: Add QOS support, add HostFlags HI support, use SSB blockI/O to do PIO
S2io: Multiqueue network device support implementation,, enable
multi ring support, added napi support when MSIX is enabled.
ixgbe: Introduce MSI-X queue vector code, introduce Multiqueue
TX, add optional DCA infrastructure, introduce adaptive interrupt
moderation
uli526x: add support for netpoll
fmvj18x_cs: add NextCom NC5310 rev B support
zd1211rw: support for mesh interface and beaconing
libertas: implement SSID scanning for SIOCSIWSCAN
ethtool: Add support for large eeproms
The scheduled bcm43xx removal
4.6. Video
cx88: Add support for the Dvico PCI Nano, add xc2028/3028
boards, add support for tuner-xc3028
saa7134: add support for the MSI TV@nywhere A/D v1.1 card,
add support for the Creatix CTX953_V.1.4.3 Hybrid
saa717x: add new audio/video decoder i2c driver
Support DVB-T tuning on the DViCO FusionHDTV DVB-T Pro
Add support for xc3028-based boards
ivtv: add support for Japanese variant of the Adaptec AVC-2410
Add basic support for Prolink Pixelview MPEG 8000GT
bttv: added support for Kozumi KTV-01C card
Add support for Kworld ATSC 120
CX24123: preparing support for CX24113 tuner
Added support for Terratec Cinergy T USB XXS
budget: Add support for Fujitsu Siemens DVB-T Activy Budget
Support for DVB-S demod PN1010 (clone of S5H1420) added
Added support for SkyStar2 rev2.7 and ITD1000 DVB-S tuner
em28xx-dvb: Add support for HVR950, add support for the HVR900
Add support for Hauppauge HVR950Q/HVR850/FusioHDTV7-USB
HVR950Q Hauppauge eeprom support
Adding support for the NXP TDA10048HN DVB OFDM
demodulator
Add support for the Hauppauge HVR-1200
pvrusb2-dvb: add DVB-T support for Hauppauge pvrusb2 model
73xxx
Add support for Beholder BeholdTV H6
cx18: new driver for the Conexant CX23418 MPEG encoder chip
s5h1411: Adding support for this ATSC/QAM demodulator
4.7. SCSI
zfcp: Add trace records for recovery thread and its queues, add
traces for state changes., trace all triggers of error recovery
activity,register new recovery trace., remove obsolete erp_dbf
trace, add trace records for recovery actions.
qla2xxx: Add support for host supported speeds FC transport
attribute., add FC-transport Asynchronous Event Notification
support., add hardware trace-logging support., add Flash
Descriptor Table layout support., add ISP84XX support., add
midlayer target/device reset support.
iscsi: extended cdb support, bidi support at the generic libiscsi
level, bidi support for iscsi_tcp
scsi_debug: support large non-fake virtual disk
gdth: convert to PCI hotplug API
st: add option to use SILI in variable block reads
megaraid_sas: Add the new controller(1078DE) support to the
driver
m68k: new mac_esp scsi driver
bsg: add large command support
Add support for variable length extended commands
aacraid: Add Power Management support
dpt_i2o: 64 bit support, sysfs
Firmware: add iSCSI iBFT Support
4.8. WATCHDOG
Add a watchdog driver based on the CS5535/CS5536 MFGPT
timers
Add ICH9DO into the iTCO_wdt.c driver
4.9. HWMON
thermal: add hwmon sysfs I/F
ibmaem: new driver for power/energy/temp meters in IBM System
X hardware
i5k_amb: support Intel 5400 chipset
4.10. USB
ISP1760 HCD driver
Linux
The scheduled ieee80211 softmac removal
The scheduled rc80211-simple.c removal
Remove obsolete driver sk98lin
Remove the obsolete xircom_tulip_cb driver
4.3. Graphics
radeon: Initial r500 support,,
intel_agp: Add support for Intel 4 series chipsets
i915: Add support for Intel series 4 chipsets
Add support for Radeon Mobility 9000 chipset
fb: add support for foreign endianness
pxafb: preliminary smart panel interface support,
Driver for Freescale 8610 and 5121 DIU
intelfb: add support for the Intel Integrated Graphics Controller
965G/965GM
Add support for Blackfin/Linux logo for framebuffer console
4.4. Sound
hda-codec - Allow multiple SPDIF devices, add SI HDMI codec
support, add support for the OQO Model 2, add support of Zepto
laptops, support RV7xx HDMI Audio, add model=mobile for
AD1884A & co, add support of AD1883/1884A/1984A/1984B, add
model for cx20549 to support laptop HP530, add model for alc883
to support FUJITSU Pi2515, add support for Toshiba Equium L30,
Map 3stack-6ch-dig ALC662 model for Asus P5GC-MX, support of
Lenovo Thinkpad X300, add Quanta IL1 ALC267 model, add
support of AD1989A/AD1989B, add model for alc262 to support
Lenovo 3000, add model for ASUS P5K-E/WIFI-AP, added support
for Foxconn P35AX-S mainboard, add drivers for the Texas
Instruments OMAP processors, add support of Medion RIM 2150,
support IDT 92HD206 codec
ice1724 - Enable AK4114 support for Audiophile192
ice1712: Added support for Delta1010E (newer revisions of
Delta1010), added support for M-Audio Delta 66E, add Terrasoniq
TS88 support
Davinci ASoC support
intel8x0 - Add support of 8 channel sound
ASoC: WM9713 driver
Emagic Audiowerk 2 ALSA driver.
Add PC-speaker sound driver
oxygen: add monitor controls
virtuoso: add Xonar DX support
soc - Support PXA3xx AC97
pxa2xx-ac97: Support PXA3xx AC97
4.5. Input
Add support for WM97xx family touchscreens
WM97xx - add chip driver for WM9705 touchscreen, add chip
driver for WM9712 touchscreen, add chip driver for WM97123
touchscreen, add support for streaming mode on Mainstone
wacom: add support for Cintiq 20WSX
xpad: add support for wireless xbox360 controllers
Add PS/2 serio driver for AVR32 devices
aiptek: add support for Genius G-PEN 560 tablet
Add Zhen Hua driver
HID: force feedback driver for Logitech Rumblepad 2, Logitech
diNovo Mini pad support
4.6. Video
V4L2 soc_camera driver for PXA270,,
Add support for the MT9M001 camera
Add support for the MT9V022 camera
Add support for the ISL6405 dual LNB supply chip
Initial DVB-S support for MD8800 /CTX948
cx23885: Add support for the Hauppauge HVR1400, add generic
cx23417 hardware encoder support
Add mxl5505s driver for MaxiLinear 5505 chipsets, basic digital
support.
37
pxa27x_udc driver
CDC WDM driver
Add Cypress c67x00 OTG controller core driver,,
Add HP hs2300 Broadband Wireless Module to sierra.c
Partial USB embedded host support
Add usb-serial spcp8x5 driver
r8a66597-hcd: Add support for SH7366 USB host
Add Zoom Telephonics Model 3095F V.92 USB Mini External
modem to cdc-acm
Support for the ET502HS HDSPA modem
atmel_usba_udc: Add support for AT91CAP9 UDPHS
4.11. FireWire
release notes at linux1394-user
4.12. Infiniband
IPoIB: Use checksum offload support if available, add LSO
support, add basic ethtool support, support modifying IPoIB CQ
event moderation, handle 4K IB MTU for UD (datagram) mode
ipath: Enable 4KB MTU, add code to support multiple link speeds
and widths, EEPROM support for 7220 devices, robustness
improvements, cleanup, add support for IBTA 1.2 Heartbeat
Add support for IBA7220,,,,,,,,,
mthca: Add checksum offload support
mlx4: Add checksum offload support, add IPoIB LSO support to
mlx4,
RDMA/cxgb3: Support peer-2-peer connection setup
4.13. ACPI and Power Management
ACPICA: Disassembler support for new ACPI tables
eeepc-laptop: add base driver, add backlight, add hwmon fan
control
thinkpad-acpi: add sysfs led class support for thinklight (v3.1),
add sysfs led class support to thinkpad leds (v3.2)
Remove legacy PM
4.14. MTD
m25p80: add FAST_READ access support to M25Pxx, add Support
for ATMEL AT25DF641 64-Megabit SPI Flash
JEDEC: add support for the ST M29W400DB flash chip
NAND: support for pxa3xx
NOR: Add JEDEC support for the SST 36VF3203 flash chip
NAND: FSL UPM NAND driver
AR7 mtd partition map
NAND: S3C2410 Large page NAND support
NAND: Hardware ECC controller on at91sam9263 / at91sam9260
4.15. I2C
Add support for device alias names
Convert most new-style drivers to use module aliasing
Renesas SH7760 I2C master driver
New driver for the SuperH Mobile I2C bus controller
Convert remaining new-style drivers to use module aliasing
4.16. Various
MMC: OMAP: Add back cover switch support
MMC: OMAP: Introduce new multislot structure and change
driver to use it
mmc: mmc host test driver
4981/1: [KS8695] Simple LED driver
leds: Add mail LED support for "Clevo D400P"
leds: Add support to leds with readable status
leds: Add new driver for the LEDs on the Freecom FSG-3
RAPIDIO:
Add RapidIO multi mport support
Add OF-tree support to RapidIO controller driver
Add serial RapidIO controller support, which includes MPC8548,
MPC8641
edac: new support for Intel 3100 chipset
Basic braille screen reader support
ntp: support for TAI
RTC: Ramtron FM3130 RTC support
Don't worry if you don't understand what these things mean as I don't
either. It is just important to understand that even the hardware of computers are too big and complicated for one company to oversee the development of.
38
Linux
Two of the biggest differences in strategy is User Mode Linux (UML) which
changes Linux to run as an application on top of another instance of Linux, and
the standard virtualization, which runs the guest OS in kernel mode, though it
doesn't actually talk to the hardware. The Linux kernel is evolving towards figuring out the architecture, and what is shared between the different strategies.
Linux
39
Charging for an OS
A Linux operating system is an entirely different beast compared
to a Microsoft operating system. Microsoft was constantly torn
about how much value to invest in Windows, and how much to set
aside for extra licensing revenue in other products. Windows Vista
has five different versions, (originally they announced eight!), each
with basically the same code, but with dramatically different prices:
Product
(Amazon.com)
Upgrade
New
$243
$350
$176
$260
$140
$219
$85
$157
Microsoft charges $85 to $350 for Windows Vista, but the code in each version is 99% the same.
40
Linux
Linux
41
This was all before the invention of the Internet whereby the
number of users of a server could easily be in the thousands and
which made many usages of CALs expensive and unsustainable.
Therefore, Microsoft moved towards a model where the cost was
based on the number of processors in the computer so that little
boxes would cost less than big boxes.
This model worked until Intel introduced the concept of hyperthreading, which fools a computer into thinking there are two processors inside the computer, but which adds only 15-30% more
performance. Microsoft's customers would of course be unhappy at
the thought of purchasing a bunch of new licenses for such a small
performance improvement, so Microsoft ended up giving free
licenses for hyperthreaded processors.
Then, virtualization was created:
42
Linux
Linux
43
Recently, I saw an ad for a trivial software applet capable of converting a DVD to the iPod video format. Microsoft managed to convince everybody that each little software product is worth selling.
Now you see why Stallman's analogy of proprietary software as tollbooths is apt they serve as permanent, pervasive obstacles holding up progress for software that was written years earlier. Free
software flows with much less friction. In fact, in the free software
world, the definition of a PC operating system completely changes.
The One Laptop Per Child has as much CPU power as a workstation of 15
years ago, but would be just a shiny box without free software and content.
44
Linux
saging, but it also includes tools for making pictures and music, children's applications, server software, the Bible, development tools,
and much more.
Audacity is the most popular free audio editor on Linux. It doesn't
have a talking paper clip: It looks like you're trying to add echo.
Would you like some help? But it does provide a well-rounded set of
features, and has many effects for the manipulation of sound.
Linux
45
Editing
Easy editing with Cut, Copy, Paste,
and Delete.
Use unlimited Undo (and Redo) to go
back any number of steps.
Very fast editing of large files.
Edit and mix an unlimited number of
tracks.
Use the Drawing tool to alter individual sample points.
Fade the volume up or down
smoothly with the Envelope tool.
Effects
Change the pitch without altering
Sound Quality
Record and edit 16-bit, 24-bit, and
Plug-Ins
Add new effects with LADSPA plug
ins.
Audacity includes some sample plugins by Steve Harris.
Load VST plugins for Windows and
Mac, with the optional VST Enabler.
Write new effects with the built-in
Nyquist programming language.
Analysis
Spectrogram mode for visualizing
frequencies.
Plot Spectrum command for
detailed frequency analysis.
Developers of free software applications tend to build extensibility plugins as a fundamental way of writing their software because
they know their tool will never by itself be able to do all the things
people will want. A plugin provides a boundary between things that
46
Linux
manage data, and things that manipulate it. The most popular plugins eventually become a part of the base system, but by being built
separately, they have forced clean boundaries and modularity.9
Every application that Linux has that Windows doesn't is a feature Windows is missing:
Richard Stallman's free software vision realized: A free Linux operating system contains an entire store of free applications available with one click,
and built to work together. Having so many tools at your disposal makes
computers more personal, powerful, productive, and enjoyable. Your computing experience becomes limited only by your creativity.
I argue in another place in the book that software has no clear boundaries. What I
meant was that one never really knows precisely what the interface between manager and manipulator should be. For audio files, the boundary seems clear: here is
some audio data, chew on it. However even there you must ask: what DSP APIs
are available to the plugins? Otherwise, each plugin will need lots of duplicate
code that the manager already likely has! It is the new hardware capabilities that
create a need for a change at this boundary. The lesson here is to keep your
boundaries simple, but assume you may need to change them.
Linux
47
Bill Gates
Wednesday, January 15, 2003 10:05 AM
Jim Allchin
Chris Jones (WINDOWS); Bharat Shah (NT); Joe
Peterson; Will Poole; Brian Valentine; Anoop
Gupta (RESEARCH)
Subject:
Windows Usability degradation flame
I am quite disappointed at how Windows Usability has been
going backwards and the program management groups don't
drive usability issues.
Let me give you my experience from yesterday.
I decided to download (Moviemaker) and buy the Digital Plus
pack ... so I went to Microsoft.com. They have a download place
so I went there.
The first 5 times I used the site it timed out while trying to
bring up the download page. Then after an 8 second delay I got
it to come up.
This site is so slow it is unusable.
It wasn't in the top 5 so I expanded the other 45.
These 45 names are totally confusing. These names make stuff
like: C:\Documents and Settings\billg\My Documents\My Pictures seem clear.
They are not filtered by the system ... and so many of the things
are strange.
I tried scoping to Media stuff. Still no moviemaker. I typed in
movie. Nothing. I typed in movie maker. Nothing.
So I gave up and sent mail to Amir saying - where is this
Moviemaker download? Does it exist?
So they told me that using the download page to download
something was not something they anticipated.
They told me to go to the main page search button and type
movie maker (not moviemaker!)
I tried that. The site was pathetically slow but after 6 seconds
of waiting up it came.
48
Linux
I thought for sure now I would see a button to just go do the
download.
In fact it is more like a puzzle that you get to solve. It told me to
go to Windows Update and do a bunch of incantations.
This struck me as completely odd. Why should I have to go
somewhere else and do a scan to download moviemaker?
So I went to Windows update. Windows Update decides I need
to download a bunch of controls. (Not) just once but multiple
times where I get to see weird dialog boxes.
Doesn't Windows update know some key to talk to Windows?
Then I did the scan. This took quite some time and I was told it
was critical for me to download 17megs of stuff.
This is after I was told we were doing delta patches to things
but instead just to get 6 things that are labeled in the SCARIEST possible way I had to download 17meg.
So I did the download. That part was fast. Then it wanted to do
an install. This took 6 minutes and the machine was so slow I
couldn't use it for anything else during this time.
What the heck is going on during those 6 minutes? That is
crazy. This is after the download was finished.
Then it told me to reboot my machine. Why should I do that? I
reboot every night why should I reboot at that time?
So I did the reboot because it INSISTED on it. Of course that
meant completely getting rid of all my Outlook state.
So I got back up and running and went to Windows Update
again. I forgot why I was in Windows Update at all since all I
wanted was to get Moviemaker.
So I went back to Microsoft.com and looked at the instructions.
I have to click on a folder called WindowsXP. Why should I do
that? Windows Update knows I am on Windows XP.
What does it mean to have to click on that folder? So I get a
bunch of confusing stuff but sure enough one of them is
Moviemaker.
So I do the download. The download is fast but the Install takes
many minutes. Amazing how slow this thing is.
At some point I get told I need to go get Windows Media Series
9 to download.
So I decide I will go do that. This time I get dialogs saying
things like "Open" or "Save". No guidance in the instructions
which to do. I have no clue which to do.
The download is fast and the install takes 7 minutes for this
thing.
So now I think I am going to have Moviemaker. I go to my
add/remove programs place to make sure it is there.
It is not there.
Linux
49
Linux Distributions
With Linux, each OS distribution carves out a niche to meet its
users' needs. There are specialized versions of Linux containing educational software, tools for musicians, versions dedicated to embedded or low-end hardware, and regional versions of Linux produced
in places like Spain and China.
The various distributions have much in common, including the
Linux kernel, but use different free software and installation mechanisms. One distribution called Gentoo downloads only one binary, a
50
Linux
Damn Small Linux is the most popular Linux for old computers and ships on
80x60 mm CDs.
USER
root
root
root
root
root
79128k buffers
177528k cached
TIME+
0:11.74
0:01.09
0:00.01
0:14.63
0:00.00
COMMAND
usb-storage
init
migration/0
ksoftirqd/0
watchdog/0
Linux
51
The following map shows the Debian branch of the Linux distribution family tree; they derive from each other, just like in a biological
ecosystem:
A portion of the Linux family tree showing the Debian branch, the biggest
free software distribution.
50%
Linux distributions, sorted by popularity. The line shows the divide between
both halves of the popularity curve.
52
Linux
What you see here is an almost perfectly smooth curve that illustrates a relatively new idea called the Long Tail. One way to think
about this idea is to look at the English language. Words like the
are used with great frequency, but many more words like teabag
are used infrequently. There is a long tail of infrequently used English words, and to just ignore them would be to throw away much of
what makes our language so special.
The lesson of the long tail in business is the importance of catering to customers with special interests. The long tail of Linux distributions means that the creation of a free software ecosystem
doesn't mean the end of the free market, or of competition.
Wikipedia and the Linux kernel are two of the best examples of
the fact that free software and the free exchange of ideas can create
superior products without licensing fees. The mere existence of
these premier products, without a gigantic company behind them, is
proof that the proprietary development model is doomed.
AI and Google
AI
AND
53
The source code for IBM's Deep Blue, the first chess machine to
beat then-reigning World Champion Gary Kasparov, was built by a
team of about five people. That code has been languishing in a vault
at IBM ever since because it was not created under a license that
would enable further use by anyone, even though IBM is not
attempting to make money from the code or using it for anything.
The second best chess engine in the world, Deep Junior, is also
not free, and is therefore being worked on by a very small team. If
One website documents 60 pieces of source code that perform Fourier transformations, which is an important software building block. The situation is the same
for neural networks, computer vision, and many other advanced technologies.
54
AI and Google
The hardest computing challenges we face are man-made: language, roads and spam. Take, for instance, robot-driven cars. We
could do this without a vision system, and modify every road on the
planet by adding driving rails or other guides for robot-driven cars,
but it is much cheaper and safer to build software for cars to travel
on roads as they exist today a chaotic mess.
At the annual American Association for the Advancement of Science (AAAS) conference in February 2007, the consensus among
the scientists was that we will have driverless cars by 2030. This
prediction is meaningless because those working on the problem are
not working together, just as those working on the best chess soft-
AI and Google
55
56
AI and Google
Stanley website and could find no link to the source code, or even
information such as the programming language it was written in.
Some might wonder why people should work together in a contest, but if all the cars used rubber tires, Intel processors and the
Linux kernel, would you say they were not competing? It is a race,
with the fastest hardware and driving style winning in the end. By
working together on some of the software, engineers can focus more
on the hardware, which is the fun stuff.
The following is a description of the computer vision pipeline
required to successfully operate a driverless car. Whereas Stanley's
entire software team involved only 12 part-time people, the vision
software alone is a problem so complicated it will take an effort
comparable in complexity to the Linux kernel to build it:
Image acquisition: Converting sensor inputs from 2 or more
cameras, radar, heat, etc. into a 3-dimensional image sequence
Pre-processing: Noise reduction, contrast enhancement
Feature extraction: lines, edges, shape, motion
Detection/Segmentation: Find portions of the images that
need further analysis (highway signs)
High-level processing: Data verification, text recognition,
object analysis and categorization
The 5 stages of an image recognition pipeline.
The vision pipeline is the hardest part of creating a robot-driven car, but
even such diagnostic software is non-trivial.
AI and Google
57
In 2007, there was a new DARPA Urban challenge. This is a sample of the information given to the contestants:
Constructing a vision pipeline that can drive in an urban environment presents a much harder software problem. However, if you
look at the vision requirements needed to solve the Urban Challenge, it is clear that recognizing shapes and motion is all that is
required, and those are the same requirements as had existed in the
2004 challenge! But even in the 2007 contest, there was no more
sharing than in the previous contest.
Once we develop the vision system, everything else is technically
easy. Video games contain computer-controlled drivers that can race
you while shooting and swearing at you. Their trick is that they
already have detailed information about all of the objects in their
simulated world.
After we've built a vision system, there are still many fun challenges to tackle: preparing for Congressional hearings to argue that
these cars should have a speed limit controlled by the computer, or
telling your car not to drive aggressively and spill your champagne,
or testing and building confidence in such a system.2
2
There are various privacy issues inherent in robot-driven cars. When computers
know their location, it becomes easy to build a black box that would record all
58
AI and Google
Eventually, our roads will get smart. Once we have traffic information, we can have computers efficiently route vehicles around any
congestion. A study found that traffic jams cost the average large
city $1 billion dollars a year.
No organization today, including Microsoft and Google, contains
hundreds of computer vision experts. Do you think GM would be
gutsy enough to fund a team of 100 vision experts even if they
thought they could corner this market?
There are enough people worldwide working on the vision problem right now. If we could pool their efforts into one codebase, written in a modern programming language, we could have robot-driven
cars in five years. It is not a matter of invention, it is a matter of
engineering. Perhaps the world simply needs a Linus Torvalds of
computer vision to step up and lead these efforts.
this information and even transmit it to the government. We need to make sure
that machines owned by a human stay under his control, and do not become controlled by the government without a court order and a compelling burden of proof.
AI and Google
59
His prediction is that the number of computers, times their computational capacity, will surpass the number of humans, times their computational capacity, in
2045. Therefore, the world will be amazing then.
This calculation is flawed for several reasons:
1. We will be swimming in computational capacity long before 2040. Today, my
computer is typically running at 2% CPU when I am using it, and therefore
has 50 times more computational capacity than I need. An intelligent agent
twice as fast as the previous one is not necessarily more useful.
2. Many of the neurons of the brain are not spent on reason, and so shouldn't
be in the calculations.
3. Billions of humans are merely subsisting, and are not plugged into the global
grid, and so shouldn't be measured.
4. There is no amount of continuous learning built in to today's software.
Each of these would tend to push Singularity forward and support the argument
that the benefits of singularity are not waiting on hardware. Humans make computers smarter, and computers make humans smarter, and this feedback loop
makes 2045 a meaningless moment.
Who in the past fretted: When will man build a device that is better at carrying
things than me? Computers will do anything we want, at any hour, on our command. A computer plays chess or music because we want it to. Robotic firemen
will run into a burning building to save our pets. Computers have no purpose
without us. We should worry about robots killing humans as much as we worry
about someone stealing an Apache helicopter and killing humans today.
60
AI and Google
Most computers today contain a dual-core CPU and processor folks promise that
10 and more are coming. Intel's processors also have limited 4-way parallel processing capabilities known as MMX and SSE. Intel could add even more of this
parallel processing support if applications put them to better use. Furthermore,
graphics cards exist to do work in parallel, and this hardware could also be
adapted to AI if it is not usable already.
AI and Google
61
Google
One of the problems faced by the monopoly, as its leadership
now well understands, is that any community that it can buy is
weaker than the community that we have built.
Eben Moglen
In 1950, Alan Turing proposed a thought experiment as a definition of AI in which a computer's responses (presumed to be textual)
were so life-like that, after questioning, you could not tell whether
they were made by a human or a computer. Right now the search
experience is rather primitive, but eventually, your search engine's
response will be able to pass the Turing Test. Instead of simply
doing glorified keyword matching, you could ask it to do things like:
Plot the population and GDP of the United States from 1900
2000.5 Today, if you see such a chart, you know a human did a lot of
work to make it.
The creation of machines that can pass the Turing Test will make
the challenge of outsourcing seem like small potatoes. Why outsource work to humans in other countries when computers nearby
can do the task?
AI is a meaningless term in a sense because building a piece of
software that will never lose at Tic-Tac-Toe is a version of AI, but it
is a very primitive type of AI, entirely specified by a human and executed by a computer that is just following simple rules.
Fortunately, the same primitive logic that can play Tic-Tac-Toe can
be used to build arbitrarily smart software, like chess computers
and robot-driven cars. We simply need to build systems with enough
intelligence to fake it. This is known as Weak AI, as opposed to
Strong AI, which is what we think about when we imagine robots
that can pass the Turing Test, compose music, get depressed.
In Strong AI, you wouldn't give this machine a software program
to play chess, just the rules. The first application of Strong AI is
Search; the pennies for web clicks will pay for the creation of intelligent computers.
The most important and interesting service on the Internet is
search. Without an index, a database is useless imagine a phone
directory where the names were in random order. There is an enormous turf war taking place between Google, Yahoo!, and Microsoft
for the search business. Google has 200,000 servers, which at 200
5
Of course, there are some interesting complexities to the GDP aspect, like
whether to plot the GDP in constant dollars and per person.
62
AI and Google
hits per second gives them the potential for three trillion transactions per day. Even at fractions of pennies per ad, the potential revenue is huge. Right now, Google has 65% of the search business,
with Yahoo! at 20% and Microsoft at 7%. Bill Gates has said that
Microsoft is working merely to keep Google honest, which reveals
his acceptance that, unlike Windows and Office, MSN is not the
leader. (Note that Microsoft's search and other online efforts have
an inherent advantage because they get as much software as they
want for free. Any other company which wanted to build services
using Microsoft's software would have much higher costs.)
Furthermore, to supplant an incumbent, being 10% better is
insufficient. It will take a major breakthrough by one of Google's
competitors to change the game Microsoft's Bing is not one of
those. I use Google because I find its results good enough and
because it keeps a search history, so that I can go back in time and
retrieve past searches. If I started using a different search provider,
I would lose this archive.
Google depends heavily on free software, but very little of their
code is released to outsiders. One can use many of Google's services
for free, as Google makes most of its money on advertising, but you
cannot download any of their code to learn from it or improve it or
re-use it in ways not envisioned by them. Probably 99% of the code
on a typical server at Google is free software, but 99% of the code
Google itself creates is not free software.6 Google's source code is
not only not freely available, it is not for sale.
In fact, Google is an extremely secretive and opaque company.
Even in casual conversation at conferences, its engineers quickly
retreat to statements about how everything is confidential. Curiously, a paper explaining PageRank, written in 1998 by Google cofounders Sergey Brin and Larry Page, says, With Google, we have a
strong goal to push more development and understanding into the
academic realm. It seems they have since had a change of heart.
6
Although Google doesn't give away or sell their source code, they do sell an appliance for those who want a search engine for the documents on an internal
Intranet. This appliance is a black box and is, by definition, managed separately
than the other hardware and software in a datacenter.
It also doesn't allow tight integration with internal applications. An example of a
feature important to Intranets is to have the search engine index all documents I
have access to. The Internet doesn't really have this problem as basically everything is public. Applications are the only things that know who has access to all
the data. It isn't clear that Google has attacked this problem and because the
appliance is not extensible, no one other than Google can fix this either. This feature is one reason why search engines should be exposed as part of an application.
AI and Google
63
Google is applying Metcalfe's law to the web: Gmail is a good product, but
being a part of the Google brand is half of its reason for success.
64
AI and Google
Even with all that Google is doing, search is its most important
business. Google has tweaked its patented PageRank algorithm
extensively and privately since it was first introduced in 1998, but
the core logic remains intact: The most popular web pages that
match your search are the ones whose results are pushed to the
top.7
Today, PageRank can only rank what it understands. If the database contains words, it ranks the words. PageRank lets the wisdom
in millions of web sites decide what is the most popular, and therefore the best search result because the computer cannot make
that decision today. PageRank is an excellent stopgap measure to
the problem of returning relevant information, but the focus should
be on putting richer information into the database.
I believe software intelligence will get put into web spiders, those
programs that crawl the Internet and process the pages. Right now,
they mostly just save text, but eventually they will start to understand it, and build a database of knowledge, rather than a database
of words. The rest is mostly a parsing issue. (Some early search
engines, treated digits as words: searching for 1972 would find any
reference to 1, 9, 7 or 2; this is clearly not a smart search algorithm.)
The spiders that understand the information, because they've put
it there, also become the librarians who take the search string you
give it, and compare that to its knowledge.8 You need a librarian to
build a library, and a librarian needs the library she built to help
you.
Today, web spiders are not getting a lot of attention in the search
industry. Wikipedia documents 37 web crawlers, and it appears that
the major focus for them is on performance and discovering spam
websites containing only links which are used to distort rank.9
7
8
9
AI and Google
65
The case for why a free search engine is better is a difficult one to
make, so I will start with a simpler example, Google's blogging software.
Blogger
While Google has 65% of the billion-dollar search business, it has
10% or less of the blog business. There exists an enormous number
of blog sites, the code for which is basically all the same. The technology involved in running Instapundit.com, one of the most influential current-events blogs, is little different than that running
Myspace, the most popular diary and chatboard for Jay-Z-listening
teenage girls.
Google purchased the proprietary blogging engine Blogger in
2000 for an undisclosed amount. Google doesn't release how many
users they have because they consider that knowledge proprietary,
but we do know that no community of hundreds of third party developers is working to extend Blogger to make it better and more useful.
The most popular free blogging engine is WordPress, a core of
only 40,000 lines of code. It has no formal organization behind it,
yet we find that just like Wikipedia and the Linux kernel, WordPress
is reliable, rich, and polished:
66
AI and Google
There are hundreds of add-ons for WordPress that demonstrate the health
of the developer community and which make it suitable for building even
very complicated websites. This might look like a boring set of components,
but if you broke apart MySpace or CNN's website, you would find much of
the same functionality.
Google acquired only six people when it purchased Pyra Labs, the
original creators of Blogger, a number dwarfed by WordPress's hundreds of contributors. As with any thriving ecosystem, the success of
WordPress traces back to many different people tweaking, extending and improving shared code. Like everything else in the free software community, it is being built seemingly by accident.10
10 In fact, WordPress's biggest problem is that the 3rd party development is too rich,
and in fact chaotic. There are hundreds of themes and plugins, many that duplicate each other's functionality. But grocery stores offer countless types of toothpaste, and this has not been an insurmountable problem for consumers. I talk
more about this topic in a later chapter.
AI and Google
67
Search
Google tells us what words mean, what things look like, where
to buy things, and who or what is most important to us.
Google's control over results constitutes an awesome ability
to set the course of human knowledge.
Greg Lastowka, Professor of Law, Rutgers University
And I, for one, welcome our new Insect Overlords.
News Anchorman Kent Brockman, The Simpsons
Why Google should have built Blogger as free software is an easier case to make because it isn't strategic to Google's business or
profits, the search engine is a different question. Should Google
have freed their search engine? I think a related, and more important question is this: Will it take the resources of the global software
community to solve Strong AI and build intelligent search engines
that pass the Turing Test?
Because search is an entire software platform, the best way to
look at it is by examining its individual components. One of the most
fundamental responsibilities for the Google web farm is to provide a
distributed file system. The file system which manages the data
blocks on one hard drive doesn't know how to scale across machines
to something the size of Google's data. In fact, in the early days of
Google, this was likely one of its biggest engineering efforts. There
are (today) a number of free distributed file systems, but Google is
not working with the free software community on this problem. One
cannot imagine that a proprietary file system would provide Google
any meaningful competitive advantage, nevertheless they have built
one.
Another nontrivial task for a search engine is the parsing of PDFs,
DOCs, and various other types of files in order to pull out the text to
index them. It appears that this is also proprietary code that Google
has written.
It is a lot easier to create a Google-scaled datacenter with all of
its functionality using free software today than it was when Google
was formed in 1998. Not only is Google not working with the free
68
AI and Google
AI and Google
69
Conclusion
70
Free Software
FREE SOFTWARE
If you have an apple and I have an apple and we exchange
these apples then you and I will still each have one apple. But if
you have an idea and I have an idea and we exchange these
ideas, then each of us will have two ideas.
George Bernard Shaw
Free Software
71
you could feed the world for free, would you? Likewise, if you could
provide every child access to a library of human knowledge they
would never outgrow, would you? It is the Internet that makes this
question possible to ask, and necessary to answer!
Without the right software to decode and manipulate it, a digital
idea is just a blob of bits to your computer. With the Internet, we can
exchange bits, but with free software, we can exchange ideas. While
free knowledge and free software are not any direct goal of the free
market, they provide tremendous benefits to a free market because
they allow anyone to create further value. If the larger goal to
encourage as many programmers as possible to write software, then
the free software approach has already demonstrated its superiority,
even though it is only on 1% of desktops. In fact, the proprietary
world was always destined to have just one company dominate; a
clone of Bill Gates who came along later would have been unable to
learn from and improve upon the innovations of the first.
Free software brings the libertarian benefit of allowing information to be used in unlimited new ways, combined with the communitarian benefit of ensuring that no one is left behind by the access
cost of knowledge. Because free software is better for the free
market than proprietary software, and an important element of a
society characterized by the free exchange of ideas, I think it is a
better name than open source, although both represent the same
basic idea. (Another reason to call it free software is that there is
an academic tradition that the person who discovers or defines
something has the right to give it a name, and Richard Stallman
defined free software long before others created open source.)
Software as a Science
In any intellectual field, one can reach greater heights by
standing on the shoulders of others. But that is no longer generally allowed in the proprietary software fieldyou can only
stand on the shoulders of the other people in your own company.
The associated psychosocial harm affects the spirit of scientific
cooperation, which used to be so strong that scientists would
cooperate even when their countries were at war. In this spirit,
Japanese oceanographers abandoning their lab on an island in
the Pacific carefully preserved their work for the invading U.S.
Marines, and left a note asking them to take good care of it.
Richard Stallman
72
Free Software
Even the word university, man's place for shared study, derives
from the Latin universitas magistrorum et scholarium, meaning a
community of teachers and scholars. Universities were long understood to be places where people were placed together to learn from
each other.
Unfortunately, today, proprietary software has spread from the
corporate world to universities and other public institutions. If corporations want to hoard their scientific advancements, that is fine,
albeit short-sighted, but our public institutions should not be following suit! Not only Stanford's robot-driven car, Stanley, but also a ton
of other proprietary software is written by public institutions today.
Just unleashing our public institutions towards free software will
greatly increase the pace of progress, without even accounting for
the software funded by corporations.
Some think of free software as a Marxist idea, but science has
always been public because it was understood that giving the knowledge away would spurn further innovation, and because the scientist needed some shoulders to stand on in the first place.
Corporations were created not to hoard knowledge but to take the
advancements in science and apply them to practical uses. There
still is plenty of opportunity for competition and free markets, even
if all of the advancements in science are freely available.
Free Software
73
74
Free Software
Because software is a science, we need to create license agreements which allow, and even encourage, cooperation among programmers. Computers scientists need software to be freely available
for them to do their work.
Richard Stallman has defined the four basic software freedoms:
1.
2.
3.
4.
The freedom to run the program, for any purpose. (You, not
your software, are in control of what is happening.)
The freedom to study how the program works and adapt it
to your needs.
The freedom to give a copy of the program to your neighbor. Sharing ideas is impossible without sharing the program to create and display them.
The freedom to improve the program, and release your
improvements to the public, so that the whole community
benefits.
The GNU General Public License (GPL) is the copyright mechanism he came up with to protect those freedoms, which will allow
maximal re-use of advancements. The goal is to protect the freedom
of the user of free software. Without the ability to study and manipulate the software, you are greatly limited in your ability to use it and
further advance it.
Copyright was created to protect the creators, from the publishers (those with the means to make copies), by granting the creator
exclusive rights. The GNU GPL is sometimes called copyleft,
because it grants the same expansive rights to everyone, creator,
and user.
It sounds backwards to protect users rather than creators, but
protecting users also helps creators. All Linux programmers except
Linus started off as users. Linus was the first user of Linux as well
as being the first contributor to Linux. He fully owned his code and
could fix any problem he found. Copyleft ensures that code protected with this license provides those same guarantees to future
users and creators.
Copyleft helps Linus because it encourages, even requires users
and programmers of Linux to give back to his invention. Linux only
Free Software
75
ran on a 80386 CPU when first released because that is what Linus
owned. All the improvements that it took to run on the other processors got put into Linux, to the benefit of Linus, and everyone else.
Whatever you think of free software today, using it is a choice. In
fact, creating it is a charitable act, and we should be grateful to
Linus Torvalds for releasing his work, just like we should be grateful
to Einstein for making his E=MC2 theory publicly available.
Microsoft, Apple, Google, and many of the other blue-chip computer companies do not yet accept the idea that software should be
free in the ways that Stallman defines above. According to them,
you can generally run the code for whatever purpose, but not copy,
study, or enhance it.
Necessary
The reason it is necessary to have copyleft is that only 100% free
software is something someone else can further improve. Improving
a piece of free software, and making the enhancements proprietary,
effectively makes that entire piece of software proprietary. You need
access to an entire codebase to be able make changes to anywhere
in it. Denying free availability of code enhancements creates a new
boundary between science and alchemy.
Not Expensive
Free software is not expensive because, in practical terms,
advancements in software are nearly always based on existing
knowledge and are very small in scope. What you must give back is
much smaller than what you have freely received.
76
Free Software
Free Software
77
In fact, source code is considered free by today's software community if it supports the first three freedoms (run, study, copy), but
not copyleft (make those enhancement freely available to all.)
Two very popular free licenses, the MIT and BSD licenses, are
considered free but simply say: Please include this copyright message at the top of the source code. You can use this code, copy it
and study it, but you can also make it proprietary again. This sort of
free software does not require anyone to contribute anything back if
they enhance it.
Stallman considers these lax licenses; while they sound reasonable and are, strictly speaking, more free than copyleft, the problem is that this once-free software frequently becomes proprietary
again. Keith Packard has told the story of how the Unix windowing
system was initially created under a lax license, but was rewritten
multiple times because it got hijacked and made proprietary multiple times. An enormous amount of programming work was wasted
but because the codebase was not GPL right from the beginning.
One of the reasons why Unix was never much competition for
Windows is that many of its vendors did not work together. Linux's
GPL license nudges people into working together to save money in
total development costs and speed progress.
Some argue the lax licenses are less scary to organizations
which don't understand or truly appreciate free software. This issue
can be solved by better education of the computing community, not
by encouraging license agreements for ignorant people. As Eben
Moglen points out, things in public domain can be appropriated
from in freedom-disrespecting ways. In general, once people understand that software is a science, the idea of enabling proprietary science will not be interesting.
Further adoption of copyleft will increase the efficiency of the
free software community and help it achieve world domination
faster. Software protected by copyleft is not totally free software,
and that is why it is taking off.
78
Free Software
Free Software
79
Conventional models of economics do not apply to products with zero marginal cost. The model above assumes there can be supply shortages that
would tend to increase prices.
This analysis doesn't consider the elasticity of demand, etc. but while those considerations add complications, they do not overturn the basic laws.
80
Free Software
ity. When you assume the computational cost is zero, you no longer
can have supply shortages of bits. The law of demand dictates that
as prices are lowered, demand will increase. Consequently, a product with zero cost should, in principle, have infinite demand. One
answer to why people will write free software in the future is
because there will be infinite demand for it. Wikipedia, Linux, FireFox, and many other free software products, have user bases which
are growing every year, and are even taking away marketshare from
proprietary products, as predicted.2
The law of supply says that higher prices give producers an incentive to supply more in the hope of making greater revenue. The supply curve seems to suggest that if the price of a product is zero,
producers will have no incentive to make anything. However, the
supply curve never envisioned that the marginal cost of production
would be zero. It is this difference which upends the conventional
economic rules.
Something which requires no marginal cost to make, and is
acquired by consumers for no cost, should have infinite supply and
infinite demand. Free software will take off because the most basic
laws of economics say it should. If Wikipedia charged $50 for the
privilege of browsing their encyclopedia, they would not have had
the millions of contributors and tens of millions of enhancements
they have received so far. Wikipedia's infinite supply of free bits is
creating an infinite demand.
There is the total cost of ownership (TCO), but that is a comparative measure of
software quality. If you use some free software to help you make a website, the
cost of that software is the amount of time you spend with it to accomplish your
task. If one tool builds a website in half the time, or builds one twice as nice, then
the TCO of those software packages is different, even if both are free.
If a car company gave away their top of the line cars for free, the demand would
be very high, even though the owners still had to purchase gas and had other
ongoing costs.
Even if two pieces of software have the same TCO, there is an additional cost: the
cost of switching from one to the next. Software is not interchangeable in the way
that the laws of supply and demand envision. In fact, the switching costs between
software often dwarfs the difference in their TCO.
Free Software
81
There are many opportunities for volunteers and public institutions to create free software, but commercial enterprises need software to run their business, and they are an enormous potential
source of funding. For-profit service and support organizations will
provide the persistence that corporations will demand of free software.
As with other sciences, there should be many avenues for corporations to make money via the use and production of freely-available
advances in computer science. Service companies will write free
software when a customer pays them, just as a physicist or lawyer
does work whenever a customer pays them. In fact, by some estimates, 75% of software is written for internal use inside a corporation, without any thought of selling it to others. This corporate
software is free to its customers inside a corporation. Software
companies making licensing revenue is already a very small part of
the software industry today.
Free software is much more conducive to creating robust software services business because all of the relevant information is
publicly available. In the world of proprietary software, the company
that wrote the software is typically the only one capable of providing
support. Microsoft has created a huge software ecosystem, but the
free software service ecosystem has the potential to be much larger.
Of course, there is no guarantee of quality of service providers, but
this same issue exists today with car mechanics.
Today, many free software projects have thriving service and support communities around them. While others are not particularly
healthy yet, this is a function of the small overall marketshare of
free software, not any fundamental flaw in the business model.
In fact, the proprietary model creates fundamental limitations in
the service business. When I left Microsoft, I took on a consulting
job helping a team build a website which used Microsoft Passport as
the authentication mechanism. However, as I ran into problems,
even Google wasn't able to help because the knowledge I needed to
82
Free Software
Free Software
83
Category
Features
Manufacturing
Engineering, Bills of Material, Scheduling, Capacity, Workflow Management, Quality Control, Cost
Management, Manufacturing Process, Manufacturing Projects, Manufacturing Flow
Inventory, Order Entry, Purchasing, Product Configuration, Supply Chain Planning, Supplier Scheduling, Inspection of goods, Claim Processing,
Commission Calculation
Financials
Projects
Human Resources
Customer Relation- Sales and Marketing, Commissions, Service, Customer Contact and Call Center support
ship Management
Data Warehouse
84
Free Software
systems today are not only large and complicated, they are also heterogeneous. In the 1970s, companies like IBM provided all the hardware, software, and services necessary to run a business, but
today's computing environments are very different. Even in a homogeneous Microsoft shop where all of the servers are running Windows, SQL Server, and .Net, you still might use HP hardware and
administration tools, Intel chips and drivers, an EMC disk farm, etc.
Computer software is not smart yet, but don't let that fool you
into thinking that it is not large and complicated. A trained engineer
can become an expert at 100,000 lines of code, but because a modern computer contains 100 million lines of code, you need at least
1,000 different people to help with all possible software problems.
In other words, it is a fact of life in a modern IT department that,
whether using free or proprietary software, support will require
relationships with multiple organizations.
In fact, free software can allow service and support teams to better help you because they can build expertise in more than one area
because all of the code and other relevant information is out there.
Service companies can even build a hierarchy of relationships. You
might call HP to get help with your server, and HP might have an
escalation relationship with MySQL if they track it down to a database problem they can't fix. These hierarchies can provide one
throat to choke.
3.
All hardware companies have a compelling reason to use and support free software: it lowers their costs. IBM and Cray are happy to
give you a Linux OS for free, so you can put your money toward the
supercomputer they are selling. The Playstation 3 runs Linux, with
Sony's blessing, because it is another reason to buy their hardware
and take their product to new places they have yet to exploit.
Free software lowers the cost of hardware, and its greater usage
will stimulate new demand for computers and embedded devices. If
a complete, free software stack were magically available today that
enabled computer vision and speech, our toys would have them
tomorrow. A world of rich free software is a world with amazing
hardware.
Free software levels the playing field and makes the hardware
market richer and more competitive. One of the reasons an MRI
machine is expensive is because the software is proprietary. When a
hardware vendor controls the software, it can set the price at the
cost of the hardware plus the cost to develop the software, rather
Free Software
85
than something approximating their hardware costs. If MRI software were free, the hardware cost would drop, more people could
afford an MRI, and the quality would increase faster.
In fact, there already is free, high-quality scientific software suitable for building an MRI machine (in products like SciPy), but the
current manufacturers build their products using proprietary software. They aren't colluding with each other, but it reminds me of the
old days of databases where your only choice was whether to pay
many thousands for Oracle or DB2. The healthcare hardware companies had better watch their backs!
Even proprietary software companies have an incentive to use
free software, to lower their costs. It is ironic that Microsoft could
make higher profits, and build better products, by using free software.
4. Educational uses
I once asked some of my computer science lecturers why they
didn't get students to do something useful, like work on free
software, instead of assigning them pointless busy work
projects. Two main answers:
1. It's too hard to grade. (Why?)
2. It's seen by many to be exploitative. (As opposed to
busy-work?)
Slashdot.org commentator
Dear Ken Starks (founder of Helios Project), I am sure you
strongly believe in what you are doing but I cannot either support your efforts or allow them to happen in my classroom. At
this point, I am not sure what you are doing is legal. No software is free and spreading that misconception is harmful. I
admire your attempts in getting computers in the hands of disadvantaged people but putting Linux on these machines is holding our kids back. This is a world where Windows runs on virtually every computer and putting on a carnival show for an
operating system is not helping these children at all. I am sure
if you contacted Microsoft, they would be more than happy to
supply you with copies of an older version of Windows and that
way, your computers would actually be of service to those
receiving them.
Karen, middle school teacher
86
Free Software
ideas, you'd think making their code freely available would be a part
of the computer science research culture today, but it isn't, even in
universities. There is a paper for Standford's Stanley, but no code.
Releasing software with the paper is the exception rather than the
rule today.
Even though all the key ideas are available in a paper, re-using
the ideas in such a document takes a lot more time than working
with the software directly. You can reuse software without fully
understanding it, but you can't re-implement software without fully
understanding it!
At Microsoft, both a researcher's time and his code could get allocated to a product group if anyone found their work interesting.
Researchers at Microsoft wanted to ship their code and many PhDs
joined Microsoft because they knew their work had the potential to
become widely used.3
In the future, when we get a complete set of GPL codebases, it
will get interesting very fast because researchers will realize that
the most popular free codebase is also the best one for their
research.
5.
6.
Fame
The self-satisfaction and adulation that people receive from producing things that others use and enjoy should not be misunderestimated, even in the world of software. It was a common feeling
among my peers at Microsoft that we should pinch ourselves
because we were getting paid to write code that millions of people
used. It is almost as much fun to be appreciated in the software
world as it is in other endeavors. I once played music for 200 people
and I didn't even care that I wasn't getting paid when a girl who
liked my music put my business card in her bra.
3
Ideally, researchers would do work directly in a product group's codebase. Unfortunately, too many product group codebases were so big, old, and complicated
that researchers typically couldn't work in them directly.
Free Software
87
Goom is the only software project I know admired by myself, the wife, my
three-year-old son, and the mother-in-law. Dave Prince
The Linux kernel and the rest of the free software stack have
spots of brilliance in many places, but not many you can visualize.
My epiphany that Linux was going to win on the desktop happened
when I cranked up the default Linux media player. Totem doesn't
contain a web browser or try to sell you anything, and it plays a very
wide variety of formats, but what impressed me was its elegant visualization engine, not-so-elegantly named Goom.
Goom is visual proof of the proposition that with the free flow of
ideas, a clever engineer from Toulouse who you've never heard of,
was never hired by anyone, never signed a non-disclosure agreement was able to write some beautiful code that now graces millions
of computers. For all we know, he did this work in his pajamas
88
Free Software
like the bloggers who took out Dan Rather in the Rathergate4 scandal. If you want to read the code to learn its secrets, post a question
in a support forum, or send a donation to say thanks, the software
repository SourceForge enables this, and Goom is one of its 100,000
projects.
No one edits Wikipedia for fame or swooning girls. But this
energy spent is an example of the surplus intelligence of millions of
people that can be harnessed and put to work for interesting things.
The surplus intelligence of computer scientists is enough to write all
of the software we need. Programmers employed by businesses are
just icing on the cake, or to answer the phone at 3AM when a computer is sick.
8.
The Rathergate scandal was sometimes written that way because the documents
that Dan Rather broadcast, which were supposedly from the 1970s, had centered
text, proportional fonts, and the letters 187th written with superscript. Typewriters did not have superscript back then, so they were clearly forged!
Free Software
89
90
Free Software
Making game engines free will allow for much more user-created
content. iD Software, creators of the popular game Doom, did
release their game engine, though it is a very old version of their
code which they no longer use. Even so, a free software community
has surrounded it! This version of Doom has never been used in any
Xbox 360 game, but you can run it on an iPod.
Game creators today keep their game engine private because
they consider it a competitive advantage but it is also a huge part
of their costs. The Halo series of video games, the most popular on
the Xbox platform, spends three years between releases. A big part
of the holdup is the software engine development.
A game engine is a piece of technology comparable in size to the
Linux kernel because developers need lots of features:
the ability to model a world, create the rules of the game,
and efficiently send updates between computers
text to speech, means to deal with cheaters, and other forms
of AI.
If the thousands of game programmers around the world started
working together on the software, I shudder to think what they
might build! Hollywood might increase their investments in the
game business if there were a free standard codebase: movies, video
games and virtual reality are closely-related technologies.
Pride of Ownership
While much free software is not paid for or organized by a company, the quality of it is just as good because the programmers do so
with the same pride as they do a day job.
When I joined Microsoft, I casually signed away the right to use
my work anywhere else. Despite the fact that I no longer owned my
work, and would never see the code again after I left the company, I
didn't treat it with any less importance. It is in the nature of man to
do his best whether or not he exclusively owns an idea.
In fact, unpaid volunteer programmers might be more tempted to
do it right because there isn't a deadline. A lot of lesser-known free
software in use today was started purely for the enjoyment of the
programmer. Linus wrote Linux to learn about his computer, and
even today, 25% of contributors to the kernel do not work for a company.
A programmer might have a boring job building websites something that pays the bills, but writing fun code in his free time might
Free Software
91
I did check to verify that Bruno had children, but like changing diapers, voluntarily writing free educational software for children is something only a parent would
do!
92
Free Software
Free Software
93
Math applied during encryption of data. The security of encryption algorithms is guaranteed by the many smart eyeballs who have analyzed it. Ultimately, if the mathematicians prove an encryption algorithm is secure,
there are no back doors, and a password is the only key to the data.
Would the GPL require that the U.S. government give away its top
secret code? The GPL's goal is to ensure that all users of software
have the right to inspect and make changes. The user of military
software is the military itself, so the conditions are met.
The U.S. government might not want to give away certain source
code to other countries, but this is also solvable. Since enforcement
of copyright is the right granted to the U.S. Congress, they can create a law that says any GPL software stamped Top Secret is
removed from copyleft obligations, something that should apply only
to a small amount of code.
A question to ponder: If the military were to create a vision system to help it build more discriminate bombs less likely to hit civilians, would the military be inclined to give this away or would they
be afraid that the enemy would use it to build more discriminate
94
Free Software
Free Software
95
96
Free Software
If it weren't for piracy, Linux would have likely taken over the
world already. I have been told that more than 90% of users in China
run pirated software, and as no one can truly know, it could very
well be closer to 99%. If it weren't for these illegal copies, China
would have been forced to develop a suitable free software stack.
And once something existed for a country such as China, it would
have been usable for the rest of the world as well.
The United States needs to move rapidly towards free software if
it is to be relevant in building the future. The U.S. invented the transistor, but it has also spawned a ton of old, proprietary systems that
are a drag on future productivity. America is the widespread purveyor of non-free software which means the transition to free software will be much more difficult than for other countries which
don't have this baggage.
2.
CodePlex
Microsoft has also created a number of websites where developers can use free code and collaborate, and the latest is called CodePlex. While it does demonstrate that Microsoft understands the
benefits of free software, this website mostly contains tiny add-ons
to proprietary Microsoft products. CodePlex may serve mostly to kill
off Microsoft's community of partners who previously sold add-ons
to Visual Basic and other products. While these serve as a bulwark
against the free software movement and provide a way for Microsoft
to claim that they get this new way of developing software, it ultimately undermines their business.
3.
Interop
Free Software
97
Likewise, the best way to make sure that a new audio format
becomes available on every device is to make freely available the
code to read and write it. If the details are public, why not make
them useful to a computer?
4. Shared Source
Microsoft has also released some software under various shared
source licenses. One of the first products Microsoft released under
this license was C# and a portion of the .Net runtime. The language
spec was always free, and there was a decision made to release
some of the code as well in 2002. In addition to releasing the code,
Microsoft seeded research efforts by sponsoring 80 projects in universities around the world. However, there is little activity today,
and one reason is that the license is very restrictive:
You may not use or distribute this Software or any derivative
works in any form for commercial purposes. Examples of commercial purposes would be running business operations, licensing, leasing, or selling the Software, or distributing the
Software for use with commercial products.
Microsoft Shared Source CLI License
Just a Stab
Richard Stallman, who started the free software movement in
1985, might be right when he says that the freer intellectual property is, the better off society is.
However, I don't think this should automatically apply to everything, even software. Furthermore, it is a choice for every creator to
make. While it might make economic and moral sense for some
ideas to be given away, that doesn't mean ideas are no longer
owned. The GPL says that software is owned by its users.
Stallman reminds us that the concept of a free market is an idea
that has taken a long time for the general public to understand and
free software and the other intellectual property issues we grapple
with today will also take time for society to grasp.
98
Free Software
Computer pundit Tim O'Reilly makes the point that the GPL could
become irrelevant in the coming cloud computing world. Nowadays,
people focus on free code running on their own computer, but what
about if you are using GPL code which is doing work on your behalf,
but running on another processor? Currently, the GPL does not consider this scenario, but I think this is a loophole not within the spirit
of copyleft. Perhaps this will be the battle for GPL v4, some years
hence.
99
100
which was the biggest enhancement of that release and some say
this was the most useful feature ever added to Word. In the end,
Alex received U. S. patent #5,787,451, but was this feature truly
worthy of a patent? These are the major elements of this patent:
Red underlines of misspelled words
Spell checking happens as you type, removing the need to
launch a dialog box as a separate step.
While adding this feature was a huge time-saving device, it isn't
something so unique that other word processors wouldn't have
eventually implemented it. Teachers have been circling misspelled
words with red pens since time immemorial, this is just the digital
version. Fortunately for the world, Microsoft has not enforced this
patent and squiggly underlines can now show up almost everywhere
you can type text.
For several years, British Telecom attempted to assert ownership
on the concept of the hyperlink in patent #4,873,662. Thankfully,
that patent was eventually invalidated on a technicality, but a lot of
money was spent on lawyers in the meanwhile.
One of Amazon's first patents was for 1-Click ordering. Once
Amazon has your payment and shipping information on file, you are
able to purchase a book with literally one click. However, isn't this
an obvious innovation for anyone building an e-commerce website?
Zero-click ordering would be an innovation worth patenting!
Amazon's patent didn't encourage innovation, it simply became a
stick their lawyers could use to beat up Barnes & Noble. We are told
that patents protect the little guy, but they actually create a complicated minefield that helps incumbents.
101
There are an infinite number of ways of converting sound to and from bits,
but they are mathematically very similar. (The difference between codecs
has to do with merely their efficiency, and their cleverness in removing data
you cannot perceive.)
102
There might be a new type of compression algorithm that is innovative, but the point of codecs is to enable the easy exchange of
video and audio bits. Patents, therefore, only serve as a hindrance to
this. The reason why digital audio and video is such a hassle today is
because of the mess of proprietary formats, patents and licensing
fees. These obstacles encourage the creation of even more formats,
which just makes the problem worse.
In the mid-90s, Apple, Microsoft, Real, and others were out there
hawking their proprietary audio and video formats, touting their
advantages over the others. We have not recovered from this. Microsoft employee Ben Waggoner wrote:
Microsoft (well before my time) went down the codec standard
route before with MPEG-4 part 2, which turns out to be a
profound disappointment across the industry it didn't offer
that much of a compression advantage over MPEG-2, and the
protracted license agreement discussions scared off a lot of
adoption. I was involved in many digital media projects that
wouldn't even touch MPEG-4 in the late '90s to early '00s
because there was going to be a 'content fee' that hadn't been
fully defined yet.
And even when they created standards like MPEG, certain companies would assert patent control over certain aspects. MPEG isn't a
codec so much as a system of codecs, a land mine of proprietary and
non-proprietary specifications that makes supporting MPEG very
difficult. The reason many websites do their video using the proprietary Flash control is because the various interests didn't come
together to produce a standard.
Many times in this industry, someone has invented a compression
mechanism, patented it, implemented the code for their own use,
but did not document it or give away code to encode and decode the
format. Then they asked everyone to use their new format. This
strategy is totally the wrong approach to making formats universally
usable by computers and devices.
What is important is that we pick a simple and efficient algorithm,
standardize it and then make the software to read and write it freely
available. That way, every device and every application will work
with every piece of sound or video. Today, there is nothing but chaos
and incompatibility.
The most popular audio format today is MP3. Here there is not
just one, but a number of different companies that have patent
claims that do not expire until 2015! The core logic of a codec is
103
Software is math
Software is math. In the 1930s, Alonzo Church created a mathematical system known as lambda () calculus, an early programming
language that used math as its foundation, and which could express
any program written today.
A patent on software is therefore a patent on math, something
that historically has not been patentable. Donald Knuth, one of
America's most preeminent computer scientists, wrote in a letter to
the U. S. Patent Office in 2003:
I am told that the courts are trying to make a distinction
between mathematical algorithms and non mathematical algorithms. To a computer scientist, this makes no sense, because
every algorithm is as mathematical as anything could be. An
algorithm is an abstract concept unrelated to the physical laws
of the universe.
Nor is it possible to distinguish between numerical and nonnumerical algorithms, as if numbers were somehow different
from other kinds of precise information. All data are numbers,
and all numbers are data.
Congress wisely decided long ago that mathematical things
cannot be patented. Surely nobody could apply mathematics if
it were necessary to pay a license fee whenever the theorem of
Pythagoras is employed. The basic algorithmic ideas that people are now rushing to patent are so fundamental, the result
threatens to be like what would happen if we allowed authors
to have patents on individual words and concepts.
I strongly believe that the recent trend to patenting algorithms
is of benefit only to a very small number of attorneys and inventors, while it is seriously harmful to the vast majority of people
who want to do useful things with computers.
Software doesn't look like math, but it is built up from just a few
primitive operations that have a mathematical foundation. Allowing
people to patent particular algorithms just means that a section of
our math is now owned by someone. If all of these patents become
owned by many different entities, then math could become unusable
by anyone. Then where would we be?
The free Vorbis decoder for embedded devices is less than 7,000 lines of code,
and much of that code is infrastructure logic to do, for example, all math using
integers because low-end processors often do not have floating point capabilities.
104
105
Software is big
Beyond software being math, software also differs from things
that were patented previously. In the biotechnology world, and even
the world of Edison, a patent typically covers one product:
106
107
anyone who happens to run across it, but not using it for their own
purposes. Many times, patents are created merely as a defensive
measure against other companies. A company will patent things
with the hope it can trip up anyone who might come calling with
claims against them.
In a world filled with free software, it is the copyleft mechanism,
not the patent mechanism, that will provide protection for software.
Even proprietary software would not stop improving if software
patents disappeared, though their lawyers would scream like stuck
pigs.
Conclusion
In the early days of cars, there were patent lawsuits sent between
the lawyers, on horseback of course:
George Selden, the man who patented the car in 1895, didn't sell one until
14 years later. Was he just a squatter on this idea? The magic of the early
car was the internal combustion engine, which Selden did not invent, and
which required only the science of fire, something man exploited long
before he understood it.
108
109
Biotechnology Patents
There are more than 50 proteins possibly involved in cancer
that the company is not working on because the patent holders
either would not allow it or were demanding unreasonable royalties.
Peter Ringrose, Chief Scientific Officer, Bristol-Myers Squib
Pharmaceutical TV Ads
110
111
112
113
This open model is now being used in a federally funded international effort to create a map of haplotypes (HapMap) which
describes variations in the human genome that tend to occur
together in neighborhoods or haplotypes. Data about the
genotype of the individual haplotypes is being released publicly
as soon as it is indentified. The openness of the HapMap effort
is reinforced by its use of a licensing system that is self-consciously modeled on the copyleft system of open-source software licensing and which prevents those who utilize the data
from attempting to close it to others via patents.
Utilizing the results of the HapMap process, a public-private
partnership, the SNP Consortium, is identifying panels of a few
hundred thousand single-nucleotide polymorphisms (SNPs) that
can be used to identify common variants in an individuals
entire 3-billion base-pair genome that might be associated with
a disease. As with the HapMap project, participants in the consortium have agreed to put the data they produce in to the public domain.
In the reasonably near future, according to Dr. Francis Collins,
leader of the National Human Genome Research Institute
(NHGRI) in the National Institutes of Health (NIH), the
HapMap should help make practical case-controlled studies
using SNPs to identify gene variants that contribute to diabetes, heart disease, Alzheimer disease, common cancers, mental illness, hypertension, asthma, and a host of other common
disorders. That future seems nearer than ever today with scientists finding correlations between diseases such as multiple
sclerosis and breast cancer and specific genetic variations.
114
Length of Copyright
Our Founding Fathers had it right, once again, when they determined that authors and inventors should have exclusive rights to
their creations for limited times. Their thinking suggests a presumption that ideas will eventually flow into the public domain
because, at some point, increasing the years of protection isn't guaranteed to promote further progress; instead, it may serve as an
impediment.
When copyright was first created in the U. S., the term was 14
years, renewable once if the author was still alive. Today, the time
frame that has been chosen for U.S. copyright law is the life of the
author plus 70 years. This value was extended from the life of the
115
116
Fair Use
The fair use clause of copyright allows you to use copyrighted
materials for limited purposes without getting permission. This right
was recognized by the courts in common law and was incorporated
into U. S. Copyright law in 1976:
The fair use of a copyrighted work, including such use by
reproduction in copies or prerecords or by any other means
specified by that section, for purposes such as criticism,
comment, news reporting, teaching (including multiple copies
for classroom use), scholarship, or research, is not an
infringement of copyright. In determining whether the use
made of a work in any particular case is a fair use the factors to
be considered shall include
1. the purpose and character of the use, including whether
such use is of a commercial nature or is for nonprofit
educational purposes;
2. the nature of the copyrighted work;
3. the amount and substantiality of the portion used in
relation to the copyrighted work as a whole; and
4. the effect of the use upon the potential market for, or value
of, the copyrighted work.
117
DRM has so far had mixed results. The first major use of DRM is
that which exists in DVD players. Most people don't realize that DVD
players have DRM because DVDs can be very easily exchanged,
unlike an iTunes song, whose hassles are well-documented.
118
In fact the only time you can notice the DRM is when dealing with
region encoding:
The DVD key is 40-bits which is very small, and was made small because of the
US export restrictions which existed in 1996. It has been said that computers
powerful enough to play a DVD are powerful enough to crack the password!
119
because the information has already leaked out. In fact, the unencrypted version of the data is more useful because it doesn't have
the restrictions on it.
The key to security in any system is to have multiple layers of protection, a concept known as defense in depth. Inside a locked house,
you might have a safe to store important documents and jewelry. An
alarm system and cameras add additional levels of protection.
If you want to truly secure the contents of a DVD, you should put
it in a safe, encrypt the contents with a key stored away from the
DVD, secure access to the machine playing the DVD, make sure no
third-party applications are running on the computer playing the
DVD, etc.
One can obviously see that these mechanisms don't make sense
for the content we own. While I understand the interest in protecting copyrighted materials, we should not add complexity that serves
no purpose.
Because DVD decoding is protected by patents with per-copy
licensing fees, Microsoft decided not to include a means of playing
DVDs in Windows XP. (Microsoft likes to charge per-copy license
fees for its software but never likes to sign such license agreements
itself! DVD playback is one of the applications that hardware vendors must include with Windows to make it fully-functional.)
In fact, the DRM mechanisms create obstacles for the proprietary
software vendors, who try to legally jump through these non-existent security hoops, even more than for the free software guys, who
have just written code to do the job and been prosecuted.
If you sat down to write code to play your DVDs on your own
computer (not an easy task, I admit) you would be breaking the law!
Jon Lech Johansen, one of three people who first wrote free code to
play DVDs, suffered years of legal harassment.
The Digital Millennium Copyright Act (DMCA) of 1996 says that
writing or distributing DVD decoding software is illegal because it
circumvents a technological measure that effectively controls
access to a work. It is the DVD industry that has deigned itself
exclusive provider of technology to play your DVDs. Even if you
wrote this software just to play your own DVDs, you are breaking
the law, and Universal v. Reimerdes, the first lawsuit testing the
DMCA, has upheld this. According to Judge Lewis Kaplan:
In the final analysis, the dispute between these parties is simply
put if not necessarily simply resolved. Plaintiffs have invested
huge sums over the years in producing motion pictures in
reliance upon a legal framework that, through the law of
120
121
I downloaded a World War II flying simulator for my Xbox 360, but I couldn't
configure the controls like the controls on an RC model airplane. Don't the
programmers of that game recognize the crossover market? Relearning muscle
memory is hard and pointless.
122
When I do enable their nascent 3-D features, the computer doesn't suspend and
resume properly anymore. And the cursor would sometimes become malformed.
Bugs like this would not survive long if the code were free. In fact, many teams
are re-engineering proprietary drivers from scratch just to be able to fix bugs.
Tools
123
TOOLS
You can tell a craftsman by his tools.
Socrates
The major cause of the software crisis is that the machines
have become several orders of magnitude more powerful! To
put it quite bluntly: as long as there were no machines, programming was no problem at all; when we had a few weak computers, programming became a mild problem, and now we have
gigantic computers, programming has become an equally
gigantic problem.
Edsger Dijkstra, 1972
124
Tools
In fact, the last hardware bug anyone remembers was the Intel
floating point division bug in 1994 nearly all the rest are software
bugs.1
The problem is this: the vast majority of today's code is written in
C, a programming language created in the early 1970s, or C++, created in the late 1970s. Computers always execute machine language, but programming these 1s and 0s are inconvenient in the
extreme, so programmers create high-level languages and compilers
that convert meaningful statements to machine code. We are asking
the same tools used to program the distant computer ancestors to
also program our iPods, laptops, and supercomputers, none of which
were even conceived of back then.
Imagine building a modern car with the tools of Henry Ford. A
tool ultimately defines your ability to approach a problem. While the
importance of cooperation in solving big problems is a major theme
of this book, the current set of tools are stifling progress. A complete unified set of libraries will change computing even more than
the worldwide adoption of the Linux kernel.
Metcalfe's law has unfortunately applied to C and C++; people
use them because others use them. These languages became so
dominant that we never moved on from our 1970s heritage. Programmers know many of the arguments for why newer languages
are better, but paradoxically, they just don't think it makes sense for
their codebase right now. They are so busy that they think they
don't have time to invest in things that will make their work more
productive!
Corporations today are struggling with the high costs of IT, and
many feel pressure to outsource to places like India to reduce the
costs of maintaining their applications. However, outsourcing doesn't decrease the manpower required to add a feature, it only
reduces the cost, so this is merely a band-aid fix.
In fairness to Intel, that bug happened on average once in every nine million division operations, and would return results that were off by less than 0.000061.
Testing a 64-bit processor's math capabilities involves 2128, or 1038 test cases! Intel
does continuously release errata lists about their processors. For example, initializing a computer is a very complicated process with plenty of room for ambiguity,
yet because user data isn't even in memory yet, there is no risk of a hardware bug
causing a newsworthy problem.
Tools
125
For the first 60 years computers were programed, and input and
output was done using punch cards.2 Punch cards were invented for
the 1890 census by Herman Hollerith, founder of the firm which was
eventually to become IBM. Punch card machines operate on the
same principles as modern computers but have one millionth the
memory. Code reuse was impossible because one couldn't copy and
paste punch cards.
IBM's first computer, used for the 1890 census. Punch cards had been
around for 110 years before the world heard of pregnant chads.
This history ignores Fortran, Cobol, Algol, and other important languages, but this
book is PC-focused.
126
Tools
Tools
127
C is more comprehensible, but more importantly, by simply swapping the compiler, it enables software written in C to run on every
computer in existence today with few changes required. In building
the standard replacement for assembly language, Bell Labs changed
computing.
Assembly language actually still exists in the nooks and crannies
of modern computers: deep in the code of the operating system kernel, inside game engines, and in code to program specialized graphics or math processors. However, most processor-specific knowledge
has moved out of the code a programmer writes and into the compiler, which is a huge step up.
While developers at Microsoft used assembly language in the very
early years, when I joined in 1993, nearly all code was written in C
or C++, and that is still true today.
Bill Gates
Microsoft, 1978
Bill Gates quit coding somewhere in this timeframe.
Paul Allen
128
Tools
The world spent the first 1,000 man-years of software development primarily in assembly language, but most programs are written in C or C++. Man has spent 400,000 man-years in those two
languages, the same amount of time that the Egyptians spent building the Great Pyramid of Giza:
Man spent as much time programming in C and C++ as building the pyramids of Giza. Unfortunately, the software used to build our software looks
as old and cracked as those pyramids do today.
Tools
129
130
Tools
John McCarthy created GC in 1959 when he created Lisp, a language invented ten years before C, but which never became
accepted into the mainstream:
(defun finder (obj vec start end)
(let ((range (- end start)))
(if (zerop range)
(if (eql obj (aref vec start))
obj
nil)
(let ((mid (+ start (round (/ range 2)))))
(let ((obj2 (aref vec mid)))
(if (< obj obj2)
(finder obj vec start (- mid 1))
(if (> obj obj2)
(finder obj vec (+ mid 1) end)
obj)))))))
The binary search function written in Lisp is a simple algorithm for quickly
finding values in a sorted list. It runs in Log (n) time because at each step, it
divides the size of the array in half, similar to how we look up words in a
dictionary. There are many faster and more complicated algorithms, and
search is a very interesting and large topic in computer science, but 99% of
the time, ye olde binary search is good enough.
2
C and C++ was based on BCPL and other languages before it,
none of which had garbage collection. Lisp is a language built by
mathematicians rather than operating systems hackers. Lisp pioneered GC, but also was clean and powerful, and had a number of
innovations that even C# and Java don't have today.3
Wikipedia's web page doesn't explain why Lisp never became
accepted for mainstream applications, but perhaps the biggest
answer is performance.4 So instead, people looked at other, more
primitive, but compiled languages. The most expensive mistake in
the history of computing is that the industry adopted the non-GC
language C, rather than Lisp.
3
Perhaps the next most important innovation of Lisp over C is functional programming. Functional programming is a focus on writing code which has no side
effects; the behavior of a function depends only on its parameters, and the only
result of a function is its return value. Nowadays, in object-oriented programming,
people tend to create classes with lots of mutable states, so that the behavior of a
function depends on so many things that it is very hard to write correct code,
prove it is correct, support multiple processors manipulating that object at the
same time, etc. Functional programming is a philosophy, and Lisp made it natural.
Most Lisp implementations ran 10x slower than C because it was interpreted
rather than compiled to machine code. It is possible to compile Lisp, but unfortunately, almost no one bothered. If someone complained about Lisp performance,
the standard answer was that they were considering putting the interpreter into
hardware, i.e. a Lisp computer. This never happened because it would have
sucked Lisp into the expensive silicon race.
Tools
131
While Lisp had many innovations, its most important was garbage
collection. Garbage collection requires significant infrastructure on
the part of the system and is a threshold test for an intelligent programming language.5
Because so few of the most important codebases today have
adopted GC, I must explain how it improves software so my geek
brethren start using it.
The six factors of software quality are: reliability, portability, efficiency, maintainability, functionality, and usability; I will discuss how
GC affects all of these factors. The most important factor is reliability, the sine qua non of software.
Some argue that static type checking (declaring all types in the code so the compiler can flag mismatch errors) is an alternative way of making software more
reliable, but while it can catch certain classes of bugs, it doesn't prevent memory
leaks or buffer overruns.
Likewise, there are smart pointers which can emulate some of the features of
garbage collection, but it is not a standard part of languages and doesn't provide
many of the benefits.
Apple's Objective C 2.0 has added support for GC in their language, but it is
optional, and therefore doesn't provide many of the benefits of a fully GC language, like enabling reflection or preventing buffer overruns.
132
Tools
Reliability
Therefore everyone who hears these words of mine and puts
them into practice is like a wise man who built his house on the
rock. The rain came down, the streams rose, and the winds
blew and beat against that house; yet it did not fall, because it
had its foundation on the rock. But everyone who hears these
words of mine and does not put them into practice is like a foolish man who built his house on sand. The rain came down, the
streams rose, and the winds blew and beat against that house,
and it fell with a great crash.
Matthew 7:24-27
Tools
133
134
Tools
A tiny bug in its software caused the crash of one of the European
Space Agency's Ariane 5 rockets, costing $370 million:6
The rocket's software was written in Ada, an old language, but with many of the
features of garbage collection. Code which converted a 64-bit integer to a 16-bit
integer received a number too big to fit into 16 bits, and so the conversion code
threw an exception. The code to handle this exception was disabled, and therefore
the computer crashed. When this computer crashed, it started sending confusing
diagnostic information to the flight control computer, causing it to fly in a crazy
way and break apart, triggering the internal self-destruct mechanisms.
Many blame management, but this was a simple design bug (you should be very
careful when throwing away data). This was compounded because they were
using a specialized embedded system with a non-mainstream programming language which allowed them the capability of disabling certain exceptions. This bug
could have been caught in testing, but they didn't use accurate trajectory information in the simulations. Perhaps clumsy tools made it hard to modify test cases,
and so they never got updated.
Tools
135
When your computer crashes, you can reboot it; when your rocket
crashes, there is nothing to reboot. The Mars Spirit and Opportunity
rovers had a file system bug which made the rovers unresponsive,
nearly ending the project before they even landed!7
While it isn't usually the case that a software bug will cause a
rocket to crash, it is typically the case that all of the software layers
depending on that buggy code will also fail. Software reliability is
even trickier than that because an error in one place can lead to failures far away this is known in engineering as cascading failures. If an application gets confused and writes invalid data to the
disk, other code which reads that info on startup will crash because
it wasn't expecting invalid data. Now, your application is crashing on
startup. In software, perhaps more than in any other type of intellectual property, a bug anywhere can cause problems everywhere,
which is why reliability is the ultimate challenge for the software
engineer.
Perfect reliability is extremely hard to achieve because software
has to deal with the complexities of the real world. Ultimately, a key
to reliable software is not to let complexity get out of hand. Languages cannot remove the complexity of the world we choose to
model inside a computer. However, they can remove many classes of
reliability issues. I'm going to talk about two of the most common
and expensive reliability challenges of computers: memory leaks and
buffer overruns, and how garbage collection prevents these from
happening.
Memory Leaks
Web banner ad for a tool to find memory leaks. There is a cottage industry
of tools to fix problems which exists only because the programming language is broken in the first place.
The rover file system used RAM for every file. The rovers created a lot of system
logs on its trip from Earth to Mars, and so ran out of memory just as they arrived!
The problem is even worse because every big C/C++ application has its own
memory allocators. They grab the memory from the OS in large chunks and manage it themselves. Now if you are done with memory, you need to return to the
person who gave it to you.
136
Tools
Losing the address of your memory is like the sign outside a Chinese dry-cleaner: No tickie, no laundry. To prevent leaks, memory
should be kept track of carefully. Unfortunately, C and C++ do not
provide this basic feature, as you can allocate and lose track of
memory in two lines of code:
byte[] p = new byte[100]; // p points to 100 bytes of memory
p = NULL;
// p now points to NULL, reference
// to 100 bytes lost
New returns the location of the newly allocated memory, stored into variable p. If you overwrite that variable, the address of your memory is lost,
and you can't free it.
Tools
137
our ability to fix them. To date, there is no non-trivial codebase written in C or C++ which is able to solve all of these error conditions,
and every codebase I saw at Microsoft had bugs which occurred
when the computer ran out of memory.9
MySQL, an otherwise highly reliable database, which powers popular websites of Yahoo! and the Associated Press, still has several
memory leaks (and buffer overruns.) Firefox's bug list contains several hundred, though most are obscure now.10
Let's take a look at why a memory leak can't happen when running on a GC language:
byte[] p = new byte[100]; // Variable "p" points to 100 bytes.
p = NULL;
// p now points to NULL.
// The system can deduce that no variables
// are referencing the memory, and therefore
// free it.
You don't have to call delete because the system can infer what memory is
in use.
Often near the end of a development cycle, after fixing our feature bugs, we
would focus on some of the out-of-memory bugs. While we never fixed them all,
we'd make it better and feel good about it.
It is true that when you run out of memory, it is hard to do anything for the user,
but not causing a crash or a further memory leak is the goal.
10 Here is a link to all active MySQL bugs containing leak:
http://tinyurl.com/2v95vu. Here is a link to all active Firefox bugs containing
memory leak: http://tinyurl.com/2tt5fw.
11 GC makes it easy for programmers to freely pass around objects that more than
one piece of code is using at the same time, and the memory will be cleaned up
only when every piece of code is finished with it. C and C++ do not enable this
and many other common scenarios.
To write code which allows two pieces of software to share memory and to return
it to the operating system only when both are finished is complicated and onerous. The simplest way to implement this feature in C/C++ is to do reference
counting: have a count of the number of users of a piece of memory. COM, and the
Linux and Windows kernels have reference counting. When the last user is finished, the count turns to zero and that last user is responsible for returning the
memory to the OS. Unfortunately, this feature requires complicated nonstandard
code (to handle multiple processors) and places additional burdens on the programmer because he now needs to keep track of what things are referencing each
138
Tools
It is also quite interesting that GC enables a bunch of infrastructure that Microsoft's OLE/COM component system tried to enable,
but COM did it in a very complicated way because it built on top of
C and C++, rather that adding the features directly into the language:
COM Feature Name
Reference Counting
Garbage Collection
BSTR
Unicode strings
Type Libraries
metadata + bytecodes / IL
IUnknown
Everything is an Object
IDispatch
Reflection
DCOM
COM contains a lot of the same infrastructure that GC systems have, which
suggests a deep similarity of some kind. Doing these features outside the
language, however, made writing COM more tedious, difficult, and errorprone. .Net completely supersedes COM, and in much simpler packaging,
so you will not hear Microsoft talk about COM again, but it will live on for
many years in nearly every Microsoft codebase, and many external ones.
Buffer Overruns
As bad as memory leaks are because they often cause crashes,
buffer overruns are worse because your machine can get hijacked!
Buffer overruns are the most common type of security bug, and a
major nemesis of the computer industry. Microsoft's code has fallen
prey to a number of buffer overruns; the Code Red virus, which
infected Microsoft's web server and caused rolling outages on the
Internet, is estimated to have cost the industry two billion dollars.
Free software is certainly not immune to this either; on a daily basis
my Ubuntu operating system downloads fixes to newly discovered
buffer overruns.12
Like with memory leaks, you can create and overrun a buffer with
just two lines of code:
int* p = new int[50]; //Allocate 50 entries, referenced 0-49
p[50] = 7;
//Write to 51st entry, off by 1 bug
C and C++ do not validate memory access, so a programmer can intentionally or unintentionally read or write to memory he shouldn't have access to.
other. Handing out references, which should be simple, is now error-prone. Even
so, reference counting is insufficient: if two objects point to each other, but no one
points to them, they will keep each other alive.
12 One of the security fixes awaiting me right now is: SECURITY UPDATE: arbitrary
code execution via heap overflow, from CVE-2007-3106.
Tools
139
13 The Linux kernel is an example of reliable C code. Many might use this as proof
that it is possible to write reliable code without garbage collection. However,
there are several reasons why this lesson may not be valid:
The kernel's primary job is to provide glue code to make the hardware work. Seventy-five percent of the kernel's code is hardware specific, and much of the code
is tiny, simple components. Much of the remaining code implements decades-old
operating system concepts like threads. This makes the feature requirements and
design relatively stable over time. All of the things we consider smart software,
like grammar checkers and speech and handwriting recognition, involve writing
code which has no clear design and would never be part of a kernel.
The kernel doesn't have the need to inter-operate with as much other software
like applications do. Software written by many disparate and interdependent
teams makes GC more important.
The Linux kernel has a big team per line of code compared to other codebases.
This gives them the luxury of using inefficient tools.
The Linux kernel does have a few memory leaks even as I write this. (Bugs 5029,
6335, 8332, 8580)
The kernel has a number of specialized requirements which make it harder to do
garbage collection, but it would benefit from it, and it wouldn't surprise me if
most of the kernel's code which is not written in an assembly language is written
in a GC language one day.
For now, we should focus on porting application code and leave the Linux kernel
as the very last piece of code written in C. Once the world has collaborated on a
new software stack, written in our new programming language, we can knock on
Linus's door with our now mature tools and figure out what his requirements are.
140
Tools
Portability
Fifteen years ago, Intel, with HP's help, decided to create a new
processor architecture which would enable them to incorporate all
they had learned in creating their x86 processors, first introduced in
the 1970s. This chip, known as Itanium or IA-64, is a 64-bit chip
which removed all of the previous compounded ugliness, leaving a
simpler and therefore potentially faster design in its place. Microsoft was already working on porting its C tools and code to IA-64
when I joined in 1993, though the chip wasn't released until 2001:
Beautiful new chip, but even though Intel included x86 compatibility hardware, it was still incompatible with existing C and C++.14 Code written in a
GC language is already portable to chips that have yet to be created.
Tools
141
The biggest obstacle Intel faced was the fact that our pyramid of
C and C++ code running on PCs today is compiled for the x86 processor. Such programs won't run without at least re-compiling the
source for another processor, and it might even require changes to
the software because it is easy to write non-portable C/C++. Consequently, the adoption of new hardware is significantly limited by the
fact that you have to find every piece of code out there, and recompile it. What this ends up meaning is that while a lot of your stuff
works, some of it doesn't. Itanium Linux had no Flash player or
Adobe Reader till very recently, two significant stumbling blocks for
desktop deployments, and even one obstacle can be too many.16
GC solves portability issues because programs written in languages such as Java, C#, Python, etc. are no longer compiled for
any specific processor. By comparison a C/C++ executable program
is just a blob of processor-specific code containing no information
about what functions and other metadata are inside it. Its contents
are completely opaque to the system, and the processor just starts
blindly executing it. To a GC system, a blob of binary code is insufficient.
Like stores in the real world, GC systems in principle need to
close to take inventory. However, unlike stores, they cannot just kick
all the existing customers out, or wait for them to leave. So it does
the equivalent of locking the doors, not letting any customers in or
out (halts execution of code), and tabulating what is on the shelves
and in the shopping carts (what memory is in use). Once it has an
accurate account and it can then re-open the doors and let existing
customers leave, and new ones enter (program execution
resumes.)17
142
Tools
When GC pauses code, it needs to know what function the processor is currently executing, and even where in that function it is:
void ExampleFunction()
{
int x = SquareNum(3); //If execution stops here, no memory
//allocated yet.
object o = new object(); //This allocates memory into 'o'.
DoStuffWithObject(o); //If execution stops here, 'o' is in use.
int y = SquareNum(4); //If execution stops here, 'o' is no
//longer in use, and can be cleaned up.
}
Hello, World! in .Net's bytecode. This is similar to the original C#, though
more verbose.
Tools
143
written for the Macintosh was written in a GC programming language, it would have been zero work for Apple and third-parties to
switch to the Intel processor once the GC runtime was ported!20
In fact, an application written in a GC programming language is
automatically portable to chips that haven't even been created yet.
We impede our freedom to create new processors when software is
not written in portable languages.
Portability is one of the holy grails of computing, and while GC
code doesn't completely solve cross-operating system portability, it
does solve the situation of running the same code on different processors itself an enormous step.21
With the source code or bytecode, the GC system has all the information it needs to figure out exactly what is going on when it stops
execution. In fact, it also has a lot of information that enables other
cool features like reflection, which allows code to query information
about an object at runtime. These features create a more dynamic
system.
We've discussed two advantages of GC: greater reliability and
portability. The next topic is code performance, which is the biggest
worry when using modern tools. I have had many discussions with
smart geeks who insisted that languages such as C# simply weren't
suitable for their fast code.
Efficiency
It doesn't matter how fast your code runs if it doesn't give the correct result, but processing power is still an important resource. In
fact, code efficiency is even less important than memory usage
because if you waste memory, your computer will eventually crash,
but if you waste processor cycles, your computer will just be sluggish and annoying. Microsoft has always focused on performance,
and has often promoted it as a competitive advantage.22
If you walk up to a programmer on the street and ask them what
they think of Java, one of your answers will be: slow. At one time,
20 Today, Mac users have to worry about whether a program is a PowerPC binary or
an Intel binary. There is even a rumor that one day there will be four binaries, two
for the 32-bit and 64-bit versions of both processors!
21 If we all use the same OS, then this OS cross-platform problem disappears :-)
22 GC code can give better performance because it has the exact hardware in front
of it. C compilers are forced to generate generic code which will run on all models
of a processor. New processors aren't just faster than previous versions, they add
new instructions and other capabilities, which often go unused.
144
Tools
do {
while((a[i] < x) && (i < r)) i++;
while((x < q[j]) && (j > l)) j--;
if(i <= j) {
y = a[i];
a[i] = a[j];
a[j] = y;
i++; j--;
}
} while(i <= j);
Algorithm analysis teaches you that the code on the left should be about
50,000 times faster than the code on the right at sorting one million numbers. The speed of code, not the speed of the language is what matters.24
23 You only need to in principle pause a program while doing GC. The good news is
that in many types of applications, from word processors to web servers, a brief
pause is acceptable. This number is proportional to the amount of memory in use
by the application, and therefore obeys the only pay for what you use rule of
engineering.
However, in certain situations, pausing is not acceptable, such as video games or
code interacting with hardware. There are many solutions such as implementing
GC via reference counting, or making it incremental and scheduling it proactively
during idle moments, etc.
24 To sort an array of n numbers, Quicksort will do it in a time proportional to
n * log2(n) and Bubblesort will do it in n2. If you plug in one million for n, it means
Tools
145
146
Tools
Tools
147
Maintainability
The major incentive to productivity and efficiency are social
and moral rather than financial.
Peter Drucker
During the years we worked on Viaweb I read a lot of job
descriptions. A new competitor seemed to emerge out of the
woodwork every month or so. The first thing I would do, after
checking to see if they had a live online demo, was look at their
job listings. After a couple years of this I could tell which companies to worry about and which not to. The more of an IT flavor the job descriptions had, the less dangerous the company
was. The safest kind were the ones that wanted Oracle experience. You never had to worry about those. You were also safe if
they said they wanted C++ or Java developers. If they wanted
Perl or Python programmers, that would be a bit frightening
that's starting to sound like a company where the technical
side, at least, is run by real hackers.
We were always very secretive about our competitive advantage of Lisp. Robert Morris says that I needn't be because even
if our competitors had known, they wouldn't have understood
why: If they were that smart they'd already be programming in
Lisp.
Paul Graham, Hackers and Painters
148
Tools
on the aggressive optimizations performed by the Java HotSpot
compiler, we were pleased to find that FreeTTS runs two to
four times faster than its native-C counterpart, Flite.
Clearly, it would be possible for us to roll some of these optimizations back into Flite with the likely result of improving
Flite's performance to levels similar to FreeTTS. The lack of
Java platform features such as garbage collection and high-performance collection utilities, however, makes performing these
optimizations in Flite much more time consuming from a programming point of view.
At the same time, few argue that assembly language had close to
the productivity of C, so this contradiction is unresolved in the
minds of many. Many computer geeks like to argue about C versus
C++, but compared to a modern and elegant language like C#, this
is like choosing between Britney and Paris.
While C++ added object orientation features to C, it had a fatal
flaw: it was a superset of C. In fact, the early compilers just converted their code to C, which was a great way to bootstrap the new
language to the many places where C was already used, but by beging a superset it tied the designers to the baggage of C. While C++
added new object-oriented features, it also added significant complexity.29 I used C++ for many years, and I liked some of the
29 Another way to analyze the maintainability of code is to analyze the maintainability of the compilers for that code. One finds that C++ compilers are big, ugly and
Tools
149
improvements over C, but the language is mind-numbingly complicated and generally provides many more ways to screw up than in C.
That C++ didn't start with a clean slate is the second biggest mistake in the history of computing.
If you could double developer productivity at the cost of half of
your current performance, would you take it? There isn't any universal agreement on the answer to this question among computer engineers today. However, Moore's law states that Intel's computers take
merely 18 months to become twice as fast, and in ten years, computers will be another 100 times faster than they are today. A 20% drop
in performance to enable garbage collection would take Intel hardware progress 4 months to counter-balance, and we would pay it
only once. Anders Hejlsberg, architect of C#, has said it is the best
use of Moore's law to come around in years.
Reliability issues also add a variability to the engineering cost of a
project because developers spend an unpredictable amount of time
fixing bugs. In adopting GC, developers would pay a fixed performance cost in exchange for decreased engineering costs and variability.
150
Tools
was reliable, rich, and easy to use. We were able to tweak the UI
based on feedback right up until the end simply because we could
without breaking things. Modern tools can go everywhere, even to
devices constrained by cost and size, and their greater use will
make consumer devices easier to use.
Conclusion
We can only see a short distance ahead, but we can see plenty
there that needs to be done.
Alan Turing, father of modern computer science
Their policy on the problem of maintenance was but a game
they seemed to be playing with a piece of rubber that could be
stretched a little, then a little more.
Ayn Rand, Atlas Shrugged
Tools
151
152
Java is relatively elegant, and should have replaced C and C++. The highlighted portions show the very few places that would need changing to port
to C#.
n 1995, Sun Microsystems created a next-generation programming language called Java, something that could have been the
most significant part of their legacy, more important than their
Sparc processor, Solaris (their flavor of Unix), or anything else. Java
153
154
Sun's first mistake was that they failed to do what Bell Labs did
with C and C++: create freely available compilers and runtimes for
people to experiment with and extend. Instead, Sun locked up the
Java codebase, letting few see it, and letting even fewer improve it,
so that today, there exists only a small community of people, outside
of Sun itself, improving Java.
For example, because Sun locked up their code, no one was able
to port it to other processors. As I mentioned in the Linux chapter,
Debian calls itself The Universal Operating System because it contains 18,000 software components that run on 15 different processor architectures:
Intel x86 / IA-32
AMD64
Motorola 68k
Sun SPARC
Alpha
Motorola/IBM PowerPC
PowerPC 64-bit
ARM
MIPS CPUs
HP PA-RISC
IA-64
S/390
SuperH
Big-Endian ARM
Renesas's 32-bit RISC
Debian supports 15 processors, but Sun's Java web page lists just four.
155
ment where processors and operating systems have similar functionality, the software must be customized for embedding on low-end
hardware, something that Sun did not enable.
For its first five years, when everyone was seriously considering
using it, Java ran ten times slower than C because it was interpreted
rather than compiled, exactly like Lisp. This would have been fixed a
lot faster if Sun had involved and encouraged the existing free compiler development community.
Even in 2008, Java programs on my computer don't look like they
should. For example, when a program asks to display a file chooser
dialog box to the user, below are the results for a Java and a native
application:
Java had many significant limitations for many years because all
progress was held up by Sun.
156
157
Sun would have received, but did not, was how insanely complicated
their Java specs were. As an example, here is what a menu item
looks like on the screen:
and here are the 456 functions a Java MenuItem class implements:1
action, actionPropertyChanged, add, addActionListener, addAncestorListener, addChangeListener, addComponentListener,
addContainerListener, addFocusListener, addHierarchyBoundsListener, addHierarchyListener, addImpl, addInputMethodListener,
addItemListener, addKeyListener, addMenuDragMouseListener, addMenuKeyListener, addMouseListener, addMouseMotionListener,
addMouseWheelListener, addNotify, addPropertyChangeListener, addVetoableChangeListener, applyComponentOrientation,
areFocusTraversalKeysSet, bounds, checkHorizontalKey, checkImage, checkVerticalKey, clone, coalesceEvents, computeVisibleRect,
configurePropertiesFromAction, contains, countComponents, createActionListener, createActionPropertyChangeListener,
createChangeListener, createImage, createItemListener, createToolTip, createVolatileImage, deliverEvent, disable,
disableEvents, dispatchEvent, doClick, doLayout, enable, enableEvents, enableInputMethods, equals, finalize, findComponentAt,
fireActionPerformed, fireItemStateChanged, fireMenuDragMouseDragged, fireMenuDragMouseEntered, fireMenuDragMouseExited,
fireMenuKeyPressed, fireMenuKeyReleased, fireMenuKeyTyped, firePropertyChange, fireStateChanged, fireVetoableChange,
getAccelerator, getAccessibleContext, getAction, getActionCommand, getActionForKeyStroke, getActionListeners, getActionMap,
getAlignmentX, getAlignmentY, getAncestorListeners, getAutoscrolls, getBackground, getBaseline, getBaselineResizeBehavior,
getBorder, getBounds, getChangeListeners, getClass, getClientProperty, getColorModel, getComponent, getComponentAt,
getComponentCount, getComponentGraphics, getComponentListeners, getComponentOrientation, getComponentPopupMenu, getComponents,
getComponentZOrder, getConditionForKeyStroke, getContainerListeners, getCursor, getDebugGraphicsOptions, getDefaultLocale,
getDisabledIcon, getDisabledSelectedIcon, getDisplayedMnemonicIndex, getDropTarget, getFocusCycleRootAncestor,
getFocusListeners, getFocusTraversalKeys, getFocusTraversalKeysEnabled, getFocusTraversalPolicy, getFont, getFontMetrics,
getForeground, getGraphics, getGraphicsConfiguration, getHeight, getHideActionText, getHierarchyBoundsListeners,
getHierarchyListeners, getHorizontalAlignment, getHorizontalTextPosition, getIcon, getIconTextGap, getIgnoreRepaint,
getInheritsPopupMenu, getInputContext, getInputMap, getInputMethodListeners, getInputMethodRequests, getInputVerifier,
getInsets, getItemListeners, getKeyListeners, getLabel, getLayout, getListeners, getLocale, getLocation, getLocationOnScreen,
getMargin, getMaximumSize, getMenuDragMouseListeners, getMenuKeyListeners, getMinimumSize, getMnemonic, getModel,
getMouseListeners, getMouseMotionListeners, getMousePosition, getMouseWheelListeners, getMultiClickThreshhold, getName,
getNextFocusableComponent, getParent, getPeer, getPopupLocation, getPreferredSize, getPressedIcon, getPropertyChangeListeners,
getRegisteredKeyStrokes, getRolloverIcon, getRolloverSelectedIcon, getRootPane, getSelectedIcon, getSelectedObjects, getSize,
getSubElements, getText, getToolkit, getToolTipLocation, getToolTipText, getTopLevelAncestor, getTransferHandler, getTreeLock,
getUI, getUIClassID, getVerifyInputWhenFocusTarget, getVerticalAlignment, getVerticalTextPosition, getVetoableChangeListeners,
getVisibleRect, getWidth, getX, getY, gotFocus, grabFocus, handleEvent, hasFocus, hashCode, hide,
imageUpdate,fireMenuDragMouseReleased isBorderPainted, init, insets, inside, invalidate, isAncestorOf, isArmed,
isBackgroundSet, isContentAreaFilled, isCursorSet, isDisplayable, isDoubleBuffered, isEnabled, isFocusable, isFocusCycleRoot,
isFocusOwner, isFocusPainted, isFocusTraversable, isFocusTraversalPolicyProvider, isFocusTraversalPolicySet, isFontSet,
isForegroundSet, isLightweight, isLightweightComponent, isManagingFocus, isMaximumSizeSet, isMinimumSizeSet, isOpaque,
isOptimizedDrawingEnabled, isPaintingForPrint, isPaintingTile, isPreferredSizeSet, isRequestFocusEnabled, isRolloverEnabled,
isSelected, isShowing, isValid, isValidateRoot, isVisible, keyDown, keyUp, layout, list, locate, location, lostFocus,
menuSelectionChanged, minimumSize, mouseDown, mouseDrag, mouseEnter, mouseExit, mouseMove, mouseUp, move, nextFocus, notify,
notifyAll, paint, paintAll, paintBorder, paintChildren, paintComponent, paintComponents, paintImmediately, paramString,
postEvent, preferredSize, prepareImage, print, printAll, printBorder, printChildren, printComponent, printComponents,
processComponentEvent, processComponentKeyEvent, processContainerEvent, processEvent, processFocusEvent,
processHierarchyBoundsEvent, processHierarchyEvent, processInputMethodEvent, processKeyBinding, processKeyEvent,
processMenuDragMouseEvent, processMenuKeyEvent, processMouseEvent, processMouseMotionEvent, processMouseWheelEvent,
putClientProperty, registerKeyboardAction, remove, removeActionListener, removeAll, removeAncestorListener,
removeChangeListener, removeComponentListener, removeContainerListener, removeFocusListener, removeHierarchyBoundsListener,
removeHierarchyListener, removeInputMethodListener, removeItemListener, removeKeyListener, removeMenuDragMouseListener,
removeMenuKeyListener, removeMouseListener, removeMouseMotionListener, removeMouseWheelListener, removeNotify,
removePropertyChangeListener, removeVetoableChangeListener, repaint, requestDefaultFocus, requestFocus, requestFocusInWindow,
resetKeyboardActions, reshape, resize, revalidate, scrollRectToVisible, setAccelerator, setAction, setActionCommand,
setActionMap, setAlignmentX, setAlignmentY, setArmed, setAutoscrolls, setBackground, setBorder, setBorderPainted, setBounds,
setComponentOrientation, setComponentPopupMenu, setComponentZOrder, setContentAreaFilled, setCursor, setDebugGraphicsOptions,
setDefaultLocale, setDisabledIcon, setDisabledSelectedIcon, setDisplayedMnemonicIndex, setDoubleBuffered, setDropTarget,
setEnabled, setFocusable, setFocusCycleRoot, setFocusPainted, setFocusTraversalKeys, setFocusTraversalKeysEnabled,
setFocusTraversalPolicy, setFocusTraversalPolicyProvider, setFont, setForeground, setHideActionText, setHorizontalAlignment,
setHorizontalTextPosition, setIcon, setIconTextGap, setIgnoreRepaint, setInheritsPopupMenu, setInputMap, setInputVerifier,
setLabel, setLayout, setLocale, setLocation, setMargin, setMaximumSize, setMinimumSize, setMnemonic, setModel,
setMultiClickThreshhold, setName, setNextFocusableComponent, setOpaque, setPreferredSize, setPressedIcon,
setRequestFocusEnabled, setRolloverEnabled, setRolloverIcon, setRolloverSelectedIcon, setSelected, setSelectedIcon, setSize,
setText, setToolTipText, setTransferHandler, setUI, setVerifyInputWhenFocusTarget, setVerticalAlignment,
setVerticalTextPosition, setVisible, show, size, toString, transferFocus, transferFocusBackward, transferFocusDownCycle,
transferFocusUpCycle, unregisterKeyboardAction, update, updateUI, validate, validateTree, wait
The list of functions the Swing Java MenuItem class implements, with
bizarro names like isFocusTraversalPolicyProvider and addVetoableChangeListener. Imagine if you needed to become familiar with 456
things to use your oven.
Other widget libraries, like the Gtk#'s MenuItem have hundreds functions as well,
but many do not and Java is the worst. The WxWidgets MenuItem class has 50
members, and it uses native widgets on Mac, Windows and Linux. Apple's NSMenuItem has 55.
158
159
160
One of the things Microsoft changed was to simplify and improve the performance
of how a Java developer called in to native operating system functionality. Java has
always allowed developers to write non-portable, operating system-specific code
when the operating system provided a feature that a developer wanted access to
but that Java did not support. By definition this code was not cross-platform, so
one would imagine that Microsoft-specific syntax to access Windows-specific features would not have been a big deal. In fact, the way a Java programmer would
call the native functionality on Windows had the same syntax between Microsoft's
RNI and Sun's JNI. It was only the way the native method was declared that was
different. Microsoft did make other changes, such as adding a keyword, delegate, but if you search the web you can find many people who wished Sun had
added that feature and code samples to enable that feature in Java.
161
Sharing code across languages has historically been very difficult. The search
engine Lucene is a recent example of something which started in Java but has
been forked into versions in PHP and C#.
There are many reasons for it being so difficult to share code between different
languages. They have different ways of deploying the code, different naming conventions, and different low-level details. Interoperating layers usually adds significant performance costs because everything from the strings on up the pyramid
have to be marshaled, which usually involves making a copy of the data. If you
can't agree on what a string is, then you will have difficulty sharing code. (Going
from C# to C/C++ is fast because a copy of the data doesn't need to be created
a C Unicode string is a subset of a C# string.)
Fortress's syntax is arbitrarily different from Java in ways not at all related to high
performance computing (HPC). A programming language is by definition extensible by creating new functions and classes, and it is also possible to alter the runtime, without changing the language, to optimize it for specialized hardware, etc.
The big challenge in HPC is to robustly divide up the work amongst a farm of
servers, pass messages and maintain and monitor the system new software, not
new language features.
162
Scott McNealy, co-founder and CEO of Sun Microsystems from 1984 2006,
and an impediment to free software even though his company made most
of their money on hardware.
Sun's first major decision after co-founder Scott McNealy's departure in mid-2006 was a promise to make much of their software,
including Java, free. This announcement was big news and some
6
My bible says God scattered the Babylonians because of their pride. I can't
know what they thought, but I do believe that learning about ourselves, using
technology to save and improve lives, and attempting to understand the true
nature of the world, only gives us more appreciation for what He has created. I
believe the creator of liberty and science wants us to understand these concepts,
as long as we don't lose track of our relationship with the creator somehow in the
process.
163
think it marks the potential for a new day in the life of Java. However, this effort is starting more than a decade after Java's creation,
arguably too late to fix most of the problems.
Even with Sun's new GPL license, there is a lot of Java code out
there that is not free and cannot be made so by Sun. Apple, for
example, took Sun's PC implementation and worked extensively to
port it to Macintosh. This code isn't free, and it doesn't seem likely
that Apple will ever make it such. There is a company making Java
run on mainframes, and their enhancements are proprietary as well.
Even within the code Sun has recently made free, there are pieces
Sun licensed from third parties, and they are having to re-write it
because they can't get permission to change the license.
Even a free Java on its current trajectory will not absorb other
communities. C# is considered better than Java, so those programmers are not making a switch. There are millions of web pages written in PHP, and I don't see them rushing to switch to Sun's Java
either. Sun's code has been locked up for so long, there are few people in the outside world able to contribute. Why should the community fix a mess that was created only because Sun didn't work with
the community in the first place?
C# and .Net is the dominant platform on Windows, PHP is the
dominant language on the Web, and Python and Mono are the most
popular GC runtimes on the Linux desktop. So while Java is taught
in universities, used for custom applications in enterprises, and has
some success on mobile phones, these niches aren't enough to to
ensure its long-term viability. Java has accumulated so much baggage, only some of which I have discussed here, that I think the software community should abandon it. This would also take a big step
in lessening the problem of too many programming languages. In
earlier drafts of my book, I proposed Sun create a next generation
programming language, but I now believe there already are suitable
codebases: Mono and Python.
164
165
166
One other concern about Mono is that it doesn't use a copyleft license. Excerpt
from the FAQ on the Mono website:
When a developer contributes code to the C# compiler or the Mono runtime
engine, we require that the author grants Novell the right to relicense his/her
contribution under other licensing terms. This allows Novell to re-distribute the
Mono source code to parties that might not want to use the GPL or LGPL versions
of the code. Particularly: embedded system vendors would obtain grants to the
Mono runtime engine and modify it for their own purposes without having to
release those changes back.
167
Python's advantage is that it is created with input from programmers all over the world, and has a larger set of libraries than any
other GC language:
A computational fluid dynamic (CFD) visualization of a combustion chamber. The Python community has created a wide variety of libraries for everything from gaming to scientific computing.
168
tool which would have made their job easier their own language.
There are efforts such as the PyPy project which has built a Python
compiler, in Python, which outputs C. Unfortunately, this piece of
elegance is not yet the mainline codebase and is not considered for
such.9 And as mentioned earlier, the language doesn't have a standard graphical programming and debugging environment. So
Python today has impediments for both casual programmers looking
for an easy way to get into programming, and professionals who
care about building high-performance applications.
There are other interesting languages and runtimes out there, but
I believe the Linux desktop community should focus on these two.
Some have proposed merging the Python language with the Mono
runtime. Mono already supports other languages in addition to C#,
some that even look like Python.
There many good programming languages, in fact there are too
many, but I also think they are mostly good enough. One could further tweak the letter of the English language to make it easier for
your eyes to distinguish the letters, but it isn't necessary because
what we have is good enough. Likewise, it is much more important
to build a complete set of libraries for all aspects of computing, a
Wikipedia of free code, than to worry that further language innovation is the gating factor towards future progress in software.
One of the biggest missing features is that it doesn't support Python extension
modules.
The OS Battle
169
THE OS BATTLE
Free software works well in a complex environment. Maybe
nobody at all understands the big picture, but evolution doesnt
require global understanding, it just requires small local
improvements and an open market (survival of the fittest).
Linus Torvalds
I've been a big proponent of Microsoft Windows Vista over the
past few months, even going so far as loading it onto most of
my computers and spending hours tweaking and optimizing it.
So why, nine months after launch, am I so frustrated? The litany
of what doesn't work and what still frustrates me stretches on
endlessly.
Take sleep mode, for example. Vista promised a new low-power
sleep mode that would save energy yet enable nearly instantaneous resume. Poppycock. The brand-new dual-core system I
built a few months ago totters off to sleep but never returns. I
have to cold-start it to bring it back. This after replacing virtually every driver inside.
Take my media center PC, for example. It's supposed to serve
up photos, videos, and music. Instead, it often simply drops off
the network for absolutely no reason.
I could go on and on about the lack of drivers, the bizarre wakeup rituals, the strange and nonreproducible system quirks, and
more. But I won't bore you with the details.
Jim Louderback, Editor in Chief of PC Magazine.
170
The OS Battle
IBM
IBM was the first, and is still the biggest computer company, and
has built many operating systems over the decades. However, it has
yet to exhibit an interest in producing a Linux distribution for PCs.
In fact, you cannot get Linux pre-installed on any of their computers
today, even though IBM/Lenovo laptops are extremely popular in the
Linux community. It is possible they gave up hope after their
antitrust lawsuit in the 1970s, the distraction of mainframes and
minicomputers, and the expensive and humiliating defeat of OS/2 by
Windows in the early 1990s.
IBM has touted its support of free software and Linux for eight
years, but has done very little to even ensure its hardware runs
smoothly on Linux. I met an Intel employee whose job was to write
device drivers for IBM hardware because IBM wasn't working on
The OS Battle
171
These are the most notable devices that don't work: the fingerprint reader, the
broadband modem and the accelerometer which can signal to the hard drive to
park itself if the computer senses it is being dropped.
There are also missing utilities, like the ability to enable an external monitor, the
ability to recondition the laptop battery, or set the maximum charge to only 95%
of capacity to lengthen the lifetime of the battery. All of these exist on Windows.
Then there is the gray area of support: my old laptop contained an ATI graphics
card that generally worked, but because it the driver was proprietary, didn't
support 3-D, and was buggy.
172
The OS Battle
Red Hat
The first Linux distribution I used was Red Hat, the largest commercial Linux producer. One might conclude, therefore, that a big
part of the reason Linux hasn't taken off with desktops is because
Red Hat didn't really focus on building a user-friendly experience.
Red Hat built a platform usable by Google but not by our moms.
Instead, they focused on developers, servers, grid computing, the
web, etc. For many years, certain basic user scenarios, like setting
up a shared printer, have been cumbersome on Linux.
Red Hat's chief technology officer, Brian Stevens, was recently
asked, When is Red Hat getting into the desktop space? His
response:
To us, the desktop metaphor is dead. It's a dinosaur. Today's
users aren't sitting at home, sitting at a desk in isolation anymore. They are collaborative. They are sharing. They work and
play online. We don't believe that re-creating the Windows paradigm with just pure open source models does anything to
advance the productivity or the lives of the users.
The OS Battle
173
$349 - $1300 per year, depending on whether you get two-day web response or
24/7 phone response.
174
The OS Battle
Novell
The kernel also had fatal flaws for server scenarios: it didn't support preemptive
multitasking and ran application code in the kernel, therefore hurting reliability.
The OS Battle
175
Debian
Large organizations cannot be versatile. A large organization is
effective through its mass rather than through its agility. Fleas
can jump many times their own height, but not an elephant.
Peter Drucker
Welcome banner for the 2007 Debconf. Debian is one of the largest engineering teams you've never heard of.
176
The OS Battle
The OS Battle
177
Team size is a very important metric, in fact one of the most critical ones. All other things like productivity being equal, the team that
has the most engineers will win. The biggest reason why Internet
Explorer beat Netscape is that Microsoft created a bigger team.4 A
larger team can do more, including absorb new people faster and
build up institutional expertise in more areas. The lesson of Metcalfe's law is that the first to achieve critical mass wins: Google,
YouTube, Wikipedia.
In 2007, I went to the annual Debian Conference and was very
impressed with the strength of the team; many of the attendees
were of a similar caliber to my former co-workers at Microsoft, even
though they weren't screened via a day-long intensive interview
process.
In fact, many of Debian's components are packaged, updated, and
maintained by people who are using that component for their personal or professional use. Hewlett-Packard and other companies
contribute to Debian to make it work better for their customers, and
all have a voice at the table. This perspective which transcends companies and geographies, can lead to a healthy state of affairs.
Debian has governance structures, although their leaders play a
very small role in guiding the team in any particular direction. A feature is added simply because someone decides it is a good idea.
Details are hashed out in e-mail discussions, blogs and conferences.
The person doing the work makes the final decision, which is why
free software has been called a Do-ocracy, but the Internet allows
him to get questions answered and leverage the expertise of others.
Debian's collective expertise in understanding all of the software on
its DVD is its greatest asset.
A Debian developer's primary job is to update the many free software components to the latest version and then find and fix interaction bugs between components in the system. Debian does write its
own code, but that is a small yet important part of what is actually
on a Debian CD.
Debian is a spiritual leader of the free software community and
has a very ambitious goal embodied in their motto: to be The Universal Operating System. While they have yet to achieve this objective, they have come very close. Debian contains 18,200 software
applications that run on 15 hardware platforms and support hundreds of languages.
4
178
The OS Battle
The OS Battle
Ubuntu
179
180
The OS Battle
The OS Battle
181
Perhaps for the first time, a Linux for human beings was created,
Ubuntu's motto.
Today, Linux (and computing in general) is mostly for male human beings.
182
The OS Battle
The OS Battle
183
speed up progress, but they could have done their work inside of
Debian if Mark had told them to. In addition, there is an argument
to be made that both Ubuntu and Debian are hurt by the split.
It is widely accepted in the free software community that Ubuntu
and Debian have a special relationship. Ubuntu's website says that
Debian is the rock that Ubuntu is built upon. Given that Debian is
installed on millions of machines, has been around for 15 years, and
has 1,000 developers, this analogy is apt.
While everyone agrees that Ubuntu's hurting Debian is bad for
the free software movement, no one knows the extent of damage to
Debian. There are no accepted and published metrics that geeks can
use to help analyze the problem like the number of Debian users
who have switched to Ubuntu.5 The Debian developer community is
growing linearly, while Ubuntu and other free software efforts are
growing exponentially.6 There certainly must have been a slowdown
at Debian around the time of Ubuntu's creation in early 2004.
We know the changes in Ubuntu could have have been achieved
by Debian because Shuttleworth hired ten of the best Debian volunteer developers, and they started work in a 100% Debian codebase.
Debian is very highly respected within the Linux community, and the
pre-Ubuntu consensus was that it was just missing a little polish and
dynamism. This could have been easily fixed, especially with the
shot in the arm of a few volunteers transitioning to full-time developers. Other computer companies have done their work directly in
the Debian codebase, and Shuttleworth has never given justification
as to why he couldn't adopt a similar strategy.7
Geoffrey Moore's recent book Dealing with Darwin talks about
companies getting eaten up by context; irrelevant things not
core or important to the business. With Ubuntu, Shuttleworth cre5
6
7
Other good ones are the rate of growth of Debian users versus other, non-Ubuntu,
distros. Also useful is the number of Debian developer-hours per person per week.
Based on a conversation with former Debian leader Anthony Towns, who said that
the number of developers joining Debian has been constant over the last few
years.
One of the biggest challenges would be for Debian to have two release cycles, one
every six months, and one when it is ready, which is Debian's current modus
operandi. This is non-trivial, but doable.
Former Debian leader Martin Michlmayr argues in his PhD thesis that Debian
should switch to time-based releases. Debian believes they have, but it is an 18month release cycle, and they still allow themselves to slip. I think a yearly
release perhaps on Debian's birthday would be a good thing.
Wider use of Debian Testing would be another possibility. Debian Testing contains
the latest tested versions of all the applications all the time. New versions of the
applications are pushed to Testing after sitting in the Unstable branch for a few
days without any major bugs being found. The package manager even supports a
way to install older versions of packages, so you are never stuck.
184
The OS Battle
The OS Battle
185
who made it. Ubuntu publishes its source code on a website, but if a
Debian developer grabs it, and runs into problems, he is not an
expert in this code yet because it was the Ubuntu developer who
first made the change. Therefore, he will need to spend time getting
up to speed.
The time to get up to speed is comparable to the time to do the
work in the first place. In fact, the Debian developer who integrated
a huge set of X.Org patches from Ubuntu told me that they were
just a starting point, unsurprisingly providing little more help than
if he had done the work from scratch. I believe that if Shuttleworth
understood this concept, he would not have created a separate
Ubuntu.
If a different codebase had never been created and all the Debian
and Ubuntu developers were working in the same one, they would
automatically work more efficiently. They wouldn't need to redo, and
therefore re-learn, what someone else has just done. This would
enforce a division of labor and would increase the pace of progress.
Having separate teams is inefficient, but it also hurts Ubuntu's
quality. Whenever a Debian developer is re-learning about a software change first made in Ubuntu, he isn't using that time to move
forward on new things.
Furthermore, Ubuntu isn't the beneficiary of Debian's greater
expertise, which means their code is buggier than it could and
should be. Debian and Ubuntu's buglist is one of the best metrics
today for the set of obstacles preventing world domination. Ubuntu's
user base has grown dramatically, but their small and young team
has shown no ability to keep up with the new issues that have come
piling in along with the new users. In May 2006, Ubuntu had 10,000
active bugs, and in February 2008, Ubuntu had 40,000.
I discuss more about the challenge of bugs in the next chapter,
but the fact that Ubuntu has so many bugs means that there are
unsatisfied users, and this is stunting Ubuntu's growth. Debian's
much larger and more experienced team could provide great assistance, but because the team's release cycles and bug list are separate, there is no unified effort being made to resolve this challenge.
Because the Ubuntu team is smaller than the Debian team, they
argue that they are too busy to take ownership of their work inside
Debian. This idea is flawed because if someone else is redoing your
work, then you aren't actually accomplishing anything. Ya can't
change the laws of physics, Captain Kirk! If you are not accom-
186
The OS Battle
plishing anything, it doesn't matter how busy you think you are.
Smaller organizations should actually be more sensitive to wasted
work because they have fewer employees.
Shuttleworth claims that Ubuntu and Debian are going after different markets, but he can give no examples of features Ubuntu
wants that Debian doesn't want. If you consider the areas in which
Ubuntu has already made engineering investments: simple menus,
3-D graphics, faster startup, and educational software, it should be
obvious that Debian wants all of these features as well.8
Many of the features that Ubuntu has added, like better suspend
and resume for laptops, Debian is no longer motivated to add
because almost any Debian user who wanted this feature is now
using Ubuntu. Even if Debian does the work, they might not find the
bugs because it doesn't have that many users testing out the feature. Debian is being consigned to servers and embedded, which has
always been their strength, however, these are areas now being targeted by Ubuntu.
In a recent blog post, Shuttleworth wrote that he admires
Debian's goal of building a universal operating system, but he also
said in the same post that he believes its objectives are unrealistic.
Mark should trust his idealistic side and realize that because software is infinitely malleable, all of his software innovations can be
put directly into Debian. There are strongly unified teams building
Wikipedia and the Linux kernel, and their success stories can be
applied here.
There is understandably a fair amount of bitterness around,
which itself decreases the morale and productivity of the community. Debian has spent over a decade doing foundational work, but
Ubuntu has made just a few improvements and grabbed all the
excitement and new volunteers. I believe Debian has been terminally damaged by the split.
A separate user community is inefficient, but this is dwarfed by
the inefficiency of the separate developer community. The greatest
long-term threat to Debian is that they stop accumulating institutional knowledge. The best way to prevent this is to encourage
Ubuntu users to join the Debian community as well. Debian is filled
8
Some argue that supporting as many processor platforms as Debian does is more
work than supporting the 3 that Ubuntu supports, but there is very little architecture-specific code in Debian most of it lives in the Linux kernel and the C compiler. Additionally, Debian has platform maintainers, who are constantly watching
if anything breaks. Like with many things, Debian already has the infrastructure
and is already doing the work.
The OS Battle
187
There is a gaming team that recently decided to do all their work in Debian and
just let the changes flow downstream. If all patches flowed in both directions, as
everybody claims to want, and Debian and Ubuntu shipped on the same day, how
would someone decide which distro to install?
10 The Unicode support in the US release of Windows '95 was minimal and buggy.
Every non-trivial Windows application queries the version of the operating system
to do different behavior.
11 If a distribution finds a bug in the Linux kernel, it will put a bug into the bug database. The fix usually goes into the next release of the software, but a distro can
backport into their current version. The difference between the kernel in different
distros is the version and the set of backports, which are usually not noticeable to
applications.
188
The OS Battle
The OS Battle
189
FreeDesktop.org Projects
Avahi is a multicast dns network service discovery
library
cairo is a vector graphics library with cross-device
output support.
CJK-Unifonts open source CJK unicode truetype fonts
with additional support for Minnan and Hakka
languages.
Clipart is an open source clipart repository.
D-Bus is a message bus system.
Desktop VFS is a Virtual File System aimed at message
loop (gui) applications.
desktop-file-utils contains command line utilities for
working with desktop entries and .menu files
DRI is a framework for allowing direct access to
graphics hardware in a safe and efficient manner.
Enchant is a new cross-platform abstract layer to
spellchecking.
Enlightenment is a desktop environment and
application toolkit suite with lots of pretty pixels.
Eventuality is an "application automation meets cron"
type DBUS based framework for creating a means to
schedule arbitrary "actions" performed by conforming
apps.
Fontconfig is a library for configuring and customizing
font access.
GNU FriBidi is a library implementing the Unicode
Bidirectional Algorithm and Arabic Joining/Shaping.
Galago is a desktop-neutral presence system.
glitz is an OpenGL 2D graphics library and a backend
for gl output in cairo.
GStreamer is a streaming media framework.
GTK-Qt Theme Engine is a project to unify the GTK and
Qt theming engines.
HAL is a specification and an implementation of a
hardware abstraction layer.
HarfBuzz is the common OpenType Layout engine
shared by Pango, Qt, and possibly others.
Hieroglyph is a PostScript rendering library.
icon-slicer is a utility for generating icon themes and
libXcursor cursor themes.
icon-theme contains the standard and also references
the default icon theme called hicolor.
IMBUS is a common tier-1 architecture of IM
frameworks for connecting input method engine
containers and client application libraries.
immodule for Qt is a modular, extensible input method
subsystem for Qt.
IPCF is an inter-personal communication framework.
LDTP - Linux Desktop Testing Project
libburn is an open source library suite for reading,
mastering and writing optical discs.
libmimetype is a simple implementation accessing the
shared-mime-database included in PCManFM, a
lightweight graphical file manager featuring speed,
low resource usage, and tabbed-browsing. This small
GPL'd lib can be used for mime-type handling as a
lightweight replacement of xdgmime.
liboil is a library that makes it easier to develop and
maintain code written for MMX/SSE/Altivec extensions.
190
The OS Battle
Apple
After Woz hooked his haywire rig up to the living-room TV, he
turned it on, and there on the screen I saw a crude Breakout
game in full color! Now I was really amazed. This was much
better than the crude color graphics from the Cromemco Dazzler. ... How do you like that? said Jobs, smiling. We're going
to dump the Apple I and only work on the Apple II. Steve, I
said, if you do that you will never sell another computer. You
promised BASIC for the Apple I, and most dealers haven't sold
the boards they bought from you. If you come out with an
improved Model II they will be stuck. Put it on the back burner
until you deliver on your promises.
Stan Velt, former Editor-in-chief, Computer Shopper
Apple's iPod and iPhone may be sexy and profitable, but these
small devices are specialized in function, so there isn't a lot to say
about them. An iPod is busy when playing music, whereas when
your computer plays music, it uses less than 1% of its computing
power, which is not even noticeable.
For most of Apple's existence, they never really got the idea of the
relationship between market share and a developer community. For
example, Macs have historically not been allowed in enterprises
because no one added the necessary features and applications
because it never got the requisite market share to make anyone
want to bother.
Microsoft understood the virtuous cycle between users and developers, and knew that making it easy to build applications would
make Microsoft's ecosystems successful. Bill Gates brags that
Microsoft has ten times as many partners as Apple, and tools like
Visual Basic and FrontPage were important reasons why.
This internal focus that has limited the Mac's potential marketshare is now playing itself out with their new devices. Symptoms of
this mindset are noticeable in the most basic scenarios: you cannot
drag and drop music on and off an iPod, as you can with a digital
camera. Even if you could copy over your files, unless it is in one of
the few formats Apple can be bothered to support, you would still
not be able to play it.
While Moore's law will push these new devices further up the
computing value chain, it isn't clear Steve Jobs understands the
value of having a developer community extend his platforms
The OS Battle
191
192
The OS Battle
do it right, we can store our music in one format for decades, even
forever.
A digital format is something not tied to a hardware medium in
the way that the VHS format was tied to VHS tapes; everything digital can be copied to a hard drive or a USB key. (Iraqi terrorist Zarqawi's death was a big setback for al-Qaeda because we recovered a
USB key containing his terrorist documents and music. One day, the
only thing we will ever need to waterboard terrorists for is their
passwords.)
We have been able to create basically one format for digital cameras JPEG, which is free and efficient. We should have been able
to do this for audio as well because the underlying math is similar!
Apple made the digital audio format problem worse by endorsing
one that only they use, AAC. And, by adding DRM, they only allow
you to play your music on their one device and in their one application. Apple has added hassles and created doubt about whether you
will ever control, and therefore truly own, your music.
Steve Jobs is ecstatic that iTunes has sold 2.5 billion songs in five
years, but when you consider that the music business is a $40-billion-dollar per-year industry, and Apple has no serious competitors
in digital music, that number is modest.
I've met many people who have told me that they won't buy an
iPod again because of these and other issues. In fact, I run an alternative OS on my iPod, called Rockbox.14 It supports more audio formats, lets me copy files back and forth like you can with a digital
camera, it even sounds better because of a feature known as crossfeed. (When you listen to music on room speakers, each ear can
hear music, slightly delayed, from both the left and right channel.
However, this does not happen with headphones. Rockbox uses
clever algorithms to simulate the sound from the opposite channel,
slightly delayed, to make headphones sound more natural.)
Rockbox also comes with more software and other advantages. It
is perfectly logical that the free software community can do a better
job than Apple because Apple likely had 20 software developers
writing code for the iPod device, whereas Rockbox has had more
than 300 contributors, and itself reuses a lot of other free code.15
14 iPodLinux doesn't quite work yet. The installer repeatedly crashed, after installation I couldn't find a way to play music, I had to reboot into Apple's iPod OS to
copy music over, etc. The momentum is with Rockbox.
15 My biggest complaint with Rockbox is that they haven't standardized the back
button.
The OS Battle
193
Even worse for iPod's future, I suspect that Apple scavenged some
of the best people from their iPod team when staffing up iPhone
Microsoft would have done so.
Apple is now reversing course and providing more music DRMfree, but they are still making it difficult to put music onto your
device, not letting other devices play the songs in your music library,
and are now preventing the installation of third-party software on
their new hardware. It is only because Apple has such little overall
marketshare that they can get away with this sort of behavior.
Mac OS X Kernel
Apple giving about as much attention to the iPod as the Mac is like Ford
focusing their R&D on bling because they suddenly started making half of
their profits on chrome rims.
194
The OS Battle
Mac OS 9, the last release of the original Apple kernel and the official version until 2001.
Mac OS-X, Macintosh's first release based on a kernel that is free software.
The OS Battle
195
Portion of the Unix family tree. The biggest reason why Unix and Linux hasn't beaten Windows yet is that the workstation vendors didn't work together
on the software, and so kept re-implementing each other's features.
16 Apple's kernel also contains pieces of the Mach kernel, but their OS contains
much more code from BSD. In fact, I can see little to no benefit from using any
Mach code as opposed to its equivalent from BSD. Some have described Apple's
combined kernel as a software Frankenstein.
196
The OS Battle
17 When Apple started shipping computers with four and more processors, BSD
Unix's performance was worse than Linux's, and in 2003, BSD added scalability
optimizations that Linux had added four years earlier. Linux kernel guru Greg
Kroah-Hartman has said that Linux runs faster inside a virtual machine than the
BSD-based Mac OS hosting it.
The OS Battle
197
Software
Nobody has ever had more contempt for customers than Steve
Jobs.
Eben Moglen
With less of an internal focus, Apple could even have done a few
things to make their OS more compatible with Windows, to increase
sales and to better tempt Windows users into switching. There is
plenty of free software out there to enable interoperation with Windows technologies. Out of the box, Windows Media is treated by the
Mac like a text file; there is code out there to fix this, but Apple
doesn't provide it. It's as if supporting Windows Media is a concession that weakens Apple. We aren't even talking about having musiccreating software support that format simply the ability to play
these files!
Their iChat program doesn't support MSN messenger, though it
does support AOL and Yahoo. Given the unnecessary hurdles Apple
has created for Windows users, it is not surprising that their market
share remains so low. There is even a free Win32 implementation
known as WINE that would allow Windows software to run on the
Mac; yet another way to storm Microsoft's beachfront that Apple
hasn't adopted.
198
The OS Battle
18 WINE is one of the biggest pieces of Windows compatibility code out there, and it
alone is much bigger than all the Apple compatibility code one would need to convert someone from a Mac to Linux.
The OS Battle
199
200
The OS Battle
multimedia and other tools. They are neither using a fraction of the
free software they could, nor releasing much of their code as free
software.
In general, other free software should be looked at as an opportunity for Apple. But what about making all of Apple's software free? If
Apple's software were made free, Apple could work more closely
with the global free software community and create a better product
for their customers. However, the downside is that this software
could get ported to Windows and Linux, and create less reasons to
purchase a Mac.
However, if a Linux desktop takes off, Apple's OS will suffer the
same fate as Windows and force Apple into being a hardware-only
company like Dell and the rest, all the value they have built up will
be gone. Once people are satisfied using free software, they don't
usually go back to proprietary software. For example, free code to
convert from proprietary formats has much more demand than code
to convert to them.
If Apple's free software is used on other OSes and hardware, they
risk becoming a company competing primarily on hardware features, but they have that risk already, and at least would be in control of their own destiny. Apple could be leading the free software
movement! In addition, there are many ways to monetize users of
The OS Battle
201
Apple's free software who are running it on other hardware platforms. It seems unlikely that Apple's free software would be widely
used, but not their hardware.
Windows Vista
When I turned on the computer for the first time, it spent five
minutes checking its performance before I could do anything;
it's rather ironic that this task is slowing down my computer.
Internet Explorer provided two toolbar textboxes to search
the web, both using MSN's search. When I create a new tab
using the shortcut key after I launch IE, it would say it could
202
The OS Battle
not connect, even though the computer had long-ago established an Internet connection. Apparently, the typing part of
IE was initialized before the networking part.
Windows Update wanted to install an Office 2003 service
pack even though I could find no Office 2003 installed. Office
2007 was installed as demoware, even though I chose not to
purchase it.
There was a tool to resize the hard drive to make room for
Linux, but it insisted on reserving 44 gigs of free space for
Vista!
The disk defragmentation tool said my hard drive contained
3055 fragmented files and 13,265 excess fragments and
that the drive was heavily fragmented. I guess when Vista
checked the computer's performance, it didn't notice the
filesystem fragmentation issues. The defrag tool was nagware and kept encouraging me to purchase a better version.
(In Linux, the file systems have much less fragmentation
because the code is smarter.19)
I found seven copies of system files like i8042prt.sys.
I wanted to get Windows Media to play my OGG audio files. I
told Windows Media I want to install plugins and it took me
to a Microsoft-maintained web page with links to 3rd party
plugins. I find one and install it. Once installed, I cannot see
the plugin in the media player's list of installed plugins.
I downloaded an OGG file in Internet Explorer, and Windows
Media finds and plays the file, but I can't see where it was
put. I use the search feature to look for it, but search doesn't
find it. I try to refresh the index, and it warns me that performance will be slow because it is a background task, which is
true because it doesn't even attempt to rebuild it.
I checked on the Windows Media settings for ripping music.
The default is to rip into 128 kbps WMA, which has never
been good enough for archival purposes. It doesn't allow me
to rip into anything higher than 192 kbps. I then discover
that there are four types of Windows Media, including an
option to use variable bit rate. (VBR encoding uses more or
less bits depending on how much the music needs at that
moment to maintain a certain level of quality.) The default is
19 My ext3 user partition is 67% full and has 4% non-contiguous inodes. My system
partition is 52% full and only 1.6%. You would expect the system partition, where
not much writing happens, to not be very fragmented. In Windows, it was the system files that were fragmented because it had no user files on the computer yet.
The OS Battle
203
20 Microsoft isn't able to just give you a new kernel with updated hardware support
without giving you a whole new OS. Driver writers building code for the next OS
don't want to support the old OS in addition, especially if that next version of Windows is just around the corner.
Microsoft has to threaten and plead with hardware vendors to get them to do all
the work that MS creates for them. In order to get the 'Designed for Windows
2008 Logo' you need to do this. I heard a Microsoft Windows Server evangelist
say that their 32-bit operating system was going to be retired after Vista, which
was why everyone should just go ahead and start work on the 64-bit drivers right
now this coming from a company that shipped remnants of 16-bit DOS and Windows 1.0 in Windows Me. A 64-bit Windows simply requires that all hardware vendors recompile their drivers, but because there is no unified tree, this is a hard
problem involving a lot of coaxing.
204
The OS Battle
205
CHALLENGES FOR
FREE SOFTWARE
Bill doesn't really want to review your spec, a colleague told
me. He just wants to make sure you've got it under control.
His standard MO is to ask harder and harder questions until
you admit that you don't know, and then he can yell at you for
being unprepared. Nobody was really sure what happens if you
answer the hardest question he can come up with, because it's
never happened before.
Watching nonprogrammers trying to run software companies is
like watching someone who doesn't know how to surf trying to
surf. Even if he has great advisers standing on the shore telling
him what to do, he still falls off the board again and again. The
cult of the MBA likes to believe that you can run organizations
that do things that you don't understand. But often, you can't.
Joel Spolsky
The mode by which the inevitable is reached is effort.
Felix Frankfurter, US Supreme Court Justice
ree software has been around since 1985, and yet has only
1% marketshare on the desktop today. Free software has
tremendous potential, but the community needs to execute
better to win. In fact, until free software succeeds on the desktop,
its last and biggest challenge, many will continue to question
whether it is even viable.
As a side note, while Microsoft doesnt have a great reputation
right now, especially because of Vista and malware, there is a lot of
clever code and brilliant engineers inside the company. I was fortunate to learn from a bunch of great people in groups that fostered
cultures of very high quality software.
When Microsoft got serious about Internet Explorer, Netscapes
rag tag team of kids just out of college didnt have a chance. MS
took some of their top engineers in text engines, networking, forms,
internationalization, performance, and object models, and put
together a large, world-class team. Microsofts institutional knowledge of so many areas of software could be applied to any effort.
People forget all that, but this is why the reviewers consistently
would say that Windows was better than OS/2, Microsoft Word was
easier to use WordPerfect, IE was faster than Netscape, Excel was
richer than Lotus 1-2-3 3, ad infinitum. PC Magazine wrote in 1997:
206
One of the best ways to cure the problems of free software is simply to create more of it. If you talk to someone about their Linux
experience, they might complain that they can't play DVDs out of
the box, that some proprietary drivers are missing, or that iTunes or
other proprietary software does not natively run on Linux.
Getting vendors to provide their proprietary software on Linux
might sound like a great help, but building a completely free software stack should be the principal focus. Richard Stallman wrote:
Adding non-free software to the GNU/Linux system may
increase the popularity, if by popularity we mean the number of
people using some of GNU/Linux in combination with non-free
software. But at the same time, it implicitly encourages the
207
community to accept non-free software as a good thing, and forget the goal of freedom. It is no use driving faster if you cant
stay on the road.
Some GNU/Linux operating system distributions add proprietary packages to the basic free system, and they invite users
to consider this an advantage, rather than a step backwards
from freedom.
Cash Donations
It is every mans obligation to put back into the world at least
the equivalent of what he takes out of it.
Albert Einstein
208
The people at Sun who are responsible for OpenOffice do not realize its importance to put such few developers on it. In spite of the
fact that Sun has over 30,000 employees, Microsoft has twice as
many programmers working on Internet Explorer as Sun has working on all of OpenOffice! I wouldn't work on something as big and
complicated as OpenOffice without getting paid, which is why I
donated money.
We are used to paying hundreds of dollars for proprietary software applications, and so we should be willing to invest smaller
amounts for free software. I went on a guided tour of Mt. Saint
Helens, and at the end the Forest Service Guide said that if you
liked the tour, you should donate $5, but if you didn't, you should
donate $20 so they could improve the tour. That should be the
attitude of the free software community: if you want more, contribute more. The nice thing about this model is that people can donate
what they can afford. If you're a student or a business in India, the
209
amount you can donate back for use of a free database is much less
than if you are Amazon corporation and you have a farm of servers
running the same software.
Devices
The inside of a computer is dumb as hell but it goes like mad.
Richard Feynman, 1984 (Moore's law says this statement is
65,536 times more true today.)
There are enormous markets for free software below the PC for
which Microsoft has not built a dominant ecosystem, nor one with
strong ties to the desktop. These markets should be much easier for
free software developers to infiltrate and dominate:
7%
12%
37%
12%
Other
No formal OS
Windows
Linux
VxWorks
32%
The fragmented state of embedded operating systems. This survey also
showed that planned use of Linux is expected to increase 280% for the
next wave of projects. However, that would still only give it 33% of the
market.
210
be enabled with Linux, up from 8.1 million in 2007. Many of the proprietary embedded OSes enjoy their success primarily because of
inertia.
Moore's law doesn't just apply to our computers, it will also
ensure that small new devices will proliferate like mice in springtime. And those mice will need software to make them interesting.
One device I'm looking for is a robotic mouse for my cat:
He's regal and mellow and while he happily enjoys lots of quality
time on da cowsche and refrigerator, his hunting instincts are prolific.1 I've purchased many toys for him to discover how he learns
and plays, but while each one has provided a few minutes to a few
hours of entertainment, none of the toys provides him the thrill and
challenge of stalking and catching live prey.
One of Davis's favorite toys is a ping-pong ball. However, while it
bounces around and can quickly disappear, it isn't alive and doesn't
1
As a bachelor who has spent too much time with computers, I've relied on Davis
to teach me lessons about how to better interact with humanoids. When I come
home at night, tired and ready to relax, my cat is just waking from his day-long
slumber and is ready to play. If I ignore him, he will persistently meow at me, letting me know he is bored and wants to be entertained; living inside an apartment
simply doesn't provide enough stimulation for him.
211
change direction under its own power. I can watch him calculate a
path to pounce on a toy, realize it isn't a challenge for his prowess,
sneer, and lose interest.
I'd like a robotic mouse that is quiet, agile and fast. Maybe when
he catches it, the mouse could open a hatch and release some food.
It would be nice to come home from work and be able to sit on the
couch and commiserate with my cat about how exhausted we both
are from earning our meal.
Better toys are in our future, and so are safer ones. My sister has
a beautiful Husky, an interesting animal because, like my sister, it
has no sense of awareness of its location. Huskies are bred for
endurance and strength and can head in any given direction for
great distances but are never able to retrace their steps. I'd like to
be able to give him a collar with GPS and a cellphone-like transmitter that could report back on his location. Anyone who has lost a pet
understands the frustration. A similar thing could also be used for
young children. A GPS-device would allow you to sleep soundly at
night knowing you can always find your vulnerable family members.
These are just two of the many innovative devices I can imagine,
and these devices require little new hardware technology beyond
what we already have at our fingertips. Building such devices would
cost hundreds of dollars today, but the relentless march of Moore's
law suggests that the price will keep halving every 18 months; this
combined with free software will make the future very interesting!
Reverse Engineering
Imagine trying to fix a car without having any of the documentation produced by the car manufacturer. Much technology in the
computing world today is undocumented, which means that a lot of
what free software developers have to do is reverse-engineering, a
tedious and time-consuming task. I don't believe the Publisher format is publicly documented, so someone would need to create lots of
little files and look through the on-disk binary to figure out the particulars of how formatting is stored on disk. It is sort of amazing that
millions of people create files in formats that are not publicly
described, yet this is the state of the industry today. Unfortunately,
this means the user is enslaved to his software vendor, and arguably
doesn't even truly own their documents.
If everyone used free standards, let alone free code, software
progress would be faster. Today, if a standard is created, the community almost never gets any code to go with it. It might be best to
212
have code be the official reference, rather than a written specification, but today we often have neither. If the computing world ever
breaks out of this vicious cycle, maybe we could focus our collective
efforts on the truly hard challenges.
PC Hardware
We love Linux, and we're doing our best to support the Linux
community. we see the Linux desktop as a customer-driven
activity. If customers want it, well, Dell will give it to them.
Michael Dell
213
214
You might not believe that, but I have the early bad reviews to prove it!
Microsoft's software is considered unreliable today not because the developers
don't care, but because they are burdened by old code, and have a development
organization far too small for the vast scope of technology they release. They have
perhaps 10,000 developers in total, whereas the Linux kernel alone has 3,000
developers.
215
Fixing these bugs might be tedious, especially when the developer doesn't have access to the hardware he is trying to debug, but
everything about hardware is tedious so they may as well just do it
now. PC hardware contains 10,000 devices, along with a tremendous
amount of unnecessary complexity and undocumented designs, but
the messiness of the PC has existed from the moment Linus first created his kernel. Linus, however, has the means to make all PC hardware work on Linux: lean on hardware vendors to support free
software, and crack the whip on his kernel developers!
1,400 bugs for something as big and actively evolving as the
Linux kernel is not a disaster at all. But if Linux kernel developers
focused on the buglist for just a few months, the bug total could be
brought down to less than 50. Linux's goal should be to fix 90% of
new bugs in the next release, and 99% within two releases. This
accomplishment would represent yet another breakthrough for a
project of this size and complexity. Linux is a superior kernel to Windows today, but it needs a bit more work on the mess that is today's
PC hardware.7 The whip Linus must crack on his kernel developers
is actually nothing more than a feather boa.
Metrics
One of the great mistakes is to judge policies and programs by
their intentions rather than their results.
Milton Friedman
What's measured improves.
Peter Drucker
Microsoft conducts yearly polls that give them statistically significant data on the state of the company. If they want to know whether
developers were happier than they were five years before, they
can get hard numbers on this. By gathering the right data and
watching it over time, management can fix any organizational
trends heading in the wrong direction.
This data was collected and managed by the Human Resources
Department, but every other Microsoft team had metrics as well,
and by far the most important was the bug count. The Microsoft
buglist almost isn't even considered a metric because it is the driving force in the development process.
7
In fact, if hardware companies did not finalize their hardware until their Linux
driver was written, hardware itself would become more reliable. Most hardware
today is designed before the software is written. When the drivers finally get written, it can expose hardware bugs. Free drivers allow for better hardware and simpler software.
216
217
218
Once the drivers are in the kernel tree, there is no need for a
hardware vendor to pre-install Linux because the user can do it
themselves. Windows is installed by hardware manufacturers
because the retail version is missing drivers, but this need not be
the case with Linux. (In fact, if your hardware is supported,
installing Linux takes less than an hour.)
Eric Raymond, in his recent essay World Domination 201,
argues that the free software community should only purchase PCs
from vendors that ship Linux. However, it is much simpler just to
convince hardware vendors to support Linux than to create new
multinational corporations.
Even if vendors don't provide support for Linux yet, they could
make sure it runs. Again, while people might debate whether free
software is good for software companies, it is inarguably a benefit
for hardware companies because it lowers the cost of a computer. In
fact, PC hardware vendors don't even need to do any work other
than demand Linux drivers from their component vendors. The
power easily lies within the PC hardware vendors to apply the necessary pressure.
Shipping computers with Linux is not sufficient for world domination because hardware vendors don't have the means to solve some
of Linux's problems. Dell cannot ensure that Apple's iTunes works
on Linux, or that OpenOffice can read your Publisher files.
219
The Desktop
The only strategy in getting people to switch to your product is
to eliminate barriers. Imagine that it's 1991. The dominant
spreadsheet, with 100% market share, is Lotus 123. You're the
product manager for Microsoft Excel. Ask yourself: what are
the barriers to switching? What keeps users from becoming
Excel customers tomorrow? Think of these barriers as an obstacle course that people have to run before you can count them
as your customers. If you start out with a field of 1000 runners,
about half of them will trip on the tires; half of the survivors
won't be strong enough to jump the wall; half of those survivors
will fall off the rope ladder into the mud, and so on, until only 1
or 2 people actually overcome all the hurdles. With 8 or 9 barriers, everybody will have one non-negotiable deal killer.
This calculus means that eliminating barriers to switching is
the most important thing you have to do if you want to take
over an existing market, because eliminating just one barrier
will likely double your sales. Eliminate two barriers, and you'll
double your sales again. Microsoft looked at the list of Lotus
123 barriers and worked on all of them:
Barrier
Solution
And it worked pretty well. By incessant pounding on eliminating barriers, they slowly pried some market share away from
Lotus
Joel Spolsky, Joel on Software
Free software developers have had a long row to hoe. As proprietary software has been the dominant model for decades, many
geeks have had to live with one foot in each world; Linux programmers today are often forced to use Macs, Windows and other proprietary software.
220
Approachability
The adversary she found herself forced to fight was not worth
matching or beating; it was not a superior ability which she
would have found honor in challenging; it was ineptitude a
gray spread of cotton that deemed soft and shapeless, that
could offer no resistance to anything or anybody, yet managed
to be a barrier in her way. She stood, disarmed, before the riddle of what made this possible. She could find no answer.
Ayn Rand, Atlas Shrugged
221
Assuming you don't run into a situation where you need a network card driver, but
you can't download it until you have a functioning network card!
222
great one for every free software organization. A big challenge for
the next few years is to make the entire free software stack easier to
use and more approachable for new users and programmers.
Many wonder whether Linux needs a killer application to beat
Windows. Linux's status today of being mostly compatible with, and
slightly more reliable than Windows might not be enough to overcome the inertia required for a worldwide switch, even to something
that is free to acquire.
It is true that if Linux were to enable robust continuous speech
recognition, or some other transformational feature, Linux could
more quickly take over, but even that OS would also need to support
all your hardware and file formats as well. However, Linux already
has what I think is a killer feature: the rich set of programs that
come with it although they could still use a bit more work. For
example, the most popular high-end graphics editor in Linux, poorly
named the GIMP, is comparable to Photoshop, suitable for professionals, and free! However, like Photoshop, it isn't particularly
approachable for a new person to jump in and start using. The first
time you use GIMP, it might take 20 minutes to figure out how to
crop an image. It takes time to learn any powerful tool, but a lesson
in engineering is: You only pay for what you use. Simple things
should be simple to do, and today this is not always the case. While
many of the free applications are good enough to convert people
from Windows, and as good as competing proprietary products, they
could still be dramatically better.
While the free software community is already doing good business, especially for websites and embedded applications, PC hardware and software are the biggest and final challenge. The minimal
requirements, the Web and Office, are almost met today, yet its true
potential, its Wikipedia-scale potential, is something much larger
and several years of work away.
Once we start to coalesce into a few very good codebases, then
the PhDs will be motivated to jump in. Nowadays, a graphics
researcher might use GIMP to manipulate images, but he would definitely not use GIMP's source code as the basis for his research.
Eventually computer scientists around the world will instinctively
realize the place to contribute is in the free software code, just as
there are linguists and computer scientists today using Wikipedia as
the basis for their research. Once the PhDs get involved, software
will get very interesting.
223
Monoculture
Next to the problem of code written in old programming languages, the second biggest challenge the free software community
faces is the amount of duplicate code that exists. There are several
dictionaries in Linux, and so when you add a custom word to one,
only some applications notice it. In Linux there are too many paint
programs, media players, download managers, RSS readers, programming environments, source control systems, etc. Each piece of
duplicate code is a missed opportunity for sharing and slows
progress. On the other hand, the free software community has gotten together on primarily one kernel, Linux, and one web server,
Apache.
There is an interesting debate in the software community about
the worry of a software monoculture. Like a monoculture in the biological world, one virus or other problem could destroy the entire
ecosystem, and some argue that we run similar risks in the software
world.
While a monoculture may be a risk, having different codebases
doesn't necessarily help: any new, previously unforeseen exploit is
able to cause damage to all codebases because none of them
designed for it. A castle's walls may stop men, but they were not
designed to stop cannonballs or helicopters.
While the little differences between codebases add an extra level
of variability that makes building a virus harder, software is different from DNA (today at least!) because of our ability to infinitely
recombine it. It is possible to make a product easy to use, and powerful, rich and reliable, fast and maintainable. Because software is
infinitely malleable, all the best ideas can be incorporated into a single product.
In fact, the monoculture risk only applies to the proprietary software world because in that model we are unable to infinitely recombine ideas. There are hundreds of millions of Windows users, but
there are only a few thousand people at Microsoft who can make
changes to it.
If the monoculture risk doesn't apply to the free software world,
we should focus more on working together. Progress in the free software world is slow where people are not working on the same codebases. As described in the tools chapter, switching to modern tools
will bring a 2x productivity improvement, and consolidating various
efforts into unified codebases will bring an additional 5x productiv-
224
225
Parody of Eclipse packaging, the most popular free developer tool, from the
funny folks at FarOutShirts.com. I took a few pages of notes of my frustrating experiences but decided that the picture above did a better job.
226
Backward Compatibility
As described in the Linux chapter, all of the source for the entire
software stack is publicly available so fixes can be made in the
proper place. This is an important part of what makes free software
simpler, more maintainable and more reliable over time.
The free software stack has traded simplicity for backward compatibility and so there is a downside. For example, Windows lets you
download drivers off the web that will work on multiple versions of
its OS, but Linux does not support this scenario because the internal
kernel architecture is constantly changing. If hardware vendors did
make Linux drivers available on the web, they would potentially
have different drivers for every version of the kernel. The best way
to get a new driver in Linux is to grab the latest kernel, which is
where all the new drivers go.
It is possible that if you buy a new piece of hardware, your old
kernel might not have support for the new driver. What then?
To be clear, this scenario is contrived in several ways:
227
228
o use the Internet, you need software that supports two big
standards: TCP/HTTP and HTML. There is no HTML standard competing with an HTMM standard, as the idea is silly
on its face, yet such redundancies exist in many other areas in the
world of bits today. When you can't agree on a file format, your ability to exchange information goes from 1 to 0.
Free software has been a part of the Internet since the beginning.
In fact, a website needs to send you software, in the form of HTML
and JavaScript, in order for you to have something to look at and
interact with. It is easy to learn how a website does its magic as the
tags of an HTML document are self-describing and, on top of that,
there is an organization called W3C whose job is to fully describe
them.
In contrast, to display a Word or WordPerfect document, you had
to reverse-engineer a complicated binary file format! This prevented
their widespread use as the standard document format of the Internet.
229
Digital Images
Unlike audio and video, in the realm of still images things are in
good shape. JPEG is an efficient, free, and widely-supported standard for compression of images.1 There might be a couple of standards better than JPEG out there, but the end is nigh. (There is a
JPEG 2000 standard based on wavelets2 which is 20% better than
JPEG, but it has higher memory and processing demands. Sometimes to get a little more compression, you have to do a lot more
work, and so you reach a point of diminishing returns.)
Microsoft in early 2007 announced a new Windows Media Photo
format, which is also 20% better than JPEG, but without higher
memory and processing demands like those required by JPEG 2000.
The new format is based on JPEG, but with nine small tweaks. The
spec is public, and there is even free public source code, but the
license explicitly excludes it from being used in combination with
copyleft licenses:
2. c. Distribution Restrictions. You may not modify or distribute the source code of any Distributable Code so that any part
of it becomes subject to an Excluded License. An Excluded
License is one that requires, as a condition of use, modification
or distribution, that the code be disclosed or distributed in
source code form; or others have the right to modify it.
Digital Audio
It is the mess of proprietary standards and patent restrictions
that are impeding the progress to digital audio. Proprietary software
companies have been pushing their standards down our throat. If
1
There is also PNG and GIF for lossless compression, but they are not suitable for
real-world images with continuous gradients like clouds, etc. I took a high quality
1.9 MB JPEG and converted it to PNG and it became 2.9 times bigger.
Interestingly, these lossless formats do a better job than JPEG for certain images
like screen shots because JPEG doesn't handle the sharp transitions from black to
white, etc. that you find on a computer screen. A JPEG of a screen shot is 2.2
times larger than an equivalent PNG. With a JPEG of the same size, you find it has
added gray display artifacts at the black/white transitions.
PNG was only created because after GIF became popular, Compuserve started
suing.
Wikipedia: A wavelet is a kind of mathematical function used to divide a given
function or continuous-time signal into different frequency components and study
each component with a resolution that matches its scale.
230
231
in the first place? The best way for faster adoption would be to create a service where you mail in your old VHS or DVDs, and they
mailed you back HD versions of them for a few dollars each. That
could be a huge, if low-margin, business.
Consumers would adopt new standards faster if they could immediately enjoy everything they already own in that new format at
something approaching the actual cost of producing a disc, instead
of the full retail price. This is another area where shrinking the
copyright expiration will help. It will make nearly free all those
things we paid a license fee for, but which we now only have lowquality reproductions of.
At some point, the industry should remove the Interpol and DHS
warnings about why we shouldn't have stolen what we are about to
watch. When you stick in a disc, there should be two buttons that
show up within 5 seconds: Play and Menu. I have a boxed set of
DVDs that display four minutes of introductory warnings and selfpromotion that I cannot even skip through, before playing the actual
content. I once put in the wrong disc and so had to repeat the
hassle. This may seem petty, but these encroachments get worse. I
have to accept the license agreement of my navigation system every
time I turn on my car!4 When people feel ripped off and treated like
a sucker, a culture is created where people decide not to pay for
what is sold. Respect is a two-way street.
Suppose I don't activate my nav system one day because I can't be bothered. Then
I get into an accident and die because the system wasn't helping me. It could
therefore be argued that the need to accept the license agreement played a factor
in the cause of my death. The good news is that it would create standing to sue!
Now, if only we could come up with a way for someone to die because they had to
sit through all that stuff at the beginning of a video so we could get standing for
that!
232
The shapes of the Latin alphabet are mostly happenstance. Many details in
this world do not matter, it is only that we agree to them.
233
234
235
01
on
00
off
??
?!
If you don't know what '06' means, you don't know what is coming next, and so you can't continue and must abort. Even if you
wanted to skip over it, you can't because you don't know how far to
advance.
Microsoft did not create binary formats to lock out other vendors.
Formats by all the word processors were binary for many years
because it is efficient and because a better solution hadn't been
invented.
236
237
The Office binary formats were not documented for many years,
and the license agreement for the documentation today says that
you can only use the information for products that complement
Microsoft Office. Is supplant the same thing as complement?!
Microsoft is not particularly interested in building an open standard because it will never perfectly represent their features, and
because an open standard makes it easy to switch tools. Right now
everyone buys Office because that is what you need to read the documents you receive today. The adoption of an open format for productivity tools is a mortal threat to Microsoft's Office profit margin.
Regrettably, there is a battle going on in the XML office document
space. Microsoft has for many years ignored and then resisted the
ISO standard called OpenDocument Format (ODF), and now they
have created their own competing standard called Office OpenXML
(OOXML). However, the whole point of a standard is to not have two
of them.
XML provides the structure for your files and guarantees that
applications should be able to parse everything, even parts they
don't understand, without crashing. Given that baseline, it should be
possible to create a format that can represent the features of office
productivity tools. Microsoft's OOXML specification, which provides
100% compatibility with Microsoft Office, is 6,000 pages, while the
ODF specification is only 1,000 pages because it doesn't re-use
many existing standards like SVG, SMIL, MathML and XForms.7
OpenXML is also filled with legacy bloat. At the top of 600 pages
of the VML specification is the following text:
Note: The VML format is a legacy format originally introduced
with Office 2000 and is included and fully defined in this Standard for backwards compatibility reasons. The DrawingML format is a newer and richer format created with the goal of eventually replacing any uses of VML in the Office Open XML
formats. VML should be considered a deprecated format
included in Office Open XML for legacy reasons only and new
applications that need a file format for drawings are strongly
encouraged to use preferentially DrawingML.
238
239
their open-mindedness. For example, I was amazed, and slightly dismayed, to find references inside the spec to DDE, an obscure and
now mostly dead Microsoft-only technology. However, this technology became a part of the Microsoft OLE monikor format, which
specified how documents would embed portions of spreadsheets,
and became an important part of Microsoft's documents, thus ODF
supports it.
A robust, standard, self-describing file format for productivity
tools will allow people to archive their documents, confident that the
format will be readable many years into the future.8 In addition, like
everything we build in computers, standards can become platforms
for other standards. When ODF incorporates scenarios to encrypt
and digitally sign documents, to notarize and transmit legal documents, and support cross-company workflow, e-forms and e-government, it could become a lingua franca in a way that Microsoft
Word's DOC and PDF combined have never been.
Competing standards is a misnomer in my opinion, so perhaps it
would be best if everyone were to adopt ODF. Sun is building extensions to Office to support this format, although if people start using
the free OpenOffice, which uses this format as its native format,
there is no real need for Office. This book was written using OpenOffice, and while the application is far from perfect, it is far beyond
good enough for most users.
The state of Massachusetts was forward-looking in nearly adopting ODF, but after tons of lobbying by Microsoft, it reversed course
and now endorses either OpenXML or ODF. They either caved on the
idea of creating a standard, or they didn't really understand the
issue. Microsoft has attempted to confuse many on the importance
The tricky thing about building a standard is that new requirements can cause
ripples throughout the design of a system. Imagine two people sitting in different
countries collaborating on one file. The challenge is that the ODF file format was
not made to be incremental, it was meant to represent an entire document. Do
you send a new copy of the document over every time the user makes a change?
This is very inefficient and yet doesn't tell the user what changed. Another
solution is to use the undo stack, but it doesn't look like OpenDocument stores an
undo stack with the document. They could just send around XML diffs, but the
XML is not usually the in-memory representation of a document, in which case
XML isn't easily usable! One solution is to have an object model (with functions
like CreateTable) on top of the file format, and can be sent between computers,
but the OpenDocument committee has not attacked this yet.
I look forward to seeing how they solve this problem, or whether they decide that
while it is a doable feature, it is too hard and outside the scope of the
OpenDocument standard. Software is infinitely malleable, but that doesn't mean
you'll like what requirements have forced upon you!
240
Web
The first message ever to be sent over the ARPANET occurred
at 10:30 PM on October 29, 1969. The message itself was
simply the word login. The l and the o transmitted
without problem but then the system crashed. Hence, the first
message on the ARPANET was lo. They were able to do the
full login about one hour later.
Wikipedia article on Internet precursor ARPANET
One could write an entire book about the web, but I did want to
include some ideas here.
From a technical perspective, HTML has always been lacking as a
text-formatting standard. It is important because of two reasons:
1. It is widely used: it's famous because it's famous.
2. It has an easy deployment model.
HTML was not designed in a rigorous way by text processing
experts and so it never had the sort of awe-inspiring respect that
Donald Knuth's TeX typesetting system has had. HTML was mature
before it even added support for the concept of a page, which is why
printing does not work well yet. Text processing is a hard problem,
but not incorporating basic features like styles for seven years
demonstrates that those guys were in over their heads and they
shouldn't have reinvented the wheel.
In addition, the uptake of the JavaScript language outside of the
web has been minimal. There is nothing web-specific about the language, and so it didn't even need to be created. It has just made the
programming language tower of Babel situation worse.
In spite of its limitations, HTML today is the best cross-platform
widget set and is supported by a vast array of tools, so enterprises
should be using it for as many corporate applications as possible. A
241
Boeing airplane could even have its entire cockpit UI be a web site.
You can build something simple, reliable and pretty enough if you
pick the right subset of HTML. Right now an airplane has a mess of
buttons and knobs because each subsystem of the airplane has its
own set; the guys who build the flaps don't want to share any buttons with those who control the overhead lights.
The web today is still far from being an appropriate tool for building rich applications, and nearly everything about the web is harder
than building an equivalent functionality in a rich client application.
Google has recently announced the Chrome operating system in a
vision where all the apps run on the web, but even they have created a number of applications that presumably couldn't have been
built using HTML, such as Google Earth, Picasa, and Google Desktop. Google Docs is an impressive web-based engineering effort, but
it is slow, clumsy, feature-limited, doesn't work offline, has its own
authentication mechanism, is hamstrung by the web's limited printing capabilities, and poses no threat to Microsoft's Office business
any time soon. The most interesting capability of Google Docs is its
support for real-time collaboration, but you don't need to re-build an
entire application in HTML and Javascript to add this relatively
small feature.9
The missing features of HTML are reasons why Adobe's Flash has
become so popular. Flash started off as a failed client-based programming environment, but had a rebirth as a web plugin as a way
to fix limitations in HTML.
Adobe Flash
Flash is a primitive GC runtime whose primary advantage is that
it is cross-platform like the web. Its programming language is
known as ActionScript, which is based on the standard Ecmascript,
and which is similar but incompatible with Javascript and JScript. (It
is a mess.)
Flash is interpreted, somewhat buggy and not a standard, nor is it
available as free code. We should minimize the use of Flash because
it is a big black box to the web server, web browser, search engines,
and all existing HTML tools for building and managing websites.
Flash also doesn't enforce UI standards. Every flash site works and
looks different and sometimes I can't even tell what is a clickable
In fact, this could be a feature of the operating system that would let you share an
application with any number of other people.
242
button! Creators of the website, who myopically live in their application, don't notice it, but users visiting the website for the first time
do.
Many websites built by non-technical people hire programmers to
write a Flash website because of the UI candy it allows, but then
they don't update their site for years because it would require hiring
a programmer again. It is possible to build pretty, interactive web
pages using only HTML and Javascript, as modern mapping websites have demonstrated. If you want a pretty website, use pretty
pictures!
Limiting Flash to specific portions of specific pages, as YouTube
does with its video player, is quite reasonable given certain limitations of the web and the mess of video standards, but building whole
websites in Flash is a mistake and a threat to the web.
Web Etc.
The web needs more personalized content. I live in Seattle, and I
am a huge fan of the Seahawks football team, but this is the news
10 Even if they could call into them, they couldn't install them if they weren't already
on your computer.
243
Can you read the text on this box, like you could in a real store?
244
make it look prettier than a web page, that would be very easy to do.
A French company called Free is pioneering this, using free software as part of this effort. (A cable provider is a good example of a
company which doesn't want to maintain a bunch of proprietary
software.)
Everyone who produces television shows should also create an
XML schema which has information like who the guests are, etc.
This extra information allows for a more personalized experience
similar to what I get with the web: I could set my cable box to
record whenever comedian Dennis Miller is on any channel. If a
show runs over, which often happens with sports, the cable box
would be smart enough to not stop recording; it needs just a tiny bit
extra information to do this. Interactive TV is simply waiting for
someone to create, and everyone to get behind, two simple standards, each of which would be less than 50 pages if they re-used
HTML and XML. (It would also be nice to be able to watch the Seahawks no matter what city I live in. Comcast only offers me 4 football games per week. People complain: There are 500 channels
with nothing on. In truth, we aren't to that point yet!)
Hardware
Lots of people worry about running out of Internet bandwidth, but
they are just being nattering nabobs of negativism. Nippon Telephone and Telegraph of Japan demonstrated sending 14 trillion bits
per second down a single strand of fiber or 2,660 music CDs in
one second. We are not running up against the limits of the laws of
physics yet!
Metering data sounds logical, although it would seem that the distance would provide a better measure. However, I counted the number of router hops required to get data from my home in Seattle to
various destinations:
Seattle to:
Router hops
23
www.tmobile.de (Germany)
30
www.latviatourism.lv (Latvia)
17
www.google.co.uk (UK)
15
www.gws.com.tw (Taiwan)
23
245
246
Da Future
DA FUTURE
Phase II of Bill Gates' Career
The royalty paid to us, minus our expenses of the manual, the
tape and the overhead make Microsoft a break-even operation
today.
Bill Gates, Open Letter to Hobbyists, 1976
n June 2006, we learned that Bill Gates would step down from
Microsoft in June 2008. This gave plenty of warning for the markets, but it also means that Steve Ballmer will remain as CEO
for a decade or more, the rumors of his demise having been greatly
exaggerated.
One can presume that Bill Gates doesn't believe his legacy is furthered by spending any more time at Microsoft. Given the twin
threats of free software and Google, history may judge Bill Gates
more as an Andrew Carnegie than a Michelangelo.
Microsoft succeeded because it was the company that exploited
Metcalfe's law to its greatest advantage. Microsoft got everyone
Da Future
247
Why is Bill not smiling? Perhaps because the XBox 360 is a PC minus a keyboard, web browser and lots of other software.
248
Da Future
Da Future
249
SpaceShipOne was the first privately funded aircraft to go into space, and it
set a number of important firsts, including being the first privately funded
aircraft to exceed Mach 2 and Mach 3, the first privately funded manned
spacecraft to exceed 100km altitude, and the first privately funded reusable
spacecraft. The project is estimated to have cost $25 million dollars and
was built by 25 people. It now hangs in the Smithsonian because it serves
no commercial purpose, and because getting into space has never been the
challenge it has always been the expense.
250
Da Future
In the 21st century, more cooperation, better software, and nanotechnology will bring profound benefits to our world, and we will
put the Baby Boomers to shame. I focus only on information technology in this book, but materials sciences will be one of the biggest
tasks occupying our minds in the 21st century and many futurists say
that nanotech is the next (and last?) big challenge after infotech.
I'd like to end this book with one more big idea: how we can
jump-start the nanotechnology revolution and use it to colonize
space. Space, perhaps more than any other endeavor, has the ability
to harness our imagination and give everyone hope for the future.
When man is exploring new horizons, there is a swagger in his step.
Colonizing space will change man's perspective. Hoarding is a
very natural instinct. If you give a well-fed dog a bone, he will bury
it to save it for a leaner day. Every animal hoards. Humans hoard
money, jewelry, clothes, friends, art, credit, books, music, movies,
stamps, beer bottles, baseball statistics, etc. We become very
attached to these hoards. Whether fighting over $5,000 or
$5,000,000 the emotions have the exact same intensity.
When we feel crammed onto this pale blue dot, we forget that any
resource we could possibly want is out there in incomparably big
numbers. If we allocate the resources merely of our solar system to
all 6 billion people equally, then this is what we each get:
Resource
Amount
Hydrogen
Iron
Oxygen
34 billion Tons
Carbon
34 billion Tons
Energy production
Da Future
251
252
Da Future
The Europeans aren't providing great leadership either. One of the big investments of their Space agencies, besides the ISS, is to build a duplicate GPS satellite constellation, which they are doing primarily because of anti-Americanism!
Too bad they don't realize that their emotions are causing them to re-implement
35 year-old technology, instead of spending that $5 Billion on a truly new
advancement. Cloning GPS in 2013: Quite an achievement, Europe!
Da Future
253
254
Da Future
A NASA depiction of the space elevator. A space elevator will make it hundreds of times cheaper to put a pound into space. It is an efficiency difference comparable to that between the horse and the locomotive.
Da Future
255
One of the best ways to cheaply get back into space is kicking
around NASA's research labs:
Geosynchronous
Orbit (GEO)
Scale picture of the space elevator relative to the size of Earth. The moon is
30 Earth-diameters away, but once you are at GEO, it requires relatively little energy to get to the moon, or anywhere else.
256
Da Future
Da Future
257
Carbon Nanotubes
258
Da Future
Da Future
Why?
William Bradford, speaking in 1630 of the founding of the Plymouth Bay Colony, said that all great and honorable actions are
accompanied with great difficulties, and both must be enterprised and overcome with answerable courage.
There is no strife, no prejudice, no national conflict in outer
space as yet. Its hazards are hostile to us all. Its conquest
deserves the best of all mankind, and its opportunity for peaceful cooperation may never come again. But why, some say, the
moon? Why choose this as our goal? And they may well ask why
climb the highest mountain? Why, 35 years ago, fly the
Atlantic? Why does Rice play Texas?
We choose to go to the moon. We choose to go to the moon in
this decade and do the other things, not because they are easy,
but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because
that challenge is one that we are willing to accept, one we are
unwilling to postpone, and one which we intend to win, and the
others, too.
It is for these reasons that I regard the decision last year to
shift our efforts in space from low to high gear as among the
most important decisions that will be made during my incumbency in the office of the Presidency.
In the last 24 hours we have seen facilities now being created
for the greatest and most complex exploration in man's history.
We have felt the ground shake and the air shattered by the testing of a Saturn C-1 booster rocket, many times as powerful as
the Atlas which launched John Glenn, generating power equivalent to 10,000 automobiles with their accelerators on the floor.
We have seen the site where five F-1 rocket engines, each one
as powerful as all eight engines of the Saturn combined, will be
clustered together to make the advanced Saturn missile, assembled in a new building to be built at Cape Canaveral as tall as a
48 story structure, as wide as a city block, and as long as two
lengths of this field.
The growth of our science and education will be enriched by
new knowledge of our universe and environment, by new techniques of learning and mapping and observation, by new tools
and computers for industry, medicine, the home as well as the
school.
I do not say the we should or will go unprotected against the
hostile misuse of space any more than we go unprotected
against the hostile use of land or sea, but I do say that space
can be explored and mastered without feeding the fires of war,
without repeating the mistakes that man has made in extending
his writ around this globe of ours.
We have given this program a high national priority even
though I realize that this is in some measure an act of faith and
vision, for we do not now know what benefits await us. But if I
259
260
Da Future
were to say, my fellow citizens, that we shall send to the moon,
240,000 miles away from the control station in Houston, a giant
rocket more than 300 feet tall, the length of this football field,
made of new metal alloys, some of which have not yet been
invented, capable of standing heat and stresses several times
more than have ever been experienced, fitted together with a
precision better than the finest watch, carrying all the equipment needed for propulsion, guidance, control, communications, food and survival, on an untried mission, to an unknown
celestial body, and then return it safely to earth, re-entering the
atmosphere at speeds of over 25,000 miles per hour, causing
heat about half that of the temperature of the sun almost as
hot as it is here today and do all this, and do it right, and do
it first before this decade is out then we must be bold.
John F. Kennedy, September 12, 1962
Lunar Lander at the top of a rocket. Rockets are expensive and impose significant design constraints on space-faring cargo.
Da Future
261
262
Da Future
make it $10 per pound to put something into space. This will open
many doors for scientists and engineers around the globe: bigger
and better observatories, a spaceport at GEO, and so forth.
Surprisingly, one of the biggest incentives for space exploration is
likely to be tourism. From Hawaii to Africa to Las Vegas, the primary
revenue in many exotic places is tourism. We will go to the stars
because man is driven to explore and see new things.
Space is an extremely harsh place, which is why it is such a miracle that there is life on Earth to begin with. The moon is too small to
have an atmosphere, but we can terraform Mars to create one, and
make it safe from radiation and pleasant to visit. This will also teach
us a lot about climate change, and in fact, until we have terraformed
Mars, I am going to assume the global warming alarmists don't
really know what they are talking about yet.2 One of the lessons in
engineering is that you don't know how something works until
you've done it once.
Terraforming Mars may sound like a silly idea today, but it is simply another engineering task.3 I worked in several different groups
at Microsoft, and even though the set of algorithms surrounding
databases are completely different from those for text engines, they
are all engineering problems and the approach is the same: break a
problem down and analyze each piece. (One of the interesting
lessons I learned at Microsoft was the difference between real life
and standardized tests. In a standardized test, if a question looks
hard, you should skip it and move on so as not to waste precious
time. At Microsoft, we would skip past the easy problems and focus
our time on the hard ones.)
Engineering teaches you that there are an infinite number of
ways to attack a problem, each with various trade-offs; it might take
1,000 years to terraform Mars if we were to send one ton of material, but only 20 years if we could send 1,000 tons of material. Whatever we finally end up doing, the first humans to visit Mars will be
happy that we turned it green for them. This is another way our generation can make its mark.
2
Carbon is not a pollutant and is valuable. It is 18% of the mass of the human body,
but only .03% of the mass of the Earth. If Carbon were more widespread, diamonds would be cheaper. Driving very fast cars is the best way to unlock the carbon we need. Anyone who thinks are running out of energy doesn't understand
the algebra in E = mc2.
Mars' moon, Phobos, is only 3,700 miles above Mars, and if we create an atmosphere, it will slow down and crash. We will need to find a place to crash the fragments, I suggest in one of the largest canyons we can find; we could put them
next to a cross dipped in urine and call it the largest man-made art.
Da Future
263
There are a many interesting details surrounding a space elevator, and for those interested in further details, I recommend The
Space Elevator, co-authored by Brad Edwards.
The size of the first elevator is one of biggest questions to resolve.
If you were going to lay fiber optic cables across the Atlantic ocean,
you'd set aside a ton of bandwidth capacity. Likewise, the most
important metric for our first space elevator is its size.
The one other limitation with current designs is that they assume
climbers which travel hundreds of miles per hour. This is a fine
speed for cargo, but it means that it will take days to get into orbit.
If we want to send humans into space in an elevator, we need to
build climbers which can travel at 10,000 miles per hour. While this
seems ridiculously fast, if you accelerate to this speed over a period
of minutes, it will not be jarring. Perhaps this should be the challenge for version two if they can't get it done the first time.
264
Da Future
The conventional wisdom amongst those who think it is even possible is that it will take between 20 and 50 years to build a space
elevator. However, anyone who makes such predictions doesn't
understand that engineering is a fungible commodity. Two people
will, in general, accomplish something twice as fast as one person.4
How can you say something will unequivocally take a certain
amount of time when you don't specify how many resources it will
require or how many people you plan to assign to the task?
Efficiency drops as teams gets larger, but with the Internet, a
superb tool of communication and collaboration, thousands of people can work together efficiently. Manufacturers are using the Internet to reinvent how they interact with their suppliers during the
design process shaving years off the design. Boeing's newest 787
Dreamliner airplane will go from project start to take-off in just
four years, which is half the time it took them to design the 777.
(Why Boeing offered their 30-year old 767 instead of the 787 to the
US Military, as the workhorse of their new refueling fleet, was an
insanely backward-looking mistake. If Boeing doesn't think their
improvements are worthwhile, why are they making them?)
Furthermore, predictions are usually way off. If you asked someone how long it would take unpaid volunteers to make Wikipedia as
big as the Encyclopedia Britannica, no one would have guessed the
correct answer of two and a half years. From creating a space elevator to world domination by Linux, anything can happen in far less
time than we think is possible if everyone simply steps up to play
their part. The way to be a part of the future is to invent it, by
unleashing our scientific and creative energy towards big, shared
goals. Wikipedia, as our encyclopedia, was an inspiration to millions
of people, and so the resources have come piling in. The way to get
help is to create a vision that inspires people.
In a period of 75 years, man went from using horses and wagons
to landing on the moon. Why should it take 30 years to build something that is 99% doable today?
Many of the components of a space elevator are simple enough
that college kids are building prototype elevators in their free time.
The Elevator:2010 contest is sponsored by NASA, but while these
Fred Brooks' The Mythical Man-Month argues that adding engineers late to a
project makes a project later, but ramp-up time is just noise in the management of
an engineering project. Also, wikis, search engines, and other technologies
invented since his book have lowered the overhead of collaboration.
Da Future
265
Perhaps the Europeans could build the station at GEO. Russia could build the
shuttle craft to move cargo between the space elevator and the moon. The Middle
East could provide an electrical grid for the moon. China could take on the problem of cleaning up the orbital space debris and build the first moon base. Africa
could attack the problem of terraforming Mars, etc.
266
Da Future
As I see it, I still live in the 20th century. History will remember
the 21st century as the time when man entered a new Renaissance,
and when this was not just a photo shoot.
If we could replace journalists with robots, the issue of media bias would
disappear, and they would become better looking.
Even in our still-primitive world of today, I would rather be making $30,000 a year than be promised $100 million dollars if I walked
through a door back into 1986, and I say that not just because of the
big hair. The Internet was still seven years away from its first web
page in 1986. We might not be able to go back in time, but in general we wouldn't want to.
Many people don't appreciate how much faster the world is moving every day with the creation of the Internet and other modern
Da Future
267
268
Da Future
This is our scary future, and unfortunately mankind is crawling towards it.6
From http://lifeboat.com/ex/warning.signs.for.tomorrow, created by Anders Sandberg. I sent multiple e-mails to various e-mail addresses at the Lifeboat Foundation to try to get permission to use these images, but I never received a response.
Perhaps they are too busy with their mission of encouraging scientific advancements to help humanity survive existential risks to respond. So I just donated
$200 for their use here :-)
Da Future
We go forward with complete confidence in the eventual triumph of freedom. Not because history runs on the wheels of
inevitability; it is human choices that move events. Not because
we consider ourselves a chosen nation; God moves and chooses
as He wills. We have confidence because freedom is the permanent hope of mankind.
When our Founders declared a new order of the ages; when soldiers died in wave upon wave for a union based on liberty;
when citizens marched in peaceful outrage under the banner
Freedom Now they were acting on an ancient hope that is
meant to be fulfilled. History has an ebb and flow of justice, but
history also has a visible direction, set by liberty and the Author
of Liberty.
US Presidential Inaugural Address, 2005
Genius is eternal patience.
The greater danger for most of us lies not in setting our aim too
high and falling short; but in setting our aim too low, and
achieving our mark.
The true work of art is but a shadow of the Divine perfection.
Michelangelo
Most of the things worth doing in the world had been declared
impossible before they were done.
Louis Brandeis, US Supreme Court Justice
We have only one alternative: either to build a functioning
industrial society or see freedom itself disappear in anarchy
and tyranny.
Peter Drucker
The mind is not a vessel to be filled but a fire to be kindled.
Plutarch
269
270
Afterword
AFTERWORD
US v. Microsoft
Microsoft got its tush handed to it in the DOJ trial, but that was
because it lost credibility. For example, Bill Gates argued that he
wasn't worried about Netscape. If so, why did Microsoft VIPs say
they wanted to smother Netscape and cut off their air supply?
Judge Jackson wrote that Microsoft's witnesses in the trial: proved,
time and time again, to be inaccurate, misleading, evasive, and
transparently false. However, what they say at a trial has nothing
to do with how they behaved in the marketplace.
Fisher, the government's economist argued that it was a joke
that Windows was challenged by the Macintosh or Linux. With just a
few years hindsight, and the tremendous potential of free software,
it becomes very clear that their economist was wrong. Furthermore,
in the trial he admitted that he could find no specific harm that
Microsoft had done. Microsoft might be a hard-charging competitor,
but the software they wrote has been invaluable to the world, and
they provided it at a cost much lower than many of their competitors
like Sun, IBM and Oracle.
In addition to the Mac and Linux, the web was and is a huge
threat to Microsoft the PC development community used to
revolve around Windows apps, but now it is all about web apps.
The government's major argument was that bundling a web
browser into the operating system was illegal. However, including a
browser with an operating system is a good idea for consumers.
Without a web browser, you wouldn't be able to surf the web or send
Afterword
271
272
Afterword
I wrote this book under the assumption that if Microsoft continues on its current path and never adopts copyleft licenses for its
code, what Wikipedia did to Encarta will likewise happen to Windows, Office, Internet Explorer, SQL Server, Visual Studio,
Exchange, MSN, etc.
A tiny fraction of the software Microsoft produces runs on Linux,
so Linux world domination means the end of Microsoft as we know
it. Free software will force Microsoft to choose between licensing
revenue and relevance.
A lot of people that I meet in the software industry hate Microsoft, but have valued its leadership. Many people recognize that
Office is more polished than OpenOffice and pioneered many innovations. Internet Explorer was universally recognized for a significant period of time as a better browser than Netscape's. Microsoft's
SQL Server is considered easier to use than Oracle's. Word beat Ami
Pro and WordPerfect in the reviews.
Let us presume that Microsoft were to embrace the proposition
that free software software is a good idea. There would be no reason
to call Microsoft evil anymore, because copyleft would prevent it.
When something is free, no one can control or monopolize it anymore. In adopting free software, they would have two possibilities
available to it, however, both involve a dramatic change in what
their employees do, and a significant drop in revenue.
Microsoft could release all of its code as GPL, learn to build this
software as a globally distributed effort, and compete and better
interoperate with other existing free software. I think this would be
a mistake because Microsoft's codebases are too old, much bigger
and more complicated than existing free software codebases, and
the details today are not understood by the existing globally distributed software community.
Afterword
273
One advantage of buying Novell is that Microsoft would get Mono, the free .Net
runtime, and it could use it as a base to build the next-generation programming
language, supplanting Sun. However, this would eventually involve merging it
with their own .Net codebase.
274
Afterword
For example, SQL Server has almost an entire OS inside it with its own memory
manager, cache manager, file system, synchronization mechanisms, threads, interprocess communication, code loading, security subsystem, etc. and it still has
multithreading bugs. Huge portions of .Net are written in C++ because they
thought it would be a few percent faster, but this also has added complexity and
bugs and slows progress.
Afterword
275
Eben Moglen says that one of the goals of the free software movement is to expand opportunities for billions more people out there
to quit throwing away most of the brains on earth. This book is
about free software, but I'd like to end this afterword with a few
ideas on a free press, free markets and several other issues. If it
weren't for our scientists and engineers, we'd still be picking our
noses is caves, and the lawyers in divorce proceedings would be fervently arguing about how to divide up the rocks. Our scientists can
tell us that one pound of Uranium generates the same amount of
276
Afterword
Afterword
277
first plant that had been built under the newest and safest regulations, and it took them years. It cost us well over a billion
dollars, and they just wrote it off.
Jim, Shoreham Nuclear Regulatory Commission Project Manager
The Alvin W. Vogtle Nuclear Generating Station in Burke
County, Georgia, has two 1,200-MW reactors sitting on the
Savannah River, directly across from the federal nuclear processing facilities in South Carolina. Now Southern Nuclear,
which owns Vogtle, wants to build two new 1,000-MW reactors
as part of the nuclear renaissance.
Environmental groups have immediately taken up the challenge, arguing that dredging the Savannah River to allow barge
delivery of reactor parts will damage the river. The Savannah
was dredged regularly for more than a century until the Army
Corps of Engineers gave up in 1980 because nothing much was
happening on the river. Now environmental groups say a
renewal will ruin the environment. The Nuclear Regulatory
Commission has nodded agreement and will require an environmental impact statement before early site clearance can begin.
That will probably add three years to the project.
William Tucker
An interesting article explaining how nuclear power got killed in the United
States is located at http://tinyurl.com/WhoKilledNuclearPower.
278
Afterword
Every step that you take both solves today's problems, and sets
you up to take on new problems. At Microsoft, we had a phrase
Crawl, Walk, Run and it applies to other sectors as well as software:
Make TVs -> Make robots which make TVs -> Make robots
which make your bed
99 cent plastic toy -> materials sciences for toys -> materials
sciences changing all building materials
Food production -> genetically modified food -> genetically
modified everything
Afterword
279
Free Markets
Congress creates the problem, blames the free market, and uses the crisis
as an excuse to create more government.
A man's admiration for absolute government is proportionate to
the contempt he feels for those around him.
Alexis de Tocqueville
Nobody spends somebody else's money as wisely as he spends
his own.
A major source of objection to a free economy is precisely that
it gives people what they want instead of what a particular
group thinks they ought to want. Underlying most arguments
against the free market is a lack of belief in freedom itself.
Everybody agrees that socialism has been a failure. Everybody
agrees that capitalism has been a success...yet everybody is
extending socialism!
Spending by government in 1989 amounts to about 45 percent
of national income. By that test, government owns 45 percent
of the means of production that produce the national income.
The U.S. is now 45 percent socialist.
Milton Friedman
280
Afterword
I predict future happiness for Americans if they can prevent the
government from wasting the labors of the people under the
pretense of taking care of them.
Thomas Jefferson
While recovering from an emergency ruptured appendix operation in a Libyan state-run, universal health-care clinic in Tripoli,
an exasperated doctor lamented to me that the fellow who was
mopping the floor beside the bed by fiat made exactly what he
did.
Victor Davis Hanson
If I were designing a health care system from scratch, I would
probably go ahead with a single-payer system.
Barack Obama
So you think that money is the root of all evil? Have you ever
asked what is the root of money? Money is a tool of exchange,
which can't exist unless there are goods produced and men
able to produce them. Money is the material shape of the principle that men who wish to deal with one another must deal by
trade and give value for value.
Ayn Rand, Atlas Shrugged
Where freedom is real, equality is the passion of the masses.
Where equality is real, freedom is the passion of a small minority.
Eric Hoffler, American philosopher
Afterword
281
ask their boss for some extra money so they can develop a new idea
to make more money. In fact, working for such organizations must
drain the humanity from its workers.
Many think that the free market is evil because the spoils are
unequally distributed. What they are missing is that is something
that Milton Friedman wrote:
Industrial progresses, mechanical improvement, all of the great
wonders of the modern era have meant relatively little to the
wealthy. The rich in Ancient Greece would have benefited
hardly at all from modern plumbing: running servants replaced
running water. Television and radio, the Patricians of Rome
could enjoy the leading musicians and actors in their home,
could have the leading actors as domestic retainers. Ready-towear clothing, supermarkets all these and many other modern
developments would have added little to their life. The great
achievements of Western Capitalism have redounded primarily
to the benefit of the ordinary person. These achievements have
made available to the masses conveniences and amenities that
were previously the exclusive prerogative of the rich and powerful.
282
Afterword
Afterword
The Legislature
The curious task of economics is to demonstrate to men how little they really know about what they imagine they can design.
Before the obvious economic failure of Eastern European
socialism, it was widely thought that a centrally planned economy would deliver not only social justice but also a more efficient use of economic resources. This notion appears eminently
sensible at first glance. But it proves to overlook the fact that
the totality of resources that one could employ in such a plan is
simply not knowable to anybody, and therefore can hardly be
centrally controlled.
Frederick Hayek, The Fatal Conceit
Knowledge is one of the scarcest of all resources in any economy. Even when leaders have much more knowledge and
insight than the average member of the society, they are
unlikely to have nearly as much knowledge and insight as exists
scattered among millions of people subject to their governance.
Thomas Sowell, Basic Economics
To provide for us in our necessities is not in the power of Government. It would be a vain presumption in statesmen to think
they can do it.
Edmund Burke, 1795
Of all tyrannies, a tyranny sincerely exercised for the good of
its victims may be the most oppressive. It would be better to
live under robber barons than under omnipotent moral busybodies. The robber baron's cruelty may sometimes sleep, his
cupidity may at some point be satiated; but those who torment
us for our own good will torment us without end for they do so
with the approval of their own conscience.
C.S. Lewis
Suppose you were an idiot. And suppose you were a member of
Congress. But I repeat myself.
Mark Twain
A nation trying to tax itself into prosperity is like a man standing in a bucket and trying to lift himself up by the handle.
Winston Churchill
A government big enough to give you everything you want, is
strong enough to take everything you have. My reading of history convinces me that most bad government results from too
much government.
Thomas Jefferson
The government solution to a problem is usually as bad as the
problem.
Milton Friedman
283
284
Afterword
We don't have a trillion-dollar debt because we haven't taxed
enough; we have a trillion-dollar debt because we spend too
much.
Ronald Reagan, 1989
No government ever voluntarily reduces itself in size. Government programs, once launched, never disappear. Actually, a
government bureau is the nearest thing to eternal life we'll ever
see on this Earth.
A young man, 21 years of age, working at an average salary
his Social Security contribution would, in the open market, buy
him an insurance policy that would guarantee $220 dollars a
month at age 65. The government promises $127. Now are we
so lacking in business sense that we can't put this program on a
sound basis? Barry Goldwater thinks we can.
Ronald Reagan, 1964
Only a crisis, real or perceived, produces real change. When
that crisis occurs, the actions that are taken depend on the
ideas that are lying around. That, I believe, is our basic function: to develop alternatives to existing policies, to keep them
alive and available until the politically impossible becomes
politically inevitable.
Milton Friedman
Afterword
285
286
Afterword
Unfortunately, subtle biases in how conservative students and
professors are treated in the classroom and in the job market
have very unsubtle effects on the ideological makeup of the professoriate. The resulting lack of intellectual diversity harms
academia by limiting the questions academics ask, the phenomena we study, and ultimately the conclusions we reach.
Robert Maranto, Associate Professor of Political Science, Villanova University
Sometime in the 1960s, Higher education abandoned their role
as advocates of American values critical advocates who tried
to advance freedom and equality further than Americans had
yet succeeded in doing and took on the role of adversaries of
society.
English departments have been packed by deconstructionists
who insist that Shakespeare is no better than rap music, and
history departments with multiculturalists who insist that all
societies are morally equal except our own, which is morally
inferior.
This regnant campus culture helps to explain why Columbia
University, which bars ROTC from campus on the ground that
the military bars open homosexuals from service, welcomed
Irans president Mahmoud Ahmadinejad, whose government
publicly executes homosexuals.
What it doesnt explain is why the rest of society is willing to
support such institutions by paying huge tuitions, providing tax
exemptions and making generous gifts.
Michael Barone, American political analyst
Thirty years from now the big university campuses will be
relics. Universities won't survive. It's as large a change as when
we first got the printed book. Do you realize that the cost of
higher education has risen as fast as the cost of health care?
And for the middle-class family, college education for their children is as much of a necessity as is medical carewithout it the
kids have no future. Such totally uncontrollable expenditures,
without any visible improvement in either the content or the
quality of education, means that the system is rapidly becoming
untenable. Higher education is in deep crisis.
Peter Drucker
I would rather be governed by the first 2,000 names in the Boston telephone directory than by the 2,000 members of the Harvard faculty.
William F. Buckley Jr., Rumbles Left and Right, 1963
Afterword
287
Free Press
Promote then as an object of primary importance, institutions
for the general diffusion of knowledge. In proportion as the
structure of a government gives force to public opinion, it is
essential that public opinion should be enlightened.
George Washington Farewell Address, 1796
Journalism naturally draws liberals; we like to change the
world.
Washington Post Ombudswoman
Tonight we have put the best child care system in the world on
the American Agenda. That is to say, the system which is
acknowledged to be the best outside the home. Its in Sweden.
The Swedish system is run and paid for by the Swedish government, something many Americans [such as me] would like to
see the U.S. government do as well.
Peter Jennings, Anchorman of ABC News, 1989
I thought from the outset that Reagan's supply-side theory was
just a disaster. I knew of no one who felt it was going to work.
Tom Brokaw, Anchorman of NBC News, 1983
The collapse of Fannie and Freddie was completely preventable. The party that blocked any attempt to prevent it was:
the Democrat Party. The party that tried to prevent it was: the
Republican Party.
I have no doubt that if these facts had pointed to the Republican Party or to John McCain as the guilty parties, you would be
treating it as a vast scandal. Housing-gate, no doubt. Or Fannie-gate.
Instead, it was Senator Christopher Dodd and Congressman
Barney Frank, both Democrats, who denied that there were any
problems, who refused Bush administration requests to set up a
regulatory agency to watch over Fannie Mae and Freddie Mac,
and who were still pushing for these agencies to go even further in promoting sub-prime mortgage loans almost up to the
minute they failed.
Yet when Nancy Pelosi accused the Bush administration and
Republican deregulation of causing the crisis, you in the press
did not hold her to account for her lie. Instead, you criticized
Republicans who took offense at this lie and refused to vote for
the bailout!
288
Afterword
And after Franklin Raines, the CEO of Fannie Mae who made
$90 million while running it into the ground, was fired for his
incompetence, one presidential candidate's campaign actually
consulted him for advice on housing.
If that presidential candidate had been John McCain, you would
have called it a major scandal and we would be getting stories
in your paper every day about how incompetent and corrupt he
was.
But instead, that candidate was Barack Obama, and so you
have buried this story, and when the McCain campaign dared to
call Raines an adviser to the Obama campaign because
that campaign had sought his advice you actually let Obama's people get away with accusing McCain of lying, merely
because Raines wasn't listed as an official adviser to the Obama
campaign. You would never tolerate such weasely nit-picking
from a Republican.
Orson Scott Card, science fiction writer (Democrat)
The Republican debate provided red meat for conservatives:
anti-gay, pro-Jesus, anti-abortion and no gray matter in
between.
Brian Williams, Anchorman of NBC News, 2000
In a world without truth, freedom loses its foundation.
Pope John Paul II
In America the President reigns for four years, and Journalism
governs forever and ever.
Oscar Wilde
There never was an age of conformity quite like this one, or a
camaraderie quite like the Liberals'. Drop a little itching powder in Jimmy Wechsler's bath and before he has scratched himself for the third time, Arthur Schlesinger will have denounced
you in a dozen books and speeches, Archibald MacLeish will
have written ten heroic cantos about our age of terror, Harper's
will have published them, and everyone in sight will have been
nominated for a Freedom Award.
William F. Buckley Jr., National Review, 1955
The skillful propagandist has the power to mold minds in any
direction he chooses, and even the most intelligent and independent people cannot entirely escape their influence if they
are long isolated from other sources of information.
Frederick Hayek, The Road to Serfdom
Afterword
289
The funniest (if you like sarcasm) and most convincing case for media bias is Ann
Coulter's best-selling Slander. She is undeservedly vilified, even by conservatives.
She might say controversial things, but doesn't Jon Stewart?
290
Afterword
Note how a vacation day is defined as one not at the White House.
However, it leaves out the fact that a President takes the resources
of the White House with him wherever he goes, and President Bush
has had meetings with foreign leaders, and with his various national
security and economic teams at his Texas ranch. The media leave all
these facts out, and just report again and again that they have the
numbers to prove that Bush is lazy even though he gets up at 5
am and exercises every day.
It is with repetition and the ability to lie by omission that the
media can mold minds in any direction they choose. I have Russian
friends from Microsoft who lived in the Soviet Union during the days
of Pravda, and yet they don't seem to imagine that those same propaganda techniques could and do exist here. I believe the bias of the
media is one of the greatest ongoing scandals of our age.
How can the media be leftward in this center-right country? Why
doesn't the free market fix this? The answer is that the barriers to
entry for a newspaper or TV station are very high. If you live in New
Afterword
291
York, you read The New York Times, and even those who don't like
its extreme political bent read it for other reasons such as its Arts
section.
Like creating a newspaper, creating a TV network is also very difficult; CBS, NBC and ABC have been around for almost 70 years.
Even though there is now cable news, the big 3 TV networks have
10 times as many viewers.
While Fox News has higher ratings than CNN and MSNBC combined, the rest of the media have colluded to discredit it as a mere
propaganda arm of the Republican party. I've talked to a fair number
of fellow software engineers who scoff at the idea of Fox as a legitimate news organization; the simplest way to squelch other viewpoints is to discredit them like this. Fox News was long-anchored by
Brit Hume, who worked for ABC News for 23 years without getting
fired for being a nutcase, but now it is presumed that he is.5 Pundit
Charles Krauthammer wrote: Rupert Murdoch and Roger Ailes are
geniuses: They found a niche market half of America.
A video demonstrating how screwed up the news media in America is can be found at: http://bit.ly/BrokenMedia. In another era, this
interview would have been a scandal, but this sort of malfeasance
takes place every day. The election of Barack Obama is a an excellent case study of media bias.
British journalist Nile Gardner wrote: Television news in America has for decades
been dominated by a left-of-centre oligopoly that has not reflected public opinion.
That smug arrangement was shattered when Fox opened for business in the mid1990s.
Fox News has succeeded spectacularly in racing ahead of its rivals in the cable
news market, notably CNN and MSNBC. Its evening shows such as the OReilly
Factor, Glenn Beck and Hannity pull in several million viewers compared to just
hundreds of thousands on Foxs competitors. Fox offers a highly opinionated, fastpaced and entertaining brand of political debate that includes all sides of the
political aisle. The top hosts may be largely conservative (though not necessarily
Republican), but the guests frequently are not, creating an adversarial and combative arena that until recently was a rarity in American news coverage. Fox is
unashamedly pro-American, a breath of fresh air in an age when US foreign policy
is increasingly weak, muddled and confused.
292
Afterword
Barack Obama
Afterword
293
As for the Democrats who sneered and howled that Palin was
unprepared to be a vice-presidential nominee what navelgazing hypocrisy! What protests were raised in the party or
mainstream media when John Edwards, with vastly less political experience than Palin, got John Kerry's nod for veep four
years ago?
Camille Paglia, liberal feminist
During this election, the media in the United States of America
was worse than the media in communist Russia. The anchormen and anchorwomen were reading from the same script.
They might have had different haircuts and they might have
had different outfits, but they were reading from the same
script.
Orly Taitz, attorney
The swooning frenzy over the choice of Barack Obama as President of the United States must be one of the most absurd waves
of self-deception and swirling fantasy ever to sweep through an
advanced civilization. At least Mandela-worship its nearest
equivalent is focused on a man who actually did something.
You may buy Obama picture books and Obama calendars and if
there isnt yet a childrens picture version of his story, there
soon will be. Proper books, recording his sordid associates, his
cowardly voting record, his astonishingly militant commitment
to unrestricted abortion and his blundering trip to Africa, are
little-read and hard to find.
Peter Hitchens, Daily Mail (conservative)
Do not blame Caesar, blame the people of Rome who have so
enthusiastically acclaimed and adored him and rejoiced in their
loss of freedom and danced in his path and gave him triumphal
processions. Blame the people who hail him when he speaks in
the Forum of the more money, more ease, more security, and
more living fatly at the expense of the industrious. Julius was
always an ambitious villain, but he is only one man.
Cicero
Barack Obama is intelligent and has some good ideas, but he won
the presidency because journalists fomented anger towards Bush
over his 8 years, and advocated for Obama's victory. The media
ignored Obama's failed efforts at the Annenberg foundation, his liberal voting record, lack of bipartisan accomplishments, evidence
that Ayers ghost-wrote Obama's first memoir, and more. (I am no fan
of John McCain!) I'm not arguing that these things are all necessarily true, simply that they weren't even discussed.
294
Afterword
Conclusion
I can easily envision a world where free software has completely
taken over, but where The New York Times, et al, are still advocating against the policies of a free society. Let's build both!
HOW
TO TRY
295
LINUX
296
Dedication
DEDICATION
Writing books is the closest men ever come to childbearing.
Norman Mailer, novelist
Art is never finished, only abandoned.
Leonardo Da Vinci
Acknowledgments
I would like to acknowledge my family, friends, teachers, colleagues, reviewers, etc. The total list would be very long and I would
likely leave out or misspell names that would mean nothing to my
readers, so I won't even try. I am not great about staying in touch,
but that doesn't mean I don't cherish our time together!
The cover art was created by Nils Seifert and the cover was
designed by Alex Randall.
If you enjoyed a free version of this book, and you want to send
me a donation as thanks for my two years of labor, that would be
appreciated! You could purchase a paper copy, or go to keithcu.com
and click on the PayPal button. Any amount donated over $5 will be
given to worthy efforts in free software. If you'd like to contribute
money towards a particular area of free software, but don't know
how, I can help!
Keith Curtis
[email protected]
twitter: @keithccurtis