Wednesday, October 12, 2005

How Hoary Is Your Hedgehog?

My colleague Robert forwarded me a link he saw on Tech.Memeorandum about a new Linux distro--Ubuntu 5.10, also known as "Breezy Badger." This is replacing the previous Ubuntu 5.04, also known as "Hoary Hedgehog."

According to the article on linux.com, it's pretty good. It doesn't set up drivers for some propietary hardware and doesn't have a GUI installer, but it gets a positive review. (The reviewer notes that if you want to use the KDE desktop you should use the Kubuntu distro instead).

Aside from having a chuckle over the naming conventions, it got me to thinking about one of the unsung dangers of Linux: the sheer volume of distros and their subtle differences. The conventional wisdom with Linux is that it's easier-to-use and safer than Windows, and that the costs are much lower (of course, there's no license fee, so that part is cut and dried). And, of course, by choosing an open-source OS, you're not locking yourself in to one vendor. What isn't generally spoken about is the difficulties involved in making your particular software work with your particular flavor of Linux, and the hidden costs therein.

Distrowatch.com tracks distributions of 10 different flavors of Linux, and that's just the major ones. There are well over 100 flavors around. Why so many? Simply because it's open source. Anyone is free to create their own distribution of Linux, complete with his/her own modifications.

The problem is that not all flavors of Linux work the same. Frequently, applications that run fine under one flavor don't run under another--they need to be recompiled (sometimes recompiled and relinked differently on different flavors). Some people don't mind this a bit--heck, I know people who won't run any software on their boxes unless they compile it themselves. On the other hand, the vast majority of people out there don't know what a compiler is, and sure as heck aren't going to compile software themselves.

So what does this have to do with distributed computing? Well, Linux has made huge inroads in distributed computing over the last few years. It has really become the operating system of choice for new clusters. But I think that some of the reasons behind choosing Linux aren't as strong as people think.

The overwhelming reason people choose Linux is to save the cost of the license for the operating system; fair enough, Windows isn't free. But what about the cost of hiring or keeping on staff someone whose job it is to recompile every cluster application so it runs on your particular flavor of Linux? Or, worse, (and I've read stories of this occurring), changing the installed distro of linux in order to run different applications on the cluster. Then changing it back when you want to run the previous application.

Of course, you could just decide to stick with one distribution. But in a world where not every piece of software will run on that, you've just locked yourself in to one "vendor." Isn't that what you were trying to avoid by choosing Linux?

Now, I'm certainly not trying to say that no one should use Linux. It's a great OS with some great capabilities, and it makes sense in some situations. I'm also not saying that Microsoft has a perfect answer to this; heck, they haven't even released their Compute Cluster version of the OS yet. But one thing you can be assured of: there won't be 100 different distributions. And you won't have to recompile your applications with the correct flags to make them run on it.