Friday, June 16, 2006

Computers get slower every day

  e have a saying around our office: "Computers get slower every day."

"What?" you ask! "What about Moore's Law? What about 3 GHz Opterons? What about 3.4 GHz Xeons?"

Well, ok, that's true. AMD and Intel are still innovating, and they're still churning out faster and faster chips. But the truth is, they're not keeping up.

As I wrote last fall, increases in processor speed just can't keep up with increases in networking speed and disk density (I referenced this Scientific American article). When you take into account the amount of data that we can effectively move to a computer, the processor itself is relatively slower than it used to be.

Now, combine this ever-increasing network speed with Jim Gray's observations about distributed computing: "Keep the computation near the data." But the corollary is, when you take into account increasing network speeds, "Your data is getting closer and closer every day." In other words, because networks are getting faster and faster, it makes more sense to move data to more processors in order to work on it.

I write all of this because I had first-hand experience with it yesterday. Over on the Digipede Community boards, delcom5 had written to say that he had 15gb files to zip, and ask if the Digipede Network could be used to speed up the zipping process.

I was curious about it, and I set out to try it. It was extremely simple to set up a job to zip files (took me maybe one minute using Digipede Workbench). However, when I ran it, even though I had 10 machines working on zipping, I got barely any speedup. Why? Well, I was dealing with 100MB files--and those take a while to move around my 100 MBit network! I was pretty frustrated.

Digipede Workbench screenshotThen I decided to submit to a subnet that's wired with Gigabit Ethernet. Wow! What a difference. Zipping a gigabyte of files went from almost a minute and a half to fifteen seconds.

A couple of years ago, this wouldn't have made any sense as a distributed computing problem--100MBit networks just weren't fast enough to make it work. If you wanted to zip a bunch of files, you were forced to do it on one machine. But with the order-of-magnitude performance increase that Gigabit Ethernet gives us, you get tremendous improvement by distributing this problem.

The lesson here isn't just about zipping files, of course. As networks get faster faster than chips get faster, more and more problems like this become "eligible" for distributed computing every day.