Wednesday, February 28, 2007

It's a Delightfully Parallel World


T
  raveling every week so far in 2007 has had a major impact on my blogging frequency--I haven't had time to read my feeds in weeks, let alone write anything. But when I started this blog in 2005, I had a goal of writing at least one post a week, and I'm determined to get back to that.

And last night, after a 12 hour day when I returned to my hotel room to start a post, how does Blogger greet me? By letting me know that Servlet NewFrontend is currently unavailable. Great. Can someone remind me why I use this service?

Now, back to our regularly scheduled program.

Spending a ton of time at customer sites this year has given me a vastly improved perspective how much demand there is for distributed computing power. I've been at a customer site where people were asking for time on a brand-new development grid because they need to get production analysis runs completed. I've seen people running from desktop machine to desktop machine starting analysis software. And, of course, the most prevalent "grid" out there: using remote desktop to access many servers, starting processes on each.

And why exactly are people using these slow, inefficient methods for getting work done? Can it even help? Of course it can. Because, after all, it's a delightfully parallel world.

Several years ago, long before Digipede had released a product but after we had decided on a feature set, we were told by one of the luminaries of distributed computing that the problem with a system like ours is that it could only solve "embarrassingly parallel" problems.

For those of you unfamiliar with the term, the term embarrassingly parallel refers to computing problems that are easy to segment for separate, parallel computation. Moreover, embarrassingly parallel problems require no communication between the various pieces of the problem.

There has long been a feeling in the academic computer science community that "embarrassingly parallel" problems aren't worth spending time on. Academics have been much more intent on solving those problems that can't be easily broken up, that require constant communication and direct memory access between processes. Fields like Finite Element Analysis and Complex Fluid Dynamics, for example, are enormously complex, require vast amounts of computing power, and have great computer science minds struggling to come up with new and innovative technologies.

While the academics have been solving these very difficult problems, they've been looking down their noses at embarrassingly parallel problems--the name itself is quite condescending.


But when you go out into corporate America, and you look at the problems that most developers are trying to solve, and you look at the compute loads that are strangling most overworked servers, you find a nasty little secret:

It's a delightfully parallel world.

That industry luminary told us that, in his estimation, perhaps 10% of computing problems might be considered embarrassingly parallel--everything else requires "real" distributed computing.

Having spent a bunch of time with customers, I think he is exactly wrong. Why? Because it's a delightfully parallel world.

Most customers out there who are adapting their software to run on a grid or cluster aren't tearing apart their algorithms, rewriting every line of code using a complex toolkit so it functions across processors. Carving an algorithm like that is amazingly difficult and requires enormous expertise.

No, customers do something far more efficient and practical: instead of trying to carve up their algorithms, they break up their data.

Say you've written a routine that can analyze the risk for a customer's portfolio--it runs for 5 seconds. If you have 1000 customers, it's going to take an hour and a half to run. Imagine you have 20 servers--you'd really like to spread that work around to get it done quicker. Now you could try to rewrite that algorithm in such a way that it uses multiple processors simultaneously, but that would involve complex technology like MPI and completely rearchitecting your routine. Here's a much easier solution: leave your algorithm exactly the way it is now, and break up the data instead. Each server analyzes 50 customers, and your analysis is done in about 4 minutes. Why was that possible?

Because it's a delightfully parallel world. Your customers' portfolios aren't dependant on each other--each can be analyzed independently.

And when you venture into corporate America, and you look at the server loads, you see that most of the analysis they are doing falls into this category.

A special effects company needs to render 50,000 frames for a scene. An electric power company needs to generate 20,000 complex bills for their largest customers. A web application needs to generate PDFs for users on the website. A bioinformatician needs to check 300 different proteins to see how well they dock on a segment of DNA. A trader needs to try 50 different trading algorithms against the history of a stock's performance.

All of these are daunting problems in terms of computing capacity--and all can be solved in parallel by dividing up the data.

Now, before the MPI-jockeys take me to task, some disclaimers: I don't pretend that every problem in the world can be divided like this, and I understand that dividing data can be a complex task in its own right. Moreover, what you guys do is really, really hard. I get that, and I'm glad you're out there solving those problems.

But for the other 90% of developers out there: don't rewrite your algorithms. Break up your data.

Because, as John Powers says, it's a delightfully parallel world!

Photo credit: Scott Liddell