Friday, October 12, 2007

RDB is a good idea, but should you write your own?


Dr. Dobb's, always worth reading, has a couple of interesting articles this month.

Matt Davey of Lab49 has a good read on WPF and Complex Event Processing (but where are the illustrations?). He blogs here.

However, even more exciting for me, there's an article on grid computing (regular readers will recall that Robert Anderson and I wrote an article on scaling an SOA on a grid for them last year).

In this month's article (entitled "Grid-Enabling Resource-Intensive Applications"), Timothy Hoehn and Bob Zeidman of Zeidman Consulting compare several different strategies and methods for grid-enabling an application. For the most part, I love the conclusions they draw:

Among the architectures they examined, Distributed Objects provided the most scalable, flexible solution. They preferred it over Client/Server ("Overhead of managing socket connections can be tedious. If server fails, clients are useless."), Peer to Peer ("Harder to manage froma central location"), and Clustering ("Needs homogeneous hardware. High Administrative overhead for processes and network.")

They also examine "communication strategies," and again I like the way they think: they prefer Remote Method Invocation to Sockets or Remote Procedure Calls.

Next, they examine the "Push" and "Pull" distribution models, and they conclude that Pull offers some obvious advantages.

Finally, they discuss three different "frameworks:" .NET, JNI, and EJB. However, they're unable to actually do a comparison here. Tight timelines (and a lack of C# expertise) kept them from working in .NET; they preferred EJB to JNI.

Timothy and Bob then implemented their solution, getting good speedup on large jobs (they don't offer actual numbers, but I certainly believe them).

So, the things I like: they validated many of the decisions we made in our own product, which has a pull model that distributes objects and invokes methods on them. That's every cool.

But the one thing that I didn't like about their methods: they wrote all of the grid infrastructure themselves!

I understand that they weren't trying to do a "vendor bake off," and I really do appreciate the research they did.

But by essentially recommending that people write their own grid is a bit like saying "We've done some research into databases; if you need one, we recommend you write a Relational database." In other words, the conclusion is perfectly valid right up until the "write it yourself" part. There are many vendors out there who have written very good tools here--it would be silly to write your own database. You'd spend far more in development time than you would on a far superior software product.

The same is true in distributed computing. You could write your own...but with the selection of high quality vendors out there, why would you?

And Tim and Bob, if you happen to come across this post, I really did like the article. I think you reached all the right conclusions. I'd love the chance to see what it would take to grid-enable CodeMatch using my favorite grid computing toolset!

Photo Credit: Emily Roesly

Technorati tags: