Wednesday, August 08, 2007

Digipede Network and CCS: What's the Dif?

One of the questions I get frequently from potential customers is "How is the Digipede Network different from Windows Compute Cluster Server?" I've spent the last 16 months or so reciting the same answer over and over again; when John Powers suggested I write a blog entry about it, I was (frankly) surprised I hadn't already.

But, searching through my del.icio.us tags in the sidebar, I see that I haven't. So here goes. This will take a couple of posts; in this one, I'll concentrate on my favorite part of the answer: how the Digipede Network differs from CCS from the developer's perspective.

Let's start with why CCS exists in the first place: in order to compete in the scientific and technical computing space, Microsoft knew they had to revamp their OS.

Windows Compute Cluster Edition is basically Windows Server 2003 x64 Edition. Microsoft took Server 2003, and added support for the type of hardware frequently seen in clusters (high-bandwidth networking like Infiniband, for example), and additional support for Remote Direct Memory Access (necessary for MPI implementations).

Next, they added in the Compute Cluster Pack. CCP is a set of tools that sits on top of the OS and provides additional software support for technical computing: specifically, an MPI stack, a cluster job scheduler, and a set of management tools.

In other words, CCP sits on top of CCE to make the OS into a tool usable for scientific and technical computing. CCS (Compute Cluster Server) is simply CCE and CCP together.

Ok, that's the Microsoft product line. How does Digipede fit in?

Well, while CCP has plenty of tools for the scientific developer, it has almost nothing for the enterprise developer. If you're developing in .NET, it doesn't translate naturally to a cluster paradigm--in order to work with CCP, you'd have to compile your .NET down to a command-line executable, and do all of your data passing either in files or on the command line.

The Digipede Network, however, can put your CCE nodes at the disposal of a .NET developer. Without restructuring your application, without moving to a command-line paradigm, without deploying your EXE to the different nodes. By automatically deploying .NET assemblies (and related files), then distributing and executing .NET objects natively, the Digipede Network adds a layer of .NET support onto CCE.

We handle the .NET parts of things. They handle the OS. Want 64 new nodes on your grid? Buy a cluster of CCE nodes. You'll save a boatload of dough over the Windows Server 2003 license costs, and you'll get good deployment tools. Throw Digipede Agents on those nodes, and suddenly you've got a high-powered, .NET supercomputer on your hands.

Want to see the difference between CCS and CCS + Digipede? Watch these two MSDN webcasts. In the first, Ming Xu and Sanjay Kulkarni from Microsoft put a calculation on a CCS cluster behind a spreadsheet.

In the second, I do the same kind of thing, but I use the Digipede Network on top of CCS.

The difference? In the first demo in that webcast, Ming and Sanjay want to put some .NET logic behind a spreadsheet. They had to write a Web Service that the UDF in Excel talks to, then deploy that Web Service to the head node on the cluster. When that Web Service is called, it communicates with the CCS job scheduler to start a job. They wrote a command line executable that actually did the analysis, deployed that EXE ona file share, then had his CCS job invoke that command line executable. That command line executable in turn writes to stdout, and stdout from each task is redirected into a file on the file share. The web service polls the job scheduler to see when the job completes, waits for the job to finish, collected the results from the command line executable by opening each file, then returns the results to the UDF (which, you recall, had originally invoked the web service).

Whew! That's a lot of moving parts. It uses technologies common to technical and scientific computing (e.g., using command line executables, handling data passing using command lines and files on a share), but perhaps not as familiar to enterprise developers working in Excel.

In the second demo in this webcast, I implement a similar pattern, but I do it behind Excel Services (the same thing can happen behind Excel). The big difference is that it's much, much simpler. My User Defined Function in Excel Services simply passes .NET objects to the Digipede Network. Those objects are automatically distributed around the cluster, executed, and returned to the UDF. There are many fewer moving parts. I didn't have to predeploy or prestage my executables or DLLs, and I didn't have to mess with web services, command lines, command line EXEs, or getting data into and out of files.

To be clear, I am definitely not slamming CCS--it's a very good product, and it's making inroads in exactly the market it was aimed at: scientific and technical computing. What we've done is what good partners do: extend that product so it can be used by in another way entirely: as a plank in a grid computing platform.

More later on other differences between CCE (the OS) and the Digipede Network (the grid computing platform).

Update 2008-01-03: See my follow-up post for more differences...


Technorati tags: , ,