I had a Digipede customer ask me last week about how to optimize his processes for running on the Digipede Network. He is grid-enabling an existing application--he's got a class that does a lot of calculation, and he wants to have a bunch of those objects calculating in parallel. Each object is already independent.
Sounds perfect for distributed execution, right? "Twenty lines of code" away from being Digipede-enabled?
Well, not quite.
See, the objects that he does his calculation on are pretty darn big--on the order of 20-25 megs each. He actually had no idea they were so big; but the class has lots of members, and many of the members are large classes themselves. Now, the Digipede Network can certainly move objects of that size--but those objects have to be moved in order to calculate on them, and moving data is often more time consuming than the computation you need to do on it. (See Jim Gray's Distributed Computing Economics to get an idea of what that means, but bear in mind that we are only discussing LAN-wide computation here).
The answer can frequently be to create a class that contains only the data relevant to the distributed calculation. In the case with this customer, he was "retrofitting" the application to work on the Digipede Network. His class wasn't designed for distribution and, as a result, had a lot of data in it that wasn't necessary for the calculation that he wanted to happen remotely. In other words, his objects did a lot in their lifetime, only a portion of which was going to be distributed.
The customer needed to create a class that contained only the data that was relevant to his distributed process, and use that as a member in his huge class. Only this new, smaller class gets distributed across the network. Instead of moving 20MB per object, he was now moving only a few kilobytes. When the small class returns from its journey across the network, its result data is then copied back into the main object.
Our customer needed to do a bit more work than the fabled "twenty lines of code"--but he ended up with a more structured application and vastly improved performance.
Wednesday, December 21, 2005
I had a Digipede customer ask me last week about how to optimize his processes for running on the Digipede Network. He is grid-enabling an existing application--he's got a class that does a lot of calculation, and he wants to have a bunch of those objects calculating in parallel. Each object is already independent.
Thursday, December 15, 2005
[Update: looking for my discussion of Microsoft's tools? It's down here]
Here are a few links for your perusal...
Don Dodge, always an interesting read, echoes Dan'l Lewin's list of hot startups using Microsoft tools here:
Digipede—Its "many legs make light work" and turn any combination of servers and desktops into a grid for .NET apps.Over at the Science Library Pad, they point out that SearchCIO has published Gartner's 2006 Top 10 Strategic Technologies.
Grid computing. Who's doing grid computing? Charles Schwab, Royal Dutch Shell and Sony are among the companies tapping the technology. Definitions of grid computing vary, but its popularity continues to rise.
And, my favorite read of the day, Software Development Magazine has just published a review of our software, the Digipede Network. Paid registration required for the article, but suffice it to say: Four stars! Thanks guys!
It's a great review, and not just because he liked our product. As it turns out, the author (Rick Wayne) isn't just a tech writer--he also does soil analysis. He knows his way around compute intensive simulations. So he converted his office into a mini-grid, using nothing but the Digipede Network software (and included documentation, of course)--with not a drop of help from us. He got it working in a snap, and he was one happy camper.
Posted by Dan Ciruli at 3:43 PM
There are startups using .NET, but they aren’t the majority, and those who chose to do so are buying themselves into a trap with expensive licenses and a locked-in platform.
Jeremy Wright from b5Media, who has a great blog called Ensight, contradicts him:
The startup can grab ISV packs which’ll cost about 2500$ to get the company up and running with all the dev tools and server bits they’ll need. Toss in another 2500$ and they’ll get all the MSDN stuff they need. 5000$ is not that much to get a 5-10 man shop up and running, even when bootstrapping.
Jeremy has a great point, but he's off by an order of magnitude!
I work at a startup. We joined Microsoft's Empower program, which exists to help startups with initial costs. It cost us $375. It included a universal MSDN subscription with 5 user licenses, as well as 5 licenses for Office, and "the full array of server products including Windows Server 2003, Exchange 2003 Server and SQL Server."
Um, does $375 seem like an exorbitant amount for that?
Also included: technical support and training. Individuals at Microsoft assigned to help us with technical details, our marketing, and even our sales.
As soon as we could, we became Partners, then Certified Partners, then Gold Certified Partners. The benefits are enormous: Microsoft helps with marketing, they help with architectural issues, we get early releases of software, we get direct access to the product teams and their roadmaps. Oh, and we get GREAT license benefits.
If you haven't worked with Microsoft, then you just don't understand this: they work very, very hard to create an ecosystem that actually fosters innovation. They want startups like mine to choose their tools, so they do a tremendous amount of work to make it a good choice. And with programs like Empower, the cost of all those tools, operating systems, and support, is virtually nothing.
Windows may not be your OS of choice, and if your users are all running Linux boxes, obviously these tools and programs aren't for you.
But if you think it's too expensive for a startup to use Microsoft products, you just haven't done the research.
From our perspective: it's a slam dunk.
Posted by Dan Ciruli at 11:33 AM
Tuesday, December 13, 2005
Thursday, December 08, 2005
Which is why designing for scale is so important. I don’t believe any startup “needs” to achieve anything more than around 2 9s of uptime, which is what a properly configured server should do for you. However, even at the beginning, you need to be coding and planning for growth.(emphasis mine)
Small things like managing how transactions occur, having separate database connections for reading and writing, making your app able to handle variable state sessions, etc are key.
One of his posters, Ian Holsman, responds:
The reasoning behind not worrying about scaling is that in a lot of cases people worry about the wrong things. They will spend hours getting that code tuned just so, and have it running in 10ms less time, only not to realize that the code is only run once a day.
Scott Sanders from Feedlounge also responded:
The FeedLounge development process was more along the lines of:(again, emphasis mine)
1. Build a webapp, see if the features are compelling to a set of users, keeping a design in mind that is capable of scaling
2. Overrun the shared server that you are using, switch to dedicated server, so you can properly measure the effects of the application.
3. Add more users, adding requested features from the users, measuring the load in a fixed, known environment, and start work on “Distributed” part of ladder. The is where the build portion of the scalability starts.
4. Now that you believe you have something that has value, invest in the hardware and software development necessary to scale. Continue working on priority based tasks towards release of your product.
Scott's point is valid, and the italicized portion makes it valid: you need to be designing your software so it's capable of scaling. No, Ian, this doesn't mean spending weeks optimizing the code to trim every last microsecond off of every transaction. It means designing your software well from the beginning.
Most importantly, it means acknowledging the possibility, however remote, that you may actually succeed and build something that people eventually use. Many people.
This point applies equally to those designing web sites and those planning on deploying SaaS. If you are going to make it available on the web, and you're not designing for scalability, then you just aren't planning for success: you're planning for failure.
Ian does make one point validly: in the beginning, you can't spend too much time on scalability. You need to make sure you get the <content, features, service, whatever> right. But you need to be prepared for scaling; that's why it's important to choose a toolset in the beginning that makes scaling later as easy as possible.
Plan on succeeding.
As a product manager, I'd be remiss if I didn't point out that that is exactly what we designed our SDK for--so developers can spend their time in their area of expertise, but have a framework underpinning their software that will scale when the time comes.
Posted by Dan Ciruli at 4:25 PM
Wednesday, December 07, 2005
I was starting to prepare a post on how to buy a grid--what are the steps you can take, and what's involved.
Then I remembered this post by my colleague Kim. It's a great place to start: 10 easy steps, and guaranteed success (well, they may not all be easy, and no one can guarantee success). But it's a very good list of things to consider when you are educating yourself about a purchase.
I'd add one more thing to watch out for: don't be swayed by features that you don't need (and won't be able to take advantage of). If someone tells you that their software is the most powerful on the planet because it works on 12 different operating systems and can tie together PCs in Idaho with mainframes in Kirkutsk, but you have 200 PCs in one office building, how does that help you? It doesn't. So look for something that will help you the way you work.
So read Kim's post. It's a good one. Plus, she used it to invent the word gridified, which is my new favorite word.
Posted by Dan Ciruli at 3:19 PM
Tuesday, December 06, 2005
I tend not to use spell-checkers; I try to read my writing carefully (although anyone who has been paying attention knows that I certainly don't always do that successfully).
Today, for the first time, I used Blogger's built-in spell checker.
It choked on a word, that, well, I thought it would know...
Technorati tags: blogger
Posted by Dan Ciruli at 2:42 PM
Sometimes I read a blog post and it just makes me smile.
Today Jeremy Wright talks about the need for scaleability for Web 2.0 companies (he's specifically talking about an announcement from FeedLounge). He says:
Listen up. If your company relies on the web to stay alive, you’d damn well better be using at least some of the following “ladder to high availability”:
Backups, Redundant, Failover, Cluster, Distributed, Grid and finally Mesh
Each step up is a massive increase in cost, but it’s also a massive increase in uptime and such. I hate it when companies say they want 99.9% uptime (or even worse 5 9s of uptime) without thinking about what that’ll cost them.
Distributed/Grid computing should be the kind of thing that these companies are thinking about from the time they begin planning their architecture. They have to plan for success. They have to plan on hundreds of thousands or millions of users, right?
And I'll go a step further than Jeremy--the other thing Web2.0 companies shouldn't do is write that portion of their applications from scratch. I mean, no one in their right mind writes their own database to sit under web apps, right? You go get one--SQL Server, MySQL, PostgreSQL , whatever is right for you--but it would be a complete waste of your time to sit down and write one from scratch.
So why do people try to do that with a distributed/grid system? Most do. It's too bad; it's a waste of their valuable time and money. And in all likelihood, they'll end up with a solution that isn't nearly as scalable as they think is.
I'll let Jeremy finish this one up for me:
If your business depends on your website being up, look at your code, look at your infrastructure and for your users sake figure out what you actually need and build the damn thing properly!
Posted by Dan Ciruli at 2:26 PM
A couple of weeks ago, I wrote a post about my difficulties in upgrading a VSTO project from VSTO 2003 to VSTO 2005 (for those of you unfamiliar with VSTO, it stands for Visual Studio Tools for Office). It's replacing VBA as the method to put code behind spreadsheets, Word documents, etc.)
I had some difficulties upgrading a VSTO project from Visual Studio 2003 to Visual Studio 2005; they've changed some of the architecture, and it was a pain (read that post if you want the gory details).
However, my post was remiss--I neglected to mention any of the good changes in VSTO.
First and foremost: Integrated development environment. In the past, if I wanted to put buttons on my Excel spreadsheet, I had to put Excel into "Design mode", create buttons using the Control Toolbox, then switch over to Visual Studio to wire those to methods in my C# code. It was pretty cumbersome, and dealing with modality in Excel was a pain ("No, I meant to select the button, not click it!"). Now, all of the design work is done directly in Visual Studio 2005--I open my Excel spreadsheet there, and I work on the UI using the regular toolbox in Visual Studio. It's a much more cohesive experience.
Secondly: now the controls look better! The buttons that appear on the Excel spreadsheet just plain look better than they used to.
Third: this is the part I don't quite understand. For some reason, the .NET code in the spreadsheet seems to load much faster. I can't tell if this is my imagination or not--but I've done this a bunch, and I don't think it is. It seemed in the past that when I made my first call into .NET code behind my spreadsheet, I had a several-second delay (disclaimer: I have the slowest laptop on the planet). For whatever reason, that seems to have disappeared. Now, when I make my first call, it happens immediately. This must be a .NET 2.0 change (because I'm still running the same version of Excel 2003), but it's certainly welcome.
The change to the object model took some getting used to, and it was annoying to have to rewrite some of my code. But I'm coming around to it.
Last, the methodology of associating the binary with the spreadsheet has been improved. In the past, there were two custom file properties in the spreadsheet--one which gave the assembly location, and one which gave the assembly name. The location defaulted to NameOfSpreadsheet_bin, and the name of the DLL was NameOfSpreadsheet.dll. It always looked pretty unwieldy--you'd have your NameOfSpreadsheet.xls, and next to it a NameOfSpreadsheet_bin folder with a NameOfSpreadsheet.dll in it. And if you ever moved anything, it was a pain to tell .NET 1.1 how to give permission to the new DLL.
Now, the custom property of the spreadsheet has the GUID of the DLL in it, and .NET 2.0 gives permission to that GUID. This means that you no longer need to have a *_bin folder, and you have a much easier time if you need to move/deploy your spreadsheet.
So the upgrade process is a pain. But once you get in there, VSTO for Visual Studio 2005 is definitely better to work with that VSTO for Visual Studio 2003. I haven't built anything extravagant yet (unless you consider a grid-enabled, supercomputing spreadsheet extravagant--come to my webinar in a half an hour to hear more about that), but I think the product is much improved.
[Updated 13:17 adding Technorati tags] vsto visual studio excel
Posted by Dan Ciruli at 9:00 AM
Monday, December 05, 2005
I'm giving a Developer webinar tomorrow at 10:00 AM, Pacific time.
I'll talk a bit about the Digipede Network, I'll use Visual Studio 2005 to grid-enable a .NET application, and show that application running faster by running on a cluster of Windows boxes.
If you haven't seen a demo of the Digipede Framework SDK yet, you'll be amazed at how little I have to modify an existing application in order to make it run on the Digipede Network. Click here to register and join in on the fun.
Posted by Dan Ciruli at 9:27 AM
Friday, December 02, 2005
One of the questions I hear most frequently about Windows clusters is "Windows clusters?"
The answer: "Yes!"
People do use Windows for clusters and HPC. Here are some links to valuable resources for anyone considering a Windows Cluster.
And, of course, there is Microsoft's HPC Partners Page, featuring this little company!
Posted by Dan Ciruli at 12:33 PM
Thursday, December 01, 2005
Wednesday, November 30, 2005
The InfoWorld TechWatch blog keeps track of, well, all things tech. It's a good all-around gatherer of information. Today there's a post on OCC: Outbound Content Compliance.
Corporations are increasingly needing to monitor their employees' communications to ensure that they are compliant with any regulatory issues as well as any internal guidelines.
One great product that handles this is OutBoxer by Audiotrieve. How do I know these guys? Well, I first met them at the Demo conference in Phoenix last January. They were demoing Outboxer, and we were demoing the Digipede Network.
As it turns out, they "train" their product by running millions of messages through it, using Bayesian analysis to make their algorithms more accurate. It's the kind of thing that scales linearly. We struck up a conversation with their CTO Sean True at the conference, and they were one of our early beta customers. They became a commercial customer as soon as we released. Their analysis runs went from overnight to under an hour. Faster analysis has two huge benefits for them: more accuracy and better use of their employees' time.
It was a great proof-of-concept sale for the Digipede Network. We never visited their offices; they were able to download and install the software by themselves. After seeing it work, they bought new, dedicated hardware to scale their solution even more. They were the first users of our COM API (which they were calling from Python!), and Sean got into .NET programming and even ported one of our samples to IronPython for the Digipede Community Boards.
If you're interested more in the implementation, we've got a case study here.
Posted by Dan Ciruli at 2:36 PM
Tuesday, November 29, 2005
If you're not reading Nicholas Carr's Rough Type blog, you're missing one of the best writers in the blogosphere.
His post today is called "Kill All Screensavers," and he tells an interesting story about screensavers in corporate America--he's talked to at least two CIOs who have been prevented from successfully implementing grids on their PCs because of the official corporate screensaver.
He said that while grids were theoretically attractive as a cheap means of harnessing lots of processing power, he faced a big roadblock: his company's official screensaver. It turns out that the corporate communications department created an elaborate screensaver, complete with video clips featuring the CEO, to promulgate a “corporate values” program. Installed on all the company’s PCs, the screensaver sucks up the processing cycles that might otherwise be put to a productive use – like finding a cure for cancer.When we were first designing the Digipede Network, we looked at a lot at what had been done by other distributed computing networks. Many of them implemented a fancy screen saver (SETI@Home is a good example, but there are many others).
Without thinking twice, we decided not to implement one. Why? Because it's an enormous waste of resources! (See Nicholas's post for some staggering statistics on the power wasted alone). It's bad enough for a pharma company to willingly waste power and CPU hours; it's ludicrous for a grid company to do it. They're wasting the very commodity they're supposed to be saving!
Needless to say: don't look for a Digipede screensaver when you implement the Digipede Network.
Technorati Tags: .NET, green, grid, net, screensavers
Posted by Dan Ciruli at 9:25 AM
Monday, November 28, 2005
I've spent a lot of time over the last couple of months thinking about software as a service and service oriented architecture, especially how they relate to distributed computing.
Today my colleague Kim sent me a link to Threeminds blog on Digital Marketing. In this post, 3minds makes an interesting point that I hadn't considered very much: distributed collaborative development (SourceLabs, SWiK, Sourceforge, et. al.) enable "extreme acceleration" in collaborative development by distributing the development, accomplishing more work by developing in parallel. Distributed computing offers extreme acceleration in the work that can be done by that software, accomplishing more work by computing in parallel.
The networks (social and computing) are becoming more powerful. The tools available on both sides are becoming more sophisticated every day.
Posted by Dan Ciruli at 2:20 PM
Thursday, November 17, 2005
While I work on a post about Supercomputing 2005, feel free to chew on this interview with Digipede CEO John Powers on WindowsHPC.org.
Posted by Dan Ciruli at 12:05 PM
Tuesday, November 15, 2005
I had intended to blog directly from Supercomputing 2005. However, our days ended up being so full, I couldn't find a free 15 minutes to write. So I'm back in Oakland and ready to pour the thoughts out of my head. We had a pretty huge week here at Digipede, so it's been tough to find time to blog.
This was my first trip to Supercomputing. There were some things that struck me as odd, some things that wowed me, and other things that just struck me.
I was amazed at the amount of computing power there. John overheard someone say that they basically loaded as much computing power into the building as the power grid could handle. There were huge racks of servers. There were supercomputers. The vendors brought computers. The national labs brought computers. It was an amazing amount of compute (and networking) power. They put up some pretty good information about the network they built for the show here.
One thing that surprised me about the show was the incredibly large presence of the not-for-profit agencies that do supercomputing. The national laboratories (Argonne, Brookhaven, Idaho, Lawrence-Berkeley, Los Alamos et. al.) were there in force. They had large booths (20x20 or 30x30, with many displays, multi-story structures, etc.). I was surprised to see them because I think of conferences like this (and exhibits, in particular) as a way to attract customers. Clearly these institutions use this conference as a way to keep the industry abreast of what they've done in the last year. It was all very informative, and made the exhibit hall a little a little more tolerable because not everyone there was trying to sell their wares.
One thing that made this show very different than Supercomputings past was Microsoft's large presence. They probably had the most floor space of any exhibitor: one large booth for themselves, and another devoted to Microsoft partners. In addition, Bill Gates gave one of the keynote addresses.
From speaking with other attendees, I gathered that Bill's speech went better than expected. I think a lot of people were expecting him to give a "Rah! Rah! Microsoft!" speech; he didn't. Instead, he talked about the benefits of supercomputing, and the dawning of the "personal supercomputing" era (with many CPUs on each desktop). Kyril Faenov did a demo of Microsoft Compute Cluster Solution in action. I think the things that impressed people the most is that they showed CCS working on a real world problem and interoperating with a Linux cluster.
As with most conferences, the best part for us was meeting with the people who attended. Going to a conference like this saves you 15 business trips because so many people are in the same place at the same time. The place was just full of partners, potential partners, and potential customers. We had meetings nearly all day both days and got some terrific leads out of it.
Posted by Dan Ciruli at 4:24 PM
Monday, November 14, 2005
Friday, November 11, 2005
One of the coolest things about being at the Global Launch of Visual Studio 2005 and SQL Server 2005 was being able to demo a product that utilizes both: our SDK plugs directly into VS 2005, and our server runs on top of SQL Server 2005.
Not only that, but we're also running on the "not really announced loudly but it's out there now, too" product: .NET 2.0. (As an aside--I wonder why they didn't make a bigger deal about this? I guess developers know about it, and non-developers don't understand or care)
So how many companies can say they have a single product that plugs in to VS 2005, runs on top of SQL Server 2005, and takes advantage of .NET 2.0?
I'm giving a webinar on Thursday next week to show it all off. Sign up here if you're interested.
Posted by Dan Ciruli at 4:33 PM
Thursday, November 10, 2005
WARNING! This post will only be interesting to anyone upgrading Visual Studio Tools for Office projects from VSTO 2003 to VSTO 2005. All fans of distributed computing, you may want to skip this one!
Last week I wrote an entry about installing some minor difficulties installing VSTO 2005--it was not as painless and trivial as it should have been.
After getting things installed, I immediately tried to upgrade a VSTO project that I had written for Excel. I needed to get it done quickly, because we needed to demonstrate it at the VS 2005 Launch. The upgrade of the VSTO 2003 project did not go as easily as I had hoped.
It seems Microsoft has made some pretty significant changes to both the mechanics and the object model of the VSTO projects. Most importantly, they've changed things from a workbook-centric model to a worksheet-centric model. This means that, in practical terms, you may need to do some re-architecture of your model before it will work.
I opened up my old project in VS 2005, and it offered to upgrade (making a backup, of course). After a few changes I was able to get my project to compile; however, when I tried to run it I got the dreaded "Office document customization is not available" error. Of course, I had no idea why I was getting it; my first assumption was that my project is dependent on two different libraries, and that one of them didn't have permission to load (this was a difficulty in VSTO 2003).
So I went off to my Administrative Tools->Microsoft .NET 2.0 Configuration tool. When using VSTO 2003, I generally used the .NET 1.1 Wizard to give full trust to any libraries loaded by my assembly. However, .NET 2.0 doesn't have a wizard. Instead, I had to use the Configuration tool. The VSTO projects in the .NET 2.0 Configuration Tool are set up differently than they were in .NET 1.1. Formerly, I'd see either a Wizard entry under the Machine->Code Groups->All Code entries, or I'd see an entry under User->Code Groups->All_Code->Office_Projects that was based on the directory where I built my project. In the .NET 2.0 Configuration Tool, projects are listed under User->Code Groups->All_Code->VSTOProjects, and each is listed by its GUID rather than its folder.
Because I had had such difficulties with security when using VSTO 2003, I assumed that my problem here was the same. I was wrong. When I tried tweaking the settings in the .NET 2.0 Configuration Tool, I started getting a new error that made it clear that now I didn't have permission to load that assembly.
Eventually, I gave up and tried recreating the project. Once again, I upgraded my VSTO 2003 project. At Rob's suggestion, I followed the directions in the "How to Upgrade Solutions from Visual Studio Tools for Office" document (I was wondering why the upgrade hadn't told me to read that first, and also why that document is so hard to find when searching in MSDN). I copied and pasted code as directed; some of my code used to get called when the workbook opened, so I put it in the ThisWorkbook_Startup method. Again, it wouldn't open.
At this point I was pretty frustrated, having spent the better part of a day trying to get this working.
Finally, I "started from scratch." Rather than upgrading my old project, I created a new project. When the wizard asked if I wanted a new workbook or a copy of an existing workbook, I pointed to my old workbook. It created a brand new project, with files for each of my sheets (that's another change between 2003 and 2005--there are .cs files for each sheet of your workbook).
I then added references to the DLLs that I needed my project to load; I built and ran--it loaded! Now I just needed to add my code. I copied and pasted most of the code from my old project into Sheet1.cs (this time, I didn't put any code in ThisWorkbook.cs).
Almost everything was ready. The only problem I had was that I had no way to get values from multiple sheets--in the VSTO 2003 paradigm, your workbook had access to all of the sheets; the new object model seems to make that harder. I used to use code snippets like
mwksResultsWorksheet = (Excel.Worksheet)ThisWorkbook.Worksheets["Results"];, then I'd use that worksheet to get values. Now I didn't have access to the other sheets, because I was working within a sheet rather than within a workbook.
Eventually I came upon the Globals class. This class gave me access to all of the sheets, using construction like this:
Excel.Range cellBetas = Globals.Sheet3.get_Range("cellBeta1", "cellBeta30");
The most inconvenient thing about it is that I have to refer to the sheets by sheet number rather than by their name.
Anyway, at long last, I had my VSTO sample working. It's awesome. Using distributed code behind my Excel spreadsheet, I can have 10 machines sharing in the 700,000,000 calculations necessary for my retirement simulator. All in all, I get 3 minutes of work done in about 20 seconds.
Other than the difficult upgrade process, I actually like some of the new aspects of VSTO 2005. But I'll have more on that later...
Posted by Dan Ciruli at 9:06 AM
Wednesday, November 09, 2005
I may give away my age if I say that one of my favorite parts of the Microsoft Visual Studio/SQL Server/Biztalk Launch was that Cheap Trick played both the keynote session and the party. After 30 years, I have to say, they sound great. I've seen over-the-hill bands before, and this is not one of them: Cheap Trick still rocks. (Rob has some pictures of them here).
But there was so much else going on, it's hard to pick a favorite part. Steve Ballmer gave his customary high energy keynote (available here). There were tons of Microsoft partners there, and it was great to catch up with people from Softagon, Allin, and many others. All of the major hardware vendors were there.
The development communities were there in force, too. INETA, the International .NET Association, was there, as was its local affiliate Bay.NET. Oliver and Bennett from Bay.NET put together tons of great informational programs; I think that both Kim and Rob will be presenting for them soon.
But, as always, the best thing about a Microsoft event is the chance to meet with and talk to the Microsoftians themselves. No, Steve Ballmer didn't stop by the Digipede kiosk. But Kari Martin and the ISV Connect team was there (is it true, Kari, that you put the "kari" in "karaoke?"). Sam Ramji stopped by for a few minutes--we are very excited about the work he's doing; we think that Microsoft is building a great platform for SaaS, and we think we have an important piece of the puzzle for SaaS providers. Dan'l Lewin spent some time with our CEO, John Powers. And both Matt Pease and Diana Beckman from the partner program both came by at some point. Many Microsoftians made it a point to come by, say hi, and ask what they could do to help.
And at the party, I let Jason Mauer school me in air hockey ;-). I should have known better than to play against anyone who uses the word "pimpiest" in a blog entry.
So what did I take away from the show? Well, I already knew all about the great new technologies, and how fast SQL Server 2005 is, and how powerful Visual Studio 2005 is. But it's not what you know, it's who you know. Having spent a day where we got to talk to the folks in charge of Microsoft's partner programs, Emerging Business Team, and the ISV Connect team--oh, and got to show off our latest release to 3000 potential customers--was a darn good day.
Posted by Dan Ciruli at 2:46 PM
Friday, November 04, 2005
Just found out I've been accepted to the "MSDN Architects Bloggers" community--because I signed up to be part of the Visual Studio/SQL Server Launch blogging community.
I was part of the PDC Blogger community, and I enjoyed reading other people's posts. It was impossible to go to every session and meet every interesting person there, so having the ability to experience the PDC through other attendees' eyes was really useful. I hope this leads to the same thing.
For those of you who are reading that feed: welcome! And if you're interested in grid/distributed computing on the Microsoft platform, add this feed to your aggregator.
And if you're interested in hearing about my experience upgrading a VSTO project to VS2005, stick around: more on that adventure later today. The good news is that we will have our demo ready for the VS launch on Monday!
Posted by Dan Ciruli at 11:46 AM
At Digipede, we embraced the Microsoft partner program from the day we decided on building a product on their platform. We immediately enrolled in the Empower Program, which was great. It saved us thousands of dollars in license fees.
As soon as we qualified, we became a Certified Partner and then a Gold Certified Partner. Really, the programs are invaluable. Access to lots of software. Access to great people within Microsoft. And, best of all, access to lots and lots of other ISVs and Microsoft partners. As I've blogged about before, they put lots and lots of effort into their partner programs.
But you know what the most useful thing we've found is? It's not the programs at Microsoft--it's the people at Microsoft. There are certain people there who absolutely bend over backwards to help partners. In particular, we've had amazing help from Suzanne Lavine here in the Bay Area, and Kate Bothell up in Redmond. They both do a wonderful job helping partners. They both take the time to understand what individual partner's needs are, and take time to try to see how they can help. They're willing to step outside of their defined roles and just lend a helping hand. I get lots and lots of e-mails from the different partner programs that I'm in, but 100 of those aren't worth as much as an e-mail from Kate telling me about a program we should be in or an introduction from Suzanne to a vendor she knows we could partner with.
Of course, not everyone at Microsoft is this outward oriented--I've met plenty of people who would barely give me the time of day (or maybe just terse e-mail now and again). Microsoft tries to emphasize the importance of its partners to all employees (walk the halls in Redmond and you'll see posters reminding employees that 96% of Microsoft's revenue comes through partners), but clearly some people grok that better than others.
For us, finding those people who truly embrace partners has been critical. So: thanks Suzanne and Kate!
Posted by Dan Ciruli at 9:14 AM
Wednesday, November 02, 2005
I'm preparing my machine to be our demo machine at the Visual Studio launch event on Monday.
One of the very effective demos that we've done in the past involves Digipede-enabling some .NET code behind a spreadsheet in Excel. It drastically reduces the time it takes to run, and only takes about 20 lines of code to do it. Really, really cool stuff.
So I installed VS 2005 on my machine, only to learn that VSTO isn't part of it. So I downloaded the 400MB VSTO install and installed it. It's a little strange, because it seems to be its own installation (not just an "add on" to Visual Studio).
However, now when I try to open an office project, I get a message telling me that I need to install Microsoft Office Professional Edition 2003 SP 1. This is a problem, because I have Office Professional Edition 2003 installed already! I have SP 2, though, not SP 1. In any case, I can't actually create or open an Office project.
Has anyone else seen this problem?
Posted by Dan Ciruli at 9:51 AM
There are millions of people out there writing blogs; the variety in terms of both content and quality is staggering. Many people seem to be just providing links to other blogs in many (if not most) of their entries; now that great aggregation services exist, those blogs are mostly a waste of space.
There's one blog I read that stands out in its insight and incisiveness: Nicholas Carr's Rough Type. He writes effectively, and each and every post has good analysis and content. I bring him up today because he has an interesting take on Microsoft's announcements from yesterday about Live Software and live.com.
So add Nicholas to your subscription list.
Oh, look. I just became one of those people who writes a post that just links to someone else's post! More content later; I promise.
Posted by Dan Ciruli at 9:18 AM
Friday, October 28, 2005
For a few months now, I've been hosting a webinar every other Tuesday to talk about the Digipede Network. It's been pretty fun; we've had a variety of people attend (and some have become customers!). I usually open the floor to questions at the end of the session (they run about 30 minutes).
This coming Tuesday I'm hosting a "Grid Computing for Financial Applications" webinar. A lot of the content is Digipede specific (to give people a general understanding of what the Digipede Network is and how it works); however, I will talk specifically about some finance applications, and show one in action.
If you've been reading this blog and would like to see some Windows grid computing in action, register for the webinar here. I'd love to have some attendees from the blogosphere!
Details: Tuesday, 11/1. 10:00 AM PST. Click here to register.
Posted by Dan Ciruli at 2:20 PM
Thursday, October 27, 2005
Hey - have I mentioned that we're going to be showing off the Digipede Network at the global launch for Visual Studio 2005 in San Francisco on November 7th?
I'd tell you to register, but it's sold out...
If you were lucky enough to register early, come find us demonstrating the greatest .NET grid computing solution on the planet--running on .NET 2.0!
Posted by Dan Ciruli at 11:53 AM
Given that Sun, DataSynapse, Platform etc all have grid solution, Microsoft is definitely starting from the back of the pack. From an investment banking perspective, almost every major tier-1 bank created its own cluster-compute-grid application during the last 10 years. With a number of these tier-1 investment banks having already migrated from their home grown solution to commercial grid software its hard to envisage a bank migrating again to Microsoft's Compute Cluster Solution. If we assume there are a few Microsoft centric investment banks still with home grown grids, then Microsoft should be able to sell their product in a few investment banks
It seems to me that Matt is missing the point about what CCS is all about.
Remember, CCS is an operating system with a set of low-level tools that enable some of the things necessary for scientific, technical, and high-performance computing. (MPI, support for high-bandwidth interconnects, etc.).
Microsoft's operating system has a stranglehold on the desktop OS in the world, but they're making a concerted effort to increase their market share on the servers (although they're probably doing better than you think; according to IDC, Microsoft is still shipping OSs on about three times as many servers as Linux). CCS is their move to counter the strong growth that Linux has been making in that space, and to make sure that Windows is a viable alternative to UNIX for the scientific and technical applications.
Is an OS enough to do this? Of course not. Microsoft, as always, needs partners to help them in this space. From the beginning, they have fostered partnerships and encouraged developers to work on their platform. They build the foundation, and we do the rest. I'm glad they continue to improve that OS (and the awesome development tools), making it a great platform to develop software on!
Posted by Dan Ciruli at 10:26 AM
Wednesday, October 26, 2005
I'm changing my feed structure today.
For those of you who are subscribing to my atom feed (http://westcoastgrid.blogspot.com/atom.xml), I'm turning it off (unless I get virulently angry comments telling me that you NEED atom or your head will explode)!
Please use my RSS feed (http://feeds.feedburner.com/WestCoastGrid instead. This'll do a couple of things for me--including make it easier for me to move to my own domain someday.
I'll watch for comments on this, and make the change later this afternoon.
[Update 2:19pm] Robert tells me that Feedburner will work for Atom feeds as well. So change those feeds!
Posted by Dan Ciruli at 11:30 AM
Tuesday, October 25, 2005
Robert and I went to the Geek Dinner organized by Dave Winer last night in Berkeley. Thanks, Dave, for organizing it (and I hope you didn't get stuck with a huge tab!).
It was a huge crowd (well, many more than the 25 who were supposed to attend!), and it was a blast. An interesting crew, including young entrepreneurs and old entrepreneurs, tech analysts and tech writers (and if you ever want to have an entertaining 5 minutes, ask Marc Canter what he's up to; I'm still in awe of that guy's personality).
I loved listening to the patter between Scoble and Steve Gillmor. About 10 of us stood outside the restaurant in the cold--not quite arguing, but not quite agreeing on anything either. Everyone was contributing, but Scoble and Gillmor were the main attractions. Steve has been a respected voice in this industry forever, and he sure doesn't lack for opinions (especially when it comes to Microsoft). Scoble is not a Microsoft apologist by any means (he has no problems saying things like "MSN search sucks"), but he'll stick up for the Redmondonians when they're getting something right.
What impresses me most about Scoble each time I meet him is how much he respects the audience. He's keenly aware that he has an audience because he's honest, and he truly believes that only being honest with his audience will keep them subscribing to his feed. He's also incredibly enthusiastic: he really, really loves technology, and seems to love the fact that he gets to write about it every day. He's not the kind of guy who loves to bash technologies and companies just to show how smart he is. Quite the opposite: I think Robert likes nothing better than finding something cool and sharing it with the world. And that love of technology is what has given him millions of readers.
Scoble (by the way, does that guy sleep? He was blogging until 4 am, then back at it before 8!) said this morning that he and Steve went and sat in a cafe 'til 1:00am, and Scoble finally started getting the gist of what Steve was saying. I would love to have been a fly on the wall there; they were both on fire with ideas.
Posted by Dan Ciruli at 4:42 PM
[Update 4/13/2006]: Fixed egregious spelling error. Can someone start proofreading these for me?
Sam Ramji, always insightful, has a post today on what Microsoft should be doing with regards to Software-as-a-Service companies.
He got together with some major players (Intacct, Echopass, Blue Roads, and Newsgator, among others) and had a good discussion about SaaS and how Microsoft can help these companies succeed.
After describing his meeting, Sam has a call to action: he wants to know how Microsoft can provide broad customer reach for SaaS partners, help with sales and marketing, and provide ways for SaaS ISVs and VARs to connect.
Sam, at least one part of the answer is simple: expand the existing Microsoft Partner programs to embrace SaaS. Add a competency specifically for SaaS. Make it available to ISVs (who will be providing SaaS) and AppDevs (who may be called upon to build SaaS applications for their clients). Let your ISVs and AppDevs know about the third party tools available on your platform that can help SaaS developers--the "picks and shovels" that will help people build innovative, scalable software on your platform. You're right: ISVs, AppDevs, and VARs in this space need to be able to connect. Continue to grow and foster the partner programs, and they'll be able to do that.
And, as an aside: keep your Servers and Tools people innovating! If you want people developing SaaS to choose your platform, you've got to make sure that the development tools and server tools that you're producing are hands-down the best in the business.
Posted by Dan Ciruli at 4:03 PM
Monday, October 24, 2005
Other obligations (my wife's college reunion) prevented me from attending the first day of Code Camp. Although I was happy to accompany my wife on her stroll down memory lane, I missed a ton of good content (the SOA through WS-Policy talk by Derek Harmon, for example, and I would love to have seen Richard Crandall's talk on Apple's ACG).
I flew in Sunday morning, expecting to see some of the sessions before mine (Jason Mauer's talk on rendering and Tim Shakarian's talk on Linq were promising, in addition to a lot of other great content).
Unfortunately, I was beset with technical problems when I arrived. My machine had trouble with the proxy server and their network, and it took a bunch of help from help from Robert to help me through it. We both have Skype, which was really useful for technical support.
As a result, I missed the sessions before mine. Although I thought my session went great, I was really disappointed not to be able to see other folks' talks. That's the point of Code Camp, right? I already knew the stuff that I was going to say; I wanted to hear what other people had to say!
I really hope that the Bay Area gets something like this going. We've got plenty of IT related user groups (E-Big, Bay.NET User Group, to name a couple, plus a ton of Linux groups). I think the overwhelming feeling from Code Camp Seattle is that it really benefits the developer community as a whole to have these "non-denominational" community meetings.
[Update 11:18]I just realized that Brad Abrams was there, too. Bummed that I missed his talk on reusable class libraries..
Posted by Dan Ciruli at 10:11 AM
Friday, October 21, 2005
As I noted before, I'm off to Code Camp this weekend.
I practiced my talk on my wife last night and my co-workers today over lunch. My wife fell asleep; my co-workers peppered me with questions and helped me a bunch.
Lots of folks are attending; a quick trip around the blogosphere found references here, here, here and here before I got tired of clicking. I'm excited to be in a place with a lot of great thinkers--one of the reasons I loved PDC this year is that the attendees tend to be really smart people, and every conversation I had was interesting. I expect more of the same at Code Camp.
I'm also really appreciative of the opportunity (thanks, Steve and Jason). It was a very useful exercise for me to put together a 65 minute (+/-) presentation on Grid Object Oriented Programming, because it forced me to think about it from the perspective of someone who hasn't done it before.
We've been working hard on the Digipede Network and the Digipede Framework (the development tools) for over two years now. I think about grid and distributed computing all day every day. So I rarely put myself in the shoes of someone who is beginning to think about these concepts for the first time. It's always a useful marketing and educational exercise to do that periodically.
As a result, I feel like I've got a really good introductory talk for developers. It introduces the key concepts, and lets them see some of the fundamentals of designing software for distributed computing.
I got some help with my talk from Kim; she'll be giving talks like this a lot in the coming months so I was glad to get her input.
Posted by Dan Ciruli at 1:54 PM
Tuesday, October 18, 2005
If you have a Google news feed like mine, you already saw this somewhere:
Digipede is proud to announce the release of the Digipede Network Professional Edition! The feature set is largely the same as Team Edition, just more-more-more. More agents, more users, more pools.
It's getting routine around here: Another month, another release (the Digipede Framework SDK came out in September)!
And, coming next month, the Digipede Network 1.2!
Posted by Dan Ciruli at 12:45 PM
Monday, October 17, 2005
A couple of weeks ago, Steve Borg attended one of our webinars (if you're interested, we're having another one tomorrow at 10:00 AM PDT). Steve is one of the talented guys at Accentient; I love the way they describe themselves, because I totally identify with it: "All of our trainers are developers by trade - with one small exception: we can communicate!"
Anyway, Steve called us right after the webinar. He had a few questions, and a suggestion: that I give a session at Seattle Code Camp v1.0. Code Camp is a non-denominational event by coders for coders (in other words, not devoted to a particular language or platform). As they describe it in their FAQ,
The Code Camp Manifesto consists of six points: (1) by and for the developer community; (2) always free; (3) community developed material; (4) no fluff – only code; (5) community ownership; and (6) never occur during working hours.I'm going to give a 75-minute talk on what we at Digipede call G.O.O.P.: Grid Object Oriented Programming. How is GOOP different from OOP? Well, it's definitely still OOP. But it allows you to take advantage of the grid--your objects will execute on different machines simultaneously.
My talk is Sunday afternoon at 3:00. If you're in the area (it's being held at the DeVry University campus in Federal Way, WA), come on by! I'll be giving away free copies of the Developer Edition of the Digipede Network.
Posted by Dan Ciruli at 8:15 PM
Wednesday, October 12, 2005
My colleague Robert forwarded me a link he saw on Tech.Memeorandum about a new Linux distro--Ubuntu 5.10, also known as "Breezy Badger." This is replacing the previous Ubuntu 5.04, also known as "Hoary Hedgehog."
According to the article on linux.com, it's pretty good. It doesn't set up drivers for some propietary hardware and doesn't have a GUI installer, but it gets a positive review. (The reviewer notes that if you want to use the KDE desktop you should use the Kubuntu distro instead).
Aside from having a chuckle over the naming conventions, it got me to thinking about one of the unsung dangers of Linux: the sheer volume of distros and their subtle differences. The conventional wisdom with Linux is that it's easier-to-use and safer than Windows, and that the costs are much lower (of course, there's no license fee, so that part is cut and dried). And, of course, by choosing an open-source OS, you're not locking yourself in to one vendor. What isn't generally spoken about is the difficulties involved in making your particular software work with your particular flavor of Linux, and the hidden costs therein.
Distrowatch.com tracks distributions of 10 different flavors of Linux, and that's just the major ones. There are well over 100 flavors around. Why so many? Simply because it's open source. Anyone is free to create their own distribution of Linux, complete with his/her own modifications.
The problem is that not all flavors of Linux work the same. Frequently, applications that run fine under one flavor don't run under another--they need to be recompiled (sometimes recompiled and relinked differently on different flavors). Some people don't mind this a bit--heck, I know people who won't run any software on their boxes unless they compile it themselves. On the other hand, the vast majority of people out there don't know what a compiler is, and sure as heck aren't going to compile software themselves.
So what does this have to do with distributed computing? Well, Linux has made huge inroads in distributed computing over the last few years. It has really become the operating system of choice for new clusters. But I think that some of the reasons behind choosing Linux aren't as strong as people think.
The overwhelming reason people choose Linux is to save the cost of the license for the operating system; fair enough, Windows isn't free. But what about the cost of hiring or keeping on staff someone whose job it is to recompile every cluster application so it runs on your particular flavor of Linux? Or, worse, (and I've read stories of this occurring), changing the installed distro of linux in order to run different applications on the cluster. Then changing it back when you want to run the previous application.
Of course, you could just decide to stick with one distribution. But in a world where not every piece of software will run on that, you've just locked yourself in to one "vendor." Isn't that what you were trying to avoid by choosing Linux?
Now, I'm certainly not trying to say that no one should use Linux. It's a great OS with some great capabilities, and it makes sense in some situations. I'm also not saying that Microsoft has a perfect answer to this; heck, they haven't even released their Compute Cluster version of the OS yet. But one thing you can be assured of: there won't be 100 different distributions. And you won't have to recompile your applications with the correct flags to make them run on it.
Posted by Dan Ciruli at 1:35 PM
Tuesday, October 11, 2005
There are lots of reason we chose the Microsoft platform on which to write a distributed (grid?) computing product: .NET is a great tool, tons of opportunities out there, no one else in the space. But another reason is the great support that Microsoft gives to ISVs.
They have a great Partner Program (including the Empower program for startup ISVs). They actively foster an independent partner group (IAMCP). They have a portal to promote the ISV community at ISV Connect. They have lots of employees who have technical blogs, but they also have blogs aimed at helping ISVs with non-technical issues, too: ISV Chalk Talk and Kari Martin's .NET Blogette help ISVs.
But the other good thing is how much help they give to startups. Microsoft helps startups? That's right! Microsoft has a team that does nothing but help startup ISVs. And they're writing about it, too: just in the last couple of days, I've found good blog entries from Don Dodge and Sam Ramji, and read a good column by Dan'l Lewin. The long and short of it? As a startup ISV, you need all the help you can get. It's great to have such a good partner.
When you look at all those benefits, the decision to build on the Microsoft platform just looks easier and easier. I'm glad we made it.
Posted by Dan Ciruli at 9:41 AM
Monday, October 10, 2005
Over at ADTMag there's a interesting article today on the use of the word "cluster" versus the use of the word "grid."
John K. Waters interviews Donald Becker, co-founder of the original Beowulf project. Becker points out that a lot of people use the word "grid" when describing something a lot narrower than "grid" may imply.
Grid is a concept that involves working with a large number of separately administered machines. With grid, you don’t control the configuration, the operating systems, the libraries installed—anything.He makes a good, albeit tardy, point. He's tardy because the word "grid" now means so many things to so many people, it's impossible to define. I have yet to attend a grid event that doesn't start with hours of discussion over what "grid" means.
For a long time here at Digipede we avoided the term "grid" entirely, using "distributed computing" instead. And we defined it like this: "combining multiple computers to deliver increased performance on compute-, data-, and transaction-intensive applications." We avoided the term "grid" because we are only working on one OS. After all, our software doesn't run on all operating systems and it doesn't try to hide that fact. So why did we switch and start saying "grid?"
Simply because more people understand what "grid computing" means than understand what "distributed computing" means. A lot of people think that .NET remoting in and of itself is "distributed computing." The terminology becomes more confusing when you add in a term like "utility computing." Becker says:
So-called grid computing solutions for small to midsize businesses are more likely to be utility computing or clustering solutionsThat's just the opposite of how I generally hear "utility computing" used; generally, people use it to mean "just plug into the network and get computing cycles, just like you get electrons." It's the ultimate in transparency, and it's years and years away. Clearly, that's not what Becker means. We have different ideas about "utility."
So what is grid? I've stopped trying to define it. Using multiple CPUs in multiple boxes, you can get more work done faster. Call it grid, call it utility, call it cluster, call it distributed computing, call it what you will. It doesn't matter to me.
All I know is that no matter what you call it: if you're not doing it, your software is running too slowly.
Posted by Dan Ciruli at 3:02 PM
Friday, October 07, 2005
I said "enough Google talk for me...until they announce a grid computing product." And I changed my mind; here's one final(?) post on them.
It's Greg Nawrocki's fault; he has a post today about Service Level Agreements (SLAs), and how important they may become for grid computing.
Greg's having a bad day because of some connectivity problems; this caused him to reflect on the concept of SOAs:
While I cannot currently exchange e-mail, I can use applications local to my computer. If I were part of a grid based SOA environment where my applications may not be local to me I'd be putting pen to paper right now.Which again leads me to believe that the folks at Sun and Google can't possibly be thinking about making StarOffice available as a service. It just doesn't make sense for an application like that. I do lots of work in office applications when I don't have connectivity.
However, what does make sense?
Well, you've got a company that's a pioneer at selling CPU time by the hour. You've got a company that runs a grid of over 100,000 servers. And, by some accounts, you've got an operating system that has been enhanced to allow ease of distributed computing.
To me, that adds up to, perhaps, the largest grid-for-hire in the world. That may be what Google and Sun are hiding behind the curtain.
Now, as I said here, this doesn't necessarily make the Googlegrid the best thing since sliced bread. First of all, if you're going to use it, you have to be comfortable with your code, your data, your all-important IP leaving your building. And second, you're going to have to rewrite at least some of your code to run on a different OS.
For some people, those obstacles won't be too unpalatable. But for many people, the idea of porting your code just so it can run on someone else's machines won't be quite so attractive.
Posted by Dan Ciruli at 10:53 AM
I saw a post on the GridTech blog that reminded me of an article I meant to write about.
Martin LaMonica had a good article that made the rounds over the last couple of days (and sparked a good discussion on ZDNet).
The interesting thing to me was that Tony Hey, VP of Technical Computing for Microsoft, describes their effor as being focused on data grids rather than compute grids. It's not clear what that means with regard to the Compute Cluster Edition--that seems to be a computing product.
John and Robert are up at Microsoft's eScience Workshop; maybe they'll run into him and get the scoop.
I certainly think that Vista, WPF, and WCF will provide fantastic ways to move and visualize data. I can't wait to release some of the technologies that we're going to be able to provide with that foundation under the Digipede Network!
Posted by Dan Ciruli at 9:23 AM
Wednesday, October 05, 2005
With this news that Apache has released the Beehive, I decided to catch up on some Beehive-related reading.
Although this post from Ashith Raj is a couple of months old, he does a good job of looking at Beehive and the other vendors extensions to J2EE. Beehive (which was donated to Apache by BEA), is one of many extensions to J2EE that exist today--each vendor specific.
He goes on to makes a good point:
While the JCP continues to bicker over standards, innovation is continuing at the vendor level. As a result, customers will commit to a vendor-specific set of platform technologies or they will pay a huge cost in lost productivity.
Ashith makes a great point. And he doesn't mention the other all-important advantage .NET has over Java: you can write it using the world's most productive developer tools.
Seriously: if you're writing software, the quality of your tools is critical. And no matter what you say about the rest of their platform (and, I might add, that platform compares favorably with any other you can come up with), you can't deny that Microsoft's development platform is fantastic.
Posted by Dan Ciruli at 4:34 PM
Brad Feld had a quick post about the Google/Sun agreement. He seems to agree with something I said a couple of weeks ago: why would Google want to compete in that market? It's not what they're set up to do. He said it a lot more succinctly than I did:
I’d think a more effective strategy would be to simply sidestep the whole desktop OS / app thing and just continue to innovate like crazy. Why pick a fight when you don’t need to?I think people are wrong about what's going on between Sun and Google.
And that's enough Google talk for me. No more on that until they announce a grid computing product.
Posted by Dan Ciruli at 2:57 PM
Monday, October 03, 2005
The next two years of funding will determine a lot of the outcome of grids. More innovation tends to occur outside the established big vendors, and small vendors could get funded and push grid adoption internally at their large enterprise customers.Exactly! It is great to see someone put grid in this perspective. We're working hard to innovate, and the Digipede Framework SDK is a great example of that innovation. There are certainly no big vendors out there who have anything of the sort.
Not much more to add. It certainly was a positive bit of perspective to start a Monday morning!
Posted by Dan Ciruli at 9:54 AM
Friday, September 30, 2005
Anyone who has ever studied UI (or UX, as they're calling it these days) knows the acronym K.I.S.S.: Keep It Simple, Stupid!
Grid computing guru Greg Nawrocki notes this in his latest post: Complexity...
is one of the primary barriers of widespread adoption. Quite simply, Grid computing needs to be a transparent technology before it is widespread. How many of us would be browsing the web if we had to hand assemble http queries in a telnet window?
All I can say to that is, "And how!"
With all of the attention that has been paid to flexibility, interoperability, multi-os, and a lot of the other great features that distributed computing systems have today, it's clear that one thing got dropped off the list: usability. I read a lot of blogs and newsgroups, and I am continually amazed at the number of people who spend their time messing with perl scripts in order to run distributed processes (and this is on top of the grid software they're running). Perl scripts? That's the moral equivalent of hand assembling your http queries!
When we started designing the Digipede Network, we had a mantra: "Radically easier to Buy, Install, Learn and Use." We repeated it to ourselves over and over again. We concentrated on ease-of-use in every phase of the product--from how it would be sold, how it would be installed, how developers can work with it, and how it would work for people who have never written a line of code in their lives.
Why? Because we believe that distributed computing has the potential for a much, much wider audience than it has gained so far. Greg points out that "It's refreshing to see the discussion expand beyond the traditional (pharma, financial services, energy) markets" by talking about the 451Group's new report on the use of grid in the Digital Media Industry. But at Digipede, we see potential beyond specific verticals. We think that distributed computing is a tool that can be used by anyone in enterprise computing.
Anyone can use distributed computing?
If it's radically easier to BILU.
Posted by Dan Ciruli at 3:18 PM
Thursday, September 29, 2005
I made a video for the Show Off event at PDC--you could submit a video of up to 5 minutes in length, showing yourself doing something cool. (I blogged about the video process here, here, and here).
What did I do for my video? I took a spreadsheet that had some .NET code running behind it, and I grid-enabled it. It took about 20 lines of code using the Digipede Framework SDK in Visual Studio. Pretty sweet, actually. It went from running for over 4 minutes to running in about 20 or 30 seconds.
Channel 9 put the videos up; you can see a list of all of them here. If you want to jump straight to mine, it's over here.
Posted by Dan Ciruli at 10:15 AM
Wednesday, September 28, 2005
I saw this post on Grid Today about the network that Force10 set up for the iGrid 2005 conference. It's prety darn impressive stuff: the Force10 Terascale E1200 supports 56 line-rate 10 Gigabit and 1,260 Gigabit Ethernet ports.
It's a reminder of one reason why distributed computing continues to make better and better sense. As quickly as Moore's law predicts that CPU speeds will increase (via increased transistor density), bandwidth increases faster. So does the data available (measured via bits / sq inch).
This Scientific American article details how Moore's Law just can't keep up with the relative increases in bandwidth and storage.
This image shows a graph of the performance improvements of CPU speed (doubles every 18 months), data storage density (doubles every 12 months), and bandwidth (doubles ever 9 months).
What does this tell us? CPUs are losing. Even though they keep getting faster and faster, the amount of data they can have locally and the speed with which they can get data increases much faster than they can possibly deal with it.
And one thing we've learned about data is that the more we can store, the more we do store. Today's databases are several orders of magnitude larger than those of just a couple of years ago. No matter what field you are in, you are gathering, storing, analyzing and reporting on much more data now than you ever were. Sensor networks have more sensors. Supply chains have RFID tracking each individual item. Megabyte databases have become gigabyte databases and gigabyte databases have become terabyte databases.
And the speed increases for storage and bandwidth are so much faster than CPU speeds, even multi-core technology won't help in the long run. It will provide an extra doubling or two, but that will only make up for 1 or 2 years of innovation on the storage/bandwidth problem.
So, where does that leave us? Distributing our processing, of course. The answer isn't faster chips: it's more chips brought to bear on the problem. It's the ability to coordinate many machines to work on those huge datasets. The increasing speed of the network and storage make it more practical than ever to move bits before acting on them so that more work can be done in parallel.
And all indications are that these trends will just continue. More storage, more bandwidth, and the need for more computes.
Posted by Dan Ciruli at 3:15 PM
Sunday, September 25, 2005
Several of my colleagues and I will be off at the Microsoft Technology Center in Mountain View over the next few days, putting the Digipede Network on lots and lots of machines. It should be fun. Posts might be fewer and farther between, because I'll have less time to spend reading and pondering.
Technical note: I've tweaked my blog a bit; the center column is now wider (I'm wordier than I ever thought I'd be!). I also changed the rules for posting comments; you don't have to be a Blogger member anymore, but you do have to type a word for verification. Maybe that'll shake loose a few comments from a reader or two!
Posted by Dan Ciruli at 12:13 PM
Friday, September 23, 2005
Looking over the lists of Google's innovations that Stephen E. Arnold details in The Google Legacy, I found this quote:
Another key notion of speed at Google concerns writing computer programs to deploy to Google users. Google has developed short cuts to programming. An example is Google's creating a library of canned functions to make it easy for a programmer to optimize a program to run on the Googleplex computer. At Microsoft or Yahoo, a programmer must write some code or fiddle with code to get different pieces of a program to execute simultaneously using mulitple processors. Not at Google. A programmer writes a program, uses a function from a Google bundle of canned routines, and lets the Googleplex handle the details. Google's programmers are freed from much of the tedium associated with writing software for a distributed, parallel computer.
A great idea!
We had the same idea when we created The Digipede Framework SDK. A developer who needs to scale or speed his application (whether it's a web app, an n-tiered app, or something that handles many transaction) doesn't want to become a master of distributed computing. Sure, it's not rocket science these days to start a process on another machine. But what happens when machines go down? What happens when new machines come online? How do you install the right software, guarantee execution, reassign tasks as necessary?
Quickly, this becomes much more complicated than the program the developer was improving in the first place.
This is exactly why the Digipede Framework SDK is so valuable. It frees developers from thinking about the vagaries and subtleties of moving processes around the network. It lets them spend their time working on their software. And it gives them the speedup or scalability they need in a fraction of the time it would have taken to do it themselves.
Posted by Dan Ciruli at 3:12 PM
Thursday, September 22, 2005
Any numbers hound (and Rob's four-year-old son) knows that a googol is a 1 with a hundred zeroes after it (10 to the hundredth power), and a googolplex is a one with a googol zeroes after it (10 to the googolth power).
After thinking about buying The Google Legacy by Stephen E. Arnold, I now know what a Googleplex is.
A dollar sign with a 180 after it.
$180? For a PDF? Wow. Um. Thanks for the free chapter.
I see lots of people who hit this post after searching on "What is a googleplex?". To you people, I have two things to say:
1. A googol is 10 to the hundredth power:
10,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,A googolplex is 10 to the googolth power; I'm not going to write that number here. It's huge.
2. Don't type "what is a" into Google. It doesn't help your search. Just type "googleplex".
Posted by Dan Ciruli at 9:29 AM
- Google's computing platform -- named the Googleplex by Arnold after the name given by the company to its Mountain View headquarters complex -- is a better (faster, cheaper and simpler to operate) computer processor and operating system than systems now available from competitors. Its price advantage is five or six to one over other hardware. Massively parallelized and distributed, its processing capability can be expanded indefinitely. As a virtual system or network utility, the user simply faces no need for backup or setup or restore.
- Google has re-coded Linux to meet its needs. This recoding enables Google to deploy numerous current and future applications -- 50 or more -- without degrading performance.
- Google products have the potential to be assembled into a version of MS Office -- including word processing -- and many other applications.
I'm only one chapter in, but I see flaws in his arguments (along with some great points and fascinating insight).
Google is a very interesting company that has always done things very differently than others, and they have consistently created fantastic technology.
However, this doesn't make them "about to unseat Microsoft from its throne."
Google has deployed its own version of Linux. That's great. But what they have is a purpose-built operating system. It may be built to accept many different applications (they're running dozens of public applications and who knows how many secret ones), but, in all likelihood, it was not built to run on every desktop, server and cluster node in an enterprise. It would surprise me greatly to find out that Google has any interest in all at making an all-purpose operating system. One of the hardest things about making any public OS is the hundreds of thousands of drivers that have to be written to handle peripherals; does Google want to get in the business of making sure every printer on earth works with their OS?
While Google has been making an increasing number of applications available to the public (and allowing webservices to get at some of the data), they have not made a general application framework available to the public. And here is where I think Arnold goes too far in his appraisal of Google's reach. For an enterprise distributed computing system, enterprises want an OS that is deployed within their enterprise. They want an OS that they develop on all the time. And they want their IP (their software and their data) to stay within their walls.
Google is creating an architecture and OS that are stretching the capabilities of distributed computing and, without a doubt, proving that the power of commodity machines is immense and scalable. But I wouldn't go looking for enterprises to replace their existing platform operating systems with Googleplex anytime soon.
Posted by Dan Ciruli at 8:26 AM
Wednesday, September 21, 2005
Paul Strong of Sun has a very good article in the most recent of ACMQueue. His subject (and the subject of the entire issue) is enterprise grid.
He makes some great points. He starts by looking at the enterprise data center.
Today’s data center could be viewed as a primordial enterprise grid. The resources form a networked fabric and the applications are disaggregated and distributed. Thus, the innate performance, scaling, resilience, and availability attributes of a grid are in some sense realized. The economies of scale, latent efficiency, and agility remain untapped, however, because of management silos. How can this be changed? And what is the difference between an enterprise grid and a traditional data center?
This is exactly right. The data center has great potential for grid-enablement. He points to Microsoft's Dynamic Systems Initiative and Sun's N1 Software as examples of how the data center is evolving. But Microsoft is quick to point out that DSI is an initiative, not a product:
The Dynamic Systems Initiative (DSI) is a commitment from Microsoft and its partners to help IT teams capture and use knowledge to design more manageable systems and automate ongoing operations, reducing costs and freeing up their time so they can proactively focus on what is truly important.
Similarly, N1 is software that helps run a datacenter.
What neither of them do is dynamically take advantage of the compute resources in a datacenter to ensure that the tasks that need compute power are receiving it, and that would-be idle machines are lending compute power when they can.
Paul's article spends a lot of time talking about why people might want a grid, then extols the benefits of virtualization, seamless use of heterogeneous resources, "holistic architecture," and abstraction. These are all long term benefits that people are looking forward to; organizations like the Enterprise Grid Alliance and the Global Grid Forum are creating standards to ensure that these goals become reality.
However, those goals are fairly abstract and, for most organizations, not the place to start experimenting with grid computing. Above all, the promise of grid is what Paul Strong notes in the first page of his article:
You can apply far more resources to a particular workload across a network than you ever could within a single, traditional computer. This may result in greater throughput for a transactional application or perhaps a shorter completion time for a computationally intensive workload.
Oddly enough, this is in the section of the article entitled "Hype." Strange, because this is the promise of grid that can actually be realized today. Grid computing is about making things go faster, and people are doing this now. It isn't hype. It's been happening for a few years with a few different Linux and UNIX solutions; it's happening now with Windows solutions (because of solutions like Digipede.
I look forward to the hype of grid computing--millions of PCs available just by plugging a computer in the wall, massive virtualization, seamless repurposing of millions of heterogeneous machines. But I also like where grid is now: tapping dozens, hundreds, or thousands of machines in the enterprise, making slow things happen faster.
Posted by Dan Ciruli at 9:57 PM