Thursday, July 27, 2006

All's Well that Ends Well, and Why Put a Grid behind Your Spreadsheet in the First Place

A
 ll's well that ends well, said Wm. Shakespeare, and I'm pretty sure that if he were around today he would have agreed that my Excel Services demo ended well. For those of you who have followed my continuing series, I had a sometimes-difficult but always-interesting time taking an interactive spreadsheet with .NET code behind and porting it to an Excel Services spreadsheet with a managed UDF.

In the end, the spreadsheet looked great, the UDF was easy to run on a grid (or cluster), and we got a snazzy looking demo out of it (not to mention terrific use of our own product).

But the question is: why do you need to put a grid behind Excel Services in the first place?

Well, there are a couple of reasons. First, you may simply have something that can be broken up and run faster. In the sample I did, the UDF (User Defined Function) was pricing bonds for different interest rates--each bond takes somewhere between 1 and 5 seconds to price. Because each bond can be priced individually, they can be done in parallel. I simply wrote a class that prices one bond, then I instantiated one object for each bond I needed to price and the Digipede Network took care of executing them in parallel. On my cluster of 10 machines I routinely saw a speedup of at least a factor of 10 (a couple of my machines are dual proc, so they were shouldering twice the load of the other machines).

But there's another, more broadly applicable reason: moving compute load off of your SharePoint server.

Excel Services and SharePoint Server do a wonderful thing: they make it really easy to make powerful spreadsheets accessible to many people. It's the same conundrum we see in SOA and in SaaS: if you make something available as a service, you subject yourself to widely varied levels of usage. What happens if many people use that service at the same time?

Imagine this scenario: A developer in a brokerage house has written a spreadsheet that has a really cool UDF: the user types in an account number, and the UDF calls up the portfolio for that account, then does an optimization on that portfolio, then Excel Services renders a spreadsheet with numbers, charts, etc. The whole thing runs for about 20 seconds.

That spreadsheet is published up to the SharePoint server so all of the brokers in the firm can access it when their customers call. This all sounds great so far--but then the brokers start using it.

All of a sudden you have many portfolio optimizations running on the SharePoint server simultaneously. Everything slows down, and everyone's frustrated. In addition to the normal load that the SharePoint server is running, it is now running 20 seconds of portfolio optimization for each render of that spreadsheet. Performance for the entire system goes downhill.

Now, imagine taking that code in the UDF and running it on the Digipede Network. Simply by adding about 20 lines of code, that UDF calculation can run on a cluster (or grid) instead of on the SharePoint server. It lets the SharePoint server keep doing what it's doing (serving web pages, running Excel Services), moves the compute-intensive process onto a cluster or grid, and thus creates a very scalable solution.

Bear in mind, this doesn't involve "breaking up" your UDF or figuring out how to parallelize your code: each UDF call is running on one machine. But many UDF calls are running in parallel on different machines.

It's very similar to what I wrote in February about web services. When you make something available as a service, you need to be prepared to address the scalability issues involved.

It all adds up: Digipede + Excel Services UDFs = Crazy Delicious

Tuesday, July 25, 2006

Why Beta Software Is Hard To Use, and How A Huge Company Still Listens to the Little Guy

T
his is the umpteenth post in my series on porting a .NET-laden, grid-enabled spreadsheet to run on Excel Services. In my last post (Adapting a Spreadsheet to Excel Services), I discussed the process that took me from a complex spreadsheet with .NET code-behind to an Excel Services spreadsheet calculating in SharePoint 2007.

After I had calculation happening (including running a job on the Digipede Network, gathering results, and bringing the data back into Excel), I just had to pretty up my spreadsheet with some pretty colors and some charts. Sounded easy enough, but it took two agonizing days.

Betas Have Bugs
The first problem I had turned out to be an honest-to-Pete bug in Excel Services. As I detailed last time, my User Defined Function was returning an 8x264 array of data into my spreadsheet. My plan was to have 2 charts, just like the spreadsheet I was modeling. In my original spreadsheet the 2 charts were each of dynamically updated while the job ran on the grid. Unfortunately, due to the architecture of Excel Services, I couldn't have dynamically updating graphs, but that was no big deal.

Using Excel 2007, I made a spreadsheet look just like my original spreadsheet--the charts used the return data from my User Defined Method as their source data. When my spreadsheet looked passable, I published it to SharePoint. It looked pretty good! I entered a number of bonds and pressed the "Apply" button. This launched a job on the grid as expected.

I waited for the results to come back. It should have taken 20 or 30 seconds, and I had been waiting for several minutes. I checked Digipede Control to see the status of the job--and I noticed that there were several jobs in the system. There was only one job running, but as soon as it completed, another job started. I cancelled it--and another job started. I cancelled it--and another job started.

My spreadsheet was out of control! It was starting jobs over and over again, as if it were stuck in a loop. I had to bounce IIS on the SharePoint server to get it to stop submitting jobs.

I was flummoxed. The logic in my spreadsheet had been working fine--I had just added some charts and a "pretty" front page. Why was it launching jobs repeatedly?

I dashed an e-mail off to Shahar Prish, a member of the Excel Services team whose Cum Grano Salis blog had been very informative when I was designing this whole thing. As I tried to get more information, I realized that my User Defined Method was getting called exactly 2,112 times. What? Some Rush fan's idea of an Easter Egg? No--I was returning an 8x264 array: exactly 2,112 values. My method was getting called for each cell in the array. Definitely bad.

To my surprise, Shahar e-mailed me back within minutes asking for more details and a sample. I couldn't believe it! This is a 60,000 employee company, and a product that's used by millions of people! I had e-mailed a developer and within minutes he was corresponding with me. Very, very impressive. I credit Scoble for helping to create such an open, blog-friendly, customer-friendly environment.

I packaged up a sample that exhibited the behavior (pulling out all Digipede code so he could run this on his machine). While I was pulling this together, I realized something very strange: my method was only getting called multiple times only if a cell that preceded it in the spreadsheet was dependent on the results.

Let me repeat that. If cell A1 was dependent on my results (say they were in A2:H264), my method was called repeatedly. If cell J:10 were dependent on my results, the method would only get called once (as expected). This was very strange behavior.

I sent the sample to Shahar for him to investigate. But, in packaging up the sample, I had found the workaround to my problem: don't have anything before the array be depedent on the array! Easy enough--I dragged my "results" sheet to the front of my workbook. Like magic, everything was working again!

[Note: I received an e-mail from Shahar the next morning. I had indeed found a bug in Excel 2007 beta 2! Funny thing, though: the developers had found that bug the same night I had. Full story here on Shahar's blog.]

Not all bugs are easy bugs
From there on out, I used an iterative process of tweaking my spreadsheet (just UI stuff), publishing it to SharePoint, and tweaking it again. And I soon experienced more bad Excel Services bevavior.

Periodically, when editing my workbook, when I would publish it to Excel Services I would get the following message:

Unable to open Workbook
The file you selected cannot be opened because it is corrupt, protected by Information Rights Management, or in a file format not supported by Excel Services. Excel 2007 may be able to open this file. Would you like to try and open this file in Excel 2007?
Similarly, at periodic intervals when publishing, I would see this message:
Excel Web Access
An error has occurred. Please contact your system administrator if this problem persists.
Over time, this grew infuriating. I was only editing these files in Excel--there was no reason for them to be "corrupt." Each time, they could still be edited in Excel 2007, but Excel Services/SharePoint would balk. I would have to start over from scratch designing my workbook.

Eventually, I kind of gave up on ExcelServices's ability to serve as a repository for my workbook. Instead, I worked from my local copy. I would make an edit, then publish it up to Excel Services. I'd look to see what needed tweaking. But instead of editing the copy in SharePoint, I'd edit the local copy on my hard disk (making sure I made a copy first, every single time). Then, I'd publish it to Excel Services, overwriting the previous version. If I published it and got an error, I'd just go back one revision.

It was tedious, and it took me 50 copies to get things right, but in the end the spreadsheet looked terrific. I even threw in some conditional formatting to show off some of the Excel 2007 features.

All in all, it was a great experience. I knew I was working with beta software, so having problems didn't really bother me. The demo looked great, and I got it to my partner on time. I loved Excel 2007. I was fairly impressed with Excel Services, although there are clearly still some bumps in the road.

And I was very impressed with the responsiveness of the Excel team!

Monday, July 24, 2006

Adapting a Spreadsheet to Excel Services

A
s part of my continuing series of posts on Excel Services, I'm going to relate my experiences with taking a (fairly complicated) spreadsheet and adapting it to run on Excel Services.

First, you should know that this was not a run-of-the mill, typical spreadsheet. As far as I can tell, taking a "normal" spreadsheet and publishing it to Excel Services would be simple and effective. However, this was far from a simple spreadsheet.

This is the spreadsheet I wrote about in my Kicking a Half-KLOC post: a complicated spreadsheet that takes a portfolio of callable bonds, calculates each of their values under a variety of interest rates (using a grid to do so), then takes the results of those computations and calculates total portfolio value and some value at risk numbers. It's several steps of computation, one of which iterates through many bonds (starting a job on the grid in the process), and the later calculations depend on the results of the earlier calculations.

Whew.

As I learned about Excel Services, I realized that my spreadsheet would need some serious rearchitecture--both from a logic-flow perspective as well as a physical (which information lives on which sheet in the spreadsheet) perspective.

This spreadsheet had originally been written with .NET code behind (hurray, VSTO!). The user clicked a button to start the job, and the .NET code behind would pull values out of certain cells in the spreadsheet, then launch a job on the grid. As results came back from the grid, the data from those results were posted in different cells on different sheets (there were results for each individual bond, as well as efficiency numbers that monitored the efficiency of each node on the grid). After the entire job completed, the calculation moved into a second phase that took results from the cells that had been filled in the first phase and calculated final portfolio value and value at risk.

I learned about using User Defined Functions in Excel Services, and I verified that it would be simple to start a job on the Digipede Network from within a UDF. Great - I thought that may be the biggest hurdle.

However, I quickly realized that I was going to have to make significant changes to the way the code behind the spreadsheet works. When working with a spreadsheet in Excel Services, the spreadsheet is only evaluated one time. The user is presented with an HTML-rendered version of your spreadsheet, fills out some input values, then clicks an Apply button. When the Apply button is clicked, if you have User Defined Functions in your spreadsheet, those function gets called. However, it isn't an iterative process. Whereas I had had a very iterative process (events were getting raised, values were being set in cells, other events were getting raised, values were being pulled from cells, other calculations were started), I realized I needed a much more rigorously defined process.

I needed to put everything into one User Defined Method: calculate the value for each bond, then total up the portfolio, then calculate the value at risk. And I needed to return all of that data at once: the information for each bond, the total portfolio information, the value at risk, and all of the efficiency information.

This was a significant change for me. Rather than have a step-by-step, event-driven process that stored intermediate results in the Excel spreadsheet itself, I needed to create a single method that did all of my calculation, then returned all of my data.

That wasn't too much work. I basically had to take three methods in my original code-behind and invoke them sequentially (storing the return values myself rather than using Excel cells to store them along the way). Pretty easy, actually. At the end of my method, I created a huge two-dimensional array of objects (8x264, actually) and returned those. The first row of results had any messages for the user. The second row had the final values for my computation. The next 32 rows had data on the first 32 nodes that worked on the job (efficiency, number of tasks completed), and the next 30 rows had an array of interest-rate-to-portfolio-values. The last 200 rows had data on each bond in the portfolio.

Next, I had to deal with getting the data into that method and then back to the spreadsheet.

Creating the method was easy: I knew the inputs to my algorithm. So my method call looked like this:
public object[,]
PerformCallableBondsCalcs(int numBonds, int numSimulations, double r0, double vasicek_a, double vasicek_rbar, double vasicek_sigr, string url, string username, string password, object[,] bondInfo)
Ten arguments, including a two-dimensional array of bond information.
To invoke that method from a single cell is easy:
=PerformCallableBondsCalcs(A1, A2, A3, A4, A5, A6, A7, A8, A9, Bonds!A1:C200)


But I didn't need the answer in a single cell! My method was going to return two-dimensional data, not zero dimensional data. Some quick research into Excel taught me how to return multi-dimensional (well, that is, one- or two-dimensional) data from a function call.

First, highlight the group of cells where you want the data to end up (in my case, an 8 by 264 cell region). Then, in the formula bar, type your method call (see above). Then, hit CTRL-SHIFT-ENTER. Your formula is entered in all of those cells, with curly braces around it (like this):
{=PerformCallableBondsCalcs(A1, A2, A3, A4, A5, A6, A7, A8, A9, Bonds!A1:C200)}


Now, I had the architecture I needed:
  1. All of my logic was going to happen sequentially in my method call
  2. I wasn't storing intermediate results in Excel; I was keeping them locally in my code
  3. I had all of my inputs going into a single method call
  4. My method was returning all of my results in a two-dimensional array
  5. The two-dimensional array was populating a region of my spreadsheet


I put it all together and...it worked!

This was exciting. I had allocated several days for this, and it took me less than one. I was very excited.

I thought that the hard part was over. All that was left was to format my spreadsheet to take all of that returned data and make it look good. Little did I know that all of the hard parts still lay in front of me...

More in the next installment: Why Beta Software Is Hard To Use, and How A Huge Company Still Listens to the Little Guy.

Technorati tags: ,

Friday, July 21, 2006

What Is Excel Services 2007, and What Is a User Defined Function and Why Should I Care?

A
s I said in my previous post, I've been up to my ears in Excel 2007 and Excel Services for the last week or so. Here's the first in my series of posts describing my experiences.

First of all, what is Excel Services? If you really want to hear this from experts, go read Shahar (Cum Grano Salis) and David's (Excel 2007) blogs. They work on the team, so they both have great insight. Shahar also has very good programming tips, and photos of a soccer ball about to be born.

Here's my outsider's take on what's going on with Excel Services. Microsoft has done a bunch of work to add capability to SharePoint Server 2007 to integrate with Excel. You can now take a spreadsheet, publish it to SharePoint, and have SharePoint make that spreadsheet available in 3 ways:
  1. Share the document itself (SharePoint has been able to do this for a while)
  2. Make the spreadsheet available as a web service. You designate certain fields in the spreadsheet as inputs, certain fields as outputs, and you publish it to SharePoint. The logic behind your spreadsheet is now available as a web service. Cool stuff.
  3. SharePoint can now dynamically render that spreadsheet as HTML in a browser. Again, you can specify certain fields as inputs. When your users view the spreadsheet in their browser, they can enter the inputs. Excel Services recalculates the spreadsheet based on their inputs and re-renders it to HTML.
This last thing is really cool stuff. This means that you can now give access to a spreadsheet to anyone in your organization without e-mailing it around, maintaining control over it without worrying that people will mess it up, etc.

Imagine you're a loan company, and you have different loans you can offer. You write a spreadsheet that models your loans, and you publish it up to your SharePoint server. Now, when any of your CSRs is on the phone with a potential client, s/he can use a browser, open the loan spreadsheet, and fill in the principle, term and interest rate of the loan. Heck, you could get fancy and write a UDF (see below). Excel Services would calculate the loan, providing instant numbers for payment size, total interest paid, total cost of the loan, etc. And that whole thing was accomplished without a programmer getting involved.

Better yet, you still have total control over that spreadsheet. If your company's loan fee structure changes, you can modify your spreadsheet and republish it. There's no need to worry that everyone has the latest version, there's no e-mail that clogs your server with new spreadsheets, there's no frantic deleting of old versions. Again, all without a programmer. It's very cool.

So, that's Excel Services. What's all this about User Defined Functions?

Well, as powerful as it is, Excel 2007 can't do everything. Some things require a programmer. A User Defined Function allows a programmer to write a function that is available within formulae in Excel. In other words, if I write a UDF called MyCoolFunction, I can type "=MyCoolFunction(A1:B3)" into a cell, and it will evaluate MyCoolFunction (which exists in a DLL) when that spreadsheet is calculated in Excel Services.

This takes that previous functionality and makes it even more valuable. If you need a developer to write a function that gets used in your spreadsheet, he can do it. If that method already exists in a DLL, it may be as simple as adding an attribute ([UdfMethod]) to a particular method in order to make it accessible from Excel. The developer doesn't have to mess with putting VBA behind your spreadsheet (in fact, you can't put VBA behind a spreadsheet that will be rendered by Excel Services). The developer doesn't have to mess with your spreadsheet at all. Any spreadsheet on the server can use the method. It's very powerful.

So, back to the loan company example, let's say your company has developed an algorithm for credit approval. It's too complicated to model in Excel, and it involves a web service call to a credit bureau. You have a developer write a method that takes in some info (maybe a social security number, income, total asset value, debt value, etc)--and now you can use that method in the loan calculator spreadsheet that every one of your CSRs uses. I'm telling you, this is powerful stuff.

Ok, the stage is set. One of my good friends at Microsoft asked me to take one of their existing demonstrations (one that runs a financial model behind Excel 2007 on a cluster using the Digipede Network) and port it to run on Excel Services. I'm an optimist, and I'm burdened with overconfidence. I told him, "No problem," even though I had only a week to prepare it and had never so much as seen a SharePoint site, had no idea what a UDF was, and had never worked at with Excel Services.

That's a big promise. I delivered it to him yesterday, but it was a stretch. Coming next: what it takes to go from Excel to Excel Services.

Technorati tags: ,

Wednesday, July 19, 2006

Gettin' It On with Excel Services


E
ver since I returned from Microsoft's Worldwide Partner Conference, there have been two constants in my life: I've had lunch every day (a welcome change), and I've been working almost exclusively in Excel 2007 and Excel Services.

Ihad promised one of my friends at Microsoft that I would take one of his demos and convert it to Excel Services (Why was I doing this rather than someone at Microsoft? This demo happens to be a grid-enabled spreadsheet that runs on the Digipede Network). I had to learn a lot (I had never used SharePoint before, and Excel Services and SharePoint are inextricably intertwined). I also had to rearchitect the spreadsheet significantly.

It was an adventurous week that, unfortunately, left me no time for blogging. But I've learned a lot, and I plan on documenting it all. Look for three upcoming posts:
  • What Is Excel Services 2007, and What Is a User Defined Function and Why Should I Care?
  • Adapting a Spreadsheet to Excel Services
  • Why Beta Software Is Hard To Use, and How A Huge Company Still Listens to the Little Guy
  • All's Well that Ends Well, and Why Put a Grid behind Your Spreadsheet in the First Place


  • Update 2007-05-09: Added links to the later posts for easier navigation!

    Technorati tags:
    Photo credit: andysteel

    Thursday, July 13, 2006

    Facts and Figures from WPC06

    A
    s I've been attending keynotes and breakout sessions, I've been jotting down notes. As I sit here on my final morning, a few figures jump out at me.

    First and foremost, .NET market penetration. I've been hoping to hear some numbers surrounding this, because recently someone expressed to me that he didn't think a distributed computing solution could ever succeed on the Microsoft platform because no one does "serious" development in .NET—he thought that .NET work was all GUI work, while everything behind the UI is written in Java. Yesterday Sanjay Parthasarathy put up a lot of numbers in his slides, but the one I wrote down was: Middleware Solutions Technology: .NET 60%, Java 36%. This was from a large IDC study last year (and those are global numbers; .NET is even more successful in North America and APAC). I'm not trying to say that .NET is a Java-killer; I'm just saying that anyone who thinks that .NET isn't being used for serious development needs to look at what's happening out there and re-evaluate their opinion.

    During his keynote, Andy Lees gave us some numbers related to servers, clusters, and OS penetration. As I wrote about in an earlier post, Microsoft is doing very well in the server market. However (and not surprisingly), they're losing badly in HPC and clusters. According to their own numbers, Microsoft currently has around 6% of the HPC market. Even more interesting was the fact that 40% of the servers that are sold with Linux on them go into clusters—that's a huge number. It will be interesting to see what Kyril Faenov and the HPC group at Microsoft can do to try to gain share there.

    Well, that's it from WPC06. I'm off to Logan to get home. By the way, those of you who have been closely following the "meals Dan got in Boston," here's the final scorecard:
    • Number of times I didn't get breakfast because they ran out: 1
    • Number of times I didn't get lunch because they ran out: 2
    • Number of times I had appetizers for dinner because that's all they gave us: 2
    • Number of Gold Certified Partner lunches I attended where they ran out of food: 2
    You'll be happy to know that at today's Gold Certified Partner lunch, I arrived early and was happy to sit down to a lunch. Thanks, Allison!

    Technorati tags: , ,

    Wednesday, July 12, 2006

    Allison Watson owes me lunch


    T
      wo, actually. Microsoft is 0 for 2 when it comes to having enough lunch for attendees here at the Worldwide Partner Conference. Short version of the story: lunch was scheduled from 1:00 to 2:30 today; I, like many people, was in session from 1 to 2. I went directly from my session to lunch. I got down to the main floor to find the lunch station completely bereft of food. It was 2:05; I double checked my schedule. Was lunch over at 2? Nope. Two thirty; there is still 25 minutes of lunch time left!

    I scooted to another food station – same situation. I asked an employee, and she told me she heard there still may be food at station 3. Or station 4. Wherever those are.

    I scooted again, to stations 3 and 4. No luck, and no luck. By this time, I was with a group of other attendees who were in the same boat: hungry and frustrated.

    I asked to talk to someone in charge. A kind employee, Teresa, went to fetch her boss, Richard. He came down, sat down with me and talked for a few minutes. He placed the blame for the situation clearly in Microsoft's court. He said that they had ordered 5,400 lunches for Thursday (there are over 7,000 partners here, and I believe the number of attendees including Microsoft employees is close to 10,000). Of course they ran out! After yesterday's fiasco, they bumped up the order for today to 7,400 lunches—an order the kitchen had trouble filling, because they couldn't get the additional supply on such short demand. It clearly was still inadequate.

    I hate to fill up a grid computing blog with whining about a conference that I'm attending. But this is the only medium I have. This blog is about partnering with Microsoft, and the WPC is the event of the year for Microsoft partners.

    Allison, this is no way to treat the partners who bring in 96% of your revenue. Last year WPC ran like a Swiss watch—it was all around a terrific experience. The content was good, and that's important. But what's also important is making it a good experience for all of us. This year has been a much worse experience, and all of your partners are going to leave with that taste in their mouths. We're glad you gave us a "double thank you" for your great year.

    But, personally, I would rather have had lunch.

    Technorati tags: ,

    Logistical problems mar WPC06

    H
    ow hard is it to run a good conference? Apparently, it's pretty difficult. Microsoft and the Boston Convention Center (and EventPoint and CRG Events, who both seem to have some hand in running this thing) had a tough day yesterday at the Worldwide Partner Conference.

    On one hand, the content at the breakout sessions I've been to was very good (especially Gianpaolo's session on SaaS). And there were a couple of good demonstrations in the keynotes. But the list of things that have gone wrong is far longer than the list of things gone right:
  • They ran out of breakfast. How do you run out of breakfast at something like this? Do you not know how many people are coming? We arrived just after 8:00 this morning; the keynote was supposed to begin at 8:30, but because so many people were delayed because of the Big Dig accident, they announced that everything was going to be pushed back 30 minutes all day. Except, apparently, breakfast. As we walked in, they were removing all of the food.
  • The keynotes ran long. VERY long. I know they were put in a bad position by having to delay the start by 30 minutes. So how did they respond? The opening keynote (Ballmer and Allison Watson) started 30 minutes late, but finished an hour and ten minutes late. It ran forty minutes long! So now instead of being a half-hour behind schedule, they were an hour behind schedule.
  • They ran out of lunch. How do you run out of lunch at something like this? The keynotes were scheduled to end at 11:45. With the 30 minute schedule bump, they should have ended at 12:15. I hung in there until 12:40, but I had set up structured networking appointments, so I had 30 minutes of meetings before I could nose over to the food tables…which were, by then, completely empty. I actually had a convention center employee tell me that my best shot at food was grabbing a cab to go into town.
  • The Gold Certified Partner lunches (there were two of them) both ran out of food.
  • The food isn't in one place—it's at various stations throughout the event hall. This is a problem, because when one station runs out of food (or is shut down), I can't see if there is any food anywhere else!
  • No coffee available throughout the day. There was coffee at breakfast (although the first station I went to was out), but later in the morning and through the afternoon, there was no coffee available. There's a reason we call it a "coffee" break, Microsoft: it's because we drink coffee then.
  • Not enough tables for structured networking. Through CRG Events, Microsoft has enabled conference attendees to find each other and arrange meetings at a designated table for a 15 minute meeting. The format is actually conducive to very focused meetings; I like it a lot. But, as of last week, it was impossible to book tables for most of Wednesday—all the tables were booked!
  • No VPN access through their wireless. I can understand the need for high security when I'm at a Microsoft facility. But when providing wireless at an event like this, why go to the trouble to put so much security in place that I can't get on my VPN? It's always like this at Microsoft events, but not at other conferences. It means I can't check my e-mail during the day, which is a serious pain.

  • Ok, so that's the good and the bad so far. Now, on to the great. Last night, at the US Partner party, the entertainment was supplied by the GoGos. I'm not going to go on about it, but I'll say this: 25 years after Beauty and the Beat came out, they still rock, and they look like they are having a blast on stage. And that crush I had on Belinda Carlisle two decades ago? Still going strong.

    Technorati tags:

    Tuesday, July 11, 2006

    Ballmer: People Ready Software

    T
    his week I'm at the Microsoft Worldwide Partner Conference in Boston. (No, I wasn't in the tunnel in the Big Dig that fell apart last night--I had gone through those tunnels earlier in the day, though. I'm still wondering how long it'll take me to get to the airport on Thursday.)

    Once again, Steve Ballmer gave the keynote this year. One of the topics that he hit hard on this morning was "People Ready Software." It's the driver that they use when designing their own software, and it's what they want partners to do.

    He described People Ready Software as having these characteristics:

  • Familiar and easy to use
  • Easier to integrate and connect
  • Innovative and evolves to meet your needs
  • Widely used and supported

  • For those of us who have been writing software for this platform for two decades, we didn't need to hear it. We already do it. We live and breathe "easy to use." We think about UI first - not as an afterthought. We think about the user the whole time we're going through design. "Familiar and easy to use" was one of the tag phrases we used over and over when designing our Workbench tool.

    To me, it's a major distinction between software that was designed for experts and software that was designed for everyone. A couple of weeks ago I engaged in a bit of a debate with Joe Landman (of Scalable Informatics) about the usability of differing cluster/distributed computing products. While Joe had some points that were definitely correct (he predicted that Microsoft would be making an announcement about entering the Top 500, for example), I still disagree about usability. Traditional distributed computing solutions just weren't designed as "People Ready Software."

    That's not to say that we haven't designed our software for experts. Take our development patterns as an example: we wanted to make developing for our platform as simple as possible. As anyone who has attended one of my webcasts knows, it can take as few as 20 lines of code to grid enable software using our Worker pattern.

    It's very powerful, but very easy to use. But we don't stop at "simple." The Worker library pattern is just one of seven development patterns that we offer developers (and we ship code samples for each of them).

    Building on Web services and .NET has made it easy for us to offer integration; we made sure that we have full COM interoperability; indeed we have customers using everything from COM interfaces (C++, VBA and VB6) to .NET interfaces (C# and VB.NET) to non-Microsoft technologies (PHP, Python). We look forward to the releases of PowerShell and WCF, because those will allow us to continue to enhance our integration capabilities.

    Innovative and evolving? We're doing everything we can. We'll release v1.3 later this year (look for announcements soon), and we're already planning the 2.0 release that will follow that.

    Widely used and supported? Well, we're working on that. Our customer list continues to grow; more importantly, we're continuing to work with partners. Part of the reason we're here at the Worldwide Partner Conference is to talk to software vendors.

    Technorati tags: , ,

    Thursday, July 06, 2006

    Kicking a Half-KLOC


    K
    evin Burton (of TailRank) posted yesterday saying "Number of blogs is the new KLOC." KLOC stands for thousand lines of code; his post makes a very good point that the number of blogs that a site indexes is not necessarily the best measure of how good that index is--and draws a parallel to Steve Ballmer noting that tracking a developer's KLOC fails to track how useful it can be to eliminate lines of code.

    I experienced that yesterday when porting a partner's application to run on the grid.

    This was yet another very cool grid app that had been written behind Excel. It values a portfolio of callable bonds under a variety of interest rates, and had been written to run the analysis on a cluster. Like most cluster applications, it was pretty hardcoded to work on the cluster. It was well written, but it was extremely complicated: it had different threads starting tasks on each node, at least one thread for monitoring tasks, and a thread for reassigning tasks gone awry. It needed to know the name of every machine on the cluster, and, of course, it relied on its computation algorithm being pre-installed on each node on the cluster in a standard fashion. Pretty normal stuff.

    And, in fact, it was very fragile. Our partner had attempted to move this from a 4-node cluster to an 8-node cluster and found that it ran much slower. Why? It's not clear--my guess is that trying to write a complicated multi-threaded application to run behind Excel just isn't reliable. The submitting machine was responsible for monitoring everything as well as processing the results, so it got bogged down. Debugging that was going to be an absolute nightmare: with so many different threads happening simultaneously, finding the inefficiency could take days or weeks.

    I made a wiser choice. In a couple of hours, I ported it to run on the Digipede Network. Result: now the spreadsheet has none of the extremely complicated code in it--it makes simple API calls. It now has guaranteed execution of the tasks across the cluster without having to manually monitor each one. The user no longer has to pre-stage anything on the cluster--all of that happens automatically. The cluster is used more efficiently, and the whole thing runs faster (and scales much, much better).

    The best part? I eliminated over 500 lines of code in doing so. That's right: I made the whole thing faster and simpler, and I kicked a half-KLOC in the process.

    [Update 7/6/2006 2:15] I should have given a hat-tip to my good friend Robert (who loves to delete code) for coming up with the phrase "kicking a half-KLOC." Hat tip.


    Photo credits: jeltovski, rosevita
    Technorati tags: , ,

    Wednesday, July 05, 2006

    Bizarre schedule at Worldwide Partner Conference

    L
      ast year I attended Microsoft's Worldwide Partner Conference in Minneapolis and loved every minute of it. It was remarkably well attended. It had good technical content. It featured lots of hands-on labs. The networking opportunities (both with Microsoft employees and Microsoft partners) are unparalleled: where else can you have 7,000 employees and partners in the same city at the same time?

    They also take care of you well. Decent food the whole time, and a good party -- actually, some of the best networking of the whole conference happened at the party; last year Hootie and the Blowfish played it.

    This year, I was looking forward to the party not just for the networking: local boys done good Train are playing it! Wow - a band I'd actually like to see.

    One big problem: I just started setting my schedule for next week's events, and guess what I noticed? Train is playing Thursday night--after the conference has ended. Yup. The conference runs July 11-13, and Train is playing the night of the 13th. So for those attendees who happen to live in Boston, what a great opportunity! For those who are taking Friday off of work, how nice! For the 90% of us who are travelling Thursday night so we can work on Friday...um, thanks for thinking of us, Microsoft.

    Technorati tags: ,