Monday, July 24, 2006

Adapting a Spreadsheet to Excel Services

A
s part of my continuing series of posts on Excel Services, I'm going to relate my experiences with taking a (fairly complicated) spreadsheet and adapting it to run on Excel Services.

First, you should know that this was not a run-of-the mill, typical spreadsheet. As far as I can tell, taking a "normal" spreadsheet and publishing it to Excel Services would be simple and effective. However, this was far from a simple spreadsheet.

This is the spreadsheet I wrote about in my Kicking a Half-KLOC post: a complicated spreadsheet that takes a portfolio of callable bonds, calculates each of their values under a variety of interest rates (using a grid to do so), then takes the results of those computations and calculates total portfolio value and some value at risk numbers. It's several steps of computation, one of which iterates through many bonds (starting a job on the grid in the process), and the later calculations depend on the results of the earlier calculations.

Whew.

As I learned about Excel Services, I realized that my spreadsheet would need some serious rearchitecture--both from a logic-flow perspective as well as a physical (which information lives on which sheet in the spreadsheet) perspective.

This spreadsheet had originally been written with .NET code behind (hurray, VSTO!). The user clicked a button to start the job, and the .NET code behind would pull values out of certain cells in the spreadsheet, then launch a job on the grid. As results came back from the grid, the data from those results were posted in different cells on different sheets (there were results for each individual bond, as well as efficiency numbers that monitored the efficiency of each node on the grid). After the entire job completed, the calculation moved into a second phase that took results from the cells that had been filled in the first phase and calculated final portfolio value and value at risk.

I learned about using User Defined Functions in Excel Services, and I verified that it would be simple to start a job on the Digipede Network from within a UDF. Great - I thought that may be the biggest hurdle.

However, I quickly realized that I was going to have to make significant changes to the way the code behind the spreadsheet works. When working with a spreadsheet in Excel Services, the spreadsheet is only evaluated one time. The user is presented with an HTML-rendered version of your spreadsheet, fills out some input values, then clicks an Apply button. When the Apply button is clicked, if you have User Defined Functions in your spreadsheet, those function gets called. However, it isn't an iterative process. Whereas I had had a very iterative process (events were getting raised, values were being set in cells, other events were getting raised, values were being pulled from cells, other calculations were started), I realized I needed a much more rigorously defined process.

I needed to put everything into one User Defined Method: calculate the value for each bond, then total up the portfolio, then calculate the value at risk. And I needed to return all of that data at once: the information for each bond, the total portfolio information, the value at risk, and all of the efficiency information.

This was a significant change for me. Rather than have a step-by-step, event-driven process that stored intermediate results in the Excel spreadsheet itself, I needed to create a single method that did all of my calculation, then returned all of my data.

That wasn't too much work. I basically had to take three methods in my original code-behind and invoke them sequentially (storing the return values myself rather than using Excel cells to store them along the way). Pretty easy, actually. At the end of my method, I created a huge two-dimensional array of objects (8x264, actually) and returned those. The first row of results had any messages for the user. The second row had the final values for my computation. The next 32 rows had data on the first 32 nodes that worked on the job (efficiency, number of tasks completed), and the next 30 rows had an array of interest-rate-to-portfolio-values. The last 200 rows had data on each bond in the portfolio.

Next, I had to deal with getting the data into that method and then back to the spreadsheet.

Creating the method was easy: I knew the inputs to my algorithm. So my method call looked like this:
public object[,]
PerformCallableBondsCalcs(int numBonds, int numSimulations, double r0, double vasicek_a, double vasicek_rbar, double vasicek_sigr, string url, string username, string password, object[,] bondInfo)
Ten arguments, including a two-dimensional array of bond information.
To invoke that method from a single cell is easy:
=PerformCallableBondsCalcs(A1, A2, A3, A4, A5, A6, A7, A8, A9, Bonds!A1:C200)


But I didn't need the answer in a single cell! My method was going to return two-dimensional data, not zero dimensional data. Some quick research into Excel taught me how to return multi-dimensional (well, that is, one- or two-dimensional) data from a function call.

First, highlight the group of cells where you want the data to end up (in my case, an 8 by 264 cell region). Then, in the formula bar, type your method call (see above). Then, hit CTRL-SHIFT-ENTER. Your formula is entered in all of those cells, with curly braces around it (like this):
{=PerformCallableBondsCalcs(A1, A2, A3, A4, A5, A6, A7, A8, A9, Bonds!A1:C200)}


Now, I had the architecture I needed:
  1. All of my logic was going to happen sequentially in my method call
  2. I wasn't storing intermediate results in Excel; I was keeping them locally in my code
  3. I had all of my inputs going into a single method call
  4. My method was returning all of my results in a two-dimensional array
  5. The two-dimensional array was populating a region of my spreadsheet


I put it all together and...it worked!

This was exciting. I had allocated several days for this, and it took me less than one. I was very excited.

I thought that the hard part was over. All that was left was to format my spreadsheet to take all of that returned data and make it look good. Little did I know that all of the hard parts still lay in front of me...

More in the next installment: Why Beta Software Is Hard To Use, and How A Huge Company Still Listens to the Little Guy.

Technorati tags: ,