Return to Contents

On this page:

  1. Setting Up
  2. Generating Random Values
  3. Learning From Data

Setting Up

(↩)

Invrea Scenarios is a plugin that sits within and alongside Excel, augmenting Excel with the power of Bayesian statistics. After installation, Invrea Scenarios should be functioning as an add-in within Excel automatically - no setup required.

The system requirements for Scenarios are as follows. First of all, you need a Windows computer with Microsoft Office installed. More specifically, you need any Windows operating system released after XP, including XP. You need a Microsoft Office distribution from 2007 at the earliest. You will need a recent Java installation (Java 8), but Scenarios will install this for you if you don't have it already.

Once you've downloaded the installer from the Invrea website, double-click the .exe file you've obtained and let the installer do its work. (You might have to manually allow the installer to run, which you'll need administrator privileges on your computer to do.) The installer will prompt you to exit Excel while it is working.

Once installation has finished, after reopening Excel, you should see that a new Scenarios tab has appeared:

Probably the most salient feature of this tab is the green 'Run' button. Pressing this extracts the contents of the spreadsheet and sends them to Scenarios, which then sends back a large number of scenarios for your perusal. You can choose the number of scenarios that will be generated by typing a number into the text box just to the left of the 'Run' button. We'll discuss this process more in a little bit, of course.

You should also see, when you right-click in Excel, that the resulting menu has been augmented with a few Scenarios-specific functions:

Note also the 'About' button on the right of the tab, the one featuring Invrea's logo. Pressing this creates a window including links to a contact form and a form for reporting bugs. The bug-report form we provide while hoping you won't need to make use of it.

Generating Random Values

(↩)

The first and most immediate application of the Scenarios plugin is as a simple stochastic simulator. This means that, given a description of some unknown quantities, possible scenarios consistent with that description can be generated.

Unknown quantities are described using probability distributions. To add an unknown quantity to your spreadsheet, simply use one of the statistical primitives Invrea has added to Excel. For an initial example, type into cell A1 the following formula: =BETWEEN(1, 10). As should be apparent, this tells Scenarios that contained in the cell A1 is an unknown quantity between 1 and 10.

After entering this formula, you should see the cell A1 immediately take a possible value between 1 and 10. By pressing F9, you can generate other possible values, each of them different values between 1 and 10.

Instead of pressing F9, you can also click the 'Next' button in the Scenarios tab, which will generate another possible value. To return to the previous value, click the 'Previous' button in the Scenarios tab (the same can be accomplished with Shift+F9). If you spend a bit of time searching through possible scenarios, you should see that the cell A1 takes any value between 1 and 10 with equal probability - that is, it is evenly distributed over all the values between 1 and 10.

The values generated by Scenarios are values that sit inside a cell just like normal Excel formulas. That is, if you write in cell A2 the formula =A1+10, cell A2 will take a value somewhere between 11 and 20. Do this as well.

So by pressing F9 and using the 'Next' and 'Previous' buttons, you've generated a few possible scenarios. Now let's generate a much larger number of them. Open the Scenarios tab and select cell A1, the one with the BETWEEN formula in it. Type 10,000 into the text box to the left of the 'Run' button. Your spreadsheet, with the formulas shown, should now look like this:

Hold off on pressing the 'Run' button for just a bit. Once you press this button, Scenarios will extract the formulas in your spreadsheet, and generate 10,000 possible scenarios. A 'scenario' in our parlance, by the way, is a possible set of values for the entire spreadsheet. So Scenarios will generate 10,000 possible values for each of the cells in the spreadsheet.

Crucially, these values will be consistent. That is, remember that we set cell A2 to be 10 more than the cell A1. In each of the scenarios, values are given to cells A1 and A2 such that A2=A1+10.

Now press the 'Run' button. When you do this, 10,000 scenarios will be generated - 10,000 possible values for cells A1 and A2. You should briefly see a progress bar. Then, the spreadsheet you're looking at should split into two windows, and finally you should see results that look something like the following:

On the left is the original spreadsheet, showing one possible scenario. On the right, you should see a graph with the name of cell A1 in the title. And in the middle, you should see a new scroll bar. We'll focus on the graph first.

After running Scenarios while one cell is selected, the graph on the right is generated by tallying the values of that cell over all the scenarios that were just generated. It's a histogram representing the frequency of each scenario. So in this case, it's a histogram including the values of cell A1 in the 10,000 scenarios we just generated. The horizontal axis of this graph represents the value of cell A1; the vertical axis represents the probability of this value occurring.

Remember that cell A1 contains the formula =BETWEEN(1, 10), which specifies only that the value of cell A1 is somewhere between 1 and 10, and nothing else. As a result, it should make sense that the graph of possible values of A1 is essentially flat over the interval from 1 to 10. This tells us that cell A1 is likely to be any number between 1 and 10 with equal probability.

The numbers to the left of the graph are simply the data plotted in the graph provided for your convenience. As the graph on the right sits inside Microsoft Excel, you could clean the data or improve the graph's appearance as you wish. Also, in the 'Statistics' tab, you should see a section labelled 'Expected Value'. This section contains the average value of the cell currently being plotted, taken over all scenarios. So in this case, the average value should be approximately 5.5, because A1 can be anything between 1 and 10 with equal probability, and (1 + 10) / 2 = 5.5.

Before we move on from this graph, we'll go through how to change the cell that is currently being graphed. In the Scenarios tab, select the button underneath the text box containing the number of scenarios, the button labelled 'Choose cell to explore'. Selecting this button should bring up a 'Selecting Cells' window. Simply click on a different cell - try A2, the cell containing the formula =A1+10 - and press 'Select', and a graph of this cell will be plotted on the right, replacing the old one. You should see that just as A1's plot was a flat histogram over the range from 1 to 10, A2's plot is a flat histogram over the range from 11 to 20.

Okay, now we'll move on to the left side of the screen and the scenario scroller. Remember earlier that Invrea's view of a 'scenario' is a set of values of the entire spreadsheet, a set that obeys the rules of the formulas laid out. On the left you see one such set of values, consistent in the sense that A2 is exactly ten more than A1.

The scenario scroller in the middle allows you to change which of the 10,000 possible spreadsheets just generated you wish to look at. Scroll left and right, and you will see the spreadsheet on the left shift and take different values in each new scenario. You will also see that the red bar in the histogram moves as you scroll. The red bar tells you where in the histogram the scenario being displayed on the left sits.

Cell A2 is currently being plotted. This means that the 10,000 scenarios are sorted from the lowest value of A2 to the highest value of A2. That is, if you scroll the scenario scroller all the way to the left, you should see first that the red bar on the histogram has moved all the way to the left, and second, that A2 has taken its lowest possible value, which should be slightly larger than 11:

Okay, now let's make this spreadsheet slightly more complicated. Exit inference by pressing the red 'Stop' button above the 'Run' button. (We have to exit inference because we're about to change the formulas on the spreadsheet, which you shouldn't do while viewing scenarios.)

Now, add another cell in cell B1 with the following formula: =GAUSSIAN(0, 1). This specifies that the cell B1 is a random variable distributed according to a normal distribution with mean 0 and standard deviation 1. (The Gaussian distribution, or normal distribution, is a statistical distribution also known as the bell curve.) Now that we've made this change, select cell B1, type in 20,000 scenarios, and press the 'Run' button again:

Hopefully, if you didn't know what a Gaussian distribution was before, you have at least some idea what it is now. The normal distribution is described by two numbers, the mean and standard deviation. The mean specifies the center of the distribution (0 in this case), and the standard deviation specifies how wide the distribution is.

Unlike the BETWEEN distribution from earlier, which bounded cell A1 between the values of 1 and 10, cell B1, for which we made a Gaussian assumption, may technically be any number at all. However, the plot on the right shows what numbers B1 is likely to be. As a general rule, a variable distributed according to a normal distribution is very unlikely to be any more than three or four standard deviations from the mean. Therefore, in this case, almost all of the scenarios generated for the value of B1 are between the values of -4 and 4.

Finally, before we move on, we'll plot the possible values of cell A1 against the possible values of cell B1. Press the 'Choose cell to explore' button on the Scenarios tab, but this time, we'll select two cells, A1 and B1. (To select two cells, you can either click and drag, hold SHIFT and click two cells, or hold CONTROL and click two cells). After pressing 'Select' again, you should see a new type of graph on the right:

This is a heatmap, with the values of cell A1 plotted on the vertical axis and the values of cell B1 plotted on the horizontal axis. The color of each entry in the heatmap specifies how likely this combination of values is; the likelihood of this combination occuring is also written in each entry. Unlikely scenarios are drawn as blue, very likely scenarios are drawn as red, and scenarios in the middle are green.

Cell B1 is most likely to be close to zero, because it's a Gaussian distribution with mean zero. Therefore, scroll to the right to find the center of the graph. You should see that the graph resembles a 'ridge' centred on zero. This represents the fact that cell B1 is likely to be near zero, but cell A1 - the vertical dimension - is flat, equally likely to be anything from one to ten. Again, the data for this graph is within Excel, so its appearance can easily be matched to your taste and relevant stylistic needs.

Now, let's move on to discussing Scenarios' true unique ability.

Learning From Data

(↩)

On the same spreadsheet, enter the formula =GAUSSIAN(A1, 2) into cell C1, select cell C1, and run 20,000 scenarios. You should see a plot that looks something like the following:

Cell C1 now is approximately equal to cell A1, the cell that is between 1 and 10, but may be off by as much as six to eight in either direction. Think of cell C1 as a measurement of cell A1. Every measurement has some noise, some degree of error. Here, we assume that the error is described by a bell curve.

Now that we have a random variable that could represent a measurement, we have to actually constrain it to some value. So right-click cell C1 and press 'Record Actual'. In the prompt that appears, type 7.

In performing this action, the modeller tells Scenarios that they are only interested in situations in which the cell C1 ends up as 7. Remember that the cell C1 is itself a noisy measurement of the cell A1. Therefore, if C1 is about 7, A1 is likely to be about 7. However, it is very likely that it is anywhere in the range from 3 to 11, and somewhat more unlikely that it is a little further away.

Yet: we already specified that A1 is between 1 and 10, and anywhere in that interval with equal probability, by writing =BETWEEN(1, 10). How you weight these two constraints against each other, and come up with an optimal estimate of A1, is the fundamental question of Bayesian probability that Scenarios will solve for you.

Before we get the solution by generating scenarios, what do you think the answer is? Intuitively, you would expect the best guess for the value of A1 to be somewhere between 5.5 (the expected value of =BETWEEN(1, 10), the original formula) and 7, but how much of each and why?

To find the answer, select cell A1 and run a large number of scenarios, say 50,000. You should see a graph that looks something like the following:

From this graph, it is apparent that the Gaussian observation in cell C1 that we just added had a gigantic effect on the value of A1. Where before cell A1 had a flat, even likelihood over the numbers from 1 to 10, now it has a peak around approximately 6.7, and has a low chance of being numbers between 1 and 3. The Scenarios plugin has learned from the data provided to it, and generated an updated distribution over the random variable A1.

To wrap up, we discuss how to save the results you've just generated for later presentation. This is done using the 'Save scenario' button just to the right of the 'Stop' button and to the left of the 'Statistics' panel.

In order to save the results obtained from running Scenarios, it is crucial that your workbook first be saved. So stop viewing scenarios (press the 'Stop' button), and save your workbook. After that, once scenarios have been generated, you can save them by pressing the 'Save scenario' button. This will create a workbook where every cell has the values that you are currently viewing, and that contains the plot shown on the right.

This has been an introduction to but the simplest application of true Bayesian conditioning using Scenarios. But hopefully, it has given you the basics you need to begin considering how you might use Scenarios to solve a modelling problem that faces you.

For example models, further explanation, and regular updates, visit our blog. To learn more about the statistical functions Scenarios adds to Excel, consult the rest of the documentation. Oh, and make sure to update to the latest version of Scenarios.