Invrea *Scenarios* comes packaged with the following distributions, all of which are sampled from by directly calling them in Excel formulas (e.g. `=GAUSSIAN(1, 1)`

), and used in `ACTUAL`

statements to learn from data (e.g. `=ACTUAL(3, "gaussian", A1, 1)`

.

We think we've achieved a pretty broad range of distributions for general use. However, if there are any distributions you'd particularly like us to add support for, please contact us here!

- GAUSSIAN
- BETWEEN
- RAND
- RANDBETWEEN
- TRIANGULAR
- PERT
- CHOICE
- FLIP
- BINOMIAL
- GEOMETRIC
- POISSONIAN
- NEG_BINOMIAL
- HYPERGEOMETRIC
- MULTINOM
- EXPONENTIAL
- CHI_SQUARE
- ERLANG
- INV_GAMMA
- F
- NAKAGAMI
- LEVY
- LOG_NORMAL
- RAYLEIGH
- LAPLACE
- LOGISTIC
- FISK
- STUDENT_T
- CAUCHY
- NEAR
- LOG_CAUCHY
- BETA
- PARETO
- WALD
- RECIPROCAL
- GUMBEL
- GOMPERTZ
- FRECHET
- GEV

**Parameters:** Two: a mean and a standard deviation

**Return Type:** Any real number

**Description:** The Gaussian distribution, also known as the normal distribution and the bell curve, is ubiquitous in the fields of probability and statistics.

**Error Conditions:** The mean and standard deviation of a Gaussian distribution must be scalar numbers. The standard deviation of a Gaussian distribution must be a number higher than zero.

**Parameters:** Two: a lower bound (inclusive) and an upper bound (exclusive)

**Return Type:** Any number between the lower and upper bounds

**Description:** `BETWEEN`

is *Scenarios*' name for the uniform continuous distribution. Any number between the lower bound and upper bound is equally likely.

**Error Conditions:** The lower bound and upper bound must be numbers. The lower bound must be strictly less than the upper bound.

**Parameters:** None

**Return Type:** Any floating-point number between 0 (inclusive) and 1 (exclusive)

**Description:** `RAND`

is a native Excel function that *Scenarios* wraps as a primitive distribution. It is simply `BETWEEN`

but with lower bound 0 and upper bound 1; any number between 0 and 1 is equally likely. That is, `=RAND()`

is equivalent to `=BETWEEN(0, 1)`

, except that certain *Scenarios* functions will not be available for `RAND`

- see here for more information.

**Error Conditions:** None

**Parameters:** Two: a lower bound (inclusive) and an upper bound (also inclusive)

**Return Type:** Integers between the lower bound and upper bound (inclusive on both sides)

**Description:** `RANDBETWEEN`

is a native Excel function that *Scenarios* wraps as a primitive distribution. It generates a random integer between two specified bounds. Like `RAND`

, certain *Scenarios* functionality is unavailable for this distribution - see here for more information.

**Error Conditions:** The lower bound and upper bound must be numbers. The lower bound must be less than or equal to the upper bound.

**Parameters:** Three: a minimum value `a`

, a mode `b`

, and a maximum value `c`

**Return Type:** Floating-point numbers between the minimum value and maximum value

**Description:** `TRIANGULAR`

is *Scenarios*' implementation of the triangular distribution. The triangular distribution is commonly used in project management and finance as a simple way to model uncertainty around expert estimates.

**Error Conditions:** All three arguments `a, b, c`

must be numbers. The maximum `c`

must be strictly greater than the minimum `a`

. The mode `b`

must be in between (inclusivel) the minimum and maximum values.

**Parameters:** Four: a minimum value `a`

, a mode `b`

, a maximum value `c`

, and a shape parameter `k`

**Return Type:** Floating-point numbers between the minimum value and the maximum value

**Description:** `PERT`

is the *Scenarios* implementation of the Pert distribution. The Pert distribution, which is a shifted version of the beta distribution, is highly used in project management as a smoother and more customisable alternative to the triangular distribution. The mode is the most likely value that the random value is estimated to take, and the shape parameter determines how likely the random value is to be close to the mode. A shape parameter close to zero indicates "low confidence" in the estimate provided in the form of the mode, i.e. a high likelihood for anything between the minimum and the maximum, while a large shape parameter indicates high confidence that the random value will be close to the mode. A shape parameter of zero simply represents a uniform random variable, such that `=PERT(a, b, c, 0)`

is equivalent to `=BETWEEN(a, c)`

; as the shape parameter approaches infinity, the `PERT`

distribution approaches a `GAUSSIAN`

distribution centred on the mode.

**Error Conditions:** All four arguments `a, b, c, k`

must be numbers. The maximum `c`

must be strictly greater than the minimum `a`

. The mode `b`

must be in between (inclusively) the minimum and maximum values. The shape `k`

cannot be negative.

**Parameters:** Two: the first a range of selections, and the second (optional) a range of probabilities.

**Return Type:** An element of the range of selections provided

**Description:** `CHOICE`

is the *Scenarios* implementation of the categorical distribution. Given only a range of selections as the sole parameter, it returns one of those selections randomly with equal probability. Given a range of selections and an equally-sized range of probabilities, one of the selections will be sampled according to the given probabilities.

**Error Conditions:** The arguments to `CHOICE`

must be ranges or arrays. If both arguments are supplied, they must have the same number of elements. If a range of probabilities is given, all its elements must be numbers.

**Parameters:** One: a probability `P`

representing the probability of success

**Return Type:** `TRUE`

or `FALSE`

**Description:** `FLIP`

is *Scenarios'* name for the Bernoulli distribution. This distribution returns `TRUE`

with probability `P`

, or `FALSE`

with probability `1-P`

. It is a special case of the `CHOICE`

distribution with only two selections, `TRUE`

and `FALSE`

.

**Error Conditions:** The sole argument to `FLIP`

must be a number.

**Parameters:** Two: a number of trials `N`

and a probability `P`

**Return Type:** Any integer ranging from 0 to the number of trials `N`

(inclusive on both ends)

**Description:** The `BINOMIAL`

distribution is essentially a generalization of the `FLIP`

distribution, only with multiple flips. If `P`

represents the probability of flipping heads, then the binomial distribution returns the total number of heads after flipping `N`

coins.

**Error Conditions:** The arguments to `BINOMIAL`

must both be numbers. The number of trials must be greater than zero.

**Parameters:** One: a probability `P`

representing the probability of success

**Return Type:** Any nonnegative integer

**Description:** The `GEOMETRIC`

distribution is a distribution over nonnegative integers. If the parameter `P`

represents the probability of flipping heads, then the numbers returned by the geometric distribution represent how many coins might land tails in a row before getting one that lands on heads.

**Error Conditions:** The argument to `GEOMETRIC`

must be a number greater than zero and less than one (exclusive on both ends).

**Parameters:** One: a rate `r`

that is also the mean and variance of the random variable

**Return Type:** Any nonnegative integer

**Description:** `POISSONIAN`

is the *Scenarios* name for the Poisson distribution. The Poisson distribution is generally used to describe the number of occurrences of a certain event within a fixed amount of time, where the expected amount of time between events is assumed to be independent and determined by the rate parameter `r`

. The Poisson distribution is used as a model within fields as diverse as telecommunications, finance, project management, insurance, and the physical sciences.

**Error Conditions:** The rate paramter `r`

must be a number greater than zero.

**Parameters:** Two: a number of failures `r`

, and a probability `P`

representing the probability of success

**Return Type:** Any nonnegative integer

**Description:** `NEG_BINOMIAL`

is the *Scenarios* name for the negative binomial distribution. The negative binomial distribution is a distribution over positive integers closely related to the geometric and binomial distributions. If the parameter `P`

represents the probability of flipping heads, then the numbers returned by the negative binomial distribution represent how many flips of a coin might land heads before getting `r`

coins that land tails. The negative binomial distribution (also sometimes known as the Polya or Pascal distribution) is commonly used in engineering and statistics to model phenomena that occur over time.

**Error Conditions:** The number of failures `r`

must be a number greater than zero. The probability of success `P`

must be a number between 0 and 1 (exclusive on both ends).

**Parameters:** Three: the number of successes available `A`

, the number of failures available `B`

, and the number of draws to be taken `T`

**Return Type:** An integer between 0 and the total number of successes `A`

(inclusive on both ends)

**Description:** The `HYPERGEOMETRIC`

is best understood as an alternate version of the binomial distribution in which draws are taken without replacement. For example, consider a bag with `A`

red marbles ("successes") and `B`

blue ones ("failures"), from which we draw `T`

marbles without replacement; the number of red marbles withdrawn at the end is described by the hypergeometric distribution. The hypergeometric distribution is very frequently used in classical statistics to understand the procedure of unbiased sampling.

**Error Conditions:** The number of successes and failures available, `A`

and `B`

, must each be real numbers greater than zero (they will be converted to integers). The number of draws to be taken `T`

must be less than or equal to the total population of draws available (the sum of the successes and failures). That is, `T`

must be less than or equal to `A`

plus `B`

.

**Parameters:** Three: a number of trials `N`

, a set of probabilities, and an optional model identifier

**Return Type:** A set of integers, each corresponding to one of the given probabilities

**Description:** `MULTINOM`

is *Scenarios*' name for the multinomial distribution. The multinomial distribution is a generalization of the categorical distribution (`CHOICE`

) and also of the `BINOMIAL`

distribution. A set of probabilities specifies the likelihoods of certain outcomes, and `N`

trials are performed. The number of times each separate outcome occurs is returned, in the form of an array. As `MULTINOM`

is a multivariable probability distribution, separate calls with the same parameter values on the same spreadsheet return the same value, unless a different model identifier is specified.

**Error Conditions:** The number of trials `N`

must be a number greater than zero. The set of probabilities must be an array consisting of numbers greater than zero. The model identifier, if specified, must be an integer.

**Parameters:** One: a rate `r`

**Return Type:** A strictly-positive floating-point number

**Description:** The `EXPONENTIAL`

distribution is similar to the geometric distribution, but includes non-integer values. It is commonly used to model the timing of real-valued processes that can be assumed to have a constant rate over time.

**Error Conditions:** The argument to `EXPONENTIAL`

must be a scalar number that is greater than zero.

**Parameters:** One: a number of degrees of freedom `f`

, which in contrast to most implementations of the chi-squared distribution is allowed to be any floating-point number greater than zero.

**Return Type:** Any nonnegative floating-point number

**Description:** The chi-squared distribution is commonly used in frequentist hypothesis testing, and is heavily related to the gamma distribution, which is heavily used in Bayesian statistics.

**Error Conditions:** The number of degrees of freedom must be a number greater than zero.

**Parameters:** Two: a shape parameter `a`

, and a scale parameter `b`

.

**Return Type:** Any positive floating-point number

**Description:** `ERLANG`

is the *Scenarios* implementation of the Erlang and gamma distributions. (It's both because the Erlang distribution is a special case of the gamma distribution where the shape parameter `a`

is an integer.) Counter the name, the shape parameter `a`

is not converted into an integer in `ERLANG`

, because it is intended to function as a gamma distribution as well. (If desired, this conversion could trivially be achieved manually, e.g. `=ERLANG(INT(a), b)`

.) The gamma distribution is commonly used in Bayesian statistics, as are its special cases the chi-square and exponential distributions.

**Error Conditions:** The shape and scale parameters must be positive scalar numbers.

**Parameters:** Two: a shape parameter `a`

and a scale parameter `b`

**Return Type:** Any positive real number

**Description:** `INV_GAMMA`

is the *Scenarios* implementation of the inverse-gamma distribution. The inverse-gamma distribution is commonly used in Bayesian statistics as a conjugate prior for certain other distributions. An inverse-gamma random variable is the inverse of a gamma-distributed (Erlang-distributed) random variable.

**Error Conditions:** The shape and rate parameters of the inverse-gamma distribution must both be positive scalar numbers.

**Parameters:** Two: a shape parameter `m`

and a spread parameter `w`

**Return Type:** Any positive real number

**Description:** `NAKAGAMI`

is the *Scenarios* implementation of the Nakagami distribution. The Nakagami distribution is closely related to the gamma and the chi-squared distribution. The Nakagami distribution is commonly used to describe the attenuation of wireless signals over space.

**Error Conditions:**The Nakagami shape parameter must be greater than 1/2; the spread parameter must be greater than zero.

**Parameters:** Two: one number of degrees of freedom `d`

, and a second number of degrees of freedom `e`

**Return Type:** Any positive real number

**Description:** The `F`

distribution is very frequently used in frequentist hypothesis tests; an F-distributed random variable is essentially the ratio of two chi-squared random variables with degrees of freedom `d`

and `e`

.

**Error Conditions:** The degrees of freedom `d`

and `e`

must be positive scalar numbers.

**Parameters:** Two: a minimum `m`

and a scale parameter `c`

**Return Type:** Any floating-point number greater than or equal to the minimum value

**Description:** The `LEVY`

distribution is heavily related to the inverse-gamma distribution. The Levy distribution is notable for being one of the few 'stable' continuous distribution types (along with the Gaussian distribution and the Cauchy distribution), and is extremely heavy-tailed.

**Error Conditions:** The minimum and scale parameters of the Levy distribution must be numbers; the scale parameter must be positive.

**Parameters:** Two: a location parameter `m`

and a scale parameter `s`

**Return Type:** Any positive floating-point value

**Description:** `LOG_NORMAL`

is the *Scenarios* implementation of the log-normal distribution. The log-normal distribution is a skewed, heavy-tailed distribution that ranges over the positive real numbers, commonly used in finance to describe prices and engineering to desribe measurements. A log-normal random variable with location parameter `m`

and scale parameter `s`

is the exponential of a normal random variable with mean `m`

and standard deviation `s`

; that is, `=LOG_NORMAL(m, s)`

is equivalent to `=EXP(NORMAL(m, s))`

.

**Error Conditions:** Both arguments to `LOG_NORMAL`

must be scalar numbers. The scale parameter `s`

must be greater than zero.

**Parameters:** One: a scale parameter `s`

**Return Type:** A nonnegative floating-point value

**Description:** The `RAYLEIGH`

distribution is a skewed distribution that ranges over the positive real numbers, equal to the square root of the sum of the squares of two normal random variables, each with mean zero and standard deviation `s`

. That is, `=RAYLEIGH(s)`

is equivalent to `=SQRT(GAUSSIAN(0, s)^2 + GAUSSIAN(0, s)^2)`

. A Rayleigh random variable can therefore be understood as the norm of a 2-vector, or complex number, whose components are homoscedastic normal random variables. For this reason, the Rayleigh distribution is often used in physics and signal analysis.

**Error Conditions:** The scale parameter `s`

must be a number greater than zero.

**Parameters:** Two: a mean `m`

and a scale parameter `s`

**Return Type:** A floating-point value

**Description:** The `LAPLACE`

distribution is a symmetric distribution that ranges over all real numbers, essentially a double-sided exponential distribution centred at its mean `m`

. The Laplace distribution is often applied to describe errors for which the Gaussian distribution is inappropriate. Using the Laplace distribution rather than the Gaussian distribution to describe errors results in models, such as **lasso regression**, which are basically more likely to assume errors are zero.

**Error Conditions:** The mean `m`

and scale parameter `s`

must be scalar numbers; the scale `s`

must be greater than zero.

**Parameters:** Two: a mean `m`

and a scale parameter `s`

**Return Type:** A floating-point value

**Description:** The `LOGISTIC`

distribution is a symmetric distribution centred at its mean `m`

that ranges over all real numbers. The logistic distribution is often used as a heavier-tailed alternative to the normal distribution.

**Error Conditions:** The mean `m`

and scale parameter `s`

must be numbers; the scale `s`

must be greater than zero.

**Parameters:** Two: a location parameter `a`

and a shape parameter `b`

**Return Type:** A nonnegative floating-point value

**Description:** The `FISK`

distribution, also called the log-logistic distribution, is a positive-valued heavy-tailed distribution closely related to the logistic distribution. Specifically, `=FISK(a, b)`

is shorthand for `=EXP(LOGISTIC(a, 1/b))`

. The Fisk distribution is often applied in survival analysis and economics.

**Error Conditions:** The scale and shape parameters must both be numbers greater than zero.

**Parameters:** Three: a location parameter, scale parameter, and a number of degrees of freedom (`f`

)

**Return Type:** Any real number

**Description:** `STUDENT_T`

is the *Scenarios* implementation of the unstandardised Student's T distribution. The more common standardised Student's T distribution with `f`

degrees of freedom can be recovered simply by writing `=STUDENT_T(0, 1, f)`

. The Student's T distribution is commonly used in statistics as a wider-tailed alternative to the Gaussian distribution, where the heaviness of the tail is controlled by the number of degrees of freedom.

**Error Conditions:** All three parameters must be numbers. The scale and degrees of freedom must be greater than zero.

**Parameters:** Two: a location parameter `m`

and a scale parameter `s`

**Return Type:** Any floating-point number

**Description:** The `CAUCHY`

distribution is a special case of the Student's T distribution, where the number of degrees of freedom is one. That is, `=CAUCHY(m, s)`

is equivalent to `=STUDENT_T(m, s, 1)`

. The Cauchy disribution is an extremely heavy-tailed distribution that is notable for having no definable mean, variance, or other higher moments (however, its median is the location parameter `m`

).

**Error Conditions:** The scale parameter must be a number greater than zero.

**Parameters:** One: a median

**Return Type:** Any real number

**Description:** The `NEAR`

distribution in *Scenarios* is a variant of the Cauchy distribution. Specifically, `=NEAR(X)`

is shorthand for `=STUDENT_T(X, ABS(X/10), 1)`

, or alternately `=CAUCHY(X, ABS(X/10))`

, for any nonzero `X`

. This distribution is intended primarily for use in `ACTUAL`

statements, as it is sufficiently wide-tailed as to capture the vague human intuition that goes along with the word "near".

**Error Conditions:** The median of the `NEAR`

distribution must be a number other that is not zero.

**Parameters:** Two: a location parameter `m`

and a scale parameter `s`

**Return Type:** Any positive floating-point number

**Description:** A `LOG_CAUCHY`

random variable is the exponential of a Cauchy random variable with the corresponding location parameter `m`

and scale parameter `s`

. That is, `=LOG_CAUCHY(m, s)`

is shorthand for `=EXP(CAUCHY(m, s))`

. The log-Cauchy distribution is an extremely heavy-tailed distribution, often used in Bayesian statistics as an extremely broad prior on a positive value.

**Error Conditions:** Both arguments must be numbers. The scale parameter `s`

must be greater than zero.

**Parameters:** Two: a number of success pseudocounts `a`

and a number of failure pseudocounts `b`

**Return Type:** A real number between zero and one, inclusive

**Description:** The `BETA`

distribution is often used to make assumptions about an unknown probability, and then to learn that probability from data. This leads to the interpretation of its parameters as 'pseudocounts' - the higher the pseudo-sample-size `n=a+b`

, the more concentrated the beta distribution will become around its mean, which is determined by the relative sizes of `a`

and `b`

. The uniform continuous distribution from 0 to 1 is a special case of the beta distribution: `=BETA(1, 1)`

is equivalent to `=BETWEEN(0, 1)`

, which is equivalent to `=RAND()`

.

**Error Conditions:** The arguments to `BETA`

must be scalar numbers greater than zero.

**Parameters:** Three: a minimum value `m`

, a shape parameter `k`

, and a tail thickness parameter `a`

**Return Type:** A floating-point number greater than the minimum value `m`

**Description:** The `PARETO`

distribution is a heavy-tailed power-law distribution commonly used in finance and the social sciences, among others, to model phenomena with significant skew. The implementation provided by *Scenarios* is commonly known as the Type-II Pareto distribution. To obtain a random variable distributed according to the often-used and simpler Type-I Pareto distribution with minimum value `m`

and tail thickness `a`

, simply use a Type-II Pareto distribution whose shape parameter is equal to its minimum value: `=PARETO(m, m, a)`

. The Type-II Pareto distribution is also a generalization of the Lomax distribution: to obtain a Lomax random variable with scale `k`

and tail thickness `a`

, simply use a Type-II Pareto distribution whose minimum value is zero: `=PARETO(0, k, a)`

.

**Error Conditions:** The arguments to `PARETO`

must all be numbers. Additionally, the shape parameter `k`

and the tail thickness parameter `a`

must be greater than zero.

**Parameters:** Two: a mean and a scale parameter

**Return Type:** A floating-point number greater than zero

**Description:** The `WALD`

distribution, also called the inverse-Gaussian distribution, is a heavy-tailed positive-valued distribution. Despite being called the inverse-Gaussian distribution, it is not the inverse of a Gaussian random variable with the same parameters; it is called this for more indirect and complex reasons.

**Error Conditions:** The mean value and scale parameter must be positive floating-point numbers.

**Parameters:** Two: a minimum and a maximum value

**Return Type:** Any floating-point number between the minimum and maximum values

**Description:** The `RECIPROCAL`

distribution is a bounded distribution characterized by a minimum and maximum value. The minimum value is its mode. The most common application of the reciprocal distribution is to describing floating-point arithmetic within computers.

**Error Conditions:** The minimum and maximum values must be greater than zero. The maximum value must be strictly greater than the minimum value.

**Parameters:** Two: a location parameter and a scale parameter

**Return Type:** Any positive real number

**Description:** The `GUMBEL`

distribution is an extreme-value distribution often used by statisticians to describe the minima or maxima of large datasets, and in machine learning to train smooth approximations of categorical distributions. Like the Frechet distribution, it is a special case of the generalized extreme value (GEV) distribution. The variant used in *Scenarios* is the **max-stable** Gumbel distribution, rather than the less common **min-stable** Gumbel variant.

**Error Conditions:** The location parameter and scale parameter must both be numbers; the scale parameter must additionally be positive.

**Parameters:** Two: a scale parameter and a shape parameter

**Return Type:** Any nonnegative floating-point number

**Description:** The `GOMPERTZ`

distribution is highly related to the Gumbel distribution - essentially, it is the Gumbel distribution restricted to positive numbers. The Gompertz distribution is most famously used by actuaries and demographers to model human lifespans, with appropriate scale and shape parameters.

**Error Conditions:** Both arguments to `GOMPERTZ`

must be nonnegative numbers.

**Parameters:** Three: a minimum value `m`

, a scale parameter `s`

, and a shape parameter `a`

**Return Type:** Any floating-point number greater than the minimum `m`

**Description:** The `FRECHET`

distribution is an extreme-value distribution often used by statisticians to describe minima and maxima of various datasets, in the case that a "minimum possible minimum" `m`

can be said to exist. Both it and the Gumbel distribution are special cases of the generalized extreme value (GEV) distribution.

**Error Conditions:** All three arguments to `FRECHET`

must be numbers. Additionally, the scale parameter `s`

and shape parameter `a`

must be positive.

**Parameters:** Three: a location parameter `m`

, a scale parameter `s`

, and a shape parameter `a`

**Return Type:** Complicated. Always a floating-point number, but depending on the value of the shape parameter `a`

, the range will change.

**Description:** `GEV`

is the *Scenarios* name for the Generalized Extreme Value (GEV) distribution. The GEV distribution is a family of distributions that includes the Gumbel distribution, Frechet distribution, and Weibull distribution as special cases, depending on the value of the shape parameter (zero, positive, and negative, essentially; consult Wikipedia for more information).

**Error Conditions:** All three arguments to `GEV`

must be numbers. The scale parameter `s`

must be positive.