# `Statistex`
[🔗](https://github.com/bencheeorg/statistex/blob/1.1.1/lib/statistex.ex#L1)

Calculate all the statistics for given samples.

Works all at once with `statistics/1` or has a lot of functions that can be triggered individually.

To avoid wasting computation, function can be given values they depend on as optional keyword arguments so that these values can be used instead of recalculating them. For an example see `average/2`.

Most statistics don't really make sense when there are no samples, for that reason all functions except for `sample_size/1` raise `ArgumentError` when handed an empty list.
It is suggested that if it's possible for your program to throw an empty list at Statistex to handle that before handing it to Staistex to take care of the "no reasonable statistics" path entirely separately.

Limitations of ther erlang standard library apply (particularly `:math.pow/2` raises for VERY large numbers).

# `configuration`

```elixir
@type configuration() :: keyword()
```

The optional configuration handed to a lot of functions.

Keys used are function dependent and are documented there.

# `mode`

```elixir
@type mode() :: [sample()] | sample() | nil
```

Careful with the mode, might be multiple values, one value or nothing.😱 See `mode/1`.

# `percentiles`

```elixir
@type percentiles() :: %{required(number()) =&gt; float()}
```

The percentiles map returned by `percentiles/2`.

# `sample`

```elixir
@type sample() :: number()
```

A single sample/

# `samples`

```elixir
@type samples() :: [sample(), ...]
```

The samples to compute statistics from.

Importantly this list is not empty/includes at least one sample otherwise an `ArgumentError` will be raised.

# `t`

```elixir
@type t() :: %Statistex{
  average: float(),
  frequency_distribution: %{required(sample()) =&gt; pos_integer()},
  lower_outlier_bound: number(),
  maximum: number(),
  median: number(),
  minimum: number(),
  mode: mode(),
  outliers: [number()],
  percentiles: percentiles(),
  sample_size: non_neg_integer(),
  standard_deviation: float(),
  standard_deviation_ratio: float(),
  total: number(),
  upper_outlier_bound: number(),
  variance: float()
}
```

All the statistics `statistics/1` computes from the samples.

For a description of what a given value means please check out the function here by the same name, it will have an explanation.

# `average`

```elixir
@spec average(
  samples(),
  keyword()
) :: float()
```

Calculate the average.

It's.. well the average.
When the given samples are empty there is no average.

`Argumenterror` is raised if the given list is empty.

## Options
If you already have these values, you can provide both `:total` and `:sample_size`. Should you provide both the provided samples are wholly ignored.

## Examples

    iex> Statistex.average([5])
    5.0

    iex> Statistex.average([600, 470, 170, 430, 300])
    394.0

    iex> Statistex.average([-1, 1])
    0.0

    iex> Statistex.average([2, 3, 4], sample_size: 3)
    3.0

    iex> Statistex.average([20, 20, 20, 20, 20], total: 100, sample_size: 5)
    20.0

    iex> Statistex.average(:ignored, total: 100, sample_size: 5)
    20.0

    iex> Statistex.average([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `frequency_distribution`

```elixir
@spec frequency_distribution(samples()) :: %{required(sample()) =&gt; pos_integer()}
```

A map showing which sample occurs how often in the samples.

Goes from a concrete occurence of the sample to the number of times it was observed in the samples.

`Argumenterror` is raised if the given list is empty.

## Examples

    iex> Statistex.frequency_distribution([1, 2, 4.23, 7, 2, 99])
    %{
      2 => 2,
      1 => 1,
      4.23 => 1,
      7 => 1,
      99 => 1
    }

    iex> Statistex.frequency_distribution([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `maximum`

```elixir
@spec maximum(samples()) :: sample()
```

The biggest sample.

`Argumenterror` is raised if the given list is empty.

## Examples

    iex> Statistex.maximum([1, 100, 24])
    100

    iex> Statistex.maximum([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `median`

```elixir
@spec median(
  samples(),
  keyword()
) :: number()
```

Calculates the median of the given samples.

The median can be thought of separating the higher half from the lower half of the samples.
When all samples are sorted, this is the middle value (or average of the two middle values when the number of times is even).
More stable than the average.

`Argumenterror` is raised if the given list is empty.

## Options
* `:percentiles` - you can pass it a map of calculated percentiles to fetch the median from (it is the 50th percentile).
If it doesn't include the median/50th percentile - it will still be computed.
* `:sorted?`: indicating the samples you're passing in are already sorted. Defaults to `false`. Only set this,
if they are truly sorted - otherwise your results will be wrong. Sorting only occurs when percentiles aren't provided.

## Examples

    iex> Statistex.median([1, 3, 4, 6, 7, 8, 9])
    6.0

    iex> Statistex.median([1, 3, 4, 6, 7, 8, 9], percentiles: %{50 => 6.0})
    6.0

    iex> Statistex.median([1, 3, 4, 6, 7, 8, 9], percentiles: %{25 => 3.0})
    6.0

    iex> Statistex.median([1, 3, 4, 6, 7, 8, 9], sorted?: true)
    6.0

    iex> Statistex.median([1, 2, 3, 4, 5, 6, 8, 9])
    4.5

    iex> Statistex.median([0])
    0.0

    iex> Statistex.median([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `minimum`

```elixir
@spec minimum(samples()) :: sample()
```

The smallest sample.

`Argumenterror` is raised if the given list is empty.

## Examples

    iex> Statistex.minimum([1, 100, 24])
    1

    iex> Statistex.minimum([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `mode`

```elixir
@spec mode(
  samples(),
  keyword()
) :: mode()
```

Calculates the mode of the given samples.

Mode is the sample(s) that occur the most. Often one value, but can be multiple values if they occur the same amount of times. If no value occurs at least twice, there is no mode and it hence returns `nil`.

`Argumenterror` is raised if the given list is empty.

## Options

If already calculated, the `:frequency_distribution` option can be provided to avoid recalulating it.

## Examples

    iex> Statistex.mode([5, 3, 4, 5, 1, 3, 1, 3])
    3

    iex> Statistex.mode([1, 2, 3, 4, 5])
    nil

    # When a measurement failed and nils is reported as the only value
    iex> Statistex.mode([nil])
    nil

    iex> Statistex.mode([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

    iex> mode = Statistex.mode([5, 3, 4, 5, 1, 3, 1])
    iex> Enum.sort(mode)
    [1, 3, 5]

# `outlier_bounds`

```elixir
@spec outlier_bounds(
  samples(),
  keyword()
) :: {lower :: number(), upper :: number()}
```

Calculates the lower and upper bound for outliers.

Any sample that is `<` as the lower bound and any sample `>` are outliers of
the given `samples`.

List passed needs to be non empty, otherwise an `ArgumentError` is raised.

## Options
* `:percentiles` - you can pass it a map of calculated percentiles (25th and 75th are needed).
If it doesn't include them - it will still be computed.
* `:sorted?`: indicating the samples you're passing in are already sorted. Defaults to `false`. Only set this,
if they are truly sorted - otherwise your results will be wrong. Sorting only occurs when percentiles aren't provided.

## Examples

    iex> Statistex.outlier_bounds([3, 4, 5])
    {0.0, 8.0}

    iex> Statistex.outlier_bounds([4, 5, 3])
    {0.0, 8.0}

    iex> Statistex.outlier_bounds([3, 4, 5], sorted?: true)
    {0.0, 8.0}

    iex> Statistex.outlier_bounds([3, 4, 5], percentiles: %{25 => 3.0, 75 => 5.0})
    {0.0, 8.0}

    iex> Statistex.outlier_bounds([1, 2, 6, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50])
    {22.5, 66.5}

    iex> Statistex.outlier_bounds([50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 99, 99, 99])
    {31.625, 80.625}

    iex> Statistex.outlier_bounds([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `outliers`

```elixir
@spec outliers(
  samples(),
  keyword()
) :: {samples() | [], samples()}
```

Returns all outliers for the given `samples`, along with the remaining values.

Returns: `{outliers, remaining_samples`} where `remaining_samples` has the outliers removed.

`Argumenterror` is raised if the given list is empty.

## Options
* `:outlier_bounds` - if you already have calculated the outlier bounds.
* `:percentiles` - you can pass it a map of calculated percentiles (25th and 75th are needed).
If it doesn't include them - it will still be computed.
* `:sorted?`: indicating the samples you're passing in are already sorted. Defaults to `false`. Only set this,
if they are truly sorted - otherwise your results will be wrong. Sorting only occurs when percentiles aren't provided.

## Examples

    iex> Statistex.outliers([3, 4, 5])
    {[], [3, 4, 5]}

    iex> Statistex.outliers([1, 2, 6, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50])
    {[1, 2, 6], [50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50]}

    iex> Statistex.outliers([50, 50, 1, 50, 50, 50, 50, 50, 2, 50, 50, 50, 50, 6])
    {[1, 2, 6], [50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50]}

    iex> Statistex.outliers([50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 99, 99, 99])
    {[99, 99, 99], [50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50]}

    iex> Statistex.outliers([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `percentiles`

```elixir
@spec percentiles(samples(), number() | [number(), ...]) :: percentiles()
```

See `percentiles/3`.

# `percentiles`

Calculates the value at the `percentile_rank`-th percentile.

Think of this as the value below which `percentile_rank` percent of the samples lie.
For example, if `Statistex.percentiles(samples, 99) == 123.45`,
99% of samples are less than 123.45.

Passing a number for `percentile_rank` calculates a single percentile.
Passing a list of numbers calculates multiple percentiles, and returns them
as a map like %{90 => 45.6, 99 => 78.9}, where the keys are the percentile
numbers, and the values are the percentile values.

Percentiles must be between 0 and 100 (excluding the boundaries).

The method used for interpolation is [described here and recommended by NIST](https://www.itl.nist.gov/div898/handbook/prc/section2/prc262.htm).

`Argumenterror` is raised if the given list is empty.

## Options

* `:sorted?`: indicating the samples you're passing in are already sorted. Defaults to `false`. Only set this,
if they are truly sorted - otherwise your results will be wrong.

## Examples

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 12.5)
    %{12.5 => 1.0}

    iex> Statistex.percentiles([1, 1, 3, 3, 3, 4, 5, 5], 12.5, sorted?: true)
    %{12.5 => 1.0}

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], [50])
    %{50 => 3.0}

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], [75])
    %{75 => 4.75}

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 99)
    %{99 => 5.0}

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], [50, 75, 99])
    %{50 => 3.0, 75 => 4.75, 99 => 5.0}

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 100)
    ** (ArgumentError) percentile must be between 0 and 100, got: 100

    iex> Statistex.percentiles([5, 3, 4, 5, 1, 3, 1, 3], 0)
    ** (ArgumentError) percentile must be between 0 and 100, got: 0

    iex> Statistex.percentiles([], [50])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `sample_size`

```elixir
@spec sample_size([sample()]) :: non_neg_integer()
```

Number of samples in the given list.

Nothing to fancy here, this just calls `length(list)` and is only provided for completeness sake.

## Examples

    iex> Statistex.sample_size([])
    0

    iex> Statistex.sample_size([1, 1, 1, 1, 1])
    5

# `standard_deviation`

```elixir
@spec standard_deviation(
  samples(),
  keyword()
) :: float()
```

Calculate the standard deviation.

A measurement how much samples vary (the higher the more the samples vary). It's the square root of the variance. Unlike the variance, its unit is the same as that of the sample (as calculating the variance includes squaring).

## Options
If already calculated, the `:variance` option can be provided to avoid recalulating those values.

`Argumenterror` is raised if the given list is empty.

## Examples

    iex> Statistex.standard_deviation([4, 9, 11, 12, 17, 5, 8, 12, 12])
    4.0

    iex> Statistex.standard_deviation(:dontcare, variance: 16.0)
    4.0

    iex> Statistex.standard_deviation([42])
    0.0

    iex> Statistex.standard_deviation([1, 1, 1, 1, 1, 1, 1])
    0.0

    iex> Statistex.standard_deviation([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `standard_deviation_ratio`

```elixir
@spec standard_deviation_ratio(
  samples(),
  keyword()
) :: float()
```

  Calculate the standard deviation relative to the average.

  This helps put the absolute standard deviation value into perspective expressing it relative to the average. It's what percentage of the absolute value of the average the variance takes.

  `Argumenterror` is raised if the given list is empty.

  ## Options
  If already calculated, the `:average` and `:standard_deviation` options can be provided to avoid recalulating those values.

  If both values are provided, the provided samples will be ignored.

  ## Examples

      iex> Statistex.standard_deviation_ratio([4, 9, 11, 12, 17, 5, 8, 12, 12])
      0.4

      iex> Statistex.standard_deviation_ratio([-4, -9, -11, -12, -17, -5, -8, -12, -12])
      0.4

      iex> Statistex.standard_deviation_ratio([4, 9, 11, 12, 17, 5, 8, 12, 12], average: 10.0, standard_deviation: 4.0)
      0.4

      iex> Statistex.standard_deviation_ratio(:ignored, average: 10.0, standard_deviation: 4.0)
      0.4

      iex> Statistex.standard_deviation_ratio([])
      ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `statistics`

```elixir
@spec statistics(samples(), configuration()) :: t()
```

Calculate all statistics Statistex offers for a given list of numbers.

The statistics themselves are described in the individual samples that can be used to calculate individual values.

`ArgumentError` is raised if the given list is empty.

## Options

* `:percentiles`: percentiles to calculate (see `percentiles/2`).
The percentiles 25th, 50th (median) and 75th are always calculated.
* `:exclude_outliers` can be set to `true` or `false`. Defaults to `false`.
If this option is set to `true` the outliers are excluded from the calculation
of the statistics.
* `:sorted?`: indicating the samples you're passing in are already sorted. Defaults to `false`. Only set this,
if they are truly sorted - otherwise your results will be wrong.

## Examples

    iex> Statistex.statistics([50, 50, 450, 450, 450, 500, 500, 500, 600, 900])
    %Statistex{
      total: 4450,
      average: 445.0,
      variance: 61_361.11111111111,
      standard_deviation: 247.71175004652304,
      standard_deviation_ratio: 0.5566556180820742,
      median: 475.0,
      percentiles: %{25 => 350.0, 50 => 475.0, 75 => 525.0},
      frequency_distribution: %{50 => 2, 450 => 3, 500 => 3, 600 => 1, 900 => 1},
      mode: [500, 450],
      minimum: 50,
      maximum: 900,
      lower_outlier_bound: 87.5,
      upper_outlier_bound: 787.5,
      outliers: [50, 50, 900],
      sample_size: 10
    }

    # excluding outliers changes the results
    iex> Statistex.statistics([50, 50, 450, 450, 450, 500, 500, 500, 600, 900], exclude_outliers: true)
    %Statistex{
      total: 3450,
      average: 492.85714285714283,
      variance: 2857.142857142857,
      standard_deviation: 53.452248382484875,
      standard_deviation_ratio: 0.1084538372977954,
      median: 500.0,
      percentiles: %{25 => 450.0, 50 => 500.0, 75 => 500.0},
      frequency_distribution: %{450 => 3, 500 => 3, 600 => 1},
      mode: [500, 450],
      maximum: 600,
      minimum: 450,
      lower_outlier_bound: 87.5,
      upper_outlier_bound: 787.5,
      outliers: [50, 50, 900],
      sample_size: 7
    }

    iex> Statistex.statistics([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `total`

```elixir
@spec total(samples()) :: number()
```

The total of all samples added together.

`Argumenterror` is raised if the given list is empty.

## Examples

    iex> Statistex.total([1, 2, 3, 4, 5])
    15

    iex> Statistex.total([10, 10.5, 5])
    25.5

    iex> Statistex.total([-10, 5, 3, 2])
    0

    iex> Statistex.total([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

# `variance`

```elixir
@spec variance(
  samples(),
  keyword()
) :: float()
```

Calculate the variance.

A measurement how much samples vary (the higher the more the samples vary). This is the variance of a sample and is hence in its calculation divided by sample_size - 1 (Bessel's correction).

`Argumenterror` is raised if the given list is empty.

## Options
If already calculated, the `:average` and `:sample_size` options can be provided to avoid recalulating those values.

## Examples

    iex> Statistex.variance([4, 9, 11, 12, 17, 5, 8, 12, 12])
    16.0

    iex> Statistex.variance([4, 9, 11, 12, 17, 5, 8, 12, 12], sample_size: 9, average: 10.0)
    16.0

    iex> Statistex.variance([42])
    0.0

    iex> Statistex.variance([1, 1, 1, 1, 1, 1, 1])
    0.0

    iex> Statistex.variance([])
    ** (ArgumentError) Passed an empty list ([]) to calculate statistics from, please pass a list containing at least one number.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
