# Statistical Metrics

See my previous post on random variables for more background. This post will cover a few important statistical measures that help characterize the probability distribution of a random variable and the relationship between random variables. The first is the expected value or mean is denoted or for random variable . For a discrete random variable the expected value equates to a probability-weighted sum of the possible values of the variable, . The sample mean, , is the average of a collected sample, that is the sum of the values, divided by the number of values. The law of large numbers states that the sample mean approaches the expected value in the limit of the number of samples collected. Note that the expected value may not be an actual value that the random variable may take on. For example, the expected value of a random variable that represents the roll of a die is

The expected value function is linear, meaning that the following property holds:

Variance is a measure that typically describes how “spread out” a distribution is. It is denoted , or , and is defined as the expected value of the difference between the random variable and the mean of the distribution squared, . It can also be defined as the difference between the expected value of the random variable squared and the squared expected value of the variable, . If you have a sample the variance can be defined as the scaled sum of the differences between the samples and the mean for samples. The square root of the variance is called the standard deviation and is commonly used as well. Figure 1 shows the relationship between expected value and variance on a normal distribution. 