3.14: Where Does the N - 1 Come from?

If we know $$\mu$$ and we have a set of $$N$$ data points, the best estimate we can make of the variance is

$\sigma^2=\int^{u_{max}}_{u_{min}}{\left(u-\mu \right)}^2\left(\frac{df}{du}\right)du \approx \sum^N_{i=1}{\left(u_i-\mu \right)}^2\left(\frac{1}{N}\right)$

We have said that if we must use $$\overline{u}$$ to approximate the mean, the best estimate of $$\sigma^2$$, usually denoted $$s^2$$, is

$estimated\ \sigma^2=s^2 =\sum^N_{i=1}{\left(u_i-\overline{u}\right)}^2\left(\frac{1}{N-1}\right)$

The use of $$N-1$$, rather than $$N$$, in the denominator is distinctly non-intuitive; so much so that this equation often causes great irritation. Let us see how this equation comes about.

Suppose that we have a distribution whose mean is $$\mu$$ and variance is $$\sigma^2$$. Suppose that we draw $$N$$ values of the random variable, $$u$$, from the distribution. We want to think about the expected value of $${\left(u-\mu \right)}^2$$. Let us write $$\left(u-\mu \right)$$ as

$\left(u-\mu \right)=\left(u-\overline{u}\right)+\left(\overline{u}-\mu \right).$

Squaring this gives

${\left(u-\mu \right)}^2={\left(u-\overline{u}\right)}^2+{\left(\overline{u}-\mu \right)}^2+2\left(u-\overline{u}\right)\left(\overline{u}-\mu \right).$

From our definition of expected value, we can write:

$\begin{array}{l} \text{Expected value of } \left(u-\mu \right)^2= \\ ~~~~ =expected\ value\ of\ \ {\left(u-\overline{u}\right)}^2 \\ \ \ \ \ +expected\ value\ of\ {\left(\overline{u}-\mu \right)}^2 \\ \ \ \ \ +expected\ value\ of\ 2\left(u-\overline{u}\right)\left(\overline{u}-\mu \right) \end{array}$

From our discussion above, we can recognize each of these expected values:

• The expected value of $${\left(u-\mu \right)}^2$$ is the variance of the original distribution, which is $$\sigma^2$$. Since this is a definition, it is exact.
• The best possible estimate of the expected value of $${\left(u-\overline{u}\right)}^2$$ is $\sum^N_{i=1}{{\left(u_i-\overline{u}\right)}^2\left(\frac{1}{N}\right)}$
• The expected value of $${\left(\overline{u}-\mu \right)}^2$$ is the expected value of the variance of averages of $$N$$ random variables drawn from the original distribution. That is, the expected value of $${\left(\overline{u}-\mu \right)}^2$$ is what we would get if we repeatedly drew $$N$$ values from the original distribution, computed the average of each set of $$N$$ values, and then found the variance of this new distribution of average values. By the central limit theorem, this variance is $${\sigma^2}/{N}$$. Thus, the expected value of $${\left(\overline{u}-\mu \right)}^2$$ is exactly $${\sigma^2}/{N}$$.
• Since $$\left(\overline{u}-\mu \right)$$ is constant, the expected value of $$2\left(u-\overline{u}\right)\left(\overline{u}-\mu \right)$$ is $2\left(\overline{u}-\mu \right)\left[\frac{1}{N}\sum^N_{i=1}{\left(u_i-\overline{u}\right)}\right]$ which is equal to zero, because $\sum^N_{i=1}{\left(u_i-\overline{u}\right)} = \left(\sum^N_{i=1}{u_i}\right)-N\overline{u}=0$ by the definition of $$\overline{u}$$.

Substituting, our expression for the expected value of $${\left(u-\mu \right)}^2$$ becomes:

$\sigma^2\approx \sum^N_{i=1} \left(u_i-\overline{u}\right)^2\left(\frac{1}{N}\right)+\frac{\sigma^2}{N}$

so that

$\sigma^2\left(1-\frac{1}{N}\right)=\sigma^2\left(\frac{N-1}{N}\right)\approx \sum^N_{i=1} \frac{\left(u_i-\overline{u}\right)^2}{N}$

and

$\sigma^2 \approx \sum^N_{i=1} \frac{\left(u_i-\overline{u}\right)^2}{N-1}$

That is, as originally stated, when we must use $$\overline{u}$$ rather than the true mean, $$\mu$$, in the sum of squared differences, the best possible estimate of $$\sigma^2$$, usually denoted $$s^2$$, is obtained by dividing by $$N-1$$, rather than by $$N$$.