Variance of sample variance?

Here's a general derivation that does not assume normality.

Let's rewrite the sample variance $S^2$ as an average over all pairs of indices: $$S^2={1\over{n\choose 2}}\sum_{\{i,j\}} {1\over2}(X_i-X_j)^2.$$ Since $\mathbb{E}[(X_i-X_j)^2/2]=\sigma^2$, we see that $S^2$ is an unbiased estimator for $\sigma^2$.

The variance of $S^2$ is the expected value of $$\left({1\over{n\choose 2}}\sum_{\{i,j\}} \left[{1\over2}(X_i-X_j)^2-\sigma^2\right]\right)^2.$$

When you expand the outer square, there are 3 types of cross product terms $$\left[{1\over2}(X_i-X_j)^2-\sigma^2\right] \left[{1\over2}(X_k-X_\ell)^2-\sigma^2\right]$$ depending on the size of the intersection $\{i,j\}\cap\{k,\ell\}$.

  1. When this intersection is empty, the factors are independent and the expected cross product is zero.

  2. There are $n(n-1)(n-2)$ terms where $|\{i,j\}\cap\{k,\ell\}|=1$ and each has an expected cross product of $(\mu_4-\sigma^4)/4$.

  3. There are ${n\choose 2}$ terms where $|\{i,j\}\cap\{k,\ell\}|=2$ and each has an expected cross product of $(\mu_4+\sigma^4)/2$.

Putting it all together shows that $$\mbox{Var}(S^2)={\mu_4\over n}-{\sigma^4\,(n-3)\over n\,(n-1)}.$$ Here $\mu_4=\mathbb{E}[(X-\mu)^4]$ is the fourth central moment of $X$.


Maybe, this will help. Let's suppose the samples are taking from a normal distribution. Then using the fact that $\frac{(n-1)S^2}{\sigma^2}$ is a chi squared random variable with $(n-1)$ degrees of freedom, we get $$\begin{align*} \text{Var}~\frac{(n-1)S^2}{\sigma^2} & = \text{Var}~\chi^{2}_{n-1} \\ \frac{(n-1)^2}{\sigma^4}\text{Var}~S^2 & = 2(n-1) \\ \text{Var}~S^2 & = \frac{2(n-1)\sigma^4}{(n-1)^2}\\ & = \frac{2\sigma^4}{(n-1)}, \end{align*}$$

where we have used that fact that $\text{Var}~\chi^{2}_{n-1}=2(n-1)$.

Hope this helps.


There can be some confusion in defining the sample variance ... 1/n vs 1/(n-1). The OP here is, I take it, using the sample variance with 1/(n-1) ... namely the unbiased estimator of the population variance, otherwise known as the second h-statistic:

h2 = HStatistic[2][[2]] 

These sorts of problems can now be solved by computer. Here is the solution using the mathStatica add-on to Mathematica. In particular, we seek the Var[h2], where the variance is just the 2nd central moment, and express the answer in terms of central moments of the population:

CentralMomentToCentral[2, h2] 

enter image description here

We could just as easily find, say, the 4th central moment of the sample variance, as:

CentralMomentToCentral[4, h2]

enter image description here

Tags:

Statistics