Solved – How to calculate the standard error of the variance of a data set

standard errorvariance

Given a data set of N points with sample mean $\overline{x} \pm \Delta\overline{x}$ (here $\Delta\overline{x}$ is the standard error of the mean given by $s/\sqrt{N}$) and sample variance $s^2$, I am required to test the hypothesis that the data set is approximated by a Poisson distribution.

I consider the ratio $\frac{s^2}{\overline{x}}$. If the data set were distributed according to a Poisson distribution, we would expect that $\frac{s^2}{\overline{x}}$ is close to 1. Now, in general $\frac{s^2}{\overline{x}}$ is not going to be exactly 1, since my sample size is finite.

What I would like to do then, is to find the "standard error" associated with $s^2$ so that I may find the error associated with $\frac{s^2}{\overline{x}}$ via propagation of uncertainties:

$u(s^2/\overline{x}) = \sqrt{\left(\frac{1}{\overline{x}^2}\right)(\Delta s^2)^2 + \left(\frac{s^4}{\overline{x}^4}\right) \left(\frac{s}{\sqrt{N}}\right)^2} $

How would I do this? What would my $\Delta s^2$ need to be?

I have found a single simple paper on the topic but formula presented therein, $\Delta s^2 = s^2\sqrt{\frac{2}{N-1}}$, strikes me as over simplistic.:
https://web.eecs.umich.edu/~fessler/papers/files/tr/stderr.pdf

Best Answer

A implies B does not mean B implies A. For a Poisson distribution, mean = variance, but mean = variance does not mean distribution is Poisson. For example, normal distribution $N(\mu, \mu)$ given $\mu > 0$ is distribution with mean = variance, but it is a normal distribution, not Poisson. So your strategy has problem. Even you find a way to test that mean = variance, you still cannot get the conclusion that data come from Poisson distribution.

So my suggestion is to use Kolmogorov-Smirnov Goodness-of-Fit Test. You can find it from textbook or internet. But this method is not so efficient. It means the large sample is needed. If you sample size is < 50, even the data from a distribution far from Poisson, the chance of reject the null hypothesis that data come from Poisson distribution is very lower.

Related Question