Standard Error – Understanding Error Bars on Error Bars

errorstandard error

Inspired by my recent attendance at an environmental toxicology conference, I have the following question about error bars:

Let's say that I'm drawing samples from some unknown distribution, with finite mean and variance. I want to present the sample mean, and add some error bars. Since I don't know much about the underlying distribution, I just add error bars showing +/- the standard devaition of the samples.

My question is, is there any way I could meaningfully indicate how certain I am of those error bars? Adding error bars to the error bars, so to speak.

As as example, I have drawn 5 samples from some distribution, and I have repeated this 5 times. The sample means, and error bars of +/- the sample standard deviations, are shown below.

enter image description here

We can see that by chance, these sample means and error bars look quite different, and not really mutually compatible. Of course 5 samples isn't very much, but if my samples are obtained via some convoluted experimental procedure (capturing a wild animal and taking a blood sample, for example), it might not be an easy option to get more samples.

Update:

Just to add some notes on how I was thinking:

Coming from a computational physics background myself, I'm used to Monte Carlo methods, and the $1/\sqrt{N}$-error which follows from the central limit theorem. So just like the error in the sample mean has an expected distribution, I thought perhaps it would make sense to ask about the expected error in the sample standard deviation. Of course, the problem is that the distribution of the error in the sample mean is expressed in terms of the (unknown) variance of the underlying distribution, and hence I am left taking the standard deviation of the sample, or something along those lines.

But still, I thought there ought to be some way of indicating that my sample standard deviation is itself quite uncertain, due to the small $N$. But perhaps the only way is simply to list $N$, and be explicit about what the error bars show.

Best Answer

You are interested in standard errors, which describe the variability in a parameter estimate, and are related to your sampling approach. This is distinct from the parameters themselves (e.g. mean and standard deviation), which are functions of the underlying population only, and are not dependent on how large your sample is.

Your current plot shows two values per group, the sample mean and sample standard deviation, about which there is no uncertainty (it is whatever you observe it to be). Assuming appropriate random sampling, you can use these values to make inference about the unobservable quantities of the population mean and population standard deviation for each group. You can use common tools like standard error or 95% confidence intervals to estimate the precision of your parameter estimates.

It would be odd to try to represent this as error bars on error bars, but it would be perfectly reasonable to list the mean and standard deviation for each group, along with the 95% CI of each parameter estimate. This can help you to decide if the means/standard deviations observed in Groups C and D, for example, represent true differences in the underlying population parameters, or if the apparent differences represent normal variation that would be expected with a small sample size.

Related Question