Solved – Meaning of “Overdispersion” in Statistics

distributionsnormal distributionprobabilityvariance

I am trying to understand what "overdispersion" means in statistics.

Based on the Wikipedia page, "overdispersion" is defined as follows : "In statistics, overdispersion is the presence of greater variability (statistical dispersion) in a data set than would be expected based on a given statistical model."

However, I have heard other interpretations of "overdispersion" which suggest that "overdispersion refers to situations where the variance within the data is a function of the mean" – in other words, there is a non-constant relationship between the mean and variance within the data.

My Question: Can someone please tell how to mathematically measure and define "overdispersion"? For instance, I have heard that the Normal Distribution and the Poisson Distribution can be defined as "Dispersion Models". I have also heard that many models can be considered as "Dispersion Models" so long as a "Dispersion Parameter" can be inserted into the model. Using these definitions – is the Normal Distribution an example of "Overdispersion"? For example, there is a lot more variation in a Normal Distribution around the peak – and relatively less variation in a Normal Distribution around the tails. Is all this correct?

Thanks!

References:

Best Answer

In a Poisson$(\lambda)$ distribution:

$$ \mu=\lambda\\ \sigma^2 =\lambda\\ \implies\\ \mu=\sigma^2 $$

Consequently, when we believe we have a Poisson distribution, we expect the samples drawn from it to obey $\bar x \approx s^2$, since $\mu=\sigma^2$ in the suspected distribution.

If we have a gross violation where $s^2>>\bar x$, then we would not find it believable that $\mu=\sigma^2$, and we describe the data as overdispersed. That is, the dispersion is higher than we expected it to be.

Related Question