Solved – How to determine whether data is slightly or extremely non-normally distributed

distributionsnormal distributionnormality-assumption

I'm a PhD student and doing a research on regression analysis.

My question is how to determine whether the data is slightly, moderately or extremely non-normally distributed?


TQ to all responses of my question. But, may be my question is not so clear.
Ok, let say i have some different values of skewness and kurtosis (for example: skewness = 1.5, kurtosis = 2.0). So my question is, from the values of skewness and kurtosis, what is the type of data distribution? is it moderately non-normal, or extremely non-normal or what?

Best Answer

The sample skewness $$\gamma=\frac{\sum_{i=1}^n(x_i-\bar{x})^3}{\Big(\sum_{i=1}^n(x_i-\bar{x})^2\Big)^{3/2}}$$ and the sample (excess) kurtosis $$\kappa=\frac{\sum_{i=1}^n(x_i-\bar{x})^4}{\Big(\sum_{i=1}^n(x_i-\bar{x})^2\Big)^{2}}-3$$ are often used as measures of non-normality.

The sample skewness measures the asymmetry of the empirical distribution. If it is far from $0$, the distribution is not very symmetric. Since the normal distribution is symmetric, a sample from the normal distribution should be close to $0$.

The sample kurtosis measures the "peakedness" of the distribution. If it is much greater than $0$, then the distribution is more peaked than the normal distribution, which typically means that it has heavier tails. If it is less than $0$ it is less peaked, which typically means that the distribution is bimodal. The sample kurtosis is bounded from below by $-2$ (a value that is obtained for a two-point distribution, which of course is extremely bimodal).!

Here are two examples (normal distribution in grey, other distributions in red):

enter image description here

The skew distribution has theoretical skewness $1.6$ whereas the kurtotic distributions has theoretical (excess) kurtosis $1.5$. As you can see, the kurtotic distribution has heavier tails than the normal distribution.

So, why use skewness and kurtosis as quantifications of non-normality? The main reason is that they affect the asymptotics of the central limit theorem, which as you may know often can be used to motivate the use of a statistical procedure (that is based on normality) even if the data does not come from a normal distribution, given that you have a "large enough" sample. If either the skewness or the kurtosis is high, larger sample sizes are needed for such motivations to be valid.

For some inferential procedures you need to worry more about skewness, and for some you need to worry about heavy tails (kurtosis). I've written more about that elsewhere on this site.