Solved – Range of values of skewness and kurtosis for normal distribution

kurtosisnormal distributionskewness

I want to know that what is the range of the values of skewness and kurtosis for which the data is considered to be normally distributed.

I have read many arguments and mostly I got mixed up answers. Some says for skewness $(-1,1)$ and $(-2,2)$ for kurtosis is an acceptable range for being normally distributed. Some says $(-1.96,1.96)$ for skewness is an acceptable range. I found a detailed discussion here: What is the acceptable range of skewness and kurtosis for normal distribution of data regarding this issue. But I couldn't find any decisive statement.

What is the basis for deciding such an interval? Is this a subjective choice? Or is there any mathematical explanation behind these intervals?

Best Answer

The original post misses a couple major points: (1) No "data" can ever be normally distributed. Data are necessarily discrete. The valid question is, "is the process that produced the data a normally distributed process?" But (2) the answer to the second question is always "no", regardless of what any statistical test or other assessment based on data gives you. Normally distributed processes produce data with infinite continuity, perfect symmetry, and precisely specified probabilities within standard deviation ranges (eg 68-95-99.7), none of which are ever precisely true for processes that give rise to the data that we can measure with whatever measurement device we humans can use.

So you can never consider data to be normally distributed, and you can never consider the process that produced the data to be a precisely normally distributed process. But, as Glen_b indicated, it might not matter too much, depending on what it is that you are trying to do with the data.

Skewness and kurtosis statistics can help you assess certain kinds of deviations from normality of your data-generating process. They are highly variable statistics, though. The standard errors given above are not useful because they are only valid under normality, which means they are only useful as a test for normality, an essentially useless exercise. It would be better to use the bootstrap to find se's, although large samples would be needed to get accurate se's.

Also, kurtosis is very easy to interpret, contrary to the above post. It is the average (or expected value) of the Z values, each taken to the fourth power. Large |Z| values are outliers and contribute heavily to kurtosis. Small |Z| values, where the "peak" of the distribution is, give Z^4 values that are tiny and contribute essentially nothing to kurtosis. I proved in my article https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4321753/ that kurtosis is very well approximated by the average of the Z^4 *I(|Z|>1) values. Hence kurtosis measures the propensity of the data-generating process to produce outliers.