Solved – How to interpret two standard deviations below the mean of a count variable being less than zero

count-datastandard deviation

I asked people how many times they visited their local pub in a 'normal' week.

The result can be zero, one, two, three, four, and five and more.

  • The mean is 2 and the standard deviation is 1.3.
  • So two standard deviations above the mean is 4.6.
  • However, two standard deviation below the mean is -0.6.

Is this negative figure an error? How do I interpret it?

Best Answer

The short answer, is no, it is not an error.

As @whuber notes, there is nothing surprising (at least to a statistician) about the fact that two standard deviations below the mean of a count variable could be a negative value. Thus, to answer your question, perhaps it would be more useful to ponder why you might find the result surprising.

Why you might be surprised

  • Many introductory statistics textbooks show how you can use the mean, standard deviation, and the normal distribution to make claims like approximately 2.5% of the sample is expected to score below two standard deviations below the mean. You may have generalised this idea to a variable where the assumptions of such a procedure are invalid.
  • If you did this, you would be saying to yourself: "this is strange, how is it possible for 2.5% of the data to have counts below -0.6".

Estimating percentiles for counts

  • Your variable is not normally distributed, it is a count variable. It is discrete; it is a non-negative integer. Thus, in order to estimate the percentage that is greater than or equal to a given value, you need an approach suited to counts. A basic approach would involve using the sample data to estimate such percentiles. More sophisticated approaches could involve developing a model of the distribution suited to counts, justified by the data and knowledge of the phenomena, and estimated using the sample data.