Solved – Which Single Summary Statistic to use for Inverted Bell Curve (Bimodal Distribution)

descriptive statisticsinverse-gaussian-distribution

I've collected some datasets for which I want to report a summary statistic.

I produced normal probability plots for the datasets, and the data does not conform to a Gaussian distribution – there are extremely long vertical tails for the lowest and highest values, and very few values correspond to the mean. Therefore, it appears the data follows an inverse bell curve (which I have come to learn is a speacial case of a bimodal distribution).

I need to select a single summary statistic to report the results of the datasets. Which summary statistic would be best to report results from an inverted bell curve?

Here is an image of the distribution via the Normal Probability Plot:

enter image description here

Best Answer

There is no good single summary statistic for the type of distribution you have plotted, or, really, for any multimodal distribution.

That is, you can calculate anything you'd like: Mean, median, mode, interquartile range .... whatever. But none of these are good representations of data that has multiple modes.

Even for data that is perfectly normally distributed, you need two numbers: Mean and standard deviation. But let's assume you want a single measure of central tendency or location. For the normal, that's the mean. For highly skewed distributions, the usual choice is the median (although sometimes the mean is best, or even the mode), for distributions with a few extreme outliers, you might consider the trimmed or Winsorized mean.

But for multimodal distributions, none of these really work. The math is fine, but the intuition fails.