Statistical Analysis – Representing Population Data Mean/Median (BMI/Age Curve Example)

curvesmeanmedianpopulation

I want to represent my data (n=150 samples with a curve of value over time, similar to BMI representations in figure).
Text

Possibilities are either mean/std and median/percentiles.

The question are:

  • Can we chose the representation we want independently of the data distributions?
  • How can we justify the use of either (besides one being prettier than the other)?
  • Would it be better to use confidence intervals for either ?

Thank you for your help.

Best Answer

Can we chose the representation we want independently of the data distributions?

Yes, in principle, both methods are not restricted to any distribution. (For some distributions e.g. mean is not defined, but that is more the exception.)

How can we justify the use of either (besides one being prettier than the other)?

The representation with just mean and std is providing less information than the one with median and percentiles, presuming you provide sufficient percentiles. Think of a presentation that provides percentiles for all integer percent values. This gives you a very accurate description of the distribution, probably much more detailed than necessary. So the answer is: it depends on how detailed you want your data to be described.

Would it be better to use confidence intervals for either ?

Roughly speaking, the information provided by confidence intervals can also be gleaned from (appropriate) percentile presentations. Confidence intervals would provide a rather small amount of information, but maybe exactly what you need, so it really depends on what your requirements are.

Note that you cannot obtain confidence intervals from mean/std representations, but you also cannot obtain mean/std representations from a provided confidence interval.

In my experience, people without much statistical education can have a hard time correctly interpreting mean and std, while their interpretation of confidence interval and percentiles is more likely to be correct.