[Math] What are the different types of “averages” and when to use each one

averagemeansstatistics

So, the concept of an average truly is somewhat abstract. Most statisticians define it as a "measure of central tendency." Others say it is the "center of gravity" for a set of numbers.

I personally prefer a slightly more concrete explanation: A statistic that describes the "typical", or better yet, "representative" value of a data set. We humans get to decide what "representative" actually means. Is it the most common number? The number that falls "in the middle" of a set of numbers? Etc.

I am going to list a few types of averages and am wondering if someone could provide a data set where that particular average is the best choice for describing a "typical" or "representative" value for that data set.


Types of averages:

  • Arithmetic Mean: The number that you could use in place of each of the values of a data set, and still have the same sum. Formula: $$\overline{X}=\frac1N\sum\limits_{i=1}^{N} X_i$$

  • Mode: The most common value in a data set. No formula that I know of.

  • Median: The literal middle number of a data set where the values are listed in ascending order. No formula that I know of.

  • Root Mean Square: Don't really know how to describe this average in a physical sense. Maybe the average that gives larger numbers in the data set more "weight" or "significance?" Formula: $$X_{rms}=\sqrt{\frac1N\sum\limits_{i=1}^{N} X_i^2}$$

  • Mean Root Square: I just made this one up, but it seems to work well when I applied it to some random data sets. It seems to do the opposite of the RMS and gives smaller numbers in the data set more "weight" and "significance." Formula:
    $$X_{mrs}=\frac1N\sqrt{\sum\limits_{i=1}^{N} X_i^2}$$
    EDIT: Turns out this should not be considered an average because it fails to describe a data set that consists of only one number, unlike all the other averages.

  • Geometric mean: The number that you could use in place of each of the values of a data set, and still have the same product. Formula:
    $$GM=\sqrt[N]{\prod\limits_{i=1}^{N} X_i}$$

Feel free to add in any other popular or useful types of averages and when to use them!

Best Answer

I did some testing using what you cal "MSR" but I call Square Mean Root (SMR). It looks useful for dealing with data where large measurements indicate the opposite of an effect. For example in reaction time measurements, large reaction times indicate the participant was not paying attention. Unfortunately the "tyranny of large numbers" means those large reaction times have an inaapropriate effect on the results.

OTHER AVERAGES 1. Normalized Averages. a) Convert each data set to relative values. b) find the mean relative value c) restore the mean to actual value by multiplying and adding back the means of the subtracted constants and mean fo the divisors. NB this method wcan also produce realistic standard deviations.

  1. Kalman Averages These weight means inversely according to their standard deviations. Thus less reliable measurements have less influence. This type of average can be found on the net and in some statistics books.

  2. Winsorized Means These set limits to outliers, but do not eliminate them. The 5% and95% values of a set of measurments are taken as maximum values. Any measurments above or below these values are set to these max and minimum values. A mean is calculated from these modified measurements in the usual way. This is less questionable than simply eliminating the "outliers".

When using any "average" suh as mean, median, RMS, .., it is important to keep in mind what has been measured and what the size of the measurement numbers mean in terms of the effect producing those numerical values.

Related Question