Solved – Combining and ranking standard deviations

rankingranksstandard deviationvariance

  1. I am trying to take three independent indicators and combine them into a single composite measure.

  2. For discussion, let's say that this is a composite measure for 'economic performance', and that the measure contains three independent indicators: worklessness, qualifications and unemployment. Let's also assume that there is a complete data set, with scores for each of the 50 local areas.

  3. To compare and contrast those 50 local areas, we believe that the most robust way is to work out the standard deviation for each indicator (i.e. working out the standard deviation [s.d.] for each of the three indicators, for each of the 50 local areas).

  4. The problem that we face, however, is when we come to combine the indicators together into a single composite measure. I understand from reading elsewhere (including on stats.stackexchange.com) that standard deviations cannot simply be added together; and that the correct method is to square each of the standard deviations individually to obtain the variance; add the variances together, and then divide this by the number of variances (i.e. divide by three in this case) and then taking the square root of this number.

  5. If it was okay to add standard deviations together, there would be no problems, as we would simply be able to add standard deviations together, rank the resulting value, and quintile them.

  6. However, if we follow (4.) above to combine standard deviations, the majority of the results now fall within one standard deviation of the mean. The smallest level of granularity is now 1 s.d., making it impossible to rank / quintile them, except assume that anything within 1 s.d. is in the mid quintile.

  7. Is there any way I can justify simply adding together standard deviations, so that they can actually be ranked and quintiled, instead of following the 'proper' method?

Best Answer

The solution, it turns out, is to normalise the variables first (i.e. by calculating standard scores -- in this case, z-scores / z-values) which can be added together -- and that solves the problem in (4.) above.

Sets of data, each containing 50 observations of 3 variables - to classify the spread of the 50 observations into equal groups, so:

  1. work out z-scores for each variable;
  2. sum z-scores;
  3. rank; and
  4. quintile.