Solved – Which robust correlation methods are actually used

correlationrrobustspearman-rhowinsorizing

I plan to do a simulation study where I compare the performance of several robust correlation techniques with different distributions (skewed, with outliers, etc.). With robust, I mean the ideal case of being robust against a) skewed distributions, b) outliers, and c) heavy tails.

Along with the Pearson correlation as a baseline, I was thinking to include following more robust measures:

  • Spearman's $\rho$
  • Percentage bend correlation (Wilcox, 1994, [1])
  • Minimum volume ellipsoid, minimum covariance determinant (cov.mve/ cov.mcd with the cor=TRUE option)
  • Probably, the winsorized correlation

Of course there are many more options (especially if you include robust regression techniques as well), but I want to restrict myself to the mostly used/ mostly promising approaches.

Now I have three questions (feel free to answer only single ones):

  1. Are there other robust correlational methods I could/ should include?
  2. Which robust correlation techniques are actually used in your field?
    (Speaking for psychological research: Except Spearman's $\rho$, I have never seen any robust correlation technique outside of a technical paper. Bootstrapping is getting more and more popular, but other robust statistics are more or less non-existent so far).
  3. Are there already systematical comparisons of multiple correlation techniques that you know of?

Also feel free to comment the list of methods given above.


[1] Wilcox, R. R. (1994). The percentage bend correlation coefficient. Psychometrika, 59, 601-616.

Best Answer

Coming from a psychology perspective, Pearson and Spearman's correlation do appear to be the most common. However, I think a lot of researchers in psychology engage in various data manipulation procedures on constituent variables prior to performing Pearson's correlation. I imagine any examination of robustness should consider the effects of:

  • transformations of one or both variables in order to make variables approximate a normal distribution
  • adjustment or deletion of outliers based on a statistical rule or knowledge of problems with an observation
Related Question