Solved – Canonical correlation analysis with rank correlation

canonical-correlationdata transformationkendall-taumultivariate analysisspearman-rho

Canonical correlation analysis (CCA) aims to maximize the usual Pearson product-moment correlation (i.e. linear correlation coefficient) of the linear combinations of the two data sets.

Now, consider the fact that this correlation coefficient only measures linear associations – this is the very reason why we also use, for example, Spearman-$\rho$ or Kendall-$\tau$ (rank)correlation coefficients which measure arbitrary monotone (not necessarily linear) connection between variables.

Hence, I was thinking of the following: one limitation of CCA is that it only tries to capture linear association between the formed linear combinations due to its objective function. Wouldn't it be possible to extend CCA in some sense by maximizing, say, Spearman-$\rho$ instead of Pearson-$r$?

Would such procedure lead to anything statistically interpretable and meaningful? (Does it make sense – for example – to perform CCA on ranks…?) I am wondering if it would help when we are dealing with non-normal data…

Best Answer

I used restricted cubic spline expansions when computing canonical variates. You are adding nonlinear basis functions to the analysis exactly as you would be adding new features. This results in nonlinear principal component analysis. See the R Hmisc package's transcan function for an example. The R homals package takes this much further.

Related Question