Correlation vs. Association – Understanding the Differences


My statistics professor claims that the word "correlation" applies strictly to linear relationships between variates, whereas the word "association" applies broadly to any type of relationship. In other words, he claims the term "non-linear correlation" is an oxymoron.

From what I can make of this section in the Wikipedia article on "Correlation and dependence", the Pearson correlation coefficient describes the degree of "linearity" in the relationship between two variates. This suggests that the term "correlation" does in fact apply exclusively to linear relationships.

On the other hand, a quick Google search for "non-linear correlation" turns up a number of published papers that use the term.

Is my professor correct, or is "correlation" simply a synonym of "association"?

Best Answer

No; correlation is not equivalent to association. However, the meaning of correlation is dependent upon context.

The classical statistics definition is, to quote from Kotz and Johnson's Encyclopedia of Statistical Sciences "a measure of the strength of of the linear relationship between two random variables". In mathematical statistics "correlation" seems to generally have this interpretation.

In applied areas where data is commonly ordinal rather than numeric (e.g., psychometrics and market research) this definition is not so helpful as the concept of linearity assumes data that has interval-scale properties. Consequently, in these fields correlation is instead interpreted as indicating a monotonically increasing or decreasing bivariate pattern or, a correlation of the ranks. A number of non-parametric correlation statistics have been developed specifically for this (e.g., Spearman's correlation and Kendall's tau-b). These are sometimes referred to as "non-linear correlations" because they are correlation statistics that do not assume linearity.

Amongst non-statisticians correlation often means association (sometimes with and sometimes without a causal connotation). Irrespective of the etymology of correlation, the reality is that amongst non-statisticians it has this broader meaning and no amount of chastising them for inappropriate usage is likely to change this. I have done a "google" and it seems that some of the uses of non-linear correlation seem to be of this kind (in particular, it seems that some people use the term to denote a smoothish non-linear relationship between numeric variables).

The context-dependent nature of the term "non-linear correlation" perhaps means it is ambiguous and should not be used. As regards "correlation", you need to work out the context of the person using the term in order to know what they mean.