Correlation vs Covariance – Key Differences Explained

correlationcovariance

Following up on this question, How would you explain covariance to someone who understands only the mean?, which addresses the issue of explaining covariance to a lay person, brought up a similar question in my mind.

How would one explain to a statistics neophyte the difference between covariance and correlation? It seems that both refer to the change in one variable linked back to another variable.

Similar to the referred-to question, a lack of formulae would be preferable.

Best Answer

The problem with covariances is that they are hard to compare: when you calculate the covariance of a set of heights and weights, as expressed in (respectively) meters and kilograms, you will get a different covariance from when you do it in other units (which already gives a problem for people doing the same thing with or without the metric system!), but also, it will be hard to tell if (e.g.) height and weight 'covary more' than, say the length of your toes and fingers, simply because the 'scale' the covariance is calculated on is different.

The solution to this is to 'normalize' the covariance: you divide the covariance by something that represents the diversity and scale in both the covariates, and end up with a value that is assured to be between -1 and 1: the correlation. Whatever unit your original variables were in, you will always get the same result, and this will also ensure that you can, to a certain degree, compare whether two variables 'correlate' more than two others, simply by comparing their correlation.

Note: the above assumes that the reader already understands the concept of covariance.