Solved – Expected correlation coefficient given different ranges of values of one variable

correlation

I want to do a study looking at Pearson's correlation between two variables, let's call them x and y. I know the values of y for a number of samples. I believe that a higher correlation coefficient is more likely when looking at samples with a larger range of values of the variables of interest than if restricted to a smaller range. If I have pilot data from which I have calculated a correlation coefficient for a given range of y values e.g. from 1 to 5, i.e. a 5 fold change between minimum and maximum, is there any way of estimating expected correlation coefficients using samples with different ranges, e.g. samples with y values ranging from 1 to 10 i.e. a 10 fold change between minimum and maximum, or samples with y values ranging from 1 to 2 i.e. only a 2 fold change between minimum and maximum?

Best Answer

Assuming you know the standard deviations rather than the ranges there is a formula you can use. If we let $\Sigma$ be the unrestricted standard deviation, $\sigma$ the restricted, $\rho$ a correlation and suppose that restriction is made on $x$ then the correlation corrected for range restriction is

$$ \frac{\rho_{xy}\frac{\Sigma_x}{\sigma_x}}{\sqrt{1-\rho_{xy}^2+\rho_{xy}^2\frac{\Sigma_x^2}{\sigma_x^2}}} $$

I have copied the formula from a paper by Stauffer and Mendoza available from Psychometrika here in volume 66 (2001) 63-68. You might want to double check my typing before using it but I believe it to be OK.

Related Question