Solved – Correlation versus cause-effect regression

causalitycorrelationregression

I know correlation does not imply causation. I have read it nth time. (i.e. weight does not cause height etc. etc.)

However, to find the effect of a moderator variable on X-Y relationship, a regression model is used such as the GLM in SPSS to test for interaction or multiple regression. See here.

My Question: If the X-Y relationship is a correlation, then why is a cause-effect model used in this instance?

As far as I understand, it makes little sense to classify a variable as independent or dependent in a correlation analysis.

I apologise if this seems like a 'silly' question. In my past life, I had often told my students that there are no silly questions; just questions!

Best Answer

The calculations underlying the correlation and regression are not the same. They allow us to study the relation between two variables in different and complementary ways.

I don't think that it makes sense to say "the X-Y relationship is a correlation" (no more than to say that the relationship is a regression). The correlation is not a characteristic of the nature of the relationship. It is a just coefficient that you decide to use in order to quantity... what ?

  • The correlation coefficient is a measure of association between two variables, considering symmetrical roles. We just want to know if there is an association between these two variables and quantify the intensity of the relation. Thus, considering variables X and Y, r(X,Y) = r(Y,X). In this case, it doesn't make sense to distinguish independent variable and dependent variable.

  • Regression analysis allows us to study the association between two variables, by studying the variations of one based on the values ​​of the other, i.e., variations of a dependent variable according to the values of an independent variable. In this case, Y=aX+e ≠ X=aY+e (where a is a constant and e an error term). The regression of Y on X is different from the regression of X on Y. In this case, we have to specify a dependent variable and an independant variable, according to our hypotheses.

Correlation is often a first step of descriptive analysis (if there is no correlation, it is not useful to go ahead in regression analysis). In regression, we can test an hypothesis concerning the relation between a dependent variable and a independent variable (and also moderation effects of a third variable).

In addition, regarding the test of causal relations, the problem lies more in the nature of the variable (you can study the effect of gender on school achievement, but not vice versa) and the study design (causality implies that a variable temporally precedes another).