Is the physical impact / effect necessarily the independent variable / dependent variable of the regression model

causalitydependent variableregression

A regression analysis (RA) is often explained as follows:
"…Regressions analyses are statistical methods, by which you can calculate, whether an independent variable (IV) impacts a dependent variable (DV). So, in contrary to a correlation it is possible to evaluate causal effects and not only relations. Within RAs the IV is also called predictor, cause or regressor, whereas the DV is often called target variable, effect or regressand…"

What now is confusing me, is the equating of the terms independent variable and cause, or of the terms dependent variable and effect, respectively, – when reflecting what is the calculation direction of my regression model and what is the natural or physical impact-effect direction.

Lets take an example:
At a certain date we measure the water depth at several points (at the inclined ground) of a large, clear lake. For these points at the same date we also have the reflection values from satellite-borne imagary. Say we are interested in the reflection of the blue band. (Normally, for clear water we would assume a negativ correlation, i.e. blue reflection decreases while depth is increasing.)

Now we want to know whether we can predict depth by the blue band reflection values.
Now, in a regression model depth is the DV, and reflection is the IV, as the input is the reflection and the output is depth. Thus, if equalizing the terms IV and cause (or vice versa DV and effect), reflections values would impact water depth. But from a physical impact-effect point of view it is the complete opposite: water depth causes the measured reflection value (but reflection does not cause depth).

May anybody resolve this contradiction or help me with a link to an according post?
Or in other words:
Is it allowed to use a regression like in the example in which the calculation direction goes the opposite of the 'physical' impact-effect direction?
Might be a quite basic question. However, thanks in advance

Best Answer

There are many words that carry causal sense and are quite confusing for people willing to cross the bridge from correlation to causation. I'm reading your first paragraph and I myself can feel the confusion. Correlation, regression analysis, relation, all these words can have a causal interpretation, or not.

I like to say that causality is a new lens you put to look at something. Correlation, without anything else, is just what it says it is: a correlation. If you add assumptions, it can mean something else. And only then, words like effect turn into practical effect, or a causal effect, which is what a layman person reading would expect.

It is indeed possible to estimate causal effects with regression, as long as the causal model is identifiable. Also, you may want to check Under which assumptions a regression can be interpreted causally?. These two linked questions have good and exhaustive answers.

You can have a regression in which the dependent variable is caused by the independent variables. You can have a regression in which the dependent variable causes the independent variables. In both cases, it's possible that you can achieve reasonable predictions. You can even have a regression in which neither of the variables has a direct causal relationship and still achieve good predictions.

In some places, it's possible that you can estimate ice cream consumption based on the drowning rate, that is, you can regress one on the other. However, these variables are not a direct cause/effect of each other. They're both caused by temperature/weather, a confounding variable. Things become trickier if you really want to intervene in real life, which would make quite different if A causes B or B causes A.

So yes, in terms of being able to perform a prediction with regression, you can have a regression the way you mentioned in your question. If there is a strong statistical dependence between A and B, I can predict A with B and B with A, A~B, or B~A. It will not necessarily generalize, but you can do it.

Related Question