Solved – Control variables and other independent variables

multiple regression

I'm trying to do a multiple-regression analysis in Stata. I'm new to this subject, so I need someone to explain it to me in "simple words".

I did an multiple-regression analysis: my control variables turned out to be "not significant", but I still want to include them in my analysis to show that I have controlled for them, because they are expected variables. How can I explain the relationship between the controlling variables and my other significant independent variables? What do the insignificant controlling variables say about the significant independent variables? And how do I know if my significant independent variables are "good" or not, when all the controlling variables are insignificant?

LOJ = Dependent variable
DI, TI = Independent variabels

enter image description here

enter image description here

Best Answer

When you say "control", I suspect you mean that you have a primary variable of interest, and then you have other variables that are potential confounders.

In the presence of a confounder, the effect size of the primary variable may appear higher or lower than it actually is (Simpson's Paradoxon / omitted variable bias). To "control" for this effect (see also here), the confounder must be added to the multiple regression (otherwise you lose the ability to infer the causal effect of the primary variable).

Note, however, that not all variables should be added to a regression. In some cases, adding a variable can even produce bias (collider). The causal structure determines which variables should go into the regression, regardless of significance or how they affect the estimates of other variables. See more comments here and in the excellent paper by Lederer et al., 2019.

Related Question