Solved – How to control for market return in an (SPSS) OLS

categorical datacontrolling-for-a-variableleast squaresregressionspss

Please consider the following panel dataset:

comp  obs  industry  weekDay  ind10  ind15  day3  day4  day5  marketRet  tweets  stockRet
-----------------------------------------------------------------------------------------
1     1    15        3        0        1    1     0     0     0.10       5321    -0.90
1     2    15        4        0        1    0     1     0     1.30       4244    -0.30
1     3    15        5        0        1    0     0     1     0.90       5543     1.32
2     1    10        3        1        0    1     0     0     0.10        789     0.10
2     2    10        4        1        0    0     1     0     1.30        842     0.16
2     3    10        5        1        0    0     0     1     0.90        734     0.00
  • For a list of companies (comp) it describes the number of tweets and the stockreturn (stockRet) for a series of days (obs)
  • weekDay gives the day of the week (1 = monday, 2 = tuesday, …); this has been extracted in dummies day3 to day5
  • industry gives the company industry (15 = IT, 10 = banking, …); this has been extracted in dummies ind10 and ind15
  • The final variable (marketRet) gives the average return of the stockmarket that day. Notice that for each day (obs), the market return is the same.

Question 1: Say I am running an OLS regression with tweets as independent variable and stockRet as dependent. I'm also adding the dummies day3 to day5 and ind10 & ind15 to the model as independent variables. Does the model now include, as they say, "fixed effects for industry and day of week"?

Question 2: I have read articles with similar research to mine, and they say they have "added the market return as a control". In SPSS preferably, how do I enter the variable marketRet as a control to the model? Just by adding it as an independent variable?

Question 3: What is the difference between fixed effects and control variables?

These questions are probably very basic, but I have not been able to find a clear answer to them. For instance, articles mention they "control for market return" but make no mention of how and why they do so. Thus, any help is greatly appreciated 🙂

Best Answer

It'd be helpful if you told us what procedure you used. My answers rely on some guesses.

Question 1: If you're running the OLS regression using Analyze > Regression, then they cannot be random effects because this module does not allows it. So, they can be seen as fixed effects. If you have used Mixed module then it would depend where you put the variables: whether they were fed into the fixed, or random slot.

We use fixed effect to discern mean difference, and we use random effect to adjust for variance introduced by the variables. If a variable returns a regression coefficient (or a set of coefficients in the case of categorical variables), it belongs to fixed effect; if a variable ends up in the variance/covariance output, it's been treated as a random effect.

Another way to think if you have correctly modeled the variables is to imagine if you're to repeat the measurement, will the attributes inside that variable change? In your case, all the dummies' attributes probably wouldn't change, so I would agree that they are fixed effects. My concern, as I've stated in the comment, goes to the company ID. It's a repeated measurement design and you may want to consider using Mixed model. In addition, if the companies were randomly chosen, you may want to consider allowing a random intercept/slope for your regression model. But again, this is just my crude guess. You better discuss with your statistical support.

Question 2: Just by the wording, "added the market return as a control" usually just means "market return" is treated as one of the independent variables.

Question 3: Well, I am actually not entirely sure. I feel that control variable can either be fixed or random because there are needs to control for mean difference and to control for variance. For instance, you can "control" for the effect of gender as a fixed effect, and you can also "control" for the clustering due to state/province by treating it as a random effect. I have seen both of these wordings used.