Solved – How to control for market return in an (SPSS) OLS

categorical datacontrolling-for-a-variableleast squaresregressionspss

Please consider the following panel dataset:

comp  obs  industry  weekDay  ind10  ind15  day3  day4  day5  marketRet  tweets  stockRet
-----------------------------------------------------------------------------------------
1     1    15        3        0        1    1     0     0     0.10       5321    -0.90
1     2    15        4        0        1    0     1     0     1.30       4244    -0.30
1     3    15        5        0        1    0     0     1     0.90       5543     1.32
2     1    10        3        1        0    1     0     0     0.10        789     0.10
2     2    10        4        1        0    0     1     0     1.30        842     0.16
2     3    10        5        1        0    0     0     1     0.90        734     0.00

For a list of companies (comp) it describes the number of tweets and the stockreturn (stockRet) for a series of days (obs)
weekDay gives the day of the week (1 = monday, 2 = tuesday, …); this has been extracted in dummies day3 to day5
industry gives the company industry (15 = IT, 10 = banking, …); this has been extracted in dummies ind10 and ind15
The final variable (marketRet) gives the average return of the stockmarket that day. Notice that for each day (obs), the market return is the same.

Question 1: Say I am running an OLS regression with tweets as independent variable and stockRet as dependent. I'm also adding the dummies day3 to day5 and ind10 & ind15 to the model as independent variables. Does the model now include, as they say, "fixed effects for industry and day of week"?

Question 2: I have read articles with similar research to mine, and they say they have "added the market return as a control". In SPSS preferably, how do I enter the variable marketRet as a control to the model? Just by adding it as an independent variable?

Question 3: What is the difference between fixed effects and control variables?

These questions are probably very basic, but I have not been able to find a clear answer to them. For instance, articles mention they "control for market return" but make no mention of how and why they do so. Thus, any help is greatly appreciated 🙂

Best Answer

It'd be helpful if you told us what procedure you used. My answers rely on some guesses.

Question 1: If you're running the OLS regression using Analyze > Regression, then they cannot be random effects because this module does not allows it. So, they can be seen as fixed effects. If you have used Mixed module then it would depend where you put the variables: whether they were fed into the fixed, or random slot.

We use fixed effect to discern mean difference, and we use random effect to adjust for variance introduced by the variables. If a variable returns a regression coefficient (or a set of coefficients in the case of categorical variables), it belongs to fixed effect; if a variable ends up in the variance/covariance output, it's been treated as a random effect.

Another way to think if you have correctly modeled the variables is to imagine if you're to repeat the measurement, will the attributes inside that variable change? In your case, all the dummies' attributes probably wouldn't change, so I would agree that they are fixed effects. My concern, as I've stated in the comment, goes to the company ID. It's a repeated measurement design and you may want to consider using Mixed model. In addition, if the companies were randomly chosen, you may want to consider allowing a random intercept/slope for your regression model. But again, this is just my crude guess. You better discuss with your statistical support.

Question 2: Just by the wording, "added the market return as a control" usually just means "market return" is treated as one of the independent variables.

Question 3: Well, I am actually not entirely sure. I feel that control variable can either be fixed or random because there are needs to control for mean difference and to control for variance. For instance, you can "control" for the effect of gender as a fixed effect, and you can also "control" for the clustering due to state/province by treating it as a random effect. I have seen both of these wordings used.

Related Solutions

Solved – SPSS dumthe variables in OLS

If you have the advanced statistics package that allows you do estimate generalized linear models (see the menus Analyze -> Generalized Linear Models or the GENLIN command), you can have SPSS on the fly generate the dummy variables. Given your data it may be good to see if some of the newer mixed model commands can estimate auto-regressive components for panel data.

Alternatively, you can use the DO REPEAT syntax to efficiently generate your dummy variables for use in regression equations. For instance, for your weekDay variable it would be;

VECTOR weekDay_Dummy(7,F1.0).
DO REPEAT weekDay_Dummy = weekDay_Dummy1 to weekDay_Dummy7 /i = 1 to 7.
    DO IF weekDay = i. 
        COMPUTE weekDay_Dummy = 1.
    ELSE IF weekDay <> i.
        COMPUTE weekDay_Dummy = 0.
    END IF.
END REPEAT.

As long as your variables are in a sequential list of integer values, the do repeat command will work (if they aren't in a sequential list see the AUTORECODE command). Then in the linear regression command you can subsequently use the TO operator to specify a list of variables that are in sequential order in your dataset (extra note it has to do with the order of the variables in the dataset, nothing to do with the names directly).

Below I have an example.

data list free / company  sector  obsDay weekDay  stockPrice.
begin data
1        15      1      3        10.40
1        15      2      4         9.42
1        15      3      5         9.66
1        15      4      1        11.00
1        15      5      2        10.21
2        10      1      3        43.55
2        10      2      4        43.50
2        10      3      5        40.31
2        10      4      1        48.43
2        10      5      2        43.00
3        20      1      3        10.00
3        20      2      4        11.00
3        20      3      5        12.00
3        20      4      1        13.00
3        20      5      2        14.00
end data.
dataset name examp.

VECTOR weekDay_Dummy(7,F1.0).
DO REPEAT weekDay_Dummy = weekDay_Dummy1 to weekDay_Dummy7 /i = 1 to 7.
    DO IF weekDay = i. 
        COMPUTE weekDay_Dummy = 1.
    ELSE IF weekDay <> i.
        COMPUTE weekDay_Dummy = 0.
    END IF.
END REPEAT.

REGRESSION
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN 
  /DEPENDENT stockPrice
  /METHOD=ENTER weekDay_Dummy2 to weekDay_Dummy5.

Another extension command with more flexibility for writing dummy variables, SPSSINC_CREATE_DUMMIES (written in Python) is on the developerworks site (but I have not used it). Also one of the members here, ttnphns, has some tools to accomplish similar tasks on his site. Given your example though a few do repeat commands should be sufficient.

Solved – Cannibalization of product sales

If you wish to determine the impact of sales of product B on Product A , you must look at the conditional effect. The conditions that you might need to consider are 1) day-of-the-week ; 2) week-of-the-year ; 3) month-of-the-year 4) specific days-of-the-month ; 5) lead and lag effects around each holiday/event 5) Monday-after-a Friday event ; 6) Friday-before a Monday event ; 7) particular-weeks-in-the-month ; 8) ARIMA structure ; 9) Level Shifts, Local Time Trends , Seasonal Pulses, Pulses ; 9) changes in parameters over time; 10) changes in variance over time ; 11) impact of price/promotions ...... to name a few. I have written a paper on this subject, please see http://www.autobox.com/cms/index.php/news/131-102706-white-paper-on-cannibalization-qtesting-market-hypothesisq-by-john-c-pickett-david-p-reilly-view

Best Answer

Related Solutions

Solved – SPSS dumthe variables in OLS

Solved – Cannibalization of product sales

Related Question