Solved – OLS estimate of a linear model with dumthe variable

categorical dataleast squaresregressionself-study

I know a regression of y on x (dummy variable) and a constant term can be represented in the following form:
enter image description here

On the other hand OLS estimator can be presented in the following form:

enter image description here

I need to see how these two equations for Beta estimation are related to one another.

Best Answer

The model that we have is: enter image description here

Knowing that X is a dummy variable, we can get the following: enter image description here

Using the above information, we can substitute for components of OLS estimation and by simplifying we get: enter image description here

Related Solutions

Solved – Estimating linear regression with OLS vs. ML

Using the usual notations, the log-likelihood of the ML method is

$l(\beta_0, \beta_1 ; y_1, \ldots, y_n) = \sum_{i=1}^n \left\{ -\frac{1}{2} \log (2\pi\sigma^2) - \frac{(y_{i} - (\beta_0 + \beta_1 x_{i}))^{2}}{2 \sigma^2} \right\}$.

It has to be maximised with respect to $\beta_0$ and $\beta_1$.

But, it is easy to see that this is equivalent to minimising

$\sum_{i=1}^{n} (y_{i} - (\beta_0 + \beta_1 x_{i}))^{2} $.

Hence, both ML and OLS lead to the same solution.

More details are provided in these nice lecture notes.

Solved – SPSS dumthe variables in OLS

If you have the advanced statistics package that allows you do estimate generalized linear models (see the menus Analyze -> Generalized Linear Models or the GENLIN command), you can have SPSS on the fly generate the dummy variables. Given your data it may be good to see if some of the newer mixed model commands can estimate auto-regressive components for panel data.

Alternatively, you can use the DO REPEAT syntax to efficiently generate your dummy variables for use in regression equations. For instance, for your weekDay variable it would be;

VECTOR weekDay_Dummy(7,F1.0).
DO REPEAT weekDay_Dummy = weekDay_Dummy1 to weekDay_Dummy7 /i = 1 to 7.
    DO IF weekDay = i. 
        COMPUTE weekDay_Dummy = 1.
    ELSE IF weekDay <> i.
        COMPUTE weekDay_Dummy = 0.
    END IF.
END REPEAT.

As long as your variables are in a sequential list of integer values, the do repeat command will work (if they aren't in a sequential list see the AUTORECODE command). Then in the linear regression command you can subsequently use the TO operator to specify a list of variables that are in sequential order in your dataset (extra note it has to do with the order of the variables in the dataset, nothing to do with the names directly).

Below I have an example.

data list free / company  sector  obsDay weekDay  stockPrice.
begin data
1        15      1      3        10.40
1        15      2      4         9.42
1        15      3      5         9.66
1        15      4      1        11.00
1        15      5      2        10.21
2        10      1      3        43.55
2        10      2      4        43.50
2        10      3      5        40.31
2        10      4      1        48.43
2        10      5      2        43.00
3        20      1      3        10.00
3        20      2      4        11.00
3        20      3      5        12.00
3        20      4      1        13.00
3        20      5      2        14.00
end data.
dataset name examp.

VECTOR weekDay_Dummy(7,F1.0).
DO REPEAT weekDay_Dummy = weekDay_Dummy1 to weekDay_Dummy7 /i = 1 to 7.
    DO IF weekDay = i. 
        COMPUTE weekDay_Dummy = 1.
    ELSE IF weekDay <> i.
        COMPUTE weekDay_Dummy = 0.
    END IF.
END REPEAT.

REGRESSION
  /MISSING LISTWISE
  /STATISTICS COEFF OUTS R ANOVA
  /CRITERIA=PIN(.05) POUT(.10)
  /NOORIGIN 
  /DEPENDENT stockPrice
  /METHOD=ENTER weekDay_Dummy2 to weekDay_Dummy5.

Another extension command with more flexibility for writing dummy variables, SPSSINC_CREATE_DUMMIES (written in Python) is on the developerworks site (but I have not used it). Also one of the members here, ttnphns, has some tools to accomplish similar tasks on his site. Given your example though a few do repeat commands should be sufficient.

Best Answer

Related Solutions

Solved – Estimating linear regression with OLS vs. ML

Solved – SPSS dumthe variables in OLS

Related Question