Solved – Lagged dependent variable in linear regression

autoregressiveleast squaresregressiontime series

Recently I read an paper where in a time series data has been modelled according to the equation
$$
Y_t=\beta_1 Y_{t−1}+\beta_2X+\varepsilon.
$$
OLS was used here (with the lm() command in R) for obtaining the coefficient of $Y_{t-1}$. Is it statistically correct?

I understand when we deal with time series data, this actually means an ARX process and can be represented as
$$
Y_t=\theta Y_{t-1}+\beta X + \varepsilon,
$$
where $\theta$ comes from the Yule-Walker equations.

Will $\theta$ and $\beta_1$ yield the same result? Won't the OLS estimator suffer from autocorrelation problem as $E[x_t \varepsilon_t] \ne 0$ ? My statistics knowledge is beginner level. Please guide me on understanding this.

Best Answer

Hi: Your model is also called a koyck distributed lag and it can be difficult to estimate with small samples. With larger samples, my experience is that there is not a problem with bias. ( I used simulation to check this ).

The link discusses the statistical properties of the estimates briefly on pages 12 and 13. Essentially the problems with it are similar to those of the estimates of an AR(1).

https://www.reed.edu/economics/parker/312/tschapters/S13_Ch_3.pdf

I would check out hamilton or the little koyck book (1954) for a more in depth discussion but hopefully the above helps some.

Related Question