Solved – Using a Lag Variable in Time Series Data

lagsregressiontime series

I am new to Time Series Data and this question is confusing me, as I have received different advice and was wondering if I could request clarification.

I am attempting to test whether the creation of a particular type of school affects the number of rich or poor children in surrounding schools

I have 6 years of data on this, and I was doing a regression along the lines of
the number of disadvantaged students, against if there is one of these schools nearby, with a number of other control variables.

Because it was a times series data I was recommended to use a lag of the dependent variable [L.] (since presumably the current number of students will impact next years')

But I was also told to use the D. operator in Stata, by others, to account for any changes in number of such groups between the years, & my search on the web / textbooks has not been overly helpful about which to use in this situation if any at all.

Best Answer

D the difference operator is a statistical shorthand for a case where the contemporaneous effect is perfectly counter-balanced by a lag (delay) effect. Neither a lag or a differencing should be assumed necessary or useful. Analysis using cross-correlation procedures on suitable stationary series can often be useful in identifying an appropriate model.This http://empslocal.ex.ac.uk/people/staff/dbs202/cat/stats/corr.html might be useful in helping you understand the pitfalls of ordinary cross-correlation statistics on the original series. Also http://www.autobox.com/cms/index.php/afs-university/intro-to-forecasting/doc_download/24-regression-vs-box-jenkins discusses the fatal flaw of equally weighting all observations i.e.historical periods in forming a model.

Related Question