Solved – How to determine the appropriate number of lags when using Newey-West (or HAC) standard errors

autocorrelationheteroscedasticityneweywestpanel datastata

I have an unbalanced panel dataset where both autocorrelation and heteroskedasticity are present. I have read, in the Stata manual, that the newey command (see Newey-West, 1987) is one way in which these two problems may be addressed simultaneously. However, my understanding is that I must stipulate a lag(m) option, where autocorrelation at lags greater than m can be ignored. My question is how to determine what 'm' should be? Is there some way to determine how many lags I should be using?

I have found some discussion online about a type of stationary test where I would calculate first differences, second differences, etc. and then run the test for autocorrelation. Once I reach "stationary", I should know how many lags to use. But, I can't find any information on what this test is or how to use it.

Any help would be appreciated!

Best Answer

My answer is going to expand on what @Achim mentioned as "the growth rate of this lag length parameter". Newey & West (1987, Econometrica, p. 705) show that their estimator for the covariance matrix is consistent if the lag length $m$ fulfills the following two conditions:

  1. The lag length $m$ grows with the sample size $T$. Specifically,

$$\lim_{T\to\infty}T=+\infty$$

  1. The lag length $m$ grows at a slower rate than $T^{1/4}$:

$$\lim_{T\to \infty}[m(T)/T^{1/4}]=0$$

(These are obviously not the only conditions for the consistency of the estimators, but the other conditions are not directly related to the lag length.)

In light of the above, many practitioners seem to simply set $m$ to the integer part of $T^{1/4}$, see e.g. Greene (Econometric Analysis, 7th edition, section 20.5.2, p. 960).

Related Question