Arellano-Bond is really about using moment conditions where differenced variables are instrumented with all available lags (the most important part of the paper is just equation (2) which lays this out). Even the "all available" part is for efficiency - the estimator would be valid if you just did the lazy thing and used only two lags to instrument for every period. In this sense the paper's really quite general. Suppose we have:
$$
y_{it}=\sum_{l=1}^p\rho_ly_{i,t-l}+x_{i,t}'\beta+\alpha_i+\epsilon_{it}
$$
where $\alpha_i$ is an individual effect, and $\epsilon_{it}$ is uncorrelated with all $y_{i,s},\,\,s=t-1,\dots,0$ and all $x_{i,s},\,\,s=t,\dots,0$. Then first differencing removes the fixed effect:
$$
y_{it}-y_{i,t-1}=\sum_{l=1}^p\rho_l(y_{i,t-l}-y_{i,t-l-1})+(x_{i,t}-x_{i,t-1})'\beta+\epsilon_{it}-\epsilon_{it-1}
$$
and all $y$'s older than $t-2$ and $x$'s older than $t-1$ are available instruments. If $x$ is endogenous, $z$ can be used as an instrument in it's place so long as it is uncorrelated with $\epsilon_{it}-\epsilon_{it-1}$. This allows for arbitrary lags and $x$'s. EDIT: Note that the number of lags you can use as valid instruments are determined by the error term $\epsilon_{it}-\epsilon_{i,t-1}$. Since $\epsilon_{i,t-1}$ is correlated with $y_{i,t-1}$, you cannot use $y_{i,t-1}$. However, you can use $y_{i,t-2}$ regarless of the lag length $p$, because it will always be uncorrelated with $\epsilon_{i,t-1}$
Again, difference to remove the fixed effect and lag in order to take moment conditions. The rest is all matrix algebra and careful stacking of vectors in order to work those moment conditions out. For this I would suggest following the directions in one of the links below.
So with that in mind:
- It will work with arbitrary lags.
- No, it would be the opposite - you should worry about the finite sample properties, not the properties in a large sample. Though I will admit to knowing less about this than other parts of your question.
- Arellano-Bond allows for multiple instruments. The original paper is somewhat dense but it will tell you how to do this. If you are using
R
or Stata, the canned routines can incorporate arbitrary instruments. If you look at page 290, footnote vi in their original Review of Economic Studies paper you can see how the instrument matrix would be laid out. The source I have below shows the same thing.
- They're not all that fresh - by econometrics standards I might almost call them old. If you want a nice presentation, try Behr (2003) (alternative link).
EDIT: If you're worried about weak instruments you might want to try the estimators presented in Blundell and Bond (1998) or Arellano and Bover (1995). These feature a larger set of moments including using lagged differences of $y$ as instruments for the current level. The papers were written specifically because of the concern that the original instruments might be too weak.
Best Answer
There wasn't enough space in my comment to explain it clearly but this should clarify. Take the koyck distributed lag:
$y_t = \rho y_{t-1} + x_t + \epsilon_t$.
Now, using the lag operator, this can be re-written as
$y_{t} = \sum_{i=0}^\infty \rho^{i}x_{t-i} + \sum_{i=0}^{\infty} \rho^{i} \epsilon_{t-i}$
Notice that, in the immediately previous equation, there is no longer a relationship between the LHS and the previous value of itself. It's an illusion that's only caused by the exogenous regressor having a lagged effect.