While it seems quite common to calculate a lagged version of the dependent variable and to use it on the right hand side of a model (e.g., autoregressive models), I have rarely seen that lagged versions of independent variables are included in a model. Is there a reason for that?
Solved – Does using lagged independent variables makes sense
autocorrelationautoregressivelagsnon-independent
Related Solutions
The decision to include a lagged dependent variable in your model is really a theoretical question. It makes sense to include a lagged DV if you expect that the current level of the DV is heavily determined by its past level. In that case, not including the lagged DV will lead to omitted variable bias and your results might be unreliable. In such a scenario, including the lagged DV, will take out a lot of your variance and is likely to make your other DV's effects less significant (which means both make the $\beta$s smaller and the standard errors bigger). However, what it will allow you to do is say that those IVs that still influence your outcome have an effect controlling for past value of the DV. An alternative approach to this is to use the difference between your outcome variable at period $t$ and $t-1$ as your DV for period $t$.
However, doing any of these imply answering an important question: what is the right lag structure for your DV? You can get some information about this by observing the correlation between your outcome variable with itself for different lag values (e.g. correlation between Y and Y$t-1$, Y and Y$t-2$, etc.).
There are many approaches to modeling integrated or nearly-integrated time series data. Many of the models make more specific assumptions than more general models forms, and so might be considered as special cases. de Boef and Keele (2008) do a nice job of spelling out various models and pointing out where they relate to one another. The single equation generalized error correction model (GECM; Banerjee, 1993) is a nice one because it is (a) agnostic with respect to the stationarity/non-stationarity of the independent variables, (b) can accommodate multiple dependent variables, random effects, multiple lags, etc, and (c) has more stable estimation properties than two-stage error correction models (de Boef, 2001).
Of course the specifics of any given modeling choice will be particular to the researchers' needs, so your mileage may vary.
Simple example of GECM:
$$\Delta{y_{ti}} = \beta_{0} + \beta_{\text{c}}\left(y_{t-1}-x_{t-1}\right) + \beta_{\Delta{x}}\Delta{x_{t}} + \beta_{x}x_{t-1} + \varepsilon$$
Where:
$\Delta$ is the change operator;
instantaneous short run effects of $x$ on $\Delta{y}$ are given by $\beta_{\Delta{x}}$;
lagged short run effects of $x$ on $\Delta{y}$ are given by $\beta_{x} - \beta_{\text{c}} - \beta_{\Delta{x}}$; and
long run equilibrium effects of $x$ on $\Delta{y}$ are given by $\left(\beta_{\text{c}} - \beta_{x}\right)/\beta_{\text{c}}$.
References
Banerjee, A., Dolado, J. J., Galbraith, J. W., and Hendry, D. F. (1993). Co-integration, error correction, and the econometric analysis of non-stationary data. Oxford University Press, USA.
De Boef, S. (2001). Modeling equilibrium relationships: Error correction models with strongly autoregressive data. Political Analysis, 9(1):78–94.
De Boef, S. and Keele, L. (2008). Taking time seriously. American Journal of Political Science, 52(1):184–200.
Best Answer
The models with lagged independent variables are called distributed lag models. Usually introductory econometrics texts have a section or chapter dedicated to them. They were more popular in the 1980s, and actually Christopher Sims got his Nobel prize in economics for the work on such models (see the article Money, Income and Causality). Nowadays they are used less frequently, but still they can be useful in time series regressions when it is clear that the lags of independent variable have an effect on the dependent variable.
Also note that autoregressive models can be written as infinite distributed lag models, i.e. model
$$y_t = \rho y_{t-1}+\beta x_t + u_t$$
is equivalent to
$$y_t = \beta \sum_{j=0}^\infty \rho^j x_{t-j}+ \sum_{j=0}^\infty \rho^j u_{t-j},$$
which means that autoregressive models actually use lagged independent variables, albeit not in a direct manner.