The likelihood-ratio test is appropriate only if the two models you are comparing are nested, i. e., if one can be retrieved from the other, e. g., by fixing parameters (e. g., to zero). Models with more parameters will always fit better, the question the LR test answers is whether the increase in fit is defensible given the amount of added parameters.
If you want to compare non-nested models, you may use information criteria such as AIC or BIC (the smaller the better).
Whether R² = .3 is better then .25 also depends largely on your field, substantive theory, and the variables in the models. If m1
has only one parameter but m2
has 20, the first may be preferable. If m1
misses a theoretically very relevant predictor, you may choose m2
although an increase of .05 may sound not that much.
Just to make sure: Your predictedValues
is just a dummy for one or more predictors, i.e., variables, right? Because otherwise m3
does not make much sense, to me at least.
Question 1
If your outcome variable is integrated, you might consider using a single-equation generalized error correction model (GECM) as per Banerjee (1993) and De Boef (2001), as this model is agnostic to the stationarity of the predictors.
You might evaluate the stationarity of your outcome using:
$\log{(GDP/Labor)_{ti}} \sim \rho_{i}\log{(GDP/Labor)_{t-1i}} + \zeta_{ti} + \mu_{\rho_{i}}$,
where:
$\zeta_{ti}$ measures all disturbances to $\log{(GDP/Labor)_{ti}}$ in each time $t$ (assumed distributed normal), and
$\mu_{\rho_{i}}$ measures state-level variation in $\log{(GDP/Labor)_{ti}}$ (assumed distributed normal).
If $|\rho_{i}| \approx 1$, then you've got nearly integrated data, and the GECM, which also has the attractive properties of disentangling long-run effects, from both instantaneous change short term effects and from lagged short term effects.
The general form of the single equation GECM is:
$\Delta y_{t} = \beta_{0} + \beta_{c}\left[y_{t-1}-\left(\mathbf{X}_{t-1}\right)\right] + \mathbf{B}_{\Delta\mathbf{X}}\Delta\mathbf{X}_{t} + \mathbf{B}_{\mathbf{X}}\mathbf{X}_{t-1} + \varepsilon$,
where:
$\Delta$ is the first difference operator (e.g. $\Delta y_{t} = y_{t} - y_{t-1}$), and $\varepsilon$ may be decomposed into mixed effects (e.g. by including $\beta_{0i}$, for country-level random intercepts).
instantaneous short run effects are given by $\beta_{\Delta\mathbf{X}}$,
lagged short run effects are given by $\beta_{\mathbf{X}} - \beta_{c} - \beta_{\Delta\mathbf{X}}$, and
long run effects are given by $\left(\beta-{c}-\beta_{\mathbf{X}}\right)/\beta_{c}$.
This specification assumes a homogeneity of error correction processes. I haven't yet tried to derive a heterogeneous error correction specification...
In Stata you can perform Hadri's test for unit-root in panel data on the residuals of such a model, to check them for stationarity.
Question 2
I do not know that I can say much useful here.
Question 3
The time dummies can be included in the GECM model, and presumably other dynamic times series models, often they are used as indicators of, for example, policies going into effect. I have done something similar, but used (time-varying) proportions (rather than 0/1 indicator variables) to represent the portion of the time period during which a policy was in effect (e.g. some policies go into effect January 1, some July 1, some December 21, etc.). On the other hand: you don't have tons of data, so I suppose it depends how many new variables you are adding.
References:
Banerjee, A., Dolado, J. J., Galbraith, J. W., and Hendry, D. F. (1993). Co-integration, error correction, and the econometric analysis of non-stationary data. Oxford University Press, USA.
De Boef, S. (2001). Modeling equilibrium relationships: Error correction models with strongly autoregressive data. Political Analysis, 9(1):78–94.
Best Answer
As far as I'm aware,
plm
is a linear model and I think it is OLS estimation not maximum likelihood. If that's right, you have no likelihood to test in that way.If it is OLS-based, you could compare the adjusted R-squared and use the better one, provided your model assumptions hold.