I have data on 146 institutions for 15 years. All of them have not data for all time points so that I have unbalanced panel data. How can I calculate a unit-root test in Stata or EViews for this unbalanced panel dataset?
Solved – Unit-root test for unbalanced panel data
hypothesis testingp-valuepanel datatime seriesunit root
Related Solutions
At the current moment (version 1.2-10, 2012-05-05) it seems that the unbalanced case is not supported. Edit: The issue of unbalanced panel data is solved in version 2.2-2 of plm on CRAN (2020-02-21).
Rest of the answer is assuming version 1.2-10:
I've looked at the code, and the final data preparation line (no matter what is your initial argument) is the following:
object <- as.data.frame(split(object, id))
If you pass unbalanced panel, this line will make it balanced by repeating the same values. If your unbalanced panel has time series with lengths which divide each other then even no error message is produced. Here is the example from purtest page:
> data(Grunfeld)
> purtest(inv ~ 1, data = Grunfeld, index = "firm", pmax = 4, test = "madwu")
Maddala-Wu Unit-Root Test (ex. var. : Individual Intercepts )
data: inv ~ 1
chisq = 47.5818, df = 20, p-value = 0.0004868
alternative hypothesis: stationarity
This panel is balanced:
> unique(table(Grunfeld$firm))
[1] 20
Disbalance it:
> gr <- subset(Grunfeld, !(firm %in% c(3,4,5) & year <1945))
Two different time series length in the panel:
> unique(table(gr$firm))
[1] 20 10
No error message:
> purtest(inv ~ 1, data = gr, index = "firm", pmax = 4, test = "madwu")
Maddala-Wu Unit-Root Test (ex. var. : Individual Intercepts )
data: inv ~ 1
chisq = 86.2132, df = 20, p-value = 3.379e-10
alternative hypothesis: stationarity
Another disbalanced panel:
> gr <- subset(Grunfeld, !(firm %in% c(3,4,5) & year <1940))
> unique(table(gr$firm))
[1] 20 15
And the error message:
> purtest(inv ~ 1, data = gr, index = "firm", pmax = 4, test = "madwu")
Erreur dans data.frame(`1` = c(317.6, 391.8, 410.6, 257.7, 330.8, 461.2, :
arguments imply differing number of rows: 20, 15
The null hypothesis of this test is that all panels contain a unit root. Given your results we reject this hypothesis. If you look at your tests P, Z, L* and Pm, you get a value for these test statistics (77.8047, -7.2246, and so on) and in the next column you see the p-value. Since they are all smaller than 0.01, you can reject the null hypothesis at the 1% level of statistical significance. This means there are no unit roots in your panels under the given test conditions (included panel mean and time trend). This should also answer your second question because the p-value tells you at which level of statistical significance you can reject the null. If you would like some more details on p-values have a look at these notes (lecture1, lecture2).
Best Answer
It depends on the type of unbalancedness. Is your unbalancedness such that your time series are of different lengths, but do not have missing values in between (often the case in macro panels)? Then, there is no need for interpolation within one time series.
As you basically want to pool evidence against a unit root from different time series in a panel unit root test, it is enough to combine test statistics or $p$-values, which need not have been computed from time series of identical lengths.
You should then for instance consider $p$-value combination tests such as Fisher's test as explored here, viz. $\sum_{i=1}^N-2\ln(p_i)$, which, under independence of the units (a strong assumption!), follows a $\chi^2$-distribution.
An alternative would be Simes' test, i.e. reject if there exists an ordered $p$-value $p_{(i)}$ such that $p_{(i)}<\alpha\cdot i/N$. It is also valid under certain types of dependence and has been investigated as a panel unit root test by, um, me.