First of all consider two time series, $x_{1t}
$ and $x_{2t}
$ which both are $I\left(1\right)
$, i.e. both series contain a unit root. If these two series cointegrate then there will exist coefficients, $\mu
$ and $\beta_{2}
$ such that:
$\\$
$x_{1t}=\mu+\beta_{2}x_{2t}+u_{t}\quad\left(1\right)
$
$\\$
will define an equilibrium. In order to test for cointegration using the Engle-Granger 2-step approach we would
$\\$
1) Test the series, $x{}_{1t}
$ and $x_{2t}
$ for unit roots. If both are $I\left(1\right)
$ then proceed to step 2).
$\\$
2) Run the above defined regression equation and save the residuals. I define a new “error correction” term, $\hat{u}_{t}=\hat{ecm}_{t}
$.
$\\$
3) Test the residuals ($\hat{ecm}_{t}
$) for a unit root. Note that this test is the same as a test for no-cointegration since under the null-hypothesis the residuals are not stationary. If however there is cointegration than the residuals should be stationary. Remember that the distribution for the residual based ADF-test is not the same as the usual DF-distributions and will depend on the amount of estimated parameters in the static regression above since additiona variables in the static regression will shift the DF-distributions to the left. The 5% critical values for one estimated parameter in the static regression with a constant and trend are -3.34 and -3.78 respectively.
$\\$
4) If you reject the null of a unit root in the residuals (null of no-cointegration) then you cannot reject that the two variables cointegrate.
$\\$
5) If you want to set up an error-correction model and investigate the long-run relationship between the two series I would recommend you to rather set up an ADL or ECM model instead since there is a small sample bias attached to the Engle-Granger static regression and we cannot say anything about significance of the estimated parameters in the static regression since the distribution depends upon unknown parameters.To answer your questions:1) As seen above you method is correct. I just wanted to point out that the residual based tests critical values are not the same as the usual ADF-test critical values.
$\\$
$\\$
(2) If one of the series is stationary i.e. $I\left(0\right)
$ and the other one is $I\left(1\right)
$ they cannot be cointegrated since the cointegration implies that they share common stochastic trends and that a linear relationship between them is stationary since the stochastic trends will cancel and thereby producing a stationary relationship. To see this consider the two equations:
$\\$
$x_{1t}=\mu+\beta_{2}x_{2t}+\varepsilon_{1t}\quad\left(2\right)$
$\Delta x_{2t}=\varepsilon_{2t}\quad\left(3\right)
$
Note that $\varepsilon_{2t}\sim i.i.d.
$, $x_{1t}\sim I\left(1\right)
$, $x_{2t}\sim I\left(1\right)
$, $u_{t}=\beta\prime x_{t}\sim I\left(0\right)
$, $\varepsilon_{1t}\sim i.i.d.
$
$\\$
First we solve for equation $\left(3\right)
$ and get
$\\$
$x_{2t}=x_{0}+\sum_{i=0}^{t}\varepsilon_{2i}
$
$\\$
Plug this solution into equation $\left(2\right)
$ to get:
$\\$
$x_{1t} =\mu+\beta_{2}\left\{ x_{0}+\sum_{i=0}^{t}\varepsilon_{2i}\right\} +\varepsilon_{1t}
x_{1t} =\mu+\beta_{2}x_{0}+\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}+\varepsilon_{1t}
$
$\\$
We see at the two series share a common stochastic trend. We can then define a cointegration vector $\beta=\left(1\;-\beta_{2}\right)\prime
$ such that:
$\\$
$u_{t}=\beta\prime x_{t}=\left(1\;-\beta_{2}\right)\left(\begin{array}{c}
\mu+\beta_{2}x_{0}+\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}+\varepsilon_{1t}\\
x_{0}+\sum_{i=0}^{t}\varepsilon_{2i}
\end{array}\right)
$
$\\$
$u_{t}=\beta\prime x_{t}=\mu+\beta_{2}x_{0}+\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}+\varepsilon_{1t}-\beta_{2} x_{0}-\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}
$
$\\$
$u_{t}=\beta\prime x_{t}=\mu+\varepsilon_{1t}
$
We see that by defining a correct cointegrating vector the two stochastic trends cancel and the relationship between them is stationary ($u_{t}=\beta\prime x_{t}\sim I\left(0\right)
$). If $x_{1t}
$ was $I\left(0\right)
$ then the stochastic trend in $x_{2t}
$ would not be deleted by defining a cointegrating relationship. So yes you need both your series to be $I\left(1\right)
$!
$\\$
$\\$
(3) The last question. Yes OLS is valid to use on the two stochastic series since it can be shown that the OLS estimator for the static regression (Eq. $\left(1\right)
$) will be super consistent (variance converges to zero at $T^{-2}
$) when both series are $I\left(1\right)
$ and when they cointegrate. So if you find cointegration and your series are $I\left(1\right)
$ your estimates will be super consistent. If you do not find cointegration then the static regression will not be consistent. For further readings see the seminal paper by Engle and Granger, 1987, Co-Integration, Error Correction: Representation, Estimation and Testing.
Best Answer
...is to find a stationary combination of the integrated variables at hand. If there are just two variables, then it will be the stationary combination as it must be unique.
Because when the stationary combination is unique, any other linear combination will be nonstationary regardless of whether the integrated variables at hand are cointegrated or not.
Yes. If $u_t$ obtained from the first step of the Engle-Granger procedure is nonstationary, then $x_t$ and $y_t$ are not cointegrated.
Because that is very restrictive and need not hold in presence of cointegration. If two series $y_t$ and $x_t$ are cointegrated, there exists a $\beta$ such that $y_t-\beta x_t=u_t$ where $u_t$ is stationary. But there is no requirement that $\beta=1$.
See the argumentation above. But when could $z_t$ be more useful than $u_t$? Say we have the following subject-matter knowledge of $\beta$: if cointegration is present (which we are not sure about and want to test for), $\beta=1$. Then we can just take $\beta=1$ which implies using $z_t$ in place of $u_t$ and test $z_t$ for a unit root. If, on the other hand, we ignored the knowledge and used $u_t$ rather than $z_t$, we could by chance end up finding a fit that is too good such that while $y_t$ and $x_t$ are not cointegrated in reality, $u_t$ still appears stationary (due to a small sample).