These concepts have been created to deal with regressions (for instance correlation) between non stationary series.
Clive Granger is the key author you should read.
Cointegration has been introduced in 2 steps:
1/ Granger, C., and P. Newbold (1974): "Spurious Regression in Econometrics,"
In this article, the authors point out that regression among non stationary variables should be conducted as regressions among changes (or log changes) of the variables. Otherwise you might find high correlation without any real significance. (= spurious regression)
2/ Engle, Robert F., Granger, Clive W. J. (1987) "Co-integration and error correction: Representation, estimation and testing", Econometrica, 55(2), 251-276.
In this article (for which Granger has been rewarded by the Nobel jury in 2003), the authors go further, and introduce cointegration as a way to study the error correction model that can exist between two non stationary variables.
Basically the 1974 advice to regress the change in the time series may lead to unspecified regression models. You can indeed have variables whose changes are uncorrelated, but which are connected through an "error correction model".
Hence, you can have correlation without cointegration, and cointegration without correlation. The two are complementary.
If there was only one paper to read, I suggest you start with this one, which is a very good and nice introduction:
(Murray 1993) Drunk and her dog
First of all consider two time series, $x_{1t}
$ and $x_{2t}
$ which both are $I\left(1\right)
$, i.e. both series contain a unit root. If these two series cointegrate then there will exist coefficients, $\mu
$ and $\beta_{2}
$ such that:
$\\$
$x_{1t}=\mu+\beta_{2}x_{2t}+u_{t}\quad\left(1\right)
$
$\\$
will define an equilibrium. In order to test for cointegration using the Engle-Granger 2-step approach we would
$\\$
1) Test the series, $x{}_{1t}
$ and $x_{2t}
$ for unit roots. If both are $I\left(1\right)
$ then proceed to step 2).
$\\$
2) Run the above defined regression equation and save the residuals. I define a new “error correction” term, $\hat{u}_{t}=\hat{ecm}_{t}
$.
$\\$
3) Test the residuals ($\hat{ecm}_{t}
$) for a unit root. Note that this test is the same as a test for no-cointegration since under the null-hypothesis the residuals are not stationary. If however there is cointegration than the residuals should be stationary. Remember that the distribution for the residual based ADF-test is not the same as the usual DF-distributions and will depend on the amount of estimated parameters in the static regression above since additiona variables in the static regression will shift the DF-distributions to the left. The 5% critical values for one estimated parameter in the static regression with a constant and trend are -3.34 and -3.78 respectively.
$\\$
4) If you reject the null of a unit root in the residuals (null of no-cointegration) then you cannot reject that the two variables cointegrate.
$\\$
5) If you want to set up an error-correction model and investigate the long-run relationship between the two series I would recommend you to rather set up an ADL or ECM model instead since there is a small sample bias attached to the Engle-Granger static regression and we cannot say anything about significance of the estimated parameters in the static regression since the distribution depends upon unknown parameters.To answer your questions:1) As seen above you method is correct. I just wanted to point out that the residual based tests critical values are not the same as the usual ADF-test critical values.
$\\$
$\\$
(2) If one of the series is stationary i.e. $I\left(0\right)
$ and the other one is $I\left(1\right)
$ they cannot be cointegrated since the cointegration implies that they share common stochastic trends and that a linear relationship between them is stationary since the stochastic trends will cancel and thereby producing a stationary relationship. To see this consider the two equations:
$\\$
$x_{1t}=\mu+\beta_{2}x_{2t}+\varepsilon_{1t}\quad\left(2\right)$
$\Delta x_{2t}=\varepsilon_{2t}\quad\left(3\right)
$
Note that $\varepsilon_{2t}\sim i.i.d.
$, $x_{1t}\sim I\left(1\right)
$, $x_{2t}\sim I\left(1\right)
$, $u_{t}=\beta\prime x_{t}\sim I\left(0\right)
$, $\varepsilon_{1t}\sim i.i.d.
$
$\\$
First we solve for equation $\left(3\right)
$ and get
$\\$
$x_{2t}=x_{0}+\sum_{i=0}^{t}\varepsilon_{2i}
$
$\\$
Plug this solution into equation $\left(2\right)
$ to get:
$\\$
$x_{1t} =\mu+\beta_{2}\left\{ x_{0}+\sum_{i=0}^{t}\varepsilon_{2i}\right\} +\varepsilon_{1t}
x_{1t} =\mu+\beta_{2}x_{0}+\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}+\varepsilon_{1t}
$
$\\$
We see at the two series share a common stochastic trend. We can then define a cointegration vector $\beta=\left(1\;-\beta_{2}\right)\prime
$ such that:
$\\$
$u_{t}=\beta\prime x_{t}=\left(1\;-\beta_{2}\right)\left(\begin{array}{c}
\mu+\beta_{2}x_{0}+\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}+\varepsilon_{1t}\\
x_{0}+\sum_{i=0}^{t}\varepsilon_{2i}
\end{array}\right)
$
$\\$
$u_{t}=\beta\prime x_{t}=\mu+\beta_{2}x_{0}+\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}+\varepsilon_{1t}-\beta_{2} x_{0}-\beta_{2}\sum_{i=0}^{t}\varepsilon_{2i}
$
$\\$
$u_{t}=\beta\prime x_{t}=\mu+\varepsilon_{1t}
$
We see that by defining a correct cointegrating vector the two stochastic trends cancel and the relationship between them is stationary ($u_{t}=\beta\prime x_{t}\sim I\left(0\right)
$). If $x_{1t}
$ was $I\left(0\right)
$ then the stochastic trend in $x_{2t}
$ would not be deleted by defining a cointegrating relationship. So yes you need both your series to be $I\left(1\right)
$!
$\\$
$\\$
(3) The last question. Yes OLS is valid to use on the two stochastic series since it can be shown that the OLS estimator for the static regression (Eq. $\left(1\right)
$) will be super consistent (variance converges to zero at $T^{-2}
$) when both series are $I\left(1\right)
$ and when they cointegrate. So if you find cointegration and your series are $I\left(1\right)
$ your estimates will be super consistent. If you do not find cointegration then the static regression will not be consistent. For further readings see the seminal paper by Engle and Granger, 1987, Co-Integration, Error Correction: Representation, Estimation and Testing.
Best Answer
A $I(0)$ and a $I(1)$ timeseries can not be cointegrated. There is no linear combination of the timeseries that is stationary. And the definition of cointegration is if there is a combination of them that is stationary, they're cointegrated.
I think you should fit a VAR with the stationary variable in levels and the non-stationary variable in first difference.
Good luck!