Time Series – OLS vs ML Estimation of VECM

cointegrationestimationtime seriesvector-error-correction-model

A vector error correction (VECM) model has an equivalent vector autoregression (VAR) representation.

(VECM) $\;\;\;\Delta y_t=\Pi y_{t-1}+\Gamma_1\Delta y_{t-1}+…+\Gamma_{p-1}\Delta y_{t-(p-1)}+\varepsilon_t$

(VAR) $\;\;\;\;\;\;\;\; y_t=A_1 y_{t-1}+…+A_p y_{t-p}+\varepsilon_t$

where on one hand

(A) $\;\;\Pi=-(I-A_1-…-A_p) \;$ and $\;\;\Gamma_i=-(A_{i+1}+…+A_p)$

while on the other hand

(B) $\;\; A_1=\Pi+I+\Gamma_1$, $\;A_i=\Gamma_i-\Gamma_{i-1}$ for $i=2,…,p-1$, and $A_p=-\Gamma_{p-1}$.

Compare the following estimation techniques (suppose the cointegration rank is given):

(1) Estimate the VECM representation by maximum likelihood (ML)
(2) Estimate the VECM representation by ordinary least squares (OLS)
(3) Estimate the equivalent VAR representation by OLS with linear restrictions due to (B), then algebraically convert it to the VECM representation.

I am assuming normally distributed disturbances for simplicity.

Questions:

  1. Is (1) more efficient than (2)?
  2. Will (2) and (3) give exactly the same estimates?
  3. Should any of the alternatives, or maybe yet another approach, be generally preferred?

My guesses:

  1. The knowledge of the cointegration rank amounts to nonlinear restrictions on the $\Pi$ matrix. This can be fully utilized in (1) but not in (2) as OLS will not handle nonlinear restrictions. Therefore, I guess the knowledge of the cointegration rank will be ignored by OLS and thus (1) should be more efficient than (2).

    Edit: $\Pi=\alpha \beta'$ factors the cointegration matrix $\Pi$ into a loading matrix $\alpha$ and a matrix of cointegrating vectors $\beta'$. Perhaps what OLS does is use the matrix of estimated cointegrating vectors $\beta'$ from the Johansen procedure so that only $\alpha$ is remains to be estimated and there is no problem of nonlinear restrictions (actually, there are no restrictions at all in this case). If so, OLS is still not completely efficient because the cointegrating vectors are estimated (via the Johansen procedure) without taking into account all the dynamics implied in the VECM model. But what about ML estimation? Does it rely on Johansen procedure as an initial step, too? If so, then (1) and (2) would be equivalent.
  2. I guess that (2) and (3) will give identical estimates as they both use equivalent representations of the same model with the same number of parameters to be estimated by the same estimation technique.
  3. (1) should be generally preferred to (2) and (3) due to its efficiency as per my guess in point 1. (2) and (3) must be computationally faster than (1) since the latter will likely use numerical optimization. Also, (2) and (3) will not have convergence problems possibly found in (1).
    What about other alternatives?

Questions (continued):

  1. Does ML estimator of VECM use the matrix of estimated cointegrating vectors $\beta'$ from the Johansen procedure?
  2. Does OLS estimator of VECM use the matrix of estimated cointegrating vectors $\beta'$ from the Johansen procedure?

Edit: questions 4. and 5. are asked here, and should preferably be answered there. But since they are intimately related to the topic of this post, I still keep them visible here.

Best Answer

You are asking a complicated questions, to which there are no clear answers.

  1. Is (1) more efficient than (2)? Note actually that the Johansen ML estimator has a strange finite-sample distribution with no finite moments, and hence has a large variance. So it is most likely not more efficient than the "Granger 2-SLS". On the other side, "Granger 2-SLS" has a large bias in the first stage, which contaminates the second stage. A simple correction for that involves adding leads and lags in the first stage, as done by Saikonnen's estimator.

  2. Will (2) and (3) give exactly the same estimates? Mmh... I do not think you can use this procedure, as the restrictions due to B obviously have to be imposed with knowledge of B. So these will be the same only if you know in advance B, but then the restricted OLS procedure is of little interest...

  3. Should any of the alternatives, or maybe yet another approach, be generally preferred? There is no final answer to that, as in every case in statistics, it depends... There have been quite a lot of theoretical and Monte-Carlo comparisons, of these estimators, check Maddala and Kim (1998) for a discussion, or, one among others, or Gonzalo (1994).

  4. and 5. as you asked these questions in a separate post: Estimation of VECM via ML and OLS maybe you could remove from this one?

Refs: