Regression – Relationship Between Cointegration and Linear Regression

cointegrationregressiontime series

If two non-stationary processes are cointegrated, that means a linear combination of the two processes are stationary. In a simple linear regression, we have the model form:

$y = b_0 + b_1x + e$

If we re-arrange, we can have something like

$(y – b_1x) = b_0 + e$

And thus, the linear combination of y and x are stationary with mean b0 and variance $\sigma^2$. If y and x are stock prices, then $b_1$ is the hedge ratio.

So what are the similarities and differences of cointegration and simple linear regression? I am not seeing the big picture for cointegration yet and why it is useful. The typical example of cointegration has to do with stock prices. Why not just take any two stocks prices, run a linear regression between them, check the residuals and make sure it passes the typical SLR assumptions? Basically the residuals show stationarity. And thus we can use typical regression methods as opposed to an entirely new suite of cointegration tests and methods.

Best Answer

Cointegration and regression are quite different categories.

Cointegration is a phenomenon observed in a time series context. Several time series cointegrate if there exists a linear combination that is integrated of a lower order than the series themselves. (See also the tag description for .)

Regression has several meanings. The most relevant is perhaps the one in the tag description of which says it is techniques for analyzing the relationship between one (or more) "dependent" variables and "independent" variables.

The relationship between cointegration and regression is that one can use regression to analyze the relationship between several cointegrated variables.

(Unlike the simple case of cross-sectional data, standard regression estimators such as OLS of a naive regression of several cointegrating variables have some unusual properties, e.g. superconsistency. A helpful regression model for cointegrating time series is cointegrated (restricted) VAR and its alternative representation VECM that clearly exposes the short- and long-run relationships between the variables.)

Related Question