Non-Stationary Series Correlations – Is It Okay for Descriptive Purposes?

correlationdifferencingstationaritytime series

I have been obsessed with trying to conduct an analysis in the “correct” way. I read that no time series analysis should be conducted on non stationary series. I found both my series have a unit root and therefore are non stationary, so I’ve been running analyses on the first differenced data.

My final question here is the following: if I am purely interested in describing the past relationship, with no intent of predicting the future, is is “okay” if I run correlation analysis on two non stationary series to identify time lags? Or should I cross correlate the first differenced data and stop asking so many questions?

I have zero interest in inferring a causal “relationship.” I only want to explain the co-movement. I’m interested to know the lag which maximizes the correlation between two interest rates. I.e. interest rate A changes now, that is correlated most with changes 2 months from now for interest rate B.

Best Answer

That is an interesting and nontrivial question. Based on some recent related threads by the OP, let me focus on the case where the correlation is spurious and its population counterpart is ill defined. This would be the case with a pair of unit-root, noncointegrated time series.

Whether dealing with description, prediction or something else, we routinely use statistical estimates to infer something about the corresponding estimands / population counterparts, thus to generalize.$^{*}$,$^{**}$ That shapes how we react to statistical estimates when presented with them. Given a sample correlation, people would often implicitly think about generalization and picture themselves a well-defined population counterpart.$^{***}$

Therefore, I would avoid reporting sample correlation when its population counterpart is ill defined – unless I knew very well what I am doing and had ample opportunity to explain myself in detail to the audience. Otherwise, the message that gets through might be quite different from the one you may be trying to send.

$^{*}$An exception could be information compression where we only care about the sample, trying to compress it without much loss in fidelity.
$^{**}$In the case of prediction, we do not focus on the estimands directly but employ knowledge about them to obtain predictions.
$^{***}$Most people would probably instinctively consider causation, too, but that is another topic.

Related Question