Solved – Do we need to detrend when do Cross-Correlation between two time series

autocorrelationcross correlationtime series

I have a group of time series variables and I want to found out the relationship among them. The method I use is to calculate pair-wise correlation between two time series and found out those with high correlation values and statistical significance (P<0.05 && Q <0.05).

However, When I checked literature, there were some papers mention that if the time series have auto-correlation themselves, the P value of the cross-correlation would be unstable and the correlation coefficient would be inflation and I have validate the inflation in my data set by using ARIMA module to detrend and calculate the cross-correlation between residues.

My question is, as my aim is to find out those have strong correlation time series pairs, the series have the same trend actually are those pairs I want to find, should I do the cross-correlation after detrending? I am afraid that some time series have the same trend could be detected by the correlation analysis however can not be detected after detrending.

Best Answer

Trend can mess up your correlation matrix for sure. For instance, here's the code in MATLAB that demonstrates it.

x=randn(100,2);
corr(x)
subplot(2,1,1)
ans =

    1.0000    0.1099
    0.1099    1.0000

plot(x)

y=x+repmat((1:100)',1,2);
corr(y)

ans =

    1.0000    0.9991
    0.9991    1.0000

subplot(2,1,2)
plot(y)

enter image description here

All I did was to generate random noise with or without deterministic trend.

This effect is even more evident on crosscorrelation plots for these two cases:

>> subplot(2,1,1)
>> crosscorr(x(:,1),x(:,2))
>> subplot(2,1,2)
>> crosscorr(y(:,1),y(:,2))

enter image description here

Depending on your situation you might be interested in the trend or the noise. If you're interested in the noise, then the correlation matrix in the second case is meaningless, it's a spurious correlation. In this case differencing will help as shown below:

>> corr(diff(x))

ans =

    1.0000    0.0479
    0.0479    1.0000

You see how we got back the reasonable correlation, no correlation in this case. Differencing doesn't work in every case, of course.

Related Question