Time Series Analysis – How to Calculate Coherence Between Two Time Series

rspectral analysistime series

I originally posted this on stackoverflow.com and then deleted it and moved it here

My question is similar to
Similarity of two discrete fourier tranforms (specifically the selected answer). I've also gleaned some helpful information from this R help thread.

I'm stuck on the actual implementation of this process, however.

I have two Fourier series; my ultimate goal is to calculate a sum of weighted coherence.
(In actuality I have millions of series corresponding to pixels, but, let's just call it two for now).

    TS1 <- c(5051.29, 5355.31, 5602.18, 5784.4, 5896.45, 5934.9, 5898.6, 
5788.64, 5608.37, 5363.27, 5060.78, 4710.09, 4321.86, 3907.89, 
3480.75, 3053.42, 2638.89, 2249.75, 1897.82, 1593.81, 1346.93, 
1164.71, 1052.67, 1014.21, 1050.52, 1160.47, 1340.74, 1585.84, 
1888.33, 2239.02, 2627.25, 3041.22, 3468.36, 3895.69, 4310.22, 
4699.36, 5051.29)
    TS2 <- c(4192.83, 4532.62, 4836.41, 5094.96, 5300.41, 5446.53, 5528.88, 
5544.95, 5494.25, 5378.33, 5200.71, 4966.78, 4683.65, 4359.93, 
4005.45, 3630.98, 3247.9, 2867.85, 2502.38, 2162.59, 1858.8, 
1600.25, 1394.8, 1248.67, 1166.33, 1150.26, 1200.95, 1316.87, 
1494.5, 1728.43, 2011.55, 2335.28, 2689.76, 3064.23, 3447.31, 
3827.36, 4192.83)

From the above links I see that I need to calculate coherence of the cross-spectrum at each frequency, and weight the frequencies that have higher power spectral density.

I know I can extract spectral density from

spectrum(data.frame(TS1, TS2))$spec

I know I can extract coherence using:

spectrum(data.frame(TS1, TS2))$coh

I can see that the value of coherence extracted is always equal to 1. It seems the "solution" to this is to lag the time series somehow.

I'm not sure how to 1) calculate optimal lag, especially because I'd like it to be comparable across all the coherence calculations, and 2) how to specify lag in the spectrum(). spec.pgram takes "spans", but those seem to have more to do with smoothing than offsetting.

I tried a maximum ccf function defined here, but in many cases (not in the example set I've provided, you'll have to trust me) the "optimal lag" returned is 0, which makes sense because the ccf is 1, but isn't actually what I'm looking for.

Should I be using acf() or ccf() to manually calculate coherence?
What steps (R functions would be really helpful) do I need to perform to get from where I am to having a single weighted coherence value?

Best Answer

The following are the issues that I summarize succinctly in sequence, as each of them are an interesting problem in itself and then follow it up with my solutions:

i) How to unlag two time-series so as to lose the least data due to the lagging, and at the same time to maximize the cross-correlation or spectral coherence.

ii) how would this be done when you are dealing with multivariate time-series. As it's greater than the two, its like solving a rubiks cube optimally while lagging various time-series in the dataset so as to preserve a summed up measure of (i).

iii) How to compute a weighted-coherence?

I will skip ii) and answer i) and iii) due to reasons of conflict of interest and research collaboration/confidential agreements on my work on ii). I understand that iii) happens to be the central question to start with followed by i).

My solution for iii)

Pre-Processing Steps: Say you had two time -series t1 and t2. Determine a CCF maximizing lag, say optLag and unlag t1 and t2 as unlaggedTS = lag(t2,optLag) and perform a time-series union of the dataset as unionTS = ts.union(t1,unlaggedTS). Following that, as the time-series unlagging would produce a few NA's as you lose entries by lagging, you need to set these values to zero in unionTS as required-meaning that nothing was observed at the NA points.

Step 2: Now, obtain an abinded time-series, with the span parameters as required.

bindTSPair= abind(((spectrum(unionTS,span=c(16,16),plot=F)$spec)[,1]),((spectrum(unionTS,span=c(16,16),plot=F)$spec)[,2])) 

Following it, just normalize the data in bindTSPair as say:

normalizedTS=round(data.Normalization(c(bindTSPair),type="n4"),6)

Then obtained the following two halves:

TSHalf1=normalizedTS[1:(length(bindTSPair)/2)]

TSHalf2=normalizedTS[((length(bindTSPair)/2)+1):(length(bindTSPair))]

and then finally obtain 'a weighted' version of spectral coherence as follows:

 weightedCoherence=sum((10*(TSHalf1 + TSHalf2)) * (spectrum(unionTS,span=c(10,10),plot=F)$coh))

You may change the weighting as required or using 'prior' info.

My solution for i) to obtain a CCF maximizing lag:

 for(i in 2:ncol(myTS))
{
    for(j in 2:ncol(myTS))
    {
r=ccf(myTS[,i],myTS[,j],lag.max=1000,plot=TRUE)
tp=sort((r$acf),index.return=TRUE)$ix
tn=sort((r$acf),index.return=TRUE,decreasing=T)$ix
cn=r$acf[tn[length(tn)]]
    cp=r$acf[tp[length(tp)]]
    if(abs(cp) > abs(cn))
    {
    temp=r$lag[tp[length(tp)]]
    	 tcc=cp
    	}
    		if(abs(cn) > abs(cp))
    		{
    		temp=r$lag[tn[length(tn)]]
    		tp=tn
    		 tcc=cn
    		}
    if(abs(cn) > abs(cp))
    {
    temp=r$lag[tp[length(tp)]]
}
show(i)
if(length(tp) != 0)
{

mlag[i,j]=temp
mcc[i,j]=tcc
}
    }

}

mlag=mlag[2:ncol(myTS),2:ncol(myTS)]
mlag=data.frame(mlag)