Solved – Testing significance of peaks in spectral density

hypothesis testingrspectral analysistime series

We sometimes use spectral density plot to analyze periodicity in time series. Normally we analyze the plot by visual inspection and then try to draw a conclusion about the periodicity. But have the statisticians developed any test to check whether any spikes in the plot are statistically different from white noise? Have the R-experts developed any package for spectral density analysis and for doing that kind of test?

Best Answer

You should be aware that estimating power spectra using a periodogram is not recommended, and in fact has been bad practice since ~ 1896. It is an inconsistent estimator for anything less than millions of data samples (and even then ...), and generally biased. The exact same thing applies to using standard estimates of autocorrelations (i.e. Bartlett), as they are Fourier transform pairs. Provided you are using a consistent estimator, there are some options available to you.

The best of these is a multiple window (or taper) estimate of the power spectra. In this case, by using the coefficients of each window at a frequency of interest, you can compute a Harmonic F Statistic against a null hypothesis of white noise. This is an excellent tool for detection of line components in noise, and is highly recommended. It is the default choice in the signal-processing community for detection of periodicities in noise under assumption of stationarity.

You can access both the multitaper method of spectrum estimation and the associated F-test via the multitaper package in R (available via CRAN). The documentation that comes with the package should be enough to get you going; the F-test is a simple option in the function call for spec.mtm.

The original reference that defines both of these techniques and gives the algorithms for them is Spectrum Estimation and Harmonic Analysis, D.J. Thomson, Proceedings of the IEEE, vol. 70, pg. 1055-1096, 1982.

Here is an example using the included data set with the multitaper package.

require(multitaper);
data(willamette);
resSpec <- spec.mtm(willamette, k=10, nw=5.0, nFFT = "default",
                    centreWithSlepians = TRUE, Ftest = TRUE,
                    jackknife = FALSE, maxAdaptiveIterations = 100,
                    plot = TRUE, na.action = na.fail)

The parameters you should be aware of are k and nw: these are the number of windows (set to 10 above) and the time-bandwidth product (5.0 above). You can easily leave these at these quasi-default values for most applications. The centreWithSlepians command removes a robust estimate of the mean of the time series using a projection onto Slepian windows -- this is also recommended, as leaving the mean in produces a lot of power at the low frequencies.

I would also recommend plotting the spectrum output from 'spec.mtm' on a log scale, as it cleans things up significantly. If you need more information, just post and I'm happy to provide it.

Related Solutions

Solved – How to perform a spectral density analysis in R using the multitaper package

Just noticed this question. Here's responses to some of your questions.

The F-test test is (and I quote)

an F variance-ratio test with 2 and 2K-2 degrees of freedom for the significance of the estimated line component.

In practice, the test compares the value of the background spectra with the power in a line component, resulting in the above test. High values, high(er) significance.

As far as why the critical value is not included in the plot, I assume it is largely because which critical value you choose varies from problem to problem. The package does provide numerical output from the test (it is contained in the object returned from the function call, and you can pull it out and look at it as you wish). It even returns the two (numerator, denominator) degrees of freedom so you can compute the cutoff for any significance you want, as the distribution is essentially F, and has a known CDF.

To obtain the pieces you're looking for, pull the mtm object out of resSpec and take a look at it. For example:

spect<-resSpec$mtm
attributes(spect)
dof1<-2
dof2<-2*(spect$k)-2
Fval<-spect$Ftest

I'm not sure what you're asking when you say

Is the significance level in the F-test optional, for example 5 or 10%?

If you have data that is not sampled at $\delta t = 1s$, you can change the frequency delta by examining the spec class attached to resSpec. It is a time-series object, and can be modified in the usual way. Alternatively, if you know what you want, just create your own frequency vector and then plot(fr,resSpec$spec,etc).

The confidence interval that you can display on the spectrum is a jackknifed confidence interval. I can provide references if you'd like to read up on this a little more, but for most purposes, it doesn't provide directly applicable information for standard analysis.

As far as I know, there is no direct way to integrate the log spectrum command for the initial plot. However, I normally just turn auto-plot off and grab the resSpec$spec object and then manipulate it directly, fixing the $\delta f$ and setting the plot to plot(fr,log(resSpec$spec),etc).

Hope these answers helped.

Solved – Testing normality and independence of time series residuals

Notwithstanding IrishStat's comments, you could use a Breusch-Godfrey test. It is used to test for a lack of correlation among the residuals of a regression model.

First, you perform your regression. Get the residuals. Run a regression of the residuals on all the variables from your regression of interest from step 1 plus some number of lagged residuals. You can guess how many lags you should include by looking at the autocorrelation function. You can test for a lack of serial correlation by testing that the coefficients on the lags of the residuals are jointly 0 by using an F test or a version of a Lagrange multiplier test (the test statistic is the number of observations in the second, auxiliary regression times the $R^2$ from that regression; the test statistic is distributed as a $\chi^2_l$, where $l$ is the number of lags, under the null of no serial correlation).

Best Answer

Related Solutions

Solved – How to perform a spectral density analysis in R using the multitaper package

Solved – Testing normality and independence of time series residuals

Related Question