Solved – Spectral density of a time series R implementation explanation

rsignal processingspectral analysis

I am doing signal analysis for the fist time and am using the implementation to found in the stats package to obtain the spectral density of a time series, so I can know which frequencies of the signal have the most power.

Now although I understand the general propose of the analysis, (the plot meets the shape it was supposed to have in my dataset), I don't understand the scales obtrained through the use of Spectrum(). For example when I do the spectrum(lh) example, it gives me in the Frequency axis a scale of 0 to 0.5 if I say the time seris has a frequency 1. Could anybody explain to a new guy how do I obtain the values in the frequency and spectrum Axis?

Best Answer

The Fourier transform of discrete data with sampling rate $\Delta t = 1$ is periodic with period $2\pi$ (if we're talking about angular frequency, $\omega$) OR with period $1$ (if we're talking about linear frequency, $f$). The relationship between angular and linear frequency is $\omega = 2\pi f$. We'll assume for simplicity that $\Delta t = 1$ for now and I'll be using linear frequency, $f$.

In fact, with real-valued data the Fourier transform (which is complex) is conjugate-symmetric about $\omega = f = 0$. If $X(f)$ is the Fourier transform of $x(t) \in \mathbb{R}$, then $X(f) = X^{*}(-f)$. The spectrum is proportional to $|X(f)|^{2}$, so the spectrum is symmetric about $f = 0$.

So, with a discretely sampled data ("index limited") we end up with a "band limited" Fourier transform. The Nyquist frequency (the highest frequency which is not an alias of lower frequencies. The below is an example of aliasing (can't tell one sinusoid from another - see Aliasing on Wikipedia). aliasing example

If $\Delta t \neq 1$ then the Nyquist frequency is $1 / (2\Delta t)$ and the frequencies where the spectrum is calculated (if you have $N$ data points) is

seq(0, 1/(2*dt), by = 1/N)

Usually you want to zeropad for the FFT, so you add some zeroes on at the end.

One of the better options for estimating the spectrum is to use the Multitaper Method (package multitaper in R from CRAN).

Sorry, this was all over the place... An example is below. The plot that it creates demonstrates the bias (broadband bias) properties of the periodogram vs. the multitaper.

An example I thoroughly enjoy comes from Percival and Walden, page 46 - an order 4 autoregressive simulation (so we know the theoretical spectrum): $$x_{t} = 2.7607x_{t-1} - 3.8106x_{t-2} + 2.6535x_{t-3} - 0.9238x_{t-4}$$

library('TSA')
library('multitaper')
phi <- c(2.7607, -3.8106, 2.6535, -0.9238)

# our data
N <- 500 # number of data points
M <- 2048 # zeropadded length of series
freq <- seq(0, 0.5, by = 1/M)
x <- arima.sim(n = N, model = list(ar = phi))

# theoretical spectrum
spec.thry <- ARMAspec(model = list(ar = phi), plot = FALSE)

h.pgram <- rep(1/sqrt(N), N) #periodogram taper / window

# prepared data
xh.pgram <- x * h.pgram

# calculate the periodogram
spec.pgram <- abs(fft(c(xh.pgram, rep(0, M-N)))[1:(M/2+1)])^2
# multitaper - better solution very often
spec.mtm <- spec.mtm(x, nFFT = M, plot = FALSE)

# plotting
par(mar = c(4,4,1,1))
plot(spec.thry$freq, spec.thry$spec, type = 'l', log='y', ylab = "Spectrum", xlab = "Frequency", lwd = 2)
lines(freq, spec.pgram, col = rgb(1,0,0,0.7))
lines(spec.mtm$freq, spec.mtm$spec, col = rgb(0,0,1,0.7))
legend("topright", c("Theory", "Periodogram", "Multitaper"), col = c("black", rgb(1,0,0,0.7), rgb(0,0,1,0.7)), lwd = c(2,1,1))
Related Question