In my econometrics class, my teacher defined a stationary time series thus:
"Loosely speaking, a time series is stationary if its stochasitc properties and its temporal dependence structure do not change over time." I am confused as to what some examples would be. Would temperature throughout the years be stationary, assuming that there isn't any trend? Does stationarity mean that the only movement in the data is attributed to random, white noise? What are some examples? I am at a loss for examples.
Solved – a stationary time series? What are some examples
stationaritytime series
Related Solutions
First of all, it is important to note that stationarity is a property of a process, not of a time series. You consider the ensemble of all time series generated by a process. If the statistical properties¹ of this ensemble (mean, variance, …) are constant over time, the process is called stationary. Strictly speaking, it is impossible to say whether a given time series was generated by a stationary process (however, with some assumptions, we can take a good guess).
More intuitively, stationarity means that there are no distinguished points in time for your process (influencing the statistical properties of your observation). Whether this applies to a given process depends crucially on what you consider as fixed or variable for your process, i.e., what is contained in your ensemble.
A typical cause of non-stationarity are time-dependent parameters – which allow to distinguish time points by the values of the parameters. Another cause are fixed initial conditions.
Consider the following examples:
The noise reaching my house from a single car passing at a given time is not a stationary process. E.g., the average amplitude² is highest when the car is directly next to my house.
The noise reaching my house from street traffic in general is a stationary process, if we ignore the time dependency of the traffic intensity (e.g., less traffic at night or on weekends). There are no distinguished points in time anymore. While there may be strong fluctuations of individual time series, these vanish when I consider the ensemble of all realisations of the process.
If I we include known impacts on traffic intensity, e.g., that there is less traffic at night, the process is non-stationary again: The average amplitude² varies with a daily rhythm. Every point in time is distinguished by the time of the day.
The position of a single peppercorn in a pot of boiling water is a stationary process (ignoring the loss of water due to evaporation). There are no distinguished points in time.
The position of a single peppercorn in a pot of boiling water dropped in the exact middle at $t=0$ is not a stationary process, as $t=0$ is a distinguished point in time. The average position of the peppercorn is always in the middle (assuming a symmetric pot without distinguished directions), but at $t=ε$ (with $ε$ small), we can be sure that the peppercorn is somewhere near the middle for every realisation of the process, while at a later time, it can also be closer to the border of the pot.
So, the distribution of positions changes over time. To give a specific example, the standard deviation grows. The distribution quickly converges to the respective distributions of the previous example and if we only take a look at this process for $t>T$ with a sufficiently high $T$, we can neglect the non-stationarity and approximate it as a stationary process for all purposes – the impact of the initial condition has faded away.
¹ For practical purposes, this is sometimes reduced to the mean and the variance (weak stationarity), but I do not consider this helpful to understand the concept. Just ignore weak stationarity until you understood stationarity.
² Which is the mean of the volume, but the standard deviation of the actual sound signal (do not worry too much about this here).
I think you're doing this correctly. The predictions show the deterministic component of the model, as intended. As a loose analogy, if you add a trend line to a simple scatter plot (abline(lm(...)
, e.g.), you wouldn't expect the trend line to wobble around. Similarly, the forecast represents the best guess at the future temperatures.
Were you perhaps interested in a stochastic simulation, or bounding the forecast estimates with confidence intervals? Or maybe there is another seasonal component in the observed time series that is missing from your forecast? I'll elaborate on these three possibilities below.
- For the confidence intervals, look at
forecast$se
. Multiply that by the appropriate critical value, e.g.,qnorm(0.975)
, and plot using lines(), and you can add some confidence bands. - If you want a stochastic simulation, I would simulate the time
series manually using the equation from the model that you fitted. Basically, simulate that
prediction 1 time step at a time in a for() loop, and each time step
add a Wiener noise term. For an AR(1) process with a coefficient of
1.00, this could amount to overlaying Brownian motion (
diffinv(rnorm(40))
). Your model has more terms, so it's not so simple. Try something akin toX[i] <- B1*X[i-1] + Sigma*rnorm(1)
, where Sigma is the s.d. of the noise that you want to add, B1 is the AR(1) coefficient, and X is the response variable. Adjust as necessary to include your seasonal and MA terms, etc. You could choose Sigma based on the residual variance of your fitted time series model. - Perhaps what "looks wrong" to you is absence of a seasonality term with a multi-year period. For example, the effect of the North Atlantic Oscillation on temperature. You could try adding the NAO Index as a covariate to the model.
To directly answer your questions:
Do you need to add a white noise term? No. If it's noise you want to add, do a stochastic simulation, which would allow the noise to propagate naturally (a shock at time t-1 will have an effect at time t b/c X[t] is correlated with X[t-1]; also, the MA term means that the shocks/errors are correlated b/c Epsilon[t] is correlated with Epsilon[t-1]).
Is you method correct? Given the structure of your model, your forecast seems reasonable. If the forecast is lacking some deterministic pattern, try accounting for things like NAO. If you desire an explicit graphical representation of uncertainty (due to lack of additional terms, or other process/ observation errors), I would suggest communicating that the model upon which your forecast is based has some residual variance --- put s.e.'s or CI's on the forecast.
My hunch is that you're mostly interested in doing a stochastic simulation. That would be the way to add randomness.
Best Answer
Perhaps a simple example from finance might help intuition. Let $R_t$ be the interest rate for period $t$ (note this is a random variable).
Numerous interest rate models (eg. Vasicek or Cox-Ingersoll-Ross) imply the rate is stationary process. If you earn the interest rate $R_t$ each period and start with $V_0$ dollars, then the quantity of dollars you have at time $t$ is given by:
$$V_t = V_0 \prod_{\tau=1}^t \left(1 + R_\tau \right)$$
The process $\left\{ V_t \right\}$ is NOT stationary. There's no unconditional mean or variance.
Other examples from econ and finance:
Let $Y_t$ be aggregate output (i.e. GDP) of the economy at time $t$.
Let $S_t$ be the price of overall market portfolio.
A random walk or a Wiener process (the continuous time analogue to a random walk) are canonical examples of non-stationary processes. On the other hand, increments of a random walk or a Wiener process are stationary processes.
Temperature
As @kjetil points out, temperature is not a stationary process. For example, the distribution over temperatures in January is not the same as the distribution over temperatures in June. The joint distribution changes when shifted in time.
On the other hand, let $\mathbf{y}_t$ be a 12 by 1 vector for year $t$ where each entry of the vector denotes the average temperature for a month. You might be able to argue that $\mathbf{y}_t$ is a stationary process.
-- Update As @bright-star points out in the comments, this is the basic idea behind cyclostationarity. The temperature on a specific day as $t$ varies across years may be a stationary process.
Sunspots
One of the first time-series models was developed by Yule and Walker to model the 11-year sunspot cycle.
Let $y_t$ be the number of sunspots in year $t$. They modeled the number of sunspots in a year as a stationary process using the AR(2) model:
$$ y_t = a + b y_{t-1} + c y_{t-2} + \epsilon_t $$
A stationary process can have patterns, cycles, etc...
Be aware of the two common definitions of stationarity.
Somewhat loosely:
(Perhaps an obscure, technical remark, but strict stationarity does not imply covariance stationarity and covariance stationarity does not imply strict stationarity.)