Strict stationarity is the strongest form of stationarity. It means that the joint statistical distribution of any collection of the time series variates never depends on time. So, the mean, variance and any moment of any variate is the same whichever variate you choose. However, for day to day use strict stationarity is too strict. Hence, the following weaker definition is often used instead. Stationarity of order 2 which includes a constant mean, a constant variance and an autocovariance that does not depend on time. (second-order stationary or stationary of order 2). A weaker form of stationarity that is first-order stationary which means that the mean is a constant function of time, time-varying means to obtain one which is first-order stationary.
Using traditional stationarity tests such us PP.test (Phillips-Perron Unit Root Test), kpss test or Augmented Dickey-Fuller Tests are not adequate if you are to perform regression via other methods than ARIMA (due that in Arima the orders are fixed and that no other factors that produce non stationarity are included in the model). For non Arima cases stationarity tests in the frequency domain are more adequate.
Tests in the frequency domain : The Priestley-Subba Rao (PSR) test for nonstationarity (fractal package). Based upon examining how homogeneous a set of spectral density function (SDF) estimates are across time, across frequency, or both.
The test you refer to is a test also in the frequency domain (which tests a second order unit root test) where the wavelet looks at a quantity called βj(t) which is closely related to a wavelet-based time-varying spectrum of the time series (it is a linear transform of the evolutionary wavelet spectrum of the locally stationary wavelet processes of Nason, von Sachs and Kroisandt, 2000). So we see if βj(t) function varies over time or is constant by looking at Haar wavelet coefficients of the estimate so is stationary if all haar coefficients are zero (locits package).
There are other concerns about stationarity such us long range dependence, fractional integrated processes (ARFIMA) where the term d (differenciation) refers to long term memory processes.
The effect of higher order non stationarity, long term dependencies is that they are in effect reflected systematically in the errors of a regression, however its impact and thus validity of the regression is difficult to measure
First, you have not told us what this series represents. Subject-matter knowledge is generally useful when trying to model the data.
the volatility of the series is increasing <...> The log returns exhibit this even more clearly
It would be more accurate to say that volatility has been very low in the beginning, high for a while and then quite low again. There is no linear trend in the magnitude of volatility -- see your second figure.
- Which alternative model could I use to model this behavior of the volatility of the series <...> ?
You may want to reconsider the model for the conditional mean first. Perhaps you could model the apparent trend in the original series and the dip during approx. 2009 by including some other explanatory variables and turning the AR model into an ARX model. Also, you could consider ARCH-in-mean specification, as the volatility seems to have been the highest when the series was going down.
Now to answer the question, there is a great variety of GARCH models suited for different features of the data. One survey (not necessarily the best) can be found here: Terasvirta "An introduction to univariate GARCH models" (2009). You could take a look at the stylized facts and the characteristics of the different models and see if there is one that suits you.
You could also have a GARCH model with exogenous variables. E.g. a GARCH model with a time trend could suit a case where conditional variance is increasing linearly (which does not seem to be the case here, though). Or you could have an IGARCH model where the conditional variance is an integrated process. That also answers your second question:
- When modelling the conditional mean of the series the classic approach would be to integrate the series. Is there a similar approach considerable for the volatility?
Sorry for being a bit vague about the choice of the model, but I think it would be careless to recommend a particular model just by having seen a graph of the data.
I would be also very happy about any hint to a particular Stata or R command that seem promising to model the volatility of this time series.
In R I may recommend using "rugarch" package with main functions ugarchspec
(for specifying a GARCH model) and ugarchfit
(for estimating the specified model for a given data series). There is a nice vignette to consult.
P.S.
the estimated coefficients sum to a value greater than one <...> implying that volatility is growing without bounds which is clearly undesirable
It is not only undesirable, but it also does not seem to fit the data -- I do not see gradually exploding volatility in the second figure.
Best Answer
1.
To second your concern, Wikipedia's article on coefficient of variation ($c_v$) suggests that [t]he coefficient of variation should be computed only for data measured on a ratio scale rather than interval scale, as it is meaningless in the latter case. So you probably should not use it.
Now, how would you define the conditional coefficient of variation? Would you have one value per time point, like $c_{v,t}:=\frac{\sigma_t}{\mu_t}$? If you then try to remedy the problem of negative values by taking the absolute value, you would end up with a new measure $c^*_{v,t}:=\frac{\sigma_t}{|\mu_t|}$. It looks alright to me, just think whether it is exactly what you need (how exactly you define volatility).
2.
Your second approach seems better suited for the task. You can estimate the unconditional (or long-run) variances of the series and compare them. Here are two threads that discuss the topic: "Estimating unconditional variance in time series" and "What is the long run variance?". The question is, how do we get the distribution of the estimator for the unconditional variance? We need it for deriving the distribution of the test statistics for testing the equality of the two variances.