Solved – AR(1) process with heteroscedastic measurement errors

autoregressivelikelihoodmarkov-chain-montecarlotime series

1. The problem

I have some measurements of a variable $y_t$, where $t=1,2,..,n$, for which I have a distribution $f_{y_t}(y_t)$ obtained via MCMC, which for simplicity I'll assume is a gaussian of mean $\mu_t$ and variance $\sigma_t^2$.

I have a physical model for those observations, say $g(t)$, but the residuals $r_t = \mu_t-g(t)$ appear to be correlated; in particular, I have physical reasons to think that an $AR(1)$ process will suffice to take into account the correlation, and I plan on obtaining the coefficients of the fit via MCMC, for which I need the likelihood. I think the solution is rather simple, but I'm not pretty sure (it seems so simple, that I think I'm missing something).

2. Deriving the likelihood

A zero-mean $AR(1)$ process can be written as:
$$X_t = \phi X_{t-1}+\varepsilon_t,\ \ \ (1)$$
where I'll assume $\varepsilon_t\sim N(0,\sigma_w^2)$. The parameters to be estimated are, therefore, $\theta = \{\phi,\sigma_w^2\}$ (in my case, I also have to add the parameters of the model $g(t)$, but that's not the problem). What I observe, however, is the variable
$$R_t = X_t+\eta_t,\ \ \ (2)$$
where I'm assuming $\eta_t\sim N(0,\sigma_t^2)$, and the $\sigma_t^2$ are known (the measurement errors). Because $X_t$ is a gaussian process, $R_t$ is also. In particular, I know that
$$X_1 \sim N(0,\sigma_w^2/[1-\phi^2]),$$
therefore,
$$R_1 \sim N(0,\sigma_w^2/[1-\phi^2]+\sigma_t^2).$$
The next challenge is to obtain $R_t|R_{t-1}$ for $t\neq 1$. To derive the distribution of this random variable, note that, using eq. $(2)$ I can write
$$X_{t-1} = R_{t-1}-\eta_{t-1}.\ \ \ (3)$$
Using eq. $(2)$, and using the definition of eq. $(1)$, I can write,
$$R_{t} = X_t+\eta_t = \phi X_{t-1}+\varepsilon_{t}+\eta_t.$$
Using eq. $(3)$ in this last expression, then, I obtain,
$$R_{t} = \phi (R_{t-1}-\eta_{t-1})+\varepsilon_{t}+\eta_t,$$
thus,
$$R_t|R_{t-1} = \phi (r_{t-1}-\eta_{t-1})+\varepsilon_{t}+\eta_t,$$
and, therefore,
$$R_t|R_{t-1} \sim N(\phi r_{t-1},\sigma_w^2+\sigma_t^2-\phi^2\sigma^2_{t-1}).$$ Finally, I can write the likelihood function as
$$L(\theta) = f_{R_1}(R_1=r_1) \prod_{t=2}^{n} f_{R_{t}|R_{t-1}}(R_t=r_t|R_{t-1}=r_{t-1}),$$
where the $f(\cdot)$ are the distributions of the variables that I just defined, .i.e., defining $\sigma'^2 = \sigma_w^2/[1-\phi^2]+\sigma_t^2,$
$$f_{R_1}(R_1=r_1) = \frac{1}{\sqrt{2\pi \sigma'^2}}\text{exp}\left(-\frac{r_1^2}{2\sigma'^2}\right),$$
and defining $\sigma^2(t) = \sigma_w^2+\sigma_t^2-\phi^2\sigma^2_{t-1}$,
$$f_{R_{t}|R_{t-1}}(R_t=r_t|R_{t-1}=r_{t-1})=\frac{1}{\sqrt{2\pi \sigma^2(t)}}\text{exp}\left(-\frac{(r_t-\phi r_{t-1})^2}{2\sigma^2(t)}\right)$$

3. Questions

  1. Is my derivation ok? I don't have any resources to compare other than simulations (which seem to agree), and I'm not a statistician!
  2. Are there any derivation of this kind of things in the literature for $MA(1)$ proccesses or $ARMA(1,1)$ proccesses? A study for $ARMA(p,q)$ proccesses in general that could be particularized to this case would be nice.

Best Answer

  1. You're on the right track, but you've made a mistake in deriving the distribution of $R_t$ given $R_{t-1}$: the conditional mean isn't $\phi r_{t-1}$. It's $\phi \widehat{x}_{t-1}$, where $\widehat{x}_{t-1}$ is your best estimate of $X$ from the previous period. The value of $\widehat{x}_{t-1}$ includes information from previous observations as well as $r_{t-1}$. (To see this, consider a situation where $\sigma_w$ and $\phi$ are negligible, so you're effectively estimating a fixed mean. After lots of observations, your uncertainty about $X$ will be much smaller than $\sigma_{\eta}$.) This can be confusing at first, because you observe $R$ and not $X$. That just means that you're dealing with a state-space model.

  2. Yes, there is a very general framework for using linear-Gaussian models with noisy observations, called the Kalman filter. This applies to anything with an ARIMA structure and many more models too. Time-varying $\sigma_{\eta}$ is OK for the Kalman filter, provided it's not stochastic. Models with, e.g., stochastic volatility need more general methods. To see how the Kalman filter is derived, try Durbin-Koopman or chapter 3 of Harvey. In Harvey's notation, your model has $Z=1$, $d=c=0$, $H_t = \sigma_{\eta,t}^2$, $T=\phi$, $R=1$ and $Q=\sigma^2_w$.

Related Question