Solved – Gaussian assumption in Kalman filter

kalman filtermultivariate normal distributionnormal distribution

I have a question about the Gaussian assumption of Kalman filter in detail. I'll lay down some equations first

Assuming $x_{t|t-1}$ is your prediction of the state space at time $t$, then the innovation

$v_t = z_t – H_t\,x_{t|t-1}$, where $z_t$ is the measurement at time $t$ and $H_t$ is the mapping

Then we do the actual update

$x_{t|t} = x_{t|t-1} + K_t\,v_t$, where $K_t$ is the Kalman gain

And my question is:

If I set up some parameters and run the Kalman filter and check my assumption posteriorly, do I check distribution of the innovation $v_t$ or the distribution of $z_t – H_t\,x_{t|t}$ instead

I haven't seen the expression $z_t – H_t\,x_{t|t}$ mentioned anywhere, but to me this should represent the measurement error, while the innovations should not

Any comment is very much appreciated

Best Answer

The Gaussian assumption is used in the predict and update steps of the Kalman Filter. They are the reason you only have to keep track of means and variances.

First, $Z_t|X_t$ is Normal. Second, $X_t|Z_{1:t-1}$ is Normal. Then, by Bayes' theorem, $$ p(x_t|z_{1:t}) \propto p(z_t|x_t)p(x_t|z_{1:t-1}) $$ is Normal. Then the process begins again. You don't need to bother with the normalizing constant, because you will recognize this as a Normal distribution. All you have to do is find the mean and variance of the distribution on the left hand side. For more information on connecting this idea to your Kalman Gain and updating arithmetic, see

Checking Assumptions:

There are probably many ways to check the normality assumption, but I would check the innovation. Especially if your goal is prediction. Note that $$ p(z_t|z_{1:t-1}) = \text{Normal}(z_t;H_t x_{t|t-1}, P_{t|t-1}) \tag{1}. $$ So if the model is true, then you can plug your observations into the predictive density, and you can expect to get high densities more often than not. A bunch of techniques probably play on this idea. Here are just a few I can think of off the top of my head.


Call $F_t(\cdot)$ the cdf of (1). Then, if the model is true, $F_t(z_t) \sim \text{Uniform}(0,1)$. If these don't look uniform you have a problem.


At every time step calculate $n_t = -\log p(z_t|z_{1:t-1})$. High values are "bad." If your model is true, $\sum_{t=1}^N{n_t}/N$ should converge to the Entropy of a normal distribution, by the law of large numbers.


Look at the square of the z-score of your prediction. Calculate $(z_t - H_t x_{t|t-1})'P_{t|t-1}^{-1}(z_t - H_tx_{t|t-1})$. If your model is true, then this quantity should follow a chi-square random variable with degrees of freedom equal to whatever the dimension of your observations are.

Note: you might want to test residuals from filtering on out of sample data. Otherwise, I'm not sure if these convergence arguments are valid; you would be using the data twice.

Related Question