Solved – Kalman filter equation derivation

control theorygaussian processkalman filtermathematical-statistics

I'm studying the Kalman Filter for tracking and smoothing. Even if I have understood the Bayesian filter concept, and I can efficiently use some of Kalman Filter implementation I'm stucked on understand the math behind it in an easy way.

So, I'm looking for an easy to understand derivation of Kalman Filter equations ( (1) update step, (2) prediction step and (3) Kalman Filter gain) from the Bayes rules and Chapman- Kolmogorov formula, knowing that:

Temporal model is expressed by: $$ \textbf{X}_t = A\textbf{X}_{t-1} + \mu_p + \epsilon_p$$ where $A$ is transition matrix $D_\textbf{X} \times D_\textbf{X}$, $\mu_p$ is the $D_\textbf{X} \times 1$ control signal vector and $\epsilon_p$ is a transition gaussian noise with covariance $\Sigma_m$, and in probabilistic term could be expressed by: $$ p(\textbf{X}_t | \textbf{X}_{t-1}) = Norm_{\textbf{X}_t}[\textbf{X}_{t-1} + \mu_p, \Sigma_p] $$ and
Measurement model is expressed by: $$ \textbf{y}_t = H\textbf{X}_t + \mu_m + \epsilon_m $$ where $H$ the $D_y \times D_x$ observation matrix, that maps real state space to observation space, $\mu_m$ is a $D_\textbf{y} \times1$ mean vector, and $\epsilon_m$ is the observation noise with covariance $\Sigma_m$ that in probabilistic term could be expressed by $$ p(\textbf{y}_t | \textbf{X}_t) = Norm_{\textbf{y}_t}[ H\textbf{X}_t + \mu_m, \epsilon_m] $$

Best Answer

There is a simple, straightforward derivation that starts with the assumptions of the Kalman filter and requires a little Algebra to arrive at the update and extrapolation equations as well as some properties regarding the measurement residuals (difference between the predicted state and the measurement). To start, the Kalman Filter is a linear, unbiased estimator that uses a predictor/corrector process to estimate the state given a sequence of measurements. This means that the general process involves predicting the state and then correcting the state based upon the difference between that prediction and the observed measurement (also known as the residual). The question becomes how to update the state prediction with the observed measurement such that the resulting state estimate is: (1) a linear combination of the predicted state "x" and the observed measurement "z" and (2) has an error with zero mean (unbiased). Base upon these assumptions, the Kalman Filter can be derived.

State and Measurement Model Notation and Assumptions

The state dynamics model for the state vector $\bar x_k$ at time $k$ is given by the state transition matrix $F_{k-1}$ and the state vector $\bar x_{k-1}$ at a previous time $k-1$. The state dynamics model also includes process noise given by $\bar v_{k-1}$ at time $k-1$.

State Dynamics Model

The measurement model for the measurement vector $\bar z_k$ at time $k$ is given by the observation matrix $H_k$ and the state vector $\bar x_k$ at time $k$. The measurement model also includes measurement noise given by $\bar w_k$ at time $k$.

Measurement Model

The Kalman Filter derivation is easier if we make the Linear Gaussian assumptions and assume that the measurement noise and process noises are statistically independent (uncorrelated):

Linear Gaussian Assumptions

Uncorrelated Noise Processs

State Estimation and Error Notiations

Now, we wish to find the state estimate $\hat x$ given a time series of measurements and define the following notation:

$\hat x_{k|k}$ is the state estimate at time $k$ after updating the Kalman Filter with all measurements through time $k$. That is, it is the updated/filtered state estimate.

$\hat x_{k|k-1}$ is the state estimate at time $k$ after updating the Kalman Filter with all but the most recent measurement. That is, it is the predicted state estimate.

$\tilde x_{j|k}$ is the estimation error in the state, which is given by:

$\tilde x_{j|k} = x_j - \hat x_{j|k}$

$P_{k|k}$ is the state estimate error covariance matrix at time $k$ after updating the Kalman Filter with all measurements through time $k$. That is, it is the error covariance for the updated/filtered state estimate.

$P_{k|k-1}$ is the state estimate at time $k$ after updating the Kalman Filter with all but the most recent measurement. That is, it is the error covariance for the predicted state estimate.

$P_{j|k}$ is the state estimate error covariance matrix, which is given by:

$P_{j|k} = E[\tilde x_{j|k} \tilde x_{j|k}^{\prime}]$

The predicted measurement that is predicted by the Kalman Filter is found by taking the expectation of the measurement model with the zero mean measurement noise assumption:

$\hat z_{k|k-1} = E[\bar z_k] = E[H_k \bar x_k + \bar w_k] = H_k E[\bar x_k] + E[\bar w_k] = H_k \hat x_{k|k-1}$

Finally, the residual vector is the difference between the observed measurement $z_k$ at time $k$ and the predicted measurement:

$\eta_k = z_k - \hat z_{k|k-1} = H_k \hat x_{k|k-1}$

Kalman Filter Derivation

We assume that the updated state estimate is a linear combination of the predicted state estimate and the observed measurement as:

enter image description here

and we wish to find the weights (gains) $K^{\prime}_k$ and $K_k$ that produce an unbiased estimate with a minimum state estimate error covariance.

Unbiased Estimate Assumption

Applying the unbiased estimation error assumption, we have that:

enter image description here

and with $E[\tilde x_{k|k}] = 0$, this results in:

enter image description here

which results in:

enter image description here

Substituting this relationship between $K^{\prime}_k$ and $K_k$ back into the linear combination assumption, we have:

enter image description here

where $K_k$ is called the Kalman Gain.

Minimizing the State Estimate Error Covariance

We start by computing the algebraic form of the updated covariance matrix:

enter image description here

We then compute the trace of the error covariance $Tr[P_{k|k}]$ and minimize it by: (1) computing the matrix derivative with respect to the Kalman Gain $K_k$ and (2) setting this matrix equation to zero. The solution for the Kalman Gain $K_k$ is given by:

$\frac{\partial Tr[P_{k|k}]}{\partial K_k}$ = 0

results in:

enter image description here

Kalman Update

From the above derivation, the Kalman Update equations are given as:

enter image description here

where

enter image description here

Kalman Extrapolation

The extrapolation equations are simply a result of applying the system dynamics model and applying the definition of the error covariance matrix:

enter image description here

Residual Covariance

The residual covariance is given by applying the formal definition of the expectation of the quadratic form of the residual vector $\eta_k$:

enter image description here

Best Answer

State and Measurement Model Notation and Assumptions

State Estimation and Error Notiations

Kalman Filter Derivation

Unbiased Estimate Assumption

Minimizing the State Estimate Error Covariance

Kalman Update

Kalman Extrapolation

Residual Covariance

Related Solutions

Solved – How does one apply Kalman smoothing with irregular time steps

Solved – What are disadvantages of state-space models and Kalman Filter for time-series modelling

Related Question