How does a Kalman Filter with Constant Velocity estimate the velocity

kalman filter

I'm gonna describe the constant velocity example (without acceleration) used in many textbooks.

My state is defined as 2D vector s = [x, v_x], where x describes position and v_x the velocity in x direction. Measurements are only the position. State transition matrix is:

H = 1 1
    0 1

Assume fixed process noise covariance Q = diag(1) and measurement covariance R = diag(1). Now let's say my assumed initial state is s_0 = [0, 0] with covariance P_0 = diag(1), i.e. both position and velocity are 0.

Given the above setup, which part of the Kalman Filter algorithm updates the velocity? How is it possible that the velocity changes between iterations (i.e. after a prediction + update step)? I know for sure that in the prediction step, the state transition keeps constant velocity, so it cannot change there. Our measurements do not include velocity information, just position, so it shouldn't (?) happen in the update step either. So where then?

Sorry, this might be a stupid question. But I implemented it in python using the frame work filterpy (see below) and I cannot explain the behavior.

#!/usr/bin/python
from filterpy.kalman import KalmanFilter as KF
import numpy as np
import matplotlib.pyplot as plt

kf = KF(2,1)

kf.F = np.array([[1,1],    # State Transition Model
                 [0,1]])

kf.H = np.array([[1,0]])   # Measurement Model
kf.x = np.array([[0],[0]]) # Initial state
kf.P *= 1                  # State Uncertainty (diag) 
kf.R *= 1                  # Measurement Noise (diag)
kf.Q *= 1                  # Process Noise     (diag)

T = range(10)
M = []
X = []

for t in T:
    kf.predict()

    m = t + np.random.randn()
    M.append(m)

    kf.update(m)
    X.append(kf.x[0,0])


plt.scatter(T,M,c='red',s=10)
plt.plot(T,X)
plt.show()

Below are 3 iterations. I can see that the velocity DID change after the update step.

--------1. Iteration--------
predict
[[0]
 [0]]
update
[[0.49755657]
 [0.16585219]]                <--- velocity changed, how?
--------2. Iteration--------
predict
[[0.66340876]
 [0.16585219]]
update
[[-0.09533335]
 [-0.21351886]]
--------3. Iteration--------
predict
[[-0.30885221]
 [-0.21351886]]
update
[[1.27699875]
 [0.60554702]]

Best Answer

I understand your confusion, and I see this occur often when students are introduced to the Kalman filter (KF). If I understand your confusion, you do not understand why the velocity is changing because you only have position measurements. The simple answer is the position and velocity are correlated, so the velocity is updated indirectly from the position measurements.

I encourage you to work out the equations symbolically for the update step, and you will understand why this occurs. However, I will provide a more general example for linear systems, and I hope you will be able to try working this out for your problem.

Illustration:

Recall, the Kalman gain is given by $$ \begin{align} K_t &= P_t^- H_t^T(H_t P_t^- H_t^T + R_t)^{-1} \end{align} $$ where $K_t$ is the Kalman gain, $P_t^-$ is the covariance matrix before the measurement, and $H_t$ is the measurement model, and the updated state estimate is given by $$ \begin{align} x_t^+ &= x^-_t + K_t(z_t - H_tx_t^-) \end{align} $$ where $x_t^-$ is the state estimate before the measurement, $x_t^+$ is the state estimate after the measurement, and $z_t$ is the measurement. Note, the $x_t \in \mathbb{R}^n$ is an a vector, $z_t \in \mathbb{R}^m$ is a vector, and $P_t \in \mathbb{R}^{n \times n}$ is a matrix with rows and columns equal to the number of states. The observation model $H_t \in \mathbb{R}^{m \times n}$ is dependent on the length of the measurement vector.

Let us look at the Kalman gain in more detail. First, let $S_t = (H_t P_t^- H_t^T + R_t)$ where $S_t$ is the covariance of the innovation. In other words, $S_t$ is the covariance of $z_t - H_t x_t^-$, so rewriting $K_t$, we have

$$K_t = P_t^- H_t^TS_t^{-1}$$

where $S_t \in \mathbb{R}^{m \times m}$ is a square matrix with rows and columns each equal to the length of the measurement vector. Based on this, we know that $K_t \in \mathbb{R}^{n \times m}$ without working out the equations.

Let $p_i$ be the $i$th row of $P$, and let $h_i$ be the $i$th column of $H^T$. Now,

$$ P_t^- H_t^T = \begin{bmatrix} p_0 h_0 & \cdots & p_0 h_m\\ \vdots & \ddots & \vdots \\ \vdots & \ddots & \vdots \\ p_n h_0 & \cdots & p_n h_m \end{bmatrix} $$

Now, if we let $s_{ij}$ represent the element of $S_t$ at the $i$th row and $j$th column, then the Kalman gain is given by $$ K_t = \begin{bmatrix} p_0 h_0 & \cdots & p_0 h_m\\ \vdots & \ddots & \vdots \\ \vdots & \ddots & \vdots \\ p_n h_0 & \cdots & p_n h_m \end{bmatrix} \begin{bmatrix} s_{00} & \cdots & s_{0m} \\ \vdots & \ddots & \vdots \\ s_{m0} & \cdots & s_{mm} \end{bmatrix} $$

Now, if we go back and look at $x_t^+$, we can see that potentially, states are updated even if we don't directly measure those states because $P_t^- H_t^T$ projects the innovation covariance on the state space (i.e., the correlations are being considered in the Kalman gain). This may not always be the case, but if variable are correlated, then this will likely occur. As I mentioned, I encourage you to work this out for your system to gain a deeper understanding.