Unscented Kalman Filter – How should I interpret the math of UKF

control theorykalman filteroptimal controlparameter estimation

I need some help for explaining the Unscented Kalman Filter (UKF). I understand regular Kalman Filter and Extended Kalman Filter, which is the same as regular Kalman Filter after the linearization.

First I'm going to write down that UKF algorithm and for every equation I write down, I will ask a question.

Step 1:

First initilize the estimated state vector $\hat x[k-1]$ and covariance matrix $P[k-1]$ with with:

$$\hat x[k-1] = E(x[k])$$
$$P[k-1] = E(x[k] – \hat x[k-1])(x[k] – \hat x[k-1])^T $$

Question 1:

  • Is it OK if I assume that the average of steady state output measurement $\bar y[k]$ can be assumed as average $E(x[k])$ and there fore $\hat x[k-1] = \bar y[k]$ as the beginning?
  • $\hat x[k-1]$ is described as best initial guess before measurement, so what should $x[k]$ be then?

Step 2:

Now we need to talk about sigma points $\hat x^{(i)}[k-1]$. I begin first to say that sigma point 0 is:

$$\hat x^{(0)}[k-1] = \hat x[k-1]$$

The rest of the sigma points $\hat x^{(i)}[k-1]$ get its values from:

$$\hat x^{(i)}[k-1] = \hat x[k-1] + \Delta x^{(i)}, i = 1…2M$$
$$\Delta x^{(i)} = (\sqrt{cP[k-1]})_{(i)}, i = 1…M$$
$$\Delta x^{(i+M)} = -(\sqrt{cP[k-1]})_{(i)}, i = 1…M$$

Where $c$ is a scalar $c = \alpha^2(M+\kappa)$, where $M$ is number of states, e.g dimension of state vector $x[k]$ and $\kappa$ and $\alpha$ are tuning parameters of free choice. Normally $\alpha$ is a positive but very smal number and $\kappa$ is usually 0.

Question 2:

  • Declaring the first sigma points as the estimated state vector $\hat x^{(0)}[k-1] = \hat x[k-1]$ is easy, but how do I get the rest of the sigma points? I'm talking about the equation like this $(\sqrt{cP[k-1]})_{(i)}$
  • $P[k-1]$ is a matrix and takin the square root of a matrix like this $(\sqrt{cP[k-1]})_{(i)}$ does not create a vector $\Delta x^{(i)}$…or does it?

Step 3:

For all kalman filters, we need to have a dynamical model. Assume that here it is:

$$x[k+1] = f(x[k], u[k])$$
$$y[k] = h(x[k], u[k])$$

Where $u[k]$ is the input signal, $y[k]$ is the model output measurement, $h()$ is the measurement function and $f()$ is the dynamical system. So the task must be where I'm using my sigma points as the state vector for $h()$ for compute the output sigma points $\hat y^{(i)}[k]$

$$\hat y^{(i)}[k-1] = h(\hat x^{(i)}[k-1], u_m[k]), i = 0…2M$$

Question 3:

  • Most if the time, I don't have a measurement function $h()$ and only a dynamical system $f()$. Is it OK to find the estimated output if my model looks like this instead? I'm using SINDy algorithm and it can identify a nonlinear state space model $f()$ from data. I recommend it all over linear system identification.
  • What is $u_m[k]$?

Step 4:

Now when I got the output sigma points $\hat y^{(i)}[k]$, then I must find the estimated output $\hat y[k]$.

$$\hat y[k] = \sum^{2M}_{i=0}W^{(i)}_M \hat y^{(i)}[k-1]$$
$$W^{(0)}_M = 1 – \frac{M}{\alpha^2(M+\kappa)}$$
$$W^{(i)}_M = \frac{1}{2\alpha^2(M+\kappa)}, i = 1…2M$$

Question 4:

  • Where is the index of $\frac{M}{2\alpha^2(M+\kappa)}$ ? I don't see any $i$ here.

Step 5:

Now we need to find the covariance matrix for output $P_y$

$$P_y = \sum^{2M}_{i=0}W^{(i)}_c(\hat y^{(i)}[k-1] – \hat y[k])(\hat y^{(i)}[k-1] – \hat y[k])^T + R[k]$$
$$W^{(0)}_c = 2 – \alpha^2 + \beta – \frac{M}{\alpha^2(M+\kappa)}$$
$$W^{(i)}_c = \frac{1}{2\alpha^2(M+\kappa)}, i = 1…2M$$

Where $\beta$ is a tuning parameter, for gausian distribution and $2$ of optimal knowledge about the state vectors.

Question 5:

  • How can i find $R[k]$?

Step 6:

Compute a covariance matrix $P_{xy}$

$$P_{xy} = \frac{1}{2\alpha^2(M+\kappa)} \sum^{2M}_{i=1}W^{(i)}_c(\hat x^{(i)}[k-1] – \hat x[k-1])(\hat y^{(i)}[k-1] – \hat y[k])^T$$

Notice that we need to start at $i = 1$.

Question 6:

  • No question

Step 7:

Now it's time to find the kalman gain $K$.

$$K = P_{xy} P^{-1}_y$$
$$\hat x[k] = \hat x[k-1] + K(y[k] – \hat y[k])$$
$$P[k] = P[k-1] – KP_yK^T_k$$

Question 7:

  • What is $K^T_k$? I know that it's the transpose of $K_k$, but where does $k$ comes from?`
  • Finding the inverse $P^{-1}$ is not effective. Do you recommend an algorithm to solve $K$ from $KP_y = P_{xy}$?

Step 8:

Now when we have your current estimated state $\hat x[k]$ from previous estimated state $\hat x[k-1]$. We are going to to predict the future estimated state $\hat x[k+1]$.

$$\hat x^{(0)}[k] = \hat x[k]$$
$$\hat x^{(i)}[k] = \hat x[k] + \Delta x^{(i)}, i = 1…2M$$
$$\Delta x^{(i)} = (\sqrt{cP[k]})_{(i)}, i = 1…M$$
$$\Delta x^{(i+M)} = -(\sqrt{cP[k]})_{(i)}, i = 1…M$$

Question 8:

  • No question

Step 9:

Compute the predicted sigma points from our dynamical model.

$$\hat y^{(i)}[k+1] = f(\hat x^{(i)}[k+1], u_s[k]), i = 0…2M$$

Question 9:

  • What is $u_s[k]$?

Step 10:

Compute these. I don't know why…

$$\hat x[k+1] = \sum^{2M}_{i=0}W^{(i)}_M \hat x^{(i)}[k+1]$$
$$W^{(0)}_M = 1 – \frac{M}{\alpha^2(M+\kappa)}$$
$$W^{(i)}_M = \frac{1}{2\alpha^2(M+\kappa)}, i = 1…2M$$

Question 10:

  • No question

Step 11:

Compute this and then you're done.

$$P[k+1] = \sum^{2M}_{i=0}W^{(i)}_c(\hat x^{(i)}[k+1] – \hat x[k+1])(\hat x^{(i)}[k+1] – \hat x[k+1])^T + Q[k]$$
$$W^{(0)}_c = 2 – \alpha^2 + \beta – \frac{M}{\alpha^2(M+\kappa)}$$
$$W^{(i)}_c = \frac{1}{2\alpha^2(M+\kappa)}, i = 1…2M$$

Question 11:

  • How do I get $Q[k]$?

Now we have the next covariance matrix $P[k+1]$ and next estimated state vector $\hat x[k+1]$. That means when we jump back to Step 2 again, then $P[k]$ and $\hat x[k]$ is becoming $P[k-1]$ and $\hat x[k-1]$ and the current $P[k-1]$ and $\hat x[k-1]$ will no longer be useful for us becaue the becomes $P[k-2]$ and $\hat x[k-2]$

Our goal is to find the state $\hat x[k]$.

Equations can be found at MathWorks: Extended and Unscented Kalman Filter Algorithms for Online State Estimation

Please correct me if I'm wrong or wrote some errors in the math.

Best Answer

Question 1. You mean $\hat{x}[0]$ and $P[0]$? They are just guesses. You can use your first measurement for $\hat{x}[0]$ and give a possible error value $P[0]$ depending on your measurement accuracy. $k$ starts from 1.

Question 2. You use the $i$th column of the matrix.

Question 3. You have to have an $h$, maybe it is just the identity function and you measure all the states? $u_m[k]$ is just the input, they just give a different name for unknown reasons I think.

Question 4. These choice of weights are independent from the index.

Question 5. $R[k]$ is the covariance of the measurement error. You can use the accuracy knowledge of the measurement device to determine it.

Question 7. $k$ is just a typo. You can use something like the LU decomposition to solve for $K$. But generally the number of outputs are small, so $P_y$ is a small square matrix.

Question 9. $u_s[k]$ is just the input.

Question 11. $Q[k]$ is the process noise covariance, which is something you know as in the linear KF.

I believe Wikipedia has a good explanation: https://en.wikipedia.org/wiki/Kalman_filter#Unscented_Kalman_filter

Related Question