[Math] Nonlinear Least Squares vs. Extended Kalman Filter

bayesian networkkalman filterleast squaresnonlinear optimizationoptimization

What is the relationship between nonlinear least squares and the Extended Kalman Filter (EKF)? I've learned both topics separately and thought I understood them, but am now in a class where the EKF (assuming no state dynamics/process model) is being presented as a form of nonlinear least squares and am getting confused.

Am I right in thinking that the EKF is like a recursive form of Gauss-Newton or Levenberg-Marquardt, where you update the state estimate with a single Newton step for each measurement? This makes it apparent that the EKF is a worse estimator than running Gauss-Newton/Levenberg-Marquardt over the full batch of data since the initial measurements are integrated with poor linearization points that are never updated.

If you did just run a batch Gauss-Newton/Levenberg-Marquardt, how would you obtain an uncertainty estimate as in the EKF?

Best Answer

Yes, an EKF can be understood as a recursive form of Gauss-Newton update, see here.

We can conclude, that we need to iterate at each time step to converge to the best solution, just like when performing Gauss-Newton optimization. Now, this is why the Iterated EKF exists.

The full batch estimation is more powerful than an EKF because we can relinearize the working point. Now, this is why an Iterated EKF with an augmented state vector exists: it allows to change the linearization point for all states in the augmented state vector before marginalizing them. Augmenting the state vector over the entire set of states makes it algebraically equivalent to full batch estimation.

To your last question: if you run a batch Gauss-Newton or Levenberg-Marquardt estimation, you usually solve a equation that involves computing the inverse of a Hessian matrix. The inverse blocks on its diagonal represent the absolute uncertainty estimates for the states. (If you first remove the row and column of the state you are interested from the Hessian and subsequently invert it, you obtain the marginal covariances: the covariance from the fixed state to all other states.)

Related Question