Solved – Maximum Likelihood estimation and the Kalman filter

kalman filterlikelihoodmaximum likelihoodstate-space-modelstime series

I know the Kalman filter recursions and can derive these but what I don't really get is how to estimate the hyper parameters using maximum likelihood.

I understand that when running the Kalman filter we get the prediction error and its variance which can be used the construct the likelihood function.

What I don't really get is the order these steps are done in. Am I supposed to:

Method 1

1) Run the Kalman filter given arbitrary starting values and obtain the likelihood function.

2) Maximize the likelihood function wrt to the hyper parameters of the model.

OR

Method 2

1) Estimate the hyper-parameters of the state space model using maximum likelihood.

2) Run the Kalman filter with the hyper-parameters set at these estimates.

I found this question which answers what I need: LogLikelihood Parameter Estimation for Linear Gaussian Kalman Filter. Here the hyper parameters are estimated from the likelihood function and that is the same as the algorithm on p. 8 top in these lecture notes specifies. However, in the same notes it is written (p. 10 mid): "Given a set of optimal parameter values, $\theta_{ML}$, it is now worth to explore the paths of unobserved components…".

Is the correct way to conduct the analysis the following?

Method 3

1) Run the Kalman filter given arbitrary starting values and obtain the likelihood function.

2) Maximize the likelihood function wrt to the hyper parameters of the model.

3) Run the Kalman filter again using the ML estimates obtained in step 2). Use these state estimates in the following analysis?

Best Answer

I could be wrong, but what makes sense to me is this:

  1. define a function for the kalman filtering and prediction. Make that output the log likelihood (using v and the covariance matrix of v). The log likelihood in this case is described in the stack exchange post you refer to. Make sure Q, R, mu_0 and A are free parameters
  2. Optimize the function with respect to those parameters by maximizing the log likelihood.

Essentially yes, the underlying optimization procedure will start with random parameter values but from there it will optimize the parameters to fit the observables. I don't see how you can estimate these parameters first and then do the kalman filter.

Source: https://faculty.washington.edu/eeholmes/Files/Intro_to_kalman.pdf

Related Question