Solved – How to make a matrix positive definite

expectation-maximizationfactor analysis

I'm trying to implement an EM algorithm for the following factor analysis model;

$$W_j = \mu+B a_j+e_j \quad\text{for}\quad j=1,\ldots,n$$

where $W_j$ is p-dimensional random vector, $a_j$ is a q-dimensional vector of latent variables and $B$ is a pxq matrix of parameters.

As a result of other assumptions used for the model, I know that $W_j\sim N(\mu, BB'+D)$ where $D$ is the variance covariance matrix of error terms $e_j$, $D$ = diag($\sigma_1^2$,$\sigma_2^2$,…,$\sigma_p^2$).

For the EM algorithm to work, I'm doing dome iterations involving estimation of $B$ and $D$ matrices and during these iterations I'm computing the inverse of $BB'+D$ at each iteration using new estimates of $B$ and $D$. Unfortunately during the course of iterations, $BB'+D$ loses its positive definiteness (but it shouldn't because it is a variance-covariance matrix) and this situation ruins the convergence of the algorithm. My questions are:

  1. Does this situation show that there is something wrong with my algorithm since the likelihood should increase at every step of EM?

  2. What are the practical ways to make a matrix positive definite?

Edit: I'm computing the inverse by using a matrix inversion lemma which states that:

$$(BB'+D)^{-1}=D^{-1}-D^{-1}B (I_q+B'D^{-1}B)^{-1} B'D^{-1}$$

where the right side involves only the inverses of $q\times q$ matrices.

Best Answer

OK, since you're doing FA I'm assuming that $B$ is of full column rank $q$ and $q<p$. We need a few more details though. This may be a numerical problem; it may also be a problem with your data.

How are you computing the inverse? Do you need the inverse explicitly, or can re-express the calculation as the solution to a linear system? (ie to get $A^{-1}b$ solve $Ax=b$ for x, which is typically faster and more stable)

What is happening to $D$? Are the estimates really small/0/negative? In some sense it is the critical link, because $BB'$ is of course rank deficient and defines a singular covariance matrix before adding $D$, so you can't invert it. Adding the positive diagonal matrix $D$ technically makes it full rank but $BB'+D$ could still be horribly ill conditioned if $D$ is small.

Oftentimes the estimate for the idiosyncratic variances (your $\sigma^2_i$, the diagonal elements of $D$) is near zero or even negative; these are called Heywood cases. See eg http://www.technion.ac.il/docs/sas/stat/chap26/sect21.htm (any FA text should discuss this as well, it's a very old and well-known problem). This can result from model misspecification, outliers, bad luck, solar flares... the MLE is particularly prone to this problem, so if your EM algorithm is designed to get the MLE look out.

If your EM algorithm is approaching a mode with such estimates it's possible for $BB'+D$ to lose its positive definiteness, I think. There are various solutions; personally I'd prefer a Bayesian approach but even then you need to be careful with your priors (improper priors or even proper priors with too much mass near 0 can have the same problem for basically the same reason)