[Math] Analytical formula for numerical derivative of the matrix pseudo-inverse

matricesna.numerical-analysis

Is there a simple numerical procedure for obtaining the derivative (with respect to $x$) of the pseudo-inverse of a matrix $A(x)$, without approximations (except for the usual floating-point limitations)? The matrix $\frac{\mathrm{d}}{\mathrm{d}x}A(x)$ is supposed to be known.

In other words, are there analytical formulas that could be numerically evaluated so as to obtain the derivative of the pseudo-inverse? or, what formula would generalize
$$
\frac{\mathrm{d}}{\mathrm{d}x}A^{-1}(x) = -A^{-1}(x) \left(\frac{\mathrm{d}}{\mathrm{d}x}A(x)\right) A^{-1}(x)
$$
for the pseudo-inverse?

I would be happy if this were possible, as this would allow my uncertainty calculation programming package to precisely calculate uncertainties on the pseudo-inverse of matrices whose elements have uncertainties (currently, a numerical differentiation is performed, which may yield imprecise results in some cases).

Any idea would be much appreciated!

Best Answer

The answer is known since at least 1973: a formula for the derivative of the pseudo-inverse of a matrix $A(x)$ of constant rank can be found in

The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables Separate. Author(s): G. H. Golub and V. Pereyra. Source: SIAM Journal on Numerical Analysis, Vol. 10, No. 2 (Apr., 1973), pp. 413-432

References 29 and 30 in the above paper contain an earlier formula that can also be used to obtain the same result (papers by P.A. Wedin).

The case of non-constant rank is simple: the pseudo-inverse is not continuous, in this case (see Corollary 3.5 in On the Perturbation of Pseudo-Inverses, Projections and Linear Least Squares Problems. G. W. Stewart. SIAM Review, Vol. 19, No. 4. (Oct., 1977), pp. 634-662).

Here is the formula for a matrix of constant rank (equation (4.12), in the Golub paper):

$$ \frac{\mathrm d}{\mathrm d x} A^+(x) = -A^+ \left( \frac{\mathrm d}{\mathrm d x} A \right) A^+ +A^+ A{^+}^T \left( \frac{\mathrm d}{\mathrm d x} A^T \right) (1-A A^+) + (1-A^+ A) \left( \frac{\mathrm d}{\mathrm d x} A^T \right) A{^+}^T A^+ $$

(for a real matrix).

For complex matrices, the above formula works if Hermitian conjugates are used instead of transposes. I don't have any reference on this (anyone?), but this is verified by all the numerical tests I did (with matrices of various shapes and ranks).