Functions of a Matrix

linear algebramatricesmatrix equationsmatrix-calculuspolynomials

I was trying to calculate the functions of a matrix, like for example, say $M$ is a matrix, and I'm trying to compute $f(M)$.

One of the commonest approaches to this problem is diagonalizing the matrix, and then noting that function of the matrix $M$ is a diagonal matrix, with the diagonal elements being $f(\lambda)$, where $\lambda$ are the eigenvalues of the matrix.

Another common approach is by using the Taylor series.

However, I came across a third approach, which I use most frequently, and gives me the right answer, but I don't understand the theory or the intuition behind it.

We start by writing, the function in a form similar to the Taylor series, but containing finite terms :

$f(M) = a_0I + a_1M + a_2M^2 +………+a_nM^n$

Here, $n$ is the number of non-degenrate eigenvalues of the matrix.
Using Cayley hamiltonian theorem, we know, a matrix satisfies its own characteristic equation. So, now we plug in the eigenvalues of $M$ to figure out the coefficients, it gives us $n$ simultaneous linear equations, that we solve.

Then we plug the values of the coefficient into the original equation, and hence we have got $f(M)$.

Now, I've quite a few questions about this method. Firstly, what is the intuition behind this method, from where do we derive this ? Secondly, suppose I have a matrix with 3 eigenvalues, out of which 2 are the same (degeneracy). In this case, it would turn out to be a degree-$2$ polynomial However, if there had been no such degeneracies, it would have been a degree-$3$ polynomial

I've been told this is related to the minimal polynomial, but I know, even if there is degeneracy in the eigenvalues, the minimal polynomial can still have a degree of $n$ in a $n$ dimensional square matrix. So, there are 3-dimensional matrices with (say) 3 eigenvalues in which 2 are repeated. The minimal polynomial can be of degree 3, we have to check it, but it is possible.

However, in the above method, we cannot repeat any eigenvalues and the degree of that polynomial is equal to the no. of independent eigenvalues. SO I don't know how it is related to the minimal polynomial of the matrix.

Can anyone tell me where the third method comes from, and what are its limitations, if any? Till now, it has given me the correct answer for all problems I've encountered.

Best Answer

$ \def\e{\varepsilon} \def\p{\partial} \def\o{{\tt1}} $Sylvester Interpolation will work for any matrix, assuming that its eigenvalues are known.
However, the simplest technique I've found is due to Nicholas Higham.
It is based on eigenvalue decomposition $$\eqalign{ M &= X\Lambda X^{-1} \\ f(M) &= X\;f(\Lambda)\;X^{-1} \\ }$$ But this method won't work when $M$ is defective since $X$ becomes singular. Even if $M$ is not defective, $X$ may be so ill-conditioned that the computed function is not accurate.

In that case, use a small random matrix $\big\{\|R\|\sim\e\big\}\,$ and calculate $f(M\!+\!R)$ whose elements will be within the machine epsilon to those of $f(M)$


To answer your question about adapting the Cayley-Hamilton-Sylvester method $$\eqalign{ p(z) &= \sum_{j=0}^{n-1} a_j z^j \qquad&\big({\rm finite\;polynomial}\big) \\ f(M) &= p(M) \\ }$$ to handle eigenvalues $\{\lambda_k\}$ with algebraic multiplicities $m_k>\o,\;$ setup and solve the following system of equations (using distinct eigenvalues) $$\eqalign{ \p^n f(\lambda_k) &= \p^n p(\lambda_k) \qquad&\big({\rm for}\;0\le n<m_k\big) \\ }$$ where $\p^nf(z)$ denotes the $n^{th}$ derivative of $f(z)\;$ and $\;\p^0f(z) = f(z)$