[Math] Convergence of exponential matrix sum

convergence-divergenceexponentiationmatrices

Let $A$ be an $n\times n$ matrix. Consider the infinite sum $$B=\sum_{k=1}^\infty\frac{A^kt^k}{k!}$$ Each term $\dfrac{A^kt^k}{k!}$ is an $n\times n$ matrix. Does the sum $B$ always converge? (i.e. does the sum for each of the $n^2$ entries always converge?)

Best Answer

This is just $\exp(tA)-I$, the matrix exponential, except that you forgot to start summation at$~0$. The sum always converges for the same reason the ordinary exponential does, namely the $k!$ in the denominator beats the at most exponential growth of each entry of $(tA)^k$. To be precise, each entry of $(tA)^k$ is bounded in absolute value by $(nm)^k$ where $m$ is the maximal absolute value of all entries of $tA$; this can be shown by an easy induction on $k$.

Related Solutions

Linear Algebra – Matrix Multiplication Convergence Problem

If the Jordan normal form of the matrix $A$ is $J$, then you have $A=PJP^{-1}$ and this yields $A^n=PJ^nP^{-1}$. So we only have to ask when powers of the Jordan blocks of the given matrix converge. The structure of powers of Jordan blocks is relatively simple.

It is relatively easy to see that the power $A^n$ converges to zero matrix if $|\lambda|<1$ for all eigenvalues of $A$. (See e.g. the result at the end of this text.)

If the only eigenvalue with absolute value 1 is 1 and the corresponding Jordan blocks have size 1, then it converges. If there is a Jordan block corresponding to 1 of size at least 2, then the power does not converge. (This was pointed out by Ted in the comments, thanks for the correction.)

If there are complex eigenvalues different from 1 with $|\lambda|=1$ then the power does not converge.

If it has an eigenvalue with $|\lambda|>1$, it does not converge.

Matrices – Additivity of Matrix Exponential of Infinite Matrices

I've been able to prove only the special case. Here it is, together with a host of useful theorems and definition used, directly or indirectly, in its proof (Theorem 6.2). Only a handful of results are proved (Theorem 6.2 being one of them), since the rest were deemed straight-forward.

You may also wish to consult some of the sources listed in the following math stack exchange post: The matrix exponential: Any good books?

Definition 1 We shall use $\mathbf{M}$ to denote the class of infinite dimensional, real valued matrices as described in the original post. Unless explicitly stated otherwise, We shall use the words matrix, matrices exclusively to denote members of $\mathbf{M}$.

Conventions Capital English letters, possibly adorned with sub-/superscripts, will be used exclusively to designate matrices. Lower-case English letters, possibly adorned with sub-/superscripts, will be used exclusively to designate real numbers. Equalities will always imply existence, for instance if we write: "Let $AB=C$", we mean "Suppose the product of $A$ and $B$, in this order, is well defined and equals $C$" and if we write "... then $\sum_{n\in\mathbb{N}_0}A_n=B$" we mean "... then the series $\sum_{n\in\mathbb{N}_0}A_n$ converges and sums to $B$."

Theorem 1 $\mathbf{M}$ is a real vector space with respect to the operations of scalar multiplication and addition described in the original post and with $0$ (likewise described in the original post) serving as the neutral element w.r.t. matrix addition.

Definition 2

$|A|:=(|A_{i,j}|)_{i,j\in\mathbb{N}_0}$.
$0\leq A$ shall mean that $A=|A|$.
$A\leq B$ shall mean that $0\leq B-A$.
$A$ is bounded iff $\{\left.A_{i,j}\space:\right|\space i,j\in\mathbb{N}_0\}$ is bounded.

Theorem 2 Multiplication of matrices

A Multiplicative neutral element. $I$ is the unique multiplicative neutral element w.r.t. matrix multiplication.
Associativity w.r.t. scalar multiplication. If $a\neq0$ $(aB)C=D\iff a(BC)=D\iff B(aC)=D$. If $a=0$, the implications whose hypothesis $a(BC)=D$ is, remain value, but their converses are true only when the product $BC$ is well-defined.
Distributivity. (i) If $A_1B=C_1$ and $A_2B=C_2$, then $(A_1+A_2)B=C_1+C_2$, (ii) If $BA_1=C_1$ and $BA_2=C_2$, then $B(A_1+A_2)=C_1+C_2$
Associativity. If either $$|A||B|=P\mathrm{\, and\, }P|C|=Q$$ or $$|B||C|=S\mathrm{\, and\, }A|S|=T$$ there's some $D$ such that $$(AB)C=D=A(BC)$$
The Binomial theorem. Let $m\in\mathbb{N}_0$. If $0\leq A,B$ commute (i.e. $AB=C=BA$ for some $C$), and for every $n,k\in\mathbb{N}_0$ such that $n+k\leq m$, $A^nB^k=C_{n,k}$, then $$(A+B)^{m}=\sum_{n=0}^m\binom{m}{n}C_{n,m-n}$$

Theorem 3 Scalar multiplication of infinite sequences of matrices

If $\lim_{n\rightarrow\infty}a_n=b$, $\lim_{n\rightarrow\infty}(a_nC)=bC$.
If $\lim_{n\rightarrow\mathbb{N}_0}A_n=B$, $\lim_{n\rightarrow\infty} (cA_n)=cB$.
If $\sum_{n=0}^\infty A_n=B$, $\sum_{n=0}^\infty cA_n=cB$.

Theorem 4 Absolute convergence and rearrangement of a series

If $\sum_{n=0}^\infty A_n=B$ and $A'=(A_0,0,A_1,0,\dots)$, $$\sum_{n=0}^\infty A'_n=B$$
If $\sum_{n=0}^\infty |A_n|=B$, $\sum_{n=0}^\infty A_n=C$ for some $C$.
If $\sum_{n=0}^\infty |A_n|=B$ and $(A_{n_i})_{i\in\mathbb{N}_0}$ is a rearrangement of $(A_n)_{n\in\mathbb{N}_0}$, then $\sum_{i=0}^\infty A_{n_i}=\sum_{n=0}^\infty A_n$.
If $(A_{i,j})_{i,j\in\mathbb{N}_0}$ is a rearrangement of $(A_n)_{n\in\mathbb{N}_0}$, then $\sum_{i=0}^\infty\sum_{j=0}^\infty|A_{i,j}|=B$ iff $\sum_{n=0}^\infty |A_n|=B$ and then $$\sum_{i=0}^\infty\sum_{j=0}^\infty A_{i,j}=\sum_{j=0}^\infty\sum_{i=0}^\infty A_{i,j}=\sum_{n=0}^\infty A_n$$

Definition 3 Stochastic Matrix. A is stochastic iff $0\leq A$ and for all $i\in\mathbb{N}_0$, $\sum_{j=0}^\infty A_{i,j}=1$.

Theorem 5 Stochastic matrices

If $A, B$ are stochastic, there's a stochastic matrix $C$, such that $AB=C$.
If $A$ is stochastic, $e^{tA}=B$ for some $B$ and for all $i,j\in\mathbb{N}_0$, $|B_{i,j}|\leq e^t$.
$e^{tI}=e^tI$

Proof of Theorem 5.1

We shall make use of the following lemma, whose proof is left for the reader.

Lemma Let $0\leq A$ and let $(a_n,b_n)$ be a sequence of pairs of natural ($\mathbb{N}_0$) numbers such that $\lim_{n\rightarrow\infty}a_n=\lim_{n\rightarrow\infty}b_n=\infty$. Then $\sum_{i=0}^\infty\sum_{j=0}^\infty A_{i,j}=s$ for some $s\in\mathbb{R}$ iff $\lim_{n\rightarrow\infty}\sum_{i=0}^{a_n}\sum_{j=0}^{b_n}A_{i,j}=t$ for some $t\in\mathbb{R}$, and then $s=t$.

Now, let $A, B$ be stochastic. It is easy to see that $C:=AB$ is well defined and $0\leq C$. All that remains to show is that each of $C$'s rows sums to $1$. Let $i\in\mathbb{N}_0$ be a row index. We need to show that $\sum_{j=0}^\infty\sum_{k=0}^\infty A_{i,k}B_{k,j}=1$. For every $k\in\mathbb{N}_0$, let $a_k\in\mathbb{N}_0$ be an ascending sequence such that $\sum_{j=0}^{a_k}B_{k,j}>1-\frac{1}{k+1}$ and set $b_k:=k$. Then $a_k,b_k\underset{k\rightarrow\infty}{\rightarrow}\infty$ and so, by the lemma, it is enough to show that $\lim_{n\rightarrow\infty}C_n=1$ with $C_n:=\sum_{j=0}^{a_n}\sum_{k=0}^{b_n}A_{i,k}B_{k,j}$. Indeed, $$\underbrace{(1-\frac{1}{n+1})\sum_{k=0}^{b_n}A_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}1}\leq C_n=\sum_{k=0}^{b_n}A_{i,k}\sum_{j=0}^{a_n}B_{k,j}\leq\underbrace{\sum_{k=0}^{b_n}A_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}1}$$ $\square$

Definition 4 Exponential partial sums. For all $n\in\mathbb{N}_0, t\in\mathbb{R}$, define

$\exp_n(t):=\sum_{i=0}^n \frac{t^n}{n!}$
$\mathrm{Exp}_n(t):=\exp_n(t) I$

Theorem 6 Additivity of the matrix exponential

$\sum_{n=0}^\infty(\frac{t^n}{n!}A)=e^tA$
If $A$ is stochastic, $e^{sI}e^{tA}=e^{sI+tA}$

Proof of Theorem 6.2

Let $A$ be stochastic. Then

$$\begin{align}e^{sI}e^{tA}&=\sum_{n=0}^\infty\left(\frac{s^n}{n!}e^{tA}\right)\\ &=\sum_{n=0}^\infty\sum_{k=0}^\infty\frac{s^n}{n!}\frac{t^k}{k!}A^k\\ &=\sum_{m=0}^\infty\frac{1}{m!}\sum_{n,k\in\mathbb{N}_0\atop n+k=m}\binom{m}{n}(sI)^n(tA)^k\\ &=\sum_{m=0}^\infty\frac{1}{m!}\left(sI+tA\right)^m\\ &=e^{sI+tA}\end{align}$$

$\square$

Definition 5 The derivative of a function $\mathbb{R}\rightarrow\mathbf{M}$

Let $\phi:\mathbb{R}\rightarrow\mathbf{M}$ be a matrix-valued function on the real line. For every $i,j\in\mathbb{N}_0$ define the function-valued matrix $\Phi$ whose members are real-valued functions on the real line by $$\Phi_{i,j}:\mathbb{R}\rightarrow\mathbb{R},\space\space\Phi_{i,j}(x):=(\phi(x))_{i,j}$$

If $\Phi_{i,j}$ is differentiable at $x_0\in\mathbb{R}$ for all $i,j\in\mathbb{N}_0$, we say that $\phi$ is differentiable at $x_0$, and its derivative at this point is defined to be the matrix $$\phi'(x_0):=\left(\Phi'_{i,j}(x_0)\right)_{i,j\in\mathbb{N}_0}$$

Theorem 7 Properties of the matrix exponential

$e^0=I$
If $e^{tA}=B$ for some $0 < t$, then for every $s\in(-t,t)$ there's some $C_s$ such that $e^{sA}=C_s$.
If $0\leq A$ and $e^A=B$ for some $B$, $e^{-A}=C$ for some $C$.
If $e^{tA}=B$ for some $0 < t$, the matrix function $$\phi:\mathbb{R}\rightarrow\mathbf{M},\space\space \phi(s):=e^{sA}$$ is differentiable in the domain $(-t,t)$, and $\phi'(0)=A$.

Definition 6 Infinitesimal generators. An infinitesimal generator is an infinite real-valued matrix $A$ that satisfies the following two conditions:

i) $A_{i,j}\geq0$ for all $i,j\in\mathbb{N}_0$ such that $i \neq j$,

ii) $A_{i,i}=-\sum_{j\neq i}A_{i,j}$ for all $i\in\mathbb{N}_0$

(Infinitesimal generators arise in the theory of probability in the context of continuous-time/discrete-state-space Markov processes.)

Theorem 8 Infinitesimal generator. If $A$ is a bounded infinitesimal generator, there's some stochastic matrix $B$ and some number $0\leq c$ such that $$e^{tA}=e^{-ct}e^{ctB}$$ In fact, you may choose any $c\geq\sup_{i\in\mathbb{N}_0}|A_{i,i}|$ and

$$B = \begin{cases} c^{-1} A+I &, 0 < c \\ I&,\mathrm{otherwise} \end{cases}$$

In particular, $e^{tA}$ converges for all t.

Proof See the proof of Klenke's Theorem 17.25. $\square$

Theorem 9 The product of an infinitesimal generator and a bounded matrix

$A$ is an infinitesimal generator with $|A_{i,i}| \leq 1$ for every $i \in \mathbb{N}_0$ iff $A+I$ is stochastic.
If $A$ is an infinitesimal generator, so is $rA$. If $A$ is bounded, so is $rA$.
If $A$ is an infinitesimal generator and $B$ is bounded, $AB=C$ for some $C$. If $A$ is bounded then $C$ is bounded.
Let $(A^{(n)})_{n\in\mathbb{N}_0}$ be a sequence of infinitesimal generators and let $B$ be bounded. If $\lim_{n\rightarrow\infty}A^{(n)}=C$ and $C$ is an infinitesimal generator, $\lim_{n\rightarrow\infty}(A^{(n)}B)=CB$.

Proof of Theorem 9.3

Let $i,j\in\mathbb{N}_0$. We need to show that $$\lim_{n\rightarrow\infty}\sum_{k=0}^\infty A_{i,k}^{(n)}B_{k,j}=\sum_{k=0}^\infty C_{i,k}B_{k,j}$$

Let $n,N\in\mathbb{N}_0$ be arbitrary, $i < N$. $$\begin{align}\left|\sum_{k=0}^\infty A_{i,k}^{(n)}B_{k,j}-\sum_{k=0}^\infty C_{i,k}B_{k,j}\right|&=\underbrace{\left|A_{i,i}^{(n)}-C_{i,i}\right|}_{\underset{n\rightarrow\infty}{\rightarrow}0}\left|B_{i,j}\right|+\sum_{k=0\atop k\neq i}^N\underbrace{\left|A_{i,k}^{(n)}-C_{i,k}\right|}_{\underset{n\rightarrow\infty}{\rightarrow}0}\left|B_{k,j}\right|\\&+\left|\sum_{k=N+1}^\infty(A_{i,k}^{(n)}-C_{i,k})B_{k,j}\right|\end{align}$$

Now, if $0\leq\beta\in\mathbb{R}$ is an upper bound on $B$,

$$\begin{align}\left|\sum_{k=N+1}^\infty(A_{i,k}^{(n)}-C_{i,k})B_{k,j}\right|&\leq\left(\sum_{k=N+1}^\infty A_{i,k}^{(n)}\right)\beta+\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta\\&=\left(C_{i,i}+\underbrace{A_{i,i}^{(n)}-C_{i,i}}_{\underset{n\rightarrow\infty}{\rightarrow}0}\right)\beta\\&-\left(\sum_{k=0\atop k\neq i}^N C_{i,k}+\sum_{k=0\atop k\neq i}^N(\underbrace{A_{i,k}^{(n)}-C_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}0})\right)\beta\\&+\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta\\&\leq2\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta+\varepsilon\end{align}$$

where both summands in the last expression can be made gratuitously small by choosing $N$, $n$ large enough.

$\square$

Definition 7 Component-wise limit of a matrix function

Given a set of numbers $\emptyset\neq S\subseteq\mathbb{R}$, a matrix function $$f:S\rightarrow\mathbf{M}$$ and an accumulation point of $S$, $s\in\mathbb{R}\cup\{\pm\infty\}$ [If $s=\infty$, $s$ is an accumulation point of $S$ iff $S$ has no upper bound. If $s=-\infty$, $s$ is an accumulation point of $S$ iff $S$ has no lower bound],

$$\lim_{t\rightarrow s\atop t\in S}f(t)=A$$ iff for all $i,j\in\mathbb{N}_0$, $$\lim_{t\rightarrow s\atop t\in S}\left(f(t)\right)_{i,j}=A_{i,j}$$

Theorem 10 Discretization of the limiting process

Under the assumptions of Definition 7,

$$\lim_{t\rightarrow s\atop t\in S}f(t)=A$$ iff for every sequence $(t_n)_{n\in\mathbb{N}_0}$ in $S$ that converges to $s$, $$\lim_{n\rightarrow\infty}f(t_n)=A$$

Definition 8 Markov semigroup

A matrix function $$f:[0,\infty)\rightarrow\mathbf{M}$$ is a Markov semigroup iff

$f(0)=I$
For all $t\in(0,\infty)$, $f(t)$ is stochastic
For all $s,t\in[0,\infty)$, $$f(s+t)=f(s)f(t)$$

(Markov semigroups arise in probability theory in the context of continuous-time/discrete-state-space Markov processes.)

Theorem 11 Right-hand derivative of a Markov semigroup

Let $f:[0,\infty)\rightarrow\mathbf{M}$ be a Markov semigroup and let $A$ be an infinitesimal generator, such that $$\lim_{t\downarrow0}\frac{1}{t}(f(t)-I)=A$$

Then $f$ is right-hand differentiable on $[0,\infty)$ and for all $t\in[0,\infty)$, $$\mathrm{D}_R(f, t)=Af(t)$$ where $\mathrm{D}_R(f, t)$ is the right-hand derivative of $f$ at $t$.

Proof

Let $t\in[0,\infty)$. Then $$\lim_{s\downarrow0}\frac{1}{s}(f(t+s)-f(t))=\lim_{s\downarrow0}\left(\frac{1}{s}(f(s)-I)\space f(t)\right)=Af(t)$$

$\square$

Best Answer

Related Solutions

Linear Algebra – Matrix Multiplication Convergence Problem

Matrices – Additivity of Matrix Exponential of Infinite Matrices

Related Question