Matrices – Additivity of Matrix Exponential of Infinite Matrices

matrices

It is well known that the matrix exponential of finite dimensional matrices is additive if the exponents commute: $AB=BA\implies e^Ae^B=e^{A+B}$ (cf. e.g. Bernstein, Corollary 11.1.6). Under what circumstances does it carry over to the infinite dimensional case? I am particularly interested in the case when the matrices are stochastic.


Definitions

Let $A=(A_{i,j})_{i,j\in\mathbb{N}_0}$, $B=(B_{i,j})_{i,j\in\mathbb{N}_0}$ be real valued infinite dimensional matrices.
Scalar product, sum and product of infinite matrices are again infinite matrices that are defined component-wise as follows
$$\begin{align}rA&:=(rA_{i,j})_{i,j\in\mathbb{N}_0}\\A+B&:=(A_{i,j}+B_{i,j})_{i,j\in\mathbb{N}_0}\\AB&:=(\sum_{k=0}^\infty A_{i,k}B_{k,j})_{i,j\in\mathbb{N}_0}\end{align}$$
Unlike the scalar product and the sum, the product may not be defined for some matrices. It is defined provided all the infinite series on the right hand side converge to a real number.

The limit of an infinite sequence of infinite, real-valued matrices $(C^{(n)})_{n\in\mathbb{N}_0}$ as well as the sum thereof are again infinite matrices that are defined component-wise as follows
$$\begin{align}\lim_{n\rightarrow\infty}C^{(n)}&:=(\lim_{n\rightarrow}C_{i,j}^{(n)})_{i,j\in\mathbb{N}_0}\\\sum_{n=0}^\infty C^{(n)}&:=(\sum_{n=0}^\infty C_{i,j}^{(n)})_{i,j\in\mathbb{N}_0}\end{align}$$
provided all the limits/sums on the right hand side converge to real numbers.

We define
$$\begin{align}0&:=(0)_{i,j\in\mathbb{N}_0}\\I&:=(\delta_{i,j})_{i,j\in\mathbb{N}_0}\end{align}$$
where $\delta_{i,j}$ is Kronecker's Delta.

The powers of $A$ are defined recursively thus (for $n\in\mathbb{N}_0$)
$$\begin{align}A^0&:=I\space\space\mathrm{(even\, if\, }A=0\mathrm{\, )}\\A^{n+1}&:=AA^n\end{align}$$
Not all powers of all infinite matrices may be defined for the same reason the product of some pairs of infinite matrices may not be defined.

The exponential of $A$ is defined as
$$e^A:=\sum_{n=0}^\infty \frac{1}{n!} A^n$$
provided all the powers on the right hand side are defined and that the sum converges.


Question

Suppose $e^A$, $e^B$ converge and their product is well-defined. Under what circumstances does $e^{A+B}$ converge and
$$e^{A+B}=e^Ae^B$$


A Special Case of Interest

If $A$ is a stochastic matrix (i.e. all components are non-negative and each row sums to $1$) and $s,t\in\mathbb{R}$, under what conditions does it hold that $e^{sI}$, $e^{tA}$, $e^{sI}e^{tA}$ and $e^{sI+tA}$ converge and
$$e^{sI}e^{tA}=e^{sI+tA}$$

Best Answer

I've been able to prove only the special case. Here it is, together with a host of useful theorems and definition used, directly or indirectly, in its proof (Theorem 6.2). Only a handful of results are proved (Theorem 6.2 being one of them), since the rest were deemed straight-forward.

You may also wish to consult some of the sources listed in the following math stack exchange post: The matrix exponential: Any good books?


Definition 1 We shall use $\mathbf{M}$ to denote the class of infinite dimensional, real valued matrices as described in the original post. Unless explicitly stated otherwise, We shall use the words matrix, matrices exclusively to denote members of $\mathbf{M}$.

Conventions Capital English letters, possibly adorned with sub-/superscripts, will be used exclusively to designate matrices. Lower-case English letters, possibly adorned with sub-/superscripts, will be used exclusively to designate real numbers. Equalities will always imply existence, for instance if we write: "Let $AB=C$", we mean "Suppose the product of $A$ and $B$, in this order, is well defined and equals $C$" and if we write "... then $\sum_{n\in\mathbb{N}_0}A_n=B$" we mean "... then the series $\sum_{n\in\mathbb{N}_0}A_n$ converges and sums to $B$."

Theorem 1 $\mathbf{M}$ is a real vector space with respect to the operations of scalar multiplication and addition described in the original post and with $0$ (likewise described in the original post) serving as the neutral element w.r.t. matrix addition.

Definition 2

  1. $|A|:=(|A_{i,j}|)_{i,j\in\mathbb{N}_0}$.

  2. $0\leq A$ shall mean that $A=|A|$.

  3. $A\leq B$ shall mean that $0\leq B-A$.

  4. $A$ is bounded iff $\{\left.A_{i,j}\space:\right|\space i,j\in\mathbb{N}_0\}$ is bounded.

Theorem 2 Multiplication of matrices

  1. A Multiplicative neutral element. $I$ is the unique multiplicative neutral element w.r.t. matrix multiplication.

  2. Associativity w.r.t. scalar multiplication. If $a\neq0$ $(aB)C=D\iff a(BC)=D\iff B(aC)=D$. If $a=0$, the implications whose hypothesis $a(BC)=D$ is, remain value, but their converses are true only when the product $BC$ is well-defined.

  3. Distributivity. (i) If $A_1B=C_1$ and $A_2B=C_2$, then $(A_1+A_2)B=C_1+C_2$, (ii) If $BA_1=C_1$ and $BA_2=C_2$, then $B(A_1+A_2)=C_1+C_2$

  4. Associativity. If either $$|A||B|=P\mathrm{\, and\, }P|C|=Q$$ or $$|B||C|=S\mathrm{\, and\, }A|S|=T$$ there's some $D$ such that $$(AB)C=D=A(BC)$$

  5. The Binomial theorem. Let $m\in\mathbb{N}_0$. If $0\leq A,B$ commute (i.e. $AB=C=BA$ for some $C$), and for every $n,k\in\mathbb{N}_0$ such that $n+k\leq m$, $A^nB^k=C_{n,k}$, then $$(A+B)^{m}=\sum_{n=0}^m\binom{m}{n}C_{n,m-n}$$

Theorem 3 Scalar multiplication of infinite sequences of matrices

  1. If $\lim_{n\rightarrow\infty}a_n=b$, $\lim_{n\rightarrow\infty}(a_nC)=bC$.

  2. If $\lim_{n\rightarrow\mathbb{N}_0}A_n=B$, $\lim_{n\rightarrow\infty} (cA_n)=cB$.

  3. If $\sum_{n=0}^\infty A_n=B$, $\sum_{n=0}^\infty cA_n=cB$.

Theorem 4 Absolute convergence and rearrangement of a series

  1. If $\sum_{n=0}^\infty A_n=B$ and $A'=(A_0,0,A_1,0,\dots)$, $$\sum_{n=0}^\infty A'_n=B$$

  2. If $\sum_{n=0}^\infty |A_n|=B$, $\sum_{n=0}^\infty A_n=C$ for some $C$.

  3. If $\sum_{n=0}^\infty |A_n|=B$ and $(A_{n_i})_{i\in\mathbb{N}_0}$ is a rearrangement of $(A_n)_{n\in\mathbb{N}_0}$, then $\sum_{i=0}^\infty A_{n_i}=\sum_{n=0}^\infty A_n$.

  4. If $(A_{i,j})_{i,j\in\mathbb{N}_0}$ is a rearrangement of $(A_n)_{n\in\mathbb{N}_0}$, then $\sum_{i=0}^\infty\sum_{j=0}^\infty|A_{i,j}|=B$ iff $\sum_{n=0}^\infty |A_n|=B$ and then $$\sum_{i=0}^\infty\sum_{j=0}^\infty A_{i,j}=\sum_{j=0}^\infty\sum_{i=0}^\infty A_{i,j}=\sum_{n=0}^\infty A_n$$

Definition 3 Stochastic Matrix. A is stochastic iff $0\leq A$ and for all $i\in\mathbb{N}_0$, $\sum_{j=0}^\infty A_{i,j}=1$.

Theorem 5 Stochastic matrices

  1. If $A, B$ are stochastic, there's a stochastic matrix $C$, such that $AB=C$.

  2. If $A$ is stochastic, $e^{tA}=B$ for some $B$ and for all $i,j\in\mathbb{N}_0$, $|B_{i,j}|\leq e^t$.

  3. $e^{tI}=e^tI$

Proof of Theorem 5.1

We shall make use of the following lemma, whose proof is left for the reader.

Lemma Let $0\leq A$ and let $(a_n,b_n)$ be a sequence of pairs of natural ($\mathbb{N}_0$) numbers such that $\lim_{n\rightarrow\infty}a_n=\lim_{n\rightarrow\infty}b_n=\infty$. Then $\sum_{i=0}^\infty\sum_{j=0}^\infty A_{i,j}=s$ for some $s\in\mathbb{R}$ iff $\lim_{n\rightarrow\infty}\sum_{i=0}^{a_n}\sum_{j=0}^{b_n}A_{i,j}=t$ for some $t\in\mathbb{R}$, and then $s=t$.

Now, let $A, B$ be stochastic. It is easy to see that $C:=AB$ is well defined and $0\leq C$. All that remains to show is that each of $C$'s rows sums to $1$. Let $i\in\mathbb{N}_0$ be a row index. We need to show that $\sum_{j=0}^\infty\sum_{k=0}^\infty A_{i,k}B_{k,j}=1$. For every $k\in\mathbb{N}_0$, let $a_k\in\mathbb{N}_0$ be an ascending sequence such that $\sum_{j=0}^{a_k}B_{k,j}>1-\frac{1}{k+1}$ and set $b_k:=k$. Then $a_k,b_k\underset{k\rightarrow\infty}{\rightarrow}\infty$ and so, by the lemma, it is enough to show that $\lim_{n\rightarrow\infty}C_n=1$ with $C_n:=\sum_{j=0}^{a_n}\sum_{k=0}^{b_n}A_{i,k}B_{k,j}$. Indeed, $$\underbrace{(1-\frac{1}{n+1})\sum_{k=0}^{b_n}A_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}1}\leq C_n=\sum_{k=0}^{b_n}A_{i,k}\sum_{j=0}^{a_n}B_{k,j}\leq\underbrace{\sum_{k=0}^{b_n}A_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}1}$$ $\square$

Definition 4 Exponential partial sums. For all $n\in\mathbb{N}_0, t\in\mathbb{R}$, define

  1. $\exp_n(t):=\sum_{i=0}^n \frac{t^n}{n!}$

  2. $\mathrm{Exp}_n(t):=\exp_n(t) I$

Theorem 6 Additivity of the matrix exponential

  1. $\sum_{n=0}^\infty(\frac{t^n}{n!}A)=e^tA$

  2. If $A$ is stochastic, $e^{sI}e^{tA}=e^{sI+tA}$

Proof of Theorem 6.2

Let $A$ be stochastic. Then

$$\begin{align}e^{sI}e^{tA}&=\sum_{n=0}^\infty\left(\frac{s^n}{n!}e^{tA}\right)\\ &=\sum_{n=0}^\infty\sum_{k=0}^\infty\frac{s^n}{n!}\frac{t^k}{k!}A^k\\ &=\sum_{m=0}^\infty\frac{1}{m!}\sum_{n,k\in\mathbb{N}_0\atop n+k=m}\binom{m}{n}(sI)^n(tA)^k\\ &=\sum_{m=0}^\infty\frac{1}{m!}\left(sI+tA\right)^m\\ &=e^{sI+tA}\end{align}$$

$\square$

Definition 5 The derivative of a function $\mathbb{R}\rightarrow\mathbf{M}$

Let $\phi:\mathbb{R}\rightarrow\mathbf{M}$ be a matrix-valued function on the real line. For every $i,j\in\mathbb{N}_0$ define the function-valued matrix $\Phi$ whose members are real-valued functions on the real line by $$\Phi_{i,j}:\mathbb{R}\rightarrow\mathbb{R},\space\space\Phi_{i,j}(x):=(\phi(x))_{i,j}$$

If $\Phi_{i,j}$ is differentiable at $x_0\in\mathbb{R}$ for all $i,j\in\mathbb{N}_0$, we say that $\phi$ is differentiable at $x_0$, and its derivative at this point is defined to be the matrix $$\phi'(x_0):=\left(\Phi'_{i,j}(x_0)\right)_{i,j\in\mathbb{N}_0}$$

Theorem 7 Properties of the matrix exponential

  1. $e^0=I$

  2. If $e^{tA}=B$ for some $0 < t$, then for every $s\in(-t,t)$ there's some $C_s$ such that $e^{sA}=C_s$.

  3. If $0\leq A$ and $e^A=B$ for some $B$, $e^{-A}=C$ for some $C$.

  4. If $e^{tA}=B$ for some $0 < t$, the matrix function $$\phi:\mathbb{R}\rightarrow\mathbf{M},\space\space \phi(s):=e^{sA}$$ is differentiable in the domain $(-t,t)$, and $\phi'(0)=A$.

Definition 6 Infinitesimal generators. An infinitesimal generator is an infinite real-valued matrix $A$ that satisfies the following two conditions:

i) $A_{i,j}\geq0$ for all $i,j\in\mathbb{N}_0$ such that $i \neq j$,

ii) $A_{i,i}=-\sum_{j\neq i}A_{i,j}$ for all $i\in\mathbb{N}_0$

(Infinitesimal generators arise in the theory of probability in the context of continuous-time/discrete-state-space Markov processes.)

Theorem 8 Infinitesimal generator. If $A$ is a bounded infinitesimal generator, there's some stochastic matrix $B$ and some number $0\leq c$ such that $$e^{tA}=e^{-ct}e^{ctB}$$ In fact, you may choose any $c\geq\sup_{i\in\mathbb{N}_0}|A_{i,i}|$ and

$$B = \begin{cases} c^{-1} A+I &, 0 < c \\ I&,\mathrm{otherwise} \end{cases}$$

In particular, $e^{tA}$ converges for all t.

Proof See the proof of Klenke's Theorem 17.25. $\square$

Theorem 9 The product of an infinitesimal generator and a bounded matrix

  1. $A$ is an infinitesimal generator with $|A_{i,i}| \leq 1$ for every $i \in \mathbb{N}_0$ iff $A+I$ is stochastic.

  2. If $A$ is an infinitesimal generator, so is $rA$. If $A$ is bounded, so is $rA$.

  3. If $A$ is an infinitesimal generator and $B$ is bounded, $AB=C$ for some $C$. If $A$ is bounded then $C$ is bounded.

  4. Let $(A^{(n)})_{n\in\mathbb{N}_0}$ be a sequence of infinitesimal generators and let $B$ be bounded. If $\lim_{n\rightarrow\infty}A^{(n)}=C$ and $C$ is an infinitesimal generator, $\lim_{n\rightarrow\infty}(A^{(n)}B)=CB$.

Proof of Theorem 9.3

Let $i,j\in\mathbb{N}_0$. We need to show that $$\lim_{n\rightarrow\infty}\sum_{k=0}^\infty A_{i,k}^{(n)}B_{k,j}=\sum_{k=0}^\infty C_{i,k}B_{k,j}$$

Let $n,N\in\mathbb{N}_0$ be arbitrary, $i < N$. $$\begin{align}\left|\sum_{k=0}^\infty A_{i,k}^{(n)}B_{k,j}-\sum_{k=0}^\infty C_{i,k}B_{k,j}\right|&=\underbrace{\left|A_{i,i}^{(n)}-C_{i,i}\right|}_{\underset{n\rightarrow\infty}{\rightarrow}0}\left|B_{i,j}\right|+\sum_{k=0\atop k\neq i}^N\underbrace{\left|A_{i,k}^{(n)}-C_{i,k}\right|}_{\underset{n\rightarrow\infty}{\rightarrow}0}\left|B_{k,j}\right|\\&+\left|\sum_{k=N+1}^\infty(A_{i,k}^{(n)}-C_{i,k})B_{k,j}\right|\end{align}$$

Now, if $0\leq\beta\in\mathbb{R}$ is an upper bound on $B$,

$$\begin{align}\left|\sum_{k=N+1}^\infty(A_{i,k}^{(n)}-C_{i,k})B_{k,j}\right|&\leq\left(\sum_{k=N+1}^\infty A_{i,k}^{(n)}\right)\beta+\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta\\&=\left(C_{i,i}+\underbrace{A_{i,i}^{(n)}-C_{i,i}}_{\underset{n\rightarrow\infty}{\rightarrow}0}\right)\beta\\&-\left(\sum_{k=0\atop k\neq i}^N C_{i,k}+\sum_{k=0\atop k\neq i}^N(\underbrace{A_{i,k}^{(n)}-C_{i,k}}_{\underset{n\rightarrow\infty}{\rightarrow}0})\right)\beta\\&+\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta\\&\leq2\left(\sum_{k=N+1}^\infty C_{i,k}\right)\beta+\varepsilon\end{align}$$

where both summands in the last expression can be made gratuitously small by choosing $N$, $n$ large enough.

$\square$

Definition 7 Component-wise limit of a matrix function

Given a set of numbers $\emptyset\neq S\subseteq\mathbb{R}$, a matrix function $$f:S\rightarrow\mathbf{M}$$ and an accumulation point of $S$, $s\in\mathbb{R}\cup\{\pm\infty\}$ [If $s=\infty$, $s$ is an accumulation point of $S$ iff $S$ has no upper bound. If $s=-\infty$, $s$ is an accumulation point of $S$ iff $S$ has no lower bound],

$$\lim_{t\rightarrow s\atop t\in S}f(t)=A$$ iff for all $i,j\in\mathbb{N}_0$, $$\lim_{t\rightarrow s\atop t\in S}\left(f(t)\right)_{i,j}=A_{i,j}$$

Theorem 10 Discretization of the limiting process

Under the assumptions of Definition 7,

$$\lim_{t\rightarrow s\atop t\in S}f(t)=A$$ iff for every sequence $(t_n)_{n\in\mathbb{N}_0}$ in $S$ that converges to $s$, $$\lim_{n\rightarrow\infty}f(t_n)=A$$

Definition 8 Markov semigroup

A matrix function $$f:[0,\infty)\rightarrow\mathbf{M}$$ is a Markov semigroup iff

  1. $f(0)=I$

  2. For all $t\in(0,\infty)$, $f(t)$ is stochastic

  3. For all $s,t\in[0,\infty)$, $$f(s+t)=f(s)f(t)$$

(Markov semigroups arise in probability theory in the context of continuous-time/discrete-state-space Markov processes.)

Theorem 11 Right-hand derivative of a Markov semigroup

Let $f:[0,\infty)\rightarrow\mathbf{M}$ be a Markov semigroup and let $A$ be an infinitesimal generator, such that $$\lim_{t\downarrow0}\frac{1}{t}(f(t)-I)=A$$

Then $f$ is right-hand differentiable on $[0,\infty)$ and for all $t\in[0,\infty)$, $$\mathrm{D}_R(f, t)=Af(t)$$ where $\mathrm{D}_R(f, t)$ is the right-hand derivative of $f$ at $t$.

Proof

Let $t\in[0,\infty)$. Then $$\lim_{s\downarrow0}\frac{1}{s}(f(t+s)-f(t))=\lim_{s\downarrow0}\left(\frac{1}{s}(f(s)-I)\space f(t)\right)=Af(t)$$

$\square$

Related Question