It suffices to show the convexity in the domain $\Omega=\{x_i>0,\ i=1,\dots,n\}$.
Once this is done, to check that the standard definition of convexity holds for a generic couple of points $x$, $y$ you can choose a large $M>0$ so that, calling $\vec{M}=(M,\dots,M)$, both $x'=x+\vec{M}\in\Omega$ and $y'=y+\vec{M}\in\Omega$;
then you have $F(z+\vec{M})=\exp\left(A(z+\vec{M})\right)=\exp(A(z)+MI)=e^M F(z)$ for any $z$ lying on the segment which joins $x$ and $y$ and what you want to check follows from the definition of convexity with $x',y'$.
So let's show that $f_{ij}$ is convex in $\Omega$: it suffices to show that if a matrix $A$ has nonnegative coefficients and $D$ is a diagonal matrix then $\exp(A+tD)+\exp(A-tD)-2\exp(A)=t^2P+O(t^3)$, where $P$ has nonnegative coefficients (because then we deduce $\nabla^2 f_{ij}(x)[h,h]=\lim_{t\to 0}\frac{f_{ij}(x+th)+f_{ij}(x-th)-2f_{ij}(x)}{t^2}\ge 0$ for any vector $h$, so $\nabla^2 f_{ij}$ is positive semidefinite and $f_{ij}$ is convex).
By the definition of the exponential, we can just show the stronger statement that for any $k\ge 0$ $(A+tD)^k+(A-tD)^k-2A^k=t^2P+O(t^3)$, where again $P$ is some matrix with nonnegative coefficients.
Now notice that $(A+tD)^k+(A-tD)^k-2A^k=2t^2\sum_{0\le r<s\le k-1} A^rDA^{s-r-1}DA^{k-s-1}+O(t^3)$ because the terms of first order cancel out, while the term of second order in the Taylor expansion of $(A\pm tD)^k$ is $t^2\sum_{0\le r<s\le k-1}\Pi_{rs}$, where $\Pi_{rs}$ is the product of $k$ factors which are all equal to $A$ except for the $r$-th and $s$-th ones which are equal to $D$ (by the $0$-th factor we mean the first one, and so on). So $P=2\sum_{0\le r<s\le k-1} A^rDA^{s-r-1}DA^{k-s-1}$.
Calling $i_0:=i$, $i_k:=j$, we then discover that the $(i,j)$-th component of $\sum_{0\le r<s\le k-2} A^rDA^{s-r-1}DA^{k-s-2}$ is
$$\sum_{0\le r<s\le k}\sum_{i_1,\dots,i_{k-1}}a_{i_0i_1}\cdots a_{i_{r-1}i_r}d_{i_ri_{r+1}}a_{i_{r+1}i_{r+2}}\dots a_{i_{s-1}i_s}d_{i_si_{s+1}}a_{i_{s+1}i_{s+2}}\cdots a_{i_{k-1}i_k}$$
but $D$ is diagonal and has the effect of "freezing the indices" for one step, i.e. in the sum we can take $i_r=i_{r+1}$ and $i_s=i_{s+1}$.
Thus, by shifting all the indices, the big sum becomes
$$\sum_{0\le r\le s\le k-2}\sum_{j_1,\dots,j_{k-3}}a_{j_0j_1}\cdots a_{j_{r-1}j_r}d_{j_r}a_{j_rj_{r+1}}\dots a_{j_{s-1}j_s}d_{j_s}a_{j_sj_{s+1}}\cdots a_{i_{k-3}i_{k-2}}$$
where we have put $d_j:=d_{jj}$ and the extremal indices are $j_0:=i$, $j_{k-2}:=j$.
Now we exchange the two sums and we are left to show that
$$\sum_{0\le r\le s\le k-2}d_{j_r}d_{j_s}a_{j_0j_1}\cdots a_{j_{r-1}j_r}a_{j_rj_{r+1}}\dots a_{j_{s-1}j_s}a_{j_sj_{s+1}}\cdots a_{i_{k-3}i_{k-2}}\ge 0$$
for any fixed choice of indices $j_0,\dots,j_{k-2}$. The product of the coefficients of $A$ is nonnegative and independent of $(r,s)$, so we just have to show that
$$\sum_{0\le r\le s\le k-2}d_{j_r}d_{j_s}\ge 0$$
which is clear as $2\sum_{0\le r\le s\le k-2}d_{j_r}d_{j_s}=2\sum_{0\le r\le k-2}d_{j_r}^2+2\sum_{0\le r<s\le k-2}d_{j_r}d_{j_s}=\sum_{0\le r\le k-2}d_{j_r}^2+\left(\sum_{0\le r\le k-2}d_{j_r}\right)^2\ge 0.$
Let $f : \mathbb R^{n \times n} \to \mathbb R^{n \times n}$ be defined by $f (X) = \exp(X)$. Hence,
$\begin{array}{rl} f (X + h V) &= \exp(X + h V)\\ &= I_n + (X + h V) + \frac{1}{2!} (X + h V)^2 + \frac{1}{3!} (X + h V)^3 + \cdots\\ &= I_n + X + h V + \frac{1}{2!} X^2 + \frac{h}{2!} (X V + V X) + \frac{h^2}{2!} V^2 + \frac{1}{3!} X^3 + \\ &\,\,\,\,\,+ \frac{h}{3!} (X^2 V + X V X + V X^2) + \frac{h^2}{3!} (X V^2 + V X V + V^2 X) + \frac{h^3}{3!} V^3 + \cdots\\ &= f (X) + h \left( V + \frac{1}{2!} (X V + V X) + \frac{1}{3!} (X^2 V + X V X + V X^2) + \cdots\right) + \cdots\end{array}$
Thus, the directional derivative of $f$ in the direction of $V$ at $X$ is given by
$$\begin{array}{rl} D_V f (X) &= \displaystyle\lim_{h \to 0} \frac{1}{h} \left( f (X + h V) - f (X) \right)\\ &= V + \frac{1}{2!} (X V + V X) + \frac{1}{3!} (X^2 V + X V X + V X^2) + \cdots\end{array}$$
We then write
$$D_V f (X) = M_0 + \frac{1}{2!} M_1 + \frac{1}{3!} M_2 + \frac{1}{4!} M_3 + \cdots$$
where
$$\begin{array}{rl} M_0 &= V\\ M_1 &= X V + V X =: \{X,V\}\\ M_2 &= X^2 V + X V X + V X^2\\ &\vdots\\ M_k &= \displaystyle\sum_{i=0}^k X^{k-i} V X^i\end{array}$$
Best Answer
$ \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\vc#1{\op{vec}\LR{#1}} \def\Diag#1{\op{Diag}\LR{#1}} \def\qiq{\quad\implies\quad} \def\p{\partial} \def\grad#1#2{\frac{\p #1}{\p #2}} $A scalar function can be applied to a square matrix $$\eqalign{ f(x) = e^{tx} \qiq A &= f(X) \\ }$$ and a real symmetric matrix can be diagonalized with orthogonal factors $$\eqalign{ X &= QBQ^T,\quad B=\Diag b,\quad Q^TQ=I \\ }$$ The Daleckii-Krein theorem followed by vectorization yields $$\eqalign{ Q^T dA\:Q &= R\odot\LR{Q^T dX\:Q} \\ \LR{Q\otimes Q}^T\,da &= \vc{R}\odot\LR{Q\otimes Q}^Tdx \\ }$$ where $(\odot,\otimes)$ denote the Hadamard and Kronecker products, respectively.
By defining the matrices $$\eqalign{ D &= \Diag{\vc{R}},\qquad P = \LR{Q\otimes Q}\qiq P^T=P^{-1} \\ }$$ the gradient calculation simplifies to $$\eqalign{ P^T da &= DP^Tdx \qiq \grad{a}{x} = PDP^T \\ }$$ All that remains is to calculate the symmetric $R$ matrix which lies at the heart of the theorem $$\eqalign{ R_{ij} = \begin{cases} {\Large\frac{f(b_i)-f(b_j)}{b_i-b_j}} \qquad{\rm if}\;b_i\ne b_j \\ \\ \quad f'(b_j) \qquad\qquad {\rm otherwise} \\ \end{cases} }$$ NB:$\:$ For the current problem $\;f'(x) = te^{tx}$