I found a possible answer to this in [1], Section 3.2. It goes back to [2].
Actually, the construction there is more general than what I asked for, but it turns out that I should have asked the more general question for my application anyway. However, this answer produces the same result as what I proposed in my original question. As to the answer by Robin Goodfellow, I still don't know whether it is correct; I suspect there might be an error in Petersen's book, but I might be wrong. Anyway, I cannot reconcile the construction there with any examples I know.
Let me summarize the construction given in [1]. Let $d(\cdot||\cdot):\mathcal{M}\times \mathcal{M}\rightarrow \mathbb{R}$ be a smooth function satisfying
$$d(p||q)\geq 0, \quad \text{and} \quad d(p||q)=0 \;\text{iff}\; p=q.$$
This kind of function is called contrast function or divergence if its Hessian is strictly positive definite. It can fail to obey the symmetry and triangle inequality that would make it a distance function; on the other hand, every smooth distance function is a contrast function, so the original question is also answered by this.
The contrast function $d^{\ast}$ defined by $d^{\ast}(p||q)=d(q||p)$ is called the dual of $d$. A distance function is obviously a self-dual contrast function.
Every contrast function $d$ induces a Riemannian metric $g^{(d)}$ by taking its negative Hessian and an affine connection $\nabla^{(d)}$ by taking an appropriate negative third derivative (check the book for details). It turns out that $d$ and $d^{\ast}$ induce the same metric, i.e. $g^{(d)}=g^{(d^{\ast})}$, but the connections are dual to each other
$$\nabla^{(d^{\ast})}=\left(\nabla^{(d)}\right)^{\ast}.$$
For a distance function $d$, $\nabla^{(d^{\ast})}=\nabla^{(d)}$, which means that $\nabla^{(d)}$ is a metric connection. Thus, for $d$ a distance function we obtain the Levi-Civita connection (the Torsion vanishes in any case, even when $d$ is not a distance function, because the Christoffel symbols are defined by partial derivatives).
Interestingly, [1] also mentions that in [3] the converse of the above construction was proved, i.e. that given a smooth manifold with Riemannian metric and two dual Torsion-free connections, there is a contrast function that induces this structure.
[1] Amari, S.-I., & Nagaoka, H. (2007). Methods of Information Geometry. American Mathematical Soc.
[2] Eguchi, S. (1992). Geometry of minimum contrast. Hiroshima Mathematical Journal, 22, 631–647.
[3] Matumoto, T. (1993). Any statistical manifold has a contrast function. Hiroshima Mathematical Journal, 23, 327–332.
Best Answer
First, as mentioned in the comments, your definition of $m$ assumes that you can add, subtract, and square $s_i$ and $t_i$. These operations are not defined for a general metric space. A suitable definition would be $$m((s_1,t_1), (s_2,t_2)) = \sqrt{d(s_1,s_2)^2 + e(t_1,t_2)^2}$$ The rest of this answer assumes this definition.
Note that for any positive numbers $a$ and $b$, the inequalities $a < b$ and $a^2 < b^2$ are equivalent. Therefore, proving the triangle inequality $$m((s_1,t_1), (s_3,t_3)) \leq m((s_1,t_1), (s_2,t_2)) + m((s_2,t_2), (s_3,t_3))$$ is equivalent to proving $$m((s_1,t_1), (s_3,t_3))^2 \leq [m((s_1,t_1), (s_2,t_2)) + m((s_2,t_2), (s_3,t_3))]^2$$ or equivalently, $$d(s_1,s_3)^2 + e(t_1,t_3)^2 \leq d(s_1,s_2)^2 + e(t_1,t_2)^2 + 2\sqrt{d(s_1,s_2)^2 + e(t_1,t_2)^2}\sqrt{d(s_2,s_3)^2 + e(t_2,t_3)^2} + d(s_2,s_3)^2 + e(t_2,t_3)^2$$ Let us call this inequality 1.
Now, since $d$ and $e$ are metrics, we know that $$d(s_1,s_3)^2 \leq [d(s_1,s_2) + d(s_2,s_3)]^2 = d(s_1,s_2)^2 + 2 d(s_1,s_2)d(s_2,s_3) + d(s_2,s_3)^2$$ and $$e(t_1,t_3)^2 \leq [e(t_1,t_2) + e(t_2,t_3)]^2 = e(t_1,t_2)^2 + 2 e(t_1,t_2)e(t_2,t_3) + e(t_2,t_3)^2$$ Adding these two inequalities gives us $$d(s_1,s_3)^2 + e(t_1,t_3)^2 \leq d(s_1,s_2)^2 + 2 d(s_1,s_2)d(s_2,s_3) + d(s_2,s_3)^2 + e(t_1,t_2)^2 + 2 e(t_1,t_2)e(t_2,t_3) + e(t_2,t_3)^2$$ Comparing this with inequality 1, we see that it will suffice to prove that $$d(s_1,s_2)d(s_2,s_3) + e(t_1,t_2)e(t_2,t_3) \leq \sqrt{d(s_1,s_2)^2 + e(t_1,t_2)^2}\sqrt{d(s_2,s_3)^2 + e(t_2,t_3)^2}$$ Let us call this inequality 2.
The left hand side is the dot product between the 2-dimensional real vectors $v = (d(s_1,s_2), e(t_1,t_2))$ and $w = (d(s_2,s_3), e(t_2,t_3))$. Let's call this dot product $v \cdot w$ for short.
The right-hand side is the product of the lengths of these vectors: $$\|v\|\|w\| = \sqrt{d(s_1,s_2)^2 + e(t_1,t_2)^2}\sqrt{d(s_2,s_3)^2 + e(t_2,t_3)^2}$$
The Cauchy Schwarz inequality is precisely the tool we need: it tells us that $$|v \cdot w| \leq \|v\|\|w\|$$ This along with the fact that $v \cdot w \leq |v \cdot w|$ (a real number is no larger than its absolute value) tells us that $$v \cdot w \leq \|v\|\|w\|$$ Substituting the definitions of $v$ and $w$ gives us exactly inequality 2, which as argued above, suffices to prove inequality 1, which in turn is equivalent to the triangle inequality.