I would argue that your statement
I know that for example in the Poincaré disc model we have a certain formula, another in the Klein model, and so on
has a correspondence in Euclidean geometry: There is one formula for Cartesian coordinates (the one you quoted), another for polar coordinates, and so on. So in a certain sense, the models of hyperbolic geometry are just different coordinate systems for the common hyperbolic space they model. Yes, they do come with different visualizations, but that is because we have to embed them in Euclidean space to visualize them.
There is the concept called Cayley-Klein metric which works almost the same for different kinds of planar geometries, including Euclidean, hyperbolic, elliptic, pseudo-Euclidean and Galilean. Of course, you'd most likely associate the resulting representation of hyperbolic space with the Klein model. Furthermore, in that setup, Euclidean distances will have absolute value zero, and the only thing you can do is express the ratio between two Euclidean distances. Which makes sense, since Euclidean geometry, in contrast to hyperbolic, does not have any intrinsic reference length. However, there are formulations which encode this reference length in the description of the geometry, and then end up with the common formula for Euclidean distances. So Cayley-Klein metrics do form a kind of common language for distance measurements, but not in as simple a way as you might hope for.
Here's a way to approach this problem from the linear algebra point of view. And in this answer I'm going to side-step the issue of orientation reversing isometries, which are not represented by Möbius transformation, instead by anti-Möbius transformations. So, I will explain how to use linear algebra to classify orientation preserving (o.p.) isometries, equivalently to classify Möbius transformations.
The classification of o.p. isometries is invariant under conjugacy. To see some examples of this from a synthetic point of view (i.e. no linear algebra), consider an o.p. isometry $\phi$, another o.p. isometry $\psi$, and the conjugate o.p. isometry $\psi \phi \psi^{-1}$.
First, if $\phi$ has a fixed point $x$ then $\psi \phi \psi^{-1}$ also has a fixed point, namely $\psi(x)$. Proof:
$$(\psi \phi \psi^{-1})(\psi(x)) = \psi \phi(x) = \psi(x)
$$
Furthermore, if $\phi$ rotates by an angle $\theta$ around $x$ then $\psi$ also rotates by an angle $\theta$ around $\phi(x)$. Proof: take a ray $R$ based at $x$, and the angle from $R$ to $\phi(R)$ at $x$ equals $\theta$; so the ray from $\psi(R)$ to $\psi(\phi(R))$ at $\psi(x)$ equals $\theta$.
Second, and with similar proofs, if $\phi$ fixes some geodesic line $L$ then $\psi \phi \psi^{-1}$ also fixes a line, namely $\psi(L)$. Furthermore, if $\phi$ translates a distance $d$ along $L$ then $\psi \phi \psi^{-1}$ translates a distance $d$ along $\psi(L)$.
Third, a parabolic transformation is one which fixes a single point on the circle at infinity and fixes no finite point.
Here's where things get interesting. Now you have to convince yourself of two further issues.
- Every o.p. isometry either fixes a point and rotates about it by some angle, or fixes some geodesic line and translates along that line by some distance, or is a parabolic isometry, or is the identity.
- Any two o.p. isometries which are rotations by the same angle, or are translations by the same distance, are conjugate. A similar statement holds for parabolic isometries, which I'll hold off until the end.
It is possible to prove this synthetically, but Möbius transformations certainly help. Here's an outline.
As usual we represent Möbius transformations by elements of $PSL(2,\mathbb{R}) = SL(2,\mathbb{R}) / \pm I$. Here's some algebra facts which you can easily verify.
- Given $M \in SL(2,\mathbb{R})$, $\text{trace}(M)$ is a conjugacy invariant in $SL(2,\mathbb{R})$. Also, its absolute trace $|\text{trace}(M)|$ is a conjugacy invariant in $PSL(2,\mathbb{R})$.
- If $|\text{trace}(M)|$ is not equal to 2 then it is a complete conjugacy invariant, i.e. if $M_1,M_2$ have the same absolute trace not equal to 2 then they are conjugate in $PSL(2,\mathbb{R})$. The case of absolute trace equal to $2$ corresponds to parabolic transformations and is not quite as clean; I'll skip that until the end.
Now, blend the geometry and the algebra: for each value of the absolute trace that is not equal to $2$, construct an example, verify that it is one of the isometries in the above list, and verify that every distinct isometry in the above list corresponds to exactly one value of the absolute trace. Having done that, the classification is complete! (Outside of the issue that I have ignored the parabolic case where absolute trace equals 2).
For example, absolute trace equal to zero is represented by $\pmatrix{0 & 1 \\ -1 & 0} : z \mapsto -\frac{1}{z}$ which is a rotation about the fixed point $0+1i$ in the upper half plane, with rotation angle $\pi$. More generally, absolute trace in the interval $[0,2)$ is always a rotation about a fixed point, with angles varying over $(0,\pi]$ by a simple strictly monotonic continuous function $[0,2) \mapsto (0,\pi]$ (whose formula some people remember but I never can).
For another example, absolute trace equal to $\frac{5}{2}$ is represented by $\pmatrix{2 & 0 \\ 0 & 1/2}$ which is a translation along the line $\text{RealPart}(z)=0$ by a distance $\ln(4)$. More generally, absolute trace in the inverval $(2,\infty)$ is always a translation along a line, with translation distance varying over $(0,\infty)$ by a simple strictly monotonic continuous function $(2,\infty) \mapsto (0,\infty)$ (again, with an explicit formula).
A few last words about absolute trace equal to $2$. Each of these represents either the identity or an o.p. parabolic isometry. In the full group of generalized Möbius transformations there is just one conjugacy class of parabolic isometry. But in $\text{PSL}(2,\mathbb{R})$ there are two conjugacy classes, one rotating "positively" around the fixed point at infinity and the other rotating "negatively".
Best Answer
Here's a proof, which uses the classification of isometries.
But first I need to address an issue that you raised in the comments.
Warning: When an isometry acts on the compactified hyperbolic plane $\overline{\mathbb H}^2 = \mathbb H^2 \cup S^1_\infty$ it always has at least one fixed point of $\overline{\mathbb H}^2$. But, points in $S^1_\infty$ are not involved in distance measurements; points in $S^1_\infty$ are "infinitely far" and are not part of the domain of the distance function; $d(p,q)$ is only defined for $p,q \in \mathbb H^2$.
So the problem asks to prove that if $f$ is not the identity then $d(p,f(p))$ is not constant for $p \in \mathbb H^2$.
Having said that, let me go through the cases of non-identity isometries. Let $Fix(f)$ be the fixed point set of $f$ on $\overline{\mathbb H}^2$, which breaks into two pieces: the finite fixed points $Fix_{fin}(f) = Fix(f) \cap \mathbb H^2$, and the infinite fixed points $Fix_{inf}(f) = Fix(f) \cap S^1_\infty$.
Every non-identity isometry falls into one of four cases, as follows:
Case 1: $Fix(f)$ consists of two infinite points $\xi,\eta \in {Fix}_{inf}(f)$. In this case, letting $L$ be the line with infinite endpoints $\xi,\eta$, one proves that $d(p,f(p))$ is constant for $p \in L$ and that $d(q,f(q)) > d(p,f(p))$ for $p \in L$ and $q \not\in L$. Geometrically $f$ is either a "translation" along the line $L$, or what is called a "glide reflection" which is a translation along $L$ composed with a reflection across $L$.
Case 2: $Fix(f)$ consists of an entire line $L = {Fix}_{fin}(f)$ and its two infinite endpoints $\partial L = {Fix}_{inf}(f)$. In this case then $d(p,f(p))=0$ for $p \in L$ but $d(p,f(p)) \ne 0$ for $p \not\in L$. Geometrically $f$ is a reflection across $L$.
Case 3: $Fix(f)$ consists of a single finite point $p = {Fix}_{fin}(f)$. In this case $d(p,f(p))=0$ and $d(q,f(q)) > 0$ for $q \ne p$. Geometrically $f$ is a rotation around $p$ of some angle $\theta \in (0,2\pi)$.
Case 4: $Fix(f)$ consists a single infinite point $\xi = {Fix}_{inf}(f)$. Since ${Fix}_{fin}(f) = \emptyset$, $d(p,f(p))>0$ for all $p$. However, given any line $L$ with endpoint $\xi$, $\lim_{q \in L, q \to \xi} d(q,f(q)) = 0$ where we let $q$ approach $\xi$ in $\overline{\mathbb H}^2$ along the line $L$. So $d(q,f(q))$ is not constant. Geometrically $f$ is a parabolic isometry.
There's still the question of how the conclusions are proved in these four cases. But if you know the classification of isometries in more detail, i.e. if you know how the classification is related to the fractional linear transformation formulas you provided in your question, then it's not too hard to justify those conclusions.