Edit: this doesn't work.
Original: I think that a compact curve in $M_g$ provides an example. In such a family, the length of the shortest geodesic is bounded. Whereas, in a family produced by ramifying over four points in $P^1$, as two ramification points collide, there is a loop whose image downstairs stays near that pair of points, looping about them several times, and thus has arbitrarily small length. There's probably a variant of this argument with Mumford-Tate groups.
Added:
Jason objects that branch points downstairs can collide without ramification points upstairs colliding. This is true, but easy to patch. There are four branch points and some pairs of them might collide without the ramification points colliding, but if all pairs collide without the ramification points colliding, then they must permute different sets of sheets and the curve is not (geometrically) connected.
Jordan objects that even if the ramification points collide, the curve may remain smooth. In particular, if the curve is a $d$-fold cover of $P^1$ and all the monodromy is a power of a fixed $d$-cycle, if two points labeled by $a$ and $b$ so that all three of $a$, $b$, and $a+b$ are relatively prime to $d$, then the family is smooth. If this is true for all pairs of collisions, this gives a complete family of curves that map to $P^1$ with only four branch points, contradicting my claim. In particular, $d=5$ and $1,1,1,2$ is a complete family of genus $4$ curves.
Such examples don't exist for smaller $d$. Thus a complete curve in $M_3$ (which exist, right?) is a good candidate for not having a map to $P^1$ with $4$ branch points. Collisions of branch points with other forms of monodromy are harder for me to understand, but they look more likely to result in degeneration.
If $X$ is a (smooth projective) curve over $\overline{\mathbb{Q}}$, we define
The Belyi degree $\deg_B(X)$ of $X$ to be the minimum degree of a Belyi map $X\to \mathbb{P}^1_{\overline{\mathbb{Q}}}$.
The Belyi degree is a function on $\mathcal{M}_g(\overline{\mathbb{Q}})$ which satisfies the following Northcott-type finiteness property.
Proposition. (Strong Northcott) For every integer $d$, the set of $\overline{\mathbb{Q}}$-isomorphism classes of curves $X$ over $\overline{\mathbb{Q}}$ with $\deg_B(X)\leq d$ is finite.
Proof. Like all finiteness statements, this one also boils down to some "general" finiteness statements. In this case, the statement (seemingly arithmetic in nature) is a consequence of a (topological) finiteness property of the fundamental group of $\mathbb{P}^1\setminus \{0,1,\infty\}$. Indeed, the proposition can be proven using the fact that the fundamental group of $\mathbb{P}^1\setminus \{0,1,\infty\}$ is finitely generated, and that a finitely generated group has only finitely many finite index subgroups of index at most $d$. QED
Note that this proposition can be used to enumerate all (isomorphism classes of) curves over $\overline{\mathbb{Q}}$. Simply "write" down the curves of Belyi degree at most $3$, then $4$, then $5$, etc.
The Northcott property satisfied by the Belyi degree is much stronger than that of any Weil height $h$. The Northcott property for a Weil height usually requires in addition a bound on the degree of the point.
The Strong Northcott property implies that, given a Weil height $h$ (or any function!) on $\overline{\mathbb{Q}}$, there is a function $f(\deg_B(-))$ such that
$$ h(X) \leq f(\deg_B(X)).$$
Thus, any function on $\ {\mathcal{M}_g}(\overline{\mathbb{Q}})$ is bounded by a function in the Belyi degree (simply because of the above proposition). For example, the genus of $X$ is bounded by $\deg_B(X)$. This follows from the Riemann-Hurwitz formula.
There are a few natural (arithmetic) invariants on $\ {\mathcal{M}_g}(\overline{\mathbb{Q}})$ such as the Faltings height for which one can write down explicit bounds. For example:
Theorem. If $X$ is a curve over $\overline{\mathbb{Q}}$ with Faltings height $h_F(X)$, then $$h_F(X) \leq 10^8 \deg_B(X)^6.$$
This (with many more explicit inequalities) is proven in [1]. The motivation for proving such inequalities is that they can be used to control the running time of certain algorithms computing coefficients of modular forms.
The question of actually computing the Belyi degree of a curve is an interesting one. An algorithm (which I would not recommend trying to implement) for doing so is given in [2].
[1] A. Javanpeykar. Polynomial bounds for Arakelov invariants of Belyi curves, with an appendix by Peter Bruin. Algebra and Number Theory, Vol. 8 (2014), No. 1, 89–140.
[2] A. Javanpeykar and J. Voight. The Belyi degree of a curve is computable
Contemp. Math., 2019, 722, p. 43-57.
Best Answer
Another example, like JSE's, that comes already equipped with a Belyi map but is not as familiar as modular curves and Fermat curves: For any relatively prime integers $m,n$ with $0<m<n$, and any subgroup $G$ of $S_n$, the curve that parametrizes trinomials $x^n + a x^m + b$ up to scaling with Galois group contained in $G$. The Belyi map is the invariant $a^n/b^{n-m}$ of the trinomial, and its degree is $d=[S_n:G]$; it is branched at $0$, $\infty$, and $(-n)^n/(m^m (n-m)^{n-m})$. One may assume $m \leq n/2$ (by symmetry with respect to $x \leftrightarrow 1/x$, $m \leftrightarrow n-m$). Some nontrivial examples with $n=5,7,8$ are given explicitly at http://www.math.harvard.edu/~elkies/trinomial.html; the subsequent paper with N.Bruin on the cases $(m,n) = (1,7)$ and $(1,8)$ with $d = 30$ is
Nils Bruin and Noam D. Elkies, Trinomials $ax^7+bx+c$ and $ax^8+bx+c$ with Galois Groups of Order 168 and $8 \cdot 168$, Lecture Notes in Computer Science 2369 (proceedings of ANTS-5, 2002; C.Fieker and D.R.Kohel, eds.), 172-188.
(These examples all have $G$ transitive, but the construction works for all subgroups $G$.)