If you are willing to stick to characteristic zero, then you can assume that there is actually a morphism $f\colon X\longrightarrow Y$ realizing the birational equivalence (reason: look at the graph $\Gamma\subset X\times Y$ realizing the birational equivalence and take its closure, use resolution of singularities to resolve $\Gamma$, and then replace $X$ by $\Gamma$). In this case, $f_{*}\mathcal{O}_{X}=\mathcal{O}_{Y}$, and all higher direct images are zero, the Leray spectral sequence then implies that the Euler characteristics are equal.
More generally, if $Y$ has rational singularities and $f\colon X\longrightarrow Y$ is a proper birational map, with $X$ smooth, then $f_{*}\mathcal{O}_{X}=\mathcal{O}_Y$ and all higher direct images are zero (this is the definition of rational singularities) and so the same conclusion follows. Smooth varieties have rational singularities! (The computation for smooth varieties is necessary to show that the definition makes sense, i.e., that checking that this property holds for one resolution $X$ implies that it holds for all resolutions).
Here is how I think about G-R-R in the context of moduli of curves. I realize now that I wrote something quite long.
Let me recall first the definition of the tautological ring. As a consequence of the results on the birational geometry of $\overline M_g$ that there is no hope of understanding the whole Chow ring of $\overline M_g$ -- for instance, unlike in misleading low genus examples, the Chow ring will in general be infinite-dimensional. In David Mumford's "Towards an enumerative geometry..." he he introduces a finite-dimensional subring of the Chow ring which contains all "geometrically natural" classes in the Chow ring and proposes studying it instead, and this subring is called the tautological ring. Let me quote:
"Whenever a variety or topological space is defined by some universal property, one expects that by virtue of its defining property, it possesses certain cohomology classes called tautological classes. The standard example is a Grassmannian [...] by its very definition, there is a universal bundle $E$ on Grass of rank $k$, and this induces Chern classes $c_l(E)$ in both the cohomology ring and Chow ring of Grass."
Let me expand on the meaning of "geometrically natural". There are several possible definitions of the tautological ring. The one used by Mumford is that it is the subring generated by the so-called $\kappa$-classes, which is not really the right one: you should also for instance consider the boundary divisors as tautological classes (but this is implicit already in Mumford's paper). A nice definition is the one of Faber and Pandharipande, which defines the tautological ring for all spaces $\overline M_{g,n}$ simultaneously: it is the minimal system of subrings which contains all fundamental classes, is closed under all gluing morphisms, and is closed under all forgetting points-morphisms.
Morally what this means is that: (i) for any "natural" bundle you can write down directly in terms of the moduli functor, its Chern classes are going to be tautological; (ii) any sort of "natural" gluing procedure on curves is going to keep you inside of the tautological ring. For example, the $\lambda$-classes (Chern classes of the Hodge bundle) are tautological, the $\psi$-classes are tautological (the line bundles given by the cotangent line at a marked point), and the $\kappa$-classes are tautological.
OK, so let us return to G-R-R. Let $f \colon X \to Y$ be a proper morphism. On one side of the equation you have the Chern character of the derived pushforward $Rf_\ast F$. On the other side you have the pushforward of the Chern character of $F$ and the Todd class of the relative tangent sheaf $T_f$. The point is that both $F$, $Rf_\ast F$ and $T_f$ can all be made sense of by working locally/fiberwise: we don't need to know anything about the global structure of $Y$ to apply G-R-R to $f$ and $F$. But this is also how the tautological ring was set up: the classes in the tautological ring are exactly those that can be defined by pushing around classes of "fiberwise" defined bundles, which means that these are exactly the classes that can be defined without making any reference to any "global" structure of the moduli space.
So in hindsight Grothendieck-Riemann-Roch seems tailor made for the study of tautological rings. On the other hand, this is also a limitation of G-R-R: it will produce lots of relations and identities relating tautological classes to each other, but it will never prove any "global" statement about any of them.
As an example, it is possible to algorithmically compute the intersection number on $\overline M_{g,n}$ for any polynomial in boundary strata and $\lambda$-, $\psi$- and $\kappa$-classes. First you express the $\kappa$-classes as pushforwards of $\psi$-classes, then G-R-R can be used to express the $\lambda$-classes in terms of pushforwards of $\psi$-classes, which will finally reduce your computation to an intersection number only involving $\psi$-classes. All this was completely formal, but sooner or later you are going to need to use some global geometric property of $\overline M_{g,n}$ to find an actual number, and this is where it comes in: the Witten conjecture/Kontsevich's theorem tells you how to compute any intersection of $\psi$-classes.
So let me finally talk a bit about the article of Harris and Mumford. The first application of G-R-R in their article is to derive the formula $K_{\overline{M}_g} = 13\lambda_1 - 2\delta_0 - 3\delta_{1} - \ldots - 2\delta_{n}$ in the tautological ring. This is done by applying GRR to the projection from the universal curve and truncating after the first term. Incidentally, if you don't truncate after the first term, you get Mumford's formula (derived in "Towards an enumerative geometry...") expressing the Chern character of the Hodge bundle in terms of $\kappa$-classes and pushforwards of $\psi$-classes from the boundary strata.
But again, GRR will not tell you any global geometric information like if a class is big or ample. The idea is then to find an effective divisor $D$ such that $mK_{\overline M_g} = D + a\lambda_1$ with $a > 0$. It turns out that this is possible for $D$ equal to the locus of $k$-gonal curves, where they pick $g = 2k-1$. They describe in the article how they came up with this particular choice of $D$ by trying to generalize the work of Freitag on the Kodaira dimension of $A_g$ for $g$ large, in particular I think that there should be a Siegel modular form whose pullback to $M_g$ conjecturally would have $D$ as its vanishing locus. I don't know if this was actually worked out in later work. Then $nK_{\overline M_g}$ for large enough $n$ defines a birational map using the fact that $\lambda_1$ is ample on $A_g$, ultimately because the Satake compactification is the Proj of the ring of Siegel modular form, i.e. the sections of powers of the determinant of the Hodge bundle. (However $\lambda_1$ is not ample on $\overline M_g$!)
This part is clarified by the later article of Cornalba and Harris showing that a linear combination $a\lambda - b\delta$ is ample if and only if $a > 11b$. The rational Picard group of $\overline M_g$ is generated by $\lambda_1$ and the boundary divisors, so any effective divisor has an expression of the form $a\lambda - \sum b_i \delta_i$, so estimating the Kodaira dimension of $\overline M_g$ really comes down to finding effective divisors such that the slopes $a/b_i$ are small.
Anyway, the second application of GRR in their article is to show that on the open part $M_g$, the $k$-gonal locus is a multiples of $\lambda_1$. Actually, this part uses even more crucially Porteous's formula: once they express $k$-gonality in terms of a morphism of bundles having lower than expected rank, the class of the $k$-gonal locus can be expressed in terms of Chern classes of the two bundles, i.e. in terms of tautological classes. It follows then that $D$ is the sum of a multiple of $\lambda_1$ and an integral linear combination of boundary divisors. Finally these integers are determined by evaluating the divisors on suitable "test curves". They conclude that $\overline M_g$ is of general type for big $g$.
Best Answer
OK, here is another way to see it more in line with what you had in mind I think. Write your $L$ as $\mathcal O(D)$ for some divisor $D$ on $X$. Set $J_1$ to be the ideal sheaf defined by $\mathcal O(-D) \cap \mathcal O_X$ and $J_2$ to be the ideal sheaf defined by $\mathcal O(D) \cap \mathcal O_X$ (intersections taken inside of $K_X$). Let $Y_i$ be the closed subschemes of $X$ defined by these ideal sheaves (they have dimension smaller than that of $X$). Then we have the exact sequences
$$0 \to J_1(kD) \to \mathcal O(kD) \to \mathcal O_{Y_1}(kD) \to 0$$
$$0 \to J_2((k-1)D) \to \mathcal O((k-1)D) \to \mathcal O_{Y_2}((k-1)D) \to 0$$
The two left hand terms are equal by construction. Then by the induction hypothesis, and chasing the Euler characteristics, $\chi(kD) - \chi((k-1)D)$ is a numerical polynomial. This implies that that $\chi(kD)$ itself is a numerical polynomial (Section 1.7 of Harshorne's Algebraic Geometry).
(Here I swept something under the rug, because the subschemes $Y_i$ may not be as nice as $X$ was. But they are at least proper, and we should show that the result we want is that for a proper variety $W$, $\chi(kD)$ is polynomial for a divisor $D$. Then reduce this to the case where $W$ is reduced by looking at the inclusion of $W_\mathrm{red}$ into $W$. Then further reduce to the case where $W$ is integral.)