A trick I have seen several times: If you want to show that some rational number is an integer (i. e., a divisibility), show that it is an algebraic integer. Technically, it is then an application of commutative algebra (the integral closedness of $\mathbb Z$, together with the properties of integral closure such as: the sum of two algebraic integers is an algebraic integer again), but since you define algebraic number theory as the theory of algebraic numbers, you may be interested in this kind of applications.
Example: Let $p$ be a prime such that $p\neq 2$. Prove that the $p$-th Fibonacci number $F_p$ satisfies $F_p\equiv 5^{\left(p-1\right)/2}\mod p$.
Proof: We can do the $p=5$ case by hand, so let us assume that $p\neq 5$ for now. Then, $p$ is coprime to $5$ in $\mathbb Z$. Let $a=\frac{1+\sqrt5}{2}$ and $b=\frac{1-\sqrt5}{2}$. The Binet formula yields $F_p=\displaystyle\frac{a^p-b^p}{\sqrt5}$. Now, $a^p-b^p\equiv\left(a-b\right)^p\mod p\mathbb Z\left[a,b\right]$ (by the idiot's binomial formula, since $p$ is an odd prime). Note that $p$ is coprime to $5$ in the ring $p\mathbb Z\left[a,b\right]$ (since $p$ is coprime to $5$ in the ring $\mathbb Z$, and thus there exist integers $a$ and $b$ such that $pa+5b=1$). Now,
$\displaystyle F_p=\frac{a^p-b^p}{\sqrt5}\equiv\frac{\left(a-b\right)^p}{\sqrt5}$ (since $a^p-b^p\equiv\left(a-b\right)^p\mod p\mathbb Z\left[a,b\right]$ and since we can divide congruences modulo $p\mathbb Z\left[a,b\right]$ by $\sqrt5$, because $p$ is coprime to $5$ in $p\mathbb Z\left[a,b\right]$)
$\displaystyle =\frac{\left(\sqrt5\right)^p}{\sqrt5}$ (since $a-b=\sqrt5$)
$=5^{\left(p-1\right)/2}\mod p\mathbb Z\left[a,b\right]$.
In other words, the number $F_p-5^{\left(p-1\right)/2}$ is divisible by $p$ in the ring $\mathbb Z\left[a,b\right]$. Hence, $\frac{F_p-5^{\left(p-1\right)/2}}{p}$ is an algebraic integer. But it is also a rational number. Thus, it is an integer, so that $p\mid F_p-5^{\left(p-1\right)/2}$ and thus $F_p\equiv 5^{\left(p-1\right)/2}\mod p$, qed.
The BSD conjecture for an abelian variety $A$ over a function field holds if Ш$(A)[\ell^\infty]$ is finite for some prime $\ell$ ($\ell = p$ allowed). This is a theorem by Schneider, Bauer and Kato-Trihan. If $A$ is a constant abelian variety, Ш$(A)$ is finite by Milne's PhD thesis.
Edit: Since the analytic rank $\rho$ is always greater or equal than the algebraic rank, one has BSD if $\rho = 0$ (by the equivalence of weak BSD and the finiteness of an $\ell$-primary component of Sha). I show this inequality even for Abelian schemes over higher dimensional bases over finite fields in http://kellertimo.name/Height.pdf, Lemma 2.17.
Best Answer
Some comments, too extensive to fit into the comment box:
(1) There is a fairly recent reworking of at least some parts of the proof in the book "Heegner points and Rankin $L$-series", MSRI Publ. 49. (Brian Conrad in particular has a paper in there reworking the deformation theory arguments.)
(2) The theorem is a computation: one computes the height of the Heegner point, using Neron-Tate local heights, and relates the answer (a sum of contributions from each place) to a corresponding expression for the derivative.
(3) It is Kolyvagin's work which shows that if the Heegner point is non-zero, then it generates the Mordell-Weil group (up to finite index); so if you want motivation for the truth of Gross--Zagier, you can think of it as being a consequence of BSD + Kolyvagin. (This may be ahistorical, though.)
(4) Historically, Birch was the one who computed Heegner points on elliptic curves, and found that they were generators of the Mordell--Weil group (up to finite index) precisely when the rank was one. This was a big source of encouragement for Gross (as he explained at one point when I was in grad school), because it meant that there should be a relation between the derivative at 1 and the height of the Heegner point, and one just had to find it.
(5) The arithmetico-geometric parts of Gross--Zagier are wonderful; I wouldn't at all think of it as futile to study them. I've not studied the analytic parts, but no doubt they're equally wonderful.
(6) You might start with the Crelle paper of Gross--Zagier, which essentially treats the case of level one. Since the modular curve of level one has genus 0, the height is necessarily zero, and so one gets a very nice formula relating the sum of the finite local heights to the archimedean local height. And one can prove the same formula another way, using a special case of the analytic arguments that in the general setting compute the derivative. The fact that the same formula is obtained these two different ways is a special case of the general Gross--Zagier formula; but it may be simpler to understand the two sides and the comparison between them in this level one setting.
(7) As far as I understand, Kato says nothing in the analytic rank one case. For BSD in this case, one needs Gross--Zagier plus Kolyvagin.