I'm currently at a stage where I think I'm quite comfortable with the appearance of local non-archimedean fields in the maths I encounter, having seen a fair bit of technology built upon their structure and applications to connected areas, yet I somehow still feel like I have an unsatisfying understanding of why their introduction is absolutely crucial to study algebraic number theory and arithmetic geometry, especially since all these applications are hugely advanced compared to the point at which $p$-adic numbers are usually introduced in a student's career I think. If I want to motivate their definition, I sort of naturally go through the following implications in my head:
- to study a problem over the ring of integers $\mathbb{Z}$ (or more generally, the ring of integers of a number field) the strategy is to localise the problem at a prime $p$, as to focus around it and worry about the problem 'one point at a time', so-to-speak;
- we (literally, now) localise at the prime $p$, just as one would do with the ring of regular functions on a variety, and replace $\mathbb{Z}$ with the ring $\mathbb{Z}_{(p)} := \{ \frac{x}{y} \in \mathbb{Q} \mid p\nmid y \}$ ;
- we can then complete at the maximal ideal $(p) \subseteq \mathbb{Z}_{(p)}$ to obtain the $p$-adic integers $\mathbb{Z}_p$, with the benefit that now there's access to approximation techniques such as Hensel's lemma, and the newly obtained ring has pretty much all the same algebraic properties as $\mathbb{Z}_{(p)}$ because of its identical valuation theory.
It's this last step which still confuses me… even though I can appreciate the utility of approximation techniques, it still feels extremely arbitrary why it turns out to be so inevitable with all the theory that builds upon this measly little step. Somehow, every time I've seen the study of formal neighbourhoods in algebraic geometry it always feels like it's a tool to tackle a problem, and not the main object of study, whereas in my head the $p$-adic numbers have turned out to be the main character in many areas of mathematics, which sort-of goes against this intuition of mine I find.
Since this intuition really doesn't come from any of my teachers and I'm not quite sure it makes much sense, I wanted to ask if it's an apt way to think about the use of non-archimedean fields in mathematics; I'd also be very interested in learning about their role in the history of algebraic number theory since I'm not quite sure I can properly place their use and introduction on a timeline in a coherent way with all of the theory I have in mind.
I apologise if my question is very hand-wavy, and I'd be super grateful for any sort of insight 🙂 Thank you very much for your time!!
Best Answer
Let me explain how I think about one way in which $\mathbb{Q}_p$ (or more to the point $\mathbb{Z}_p$) appears from an arithmetic-geometric perspective. Let me approach this from a vantage point that should be understandable to any advanced undergraduate. In particular, let me take 'solving Diophantine equations' as the goal.
(For a parallel and more in-depth discussion, you can read this extended version of a talk I gave. For a perspective more in-line with that discussed by the user hunter you can look at these old notes of mine [Disclaimer: these are quite old and so I can't vouch for their correctness, mathematically of philosophically])
The general motivation
The basic idea stems from the following credo which one encounters ad nauseam in their mathematical education. For a 'geometric space' $X$
$$\left\{\text{Problems on }X\right\}=\left\{\left(\begin{matrix}\text{Problems locally}\\ \text{on }X\end{matrix} \,\,,\,\,\begin{matrix}\text{Gluing}\\ \text{data}\end{matrix}\right)\right\}.$$
This breaks questions on $X$ into two (hopefully) more managable pieces.
Good examples of this are:
Remark 1:
Of course, this idea really only works with the following serious caveat: the local 'geometry' of $X$ is simple(r) enough to make solving the problem locally on $X$ (more) tenable.
A geometric perspective of solving equations
So, how does this relate to solving Diophantine equations or, for that matter, your question?
The beautiful idea of Grothendieck and his collaborators (building off of the work of many others) shows that solving equations can, in fact, be put into the context of (Sect). Namely, there is associated to any ring $R$ a geometric object $\mathrm{Spec}(R)$, a so-called locally ringed space (the category of which we shall denote $\mathbf{LRS}$), such that
$$\mathbf{Ring}\to \mathbf{LRS},\qquad R\mapsto \mathrm{Spec}(R)$$
is a (contravariant) fully faithful embedding. This is a bit abstract, so before we explain how this relates to solving equations, let us spell this out a little more precisely.
The object $\mathrm{Spec}(R)$, being a locally ringed space, means it comes with two bits of structure:
The underlying topological space $\mathrm{Spec}(R)$ can be described very explicitly. As a set we take
$$\mathrm{Spec}(R)=\{\mathfrak{p}\subseteq R\text{ a prime ideal}\}.$$ The topology on $\mathrm{Spec}(R)$, called the Zariski topology, has (a basis of) open subsets given by the non-vanishing loci
$$D(f):=\{\mathfrak{p}\in\mathrm{Spec}(R):f\notin\mathfrak{p}\}$$ for an element $f$ of $R$. To help understand the intuition behind the terminology 'non-vanishing locus', it is helpful to think about the defining property of $\mathrm{Spec}(R)$: $R=\mathcal{O}(\mathrm{Spec}(R))$; in other words, the ring of functions on $\mathrm{Spec}(R)$ is $R$.
To understand this, let us first think of the example $R=\mathbb{C}[x]$. Here we have $$ \{(x-p):p\in\mathbb{C}\}\cup\{(0)\}= \mathrm{Spec}(\mathbb{C}[x])\supseteq \mathrm{MaxSpec}(\mathbb{C}[x])=\{(x-p):p\in\mathbb{C}\},$$ where for a ring $R$ we write $\mathrm{MaxSpec}(R)$ for the set of maximal ideals of $R$. Note then that any element $f$ in $\mathbb{C}[x]$ can be thought of as a function $$f\colon\mathrm{MaxSpec}(\mathbb{C}[x])\to \mathbb{C},\qquad (x-p)\mapsto (f\bmod (x-p))=f(p),$$ where this last identification is using the isomorphism $\mathbb{C}[x]/(x-p)\cong \mathbb{C}$.
Now for a general ring $R$ (especially those rings that will show up in our study of Diophantine equations), we don't have the luxury of their maximal ideals admitting such a geometric description. In fact, in general, we don't even have the luxury of having enough maximal ideals for $\mathrm{MaxSpec}(R)$ to be a particularly useful set (e.g. for a local ring there is only one!). That said, we may use the above example as a hint of how to think of an element $f$ of $R$ as a function on $\mathrm{Spec}(R)$: it is the assignment
$$f\colon \mathrm{Spec}(R)\longrightarrow \bigsqcup_{\mathfrak{p}\in\mathrm{Spec}(R)}k(\mathfrak{p}),\qquad \mathfrak{p}\mapsto f(\mathfrak{p}):=(f\bmod \mathfrak{p})\qquad (\ast).$$
Here $k(\mathfrak{p}):=\mathrm{Frac}(R/\mathfrak{p})$ is the 'residue field' of $R$ at $\mathfrak{p}$.
This might look quite bizarre, especially the codomain being a big disjoint union of different fields. But, the fact that this didn't occur in the case $R=\mathbb{C}[x]$ is semi-coincidental: for every $\mathfrak{p}$ in $\mathrm{MaxSpec}(\mathbb{C}[x])$ there is an isomorphism $k(\mathfrak{p})\cong\mathbb{C}$ and so we (sort of carelessly) identified them all with each other. As intimated before, such happy coincidences don't happen in general, and so we're stuck with the above. But, if you buy this interpretation, then the phrase 'non-vanishing locus' makes sense, as $D(f)$ is exactly those $\mathfrak{p}$ for which $f(\mathfrak{p})\ne 0$.
But, I told you that a locally ringed space also came with a sheaf $\mathcal{O}$ of functions -- what is it for $\mathrm{Spec}(R)$? For formal reasons (because they form a basis) it suffices to describe the value of $\mathcal{O}(D(f))$, and the guess is now clear: the only new functions that should be introduced by considering the non-vanishing locus of $f$ is the function $f^{-1}$. Thus, one sets $\mathcal{O}(D(f))=R[\tfrac{1}{f}]$.
Then, at least in very broad terms, the promised (contravariant) fully faithful embedding is manifested as a natural bijection
$$\mathrm{Hom}_\mathbf{Ring}(R,S)=\mathrm{Hom}_\mathbf{LRS}(\mathrm{Spec}(S),\mathrm{Spec}(R)) \quad (\ast\ast).$$
This bijection can be made very explicit, but since I am not even telling you what a morphism of locally ringed spaces is, let's just take it as a black box.
Two important examples to keep in mind for the discussion below are:
OK, fine, how does this help us solve equations? Well, let us fix an $n$-tuple of polynomials
$$\mathbf{f}=(f_1,\ldots,f_m)\in R[x_1,\ldots,x_n].$$
We may consider the solution set functor for $\mathbf{f}$ over $S$:
$$X_\mathbf{f}\colon \{R\text{-algebras}\}\to \mathbf{Set},$$
given by
$$X_\mathbf{f}(S):=\{(y_1,\ldots,y_n)\in S^n: f_j(y_1,\ldots,y_n)=0\text{ for }j=1,\ldots,m\}.$$
In other words, $X_\mathbf{f}(S)$ just spits out the solution set to $\mathbf{f}=0$ over $S$. On the other hand, note that there is a natural bijection between $X_\mathbf{f}(S)$ and the $S$-algebra maps
$$\alpha\colon S[x_1,\ldots,x_n]/(f_1,\ldots,f_m)\to S$$
given by sending $\alpha$ to the $n$-tuple $(\alpha(x_1),\ldots,\alpha(x_n))$.
So, formally studying $(\ast\ast)$ shows the following beautiful fact: there is a natural bijection
$$X_\mathbf{f}(S)\longleftrightarrow \left\{\begin{matrix}\text{Sections of the map}\\ \mathrm{Spec}(S[x_i]/(f_j))\to \mathrm{Spec}(S)\end{matrix}\right\}.$$
If you buy the above, then the upshot is that by taking $R=S=\mathbb{Z}$, then we can squarely (if surprisingly) place solving Diophantine equations in the setting of the archetypal problem (Sect) from above with $X=\mathrm{Spec}(\mathbb{Z})$.
The ring $\mathbb{Z}_{(p)}^h$
Before we get too excited though, we need to recall the aforementioned caveat: all of this formalism is useful if the local geometry of $\mathrm{Spec}(\mathbb{Z})$ is simple enough. Well...what is the local geometry of $\mathrm{Spec}(\mathbb{Z})$. Or, an even more prescient question...what does local even mean?
Well, there is a natural guess. After all $\mathrm{Spec}(\mathbb{Z})$ is a topological space, so 'local' should mean studying a 'sufficiently small' open neighborhood. Now, we know what the set $\mathrm{Spec}(\mathbb{Z})$ looks like:
$$\mathrm{Spec}(\mathbb{Z})=\{(p):p\text{ is a prime}\}\cup\{(0)\}.$$
A little thought even allows us to describe the Zariski open sets. Namely, the proper non-empty open subsets of $\mathrm{Spec}(\mathbb{Z})$ are of the form
$$\mathrm{Spec}(\mathbb{Z})-\{p_1,\ldots,p_n\}$$
for primes $p_1,\ldots,p_n$.
The Zariski topology is strange. It's quite feeble (coarse). These open subsets are all just too big to somehow serve as a 'sufficiently small' neighborhood of a point $(p)$. As an analogy, for the space $\mathbb{C}$ (with its usual topology) and a point $p\in\mathbb{C}$, we think of the 'sufficiently small' open neighborhoods of $p$ to be of the form $p\in\mathbb{D}\subseteq\mathbb{C}$ where $\mathbb{D}$ is an open disc. In particular, we very much do not think of the open subsets $\mathbb{C}-\{p_1,\ldots,p_m\}$ for points $p_1,\ldots,p_m\in\mathbb{C}$ as being small.
Again, let me emphasize, while $\mathrm{Spec}(\mathbb{Z})$ is a powerful object, its topology (the Zariski topology) is somewhat feeble(=coarse).
Remark 2:
Well, if no one Zariski open neighborhood of $(p)$ is small enough, you can do something even more drastic: what if we intersect all the neighborhoods. In normal topology land this is not a very reasonable thing to do -- for a Hausdorff topological space you just get the point back! But, the upside of the feebleness of the Zariski topology of $\mathrm{Spec}(\mathbb{Z})$ is that this is a reasonable operation here. In fact, inspecting $(\ast\ast)$, and the fact that it reverses the direction of maps, we might guess that
$$\bigcap_{p\in D(f)}D(f)=\varprojlim_{p\in D(f)}D(f),$$
should be a geometric space with ring of functions
$$\varinjlim_{f\notin (p)}\mathbb{Z}[\tfrac{1}{f}]=\mathbb{Z}_{(p)}.$$
And so, in fact, a good model for the 'intersection of all neighborhoods of $(p)$' is $\mathrm{Spec}(\mathbb{Z}_{(p)})$. Maybe this is a "neighborhood" which is 'sufficiently small'.
Unfortunately, this is still not small enough (a true indictment of the Zariski topology). The reason for this is more subtle, and takes a bit more of a leap-of-faith in terms of geometric philosophy. Namely, if $U$ is to be a 'sufficiently small neighborhood' of $(p)$ then we would imagine that the map
$$\{(p)\}=\mathrm{Spec}(\mathbb{F}_p)\to \mathrm{Spec}(\mathbb{Z}_{(p)}),$$
should be something like a (weak) homotopy equivalence. One of the main reasons that a small open disk around $p$ in $\mathbb{C}$ is often 'sufficiently small' in complex analysis is because it's contractible. Unfortunately, this is not the case. In fact, in a way that one can make precise the map on fundamental groups isn't even an isomorphism.
Remark 3:
The idea of Grothendieck and his collaborators/descendants is the following: because the Zariski topology on $\mathrm{Spec}(\mathbb{Z})$ is too feeble, one should replace it with a sort of "generalized topology" where one can take 'sufficiently small neighborhoods' of points. I won't attempt to define "generalized topology", but in our case what pops out is the Nisnevich topology on $\mathrm{Spec}(\mathbb{Z})$, which acts in a manner much more similar to the usual topology on $\mathbb{C}$.
Remark 4:
In particular, one can again 'intersect all neighborhoods' of $(p)$ in this more robust Nisnevich topology, arriving at the Henselization of $\mathbb{Z}_{(p)}$, denoted $\mathbb{Z}_{(p)}^h$. One may roughly think of this as adding in the minimal number of things to $\mathbb{Z}_{(p)}$ so that Hensel's lemma works (thus the name!). The map
$$\{(p)\}=\mathrm{Spec}(\mathbb{F}_p)\to \mathrm{Spec}(\mathbb{Z}_{(p)}^h),$$
is like a (weak) homotopy equivalence, and so from a 'topological perspective' this does serve as a 'sufficiently small' neighborhood of $(p)$.
Remark 5:
One 'equational' way this manifests itself is that if $\mathbf{f}$ defines a 'smooth family at $p$' (i.e. that $\mathrm{Spec}(\mathbb{Z}[x_i]/(f_j))\to \mathrm{Spec}(\mathbb{Z})$ is smooth at $p$), and so topologically well-behaved, then for any $n\geqslant 1$ the map $$X_\mathbf{f}(\mathbb{Z}_{(p)}^h)\to X(\mathbb{Z}/p^n\mathbb{Z}),$$ (note that $\mathbb{Z}_{(p)}^h/p^n\mathbb{Z}^h_{(p)}=\mathbb{Z}_{(p)}/p^n\mathbb{Z}_{(p)}=\mathbb{Z}/p^n\mathbb{Z}$ by this) is surjective. In other words, for topologically well-behaved families, the only obstruction to having solutions is coming from the point $\mathrm{Spec}(\mathbb{F}_p)$ itself.
The ring $\mathbb{Z}_p$
Now, this Henselization step does not occur in the situation of $\mathbb{C}$, essentially for the reason that its topology is much richer. More precisely, for a point $p$ in $\mathbb{C}$, one has the ring
$$\mathbb{C}\langle x-p\rangle :=\varinjlim_{p\in U}\left\{\begin{matrix}\text{holomorphic functions}\\ U\to \mathbb{C}\end{matrix}\right\}$$
of power series $\sum_{i=0}^\infty a_i(x-p)^i$ which converge in a neighborhood of $p$. This is a Henselian local ring, with maximal ideal $(x-p)$! Again, we already knew the topology was rich enough to get 'sufficiently small' topological neighborhoods in this case, in the form of small open disks. But, there is one way in which these small open disks are not sufficient, and this will foretell how we will finally alter $\mathbb{Z}_{(p)}^h$.
To understand this, let us go back to (Diff). Solving a differential equation $Df=0$ where $f$ is an entire function on $\mathbb{C}$, can be studied by first trying to solve this equation locally at $p$. One way of interpreting this is as solving $Df=0$ for $f$ in $\mathbb{C}\langle x-p\rangle$. But, in practice, we quite often actually break this step further into two easier pieces:
The hallmark advantage of solving differential equations in $\mathbb{C}[\![x-p]\!]$ is that it has a 'formal solvability criterion': a solution exists if and only if one can find a polynomial solution which approximates it arbitrarily well. We can represent this equationally as
$$\{f\in\mathbb{C}[\![x-p]\!] : Df=0\}=\varprojlim_n \{g_n\in\mathbb{C}[x]_n: Dg_n=0\},$$
where $\mathbb{C}[x]_n$ is the set of polynomials of degree at most $n-1$. This reduces us to the purely algebraic study of 'finite-like' objects $\mathbb{C}[x]_n$. Again, the motto here is that while $\mathbb{C}\langle x-p\rangle$ has no 'topological obstructions' to reducing study to $\mathbb{C}[x]_n$, it does have analytic obstructions and $\mathbb{C}[\![x]\!]$ does away with those.
To understand what the analagous picture is for $\mathbb{Z}_{(p)}^h$, we observe that $\mathbb{C}[x]_n$ is nothing but $\mathbb{C}\langle x-p\rangle /(x-p)^n$: the quotient of this Henselian local ring by the $n^\text{th}$ power of its maximal ideal. As, essentially by definition,
$$\mathbb{C}[\![x-p]\!]=\varprojlim_n \mathbb{C}\langle x-p\rangle/(x-p)^n,$$
(mirroring what showed up in formal solvability criterion). Thus, the correct analogue for $\mathbb{Z}_{(p)}^h$, with maximal ideal $(p)$, is
$$\mathbb{Z}_p=\varprojlim_n \mathbb{Z}_{(p)}^h/p^n\mathbb{Z}^h_{(p)}=\varprojlim_n \mathbb{Z}_{(p)}/p^n\mathbb{Z}_{(p)}=\varprojlim_n \mathbb{Z}/p^n\mathbb{Z}.$$
The $p$-adic integers have appeared.
What is the analogue of the formal solvability criterion here? Well, it is simply that one has the equality
$$X_\mathbf{f}(\mathbb{Z}_p)=\varprojlim_n X_\mathbf{f}(\mathbb{Z}/p^n\mathbb{Z}),$$
something that fails with $\mathbb{Z}_p$ replaced by $\mathbb{Z}_{(p)}^h$, essentially for 'convergence' reasons. Thus, our 'formal solvability criterion' again reduces us to studying equations in the 'finite-like' rings $\mathbb{Z}/p^n\mathbb{Z}$.
Summary and original question
To summarize the above, we see that by using algebraic geometry to view Diophantine equations geometrically, the guiding principle given in (Sect) led us to
Hopefully this is satisfying, and convinces you why rings like $\mathbb{Z}_{(p)}$ and $\mathbb{Z}_p$ are ubiquitious as studying the 'local geometry' at $p$ of $\mathbb{Z}$ (in different perspectives of 'local').
I think four questions remain though:
Appendix: a description of $\mathbb{Z}_{(p)}^h$
A natural question you might have while reading the above: what does $\mathbb{Z}_{(p)}^h$ even look like.
Let me give a description of $R^h$ in a much more general setting. Suppose that $R$ is a local, Noetherian, excellent integral domain with maximal ideal $\mathfrak{m}$. This includes many examples, but in particular includes the case when $R$ is a DVR with fraction field of characteristic $0$ (e.g. $\mathbb{Z}_{(p)}$).
For an $R$-algebra $S$, let us define the algebraic closure of $R$ in $S$ to be those $s$ in $S$ which satisfy a non-zero polynomial $p(x)$ in $R[x]$ (NB: there is no assumption that $p(x)$ is monic!).
Proof: The map $R\to R^h$ is local, and in fact $\mathfrak{m}R^h$ is the maximal ideal of $R^h$ and the map $R/\mathfrak{m}^nR \to R^h/\mathfrak{m}^n R^h$ is an isomorphism for all $n$. Thus, whe natural map $\widehat{R}\to \widehat{R^h}$ is an isomorphism, and so $R^h\to \widehat{R^h}=\widehat{R}$ is faithfully flat, and so injective. We claim that the image of $R^h\to \widehat{R}$ consists precisely of those elements of $\widehat{R}$ which are algebraic over $R$. By definition, if $r$ is an element of $R^h$, then there exists a factorization $R\to S\to R^h$, with $R\to S$ an etale map of local rings, and with $r$ in the image of $S$. Note that $S$ is a domain (combine this and this), and as $R\to S$ is faithfully flat and so injective, the generic point of $S$ maps to that of $R$ under $\mathrm{Spec}(S)\to\mathrm{Spec}(R)$. So the extension $\mathrm{Frac}(S)/\mathrm{Frac}(R)$ is finite, and so $r$ is algebraic. Conversely, suppose that $r$ in $\widehat{R}$ is algebraic and let $p(x)\in R[x]$ be a non-zero polynomial annihilating $r$. Since $\widehat{R}$ is a domain, the polynomial can have at most $\deg(p)$ many roots in $\widehat{R}$. Write this finite set of roots as $r=x_1,x_2,\ldots,x_m$. There exists some $N\gg 0$ such that $r\ne x_i\bmod \mathfrak{m}^N \widehat{R}$ for $i\ne 1$. By Artin approximation there exists some $s$ in $R^h$ which is also a root of $p(x)$ and such that $s=r\bmod \mathfrak{m}^M\widehat{R}$ for some $M\geqslant N$. But, by set up, we then have that $s=r$. $\blacksquare$
Remark 6: