Why Are $p$-Adic Numbers Ubiquitous in Modern Number Theory?

algebraic-geometryalgebraic-number-theorybig-picturemotivationp-adic-number-theory

I'm currently at a stage where I think I'm quite comfortable with the appearance of local non-archimedean fields in the maths I encounter, having seen a fair bit of technology built upon their structure and applications to connected areas, yet I somehow still feel like I have an unsatisfying understanding of why their introduction is absolutely crucial to study algebraic number theory and arithmetic geometry, especially since all these applications are hugely advanced compared to the point at which $p$-adic numbers are usually introduced in a student's career I think. If I want to motivate their definition, I sort of naturally go through the following implications in my head:

  1. to study a problem over the ring of integers $\mathbb{Z}$ (or more generally, the ring of integers of a number field) the strategy is to localise the problem at a prime $p$, as to focus around it and worry about the problem 'one point at a time', so-to-speak;
  2. we (literally, now) localise at the prime $p$, just as one would do with the ring of regular functions on a variety, and replace $\mathbb{Z}$ with the ring $\mathbb{Z}_{(p)} := \{ \frac{x}{y} \in \mathbb{Q} \mid p\nmid y \}$ ;
  3. we can then complete at the maximal ideal $(p) \subseteq \mathbb{Z}_{(p)}$ to obtain the $p$-adic integers $\mathbb{Z}_p$, with the benefit that now there's access to approximation techniques such as Hensel's lemma, and the newly obtained ring has pretty much all the same algebraic properties as $\mathbb{Z}_{(p)}$ because of its identical valuation theory.

It's this last step which still confuses me… even though I can appreciate the utility of approximation techniques, it still feels extremely arbitrary why it turns out to be so inevitable with all the theory that builds upon this measly little step. Somehow, every time I've seen the study of formal neighbourhoods in algebraic geometry it always feels like it's a tool to tackle a problem, and not the main object of study, whereas in my head the $p$-adic numbers have turned out to be the main character in many areas of mathematics, which sort-of goes against this intuition of mine I find.

Since this intuition really doesn't come from any of my teachers and I'm not quite sure it makes much sense, I wanted to ask if it's an apt way to think about the use of non-archimedean fields in mathematics; I'd also be very interested in learning about their role in the history of algebraic number theory since I'm not quite sure I can properly place their use and introduction on a timeline in a coherent way with all of the theory I have in mind.

I apologise if my question is very hand-wavy, and I'd be super grateful for any sort of insight 🙂 Thank you very much for your time!!

Best Answer

Let me explain how I think about one way in which $\mathbb{Q}_p$ (or more to the point $\mathbb{Z}_p$) appears from an arithmetic-geometric perspective. Let me approach this from a vantage point that should be understandable to any advanced undergraduate. In particular, let me take 'solving Diophantine equations' as the goal.

(For a parallel and more in-depth discussion, you can read this extended version of a talk I gave. For a perspective more in-line with that discussed by the user hunter you can look at these old notes of mine [Disclaimer: these are quite old and so I can't vouch for their correctness, mathematically of philosophically])


The general motivation

The basic idea stems from the following credo which one encounters ad nauseam in their mathematical education. For a 'geometric space' $X$

$$\left\{\text{Problems on }X\right\}=\left\{\left(\begin{matrix}\text{Problems locally}\\ \text{on }X\end{matrix} \,\,,\,\,\begin{matrix}\text{Gluing}\\ \text{data}\end{matrix}\right)\right\}.$$

This breaks questions on $X$ into two (hopefully) more managable pieces.

Good examples of this are:

  • (Diff) if $X$ is some manifold and I want to solve the differential equation $Df=0$, I can try to find solutions to $Df=0$ locally on $X$, and then glue them together to a global solution,
  • (Sect) if $f\colon Y\to X$ is a map of 'geometric spaces' and I want to find a section $s\colon X\to Y$, I can try to find a section locally on $X$ and then glue these local sections together.

Remark 1:

The second example is really the representative one, and morally the notion of 'moduli spaces' should tell you that all examples should be of this form. For instance, it's a good exercise to think how one might put the first example in this form!

Of course, this idea really only works with the following serious caveat: the local 'geometry' of $X$ is simple(r) enough to make solving the problem locally on $X$ (more) tenable.

A geometric perspective of solving equations

So, how does this relate to solving Diophantine equations or, for that matter, your question?

The beautiful idea of Grothendieck and his collaborators (building off of the work of many others) shows that solving equations can, in fact, be put into the context of (Sect). Namely, there is associated to any ring $R$ a geometric object $\mathrm{Spec}(R)$, a so-called locally ringed space (the category of which we shall denote $\mathbf{LRS}$), such that

$$\mathbf{Ring}\to \mathbf{LRS},\qquad R\mapsto \mathrm{Spec}(R)$$

is a (contravariant) fully faithful embedding. This is a bit abstract, so before we explain how this relates to solving equations, let us spell this out a little more precisely.

The object $\mathrm{Spec}(R)$, being a locally ringed space, means it comes with two bits of structure:

  • an underlying topological space and,
  • a 'sheaf of functions' $\mathcal{O}$ which assigns to any open subset $U$ the ring $\mathcal{O}(U)$ of 'functions on $U$'.

The underlying topological space $\mathrm{Spec}(R)$ can be described very explicitly. As a set we take

$$\mathrm{Spec}(R)=\{\mathfrak{p}\subseteq R\text{ a prime ideal}\}.$$ The topology on $\mathrm{Spec}(R)$, called the Zariski topology, has (a basis of) open subsets given by the non-vanishing loci

$$D(f):=\{\mathfrak{p}\in\mathrm{Spec}(R):f\notin\mathfrak{p}\}$$ for an element $f$ of $R$. To help understand the intuition behind the terminology 'non-vanishing locus', it is helpful to think about the defining property of $\mathrm{Spec}(R)$: $R=\mathcal{O}(\mathrm{Spec}(R))$; in other words, the ring of functions on $\mathrm{Spec}(R)$ is $R$.

To understand this, let us first think of the example $R=\mathbb{C}[x]$. Here we have $$ \{(x-p):p\in\mathbb{C}\}\cup\{(0)\}= \mathrm{Spec}(\mathbb{C}[x])\supseteq \mathrm{MaxSpec}(\mathbb{C}[x])=\{(x-p):p\in\mathbb{C}\},$$ where for a ring $R$ we write $\mathrm{MaxSpec}(R)$ for the set of maximal ideals of $R$. Note then that any element $f$ in $\mathbb{C}[x]$ can be thought of as a function $$f\colon\mathrm{MaxSpec}(\mathbb{C}[x])\to \mathbb{C},\qquad (x-p)\mapsto (f\bmod (x-p))=f(p),$$ where this last identification is using the isomorphism $\mathbb{C}[x]/(x-p)\cong \mathbb{C}$.

Now for a general ring $R$ (especially those rings that will show up in our study of Diophantine equations), we don't have the luxury of their maximal ideals admitting such a geometric description. In fact, in general, we don't even have the luxury of having enough maximal ideals for $\mathrm{MaxSpec}(R)$ to be a particularly useful set (e.g. for a local ring there is only one!). That said, we may use the above example as a hint of how to think of an element $f$ of $R$ as a function on $\mathrm{Spec}(R)$: it is the assignment

$$f\colon \mathrm{Spec}(R)\longrightarrow \bigsqcup_{\mathfrak{p}\in\mathrm{Spec}(R)}k(\mathfrak{p}),\qquad \mathfrak{p}\mapsto f(\mathfrak{p}):=(f\bmod \mathfrak{p})\qquad (\ast).$$

Here $k(\mathfrak{p}):=\mathrm{Frac}(R/\mathfrak{p})$ is the 'residue field' of $R$ at $\mathfrak{p}$.

This might look quite bizarre, especially the codomain being a big disjoint union of different fields. But, the fact that this didn't occur in the case $R=\mathbb{C}[x]$ is semi-coincidental: for every $\mathfrak{p}$ in $\mathrm{MaxSpec}(\mathbb{C}[x])$ there is an isomorphism $k(\mathfrak{p})\cong\mathbb{C}$ and so we (sort of carelessly) identified them all with each other. As intimated before, such happy coincidences don't happen in general, and so we're stuck with the above. But, if you buy this interpretation, then the phrase 'non-vanishing locus' makes sense, as $D(f)$ is exactly those $\mathfrak{p}$ for which $f(\mathfrak{p})\ne 0$.

But, I told you that a locally ringed space also came with a sheaf $\mathcal{O}$ of functions -- what is it for $\mathrm{Spec}(R)$? For formal reasons (because they form a basis) it suffices to describe the value of $\mathcal{O}(D(f))$, and the guess is now clear: the only new functions that should be introduced by considering the non-vanishing locus of $f$ is the function $f^{-1}$. Thus, one sets $\mathcal{O}(D(f))=R[\tfrac{1}{f}]$.

Then, at least in very broad terms, the promised (contravariant) fully faithful embedding is manifested as a natural bijection

$$\mathrm{Hom}_\mathbf{Ring}(R,S)=\mathrm{Hom}_\mathbf{LRS}(\mathrm{Spec}(S),\mathrm{Spec}(R)) \quad (\ast\ast).$$

This bijection can be made very explicit, but since I am not even telling you what a morphism of locally ringed spaces is, let's just take it as a black box.

Two important examples to keep in mind for the discussion below are:

  • the natural map $R\to R[\tfrac{1}{f}]$ corresponds to a map $\mathrm{Spec}(R[\tfrac{1}{f}])\to \mathrm{Spec}(R)$ which is an isomorphism onto the open subset $D(f)$ (an 'open immersion'),
  • for a prime $\mathfrak{p}$ the composition $R\to R/\mathfrak{p}\hookrightarrow k(\mathfrak{p})$ corresponds to a map $\mathrm{Spec}(k(\mathfrak{p}))\to \mathrm{Spec}(R)$ which one should think of as picking out the point $\mathfrak{p}$,

OK, fine, how does this help us solve equations? Well, let us fix an $n$-tuple of polynomials

$$\mathbf{f}=(f_1,\ldots,f_m)\in R[x_1,\ldots,x_n].$$

We may consider the solution set functor for $\mathbf{f}$ over $S$:

$$X_\mathbf{f}\colon \{R\text{-algebras}\}\to \mathbf{Set},$$

given by

$$X_\mathbf{f}(S):=\{(y_1,\ldots,y_n)\in S^n: f_j(y_1,\ldots,y_n)=0\text{ for }j=1,\ldots,m\}.$$

In other words, $X_\mathbf{f}(S)$ just spits out the solution set to $\mathbf{f}=0$ over $S$. On the other hand, note that there is a natural bijection between $X_\mathbf{f}(S)$ and the $S$-algebra maps

$$\alpha\colon S[x_1,\ldots,x_n]/(f_1,\ldots,f_m)\to S$$

given by sending $\alpha$ to the $n$-tuple $(\alpha(x_1),\ldots,\alpha(x_n))$.

So, formally studying $(\ast\ast)$ shows the following beautiful fact: there is a natural bijection

$$X_\mathbf{f}(S)\longleftrightarrow \left\{\begin{matrix}\text{Sections of the map}\\ \mathrm{Spec}(S[x_i]/(f_j))\to \mathrm{Spec}(S)\end{matrix}\right\}.$$

If you buy the above, then the upshot is that by taking $R=S=\mathbb{Z}$, then we can squarely (if surprisingly) place solving Diophantine equations in the setting of the archetypal problem (Sect) from above with $X=\mathrm{Spec}(\mathbb{Z})$.

The ring $\mathbb{Z}_{(p)}^h$

Before we get too excited though, we need to recall the aforementioned caveat: all of this formalism is useful if the local geometry of $\mathrm{Spec}(\mathbb{Z})$ is simple enough. Well...what is the local geometry of $\mathrm{Spec}(\mathbb{Z})$. Or, an even more prescient question...what does local even mean?

Well, there is a natural guess. After all $\mathrm{Spec}(\mathbb{Z})$ is a topological space, so 'local' should mean studying a 'sufficiently small' open neighborhood. Now, we know what the set $\mathrm{Spec}(\mathbb{Z})$ looks like:

$$\mathrm{Spec}(\mathbb{Z})=\{(p):p\text{ is a prime}\}\cup\{(0)\}.$$

A little thought even allows us to describe the Zariski open sets. Namely, the proper non-empty open subsets of $\mathrm{Spec}(\mathbb{Z})$ are of the form

$$\mathrm{Spec}(\mathbb{Z})-\{p_1,\ldots,p_n\}$$

for primes $p_1,\ldots,p_n$.

The Zariski topology is strange. It's quite feeble (coarse). These open subsets are all just too big to somehow serve as a 'sufficiently small' neighborhood of a point $(p)$. As an analogy, for the space $\mathbb{C}$ (with its usual topology) and a point $p\in\mathbb{C}$, we think of the 'sufficiently small' open neighborhoods of $p$ to be of the form $p\in\mathbb{D}\subseteq\mathbb{C}$ where $\mathbb{D}$ is an open disc. In particular, we very much do not think of the open subsets $\mathbb{C}-\{p_1,\ldots,p_m\}$ for points $p_1,\ldots,p_m\in\mathbb{C}$ as being small.

Again, let me emphasize, while $\mathrm{Spec}(\mathbb{Z})$ is a powerful object, its topology (the Zariski topology) is somewhat feeble(=coarse).

Remark 2:

One equational justification for why no neighborhood $D(f)$ of $(p)$ is sufficiently small: for any prime $p\ne q\nmid f$, the equation $qx=1$ does NOT have a solution in $\mathbb{Z}[\tfrac{1}{f}]$. A 'sufficiently small' neighborhood of $(p)$ should not have equational obstructions coming from other points!

Well, if no one Zariski open neighborhood of $(p)$ is small enough, you can do something even more drastic: what if we intersect all the neighborhoods. In normal topology land this is not a very reasonable thing to do -- for a Hausdorff topological space you just get the point back! But, the upside of the feebleness of the Zariski topology of $\mathrm{Spec}(\mathbb{Z})$ is that this is a reasonable operation here. In fact, inspecting $(\ast\ast)$, and the fact that it reverses the direction of maps, we might guess that

$$\bigcap_{p\in D(f)}D(f)=\varprojlim_{p\in D(f)}D(f),$$

should be a geometric space with ring of functions

$$\varinjlim_{f\notin (p)}\mathbb{Z}[\tfrac{1}{f}]=\mathbb{Z}_{(p)}.$$

And so, in fact, a good model for the 'intersection of all neighborhoods of $(p)$' is $\mathrm{Spec}(\mathbb{Z}_{(p)})$. Maybe this is a "neighborhood" which is 'sufficiently small'.

Unfortunately, this is still not small enough (a true indictment of the Zariski topology). The reason for this is more subtle, and takes a bit more of a leap-of-faith in terms of geometric philosophy. Namely, if $U$ is to be a 'sufficiently small neighborhood' of $(p)$ then we would imagine that the map

$$\{(p)\}=\mathrm{Spec}(\mathbb{F}_p)\to \mathrm{Spec}(\mathbb{Z}_{(p)}),$$

should be something like a (weak) homotopy equivalence. One of the main reasons that a small open disk around $p$ in $\mathbb{C}$ is often 'sufficiently small' in complex analysis is because it's contractible. Unfortunately, this is not the case. In fact, in a way that one can make precise the map on fundamental groups isn't even an isomorphism.

Remark 3:

It takes a lot of faith that this topological wording can be made precise and...wait a minute...even if you can, what in the world does this have to do with solving Diophantine equations? Let me try to give an example which hopefully (mildly) addresses both of these points.

Let us consider the Diophantine equation $$\{y\in\mathbb{Z}:y^2+y+1=0.\}$$ From our above discussion we know that this should correspond to studying sections of the map $$\mathrm{Spec}(\mathbb{Z}[x]/(x^2+x+1))\to\mathrm{Spec}(\mathbb{Z}).$$ Moreover, what it should mean to study this 'locally at $(13)$', with $\mathbb{Z}_{(13)}$ as our meaning of local, is that we want to study sections of the map $$\mathrm{Spec}(\mathbb{Z}_{(13)}[x]/(x^2+x+1))\to\mathrm{Spec}(\mathbb{Z}_{(13)}).$$ That said, while this map does not have a section it does have a section over the point $\mathbb{F}_{13}$: $$y^2+y+1=0\bmod 13$$ has two distinct solutions.

Why is this indictative of the fact that $$\{(13)\}=\mathrm{Spec}(\mathbb{F}_{13})\to \mathrm{Spec}(\mathbb{Z}_{(13)})$$ is not a homotopy equivalence? Well, let us note that the map $$\mathrm{Spec}(\mathbb{Z}_{(13)}[x]/(x^2+x+1))\to\mathrm{Spec}(\mathbb{Z}_{(13)}).$$ can/should be thought of as being smooth and proper: smooth because its derivative $2x+1$ vanishes nowhere on $\mathrm{Spec}(\mathbb{Z}_{(13)}[x]/(x^2+x+1))$, and proper because it's finite. That said, Ehresmann's theorem says that a smooth proper map is locally trivial fibration. If $X'\to X$ is a (weak) homotopy equivalence and $Y\to X$ is a locally trivial fibration, then it should have a section if and only if $Y\times_X X'\to X'$ has a section. Thus, this indicates that $$\{(13)\}=\mathrm{Spec}(\mathbb{F}_{13})\to \mathrm{Spec}(\mathbb{Z}_{(13)})$$ is indeed not a homotopy equivalence.

As mentioned above this remark, for those in the know, what I am really pointing out here is that the map $$\pi_1^\mathrm{et}(\mathrm{Spec}(\mathbb{F}_p))\to\pi_1(\mathrm{Spec}(\mathbb{Z}_{(p)})$$ is not a bijection.

The idea of Grothendieck and his collaborators/descendants is the following: because the Zariski topology on $\mathrm{Spec}(\mathbb{Z})$ is too feeble, one should replace it with a sort of "generalized topology" where one can take 'sufficiently small neighborhoods' of points. I won't attempt to define "generalized topology", but in our case what pops out is the Nisnevich topology on $\mathrm{Spec}(\mathbb{Z})$, which acts in a manner much more similar to the usual topology on $\mathbb{C}$.

Remark 4:

For those more in the know, you might be surprised that what is showing up here is the Nisnevich topology, and not the more common etale topology. The reason is that unlike the case of $\mathbb{C}$, our points themselves have non-trivial topology. In fact, recall that $$\pi_1^\mathrm{et}(\mathrm{Spec}(\mathbb{F}_p))=\mathrm{Gal}(\overline{\mathbb{F}}_p/\mathbb{F}_p)\cong\widehat{\mathbb{Z}}.$$ The Nisnevich topology respects that, and allows you to zoom in on $(p)$ without altering the topology of the point. The etale neighborhood, on the other hand, pursues the logical conclusion by zooming in so far as to obtain a 'contractible' neighborhood, but at the cost of altering the topology of the point. Both are useful, but for our purposes here it is the Nisnevich topology playing the more crucial role.

In particular, one can again 'intersect all neighborhoods' of $(p)$ in this more robust Nisnevich topology, arriving at the Henselization of $\mathbb{Z}_{(p)}$, denoted $\mathbb{Z}_{(p)}^h$. One may roughly think of this as adding in the minimal number of things to $\mathbb{Z}_{(p)}$ so that Hensel's lemma works (thus the name!). The map

$$\{(p)\}=\mathrm{Spec}(\mathbb{F}_p)\to \mathrm{Spec}(\mathbb{Z}_{(p)}^h),$$

is like a (weak) homotopy equivalence, and so from a 'topological perspective' this does serve as a 'sufficiently small' neighborhood of $(p)$.

Remark 5:

My claim that $$\{(p)\}=\mathrm{Spec}(\mathbb{F}_p)\to \mathrm{Spec}(\mathbb{Z}_{(p)}^h),$$ is like a (weak) homotopy equivalence is meant to be mostly intuitive, and probably only should be taken seriously at the pro-finite level. That said, see this, this, and Theorem 2.1.6 of this for why $\mathrm{Spec}(A/I)$ and $\mathrm{Spec}(A)$ are quite 'topologically similar' for a Henselian pair $(A,I)$. It is also helpful to compare this with the strict Henselization $\mathbb{Z}_{(p)}^\mathrm{sh}$, the local ring in the etale topology, where the results are more clear-cut: see this and Proposition 8.6 of Friedlander's Etale homotopy of schemes.

One 'equational' way this manifests itself is that if $\mathbf{f}$ defines a 'smooth family at $p$' (i.e. that $\mathrm{Spec}(\mathbb{Z}[x_i]/(f_j))\to \mathrm{Spec}(\mathbb{Z})$ is smooth at $p$), and so topologically well-behaved, then for any $n\geqslant 1$ the map $$X_\mathbf{f}(\mathbb{Z}_{(p)}^h)\to X(\mathbb{Z}/p^n\mathbb{Z}),$$ (note that $\mathbb{Z}_{(p)}^h/p^n\mathbb{Z}^h_{(p)}=\mathbb{Z}_{(p)}/p^n\mathbb{Z}_{(p)}=\mathbb{Z}/p^n\mathbb{Z}$ by this) is surjective. In other words, for topologically well-behaved families, the only obstruction to having solutions is coming from the point $\mathrm{Spec}(\mathbb{F}_p)$ itself.

The ring $\mathbb{Z}_p$

Now, this Henselization step does not occur in the situation of $\mathbb{C}$, essentially for the reason that its topology is much richer. More precisely, for a point $p$ in $\mathbb{C}$, one has the ring

$$\mathbb{C}\langle x-p\rangle :=\varinjlim_{p\in U}\left\{\begin{matrix}\text{holomorphic functions}\\ U\to \mathbb{C}\end{matrix}\right\}$$

of power series $\sum_{i=0}^\infty a_i(x-p)^i$ which converge in a neighborhood of $p$. This is a Henselian local ring, with maximal ideal $(x-p)$! Again, we already knew the topology was rich enough to get 'sufficiently small' topological neighborhoods in this case, in the form of small open disks. But, there is one way in which these small open disks are not sufficient, and this will foretell how we will finally alter $\mathbb{Z}_{(p)}^h$.

To understand this, let us go back to (Diff). Solving a differential equation $Df=0$ where $f$ is an entire function on $\mathbb{C}$, can be studied by first trying to solve this equation locally at $p$. One way of interpreting this is as solving $Df=0$ for $f$ in $\mathbb{C}\langle x-p\rangle$. But, in practice, we quite often actually break this step further into two easier pieces:

  • first solve $Df=0$ for $f\in\mathbb{C}[\![x-p]\!]$ (i.e. arbitrary power series at $p$ with no convergence conditions),
  • then show that this solution has reasonable convergence properties (i.e. that the solution can be taken to lie in $\mathbb{C}\langle x-p\rangle$).

The hallmark advantage of solving differential equations in $\mathbb{C}[\![x-p]\!]$ is that it has a 'formal solvability criterion': a solution exists if and only if one can find a polynomial solution which approximates it arbitrarily well. We can represent this equationally as

$$\{f\in\mathbb{C}[\![x-p]\!] : Df=0\}=\varprojlim_n \{g_n\in\mathbb{C}[x]_n: Dg_n=0\},$$

where $\mathbb{C}[x]_n$ is the set of polynomials of degree at most $n-1$. This reduces us to the purely algebraic study of 'finite-like' objects $\mathbb{C}[x]_n$. Again, the motto here is that while $\mathbb{C}\langle x-p\rangle$ has no 'topological obstructions' to reducing study to $\mathbb{C}[x]_n$, it does have analytic obstructions and $\mathbb{C}[\![x]\!]$ does away with those.

To understand what the analagous picture is for $\mathbb{Z}_{(p)}^h$, we observe that $\mathbb{C}[x]_n$ is nothing but $\mathbb{C}\langle x-p\rangle /(x-p)^n$: the quotient of this Henselian local ring by the $n^\text{th}$ power of its maximal ideal. As, essentially by definition,

$$\mathbb{C}[\![x-p]\!]=\varprojlim_n \mathbb{C}\langle x-p\rangle/(x-p)^n,$$

(mirroring what showed up in formal solvability criterion). Thus, the correct analogue for $\mathbb{Z}_{(p)}^h$, with maximal ideal $(p)$, is

$$\mathbb{Z}_p=\varprojlim_n \mathbb{Z}_{(p)}^h/p^n\mathbb{Z}^h_{(p)}=\varprojlim_n \mathbb{Z}_{(p)}/p^n\mathbb{Z}_{(p)}=\varprojlim_n \mathbb{Z}/p^n\mathbb{Z}.$$

The $p$-adic integers have appeared.

What is the analogue of the formal solvability criterion here? Well, it is simply that one has the equality

$$X_\mathbf{f}(\mathbb{Z}_p)=\varprojlim_n X_\mathbf{f}(\mathbb{Z}/p^n\mathbb{Z}),$$

something that fails with $\mathbb{Z}_p$ replaced by $\mathbb{Z}_{(p)}^h$, essentially for 'convergence' reasons. Thus, our 'formal solvability criterion' again reduces us to studying equations in the 'finite-like' rings $\mathbb{Z}/p^n\mathbb{Z}$.

Summary and original question

To summarize the above, we see that by using algebraic geometry to view Diophantine equations geometrically, the guiding principle given in (Sect) led us to

  1. consider $\mathbb{Z}_{(p)}$ as a possible 'sufficiently small neighborhood' of $(p)$, and to realize it's woefully 'too big',
  2. to consider the Henselization $\mathbb{Z}_{(p)}^h$ to eliminate 'topological obstructions' (not coming from $\mathrm{Spec}(\mathbb{F}_p)$ itself) to finding solutions of equations and realizing that it does that job quite well,
  3. but to realize that 'analytic' obstructions remain and so we can further replace it with $\mathbb{Z}_p$ which is a more 'formal' object (analogous to replacing convergent power series with all power series).

Hopefully this is satisfying, and convinces you why rings like $\mathbb{Z}_{(p)}$ and $\mathbb{Z}_p$ are ubiquitious as studying the 'local geometry' at $p$ of $\mathbb{Z}$ (in different perspectives of 'local').

I think four questions remain though:

  1. Can one actually work backward from $\mathbb{Z}_p$ to $\mathbb{Z}_{(p)}^h$? The answer, astoundingly, is almost always yes. This is the theory of Artin approximation.
  2. Can one actually work backward from $\mathbb{Z}_{(p)}^h$ to an honest Nisnevich neighborhood? The answer again is essentially yes -- see this.
  3. OK, but can you finally profitably glue these solutions together to an actual Diophantine solution? This is the million dollar question, and the hardest. The answer is a resounding yes...sometimes. As the gluing data can be quite abstract/complicated, it's hard to enact this in practice (although it theoretically always works). Instead people often times consider 'local solutions' with some sort of 'weakened gluing data' (sometimes no gluing data at all!). These usually go under the heading of 'local-to-global principles', and generally have reasonable success.
  4. Why use $\mathbb{Z}_p$ instead of $\mathbb{Z}_{(p)}^h$? This is fair question. Hopefully the above gives you some sort of idea of what the advantages of $\mathbb{Z}_p$ are, but from a more literal gluing perspective $\mathbb{Z}_{(p)}^h$ is superior. I can imagine a world, not so far from our own (maybe a future world?) where we do use $\mathbb{Z}_{(p)}^h$ instead. In fact, I think that some of the work of Fujiwara and Kato (on 'Henselian rigid geometry') can be seen to point in that direction.

Appendix: a description of $\mathbb{Z}_{(p)}^h$

A natural question you might have while reading the above: what does $\mathbb{Z}_{(p)}^h$ even look like.

Let me give a description of $R^h$ in a much more general setting. Suppose that $R$ is a local, Noetherian, excellent integral domain with maximal ideal $\mathfrak{m}$. This includes many examples, but in particular includes the case when $R$ is a DVR with fraction field of characteristic $0$ (e.g. $\mathbb{Z}_{(p)}$).

For an $R$-algebra $S$, let us define the algebraic closure of $R$ in $S$ to be those $s$ in $S$ which satisfy a non-zero polynomial $p(x)$ in $R[x]$ (NB: there is no assumption that $p(x)$ is monic!).

Fact: The Henselization of $R$ is the algebraic closure of $R$ in $\widehat{R}$.

Proof: The map $R\to R^h$ is local, and in fact $\mathfrak{m}R^h$ is the maximal ideal of $R^h$ and the map $R/\mathfrak{m}^nR \to R^h/\mathfrak{m}^n R^h$ is an isomorphism for all $n$. Thus, whe natural map $\widehat{R}\to \widehat{R^h}$ is an isomorphism, and so $R^h\to \widehat{R^h}=\widehat{R}$ is faithfully flat, and so injective. We claim that the image of $R^h\to \widehat{R}$ consists precisely of those elements of $\widehat{R}$ which are algebraic over $R$. By definition, if $r$ is an element of $R^h$, then there exists a factorization $R\to S\to R^h$, with $R\to S$ an etale map of local rings, and with $r$ in the image of $S$. Note that $S$ is a domain (combine this and this), and as $R\to S$ is faithfully flat and so injective, the generic point of $S$ maps to that of $R$ under $\mathrm{Spec}(S)\to\mathrm{Spec}(R)$. So the extension $\mathrm{Frac}(S)/\mathrm{Frac}(R)$ is finite, and so $r$ is algebraic. Conversely, suppose that $r$ in $\widehat{R}$ is algebraic and let $p(x)\in R[x]$ be a non-zero polynomial annihilating $r$. Since $\widehat{R}$ is a domain, the polynomial can have at most $\deg(p)$ many roots in $\widehat{R}$. Write this finite set of roots as $r=x_1,x_2,\ldots,x_m$. There exists some $N\gg 0$ such that $r\ne x_i\bmod \mathfrak{m}^N \widehat{R}$ for $i\ne 1$. By Artin approximation there exists some $s$ in $R^h$ which is also a root of $p(x)$ and such that $s=r\bmod \mathfrak{m}^M\widehat{R}$ for some $M\geqslant N$. But, by set up, we then have that $s=r$. $\blacksquare$

Remark 6:

Let us give two applications of this.

First, it lets us give a simple example of an element of $\mathbb{Z}_p$ not in $\mathbb{Z}_{(p)}^h$. In particular, the element $\sum_{i=0}^\infty p^{i!}$. It suffices to show that it's not algebraic over $\mathbb{Z}_{(p)}$ and this is fairly elementary (e.g. see An elementary example of a transcendental $p$-adic number by Suter).

Second, it lets observe that although $\mathbb{C}\langle x-p\rangle$ is Henselian, and contains $\mathbb{C}[x]_{(x-p)}$, and the two have the same completion, that $\mathbb{C}\langle x-p\rangle$ is not $\mathbb{C}[x]_{(x-p)}^h$. Indeed, it suffices to observe that $\mathbb{C}\langle x-p\rangle$ contains $\exp(x)$ which is not algebraic over $\mathbb{C}[x]_{(x-p)}$.