12 Equations in \(\mathbb{Z}_p\) and Hensel’s lemma

As in the previous chapter, \(p\) is a prime. For \(x \in \mathbb{Z}_p,\) we’ll write \(\bar{x}\) for its reduction mod \(p,\) so \(\bar{x} \in \mathbb{Z}/p\mathbb{Z}.\)

12.1 Roots of polynomials

Let’s consider a monic polynomial \(f(X) = X^n + a_{n-1} X^{n-1} + \dots + a_0,\) where the \(a_i\) are in \(\mathbb{Z}_p\) (or just in \(\mathbb{Z},\) if you prefer). What can we say about its roots in \(\mathbb{Z}_p\)?

Obviously, if \(\alpha \in \mathbb{Z}_p\) is a root, then its reduction \(\bar{\alpha} \in \mathbb{Z}/p\mathbb{Z}\) is a root of the mod \(p\) polynomial \(\bar{f} = \sum \bar{a}_i X^i.\) Amazingly, this is (in a sense) all the information we need to understand solutions in \(\mathbb{Z}_p\) as well!

We need the following abstract algebraic warm-up:

Proposition 12.1

Let \(\mathbb{K}\) be a field, and let \(f = \sum a_i X^i\) a monic polynomial with coefficients in \(\mathbb{K}.\) Suppose \(r \in \mathbb{K}\) satisfies \(f(r) = 0.\) Then the following are equivalent:

  • \(r\) is a simple root of \(f,\) i.e. we can write \(f(X) = (X - r)g(x)\) for some \(g\) with \(g(r) \ne 0\);

  • \(f'(r) \ne 0,\) where \(f'\) is defined purely formally as \(\sum i a_i X^{i-1}.\)

Proof. We can always write \(f(X) = (X - r) g(X)\) for some \(g\) (the question is whether \(g(r) = 0\) or not). But the product rule for derivatives holds for polynomials over any field, so \(f'(X) = g(X) + (X - r) g'(X),\) and setting \(X = r\) gives \(f'(r) = g(r).\)

Theorem 12.2 • Hensel’s Lemma

Let \(f \in \mathbb{Z}_p[X]\) be a monic polynomial, and let \(r \in \mathbb{Z}/ p\mathbb{Z}\) be a simple root of \(\bar{f}.\) Then there exists a unique \(\alpha \in \mathbb{Z}_p\) such that \(f(\alpha) = 0\) and \(\bar{\alpha} = r.\)

Proof. We claim that for each \(n \geqslant 1,\) there exists a unique \(\alpha_n \in \mathbb{Z}/ p^n \mathbb{Z}\) such that \(f(\alpha_n) = 0\) and \(\alpha_1 = r.\) This clearly suffices to prove the theorem, since the uniqueness implies that \((\alpha_n)_{n \geqslant 1}\) is a compatible sequence defining an element of \(\mathbb{Z}_p.\)

To prove the claim, we induct on \(n.\) The claim is obvious for \(n = 1\); so let us suppose \(\alpha_n\) exists, for some \(n \geqslant 1,\) and use it to construct \(\alpha_{n+1}.\) Clearly, if it exists, it must be one of the \(p\) elements of \(\mathbb{Z}/ p^{n + 1}\) which reduce to \(\alpha_n \bmod {p^n}\) (otherwise this would contradict the uniqueness of \(\alpha_n\)).

Let \(\beta\) be an arbitrary lifting of \(\alpha_n\) to \(\mathbb{Z}/ p^{n + 1}\); then all the other liftings look like \(\beta + p^n \epsilon,\) where \(\epsilon\) varies over \(\{0, \dots, p-1\}.\) Moreover, \(f(\beta)\) is in \(\mathbb{Z}/ p^{n+1}\) and is zero mod \(p^n,\) so it can be written as \(p^n \mu\) for some \(\mu \in \{0, \dots, p-1\}.\)

For each \(i,\) the binomial theorem gives \[a_i (\beta + p^n \epsilon)^i = a_i (\beta^i + i \beta^{i - 1} p^n \epsilon + \dots)\] where the \((\dots)\) denote terms which are divisible by \(p^{2n},\) hence are zero mod \(p^{n + 1}.\) Thus \[f(\beta + p^n \epsilon) = f(\beta) + p^n \epsilon f'(\beta) = p^n (\mu + \epsilon f'(\beta)).\] However, since we are working mod \(p^{n + 1},\) the expression in the brackets only matters modulo \(p.\) As \(\beta = r \bmod p,\) we have \(f'(\beta) \bmod p = \bar{f}'(r) \ne 0.\) Thus there is a unique choice of \(\epsilon\) which makes the bracket zero mod \(p.\)

Example 12.3

Suppose \(p \ne 2,\) and \(a \in \mathbb{Z}_p\) is such that \(a \bmod p\) is a non-zero quadratic residue. Then \(f(X) = X^2 - a\) has a root modulo \(p,\) and this root \(r\) must satisfy \(\bar{f}'(r) = 2r \ne 0\); so Hensel’s lemma says it has a root in \(\mathbb{Z}_p.\) Thus a unit in \(\mathbb{Z}_p\) is a square if and only if its image in \(\mathbb{Z}/p\mathbb{Z}\) is a square, and similarly for \(n\)-th powers as long as \(p \nmid n.\) (This generalises Proposition 6.2).

12.2 Explicitly constructing solutions

The proof of Hensel’s lemma is constructive – it gives us a recipe for constructing the solution modulo higher and higher powers of \(p.\) This can be made even more explicit, as follows.

Proposition 12.4 • Newton’s iteration

In the situation of Hensel’s lemma, choose some \(x_1 \in \mathbb{Q}\) whose denominator is coprime to \(p\) and such that \(x_1 = r \bmod p.\)

Consider the sequence defined for \(n \geqslant 1\) by \[x_{n + 1} = x_n - \frac{f(x_n)}{f'(x_n)}.\] Then we have \(x_m = x_n \bmod p^n\) for all \(m \geqslant n \geqslant 1,\) and \(x_n\) is a root of \(f \bmod p^n.\)

Proof. This is a rephrasing of the proof of Hensel’s lemma above.

Example 12.5

For example, take \(f(X) = X^2 - 11,\) and start with \(x_1 = 1,\) which is a root of \(f\) modulo \(5.\) Then we get a sequence of rational numbers (which are quite complicated, e.g. \(x_7 = 5190932463129656526839199303553/1565125026570585114734624993088\)); and these are tending in the 5-adic metric to a root of \(f\) in \(\mathbb{Z}_5.\)

(Amazingly, they’re also tending in \(\mathbb{R}\) to a root in \(\mathbb{R}\)! So the same rational-number sequence is calculating the square root of 11 in two different completions of \(\mathbb{Q}\) at once. However, it diverges horribly in the \(p\)-adic topology for \(p \ne 5.\))

Remark 12.6

The convergence is actually much better than the theorem claims: rather than just getting one more correct \(5\)-adic digit of \(\sqrt{11}\) with each step, we actually double the number of correct digits (on average). But this takes a little more work to show.

There are various generalisations of Hensel’s lemma; for instance, we can deal with non-simple roots, as long as we start with a root modulo \(p^k,\) for some large enough \(k.\) We can also consider systems of polynomials in several variables, assuming that a solution exists mod \(p\) and a suitable matrix of partial derivatives mod \(p\) is non-singular.

12.3 P-adic logarithms and the structure of \(\mathbb{Z}_p^\times\)

In this section we’ll suppose \(p \ne 2,\) for simplicity.

Recall that in real analysis the logarithm function has a Taylor series expansion around 1, \[\log x = \sum_{n = 1}^\infty \frac{(-1)^{n-1} (x - 1)^n}{n},\] convergent for \(|x-1| < 1.\) Amazingly the same thing works in the \(p\)-adics:

Proposition 12.7

For all \(x \in \mathbb{Q}_p\) with \(|x-1|_p < 1,\) the sum \(\sum_{n = 1}^\infty \frac{(-1)^{n-1} (x - 1)^n}{n}\) converges in \(\mathbb{Q}_p.\)

Proof. For any \(n \geqslant 1\) we have \(|\tfrac{1}{n}|_p \leqslant n\) (exercise). So if \(|x-1| = r < 1,\) then \[|\frac{(-1)^{n-1} (x - 1)^n}{n}|_p \leqslant n r^n,\] which tends (rather rapidly) to 0. The nonarchimedean property (and completeness) of \(\mathbb{Q}_p\) implies that any sum whose terms tend to 0 is convergent11.

This defines a function \(\log_p : U \to \mathbb{Q}_p,\) where \(U\) denotes the subgroup \(\{x : |x - 1| < 1\}\) of \(\mathbb{Q}_p^\times.\) One can check that \[\log_p(x y) = \log_p x + \log_p y \quad \forall x, y \in U,\] and moreover, \(\log_p\) is a bijection from \(U\) to \(p\mathbb{Z}_p\) (which is clearly isomorphic to \(\mathbb{Z}_p\) as an additive group). So we have shown:

Proposition 12.8

There is an isomorphism of abelian groups \(L : (U, \times) \cong (\mathbb{Z}_p, +).\)0◻

Corollary 12.9

The group \(U\) does not contain any nontrivial root of unity (i.e. any \(u\) with \(u \ne 1,\) but \(u^k = 1\) for some \(k > 1\)).

Proof. Suppose \(u \in U\) has \(u \ne 1\) but \(u^k = 1\); and let \(t = L(u) \in \mathbb{Z}_p.\) Since \(L\) converts multiplication to addition, we have \(k t = L(u^k) = L(1) = 0\); but since \(u \ne 1,\) we have \(t \ne 0,\) and also \(k \ne 0,\) since \(k > 1\) and \(\mathbb{Z}\) injects into \(\mathbb{Z}_p.\) So this contradicts the fact that \(\mathbb{Z}_p\) is an integral domain.

Theorem 12.10

The group of roots of unity in \(\mathbb{Z}_p\) is finite and cyclic of order \(p - 1\); and for each \(a \in (\mathbb{Z}/ p\mathbb{Z})^\times,\) there is a unique \((p-1)\)-st root of unity in \(\mathbb{Z}_p\) mapping to \(a \bmod p\) (the “Teichmüller lift” of \(a\)).

Proof. Firstly, the polynomial \(X^{(p-1)} - 1\) has \(p - 1\) distinct roots mod \(p\) – every element of \((\mathbb{Z}/ p\mathbb{Z})^\times\) is a root, and clearly these must be simple (since the degree is \(p-1\)). So Hensel’s lemma says that there is a unique root in \(\mathbb{Z}_p\) lifting each of these.

On the other hand, suppose \(N\) is any integer \(\geqslant 1,\) and \(\zeta \in \mathbb{Z}_p^\times\) is a root of unity of order \(N.\) Then there is some \((p-1)\)-st root of unity \(\omega\) such that \(\zeta = \omega \bmod p.\) Thus \(\zeta / \omega = 1 \bmod p,\) and \(\zeta / \omega\) is a root of unity lying in \(U.\) From the last corollary, \(\zeta/\omega = 1,\) so \(\zeta\) is in fact a \((p-1)\)-st root of unity.

Definition 12.11

The map \(\tau : (\mathbb{Z}/p\mathbb{Z})^\times \to \mathbb{Z}_p^\times,\) sending \(a\) to the unique \((p-1)\)-st root of unity in \(\mathbb{Z}_p\) that reduces mod \(p\) to \(a,\) is actually a group homomorphism; it is called the Teichmüller character.

Theorem 12.12

There is an isomorphism \[\mathbb{Z}_p^\times \cong \mathbb{Z}_p \times C_{p-1},\] sending \(x\) to \(\left(L(\tfrac{x}{\tau(x)}), \tau(x)\right).\)

Proof. We already know that \(\mathbb{Z}_p^\times\) has a subgroup (namely \(U\)) isomorphic to \(\mathbb{Z}_p,\) such that the quotient (namely \((\mathbb{Z}/ p\mathbb{Z})^\times\)) is cyclic of order \(p - 1.\) So to show the above isomorphism, it suffices to find a subgroup of \(\mathbb{Z}_p^\times\) mapping isomorphically to \((\mathbb{Z}/ p\mathbb{Z})^\times,\) and the Teichmüller lifting construction does exactly this.

12.4 Local-to-global principles

Now let’s suppose we’re trying to solve a polynomial equation in \(\mathbb{Q}\) (or a system of many equations in many variables). Clearly, if it has a solution in \(\mathbb{Q},\) then it has a solution in \(\mathbb{Q}_p\) for every \(p,\) and also in \(\mathbb{R}.\) Often this gives us a cheap way of ruling out solutions in \(\mathbb{Q}\): for instance, \(X^2 - 37 Y^2 = 0\) has no solutions in \(\mathbb{Q}_5\) except \(X = Y = 0\) (exercise!) so it has no non-trivial rational solutions either.

It would be nice if this were the only obstruction to existince of rational solutions. Unfortunately, this doesn’t always work. For instance, Gouvea gives the following exercise:

Exercise 12.13

Show that the equation \((X^2 - 2) (X^2 - 17) (X^2 - 34) = 0\) has roots in \(\mathbb{Q}_p\) for every \(p,\) and in \(\mathbb{R},\) but no roots in \(\mathbb{Q}.\)

However, in some cases – for nice classes of equations – we can deduce solvability in \(\mathbb{Q}\) from solvability in the completions.

Theorem 12.14 • Hasse–Minkowski

Let \[F(X_1, \dots, X_n) = \sum_{i, j} c_{i, j} X_i X_j\] be a homogenous quadratic polynomial in \(n\) variables with rational coefficients. Then there exist non-trivial solutions of \(F(X_1, \dots, X_n) = 0\) in \(\mathbb{Q}\) if and only if there exist non-trivial solutions in \(\mathbb{R},\) and in \(\mathbb{Q}_p\) for every \(p.\)

(For a proof see Serre’s book A Course in Arithmetic.)

This is an example of a local-to-global principle, showing that (under suitable hypotheses) we can recover information in \(\mathbb{Q}\) (a “global” field) from information about its completions (“local” fields). Investigating when local-global principles hold – and if not, whether one can quantify “how badly” they fail – is a hugely important theme in number theory research today.

Home

Chapters

Contents

PDFs