Todd Trimble
Lagrange four squares theorem

(Should return to this later, but the clean conceptual route is of course to work with Hurwitz quaternions, the \mathbb{Z}-span of 1,h,i,j,k1, h, i, j, k where h=12(1+i+j+k)h = \frac1{2}(1 + i + j + k).)

The four squares theorem says that every natural number can be expressed as a sum of four integer squares. For this it is useful to recall that for quaternions α=a+bi+cj+dk\alpha = a + b i + c j + d k, the central elements are real and are precisely the fixed points of the anti-involution αα¯=abicjdk\alpha \mapsto \widebar{\alpha} = a - b i - c j - d k. It follows quickly that the norm map αN(α)=αα¯=a 2+b 2+c 2+d 2\alpha \mapsto N(\alpha) = \alpha \widebar{\alpha} = a^2 + b^2 + c^2 + d^2 is multiplicative: we have

N(αβ)=N(α)N(β)N(\alpha \beta) = N(\alpha) N(\beta)

for all quaternions α,β\alpha, \beta.

This shows that sums of four integral squares are closed under multiplication, which reduces the four squares theorem to the statement that every prime pp is a sum of four squares. For p=2p = 2 we have 2=1 2+1 2+0 2+0 22 = 1^2 + 1^2 + 0^2 + 0^2.


If a finite field FF has an odd number of elements qq, then every element of FF is a sum of two squares. (The conclusion holds also in characteristic 22, but we don’t need this.)


If some non-square aa is not a sum of any two squares, then this is true of every non-square bb (since a/ba/b must be a square, using cyclicity of the group F ×F^\times), in which case it follows that the squares are closed under addition and form a proper subfield EE, with q+12\frac{q+1}{2} elements. If dim E(F)=d\dim_E(F) = d, then q=(q+12) dq = (\frac{q+1}{2})^d, which is impossible.


If pp is an odd prime, then there is some mm with 0<m<p0 \lt m \lt p such that mpm p is a sum of four squares.


We just saw that 1-1 is a sum of two squares in F=/(p)F = \mathbb{Z}/(p), say 1u 2+v 2-1 \equiv u^2 + v^2 where u,vu, v are represented in the range p12,,0,,p12-\frac{p-1}{2}, \ldots, 0, \ldots, \frac{p-1}{2}. Then pp divides u 2+v 2+1 2+0 22(p12) 2+1<p 22u^2 + v^2 + 1^2 + 0^2 \leq 2\left(\frac{p-1}{2}\right)^2 + 1 \lt \frac{p^2}{2}. It follows that some mpm p with 0<m<p20 \lt m \lt \frac{p}{2} is a sum of four squares.


If 2n2 n is a sum of four squares, then so is nn.


Writing 2n=a 2+b 2+c 2+d 22 n = a^2 + b^2 + c^2 + d^2, we have a+b+c+d0mod2a + b + c + d \equiv 0\; mod\; 2, so we may assume WLOG that a,ba, b have the same parity as then do c,dc, d. Then n=(a+b2) 2+(ab2) 2+(c+d2) 2+(cd2) 2n = \left(\frac{a+b}{2}\right)^2 + \left(\frac{a-b}{2}\right)^2 + \left(\frac{c+d}{2}\right)^2 + \left(\frac{c-d}{2}\right)^2.


Every odd prime pp is the sum of four squares.


Take m{1,,p1}m \in \{1, \ldots, p-1\} to be the least element such that mpm p is of the form mp=a 2+b 2+c 2+d 2m p = a^2 + b^2 + c^2 + d^2 (invoking Lemma ). The claim is that m=1m = 1; since pp is prime, it suffices to show m|pm|p. By Lemma and minimality, mm is odd. Write β=a+bi+cj+dk\beta = a + b i + c j + d k, so N(β)=mpN(\beta) = m p. Take w,x,y,z{m12,,0,,m12}w, x, y, z \in \{-\frac{m-1}{2}, \ldots, 0, \ldots, \frac{m-1}{2}\} so that

w+xi+yj+zkβ¯modm.w + x i + y j + z k \equiv \widebar{\beta} mod\; m.

Put α=w+xi+yj+zk\alpha = w + x i + y j + z k. Then αββ¯βmodm\alpha \beta \equiv \widebar{\beta}\beta \; mod\; m, and β¯β=N(β)=mp\widebar{\beta} \beta = N(\beta) = m p, so αβ0modm\alpha \beta \equiv 0\; mod\; m. Thus we may write αβ\alpha \beta in the form αβ=m(r+si+tj+uk)\alpha \beta = m(r + s i + t j + u k).

We have 0N(α)4(m12) 2=(m1) 20 \leq N(\alpha) \leq 4\left(\frac{m-1}{2}\right)^2 = (m-1)^2, and N(α)N(β)0modmN(\alpha) \equiv N(\beta) \equiv 0\; mod\; m, so there is nn with 0n<m0 \leq n \lt m such that N(α)=mnN(\alpha) = m n. Thus m 2(r 2+s 2+t 2+u 2)=N(αβ)=N(α)N(β)=m 2npm^2(r^2 + s^2 + t^2 + u^2) = N(\alpha \beta) = N(\alpha) N(\beta) = m^2 n p, whence r 2+s 2+t 2+u 2=npr^2 + s^2 + t^2 + u^2 = n p. This forces n=0n = 0 by minimality of mm. So N(α)=mn=0N(\alpha) = m n = 0, forcing w=x=y=z=0w = x = y = z = 0, and now we must conclude a,b,c,d0modma, b, c, d \equiv 0\; mod\; m. Thus m 2m^2 divides a 2+b 2+c 2+d 2=mpa^2 + b^2 + c^2 + d^2 = m p, so mm divides pp and we are done.

Revised on June 8, 2020 at 12:25:54 by Todd Trimble