nLab fundamental theorem of algebra

Redirected from "Fundamental Theorem of Algebra".

Context

Analysis

Algebra

Contents

Statement

Classically, the fundamental theorem of algebra states that

Since every non-zero polynomial could be made a monic polynomial by dividing by the leading coefficient, it could also be expressed as

Many proofs of this theorem are known (see the references below); some use complex analysis (the reciprocal of a polynomial function cannot be bounded), some use algebraic topology (the degree of a map is invariant with respect to homotopy), and some use advanced calculus (polynomial functions on the complex numbers are open mappings). All of these proofs involve, at some level, the fact that the real numbers are Dedekind complete, which has as a consequence the fact that the real numbers are archimedean.

Algebraic proof via real closed fields

Despite its name, the fundamental theorem of algebra makes reference to a concept from analysis (the field of complex numbers). However, the analytic part may be reduced to a minimum: that the field of real numbers is real closed. This has been known essentially forever, and is easily proved using (for example) the intermediate value theorem.

The rest of the proof is algebraic and, unlike the other proof methods, applies to all real closed fields, which need not be archimedean. It is due to Emil Artin, and forms a basic chapter in the Artin–Schreier theory of real closed fields.

We recall that a real closed field is an ordered field such that every positive element has a square root, and every polynomial function of odd degree has a root. Note that the polynomial function x 2+1x^2 + 1 cannot have any root in a real closed field — or in fact in any ordered field, since we always have x 20x^2\ge 0 and hence x 2+11x^2+1 \ge 1.

Theorem

If FF is real closed, then K=F[1]K = F[\sqrt{-1}] is algebraically closed.

Proof

We must show that any irreducible polynomial pp of degree greater than 00 with coefficients in KK has a root in KK. Since FF has characteristic 00, it is a perfect field.

Thus the splitting field of pp is a finite Galois extension LL of FF, with Galois group GG. If G(2)G(2) is the Sylow 2-group of GG, then the fixed field? EE of G(2)G(2) is an odd degree extension of FF. Any αE\alpha \in E must then have an irreducible polynomial function qF[x]q \in F[x] of odd degree. But since FF is real closed, qq has a root in FF; by irreducibility, deg(q)=1\deg(q) = 1 and αF\alpha \in F, forcing E=FE = F and G=G(2)G = G(2). We have |G|>1{|G|} \gt 1 since the splitting field contains KK.

So GG is a 22-primary group. But for any prime number pp, a nontrivial finite pp-group has nontrivial center (see here), and is therefore solvable by an inductive argument. Therefore the extension L/FL/F arises from a tower of non-trivial quadratic extension?s

FL 1L n=LF \subseteq L_1 \subseteq \ldots \subseteq L_n = L

By the quadratic formula, the first field L 1L_1 arises by adjoining roots to FF of a polynomial function x 2+ax+bx^2 + a x + b,

a±a 24b2,\frac{-a \pm \sqrt{a^2 - 4b}}{2},

where a 24ba^2 - 4b is negative. Since FF is real closed, the positive element 4ba 24b - a^2 has a square root in FF, so that the roots displayed above belong to K=F[1]K = F[\sqrt{-1}]. So L 1=KL_1 = K. But KK has no nontrivial quadratic extensions by the lemma that follows, so in fact L 1=L n=KL_1 = L_n = K and the theorem is proved.

Lemma

Every element of K=F[1]K = F[\sqrt{-1}] has a square root in KK.

Proof

The proof is most easily apprehended by analogy with polar coordinate representations of complex numbers and half-angle formulas, where a square root of re iθr e^{i\theta} is given by r 1/2e iθ/2r^{1/2}e^{i\theta /2}. Let ii be a fixed square root of 1-1, and let a+bia + b i be an arbitrary element of KK, with a,bFa, b \in F. We must solve (x+yi) 2=a+bi(x + y i)^2 = a + b i, i.e., find x,yFx, y \in F that solve

x 2y 2=a,2xy=bx^2 - y^2 = a, \qquad 2x y = b

Since a 2+b 2a^2 + b^2 has a square root in FF, we may assume by homogeneity in x,yx, y that (a,b)(a, b) is on the unit circle: a 2+b 2=1a^2 + b^2 = 1. By interchanging xx and yy if need be, we may assume 0a10 \leq a \leq 1; replacing yy by y-y if need be, we may assume b0b \geq 0. Taking x,y0x, y \geq 0 such that

x 2=1+a2,y 2=1a2,x^2 = \frac{1+a}{2}, \qquad y^2 = \frac{1-a}{2},

we obtain a solution (since x 2y 2=ax^2 - y^2 = a and 4x 2y 2=b 24 x^2 y^2 = b^2).

Classical FTA via advanced calculus

As noted above, many proofs of the fundamental theorem are known. The following proof, ultimately rooted in the fact that polynomial mappings on \mathbb{C} are open mappings, has the advantage that it requires very little machinery. From what I (Todd Trimble) understand, it is close to the method used by Argand to give his proof (1814)1.

Let f:f\colon \mathbb{C} \to \mathbb{C} be a nonconstant polynomial mapping, and suppose ff has no zero.

  1. Let ss be the infimum of values |f(z)|{|f(z)|}; choose a sequence z 1,z 2,z 3,z_1, z_2, z_3, \ldots such that |f(z n)|s{|f(z_n)|} \to s. Since lim zf(z)=\lim_{z \to \infty} f(z) = \infty, the sequence z nz_n must be bounded; by the Bolzano-Weierstrass theorem it has a subsequence z n kz_{n_k} that converges to some point z 0z_0. Then |f(z n k)|{|f(z_{n_k})|} converges to |f(z 0)|{|f(z_0)|} by continuity, and converges to ss as well, so |f(z)|{|f(z)|} attains an absolute minimum ss at z=z 0z = z_0. By supposition, f(z 0)0f(z_0) \neq 0.

  2. The polynomial function ff may be uniquely written in the form

    f(z)=f(z 0)+g(z)(zz 0) nf(z) = f(z_0) + g(z)(z - z_0)^n

    where gg is polynomial function and g(z 0)0g(z_0) \neq 0. Put

    F(z)=f(z 0)+g(z 0)(zz 0) nF(z) = f(z_0) + g(z_0)(z - z_0)^n

    and choose δ>0\delta \gt 0 small so that

    |zz 0|=δ|g(z)g(z 0)|<|g(z 0)|.{|z - z_0|} = \delta \Rightarrow {|g(z) - g(z_0)|} \lt {|g(z_0)|}.
  3. FF maps the circle C={z:|zz 0|=δ}C = \{z : {|z - z_0|} = \delta\} onto a circle of radius r=|g(z 0)|δ nr = {|g(z_0)|}\delta^n centered at f(z 0)f(z_0). (This uses the fact that any complex number has an n thn^{th} root, which one can prove using polar coordinate representations. We omit the details.) Choose zCz' \in C so that F(z)F(z') is on the line segment between the origin and f(z 0)f(z_0) (we can always choose δ\delta so that also r<|f(z 0)|r \lt {|f(z_0)|}). Then

    |F(z)|=|f(z 0)|r{|F(z')|} = {|f(z_0)|} - r

    We also have

    |f(z)F(z)|=|g(z)g(z 0)||zz 0| n<|g(z 0)|δ n=r{|f(z') - F(z')|} = {|g(z') - g(z_0)|} {|z' - z_0|^n} \lt {|g(z_0)|} \delta^n = r

    according to how we chose δ\delta in 2. We conclude by observing the strict inequality

    |f(z)||F(z)|+|f(z)F(z)|<|f(z 0)|r+r=|f(z 0)|,{|f(z')|} \leq {|F(z')|} + {|f(z') - F(z')|} \lt {|f(z_0)|} - r + r = {|f(z_0)|},

    which contradicts the fact that |f(z)|{|f(z)|} attains an absolute minimum at z=z 0z = z_0.

Via cohomology of complex projective space

The following expands out a proof indicated by Bruner 2009, of the fundamental theorem of algebra via the integral cohomology ring of complex projective space P n1\mathbb{C}P^{n-1}:

Bruner 2009: “I just noticed the following proof that the complex numbers \mathbb{C} are algebraically closed, and wonder if anyone has seen it in print anywhere.
An algebraic extension of C is a unital division algebra over \mathbb{C}, say of dimension nn, so induces

P n1×P n1P n1 \mathbb{C}P^{n-1} \times \mathbb{C}P^{n-1} \longrightarrow \mathbb{C}P^{n-1}

satisfying

y1y+y1 y \mapsto 1 \textstyle{\otimes} y + y \textstyle{\otimes} 1

in second integral cohomology. Since y n=0y^n = 0, we must have n=1n=1.“

Recall here that the integral cohomology of complex projective space ℂℙ n1\mathbb{CP}^{n-1} is (see there):

H i(ℂℙ n1;) a i,wherea i={1 ifi{2j|0jn1} 0 otherwise H^i\big( \mathbb{CP}^{n-1}; \mathbb{Z} \big) \cong \mathbb{Z}^{a_i} \mathrlap{\,,} \;\text{where}\; a_i = \left\{ \begin{array}{ll} 1 & \text{if}\; i \in \{ 2j \vert 0 \leq j \leq n-1 \} \\ 0 & \text{otherwise} \end{array} \right.

on which the cup product induces a ring isomorphism

H *(ℂℙ n,)[c]/c n[c], H^*\big( \mathbb{CP}^n, \mathbb{Z} \big) \simeq \mathbb{Z}[c] / c^n \mathbb{Z}[c] \mathrlap{\,,}

where cc denotes the integral first Chern class (cf. here).

Theorem

Let KK be a finite dimensional vector space over \mathbb{C} equipped

  1. (unit) a map η:*K\eta : \ast \rightarrow K

  2. (product) a linear map μ:K KK\mu \colon K \otimes_{\mathbb{C}} K \rightarrow K

satisfying for a,bKa,b \in K:

  1. (unitality) μ(η(1),a)=a=μ(a,η(1))\mu\big(\eta(1),a\big) = a = \mu\big(a,\eta(1)\big),

  2. (no zero-divisors) if μ(ab)=0\mu (a \otimes b) = 0, then a=0a = 0 or b=0b = 0.

(We do not need to require associativity!)

Then KK is isomorphic to \mathbb{C} itself as a \mathbb{C} -vector space, i.e. dim ( K ) \text{dim}(K) =1= 1.

Note that we obtain such a nonassociative \mathbb{C}-algebra as the splitting field of any complex irreducible polynomial.

Proof

The assumed absence of zero divisors implies that multiplication restricts to the complement of zero, and bilinearity of the multiplication then implies that it respects complex lines and hence descends to a continuous map between (products of) complex projective space

[μ| n{0}]:ℂℙ n1×ℂℙ n1ℂℙ n1. [ \mu|_{ \mathbb{C}^n \setminus \{0\} } ] \ : \ \mathbb{CP}^{n-1} \times \mathbb{CP}^{n-1} \longrightarrow \mathbb{CP}^{n-1} \mathrlap{\,.}

Recall that cH 2(ℂℙ n1)c \in H^2(\mathbb{CP}^{n-1}) is the generator of the cohomology ring H *(ℂℙ n1)H^* ( \mathbb{CP}^{n-1} ). We have:

H 2(ℂℙ n1×ℂℙ n1;)H 2(ℂℙ n1;)H 2(ℂℙ n1;). H^2\big( \mathbb{CP}^{n-1} \times \mathbb{CP}^{n-1}; \mathbb{Z} \big) \cong H^2\big( \mathbb{CP}^{n-1}; \mathbb{Z} \big) \oplus H^2\big( \mathbb{CP}^{n-1}; \mathbb{Z} \big) \mathrlap{\,.}

Therefore there must be a,ba, b \in \mathbb{Z} such that pullback of the generator along the product map is

H 2(μ,)(c)=a(1c)+b(c1). H^2(\mu,\mathbb{Z})(c) = a \cdot (1 \textstyle{\otimes} c) + b \cdot (c \textstyle{\otimes} 1) \mathrlap{\,.}

These coefficients must satisfy the two equations which are the image in degree-2 integral cohomology of the two unit laws (the first assumed property on KK):

H 2(μ(η×Id ℂℙ n1),)=Id H 2(ℂℙ n1,)=H 2(μ(Id ℂℙ n1×η),). H^2 \left( \mu \circ \left( \eta \times Id_{\mathbb{CP}^{n-1}} \right) , \mathbb{Z} \right) = Id_{H^2 \left( \mathbb{CP}^{n-1} , \mathbb{Z} \right) } = H^2 \left( \mu \circ \left( Id_{\mathbb{CP}^{n-1}} \times \eta \right) , \mathbb{Z} \right) \mathrlap{\,.}

This implies that:

a=1,andb=1. a = 1, \;\text{and}\; b = 1 \mathrlap{\,.}

There is an isomorphism of commutative algebras of

([c]/c n[c]) ([c]/c n[c])H (ℂℙ n1;) H (ℂℙ n1;)H (ℂℙ n1×ℂℙ n1;). \big( \mathbb{Z}[c]/c^n\mathbb{Z}[c] \big) \otimes_{\mathbb{Z}} \big( \mathbb{Z}[c]/c^n\mathbb{Z}[c] \big) \cong H^\bullet\big( \mathbb{CP}^{n-1}; \mathbb{Z} \big) \otimes_{\mathbb{Z}} H^\bullet\big( \mathbb{CP}^{n-1}; \mathbb{Z} \big) \cong H^\bullet\big( \mathbb{CP}^{n-1} \times \mathbb{CP}^{n-1}; \mathbb{Z} \big) \mathrlap{\,.}

Where H (;)H^\bullet(-; \mathbb{Z}) is the integral cohomology of a topological space.

H (μ;)H^\bullet(\mu;\mathbb{Z}) is in fact a ring homomorphism, hence we have

0= H (μ;)(c n) = (1c+c1) n = i=0 n(ni)(1c) i(c1) ni = i=0 n(ni)c nic i = 0<i<n(ni)c nic i \begin{aligned} 0 = & H^{\bullet}(\mu; \mathbb{Z})(c^n) \\ = & (1 \otimes c + c \otimes 1)^n \\ = & \sum_{i=0}^{n} \binom{n}{i} (1 \otimes c)^i (c \otimes 1)^{n-i} \\ = & \sum_{i=0}^{n} \binom{n}{i} c^{n-i} \otimes c^{i} \\ = & \sum_{0 \lt i \lt n} \binom{n}{i} c^{n-i} \otimes c^{i} \end{aligned}

As elements of the tensor product

H (ℂℙ n1;) H (ℂℙ n1;)([[c]]/c n[[c]]) ([[c]]/c n[[c]]). H^\bullet(\mathbb{CP}^{n-1}; \mathbb{Z}) \otimes_{\mathbb{Z}} H^\bullet(\mathbb{CP}^{n-1}; \mathbb{Z}) \;\cong\; \left( \mathbb{Z}[[c]]/c^n \mathbb{Z}[[c]] \right) \otimes_{\mathbb{Z}} \left( \mathbb{Z}[[c]]/c^n \mathbb{Z}[[c]] \right) \mathrlap{\,.}

of the algebra H (ℂℙ n1;)[[c]]/c n[[c]]H^\bullet(\mathbb{CP}^{n-1}; \mathbb{Z}) \cong \mathbb{Z}[[c]]/c^n \mathbb{Z}[[c]] with itself.

The right hand side is

i=1 n1(ni)c nic i. \sum_{i = 1}^{n-1} {\binom{n}{i}}\, c^{n-i} \otimes c^{i} \mathrlap{\,.}

By way of contradiction, suppose that n=dim(K)n = \text{dim}(K) is greater than 11. In this case, each of the terms c ic nic^i \otimes c^{n-i} is nonzero for 0<i<n0 \lt i \lt n, and their sum is as well.

This implies that

i=1 n1(ni)c nic i \sum_{i= 1}^{n-1} {\binom{n}{i}}\, c^{n-i} \otimes c^{i}

is not zero, which contradicts that c nc^n is zero using the above equation.

This implies n=1n = 1 and hence the claim.

A variant of this proof works with the real cohomology of infinite complex projective space.

There is an isomorphism of \mathbb{R}-algebras:

H (P ;)[[c]]. H^{\bullet}\big( \mathbb{C}P^{\infty}; \mathbb{R}\big) \cong \mathbb{R}[[c]] \mathrlap{\,.}

There is a comultiplication on H (P ;)H^{\bullet}(\mathbb{C}P^{\infty} ; \mathbb{R}) which is related to the complex orientation on the Eilenberg-MacLane spectrum HH\mathbb{R}. More precisely, there is a map of \mathbb{R}-algebras

Δ:H (P ;)H (P ;)^ H (P ;) \Delta \;\colon\; H^{\bullet} ( \mathbb{C}P^{\infty} ; \mathbb{R}) \rightarrow H^{\bullet} ( \mathbb{C}P^{\infty} ; \mathbb{R}) {\hat{\otimes}}_{\mathbb{R}} H^{\bullet} ( \mathbb{C}P^{\infty} ; \mathbb{R})

which is the unique map of \mathbb{R}-algebras such that

Δ(c)=c1+1c. \Delta(c) = c \otimes 1 + 1 \otimes c \mathrlap{\,.}

When [[c]]^ [[c]]\mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]] is identified with [[x,y]]\mathbb{R}[[x,y]], this is the additive formal group law on \mathbb{R}.

Then we have the following facts:

  • H (P n;)[[c]]/c n[[c]]H^{\bullet} ( \mathbb{C}P^n ; \mathbb{R}) \cong \mathbb{R}[[c]] / c^n \mathbb{R}[[c]] as \mathbb{R}-algebras.

  • Any continuous binary operation on P n\mathbb{C}P^{n} with a unit (satisfying the left and right unit laws) induces a quotient of the \mathbb{R}-algebra [[c]]\mathbb{R}[[c]] which commutes with the comultiplication in the sense that the following diagram commutes

On the power series ring [[c]]\mathbb{R}[[c]] consider the multiplication

μ:[[c]]^ [[c]][[c]] \mu \;\colon\; \mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]] \longrightarrow \mathbb{R}[[c]]

given by

μ( i=0 a ic i i=0 b ic i)= i=0 i 1+i 2=i,i 1,i 2a i 1b i 2c i \mu \big( \sum_{i= 0}^{\infty} a_i c^i \otimes \sum_{i=0}^{\infty} b_i c^i \big) = \sum_{i=0}^{\infty} \sum_{\ i_1 + i_2 = i, i_1, i_2 \in \mathbb{N} } a_{i_1} b_{i_2} c^{i}

and the comultiplication

Δ:[[c]][[c]]^ [[c]] \Delta \;\colon\; \mathbb{R}[[c]] \longrightarrow \mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]]

given by

Δ(c)=c1+1c. \Delta \left( c \right) = c \otimes 1 + 1 \otimes c \mathrlap{\,.}

Theorem

Let I[[c]]I \subset \mathbb{R}[[c]] be an ideal of [[c]]\mathbb{R}[[c]] such that

Δ(I)I^ [[c]]+[[c]]^ I \Delta (I) \subseteq I \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]] + \mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} I

Then II must be one of the following ideals:

  • {0}\{ 0 \}

  • c[[c]]c \cdot \mathbb{R}[[c]]

  • [[c]]\mathbb{R}[[c]]

Proof

The ring of formal power series [[c]]\mathbb{R}[[c]] over the field \mathbb{R} is a discrete valuation ring. Its unique maximal ideal is generated by cc. Consequently, every non-zero ideal in [[c]]\mathbb{R}[[c]] is principal and is generated by some power of cc (see there). Therefore, I={0}I = \{0\} or there exists a non-negative integer nn \in \mathbb{N} such that I=(c n)=c n[[c]]I = (c^n) = c^n \cdot \mathbb{R}[[c]].

If I={0}I = \{0\}, the condition Δ({0})={0}{0}+{0}\Delta(\{0\}) = \{0\} \subseteq \{0\} + \{0\} holds trivially. Therefore we now assume that II is a non-zero ideal, so that I=(c n)I = (c^n) for some n0n \ge 0.

To analyze the ideal condition, we use the canonical isomorphism of topological \mathbb{R}-algebras:

[[c]]^ [[c]] [[x,y]] c1 x 1c y. \begin{array}{ccc} \mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]] &\overset{\sim}{\longrightarrow}& \mathbb{R}[[x, y]] \\ c \otimes 1 &\mapsto& x \\ 1 \otimes c &\mapsto& y\mathrlap{\,.} \end{array}

Under this isomorphism:

  1. The comultiplication acts as Δ(c)=x+y\Delta(c) = x + y.

  2. The bounding ideal I^ [[c]]+[[c]]^ II \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]] + \mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} I corresponds precisely to the ideal generated by x nx^n and y ny^n, namely (x n,y n)[[x,y]](x^n, y^n) \subset \mathbb{R}[[x, y]].

The condition Δ(I)I^ [[c]]+[[c]]^ I\Delta(I) \subseteq I \hat{\otimes}_{\mathbb{R}} \mathbb{R}[[c]] + \mathbb{R}[[c]] \hat{\otimes}_{\mathbb{R}} I therefore requires that the image of the generator c nc^n belongs to this ideal:

(x+y) n(x n,y n). (x + y)^n \in \big( x^n, y^n \big) \mathrlap{\,.}

Now, the expression (x+y) n(x+y)^n is a homogeneous polynomial of total degree nn. Any element P(x,y)(x n,y n)P(x,y) \in (x^n, y^n) can be written as x nf(x,y)+y ng(x,y)x^n f(x,y) + y^n g(x,y) for some power series f,g[[x,y]]f, g \in \mathbb{R}[[x,y]]. Because formal power series expand into components of ascending degrees, the terms of total degree exactly nn in P(x,y)P(x,y) must come exclusively from the constant terms of ff and gg.

Thus, the degree-nn homogeneous part of any series in the ideal (x n,y n)(x^n, y^n) is necessarily of the form Ax n+By nA x^n + B y^n for some real constants A,BA, B \in \mathbb{R}.

Since (x+y) n(x+y)^n belongs to (x n,y n)(x^n, y^n) and is entirely homogeneous of degree nn, it must equal its own degree-nn component. Therefore, there exist A,BA, B \in \mathbb{R} such that:

(x+y) n=Ax n+By n. (x+y)^n \;=\; A x^n + B y^n \mathrlap{\,.}

However, by the binomial theorem, we know that:

(x+y) n= i=0 n(ni)x iy ni. (x+y)^n \;=\; \sum_{i=0}^n \binom{n}{i} x^i y^{n-i} \mathrlap{\,.}

For these two expressions to be identical, the coefficients of all cross-terms x iy nix^i y^{n-i} (where 0<i<n0 \lt i \lt n) must be zero, in particular (n1)\binom{n}{1} would need to be zero if n>1n\gt 1.

Now we use that we are working over \mathbb{R}, a field of characteristic zero. The binomial coefficient (n1)=n\binom{n}{1}=n is nonzero for all n1n\geq 1, so we are left with the only possible cases being n1n\leq 1. The case n=0n=0 holds by inspection, and corresponds to the improper ideal, and the case n=1n=1 obviously satisfies the required identity, and is the proper nonzero ideal in the statement.

In constructive mathematics

There are multiple notions used in the formulation of the fundamental theorem of algebra that bifurcate into different notions in constructive mathematics.

First of all, the notion of a field of real numbers bifurcates into multiple distinct notions of the real numbers, so one has a fundamental theorem of algebra for every field of complex numbers =[i]/(i 2+1)\mathbb{C} = \mathbb{R}[i]/(i^2 + 1) of a notion of real numbers \mathbb{R}, ranging from the Cauchy real numbers to any Cauchy complete Archimedean ordered field, examples of which include the HoTT book real numbers, which is the initial Cauchy complete Archimedean ordered field, and the Dedekind real numbers, which is the terminal Cauchy complete Archimedean ordered field.

Secondly, one has to decide what kind of polynomial functions to use for the FTA. In constructive mathematics, one usually considers three notions of polynomial functions, which in constructive mathematics are defined using the tight apartness relation on the real numbers rather than denial inequality:

  1. non-constant polynomial functions: a polynomial function p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n is non-constant if one has at least one coefficient a ia_i in \mathbb{C} apart from zero, with 0<in0 \lt i \leq n

  2. polynomial function with positive degree: a polynomial function p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n has a positive degree nn if a na_n is apart from zero, with n>0n \gt 0

  3. monic polynomial functions: a polynomial function p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n is monic if a n=1a_n = 1, with n>0n \gt 0

However, the FTA that uses monic polynomial functions imply the other versions:

Theorem

Suppose that the FTA for monic polynomial functions hold. Then the FTA for non-constant polynomial functions hold.

Proof

Lemma 6 and Corollary 1 of Geuvers, Wiedijk, & Zwanenburg 2000 hold for any Heyting field FF as their proofs only require the field structure of FF, and Theorem 1 of Geuvers, Wiedijk, & Zwanenburg 2000 is the FTA for non-constant functions and depends on Lemma 6, Corollary 1, and the FTA for monic polynomial functions but otherwise only depends on the field structure of FF.

The converses come from the fact that every polynomial with positive degree is non-constant and every monic polynomial has positive degree. Thus, it suffices to only consider monic polynomials in the fundamental theorem of algebra.

Finally, one has to decide whether to use mere existence of a root in the sense of traditional first-order logic or constructive existence in the sense of the BHK interpretation in the formulation of the FTA. One can also consider, instead of exact roots, approximate roots of the polynomial function.

As a result, there are multiple different versions of the fundamental theorem of algebra which are equivalent in classical mathematics but are not equivalent in constructive mathematics. Different authors have ended up proving different versions of the fundamental theorem of algebra for different kinds of real numbers, without assuming any constructive taboos, while other versions of the fundamental theorem of algebra are unprovable without certain constructive taboos and may even be provably false from other constructive taboos.

Constructive FTA with mere existence of zeroes

The version of the fundamental theorem of algebra that uses mere existence for existence of zeroes is stated as follows:

Proposition

Every monic polynomial function p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n with degree nn has a root pp in \mathbb{C}.

Constructive FTA with constructive existence of zeroes

The version of the fundamental theorem of algebra that uses the BHK interpretation for constructive existence of zeroes is stated as follows:

Proposition

For every [non-constant / positive degree / monic] polynomial function p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n with degree nn, one can construct a specified root of pp in \mathbb{C}.

This version of the fundamental theorem of algebra is not provable in neutral constructive mathematics. To show that this is the case, we turn to reframing the fundamental theorems of algebra in terms of surjectivity and having right inverses of complex polynomial functions so that one can show that it is equivalent to a weak form of choice.

Let p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n be a polynomial function on the complex numbers. Every such p(z)p(z) can be written as the sum of a constant a 0a_0 and a polynomial function q(z)= 1ina nz nq(z) = \sum_{1 \leq i \leq n} a_n z^n with a fixed point at zero. As a result, the statement that there exists a complex number zz such that p(z)=0p(z) = 0 is equivalently the statement that there exists a complex number zz such that q(z)=bq(z) = b, where b=a 0b = -a_0. Thus, one can rewrite the fundamental theorem of algebra as follows:

Theorem

Given a complex number bb and a [non-constant / positive degree / monic] polynomial function q(z)q(z) on the complex numbers such that q(0)=0q(0) = 0, there exists a complex number cc such that q(c)=bq(c) = b.

Or equivalently

Theorem

Every [non-constant / positive degree / monic] polynomial function q(z)q(z) on the complex numbers such that q(0)=0q(0) = 0 is surjective.

One can do the same analysis with the BHK interpretation of the FTA, the statement that one can construct a specified complex number zz such that p(z)=0p(z) = 0 is equivalently the statement that one can construct a specified complex number zz such that q(z)=bq(z) = b, where b=a 0b = -a_0. Thus, one can rewrite the fundamental theorem of algebra as follows:

Theorem

Given a complex number bb and a [non-constant / positive degree / monic] polynomial function q(z)q(z) on the complex numbers such that q(0)=0q(0) = 0, one can construct a specified complex number cc such that q(c)=bq(c) = b.

Or equivalently

Theorem

Every [non-constant / positive degree / monic] polynomial function q(z)q(z) on the complex numbers such that q(0)=0q(0) = 0 has a section.

Thus, the gap between the version of the FTA using mere existence and the version using constructive existence is precisely this weak version of the axiom of choice:

Proposition

Every surjective polynomial function on the complex numbers, that is [non-constant / with positive degree / monic] and has a fixed point at zero, has a section.

This is the reason why some classical proofs of versions of the fundamental theorem of algebra that use mere existence fail to work constructively if they are reliant on a square root function on the complex numbers, such as the proof in Lemma which uses the quadratic formula. Such a square root function is a section of the squaring function zz 2z \mapsto z^2 and so cannot be proven to exist on the complex numbers from surjectivity of zz 2z \mapsto z^2, since the square root on the complex numbers, if it exists, is discontinuous at zero, which implies the constructive taboo analytic WLPO for the real numbers and decidable equality for the complex numbers. In fact, in certain topoi, such as sheaves over \mathbb{C}, one can prove that there are no square root functions because in those topoi all functions on the complex numbers are continuous.

The unprovability of this weak version of choice in neutral constructive mathematics also implies that one cannot factor every monic polynomial function p(z)p(z) of degree nn into nn distinct monomials (zb i)(z - b_i) for i<ni \lt n such that p(z)= i<n(zb i)p(z) = \prod_{i \lt n} (z - b_i), since one would need to first construct the specified complex roots b ib_i. Thus, the complex numbers are not provably an algebraically closed field.

In light of this, one can instead interpret the constructive FTA as a statement about sets of roots rather than about individual roots, an interpretation that dates from Richman 2000. He constructs a complete metric space M^ n()\hat{M}_n(\mathbb{C}) which, classically, is the space of nn-element multisets of complex numbers (and constructively is the completion of that space) and proves that every complex polynomial function pp of degree nn may be associated with a point in this space in such a way that the nn elements of that point (when viewed as a multiset, if possible, and morally in any case) are the nn roots of pp.

Approximate FTA

Alternatively, one can consider an approximate version of the fundamental theorem of algebra using constructive existence in the sense of the BHK interpretation:

Theorem

For every [non-constant / positive degree / monic] polynomial function p(z)= ina nz np(z) = \sum_{i \leq n} a_n z^n with degree nn, one can construct positive reals q<1q \lt 1 and c>|f(0)|c \gt \vert f(0) \vert and a Cauchy sequence w:w:\mathbb{N} \to \mathbb{C} in \mathbb{C} such that for all natural numbers nn, |p(w n)|<q nc\vert p(w_n) \vert \lt q^n c.

Proof

This is essentially most of lemma 5 of Geuvers, Wiedijk, & Zwanenburg 2000, only one stops in the proof before one considers whether the limit of the Cauchy sequence exists or not.

Arguably, this approximate version of the fundamental theorem of algebra is what is important in constructive numerical analysis.

In reverse mathematics

Assuming classical logic, but weak foundations, it can be shown that FTA is true in the reverse mathematics system RCA 0RCA_0 (Tanaka-Yamazaki 2005).

History

The proof was attempted many times before Gauss gave what is accepted as the first proof in his dissertation (Gauss 1799), although this was not without issues (Gauss ‘fixed’ this proof almost 50 years later, but the last gap was not filled until the 20th century).

All proofs of this fact (of which there are many) require something analytic, in the sense that ordinary algebra will not suffice: one needs to know that the real numbers (or the complex numbers) ‘have no algebraic gaps’. For instance, the rational numbers famously don’t contain the square root of 22. The cleanest proof I know, due to Artin, that isolates this analytic germ, uses the step-ladder result that the real numbers form what is called a real closed field. This is essentially saying that non-negative real numbers have square roots, and odd degree polynomial functions have roots (anyone who has plotted a cubic can appreciate this fact). Alternatively, one can characterise real closed fields as those for whom the Intermediate Value Theorem (IVT) holds for polynomial functions. Accepting this result (which does need proof), the FTA follows using pure algebra (although not of the high-school sort).

However, it is of interest, partly theoretical, partly for the sake of finding the bare minimum needed to prove the FTA, to know an elementary proof, namely one that minimises the use of analytic techniques (for instance, the IVT for polynomial functions follows from the IVT for continuous functions, but that is like killing a mosquito with a bazooka). Gauss’ second proof (Gauss 1866) is elementary (and predates Artin’s by a long time). Since Gauss lacked modern algebraic techniques, some of his proof is laborious, but (Taylor 85) gives a modern gloss. (With some amusing side notes: as Taylor puts it – ‘Gauss takes the opportunity [to] be rude to his inferior contemporaries’.) Gauss’ proof, in modern language, takes up less than a page and a half, but this presupposes familiarity with some of the theory of fields (but which is pure algebra). Artin’s proof, by comparison, drawing on major theorems can be given in half a page.

It should be noted, in the context of the last statement, that proofs of the FTA can be given, relying on analytic ‘bazooka’ theorems, that are one sentence. However, to spell out the proofs of the necessary theorems, one needs a course in analysis, of some variety, so one is merely sweeping a lot under a very small rug.

In constructive mathematics, there are multiple versions of the FTA which are inequivalent to each other, many of which are not formulated in algebraic terms. For example, the approximate version of the FTA listed above requires one to show that for all polynomials and for all neighborhoods around zero there exists a complex number zz such that the polynomial evaluated at zz is in the neighborhood, which is analytic and topological rather than algebraic.

References

  • Carl Gauss, Demonstratio nova theorematis functionem algebraicam rationalem integramunius variabilis in factores reales primi vel secundi gradus resolvi posse, Dissertation, Helmstedt (1799); Werke 3, 1–30 (1866) (English transl. pdf))

  • Carl Gauss, Demonstratio nova altera theorematis omnem functionem algebraicamrationalem integram unius variabilis in factores reales primi vel secundi gradus resolviposse, Comm. Soc. Reg. Sci. Göttingen 3, 107–142 (1816); Werke 3, 33–56 (1866)

  • Michael Eisermann. An Elementary Real-Algebraic Proof via Sturm Chains. pdf

    Another new proof of the theorem that every integral rational algebraic function of one variable can be resolved into real factors of the first or second degree translated by Paul Taylor and B. Leak (1983) (web)

  • Paul Taylor, Gauss’ Second Proof, Eureka 45 (1985) 42-47 (pdf)

A proof using differential topology:

A constructive proof of the fundamental theorem of algebra for any Cauchy complete Archimedean ordered field:

  • Herman Geuvers, Freek Wiedijk, Jan Zwanenburg, A Constructive Proof of the Fundamental Theorem of Algebra without using the Rationals, TYPES ‘00: Selected papers from the International Workshop on Types for Proofs and Programs, Pages 96 - 111, 08 December 2000 [web, pdf]

A constructive proof of a variant using finite multisets of complex numbers:

see also:

  • Fred Richman, Constructive Mathematics without Choice,

    in: Reuniting the Antipodes – Constructive and Nonstandard Views of the Continuum Synthese Library 306, Springer (2001) 199-206 [doi:10.1007/978-94-015-9757-9_17]

A constructive algebraic proof for the modulated Cauchy real numbers without choice princples such as weak countable choice:

  • Wim Ruitenburg: Constructing Roots of Polynomials over the Complex Numbers, Computational Aspects of Lie Group Representations and Related Topics, CWI Tract 84 Centre for Mathematics and Computer Science, Amsterdam (1991) 107–-128 [pdf, pdf]

A full formalization in the Rocq proof assistant:

  • Herman Geuvers, Freek Wiedijk, Jan Zwanenburg: A Constructive Proof of the Fundamental Theorem of Algebra without Using the Rationals [web]

The above proof using integral cohomology of complex projective space:

Discussion in the context of reverse mathematics:

  • Kazuyuki Tanaka, Takeshi Yamazaki: Manipulating the reals in RCA 0RCA_0, inL Reverse Mathematics 2001, Lecture Notes in Logic 21 (2005)

See also:


  1. Despite the credit given to Gauss for his demonstration of 1799, Argand’s proof is often credited as the first one that is fully rigorous. The proof given here also uses the Bolzano-Weierstrass theorem, first proven by Bolzano in 1817, making it somewhat contemporaneous. Argand is also widely credited as the one who introduced the cutting-edge idea of viewing complex numbers and their operations geometrically, which the proof here also uses (the complex plane \mathbb{C} being also known as the Argand plane).

Last revised on June 15, 2026 at 07:13:37. See the history of this page for a list of all contributions to it.