Proofs of Fermat's theorem on sums of two squares
Encyclopedia
Fermat's theorem on sums of two squares asserts that an odd prime number
p can be expressed as
with integer
x and y if and only if p is congruent
to 1 (mod 4). The statement was announced by Fermat
in 1640, but he supplied no proof.
The "only if" clause is easy: a perfect square
is congruent to 0 or 1 modulo 4, hence a sum of two squares is congruent to 0, 1, or 2. An odd prime number is congruent to either 1 or 3 modulo 4, and the second possibility has just been ruled out. The first proof that such a representation exists was given by Leonhard Euler
in 1747 and was quite complicated. Since then, many different proofs have been found. Among them, the proof using Minkowski's theorem
about convex set
s and Don Zagier
's stunningly short proof based on involutions especially stand out.
succeeded in proving Fermat's theorem on sums of two squares in 1747, when he was forty years old. He communicated this in a letter to Goldbach
dated 6 May 1747. The proof relies on infinite descent, and proceeds in five steps; the fifth step below is from another letter to Goldbach written in 1749, as the first letter was vague on that final step:
1. The product of two numbers, each of which is a sum of two squares, is itself a sum of two squares.
2. If a number which is a sum of two squares is divisible by a prime which is a sum of two squares, then the quotient is a sum of two squares.
3. If a number which can be written as a sum of two squares is divisible by a number which is not a sum of two squares, then the quotient has a factor which is not a sum of two squares.
4. If and are relatively prime then every factor of is a sum of two squares.
5. Every prime of the form is a sum of two squares.
gave a proof in 1770 based on his general theory of integral quadratic forms. The following is a slight simplification of his argument, due to Gauss
, which appears in article 182 of the Disquisitiones Arithmeticae
.
A (binary) quadratic form will be taken to be an expression of the form with integers. A number is said to be represented by the form if there exist integers such that . Fermat's theorem on sums of two squares is then equivalent to the statement that a prime is represented by the form (i.e., , ) exactly when is congruent to modulo .
The discriminant
of the quadratic form is defined to be (this is the definition due to Gauss; Lagrange did not require the term to have even coefficient, and defined the discriminant as ). The discriminant of is then equal to .
Two forms and are equivalent if and only if there exist substitutions with integer coefficients
with such that, when substituted into the first form, yield the second. Equivalent forms are readily seen to have the same discriminant. Moreover, it is clear that equivalent forms will represent exactly the same integers.
Lagrange proved that all forms of discriminant −1 and are equivalent (a form satisfying this conditions is said to be reduced). Thus, to prove Fermat's theorem it is enough to find any reduced form of discriminant −1 that represents . To do this, it suffices to find an integer such that divides . For, finding such an integer, we can consider the form
which has discriminant −1 and represents p by setting x = 1 and y = 0.
Suppose then that p = 4n + 1. Again we invoke Fermat's Little Theorem: for any z relatively prime to p, we know that p divides . Moreover, by a theorem of Lagrange, the number of solutions modulo p to a congruence of degree q modulo p is at most q (this follows since the integers modulo p form a field, and a polynomial of degree q has at most q roots). So the congruence has at most 2n solutions among the numbers 1, 2, …, p − 1 = 4n. Therefore, there exists some positive integer z strictly smaller than p (and so relatively prime to p) such that p does not divide . Since p divides , p must divide . Setting completes the proof.
gave at least two proofs of Fermat's theorem on sums of two squares, both using the arithmetical properties of the Gaussian integers, which are numbers of the form a + bi, where a and b are integers, and i is the square root of −1. One appears in section 27 of his exposition of ideals published in 1877; the second appeared in Supplement XI to Dirichlet's
Lectures on Number Theory, and was published in 1894.
1. First proof. If is an odd prime number
, then we have in the Gaussian integers. Consequently, writing a Gaussian integer ω = x + iy with x,y ∈ Z and applying the Frobenius automorphism in Z[i]/(p), one finds
since the automorphism fixes the elements of Z/(p). If p is congruent to 1 modulo 4, then the right hand side equals ω, so in this case the Frobenius endomorphism of Z[i]/(p) is the identity.
Kummer had already established that if } is the order
of the Frobenius automorphism of Z[i]/(p), then the ideal
in Z[i] would be a product of 2/f distinct prime ideal
s. (In fact, Kummer had established a much more general result for any extension of Z obtained by adjoining a primitive m-th root of unity
, where m was any positive integer; this is the case of that result.) Therefore in the current case the ideal (p) is the product of two different prime ideals in Z[i]. Since the Gaussian integers are a Euclidean domain
for the norm function , every ideal is principal and generated by a nonzero element of the ideal of minimal norm. Since the norm is multiplicative, the norm of a generator of one of the ideal factors of (p) must be a strict divisor of , so that we must have , which gives Fermat's theorem.
2. Second proof. This proof builds on Lagrange's result that if is a prime number, then there must be an integer m such that is divisible by p; it also uses the fact that the Gaussian integers are a unique factorization domain
(because they are a Euclidean domain). Since does not divide either of the Gaussian integers and (as it does not divide their imaginary parts), but it does divide their product , it follows that cannot be a prime element in the Gaussian integers. We must therefore have a nontrivial factorization of p in the Gaussian integers, which in view of the norm can have only two factors, so it must be of the form for some integers and . This immediately yields that .
Prime number
A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number. For example 5 is prime, as only 1 and 5 divide it, whereas 6 is composite, since it has the divisors 2...
p can be expressed as
with integer
Integer
The integers are formed by the natural numbers together with the negatives of the non-zero natural numbers .They are known as Positive and Negative Integers respectively...
x and y if and only if p is congruent
Modular arithmetic
In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus....
to 1 (mod 4). The statement was announced by Fermat
Pierre de Fermat
Pierre de Fermat was a French lawyer at the Parlement of Toulouse, France, and an amateur mathematician who is given credit for early developments that led to infinitesimal calculus, including his adequality...
in 1640, but he supplied no proof.
The "only if" clause is easy: a perfect square
Square number
In mathematics, a square number, sometimes also called a perfect square, is an integer that is the square of an integer; in other words, it is the product of some integer with itself...
is congruent to 0 or 1 modulo 4, hence a sum of two squares is congruent to 0, 1, or 2. An odd prime number is congruent to either 1 or 3 modulo 4, and the second possibility has just been ruled out. The first proof that such a representation exists was given by Leonhard Euler
Leonhard Euler
Leonhard Euler was a pioneering Swiss mathematician and physicist. He made important discoveries in fields as diverse as infinitesimal calculus and graph theory. He also introduced much of the modern mathematical terminology and notation, particularly for mathematical analysis, such as the notion...
in 1747 and was quite complicated. Since then, many different proofs have been found. Among them, the proof using Minkowski's theorem
Minkowski's theorem
In mathematics, Minkowski's theorem is the statement that any convex set in Rn which is symmetric with respect to the origin and with volume greater than 2n d contains a non-zero lattice point...
about convex set
Convex set
In Euclidean space, an object is convex if for every pair of points within the object, every point on the straight line segment that joins them is also within the object...
s and Don Zagier
Don Zagier
Don Bernard Zagier is an American mathematician whose main area of work is number theory. He is currently one of the directors of the Max Planck Institute for Mathematics in Bonn, Germany, and a professor at the Collège de France in Paris, France.He was born in Heidelberg, Germany...
's stunningly short proof based on involutions especially stand out.
Euler's proof by infinite descent
EulerLeonhard Euler
Leonhard Euler was a pioneering Swiss mathematician and physicist. He made important discoveries in fields as diverse as infinitesimal calculus and graph theory. He also introduced much of the modern mathematical terminology and notation, particularly for mathematical analysis, such as the notion...
succeeded in proving Fermat's theorem on sums of two squares in 1747, when he was forty years old. He communicated this in a letter to Goldbach
Christian Goldbach
Christian Goldbach was a German mathematician who also studied law. He is remembered today for Goldbach's conjecture.-Biography:...
dated 6 May 1747. The proof relies on infinite descent, and proceeds in five steps; the fifth step below is from another letter to Goldbach written in 1749, as the first letter was vague on that final step:
1. The product of two numbers, each of which is a sum of two squares, is itself a sum of two squares.
-
- This is simply a restatement of the Brahmagupta–Fibonacci identity.
2. If a number which is a sum of two squares is divisible by a prime which is a sum of two squares, then the quotient is a sum of two squares.
-
- Indeed, suppose for example that is divisible by and that this latter is a prime. Then divides
-
- Since is a prime, it divides one of the two factors. Suppose that it divides . Since
-
- (Brahmagupta-Fibonacci identity) it follows that must divide . So the equation can be divided by the square of . Dividing the expression by yields:
-
- and thus expresses the quotient as a sum of two squares, as claimed.
-
- If divides , a similar argument holds by using
-
- (Brahmagupta-Fibonacci identity).
3. If a number which can be written as a sum of two squares is divisible by a number which is not a sum of two squares, then the quotient has a factor which is not a sum of two squares.
-
- Suppose divides and that the quotient, factored into its prime factors is . Then . If all factors can be written as sums of two squares, then we can divide successively by , , etc., and applying the previous step we deduce that each quotient is a sum of two squares. This until we get to , concluding that would have to be the sum of two squares. So, by contraposition, if is not the sum of two squares, then at least one of the primes is not the sum of two squares.
4. If and are relatively prime then every factor of is a sum of two squares.
-
- This is the step that uses infinite descent. Let be a factor of . We can write
- where and are at most half of in absolute value. This gives:
- Therefore, must be divisible by , say . If and are not relatively prime, then their gcd cannot divide (if it did, then it would divide and which we assume are relatively prime). Thus the square of the gcd divides (as it divides ), giving us an expression of the form for relatively prime and , and with no more than half of , since
- This is the step that uses infinite descent. Let be a factor of . We can write
-
- If and are relatively prime, then we can use them directly instead of switching to and .
-
- If is not the sum of two squares, then by the third step there must be a factor of which is not the sum of two squares; call it . This gives an infinite descent, going from to a smaller number , both not the sums of two squares but dividing a sum of two squares. Since an infinite descent is impossible, we conclude that must be expressible as a sum of two squares, as claimed.
5. Every prime of the form is a sum of two squares.
-
- If , then by Fermat's Little TheoremFermat's little theoremFermat's little theorem states that if p is a prime number, then for any integer a, a p − a will be evenly divisible by p...
each of the numbers is congruent to one modulo . The differences are therefore all divisible by . Each of these differences can be factored as - Since is prime, it must divide one of the two factors. If in any of the cases it divides the first factor, then by the previous step we conclude that is itself a sum of two squares (since and differ by , they are relatively prime). So it is enough to show that cannot always divide the second factor. If it divides all differences , then it would divide all differences of successive terms, all differences of the differences, and so forth. Since the th differences of the sequence are all equal to (Finite difference), the th differences would all be constant and equal to , which is certainly not divisible by . Therefore, cannot divide all the second factors which proves that is indeed the sum of two squares.
- If , then by Fermat's Little Theorem
Lagrange's proof through quadratic forms
LagrangeJoseph Louis Lagrange
Joseph-Louis Lagrange , born Giuseppe Lodovico Lagrangia, was a mathematician and astronomer, who was born in Turin, Piedmont, lived part of his life in Prussia and part in France, making significant contributions to all fields of analysis, to number theory, and to classical and celestial mechanics...
gave a proof in 1770 based on his general theory of integral quadratic forms. The following is a slight simplification of his argument, due to Gauss
Carl Friedrich Gauss
Johann Carl Friedrich Gauss was a German mathematician and scientist who contributed significantly to many fields, including number theory, statistics, analysis, differential geometry, geodesy, geophysics, electrostatics, astronomy and optics.Sometimes referred to as the Princeps mathematicorum...
, which appears in article 182 of the Disquisitiones Arithmeticae
Disquisitiones Arithmeticae
The Disquisitiones Arithmeticae is a textbook of number theory written in Latin by Carl Friedrich Gauss in 1798 when Gauss was 21 and first published in 1801 when he was 24...
.
A (binary) quadratic form will be taken to be an expression of the form with integers. A number is said to be represented by the form if there exist integers such that . Fermat's theorem on sums of two squares is then equivalent to the statement that a prime is represented by the form (i.e., , ) exactly when is congruent to modulo .
The discriminant
Discriminant
In algebra, the discriminant of a polynomial is an expression which gives information about the nature of the polynomial's roots. For example, the discriminant of the quadratic polynomialax^2+bx+c\,is\Delta = \,b^2-4ac....
of the quadratic form is defined to be (this is the definition due to Gauss; Lagrange did not require the term to have even coefficient, and defined the discriminant as ). The discriminant of is then equal to .
Two forms and are equivalent if and only if there exist substitutions with integer coefficients
with such that, when substituted into the first form, yield the second. Equivalent forms are readily seen to have the same discriminant. Moreover, it is clear that equivalent forms will represent exactly the same integers.
Lagrange proved that all forms of discriminant −1 and are equivalent (a form satisfying this conditions is said to be reduced). Thus, to prove Fermat's theorem it is enough to find any reduced form of discriminant −1 that represents . To do this, it suffices to find an integer such that divides . For, finding such an integer, we can consider the form
which has discriminant −1 and represents p by setting x = 1 and y = 0.
Suppose then that p = 4n + 1. Again we invoke Fermat's Little Theorem: for any z relatively prime to p, we know that p divides . Moreover, by a theorem of Lagrange, the number of solutions modulo p to a congruence of degree q modulo p is at most q (this follows since the integers modulo p form a field, and a polynomial of degree q has at most q roots). So the congruence has at most 2n solutions among the numbers 1, 2, …, p − 1 = 4n. Therefore, there exists some positive integer z strictly smaller than p (and so relatively prime to p) such that p does not divide . Since p divides , p must divide . Setting completes the proof.
Dedekind's two proofs using Gaussian integers
DedekindRichard Dedekind
Julius Wilhelm Richard Dedekind was a German mathematician who did important work in abstract algebra , algebraic number theory and the foundations of the real numbers.-Life:...
gave at least two proofs of Fermat's theorem on sums of two squares, both using the arithmetical properties of the Gaussian integers, which are numbers of the form a + bi, where a and b are integers, and i is the square root of −1. One appears in section 27 of his exposition of ideals published in 1877; the second appeared in Supplement XI to Dirichlet's
Johann Peter Gustav Lejeune Dirichlet
Johann Peter Gustav Lejeune Dirichlet was a German mathematician with deep contributions to number theory , as well as to the theory of Fourier series and other topics in mathematical analysis; he is credited with being one of the first mathematicians to give the modern formal definition of a...
Lectures on Number Theory, and was published in 1894.
1. First proof. If is an odd prime number
Prime number
A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. A natural number greater than 1 that is not a prime number is called a composite number. For example 5 is prime, as only 1 and 5 divide it, whereas 6 is composite, since it has the divisors 2...
, then we have in the Gaussian integers. Consequently, writing a Gaussian integer ω = x + iy with x,y ∈ Z and applying the Frobenius automorphism in Z[i]/(p), one finds
since the automorphism fixes the elements of Z/(p). If p is congruent to 1 modulo 4, then the right hand side equals ω, so in this case the Frobenius endomorphism of Z[i]/(p) is the identity.
Kummer had already established that if } is the order
Order (group theory)
In group theory, a branch of mathematics, the term order is used in two closely related senses:* The order of a group is its cardinality, i.e., the number of its elements....
of the Frobenius automorphism of Z[i]/(p), then the ideal
Ideal (ring theory)
In ring theory, a branch of abstract algebra, an ideal is a special subset of a ring. The ideal concept allows the generalization in an appropriate way of some important properties of integers like "even number" or "multiple of 3"....
in Z[i] would be a product of 2/f distinct prime ideal
Prime ideal
In algebra , a prime ideal is a subset of a ring which shares many important properties of a prime number in the ring of integers...
s. (In fact, Kummer had established a much more general result for any extension of Z obtained by adjoining a primitive m-th root of unity
Root of unity
In mathematics, a root of unity, or de Moivre number, is any complex number that equals 1 when raised to some integer power n. Roots of unity are used in many branches of mathematics, and are especially important in number theory, the theory of group characters, field theory, and the discrete...
, where m was any positive integer; this is the case of that result.) Therefore in the current case the ideal (p) is the product of two different prime ideals in Z[i]. Since the Gaussian integers are a Euclidean domain
Euclidean domain
In mathematics, more specifically in abstract algebra and ring theory, a Euclidean domain is a ring that can be endowed with a certain structure – namely a Euclidean function, to be described in detail below – which allows a suitable generalization of the Euclidean algorithm...
for the norm function , every ideal is principal and generated by a nonzero element of the ideal of minimal norm. Since the norm is multiplicative, the norm of a generator of one of the ideal factors of (p) must be a strict divisor of , so that we must have , which gives Fermat's theorem.
2. Second proof. This proof builds on Lagrange's result that if is a prime number, then there must be an integer m such that is divisible by p; it also uses the fact that the Gaussian integers are a unique factorization domain
Unique factorization domain
In mathematics, a unique factorization domain is, roughly speaking, a commutative ring in which every element, with special exceptions, can be uniquely written as a product of prime elements , analogous to the fundamental theorem of arithmetic for the integers...
(because they are a Euclidean domain). Since does not divide either of the Gaussian integers and (as it does not divide their imaginary parts), but it does divide their product , it follows that cannot be a prime element in the Gaussian integers. We must therefore have a nontrivial factorization of p in the Gaussian integers, which in view of the norm can have only two factors, so it must be of the form for some integers and . This immediately yields that .
Zagier's "one-sentence proof"
If p = 4k + 1 is prime, then the set S = {(x, y, z) ∈ N3: x2 + 4yz = p} is finite and has two involutions: an obvious one (x, y, z) → (x, z, y), whose fixed points correspond to representations of p as a sum of two squares, and a more complicated one,-
which has exactly one fixed point, (1, 1, k); however, the number of fixed points of an involution of a finite set S has the same parity as the cardinality of S, so this number is odd for the first involution as well, proving that p is a sum of two squares.
This proof, due to ZagierDon ZagierDon Bernard Zagier is an American mathematician whose main area of work is number theory. He is currently one of the directors of the Max Planck Institute for Mathematics in Bonn, Germany, and a professor at the Collège de France in Paris, France.He was born in Heidelberg, Germany...
, is a simplification of an earlier proof by Heath-BrownRoger Heath-BrownDavid Rodney "Roger" Heath-Brown F.R.S. , is a British mathematician working in the field of analytic number theory. He was an undergraduate and graduate student of Trinity College, Cambridge; his research supervisor was Alan Baker...
, which in turn was inspired by a proof of LiouvilleJoseph Liouville- Life and work :Liouville graduated from the École Polytechnique in 1827. After some years as an assistant at various institutions including the Ecole Centrale Paris, he was appointed as professor at the École Polytechnique in 1838...
. The technique of the proof is a combinatorial analogue of the topological principle that the Euler characteristicEuler characteristicIn mathematics, and more specifically in algebraic topology and polyhedral combinatorics, the Euler characteristic is a topological invariant, a number that describes a topological space's shape or structure regardless of the way it is bent...
s of a topological spaceTopological spaceTopological spaces are mathematical structures that allow the formal definition of concepts such as convergence, connectedness, and continuity. They appear in virtually every branch of modern mathematics and are a central unifying notion...
with an involution and of its fixed point set have the same parity and is reminiscent of the use of sign-reversing involutions in the proofs of combinatorial bijections.
External links