\chapter{Constructions using ruler and compass}
\section{Solving cubic equations using square roots}
Suppose you are given a cubic equation of the form $x^3+a x^2+b x+c=0$, where $a,b$ and $c$ are rational numbers. If the equation has a rational root $x=x_0$, then we can divide the polynomial $x^3+a x^2+b x+c$ by the linear polynomial $x-x_0$ to get a quadratic polynomial with rational coefficients. The roots of this quadratic are the remaining roots of the equation we started with. We can solve for these roots using the quadratic formula, which involves only field operations (+,-,*,/) and the extraction of a square root. Thus, if a cubic polynomial with rational coefficients has a rational root, then the other two roots can expressed by a formula involving rational numbers and their square roots.
Our goal for this section will be to show a strong version of the converse statement: if a cubic equation with rational coefficients does not have a rational root, then none of its roots can be expressed by a formula involving any number of extractions of square roots.
To prove such a result we will find the following concepts useful. The first is the concept of a field. A field is a set of elements that you can add, subtract, multiply and divide, and these operations behave exactly like the addition, subtraction, multiplication and division of rational numbers (e.g. adding $a$ to $b$ gives the same result as adding $b$ to $a$, computing $a(b-c)$ is the same as computing $ab-ac$, one can divide by any non-zero element and so on). We don't want to give a precise definition here, just to list a couple of important examples: the field of rational numbers $\mathbb{Q}$, the field of real numbers $\mathbb{R}$ and the field of complex numbers $\mathbb{C}$. The next concept gives us a way to construct a new field out of an old one.
\subsection{Adjoining square root to a field}
Let $F$ be a field (like $\mathbb{Q}$) and let $d$ be any element in $F$, which is not a square (it is not of the form $a\cdot a$ for any $a$ in $F$, like $2$ in $\mathbb{Q}$). We now construct a new field $F(\sqrt{d})$, in which $d$ will be a square - the square of $\sqrt{d}$. The elements of this field are of the form $a+b\sqrt{d}$ for $a,b\in F$. Note that $\sqrt{d}$ is just the name of a new symbol; for now it doesn't have any meaning. In particular until we define how to multiply such elements, $\sqrt{d}\cdot \sqrt{d}$ is in no way related to $d$.
Before we define how to add, subtract, multiply and divide such elements, we want to tell our reader that he or she has done all this already. Indeed, how does one construct complex numbers? One takes all real numbers and chooses among them a number, $-1$, that has no square root. Then one introduces a new symbol, $\sqrt{-1}$ (which even receives a special name, $i$) and looks at numbers" of the form $a+bi$, where $a$ and $b$ are real. One adds and subtracts such numbers in the obvious way, and one multiplies them in such a way as to guarantee that $(\sqrt{-1})^2=-1$. It turns out that one can then divide by any such nonzero number". All one has to do is to multiply by $\frac{1}{a+bi}$, which one finds by multiplying both numerator and denominator by the conjugate of $a+bi$: $\frac{1}{a+bi}=\frac{a-bi}{(a+bi)(a-bi)}=\frac{a}{a^2+b^2}-\frac{b}{a^2+b^2}i$.
Now we do exactly the same thing, replacing $-1$ everywhere by $d$ and $i$ by $\sqrt{d}$. Thus we define $(a+b\sqrt{d})\pm(a'+b'\sqrt{d})=(a\pm a')+ (b\pm b')\sqrt{d}$ and $(a+b\sqrt{d})(a'+b'\sqrt{d})=(aa'+bb'd)+(ab'+ba')\sqrt{d}$ so that in particular $(\sqrt{d})^2=d$. We define the conjugate of $a+b\sqrt{d}$ to be $a-b\sqrt{d}$. It turns out that we can divide by any nonzero element $a+b\sqrt{d}$ by multiplying by $\frac{1}{a+b\sqrt{d}}$. We can find the latter fraction by multiplying both numerator and denominator by the conjugate of $a+b\sqrt{d}$: $\frac{1}{a+b\sqrt{d}}=\frac{a-b\sqrt{d}}{(a+b\sqrt{d})(a-b\sqrt{d})}=\frac{a}{a^2-b^2 d}-\frac{b}{a^2- b^2 d}\sqrt{d}$. Note that in this fraction the denominator $a^2-b^2 d$ is nonzero, since otherwise $d$ would be equal to $(a/b)^2$, contradicting the assumption that it wasn't a square.
\subsection{Strategy}
We can now go back to the question with which we started: whether we can solve a cubic equation that doesn't have any rational roots using formulas involving field operations and the extraction of square roots.
The main lemma we are going to use is the following:
\begin{lemma}
Suppose that an equation $x^3+a_1 x^2+a_2 x+a_3=0$ with coefficients in $F$ doesn't have any roots in a field $F$. Then it doesn't have any roots in the field $F(\sqrt{d})$ to which the square root of some element $d\in F$ was adjoined in the described way.
\end{lemma}
Once we prove this lemma we can deduce that the roots of a cubic equation that doesn't have any rational roots can't be expressed using formulae that use only field operations and extractions of square roots. Indeed, any number written using a formula with (perhaps nested) square roots belongs to a field that can be obtained from $\mathbb{Q}$ by adjoining square roots: first the most nested ones that appear in the formula, then the square roots of elements that have been already adjoined and so on. However successive applications of the lemma formulated above show that none of the roots of the cubic equation can lie in such a field extension of $\mathbb{Q}$.
Now we prove the lemma.
\begin{proof}
The following observation is crucial to the proof: if some element $a+b\sqrt{d}$ is a root of the irreducible cubic equation $x^3+a_1 x^2+a_2 x+a_3=0$, then the element $a-b\sqrt{d}$ is also a root. Indeed, the map $\sigma:F(\sqrt{d})\rightarrow F(\sqrt{d})$ defined by $\sigma(a+b\sqrt{d})=a-b\sqrt{d}$ preserves all the field operations: $\sigma(x\pm y)=\sigma(x)\pm\sigma(y)$, $\sigma(xy)=\sigma(x)\sigma(y)$ and $\sigma(1/x)=1/\sigma(x)$ for any $x$ and $y$ in $F(\sqrt{d})$ (in complex numbers this $\sigma$ would just be the complex conjugation, for which these properties are all familiar). Note that the elements of $F$ inside $F(\sqrt{d})$ are fixed by $\sigma$ and they are the only elements that are fixed by it.
Suppose now that some element $x\in F(\sqrt{d})$ satisfies $x^3+a_1 x^2+a_2 x+a_3=0$. Then we can apply $\sigma$ to this equation to get $\sigma(x^3+a_1 x^2+a_2 x+a_3)=0$. Because $\sigma$ preserves field operations and fixes the coefficients $a_1,a_2,a_3$, this implies that $\sigma(x)^3+a_1 \sigma(x)^2+a_2 \sigma(x)+a_3=0$, i.e. $\sigma(x)$ is also a root of the cubic.
To summarize, we showed that if our cubic has a root $x$ in a field $F(\sqrt{d})$, then it has $\sigma(x)$ as a root as well. But then the third root of the cubic must also lie in the field since we can divide out the other two roots and the coefficients are in $F$. But then the conjugate of this third root must also be a root, if it is distinct. We started by assuming that the polynomial has no roots in $F$, so this third root is in $F(\sqrt{d})$ and must be distinct. We have found by now four roots of the cubic, which is a contradiction. Thus the cubic can't have even a single root in $F(\sqrt{d})$.
\end{proof}
\subsection{Applications to geometry}
In the previous section we proved an algebraic result about the impossibility of solving some problem: we proved it is impossible to solve an irreducible cubic equation using field operations and square roots.
This algebraic result has some very interesting implications for ruler and compass constructions. In this section we will show how the algebraic results we proved above imply the impossibility of solutions to certain problems posed about two thousands years ago! These proofs were obtained only about two centuries ago.
\subsection{Coordinates}
First let us formalize what we mean by the phrase that some point is constructible using ruler and compass. We usually start with some data that is given to us, like two marked points in a plane and ask the question whether we can construct some other point, e.g. the midpoint of the segment joining the two given ones. The constructions we are allowed to make are: drawing a line through two marked points, drawing a circle centered at one of the marked points and passing through some other marked point and marking a point that is the intersection of the figures that have already been drawn on the plane.
The connexion between algebra and geometry is a coordinate system on the plane. We look at the smallest field that contains the coordinates of all the points we can mark.
Our main claim is that by doing an allowed construction we can at most extend this field by adjoining a square root to it. First we note that any line passing through two points $(x_1,y_1)$, $(x_2,y_2)$ with coefficients in a field $F$ is the set of solutions of the equation $(x_2-x_1)(y-y_1)=(y_2-y_1)(x-x_1)$ that has coefficients in the field $F$ and similarly a circle with center at point $(x_1,y_1)$ with coordinates in $F$ and passing through a point $(x_2,y_2)$ with the same property is the set of solutions of the equation $(x-x_1)^2+(y-y_1)^2=(x_2-x_1)^2+(y_2-y_1)^2$ that has coefficients in the field $F$. Next we note that a point of intersection of two lines given by equations with coefficients in a field $F$ has coordinates in the field $F$, because we can solve linear equations using only field operations. Similarly to find the points of intersection of a line and a circle the most we have to do is to solve some quadratic equation (the equation of the circle restricted to the line). Quadratic equations can be solved using formulae that require only field operations and the extraction of a square root, so the point of intersection of a line and a circle which equations have coefficients in $F$ has coordinates at most in an adjoinment of the form $F(\sqrt{d})$. Finally the problem of intersecting two circles given by equations $x^2+y^2+a x + b y + c=0$ and $x^2+y^2+a' x + b' y + c'=0$ can be reduced to the problem of intersecting the first circle with the line $(a'-a)x+(b'-b)y+(c'-c)=0$ by subtracting the first equation from the second.
Now we have everything we need to apply what we know about irreducible cubics to ancient questions about constructions using a ruler and a compass.
Consider the following problem: given a segment that represents the side of a cube, can you construct a segment which will be the side of a cube of twice the volume?
By choosing coordinates appropriately this amounts to the following question: given two marked points $(0,0)$ and $(1,0)$, can you construct a segment of length $\sqrt[3]{2}$? This length is a root of an irreducible cubic: $x^3-2=0$ over the field $\mathbb{Q}$. Hence it can't lie in any extension of $\mathbb{Q}$ made by adjoining successive square roots: it can't be constructed using ruler and compass! We have shown, with only simple algebra, that no matter how involved your method you will never be able to construct a segment of length $\sqrt[3]{2}$ out of a given unit length segment. This is the basis for the aphoristic impossibility of doubling the cube." There is a similar explanation for the phrase squaring the circle."
Another problem of a similar spirit is the problem of trisecting an angle. Its solution is outlined in the exercises. (which are still to be added!)
The mathematical tools we developed here are in fact very simple toy examples of what is known today as Galois theory. More sophisticated techniques from Galois theory enabled Gauss to answer the question which regular $n$-gons are constructible using a ruler and a compass?" For instance, it is very easy to construct a regular triangle. It was known to the ancients how to construct a regular pentagon. But the problems of constructing a regular heptagon, 11-gon and 13-gon resisted solution. It was believed that as with the problem of doubling the cube, no solution for the problem of constructing a regular $n$-gon existed unless $n$ was a product of powers of 2,3 or 5. It was a complete surprise to the mathematical community when Gauss managed to construct a regular 17-gon, at the age of nineteen!