The Abel-Ruffini theorem
Table of Contents
Introduction
The Abel-Ruffini theorem states that the roots of a generic polynomial
\begin{align} P_n(x) = x^n + a_{n-1} x^{n-1} + \cdots + a_1 x + a_0 \end{align}of degree \(n \geq 5\) cannot be expressed in terms of radicals, i.e. that there exists no formula to write the roots \(r_0, \ldots, r_{n-1}\) in terms of the coefficients \(a_0, \ldots, a_{n-1}\) that only uses addition, subtraction, multplication, division and the extraction of \(k\)-th roots (radicals).
It is important that the polynomial is generic which simply means that the coefficients are treated as indeterminates. For special cases there can of course be formulas for the roots in terms of radicals. An example is the polynomial \(x^n - 1\) whose roots are the \(n\)-th roots of unity \(\exp\left(\frac{2\pi i k}{n}\right)\) for \(k = 0, \ldots, n - 1\).
The theorem is often proven using Galois theory, but there is also a more topological proof apparently due to Arnold. I first heard about this proof in this article by Bruce Bartlett and learned most of the details from this paper by Paul Ramond.
We will consider polynomials \(P_n(x)\) starting with the quadratic case \(n = 2\) and see how increasing the degree \(n\) leads to nested radicals in the formula for the roots. When the degree reaches \(n = 5\) there will be an obstruction that cannot be overcome by another level of nesting.
Quadratic
To begin, let's consider a monic quadratic polynomial
\begin{align} P_2(x) = x^2 + a_1 x + a_0, \quad a_0, a_1 \in \mathbb{C}. \end{align}Over the complex numbers, the polynomial has two (generally) complex roots and can be factorized as
\begin{align} P_2(x) = \left(x - r_0\right) \left(x - r_1\right). \end{align}It is well-known that the roots may be written as a function of the coefficients \(a_0\) and \(a_1\) as
\begin{align} \label{eq:roots-quadratic} r_{0,1} = \frac{-a_1 \pm \sqrt{a_1^2 - 4 a_0}}{2}. \end{align}More important for us is how the coefficients can be expressed in terms of the roots:
\begin{align} \label{eq:vieta-quadratic} a_0 = r_0 r_1, \quad a_1 = -\left(r_0 + r_1\right). \end{align}Note that while the solution \eqref{eq:roots-quadratic} for the quadratic equation contains a square roots, the coefficients \(a_0\) and \(a_1\) can be expressed as polynomials in terms of \(r_0\) and \(r_1\). This will still be true at higher degrees the corresponding generalizations of \eqref{eq:vieta-quadratic} are known as Vieta's formulas.
Now imagine that we swap the roots \(r_0\) and \(r_1\) by moving them along two paths \(\gamma_{01}(t)\) and \(\gamma_{10}(t)\) in the complex plane. The path \(\gamma_{01}\) is such that \(\gamma_{01}(0) = r_0\) and \(\gamma_{01}(1) = r_1\) and similarly for \(\gamma_{10}\). As the roots move along these paths, each of the coefficients \(a_0\) and \(a_1\) moves along an induced path that follows from their expression in equation \eqref{eq:vieta-quadratic}. In fact, because the polynomial \(P_2(x)\) is the same after the roots have swapped places, the coefficients move along a closed loop.
The following two plots represent complex planes. The roots and the path along which they move are shown on the left while the coefficients are shown on the right. Dragging the dots in one of the plots updates the data in the other. A click on the "swap" button swaps the roots along the indicated paths.
As the roots swap places, the coefficients \(a_0\) and \(a_1\) also move around but eventually return to their initial position. The same is true for any rational function \(F_0(a_0, a_1)\): An exchange of \(r_0\) and \(r_1\) induces the motion of \(F_0(a_0, a_1)\) along a closed loop. An example is the discriminant of the quadratic,
\begin{align} \Delta_2 = a_1^2 - 4 a_0 = (r_0 - r_1)^2, \end{align}shown as a red dot in the plot on the right hand side. It indeed moves along a closed loop inside the complex plane when we swap \(r_0\) and \(r_1\).
One immediate consequence is that a rational function \(F_0(a_0, a_1)\) can in general not be sufficient to express the roots \(r_0\) and \(r_1\), because it does not behave correctly when \(r_0\) and \(r_1\) are exchanged. In some sense, a rational function \(F_0\) is simply not complicated enough. In the quadratic case we of course know that the correct formula for the roots contains the square root of the discriminant \(\sqrt{\Delta_2}\) (see \eqref{eq:roots-quadratic}).
Since the loop of \(\Delta_2\) contains the origin in its interior, there is a point at which \(\Delta_2\) crosses the branch cut of the square root function along the negative real axis. Thus, while \(\Delta_2\) ends up where it started, \(\sqrt{\Delta_2}\) indeed changes sign as required to exchange the solutions \(r_0\) and \(r_1\).
Cubic
Consider now a cubic polynomial,
\begin{align} P_3(x) = x^3 + a_2 x^2 + a_1 x + a_0, \quad a_0, a_1, a_2 \in \mathbb{C}. \end{align}Over the complex numbers there are three complex roots \(r_0\), \(r_1\) and \(r_2\). In this case, Vieta's formulas formulas for the coefficients expressed in terms of the coefficients are
\begin{align} \label{eq:vieta-cubic} a_0 = -r_0 r_1 r_2, \quad a_1 = r_0 r_1 + r_1 r_2 + r_0 r_2, \quad a_2 = -(r_0 + r_1 + r_2). \end{align}The cubic equation \(P_3(x) = 0\) can be solved in terms of radicals, i.e. there is a formula that expresses the roots in terms of the coefficients (see e.g. here), but let's imagine we do not know that and instead see if it could even exist.
We first investigate what happens when we exchange two roots \(r_i\) and \(r_j\) by moving them along two paths in the complex plane while keeping the third root fixed. As in the quadratic case, Vieta's formulas \eqref{eq:vieta-cubic} express the coefficients as polynomials in the roots, so they follow closed loops in the complex plane during this exchange. Here is a visualization similar to the one for the quadratic:
Inspired by the quadratic case, one might look at the discriminant of the cubic,
\begin{align} \Delta_3 = a_1^2 a_2^2 - 4 a_0 a_2^3 - 4 a_1^3 + 18 a_0 a_1 a_2 - 27 a_0^2 = (r_0 - r_1)^2 (r_1 - r_2)^2 (r_2 - r_0)^2, \end{align}which again moves along a loop that encloses the origin when two roots are interchanged. Note that the loop depends on which of the two curves are exchanged.
Still remembering the quadratic case, one could naively hope that a function \(F_1(a_0, a_1, a_2)\) with one level of radicals of \(\Delta_3\), for example a cubic root \(\sqrt[3]{\Delta_3}\), might be enough to express the roots. As \(\Delta_3\) loops around the origin, \(F_1(a_0, a_1, a_2)\) moves along a curve in the complex plane, but not necessarily a closed loop. This is a good sign.
Here is a simplified example where \(\Delta_3\) (again in red) moves around the origin along the unit circle. The three three possible values for \(\sqrt[3]{\Delta_3}\) are shown as the points \(\xi_0\), \(\xi_1\) and \(\xi_2\) below. When \(\Delta_3\) completes the loop they are permuted. Note that the permutation (i.e. the value of \(\sqrt[3]{\Delta_3}\) depends on the path of \(\Delta_3\):
However, it turns out this guess is in fact too naive and the function \(F_1\) is not sufficient to express the roots. The reason is that we have so far only considered a swap of two roots, i.e. a transposition \((ij)\), but the roots can be permuted by any element in the symmetric group of three elements \(S_3\) which contains more than that. Let's consider for example the permutation \((123)\) (we are using the usual cycle notation for permutations). This cycle can be decomposed into a product of transpositions as
\begin{align} \label{eq:s3-123-commutator} (1 2 3) = (1 2) (2 3) (1 2)^{-1} (2 3)^{-1}. \end{align}To each transposition \((i j)\) corresponds a pair of paths \(\gamma_{ij}\) and \(\gamma_{ji}\) along which we swap \(r_i\) and \(r_j\), a loop \(\delta_{ij}\) along which the value of a rational function \(F_0(a_0, a_1, a_2)\) moves and finally a path \(\epsilon_{ij}\) along which \(F_1(a_0, a_1, a_2)\) moves. We find the full path taken by \(F_1(a_0, a_1, a_2)\) for the permutation \((1 2 3)\) by concatenating the paths corresponding to the swaps,
\begin{align} \epsilon_{123} = \epsilon_{12} \epsilon_{23} \epsilon_{12}^{-1} \epsilon_{23}^{-1}. \end{align}Here the path \(\epsilon_{ij}^{-1}\) is the same as \(\epsilon_{ij}\), but with its direction reversed. This however means that \(\epsilon_{123}\) is a closed loop!
We can now argue as in the quadratic case: The permutation \((1 2 3)\) permutes the roots, but if we write \((1 2 3)\) as in \eqref{eq:s3-123-commutator} we see that any function \(F_1(a_0, a_1, a_2)\) follows a closed loop. Such a function is therefore insufficient to express the roots in terms of the coefficients.
For the symmetric group on \(n\) elements the formula \eqref{eq:s3-123-commutator} holds more generally,
\begin{align} \label{eq:sn-transposition-commutator} (i j k) = (i j) (j k) (i j)^{-1} (j k)^{-1} \end{align}if \(i\), \(j\) and \(k\) are pairwise distinct. Permuting the roots according to such a permutation will induce a closed loop for any function \(F_1\) with one level of radicals. Note moreover that the right-hand side of equation \eqref{eq:sn-transposition-commutator} is simply the commutator \(\left[(i j), (j k)\right]\) of \((i j)\) and \((j k)\). The path in the complex plane corresponding to a commutator is always a closed loop. This will be important to discuss the solutions of quartic and quintic polynomials.
We should finally also convince ourselves that adding another level of radicals, i.e. a function \(F_2(a_0, a_1, a_2)\), cannot be ruled out by the same argument. First, the additional level of radicals means that \(F_2(a_0, a_1, a_2)\) moves along an open path even for a permutation \((i j k)\). Moreover, another commutator similar to the one in equation \eqref{eq:s3-123-commutator} gives nothing new, for example
\begin{align} \label{eq:s3-commutator-trivial} (1 2 3) (1 3 2) (1 2 3)^{-1} (1 3 2)^{-1} = (). \end{align}The movement of \(F_2(a_0, a_1, a_2)\) obtained by concatenating the corresponding paths gives a closed loop, but because the right hand side is simply the identity permutation this is not a problem.
Quartic
The next step on the way to the quintic is a polynomial of degree four,
\begin{align} P_4 = x^4 + a_3 x^3 + a_2 x^2 + a_1 x + a_0, \quad a_0, a_1, a_2, a_3 \in \mathbb{C}. \end{align}There are now four roots that we can permute to say something about the level of radical nesting required to express the roots in terms of the coefficients.
Vieta's formulas again allow us to express the coefficients \(a_0, \ldots, a_3\) as symmetric polynomials in the roots \(r_0, \ldots, r_3\). As before, a swap \((i j)\) of two roots \(r_i\) and \(r_j\) along some path causes the value of a rational function \(F_0(a_0, \ldots, a_3)\) of the coefficients to move along a closed loop. Such a function is therefore clearly not sufficient to express the roots. For a permutation \((i j k)\) of three roots, we can use equation \eqref{eq:sn-transposition-commutator} and conclude as in the cubic case that also the value of a function \(F_1(a_1, \ldots, a_3)\) with one level of radicals moves along a closed loop and is therefore not a good candidate.
The next level involves the type of function needed to solve the cubic, i.e. a function \(F_2(a_1, \ldots, a_3)\) with two levels of nested radicals. A permutation \((i j k)\) of three roots generally causes the value of such a function to move along an open path \(\eta_{ijk}\) in the complex plane. There is a formula similar to \eqref{eq:sn-transposition-commutator} for the commutator of cycles of length three,
\begin{align} \label{eq:s4-commutator-nontrivial} (i j) (k \ell) = (i k \ell) (k \ell j) (i k \ell)^{-1} (k \ell j)^{-1}, \end{align}valid for \(i, j, k, \ell\) pairwise distinct. The left hand side of this equation is a non-trivial permutation in the symmetric group \(S_4\), but from the right hand side we can conclude that the value of \(F_2(a_0, \ldots, a_3)\) moves a long the curve \(\eta_{i k \ell} \eta_{k \ell j} \eta_{i k \ell}^{-1} \eta_{k \ell j}^{-1}\). This is again a closed loop, so the function \(F_2\) can also not be sufficient to express the roots of the quartic in terms of the coefficients.
As before, we optimistically add another level of radicals and end up with a function \(F_3(a_0, \ldots, a_3)\) as a candidate for the roots of the quartic. The value of this function now moves along an open path for a permutation of the form \((i j) (k \ell)\). And very similar to the cubic case it turns out that
\begin{align} \label{eq:s4-1324-commutator} \left[(1 3) (2 4)\right] \left[(1 4) (2 3)\right] \left[(1 3) (2 4)\right]^{-1} \left[(1 4) (2 3)\right]^{-1} = () \end{align}and similar for other commutators of permutations of the form \((i j) (k \ell)\), which means that three levels of nested radicals are indeed enough. The actual formula for the roots is very long and can for example be found here.
Quintic
Finally, there is the quintic polynomial,
\begin{align} P_5 = x^5 + a_4 x^4 + a_3 x^3 + a_2 x^2 + a_1 x + a_0, \quad a_0, a_1, a_2, a_3, a_4 \in \mathbb{C}, \end{align}with five complex roots. For the previous degrees we could always consider commutators \(\sigma \tau \sigma^{-1} \tau^{-1}\) of permutations \(\sigma\) and \(\tau\). The path in the complex plane prescribed by a function \(F_k\) corresponding to this combination is a closed loop and if the commutator was not the identity \(()\) this allowed us to argue that \(F_k\) was insufficient and that another level of radicals was needed. We then checked that another commutator only produced the identity permutation and concluded that this one extra level was indeed enough. This does not work for the quintic as we will see now.
We can begin as in the quartic case and see that we need a least three levels of nested radicals, i.e. a function \(F_3(a_0, \ldots, a_4)\), to correctly capture the behaviour of the roots under all possible permutations. The permutations that gave rise to the last level of nesting were of the form \((i j) (k \ell)\), but unlike in the quartic case (c.f. equation \eqref{eq:s4-1324-commutator}) we can now produce a non-trivial commutator from them as follows:
\begin{align} \label{eq:s5-perm1} \left[(i k) (\ell m)\right] \left[(i j) (\ell m)\right] \left[(i k) (\ell m)\right]^{-1} \left[(i j) (\ell m)\right]^{-1} = (i j k). \end{align}This formula holds if \(i\), \(j\), \(k\), \(\ell\) and \(m\) are pairwise distinct and therefore did not exist for lower degrees. Note that the right hand side is not the identity permutation, but the left hand side tell us that the value of the function \(F_3(a_0, \ldots, a_4)\) follows a closed loop. So it seems that we need a function \(F_4(a_0, \ldots, a_4)\) with another level of radicals.
Imagine we have found such a function \(F_4(a_0, \ldots, a_4)\). Proceeding in the usual fashion we consider
\begin{align} \label{eq:s5-perm2} (i j k) (k \ell m) (i j k)^{-1} (k \ell m)^{-1} = (j k m) \end{align}which again holds for pairwise distinct \(i\), \(j\), \(k\), \(\ell\) and \(m\). This formula is a problem. According to the left hand side, the value of \(F_4(a_0, \ldots, a_4)\) follows a closed loop, but the right-hand side is a permutation for which \(F_4(a_0, \ldots, a_4)\) was designed to follow an open path!
Moreover, this problem can never be fixed by adding one or more new levels of nested radicals. For any candidate function \(F_n(a_0, \ldots, a_4)\) with \(n\) levels of nested radicals, we can look at the permutation \((j k m)\). If \(F_n(a_0, \ldots, a_4)\) is a good candidate, it should follow an open path in the complex plane when the roots are permuted. But we can then use the left-hand side of equation \eqref{eq:s5-perm2} to write \((j k m)\) in terms of \((i j k)\) and \((k \ell m)\) which shows that it must follow a closed path. This works for any level \(n\) of nesting and we conclude that for polynomials of degree five and higher no formula \(F_n(a_0, \ldots, a_4)\) for the roots can exist.
Relation to the derived series
The crucial step in the discussion above that forced us to introduce another level of nested radicals was to consider the commutator
\begin{align} \left[\sigma, \tau\right] = \sigma \tau \sigma^{-1} \tau^{-1} \end{align}of permutations \(\sigma\) and \(\tau\). The subgroup of a group \(G\) formed by the commutators of all its elements is called the commutator subgroup \(\left[G, G\right]\).
Starting with a group \(G\) we can continue to take commutators and form a series of subgroups \(G_0, G_1, G_2, \cdots\) where
\begin{align} G_0 = G, \quad G_i = \left[G_{i-1}, G_{i-1}\right]. \end{align}As we take commutators the groups become in a way simpler and for finite groups the series eventually stops. The series of subgroups is called the derived series of \(G\). Each \(G_i\) is a normal subgroup of \(G_{i-1}\), so the quotient \(G_{i-1} / G_{i}\) makes sense and is abelian.
In the cubic and the quartic cases we observed above that after two and three levels of commutators respectively, we always obtained the identity permutation. In terms of the derived series this means that the series terminate with the trivial subgroup.
We can check this with sage
as follows:
SymmetricGroup(3).derived_series()
[Subgroup generated by [(1,2), (1,2,3)] of (Symmetric group of order 3! as a permutation group), Subgroup generated by [(1,2,3)] of (Symmetric group of order 3! as a permutation group), Subgroup generated by [()] of (Symmetric group of order 3! as a permutation group)]
The first group is the entire group \(S_3\). The second one contains all possible commutators which are \(()\), \((1 2 3)\) and \((1 3 2)\) (see equation \eqref{eq:sn-transposition-commutator}). Finally the last group only contains the identity which is what we saw in equation \eqref{eq:s3-commutator-trivial}. Note that the three subgroups precisely correspond to the three levels of roots in the formula \(F_3(a_0, a_1, a_2)\) for the roots of the cubic.
For the quartic we see that we reach the trivial subgroup after four steps:
SymmetricGroup(4).derived_series()
[Subgroup generated by [(1,2), (1,2,3,4)] of (Symmetric group of order 4! as a permutation group), Subgroup generated by [(2,3,4), (1,2,3)] of (Symmetric group of order 4! as a permutation group), Subgroup generated by [(1,2)(3,4), (1,4)(2,3)] of (Symmetric group of order 4! as a permutation group), Subgroup generated by [()] of (Symmetric group of order 4! as a permutation group)]
The third group is the one generated by the commutators in equation \eqref{eq:s4-commutator-nontrivial}. Their commutators always give the identity (c.f. equation \eqref{eq:s4-1324-commutator}.
Finally, the derived series for \(S_5\), the permutation group of the roots of the quintic, does not terminate with the trivial group:
SymmetricGroup(5).derived_series()
[Subgroup generated by [(1,2), (1,2,3,4,5)] of (Symmetric group of order 5! as a permutation group), Subgroup generated by [(3,4,5), (1,2,3,4,5)] of (Symmetric group of order 5! as a permutation group)]
The relation \eqref{eq:s5-perm2} is a manifestation of the fact that the second of these two groups is equal to its commutator subgroup.
A group \(G\) whose derived series ends in the trivial subgroup is called solvable precisely because if \(G\) is the Galois group of some polynomial \(P(x)\) then the equation \(P(x) = 0\) can be solved in terms of radicals. The permutations groups \(S_n\) for \(n \geq 5\) are in general not solvable.
References
- Bruce Bartlett: The Quintic, the Icosahedron, and Elliptic Curves
- Paul Ramond: Abel-Ruffini's Theorem: Complex but Not Complicated!