next up previous
Next: About this document ...

Notes on Circulant Matrices
and
Roots of Small Degree Polynomials


(Prepared by Rob Rostermundt)


Around the year A.D. 825 the Arabian mathematician Mohammed ibn M$ \hat{u}$s$ \hat{a}$ al-Khow$ \hat{a}$rizm$ \hat{i}$ illustrated the solution for the roots of an arbitrary second degree polynomial. Today, we know this formula as the quadratic equation which says that the roots of $ ax^2+bx+c=0$ can be found as

$\displaystyle x=\displaystyle \frac{-b\pm\sqrt{b^2-4ac}}{2a}$

This is an example of what is called a solution ``by radicals.'' In other words, the formula for the roots of a polynomial can be found using an equation (using addition, subtraction, mutliplication, division, and extraction of roots) involving the coefficients of the polynomial. There are of course two other famous formulas for the roots of polynomials. A solution by radicals to the cubic and quartic were published by Cardano in 1545, attributing the former to Tartaglia and the latter to Ferrari. Then in 1824, the Norwegian mathematician Niels Abel demonstrated that no solution by radicals exists for any quintic or higher degree polynomial.

Unfortunately, the formulas for the cubic and quartic are extremely complicated and even more difficult to remember. In these notes we show an alternate method, which connects matrices, their eigenvalues, and roots of unity, giving a unified approach to finding roots of qaurtic or smaller degree polynomials. We start with a definition.

Definition 1   An $ n\times n$ circulant matrix is formed from any $ n$-dimensional row-vector by cyclically permuting the entries. For example, starting with the vector $ (a\;b\;c)$ we can form the $ 3\times 3$ circulant matrix

$\displaystyle \left(\begin{array}{ccc}
a&b&c\\
c&a&b\\
b&c&a\\
\end{array}\right)$

The most basic circulant matrix is known as a circulant generator matrix, which is a circulant matrix having first row $ (0\;1\;0\;\cdots 0)$. For example, the $ 4\times 4$ circulant generator is the matrix

$\displaystyle W=\left(\begin{array}{cccc}
0&1&0&0\\
0&0&1&0\\
0&0&0&1\\
1&0&0&0\\
\end{array}\right)$

Then to generate the $ 4\times 4$ circulant matrix with first row $ (a\;b\;c\;d)$ we evaluate the function $ q(t)=a+bt+ct^2+dt^3$ at $ W$. Notice that the polynomial is cubic and has degree one less than the size of the matrix. For example, suppose we want to generate the $ 4\times 4$ circulant matrix with first row $ (1\;2\;\,0\;-1)$. We evaluate $ q(t)=1+2t-t^3$ at $ W$ as

$\displaystyle q(W)=I+2W-W^3$ $\displaystyle =$ $\displaystyle \left(\begin{array}{cccc}
1&0&0&0\\
0&1&0&0\\
0&0&1&0\\
0&0&0&...
...gin{array}{cccc}
0&1&0&0\\
0&0&1&0\\
0&0&0&1\\
1&0&0&0\\
\end{array}\right)$  
  $\displaystyle =$ $\displaystyle \left(\begin{array}{cccc}
1&0&0&0\\
0&1&0&0\\
0&0&1&0\\
0&0&0&...
...gin{array}{cccc}
0&0&0&1\\
1&0&0&0\\
0&1&0&0\\
0&0&1&0\\
\end{array}\right)$  
  $\displaystyle =$ $\displaystyle \left(\begin{array}{cccc}
1&2&0&-1\\
-1&1&2&0\\
0&-1&1&2\\
2&0&-1&1\\
\end{array}\right)$  

It is these matrices and their eigenvalues that we will use to find the roots of small degree polynomials. You might ask why we want to generate the desired circulant matrix by evaluating the appropriate function $ q(t)$ at $ W$, as this seems like more work than necessary. But this perpective will help us when computing the eigenvalues of a given circulant matrix. One first needs to realize that if for some matrix $ A$ and vector $ \v $ we have $ A\v =\lambda\v $, then it is also true that for any polynomial $ q(t)$ we have $ q(A)\v =q(\lambda)\v $. In other words, if $ \lambda$ is an eigenvalue of the matrix $ A$, then $ q(\lambda)$ is an eigenvalue for the matrix $ q(A)$. Lets convince ourselves with an example. Choose the matrix

$\displaystyle A=\left(\begin{array}{cc}
1&-2\\
0&3\\
\end{array}\right)$

Obviously, the eigenvalues of $ A$ are $ \lambda=1,3$. Now suppose we have $ q(t)=2-t+t^2$ and evaluate
$\displaystyle q(A)=2I-A+A^2$ $\displaystyle =$ $\displaystyle 2\left(\begin{array}{cc}
1&0\\
0&1\\
\end{array}\right)-\left(\...
...
\end{array}\right)+\left(\begin{array}{cc}
1&-2\\
0&3\\
\end{array}\right)^2$  
  $\displaystyle =$ $\displaystyle \left(\begin{array}{cc}
2&0\\
0&2\\
\end{array}\right)-\left(\b...
...\
\end{array}\right)+\left(\begin{array}{cc}
1&-8\\
0&9\\
\end{array}\right)$  
  $\displaystyle =$ $\displaystyle \left(\begin{array}{cc}
2&-6\\
0&8\\
\end{array}\right)$  

The claim is that the eigenvalues of $ q(A)$ are just $ q(\lambda)$ for each eigenvalue of $ A$. Clearly the eigenvalues of $ q(A)$ are $ \lambda_{\;q(A)}=2,8$. Then taking the eigenvalues of $ A$, for $ \lambda=1$ we get $ q(1)=2-1+1=2$, and for $ \lambda=3$ we get $ q(3)=2-3+9=8$.

How does this help us with circulant matrices? If we know the eigenvalues of the circulant generator $ W$, to find the eigenvalues of the circulant matrix $ q(W)$ we simply evaluate the polynomial $ q(t)$ at the eignvalues of $ W$. That is, suppose that $ A$ is the $ 3\times 3$ circulant matrix with first row $ (a\;b\;c)$. That means that $ A=q(W)$, where $ q(t)=a+bt+t^2$. So if $ \lambda$ is an eigenvalue of the circulant generator $ W$, then $ q(\lambda)$ is an eigenvalue of the circulant matrix $ q(W)$. Without computation, we state the eigenvalues of the $ 2\times 2$, $ 3\times 3$, and $ 4\times 4$ circulant generators.


$\displaystyle \left(\begin{array}{cc}
0&1\\
1&0\\
\end{array}\right)$ $\displaystyle \Longrightarrow$ $\displaystyle \lambda=\pm1$  
$\displaystyle \left(\begin{array}{ccc}
0&1&0\\
0&0&1\\
1&0&0\\
\end{array}\right)$ $\displaystyle \Longrightarrow$ $\displaystyle \lambda=1,\displaystyle \frac{-1\pm i\sqrt{3}}{2}$  
$\displaystyle \left(\begin{array}{cccc}
0&1&0&0\\
0&0&1&0\\
0&0&0&1\\
1&0&0&0\\
\end{array}\right)$ $\displaystyle \Longrightarrow$ $\displaystyle \lambda=\pm1,\pm i$  

We can now easily compute the eigenvalues of the following circulant matrix.

$\displaystyle A=\left(\begin{array}{cccc}
1&3&-2&-1\\
-1&1&3&-2\\
-2&-1&1&3\\
3&-2&-1&1\\
\end{array}\right)$

Looking at the first row we see that $ A=q(W)$, where $ q(t)=1+3t-2t^2-t^3$. Since we already know the eigenvalues of $ W$ the eigenvalues of $ A$ are found as $ q(\lambda)$, for each eigenvalue of $ W$.


$\displaystyle q(1)$ $\displaystyle =$ $\displaystyle 1+3-2-1=1$  
$\displaystyle q(-1)$ $\displaystyle =$ $\displaystyle 1-3-2+1=-3$  
$\displaystyle q(i)$ $\displaystyle =$ $\displaystyle 1+3i-2i^2-i^3=3+4i$  
$\displaystyle q(-i)$ $\displaystyle =$ $\displaystyle 1+3(-i)-2(-i)^2-(i)^3=3-4i$  

We can now give a formula for the eigenvalues of an arbitrary circulant matrix.


$\displaystyle \left(\begin{array}{cc}
a&b\\
b&a\\
\end{array}\right)$ $\displaystyle \Longrightarrow$ $\displaystyle \lambda=q(\pm1)=a\pm b$  
$\displaystyle \left(\begin{array}{ccc}
a&b&c\\
c&a&b\\
b&c&a\\
\end{array}\right)$ $\displaystyle \Longrightarrow$ $\displaystyle \lambda=q(1,\pm\omega)=a+b+c,\;
a+b\omega+c\omega^2$  
    where$\displaystyle \;\omega=\displaystyle \frac{-1\pm i\sqrt{3}}{2}$  
$\displaystyle \left(\begin{array}{cccc}
a&b&c&d\\
d&a&b&c\\
c&d&a&b\\
b&c&d&a\\
\end{array}\right)$ $\displaystyle \Longrightarrow$ $\displaystyle \lambda=q(\pm1,\pm i)=
a\pm b+c\pm d,\;a-c\pm(b-d)i$  

We are now ready to find the roots of quadratic, cubic, and quartic polynomials. We start with an easy example. Suppose we have $ f(x)=3+2x^2$. We first divide by through the coefficient on $ x^2$ to get the monic polynomial $ f^*(x)=x^2+3/2$ (which has the same roots as $ f(x)$). To find its roots we simply must find a $ 2\times 2$ circulant matrix $ A$ with coefficients $ a,b$ such that the characteristic polynomial of $ A$ equals the polynomial $ f(x)$. Remember the eigenvalues of a matrix $ A$ are the roots of its characteristic polynomial. Then, for the appropriate choices $ a,b$ we evaluate $ q(t)=a+bt$ at $ t=\pm1$ and these will be the roots of $ f^*(x)$, which are also the roots of the original $ f(x)$.

The characteristic polynomial of the circulant $ 2\times 2$ is


$\displaystyle det(A-xI)$ $\displaystyle =$ $\displaystyle det\left(\begin{array}{cc}
a-x&b\\
b&a-x\\
\end{array}\right)$  
  $\displaystyle =$ $\displaystyle (a-x)^2-b^2$  
  $\displaystyle =$ $\displaystyle x^2-2ax+a^2-b^2$  

We must have $ 2a=0$ and $ a^2-b^2=3/2$. So $ a=0$ and $ b=i=\sqrt{3/2}$. It follows that the correct circulant matrix is

$\displaystyle q(W)=\left(\begin{array}{cc}
0&i\sqrt{3/2}\\
i\sqrt{3/2}&0\\
\end{array}\right)$

So, the polynomial is $ q(t)=0+i\sqrt{3/2}t$, and evaluated at $ t=\pm1$ we get the eigenvalues of $ q(W)$ as $ \lambda=\pm i\sqrt{3/2}$. Or, using the computations from the previous page, the roots of $ f(x)$ are $ a\pm
b=\pm i\sqrt{3/2}$. We can check this with the quadratic formula.

$\displaystyle x=\displaystyle \frac{-b\pm\sqrt{b^2-4ac}}{2}=\displaystyle \frac{0\pm\sqrt{0-6}}{2}=
\pm i\sqrt{3/2}$

Since we already have the formula for the quadratic memorized, we move onto the cubic. We start (for reasons to be described later) with the following general ``reduced'' cubic

$\displaystyle x^3-\beta x+\gamma=0$

Before we work on the cubic we state the following useful theorem.

Theorem 1   Let $ A$ be an $ n\times$ matrix with (not necesarrily distinct) eigenvalues $ \lambda_1,\lambda_2,\dots,\lambda_p$. Then the sum of the diagonal entries of $ A$, called the trace of $ A$, equals the sum of the eigenvalues. That is,

$\displaystyle tr(A)=\displaystyle \sum^{p}_{i=1}\lambda_i$

Next, recall that for any monic polynomial of degree $ n$, the sum of the roots equals the coefficient on the $ n-1$ term. So, if $ f(x)=x^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0$, then for each root $ \alpha_i$ of $ f(x)$ we have

$\displaystyle \displaystyle \sum^{n}_{i=1}\alpha_i=a_{n-1}$

Notice that the reduced cubic does not have a squared term. That is, the sum of the roots of the cubic $ f(x)=x^3+\beta x+\gamma$ equals zero. But, the diagonal entries of the appropriate circulant matrix will sum to zero, and since all its diagonal entries are the same we must have a circulant matrix with the form

$\displaystyle q(W)=\left(\begin{array}{ccc}
0&b&c\\
c&0&b\\
b&c&0\\
\end{array}\right)$

If we can determine the values $ b,c$ such that the characteristic polynomial equals $ f(x)=x^3+\beta x+\gamma$, the the eigenvalues of $ q(W)$, which are the roots of $ f(x)$, are found by evaluating $ q(t)=bt+ct^2$ at the cubic roots of unity.

The characteristic polynomial of $ q(W)$ is $ p(x)=x^3-3bcx-(b^3+c^3)$. So we need to find $ b,c$ such that


$\displaystyle b^3+c^3$ $\displaystyle =$ $\displaystyle -\gamma$  
$\displaystyle b^3c^3$ $\displaystyle =$ $\displaystyle -\displaystyle \frac{\beta^3}{27}$  

To solve this system (which is not linear) we observe that $ b^3$ and $ c^3$ are both roots of $ y^2+\gamma y-\beta^3/27=0$. It follows, using the quadratic equation, that

$\displaystyle b^3,c^3=\displaystyle \frac{-\gamma\pm\sqrt{\gamma^2+4\beta^3/27}}{2}$

At this point we must be careful. If $ \gamma^2+4\beta^3/27\ge0$ then we let


$\displaystyle b$ $\displaystyle =$ $\displaystyle \left[\displaystyle \frac{-\gamma+\sqrt{\gamma^2+4\beta^3/27}}{2}\right]^{1/3}$  
$\displaystyle c$ $\displaystyle =$ $\displaystyle \left[\displaystyle \frac{-\gamma-\sqrt{\gamma^2+4\beta^3/27}}{2}\right]^{1/3}$  

However, if these values are complex we need to remember that roots of complex numbers are not necessarily unique. So, for any of the possible cube roots, we let

$\displaystyle b=\left[\displaystyle \frac{-\gamma+\sqrt{\gamma^2+4\beta^3/27}}{2}\right]^{1/3}$

and then we let $ c=-\beta/(3b)$. To find the roots of the original polynomial $ f(x)=x^3+\beta x+\gamma$ we evaluate

$\displaystyle q(t)=\left(\left[
\displaystyle \frac{-\gamma+\sqrt{\gamma^2+4\beta^3/27}}{2}\right]^{1/3}\right)t
-\beta t^2/(3b)$

at the cube roots of unity.

To complete the computation we get, when $ \omega=\displaystyle \frac{-1\pm i\sqrt{3/2}}{2}$, the three roots of $ f(x)=x^3+\beta x+\gamma$:


$\displaystyle q(1)$ $\displaystyle =$ $\displaystyle \left[\displaystyle \frac{-\gamma+\sqrt{\gamma^2+4\beta^3/27}}{2}\right]^{1/3}
-\beta/(3b)$  
$\displaystyle q(\omega)$ $\displaystyle =$ $\displaystyle \left(\left[
\displaystyle \frac{-\gamma+\sqrt{\gamma^2+4\beta^3/27}}{2}\right]^{1/3}\right)\omega
-\beta\omega^2/(3b)$  

Although the solution is not in a ``nice'' form, the road to the solution relies only on basic theory of linear algebra, and especially on the ease of computing the eigenvalues of circulant matrices. We now extend this to the general cubic $ f(x)=x^3+\alpha x^2+\beta
x+\gamma$.

We can ``reduce'' the general cubic by making the substitution $ y=x-\alpha/3$. This substitution eliminates the quadratic term. Then after solving this reduced cubic, we easily find the roots of the original cubic by back-substitution.

Although we do not go into the details here, this same approach can be extended to the general quartic $ f(x)=x^4+\zeta x^3+\alpha x^2+\beta
x+\gamma$. First, make the substitution, $ y=x-\zeta/4$ to eliminate the cubic term and see that the circulant matrix will have the form

$\displaystyle q(W)=\left(\begin{array}{cccc}
0&b&c&d\\
b&0&b&c\\
c&d&0&b\\
b&c&d&0\\
\end{array}\right)$

The steps from here involve solving a cubic, an approach we have already covered.

Unfortunately, for higher degree polynomials, this approach using circulant matrices fails in general. This was proven, somewhat indirectly, by Abel and Galois in the 1800's. Most of the techniques for solving for roots of higher degree polynomials rely on approximation techniques such as Newton's method, although roots of certain classes of polynomials can be found relatively easily. However, what we have seen here is that the theory of circulant matrices gives a fascinating connection between matrix theory and polynomials.




next up previous
Next: About this document ...
Robert Rostermundt 2005-01-17