Condition that a Matrix is Similar to the Companion Matrix of its Characteristic Polynomial

Linear algebra problems and solutions

Problem 348

Let $A$ be an $n\times n$ complex matrix.
Let $p(x)=\det(xI-A)$ be the characteristic polynomial of $A$ and write it as
\[p(x)=x^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0,\] where $a_i$ are real numbers.

Let $C$ be the companion matrix of the polynomial $p(x)$ given by
\[C=\begin{bmatrix}
0 & 0 & \dots & 0 &-a_0 \\
1 & 0 & \dots & 0 & -a_1 \\
0 & 1 & \dots & 0 & -a_2 \\
\vdots & & \ddots & & \vdots \\
0 & 0 & \dots & 1 & -a_{n-1}
\end{bmatrix}=
[\mathbf{e}_2, \mathbf{e}_3, \dots, \mathbf{e}_n, -\mathbf{a}],\] where $\mathbf{e}_i$ is the unit vector in $\C^n$ whose $i$-th entry is $1$ and zero elsewhere, and the vector $\mathbf{a}$ is defined by
\[\mathbf{a}=\begin{bmatrix}
a_0 \\
a_1 \\
\vdots \\
a_{n-1}
\end{bmatrix}.\]

Then prove that the following two statements are equivalent.

  1. There exists a vector $\mathbf{v}\in \C^n$ such that
    \[\mathbf{v}, A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] form a basis of $\C^n$.
  2. There exists an invertible matrix $S$ such that $S^{-1}AS=C$.
    (Namely, $A$ is similar to the companion matrix of its characteristic polynomial.)

 
LoadingAdd to solve later

Sponsored Links


Proof.

Before proving the equivalence of the statements, let us first note a simple but useful fact.
Let $B$ be $n\times n$ matrix and let $\mathbf{e}_i$ be the unit vector of $\C^n$ as defined in the problem. Then the matrix product $B\mathbf{e}_i$ is the $i$-th column vector of the matrix $B$.

$( 1 \implies 2)$

Suppose that the statement 1 is true. That is, there exists a vector $\mathbf{v} \in \C^n$ such that
\[\mathbf{v}, A\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] form a basis of $\C^n$. In particular, these vectors are linearly independent.

Form a matrix
\[S:=[\mathbf{v}, A\mathbf{v}, \dots, A^{n-1}\mathbf{v}],\] then the matrix $S$ is invertible since the columns vectors are linearly independent.

We prove that $AS=SC$.
We have
\begin{align*}
AS&=A[\mathbf{v}, A\mathbf{v}, \dots, A^{n-1}\mathbf{v}],\\
&=[A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n}\mathbf{v}].
\end{align*}

On the other hand, we have
\begin{align*}
SC&=S [\mathbf{e}_2, \mathbf{e}_3, \dots, \mathbf{e}_n, -\mathbf{a}]\\
&=[S\mathbf{e}_2, S\mathbf{e}_3, \dots, S\mathbf{e}_n, -S\mathbf{a}].
\end{align*}

As noted at the beginning, the vector $S\mathbf{e}_2$ is the second column vector of $S$, which is $A\mathbf{v}$. Similarly, $S\mathbf{e}_i$ is the $i$-th column vector of $S$, and thus we have
\[S\mathbf{e}_i=A^{i-1}\mathbf{v} \tag{*}.\] Hence we have
\begin{align*}
SC&=[A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}, -S\mathbf{a}].
\end{align*}

Thus, it remains to prove that $-S\mathbf{a}=A^n\mathbf{v}$.
Write
\[\mathbf{a}=\begin{bmatrix}
a_0 \\
a_1 \\
\vdots \\
a_{n-1}
\end{bmatrix}=a_0\mathbf{e}_1+a_1\mathbf{e}_2+\cdots+a_{n-1}\mathbf{e}_n.\] Then we have
\begin{align*}
S\mathbf{a}&=S(a_0\mathbf{e}_1+a_1\mathbf{e}_2+\cdots+a_{n-1}\mathbf{e}_n)\\
&=a_0S\mathbf{e}_1+a_1S\mathbf{e}_2+\cdots+a_{n-1}S\mathbf{e}_n\\
&\stackrel{(*)}{=} a_0\mathbf{v}+a_1A\mathbf{v}+\cdots+ a_{n-1}A^{n-1}\mathbf{v}\\
&=(a_0I+a_1 A+\cdots +a_{n-1}A^{n-1})\mathbf{v}\\
&=-A^n\mathbf{v},
\end{align*}
where the last step follows from the Cayley-Hamilton theorem
\[O=p(A)=A^n+a_{n-1}A^{n-1}+\cdots a_1A+a_0I.\]

This proves that $-S\mathbf{a}=A^n\mathbf{v}$ and hence $AS=SC$.
Since $S$ is invertible, we obtain
\[S^{-1}AS=C.\]

 

$( 2 \implies 1)$

Next suppose that the statement 2 is true. That is, there exists an invertible matrix $S$ such that $S^{-1}AS=C$. Then we have $AS=CS$.
Define $\mathbf{v}:=S\mathbf{e}_1$ to be the first column vector of $S$.

We prove that with this choice of $\mathbf{v}$, the vectors
\[\mathbf{v}, A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] form a basis of $\C^n$.

We have
\begin{align*}
AS&=A[S\mathbf{e}_1, S\mathbf{e}_2, \dots, S\mathbf{e}_n]\\
&=[AS\mathbf{e}_1, AS\mathbf{e}_2, \dots, AS\mathbf{e}_n] \end{align*}
and
\begin{align*}
SC&=S [\mathbf{e}_2, \mathbf{e}_3, \dots, \mathbf{e}_n, -\mathbf{a}]\\
&=[S \mathbf{e}_2, S\mathbf{e}_3, \dots, S\mathbf{e}_n, -S\mathbf{a}].
\end{align*}
Since $AS=SC$, comparing column vectors, we obtain
\begin{align*}
AS\mathbf{e}_1&=S\mathbf{e}_2\\
AS\mathbf{e}_2&=S\mathbf{e}_3\\
\vdots\\
AS\mathbf{e}_{n-1}&=S\mathbf{e}_n\\
AS\mathbf{e}_n&=-S\mathbf{a}.\\
\end{align*}

It follows that we have inductively
\begin{align*}
A\mathbf{v}&=S\mathbf{e}_2\\
A^2\mathbf{v}&=A(A\mathbf{v})=AS\mathbf{e}_2=S\mathbf{e}_3\\
\vdots\\
A^{n-1}\mathbf{v}&=A(A^{n-2}\mathbf{v})=AS\mathbf{e}_{n-1}=S\mathbf{e}_n.
\end{align*}
Namely we obtain
\[A^i\mathbf{v}=S\mathbf{e}_{i+1}\] for $i=0, 1, \dots, n-1$.
Now, suppose that we have the linear combinaiton
\[c_1\mathbf{v}+c_2A\mathbf{v}+ \dots c_n A^{n-1}\mathbf{v}=\mathbf{0},\] for some coefficient scalars $c_1, \dots, c_n \in \C$.

Then we have using the above relations
\begin{align*}
c_1S\mathbf{e}_1+c_2S\mathbf{e}_2+\cdots +c_nS\mathbf{e}_n=\mathbf{0}.
\end{align*}
Since $S$ is invertible, the column vectors $S\mathbf{e}_i$ are linearly independent.
Thus, we must have $c_1=c_2=\cdots=c_n=0$.

Therefore the vectors
\[\mathbf{v}, A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] are linearly independent, and they form a basis of $\C^n$ since we have $n$ linearly independent vectors in the $n$-dimensional vector space $\C^n$.

Related Question.

For a basic question about the companion matrix of a polynomial, check out the post “Companion matrix for a polynomial“.


LoadingAdd to solve later

Sponsored Links

More from my site

You may also like...

1 Response

  1. 03/21/2017

    […] Another problem about the companion matrix of a polynomial is given in the post “Condition that a matrix is similar to the companion matrix of its characteristic polynomial“. […]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

More in Linear Algebra
Linear Algebra Problems and Solutions
Linearly Dependent if and only if a Vector Can be Written as a Linear Combination of Remaining Vectors

Let $V$ be a vector space over a scalar field $K$. Let $S=\{\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n\}$ be the set of...

Close