Condition that a Matrix is Similar to the Companion Matrix of its Characteristic Polynomial

Linear algebra problems and solutions

Problem 348

Let $A$ be an $n\times n$ complex matrix.
Let $p(x)=\det(xI-A)$ be the characteristic polynomial of $A$ and write it as
\[p(x)=x^n+a_{n-1}x^{n-1}+\cdots+a_1x+a_0,\] where $a_i$ are real numbers.

Let $C$ be the companion matrix of the polynomial $p(x)$ given by
\[C=\begin{bmatrix}
0 & 0 & \dots & 0 &-a_0 \\
1 & 0 & \dots & 0 & -a_1 \\
0 & 1 & \dots & 0 & -a_2 \\
\vdots & & \ddots & & \vdots \\
0 & 0 & \dots & 1 & -a_{n-1}
\end{bmatrix}=
[\mathbf{e}_2, \mathbf{e}_3, \dots, \mathbf{e}_n, -\mathbf{a}],\] where $\mathbf{e}_i$ is the unit vector in $\C^n$ whose $i$-th entry is $1$ and zero elsewhere, and the vector $\mathbf{a}$ is defined by
\[\mathbf{a}=\begin{bmatrix}
a_0 \\
a_1 \\
\vdots \\
a_{n-1}
\end{bmatrix}.\]

Then prove that the following two statements are equivalent.

  1. There exists a vector $\mathbf{v}\in \C^n$ such that
    \[\mathbf{v}, A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] form a basis of $\C^n$.
  2. There exists an invertible matrix $S$ such that $S^{-1}AS=C$.
    (Namely, $A$ is similar to the companion matrix of its characteristic polynomial.)

 
LoadingAdd to solve later

Sponsored Links


Proof.

Before proving the equivalence of the statements, let us first note a simple but useful fact.
Let $B$ be $n\times n$ matrix and let $\mathbf{e}_i$ be the unit vector of $\C^n$ as defined in the problem. Then the matrix product $B\mathbf{e}_i$ is the $i$-th column vector of the matrix $B$.

$( 1 \implies 2)$

Suppose that the statement 1 is true. That is, there exists a vector $\mathbf{v} \in \C^n$ such that
\[\mathbf{v}, A\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] form a basis of $\C^n$. In particular, these vectors are linearly independent.

Form a matrix
\[S:=[\mathbf{v}, A\mathbf{v}, \dots, A^{n-1}\mathbf{v}],\] then the matrix $S$ is invertible since the columns vectors are linearly independent.

We prove that $AS=SC$.
We have
\begin{align*}
AS&=A[\mathbf{v}, A\mathbf{v}, \dots, A^{n-1}\mathbf{v}],\\
&=[A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n}\mathbf{v}].
\end{align*}

On the other hand, we have
\begin{align*}
SC&=S [\mathbf{e}_2, \mathbf{e}_3, \dots, \mathbf{e}_n, -\mathbf{a}]\\
&=[S\mathbf{e}_2, S\mathbf{e}_3, \dots, S\mathbf{e}_n, -S\mathbf{a}].
\end{align*}

As noted at the beginning, the vector $S\mathbf{e}_2$ is the second column vector of $S$, which is $A\mathbf{v}$. Similarly, $S\mathbf{e}_i$ is the $i$-th column vector of $S$, and thus we have
\[S\mathbf{e}_i=A^{i-1}\mathbf{v} \tag{*}.\] Hence we have
\begin{align*}
SC&=[A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}, -S\mathbf{a}].
\end{align*}

Thus, it remains to prove that $-S\mathbf{a}=A^n\mathbf{v}$.
Write
\[\mathbf{a}=\begin{bmatrix}
a_0 \\
a_1 \\
\vdots \\
a_{n-1}
\end{bmatrix}=a_0\mathbf{e}_1+a_1\mathbf{e}_2+\cdots+a_{n-1}\mathbf{e}_n.\] Then we have
\begin{align*}
S\mathbf{a}&=S(a_0\mathbf{e}_1+a_1\mathbf{e}_2+\cdots+a_{n-1}\mathbf{e}_n)\\
&=a_0S\mathbf{e}_1+a_1S\mathbf{e}_2+\cdots+a_{n-1}S\mathbf{e}_n\\
&\stackrel{(*)}{=} a_0\mathbf{v}+a_1A\mathbf{v}+\cdots+ a_{n-1}A^{n-1}\mathbf{v}\\
&=(a_0I+a_1 A+\cdots +a_{n-1}A^{n-1})\mathbf{v}\\
&=-A^n\mathbf{v},
\end{align*}
where the last step follows from the Cayley-Hamilton theorem
\[O=p(A)=A^n+a_{n-1}A^{n-1}+\cdots a_1A+a_0I.\]

This proves that $-S\mathbf{a}=A^n\mathbf{v}$ and hence $AS=SC$.
Since $S$ is invertible, we obtain
\[S^{-1}AS=C.\]

 

$( 2 \implies 1)$

Next suppose that the statement 2 is true. That is, there exists an invertible matrix $S$ such that $S^{-1}AS=C$. Then we have $AS=CS$.
Define $\mathbf{v}:=S\mathbf{e}_1$ to be the first column vector of $S$.

We prove that with this choice of $\mathbf{v}$, the vectors
\[\mathbf{v}, A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] form a basis of $\C^n$.

We have
\begin{align*}
AS&=A[S\mathbf{e}_1, S\mathbf{e}_2, \dots, S\mathbf{e}_n]\\
&=[AS\mathbf{e}_1, AS\mathbf{e}_2, \dots, AS\mathbf{e}_n] \end{align*}
and
\begin{align*}
SC&=S [\mathbf{e}_2, \mathbf{e}_3, \dots, \mathbf{e}_n, -\mathbf{a}]\\
&=[S \mathbf{e}_2, S\mathbf{e}_3, \dots, S\mathbf{e}_n, -S\mathbf{a}].
\end{align*}
Since $AS=SC$, comparing column vectors, we obtain
\begin{align*}
AS\mathbf{e}_1&=S\mathbf{e}_2\\
AS\mathbf{e}_2&=S\mathbf{e}_3\\
\vdots\\
AS\mathbf{e}_{n-1}&=S\mathbf{e}_n\\
AS\mathbf{e}_n&=-S\mathbf{a}.\\
\end{align*}

It follows that we have inductively
\begin{align*}
A\mathbf{v}&=S\mathbf{e}_2\\
A^2\mathbf{v}&=A(A\mathbf{v})=AS\mathbf{e}_2=S\mathbf{e}_3\\
\vdots\\
A^{n-1}\mathbf{v}&=A(A^{n-2}\mathbf{v})=AS\mathbf{e}_{n-1}=S\mathbf{e}_n.
\end{align*}
Namely we obtain
\[A^i\mathbf{v}=S\mathbf{e}_{i+1}\] for $i=0, 1, \dots, n-1$.
Now, suppose that we have the linear combinaiton
\[c_1\mathbf{v}+c_2A\mathbf{v}+ \dots c_n A^{n-1}\mathbf{v}=\mathbf{0},\] for some coefficient scalars $c_1, \dots, c_n \in \C$.

Then we have using the above relations
\begin{align*}
c_1S\mathbf{e}_1+c_2S\mathbf{e}_2+\cdots +c_nS\mathbf{e}_n=\mathbf{0}.
\end{align*}
Since $S$ is invertible, the column vectors $S\mathbf{e}_i$ are linearly independent.
Thus, we must have $c_1=c_2=\cdots=c_n=0$.

Therefore the vectors
\[\mathbf{v}, A\mathbf{v}, A^2\mathbf{v}, \dots, A^{n-1}\mathbf{v}\] are linearly independent, and they form a basis of $\C^n$ since we have $n$ linearly independent vectors in the $n$-dimensional vector space $\C^n$.

Related Question.

For a basic question about the companion matrix of a polynomial, check out the post “Companion matrix for a polynomial“.


LoadingAdd to solve later

Sponsored Links

More from my site

You may also like...

1 Response

  1. 03/21/2017

    […] Another problem about the companion matrix of a polynomial is given in the post “Condition that a matrix is similar to the companion matrix of its characteristic polynomial“. […]

Please Login to Comment.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

More in Linear Algebra
Linear Algebra Problems and Solutions
Linearly Dependent if and only if a Vector Can be Written as a Linear Combination of Remaining Vectors

Let $V$ be a vector space over a scalar field $K$. Let $S=\{\mathbf{v}_1, \mathbf{v}_2, \dots, \mathbf{v}_n\}$ be the set of...

Close