Section 8.1 Cayley-Hamilton Theorem
We begin with the following observation.
Definition 8.1.4. (Annihilator of a linear map).
A polynomial \(p(t)\in F[t]\) is said to annihilate a linear map \(T\) if \(p(T)=0\in\End_F(V)\text{.}\)We show the existence of an annihilating polynomial.
Lemma 8.1.5. (Existence of an annihilating polynomial).
Let \(V\) be a finite-dimensional vector space over a field \(F\) and let \(T\colon V\to V\) be an \(F\)-linear map. Then there exists a polynomial \(p(t)\in F[t]\) such that \(p(T)=0\in\End_F(V)\text{.}\)Proof.
Let \(\dim_FV=n\text{.}\) By Corollary 5.2.4, \(\dim_F\End_F(V)=(\dim_FV)^2<\infty\text{.}\) Hence the subset \(\{\unit_V,T,T^2,\ldots,T^n,\ldots,T^{n^2}\}\) of \(\End_F(V)\) is linearly dependent, i.e., there are scalars \(a_0,a_1,\ldots,a_{n^2}\in F\text{,}\) not all zero, such that
\begin{equation*}
a_0\unit_V+a_1T+\cdots + a_{n^2}T^{n^2}=0\in\End_FV.
\end{equation*}
Hence the polynomial \(p(t)=a_0+a_t+\cdots+a_{n^2}t^{n^2}\in F[t]\) is an annihilating polynomial of \(T\text{.}\)
The Cayley-Hamilton theorem stated below asserts that there exists an annihilating polynomial of degree \(\dim_FV\text{,}\) namely, the characteristic polynomial of \(T\text{.}\)
Theorem 8.1.6. (Cayley-Hamilton Theorem).
Let \(V\) be a finite-dimensional vector space over a field \(F\) and let \(T\colon V\to V\) be an \(F\)-linear map. The characteristic polynomial of \(T\) annihilates \(T\text{.}\)
Checkpoint 8.1.7.
Verify Cayley-Hamilton Theorem for any square matrix of your choice.
We will not give a proof of Cayley-Hamilton theorem. However, we prove the theorem for triangulable linear maps.
Proposition 8.1.8. (Cayley-Hamilton theorem for triangulable linear maps).
Let \(V\) be a finite-dimensional vector space over a field \(F\) and let \(T\colon V\to V\) be a triangulable \(F\)-linear map. Then \(\chi_T(T)=0.\)Proof.
By Theorem 7.4.4, \(\chi_T=(t-\lambda_1)(t-\lambda_2)\cdots(t-\lambda_n)\in F[t]\) and thus, there exists a basis \(\mathfrak{B}=\{v_1,v_2,\ldots,v_n\}\) of \(V\) such that
\begin{equation*}
[T]_{\mathfrak{B}}=\begin{pmatrix}\lambda_1\amp a_{12}\amp\cdots\amp a_{1n}\\0\amp \lambda_2\amp\cdots\amp a_{2n}\\\vdots\amp\vdots\amp\ddots\amp \vdots\\0\amp 0\amp\cdots\amp \lambda_n\end{pmatrix}.
\end{equation*}
Hence,
\begin{equation}
T(v_1)=\lambda_1v_1\quad\text{and}\quad T(v_k)=\sum_{\ell=1}^{k-1}a_{\ell k}v_\ell+\lambda_kv_k\quad\text{for}\;2\leq k\leq n.\tag{8.1.5}
\end{equation}
For any \(\lambda_i\) and \(\lambda_j\) we have
\begin{equation}
(T-\lambda_i\unit_V)(T-\lambda_j\unit_V)=(T-\lambda_j\unit_V)(T-\lambda_i\unit_V).\tag{8.1.6}
\end{equation}
For any \(k\geq 1\) put \(g_{k}(T)=(T-\lambda_{k}\unit_V)(T-\lambda_{k-1}\unit_V)\cdots(T-\lambda_1\unit_V)\text{.}\) In particular
\begin{equation}
\chi_T(T)=g_{n}(T)=(T-\lambda_n\unit_V)(T-\lambda_{n-1}\unit_V)\cdots(T-\lambda_1\unit_V).\tag{8.1.7}
\end{equation}
We can obtain the following identity by a repeated application of (8.1.6).
\begin{equation}
(T-\lambda_k\unit_V)\cdot g_{k-1}(T)=g_{k-1}(T)\cdot(T-\lambda_k\unit_V)\tag{8.1.8}
\end{equation}
We show that \(\chi_T(T)(v_k)=0\) for every basis vector \(v_k\text{.}\) We prove this by showing \(g_{k}(T)(v_i)=0\) for every \(1\leq i\leq k\) and \(1\leq k\leq n\text{.}\) We have \((T-\lambda_1\unit_V)(v_1)=0\text{.}\) Hence, by (8.1.8), \(g_k(T)(v_1)=0\) for all \(k\text{.}\) Now assume that, for some \(1\lt r-1\lt n\text{,}\) we have proved \(g_{r-1}(T)(v_i)=0\text{,}\) where \(1\leq i\leq r-1\text{.}\) We show that \(g_{r}(T)(v_i)=0\) for all \(1\leq i\leq r\text{.}\) Since \(g_r(T)=(T-\lambda_r\unit_V)g_{r-1}(T)\text{,}\) we only need to show that \(g_{r}(T)(v_{r})=0\text{.}\) We have
\begin{align*}
g_{r}(T)(v_{r})\amp=(T-\lambda_{r}\unit_V)g_{r-1}(T)(v_{r})\amp\\
\amp= g_{r-1}(T)(T-\lambda_{r}\unit_V)(v_{r})\amp\text{by}\; \knowl{./knowl/commutativity-general.html}{\text{(8.1.8)}}\\
\amp= g_{r-1}(T)\left(\sum_{\ell=1}^{r-1}a_{\ell r}v_\ell\right)\amp\text{by}\;\knowl{./knowl/matrix-representation.html}{\text{(8.1.5)}}\\
\amp=\sum_{\ell=1}^{r-1}a_{\ell r}g_{r-1}(T)(v_\ell)\amp\text{as}\;g_{r-1}(T)\in\End_F(V)
\end{align*}
By induction \(g_{r-1}(T)(v_\ell)=0\) for \(1\leq\ell\leq r-1\text{.}\) Hence, \(g_{r}(T)(v_r)=0\text{.}\) Hence the required result can be obtained by taking \(r=n\text{.}\)
Using Cayley-Hamilton theorem we show that a linear map \(T\colon\R^3\to\R^3\) has either a one-dimensional or a two-dimensional invariant subspace.
Example 8.1.9. (Invariant subspace of dimension \(1\) or \(2\) in \(\R^3\)).
Let \(T\colon\R^3\to\R^3\) be an \(\R\)-linear map. Then \(T\) has either a one-dimensional or a two-dimensional invariant subspace.
Indeed, the characteristic polynomial of
\(T\text{,}\) \(\chi_T\) is of the degree
\(3\text{.}\) Hence
\(\chi_T\) has a real root, say
\(a\in\R\) (this can be proved, for instance, by using continuity of polynomial
\(\chi_T\)). By
Lemma A.1.6, there exists a monic (i.e., coefficient of
\(t^2\) is
\(1\)) quadratic polynomial
\(q(t)\in\R[t]\) such that
\begin{equation}
\chi_T=(t-a)\cdot q(t)\tag{8.1.9}
\end{equation}
By Cayley-Hamilton Theorem,
\(T\) is annihilated by the characteristic polynomial
\(\chi_T\in\R[t]\text{.}\) Hence, by
(8.1.3),
\begin{equation*}
0=\chi_T(T)(v)=(T-a\unit_{\R^3})\left(q(T)(v)\right)\quad\text{for all}\;v\in\R^3.
\end{equation*}
We now assume that \(v\in\R^3\) is a nonzero vector.
Case 1. Suppose that \(q(T)(v)\neq 0\text{.}\) If we put \(w=q(T)(v)\in\R^3\) then \((T-a\unit_{\R^3})(w)=0\text{,}\) i.e., \(\langle w\rangle\) is a one-dimensional subspace invariant under \(T\text{.}\)
Case 2. Suppose that \(q(T)(v)= 0\) and \(q(t)=t^2+bt+c\in\R[t]\text{.}\) Thus \((T^2+bT+c\unit_{\R^3})(v)=0\text{.}\) Consider the subspace \(W=\langle v,T(v)\rangle\text{.}\) If \(v\) and \(T(v)\) are linearly dependent then \(v\) is an eigenvector of \(T\) and we are done. So we assume that \(\dim_{\R}W=2\text{.}\) Note that \(T^2(v)=-bT(v)-cv\in W\) and hence \(W\) is an invariant subspace of dimension \(2\text{.}\)