Skip to main content

Section 8.1 Cayley-Hamilton Theorem

We begin with the following observation.

Observation 8.1.1.

Let \(T\colon V\to V\) be an \(F\)-linear map of a finite-dimensional vector space \(V\) over a field \(F\text{.}\) We denote by \(T^k=\underbrace{T\circ T\circ\cdots\circ T}_{k-\text{times}}\text{.}\) If \(p(t)=a_0+a_1t+\cdots+a_nt^n\in F[t]\) is a polynomial then
\begin{equation} a_0\unit_V+a_1T+\cdots+a_nT^n\in\End_F(V)\text{.}\tag{8.1.1} \end{equation}

Convention 8.1.2.

We denote by \(p(T)\in\End_F(V)\) the expression obtained in (8.1.1).

Observation 8.1.3.

If \(s(t)=f(t)+g(t)\in F[t]\) and \(u(t)=f(t)g(t)\in F[t]\) then we have the following.
\begin{equation} s(T)=f(T)+g(T)\tag{8.1.2} \end{equation}
\begin{equation} u(T)=f(T)g(T)\tag{8.1.3} \end{equation}
\begin{equation} f(T)g(T)=g(T)f(T)\tag{8.1.4} \end{equation}

Definition 8.1.4. (Annihilator of a linear map).

A polynomial \(p(t)\in F[t]\) is said to annihilate a linear map \(T\) if \(p(T)=0\in\End_F(V)\text{.}\)
We show the existence of an annihilating polynomial.
Let \(\dim_FV=n\text{.}\) By Corollary 5.2.4, \(\dim_F\End_F(V)=(\dim_FV)^2<\infty\text{.}\) Hence the subset \(\{\unit_V,T,T^2,\ldots,T^n,\ldots,T^{n^2}\}\) of \(\End_F(V)\) is linearly dependent, i.e., there are scalars \(a_0,a_1,\ldots,a_{n^2}\in F\text{,}\) not all zero, such that
\begin{equation*} a_0\unit_V+a_1T+\cdots + a_{n^2}T^{n^2}=0\in\End_FV. \end{equation*}
Hence the polynomial \(p(t)=a_0+a_t+\cdots+a_{n^2}t^{n^2}\in F[t]\) is an annihilating polynomial of \(T\text{.}\)
The Cayley-Hamilton theorem stated below asserts that there exists an annihilating polynomial of degree \(\dim_FV\text{,}\) namely, the characteristic polynomial of \(T\text{.}\)
Verify Cayley-Hamilton Theorem for any square matrix of your choice.
We will not give a proof of Cayley-Hamilton theorem. However, we prove the theorem for triangulable linear maps.
By Theorem 7.4.4, \(\chi_T=(t-\lambda_1)(t-\lambda_2)\cdots(t-\lambda_n)\in F[t]\) and thus, there exists a basis \(\mathfrak{B}=\{v_1,v_2,\ldots,v_n\}\) of \(V\) such that
\begin{equation*} [T]_{\mathfrak{B}}=\begin{pmatrix}\lambda_1\amp a_{12}\amp\cdots\amp a_{1n}\\0\amp \lambda_2\amp\cdots\amp a_{2n}\\\vdots\amp\vdots\amp\ddots\amp \vdots\\0\amp 0\amp\cdots\amp \lambda_n\end{pmatrix}. \end{equation*}
Hence,
\begin{equation} T(v_1)=\lambda_1v_1\quad\text{and}\quad T(v_k)=\sum_{\ell=1}^{k-1}a_{\ell k}v_\ell+\lambda_kv_k\quad\text{for}\;2\leq k\leq n.\tag{8.1.5} \end{equation}
For any \(\lambda_i\) and \(\lambda_j\) we have
\begin{equation} (T-\lambda_i\unit_V)(T-\lambda_j\unit_V)=(T-\lambda_j\unit_V)(T-\lambda_i\unit_V).\tag{8.1.6} \end{equation}
For any \(k\geq 1\) put \(g_{k}(T)=(T-\lambda_{k}\unit_V)(T-\lambda_{k-1}\unit_V)\cdots(T-\lambda_1\unit_V)\text{.}\) In particular
\begin{equation} \chi_T(T)=g_{n}(T)=(T-\lambda_n\unit_V)(T-\lambda_{n-1}\unit_V)\cdots(T-\lambda_1\unit_V).\tag{8.1.7} \end{equation}
We can obtain the following identity by a repeated application of (8.1.6).
\begin{equation} (T-\lambda_k\unit_V)\cdot g_{k-1}(T)=g_{k-1}(T)\cdot(T-\lambda_k\unit_V)\tag{8.1.8} \end{equation}
We show that \(\chi_T(T)(v_k)=0\) for every basis vector \(v_k\text{.}\) We prove this by showing \(g_{k}(T)(v_i)=0\) for every \(1\leq i\leq k\) and \(1\leq k\leq n\text{.}\) We have \((T-\lambda_1\unit_V)(v_1)=0\text{.}\) Hence, by (8.1.8), \(g_k(T)(v_1)=0\) for all \(k\text{.}\) Now assume that, for some \(1\lt r-1\lt n\text{,}\) we have proved \(g_{r-1}(T)(v_i)=0\text{,}\) where \(1\leq i\leq r-1\text{.}\) We show that \(g_{r}(T)(v_i)=0\) for all \(1\leq i\leq r\text{.}\) Since \(g_r(T)=(T-\lambda_r\unit_V)g_{r-1}(T)\text{,}\) we only need to show that \(g_{r}(T)(v_{r})=0\text{.}\) We have
\begin{align*} g_{r}(T)(v_{r})\amp=(T-\lambda_{r}\unit_V)g_{r-1}(T)(v_{r})\amp\\ \amp= g_{r-1}(T)(T-\lambda_{r}\unit_V)(v_{r})\amp\text{by}\; \knowl{./knowl/commutativity-general.html}{\text{(8.1.8)}}\\ \amp= g_{r-1}(T)\left(\sum_{\ell=1}^{r-1}a_{\ell r}v_\ell\right)\amp\text{by}\;\knowl{./knowl/matrix-representation.html}{\text{(8.1.5)}}\\ \amp=\sum_{\ell=1}^{r-1}a_{\ell r}g_{r-1}(T)(v_\ell)\amp\text{as}\;g_{r-1}(T)\in\End_F(V) \end{align*}
By induction \(g_{r-1}(T)(v_\ell)=0\) for \(1\leq\ell\leq r-1\text{.}\) Hence, \(g_{r}(T)(v_r)=0\text{.}\) Hence the required result can be obtained by taking \(r=n\text{.}\)
Using Cayley-Hamilton theorem we show that a linear map \(T\colon\R^3\to\R^3\) has either a one-dimensional or a two-dimensional invariant subspace.
Let \(T\colon\R^3\to\R^3\) be an \(\R\)-linear map. Then \(T\) has either a one-dimensional or a two-dimensional invariant subspace.
Indeed, the characteristic polynomial of \(T\text{,}\) \(\chi_T\) is of the degree \(3\text{.}\) Hence \(\chi_T\) has a real root, say \(a\in\R\) (this can be proved, for instance, by using continuity of polynomial \(\chi_T\)). By Lemma A.1.6, there exists a monic (i.e., coefficient of \(t^2\) is \(1\)) quadratic polynomial \(q(t)\in\R[t]\) such that
\begin{equation} \chi_T=(t-a)\cdot q(t)\tag{8.1.9} \end{equation}
By Cayley-Hamilton Theorem, \(T\) is annihilated by the characteristic polynomial \(\chi_T\in\R[t]\text{.}\) Hence, by (8.1.3),
\begin{equation*} 0=\chi_T(T)(v)=(T-a\unit_{\R^3})\left(q(T)(v)\right)\quad\text{for all}\;v\in\R^3. \end{equation*}
We now assume that \(v\in\R^3\) is a nonzero vector.
Case 1. Suppose that \(q(T)(v)\neq 0\text{.}\) If we put \(w=q(T)(v)\in\R^3\) then \((T-a\unit_{\R^3})(w)=0\text{,}\) i.e., \(\langle w\rangle\) is a one-dimensional subspace invariant under \(T\text{.}\)
Case 2. Suppose that \(q(T)(v)= 0\) and \(q(t)=t^2+bt+c\in\R[t]\text{.}\) Thus \((T^2+bT+c\unit_{\R^3})(v)=0\text{.}\) Consider the subspace \(W=\langle v,T(v)\rangle\text{.}\) If \(v\) and \(T(v)\) are linearly dependent then \(v\) is an eigenvector of \(T\) and we are done. So we assume that \(\dim_{\R}W=2\text{.}\) Note that \(T^2(v)=-bT(v)-cv\in W\) and hence \(W\) is an invariant subspace of dimension \(2\text{.}\)