UniDistance | BMathE_M04

6.5 Properties of eigenvalues

We will argue next that an endomorphism \(g : V \to V\) of a finite dimensional \(\mathbb{K}\)-vector space \(V\) has at most \(\dim(V)\) eigenvalues. We first need:

Theorem 6.43 • Little Bézout’s theorem

For a polynomial \(f \in \mathsf{P}(\mathbb{K})\) of degree \(n\geqslant 1\) and \(x_0 \in \mathbb{K},\) there exists a polynomial \(g \in \mathsf{P}(\mathbb{K})\) of degree \(n-1\) such that for all \(x \in \mathbb{K}\) we have \(f(x)=f(x_0)+g(x)(x-x_0).\)

Proof. We will give an explicit expression for the polynomial \(g.\) If one is not interested in such an expression, a proof using induction can also be given. Write \(f(x)=\sum_{k=0}^n a_kx^k\) for coefficients \((a_0,\ldots,a_n) \in \mathbb{K}^{n+1}.\) For \(0\leqslant j\leqslant n-1\) consider \[\tag{6.3} b_j=\sum_{k=0}^{n-j-1}a_{k+j+1}x_0^k\] and the polynomial \[g(x)=\sum_{j=0}^{n-1}b_jx^j\] of degree \(n-1.\) We have \[\begin{aligned} g(x)(x-x_0)&=\sum_{j=0}^{n-1}\sum_{k=0}^{n-j-1}\left(a_{k+j+1}x_0^kx^{j+1}\right)-\sum_{j=0}^{n-1}\sum_{k=0}^{n-j-1}\left(a_{k+j+1}x_0^{k+1}x^{j}\right)\\ &=\sum_{j=1}^n\sum_{k=0}^{n-j}\left(a_{k+j}x_0^kx^j\right)-\sum_{j=0}^{n-1}\sum_{k=1}^{n-j}\left(a_{k+j}x_0^kx^j\right)\\ &=a_nx^n+\sum_{j=1}^{n-1}a_jx^j+a_0-a_0-\sum_{k=1}^na_kx_0^k=f(x)-f(x_0). \end{aligned}\]

From this we conclude:

Proposition 6.44

Let \(f \in \mathsf{P}(\mathbb{K})\) be a polynomial of degree \(n.\) Then \(f\) has at most \(n\) (distinct) zeros or \(f\) is the zero polynomial.

Proof. We use induction. The case \(n=0\) is clear, hence the statement is anchored.

Inductive step: Suppose \(f \in \mathsf{P}(\mathbb{K})\) is a polynomial of degree \(n\) with \(n+1\) distinct zeros \(\lambda_1,\ldots,\lambda_{n+1}.\) Since \(f(\lambda_{n+1})=0,\) Theorem 6.43 implies that \[f(x)=(x-\lambda_{n+1})g(x)\] for some polynomial \(g\) of degree \(n-1.\) For \(1\leqslant i\leqslant n,\) we thus have \[0=f(\lambda_i)=(\lambda_i-\lambda_{n+1})g(\lambda_i).\] Since \(\lambda_i\neq \lambda_{n+1}\) it follows that \(g(\lambda_i)=0.\) Therefore, \(g\) has \(n\) distinct zeros and must be the zero polynomial by the induction hypothesis. It follows that \(f\) is the zero polynomial as well.

This gives:

Corollary 6.45

Let \(g : V \to V\) be an endomorphism of a \(\mathbb{K}\)-vector space of dimension \(n \in \mathbb{N}.\) Then \(g\) has at most \(n\) (distinct) eigenvalues.

Proof. By Lemma 6.36 and Lemma 6.41, the eigenvalues of \(g\) are the zeros of the characteristic polynomial. The characteristic polynomial of \(g\) has degree \(n.\) The claim follows by applying Proposition 6.44.

Proposition 6.46 • Linear independence of eigenvectors

Let \(V\) be a finite dimensional \(\mathbb{K}\)-vector space and \(g :V \to V\) an endomorphism. Then the eigenspaces \(\operatorname{Eig}_{g}(\lambda)\) of \(g\) are in direct sum. In particular, if \(v_1,\ldots,v_m\) are eigenvectors corresponding to distinct eigenvalues of \(g,\) then \(\{v_1,\ldots,v_m\}\) are linearly independent.

Proof. We use induction on the number \(m\) of distinct eigenvalues of \(g.\) Let \(\{\lambda_1,\ldots,\lambda_m\}\) be distinct eigenvalues of \(g.\) For \(m=1\) the statement is trivially true, so the statement is anchored.

Inductive step: Assume \(m-1\) eigenspaces are in direct sum. We want to show that then \(m\) eigenspaces are also in direct sum. Let \(v_i,v_i^{\prime} \in \operatorname{Eig}_{g}(\lambda_i)\) be eigenvectors such that \[\tag{6.4} v_1+v_2+\cdots+v_m=v_1^{\prime}+v_2^{\prime}+\cdots+v_{\tilde{m}}.\] Applying \(g\) to this last equation gives \[\tag{6.5} \lambda_1v_1+\lambda_2 v_2+\cdots+\lambda_m v_m=\lambda_1 v_1^{\prime}+\lambda_2 v_2^{\prime}+\cdots+\lambda_m v_{\tilde{m}}.\] Subtracting \(\lambda_m\) times (6.4) from (6.5) gives \[(\lambda_1-\lambda_m)v_1+\cdots+(\lambda_{m-1}-\lambda_m)v_{m-1}=(\lambda_1-\lambda_m)v_1^{\prime}+\cdots+(\lambda_{m-1}-\lambda_m)v_{m-1}^{\prime}.\] Since \(m-1\) eigenspaces are in direct sum, this implies that \((\lambda_i-\lambda_m)v_i=(\lambda_i-\lambda_m)v_i^{\prime}\) for \(1\leqslant i\leqslant m-1.\) Since the eigenvalues are distinct, we have \(\lambda_i-\lambda_m\neq 0\) for all \(1\leqslant i\leqslant m-1\) and hence \(v_i=v_i^{\prime}\) for all \(1\leqslant i\leqslant m-1.\) Now (6.5) implies that \(v_m=v_{\tilde{m}}\) as well and the inductive step is complete.

Since the eigenspaces are in direct sum, the linear independence of eigenvectors with respect to distinct eigenvalues follows from Remark 6.7 (ii).

In the case where all the eigenvalues are distinct, we conclude that \(g\) is diagonalisable.

Proposition 6.47

Let \(g : V \to V\) be an endomorphism of a finite dimensional \(\mathbb{K}\)-vector space \(V.\) Suppose the characteristic polynomial of \(g\) has \(\dim(V)\) distinct zeros (that is, the algebraic multiplicity of each eigenvalue is \(1\)), then \(g\) is diagonalisable.

Proof. Let \(n=\dim(V).\) Let \(\lambda_1,\ldots,\lambda_n\) denote the distinct eigenvalues of \(g.\) Let \(0_V\neq v_i \in \operatorname{Eig}_g(\lambda_i)\) for \(i=1,\ldots,n.\) Then, by Proposition 6.46, the eigenvectors are linearly independent, it follows that \((v_1,\ldots,v_n)\) is an ordered basis of \(V\) consisting of eigenvectors, hence \(g\) is diagonalisable.

Remark 6.48

Proposition 6.47 gives a sufficient condition for an endomorphism \(g : V \to V\) to be diagonalisable, it is however not necessary. The identity endomorphism is diagonalisable, but its spectrum consists of the single eigenvalue \(1\) with algebraic multiplicity \(\dim(V).\)

Every polynomial in \(\mathsf{P}(\mathbb{C})\) of degree at least \(1\) has at least one zero. This fact is known as the fundamental theorem of algebra. The name is well-established, but quite misleading, as there is no purely algebraic proof. You will encounter a proof of this statement in the module M07. As a consequence we obtain the following important existence theorem:

Theorem 6.49 • Existence of eigenvalues

Let \(g : V \to V\) be an endomorphism of a complex vector space \(V\) of dimension \(n\geqslant 1.\) Then \(g\) admits at least one eigenvalue. Moreover, the sum of the algebraic multiplicities of the eigenvalues of \(g\) is equal to \(n.\) In particular, if \(\mathbf{A}\in M_{n,n}(\mathbb{C})\) is a matrix, then there is at least one eigenvalue of \(\mathbf{A}.\)

Proof. By Lemma 6.36 and Lemma 6.41, the eigenvalues of \(g\) are the zeros of the characteristic polynomial and this is an element of \(\mathsf{P}(\mathbb{C}).\) The first statement thus follows by applying the fundamental theorem of algebra to the characteristic polynomial of \(g.\)

Applying Theorem 6.43 and the fundamental theorem of algebra repeatedly, we find \(k \in \mathbb{N}\) and multiplicities \(m_1,\ldots,m_k \in \mathbb{N}\) such that \[\operatorname{char}_g(x)=(x-\lambda_1)^{m_1}(x-\lambda_2)^{m_2}\cdots(x-\lambda_k)^{m_k}\] where \(\lambda_1,\ldots,\lambda_k\) are zeros of \(\operatorname{char}_g.\) Since \(\operatorname{char}_g\) has degree \(n,\) it follows that \(\sum_{i=1}^k m_i=n.\)

Example 6.50

Recall that the discriminant of a quadratic polynomial \(x\mapsto ax^2+bx+c \in \mathsf{P}(\mathbb{K})\) is \(b^2-4ac,\) provided \(a\neq 0.\) If \(\mathbb{K}=\mathbb{C}\) and \(b^2-4ac\) is non-zero, then the polynomial \(ax^2+bx+c\) has two distinct zeros. The characteristic polynomial of a \(2\)-by-\(2\) matrix \(\mathbf{A}\) satisfies \(\operatorname{char}_\mathbf{A}(x)=x^2-\operatorname{Tr}(\mathbf{A})x+\det(\mathbf{A}).\) Therefore, if \(\mathbf{A}\) has complex entries and satisfies \((\operatorname{Tr}\mathbf{A})^2-4\det \mathbf{A}\neq 0,\) then it is diagonalisable. If \(\mathbf{A}\) has real entries and satisfies \((\operatorname{Tr}\mathbf{A})^2-4\det \mathbf{A}\geqslant 0,\) then it has a least one eigenvalue. If \((\operatorname{Tr}\mathbf{A})^2-4\det \mathbf{A}>0\) then it is diagonalisable.
Recall that, by Proposition 5.24
Proposition 5.24 ➔
Let \(n \in \mathbb{N}\) and \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) be an upper triangular matrix so that \(A_{ij}=0\) for \(i>j.\) Then \[\tag{5.6} \det(\mathbf{A})=\prod_{i=1}^n A_{ii}=A_{11}A_{22}\cdots A_{nn}.\]

, an upper triangular matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}\) satisfies \(\det \mathbf{A}=\prod_{i=1}^n A_{ii}.\) It follows that \[\operatorname{char}_\mathbf{A}(x)=\prod_{i=1}^n(x-A_{ii})=(x-A_{11})(x-A_{22})\cdots(x-A_{nn}).\] Consequently, an upper triangular matrix has spectrum \(\{A_{11},A_{22},\ldots,A_{nn}\}\) and is diagonalisable if all its diagonal entries are distinct. Notice that by Example 6.40
Example 6.40 ➔
1. We consider \[\mathbf{A}=\begin{pmatrix} 1 & 5 \\ 5 & 1 \end{pmatrix}.\] Then \[\begin{aligned} \operatorname{char}_{\mathbf{A}}(x)&=\operatorname{char}_{f_\mathbf{A}}(x)=\det\left(x\mathbf{1}_{2}-A\right)=\det\begin{pmatrix} x-1 & -5 \\ -5 & x-1\end{pmatrix}\\ &=(x-1)^2-25=x^2-2x-24=(x+4)(x-6). \end{aligned}\] Hence we have eigenvalues \(\lambda_1=6\) and \(\lambda_2=-4,\) both with algebraic multiplicity \(1.\) By definition we have \[\operatorname{Eig}_{\mathbf{A}}(6)=\operatorname{Eig}_{f_\mathbf{A}}(6)=\left\{\vec{v} \in \mathbb{K}^2 | \mathbf{A}\vec{v}=6\vec{v}\right\}\] and we compute that \[\operatorname{Eig}_{\mathbf{A}}(6)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ 1 \end{pmatrix}\right\}\] Since \(\dim \operatorname{Eig}_\mathbf{A}(6)=1,\) the eigenvalue \(6\) has geometric multiplicity \(1.\) Likewise we compute \[\operatorname{Eig}_\mathbf{A}(-4)=\operatorname{span}\left\{\begin{pmatrix} -1 \\ 1 \end{pmatrix}\right\}\] so that the eigenvalue \(-4\) has geometric multiplicity \(1\) as well. Notice that we have an ordered basis of eigenvectors of \(\mathbf{A}\) and hence \(\mathbf{A}\) is diagonalisable, c.f. Example 3.96.
2. We consider \[\mathbf{A}=\begin{pmatrix} 2 & 1 \\ 0 & 2 \end{pmatrix}\] Then \(\operatorname{char}_\mathbf{A}(x)=(x-2)^2\) so that we have a single eigenvalue \(2\) with algebraic multiplicity \(2.\) We compute \[\operatorname{Eig}_\mathbf{A}(2)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ 0 \end{pmatrix}\right\}\] so that the eigenvalue \(2\) has geometric multiplicity \(1.\) Notice that we cannot find an ordered basis consisting of eigenvectors, hence \(\mathbf{A}\) is not diagonalisable.
(ii) not every upper triangular matrix is diagonalisable.

Example 6.51 • Fibonacci sequences

We revisit the Fibonacci sequences, now equipped with the theory of endomorphisms. A Fibonacci sequence is a sequence \(\xi : \mathbb{N}\cup\{0\} \to \mathbb{K}\) satisfying the recursive relation \(\xi_{n+2}=\xi_n+\xi_{n+1}.\) Consider the matrix \[\mathbf{A}=\begin{pmatrix} \xi_0 & \xi_1 \\ \xi_1 & \xi_2 \end{pmatrix}.\] Then, using induction, we can show that \[\mathbf{A}^n=\begin{pmatrix} \xi_{n-1} & \xi_n \\ \xi_n & \xi_{n+1}\end{pmatrix}\] for all \(n \in \mathbb{N}.\) We would like to compute \(\mathbf{A}^n\) for the initial conditions \(\xi_0=0\) and \(\xi_1=1.\) Suppose we can find an invertible matrix \(\mathbf{C}\) so that \(\mathbf{A}=\mathbf{C}\mathbf{D}\mathbf{C}^{-1}\) for some diagonal matrix \(\mathbf{D}.\) Then \[\mathbf{A}^n=\mathbf{C}\mathbf{D}\mathbf{C}^{-1}\mathbf{C}\mathbf{D}\mathbf{C}^{-1}\cdots \mathbf{C}\mathbf{D}\mathbf{C}^{-1}=\mathbf{C}\mathbf{D}^n\mathbf{C}^{-1}\] and we can easily compute \(\mathbf{A}^n,\) as the \(n\)-th power of a diagonal matrix \(\mathbf{D}\) is the diagonal matrix whose diagonal entries are given by the \(n\)-th powers of diagonal entries of \(\mathbf{D}.\) We thus want to diagonalise the matrix \[\mathbf{A}=\begin{pmatrix} 0 & 1 \\ 1 & 1 \end{pmatrix}.\] We obtain \(\operatorname{char}_\mathbf{A}(x)=x^2-x-1\) and hence eigenvalues \(\lambda_{1}=(1+\sqrt{5})/2\) and \(\lambda_2=(1-\sqrt{5})/2.\) From this we compute \[\operatorname{Eig}_{\mathbf{A}}(\lambda_1)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ \lambda_1 \end{pmatrix}\right\}\qquad \text{and} \qquad \operatorname{Eig}_{\mathbf{A}}(\lambda_2)=\operatorname{span}\left\{\begin{pmatrix} 1 \\ \lambda_2 \end{pmatrix}\right\}\]

Let \(\mathbf{e}=(\vec{e}_1,\vec{e}_2)\) denote the standard basis of \(\mathbb{R}^2\) and consider the ordered basis \[\mathbf{b}=\left(\begin{pmatrix} 1 \\ \lambda_1 \end{pmatrix},\begin{pmatrix} 1 \\ \lambda_2 \end{pmatrix}\right)\] of eigenvectors of \(f_\mathbf{A}.\) We have \[\mathbf{M}(f_\mathbf{A},\mathbf{b},\mathbf{b})=\begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2\end{pmatrix}=\mathbf{D}\] and the change of base matrix is \[\mathbf{C}=\mathbf{C}(\mathbf{b},\mathbf{e})=\begin{pmatrix} 1 & 1 \\ \lambda_1 & \lambda_2 \end{pmatrix}\] and \[\mathbf{C}^{-1} =\mathbf{C}(\mathbf{e},\mathbf{b})=\frac{1}{\lambda_2-\lambda_1}\begin{pmatrix} \lambda_2 & -1 \\ -\lambda_1 & 1 \end{pmatrix}.\] Therefore \(\mathbf{A}=\mathbf{C}\mathbf{D}\mathbf{C}^{-1}\) and hence \(\mathbf{A}^n=\mathbf{C}\mathbf{D}^n\mathbf{C}^{-1}\) so that \[\mathbf{A}^n=\frac{1}{\lambda_2-\lambda_1}\begin{pmatrix} 1 & 1 \\ \lambda_1 & \lambda_2 \end{pmatrix}\begin{pmatrix} \lambda_1^n & 0 \\ 0 & \lambda_2^n \end{pmatrix}\begin{pmatrix} \lambda_2 & -1 \\ -\lambda_1 & 1 \end{pmatrix}=\begin{pmatrix} \xi_{n-1} & \xi_n \\ \xi_n & \xi_{n+1}\end{pmatrix}.\] This yields the formula \[\xi_n=\frac{\lambda_1^n-\lambda_2^n}{\lambda_1-\lambda_2}.\]

Proposition 6.52

Let \(g : V \to V\) be an endomorphism of a finite dimensional \(\mathbb{K}\)-vector space \(V\) of dimension \(n\geqslant 1.\)

Let \(\lambda\) be an eigenvalue of \(g.\) Then its algebraic multiplicity is at least as big as its geometric multiplicity.
If \(\mathbb{K}=\mathbb{C},\) then \(g\) is diagonalisable if and only if for all eigenvalues of \(g,\) the algebraic and geometric multiplicity are the same.

Proof. (i) Let \(\dim \operatorname{Eig}_{g}(\lambda)=m\) and \(\mathbf{b}\) be an ordered basis of \(\operatorname{Eig}_{g}(\lambda).\) Furthermore, let \(\mathbf{b}^{\prime}\) be an ordered tuple of vectors such that \(\mathbf{c}=(\mathbf{b},\mathbf{b}^{\prime})\) is an ordered basis of \(V.\) The eigenspace \(\operatorname{Eig}_{g}(\lambda)\) is stable under \(g\) and \[\mathbf{M}(g|_{\operatorname{Eig}_{g}(\lambda)},\mathbf{b},\mathbf{b})=\lambda\mathbf{1}_{m}.\] By Proposition 6.34, the matrix representation of \(g\) with respect to the basis \(\mathbf{c}\) takes the form \[\mathbf{M}(g,\mathbf{c},\mathbf{c})=\begin{pmatrix} \lambda\mathbf{1}_{m} & * \\ \mathbf{0}_{m-n,m} & \mathbf{B}\end{pmatrix}\] for some matrix \(\mathbf{B}\in M_{n-m,n-m}(\mathbb{K}).\) We thus obtain \[\operatorname{char}_g(x)=\det\begin{pmatrix} (x-\lambda)\mathbf{1}_{m} & * \\ \mathbf{0}_{m-n,m} & x\mathbf{1}_{n-m}-\mathbf{B}\end{pmatrix}\] Applying the Laplace expansion (5.5) with respect to the first column, we have \[\operatorname{char}_g(x)=(x-\lambda)\det\begin{pmatrix} (x-\lambda)\mathbf{1}_{m-1} & * \\ \mathbf{0}_{m-n,m-1} & x\mathbf{1}_{n-m}-\mathbf{B}\end{pmatrix}\] Applying the Laplace expansion again with respect to the first column, \(m\)-times in total, we get \[\operatorname{char}_g(x)=(x-\lambda)^m\det(x\mathbf{1}_{n-m}-\mathbf{B})=(x-\lambda)^m\operatorname{char}_\mathbf{B}(x).\] The algebraic multiplicity of \(\lambda\) is thus at least \(m.\)

(ii) Suppose \(\mathbb{K}=\mathbb{C}\) and that \(g :V \to V\) is diagonalisable. Hence we have an ordered basis \((v_1,\ldots,v_n)\) of \(V\) consisting of eigenvectors of \(g.\) Therefore, \[\operatorname{char}_g(x)=\prod_{i=1}^n(x-\lambda_i)\] where \(\lambda_i\) is the eigenvalue of the eigenvector \(v_i,\) \(1\leqslant i \leqslant n.\) For any eigenvalue \(\lambda_j,\) its algebraic multiplicity is the number of indices \(i\) with \(\lambda_i=\lambda_j.\) For each such index \(i,\) the eigenvector \(v_i\) satisfies \(g(v_i)=\lambda_iv_i=\lambda_j v_i\) and hence is an element of the eigenspace \(\operatorname{Eig}_{g}(\lambda_j).\) The geometric multiplicity of each eigenvalue is thus at least as big as the algebraic multiplicity, but by the previous statement, the latter cannot be bigger than the former, hence they are equal.

Conversely, suppose that for all eigenvalues of \(g,\) the algebraic and geometric multiplicity are the same. Since \(\mathbb{K}=\mathbb{C},\) by Theorem 6.49, the sum of the algebraic multiplicities is \(n.\) The sum of the geometric multiplicities is by assumption also \(n.\) Since, by Proposition 6.46, the eigenspaces with respect to different eigenvalues are in direct sum, we obtain a basis of \(V\) consisting of eigenvectors of \(g.\)

6.6 Special endomorphisms

6.6.1 Involutions

A mapping \(\iota : \mathcal{X} \to \mathcal{X}\) from a set \(\mathcal{X}\) into itself is called an involution, if \(\iota \circ \iota=\mathrm{Id}_\mathcal{X}.\) In the case where \(\mathcal{X}\) is a vector space and \(\iota\) is linear, then \(\iota\) is called a linear involution.

Example 6.53 • Involutions

Let \(V\) be a \(\mathbb{K}\)-vector space. Then the identity mapping \(\mathrm{Id}_V : V \to V\) is a linear involution.
For all \(n \in \mathbb{N},\) the transpose \(M_{n,n}(\mathbb{K}) \to M_{n,n}(\mathbb{K})\) is a linear involution.
For \(n \in \mathbb{N},\) let \(\mathcal{X}\) denote the set of invertible \(n\times n\) matrices. Then the matrix inverse \(^{-1} : \mathcal{X} \to \mathcal{X}\) is an involution. Notice that \(\mathcal{X}\) is not a vector space.
For any \(\mathbb{K}\)-vector space \(V,\) the mapping \(\iota : V \to V, v \mapsto -v\) is a linear involution. Considering \(\mathsf{F}(I,\mathbb{K}),\) the \(\mathbb{K}\)-vector space of functions on the interval \(I\subset \mathbb{R},\) we obtain a linear involution of \(\mathsf{F}(V,\mathbb{K})\) by sending a function \(f\) to \(f\circ \iota.\)
If \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) satisfies \(\mathbf{A}^2=\mathbf{1}_{n},\) then \(f_\mathbf{A}: \mathbb{K}^n \to \mathbb{K}^n\) is a linear involution.

The spectrum of an involution is a subset of \(\{-1,1\}.\)

Proposition 6.54

Let \(V\) be a \(\mathbb{K}\)-vector space and \(\iota : V \to V\) a linear involution. Then the spectrum of \(\iota\) is contained in \(\{-1,1\}.\) Moreover \(V=\operatorname{Eig}_{\iota}(1)\oplus \operatorname{Eig}_{\iota}(-1)\) and \(\iota\) is diagonalisable.

Proof. Suppose \(\lambda \in \mathbb{K}\) is an eigenvalue of \(\iota\) so that \(\iota(v)=\lambda v\) for some non-zero vector \(v \in V.\) Then \(\iota(\iota(v))=v=\lambda \iota(v)=\lambda^2 v.\) Hence \((1-\lambda^2)v=0_V\) and since \(v\) is non-zero, we conclude that \(\lambda=\pm 1.\) By Proposition 6.46, the eigenspaces \(\operatorname{Eig}_{\iota}(1)\) and \(\operatorname{Eig}_{\iota}(-1)\) are in direct sum.

For \(v \in V\) we write \[v=\underbrace{\frac{1}{2}(v+f(v))}_{\in \operatorname{Eig}_{\iota}(1)}+\underbrace{\frac{1}{2}(v-f(v))}_{\in \operatorname{Eig}_{\iota}(-1)}\] hence \(V=\operatorname{Eig}_{\iota}(1)\oplus \operatorname{Eig}_\iota(-1).\) Take an ordered basis \(\mathbf{b}_+\) of \(\operatorname{Eig}_\iota(1)\) and an ordered basis \(\mathbf{b}_-\) of \(\operatorname{Eig}_\iota(1).\) Then \((\mathbf{b}_+,\mathbf{b}_-)\) is an ordered basis of \(V\) consisting of eigenvectors of \(\iota.\)

6.6.2 Projections

A linear mapping \(\Pi : V \to V\) satisfying \(\Pi\circ \Pi=\Pi\) is called a projection.

Example 6.55

Consider \(V=\mathbb{R}^3\) and \[\mathbf{A}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix}.\] Clearly, \(\mathbf{A}^2=\mathbf{A}\) and \(f_\mathbf{A}: \mathbb{R}^3 \to \mathbb{R}^3\) projects a vector \(\vec{x}=(x_i)_{1\leqslant i\leqslant 3}\) onto the plane \(\{\vec{x} \in \mathbb{R}^3 | x_3=0\}.\)

In a sense there is only one type of projection. Recall from the exercises that for a projection \(\Pi : V \to V,\) we have \(V=\operatorname{Ker}\Pi \oplus \operatorname{Im}\Pi.\) Given two subspaces \(U_1,U_2\) of \(V\) such that \(V=U_1 \oplus U_2\) , there is a projection \(\Pi : V \to V\) whose kernel is \(U_1\) and whose image is \(U_2.\) Indeed, every vector \(v\in V\) can be written as \(v=u_1+u_2\) for unique vectors \(u_i \in U_i\) for \(i=1,2.\) Hence we obtain a projection by defining \(\Pi(v)=u_2\) for all \(v \in V.\)

Denote by \(\mathcal{X}\) the set of projections from \(V\) to \(V\) and by \(\mathcal{Y}\) the set of pairs \((U_1,U_2)\) of subspaces of \(V\) that are in direct sum and satisfy \(V=U_1\oplus U_2.\) Then we obtain a mapping \(\Lambda : \mathcal{X} \to \mathcal{Y}\) defined by \(f \mapsto (\operatorname{Ker}f,\operatorname{Im}f).\)

Similar to Proposition 6.54, we obtain:

Proposition 6.56

Let \(V\) be a \(\mathbb{K}\)-vector space and \(\Pi : V \to V\) a projection. Then the spectrum of \(\Pi\) is contained in \(\{0,1\}.\) Moreover \(V=\operatorname{Eig}_{\Pi}(0)\oplus \operatorname{Eig}_{\Pi}(1),\) \(\Pi\) is diagonalisable and \(\operatorname{Im}\Pi=\operatorname{Eig}_{\Pi}(1).\)

Proof. Let \(v \in V\) be an eigenvector of the projection \(\Pi\) with eigenvalue \(\lambda.\) Hence we obtain \(\Pi(\Pi(v))=\lambda^2 v=\Pi(v)=\lambda v,\) equivalently, \(\lambda(\lambda-1)v=0_V.\) Since \(v\) is non zero, it follows that \(\lambda=0\) or \(\lambda=1.\) Since \(\Pi\) is a projection, we have \(V=\operatorname{Ker}\Pi\oplus \operatorname{Im}\Pi.\) Since \(\operatorname{Ker}\Pi=\operatorname{Eig}_\Pi(0),\) we thus only need to show that \(\operatorname{Im}\Pi=\operatorname{Eig}_{\Pi}(1).\) Let \(v \in \operatorname{Im}\Pi\) so that \(v=\Pi(\hat{v})\) for some vector \(\hat{v} \in V.\) Hence \(\Pi(v)=\Pi(\Pi(\hat{v}))=\Pi(\hat{v})=v\) and \(v\) is an eigenvector with eigenvalue \(1.\) Conversely, suppose \(v \in V\) is an eigenvector of \(\Pi\) with eigenvalue \(1.\) Then \(\Pi(v)=v=\Pi(\Pi(v))\) and hence \(v \in \operatorname{Im}\Pi.\) We thus conclude that \(\operatorname{Im}\Pi=\operatorname{Eig}_{\Pi}(1).\) Choosing an ordered basis of \(\operatorname{Ker}\Pi\) and an ordered basis of \(\operatorname{Im}\Pi\) gives a basis of \(V\) consisting of eigenvectors, hence \(\Pi\) is diagonalisable.

Exercises

Exercise 6.57

Derive the formula (6.3) for the coefficients \(b_j.\)

Solution

We first note that for \(k\geqslant 1\) we have \[x^k-x_0^k = (x-x_0)\sum_{j=0}^{k-1}x^jx_0^{k-1-j},\] which can be checked by a direct computation. Using this expression we find \[\begin{aligned}f(x)-f(x_0) & = \sum_{k=0}^na_k(x^k-x_0^k)\\ & = (x-x_0)\sum_{k=0}^n\sum_{j=0}^{k-1}a_kx^jx_0^{k-1-j}\\ & = (x-x_0)\sum_{j=0}^{n-1}\sum_{k=j+1}^na_kx_0^{k-1-j}x^j\\ & = (x-x_0)\sum_{j=0}^{n-1}\bigg(\underbrace{\sum_{k=0}^{n-j-1}a_{k+j+1}x_0^{k}}_{=b_j}\bigg)x^j \end{aligned}\] and the claim follows.

Exercise 6.58

Show that \(\Lambda\) is a bijection.

Solution

We need to show that \(\Lambda:\mathcal X\to\mathcal Y\) is both injective and surjective. Injectivity: Let \(\Pi_1,\Pi_2\in \mathcal X\) be two distinct projections. Then there is a non-zero vector \(v\in V\) such that \(\Pi_1(v)\ne \Pi_2(v).\) Since \(V=\operatorname{Ker}\Pi_1\oplus \operatorname{Im}\Pi_1 = \operatorname{Ker}\Pi_2\oplus \operatorname{Im}\Pi_2,\) either \(v\in\operatorname{Ker}\Pi_1\) or \(v\in\operatorname{Im}\Pi_1\):

If \(v\in \operatorname{Ker}\Pi_1,\) then \(v\notin \operatorname{Ker}\Pi_2\) and therefore \(\operatorname{Ker}\Pi_1\ne\operatorname{Ker}\Pi_2\) and hence \(\Lambda(\Pi_1)\ne\Lambda(\Pi_2).\)
If \(v\in\operatorname{Im}\Pi_1,\) then \(\Pi_1(v)=v\ne \Pi_2(v).\) However, if \(v\in\operatorname{Im}\Pi_2,\) then \(\Pi_2(v)=v\) which is a contradiction. Therefore, \(v\notin\operatorname{Im}\Pi_2\) and \(\Lambda(\Pi_1)\ne\Lambda(\Pi_2).\)

Surjectivity: This is done exactly as indicated in the lecture notes. If \((U_1,U_2)\in\mathcal Y,\) then every \(v\in V\) can uniquely be written as \(v=u_1+u_2,\) where \(u_1\in U_1\) and \(u_2\in U_2.\) Let \(\Pi:V\to V\) be defined as \(\Pi(v)=u_2.\) Then \(\Pi\) is linear, \(\Pi(\Pi(v)) = \Pi(u_2) = \Pi(0_V+u_2) = u_2\) and \(\operatorname{Ker}\Pi=U_1,\) \(\operatorname{Im}\Pi=U_2.\)

Exercise 6.59

Show that if \(\Pi : V \to V\) is a projection then \(\mathrm{Id}_V - \Pi : V \to V\) is a projection with kernel equal to the image of \(\Pi\) and image equal to the kernel of \(\Pi.\)

Solution

We first show that \(\mathrm{Id}_V-\Pi\) is a projection: \[\begin{aligned}((\mathrm{Id}_V-\Pi)\circ(\mathrm{Id}_V-\Pi))(v) & = (\mathrm{Id}_V-\Pi)(v-\Pi(v))\\ & = v-\Pi(v)-\Pi(v-\Pi(v))\\ & = v-\Pi(v)-\Pi(v)+\Pi(v)\\ & = (\mathrm{Id}_V-\Pi)(v) \end{aligned}\] We show \(\operatorname{Ker}(\mathrm{Id}_V-\Pi)=\operatorname{Im}\Pi\) by showing both inclusions: Let \(v\in \operatorname{Ker}(\mathrm{Id}_V-\Pi).\) Then \(v-\Pi(v)=0_V \Longleftrightarrow v=\Pi(v)\) which shows that \(v\in \operatorname{Im}\Pi.\) In order to show the reverse inclusion, let \(w\in \operatorname{Im}\Pi.\) Then there exists \(v\in V\) such that \(\Pi(v) = w.\) \[(\mathrm{Id}_V-\Pi)(w) = w-\Pi(w) = \Pi(v)-\Pi(\Pi(v))= 0_V\] and hence \(w\in \operatorname{Ker}(\mathrm{Id}_V-\Pi).\)

In order to show \(\operatorname{Im}(\mathrm{Id}_V-\Pi)=\operatorname{Ker}\Pi,\) we apply the above argument to the projection \(\widetilde \Pi = \mathrm{Id}_V-\Pi\) to conclude that \(\operatorname{Ker}(\mathrm{Id}_V-\widetilde \Pi) = \operatorname{Im}\widetilde \Pi\) and thus \[\operatorname{Im}(\mathrm{Id}_V-\Pi) = \operatorname{Im}\widetilde \Pi = \operatorname{Ker}(\mathrm{Id}_V-\widetilde\Pi) = \operatorname{Ker}\Pi.\]

6.5 Properties of eigenvalues

6.6 Special endomorphisms

6.6.1 Involutions

6.6.2 Projections

Exercises

Home

Contents

Exercise Sheets

Lecture Recordings

Quizzes

Study Weeks

✕