9.2 Symmetric bilinear forms

We now restrict to the case \(\mathbb{K}=\mathbb{R}.\) Perpendicular vectors are orthogonal in the following sense:

Definition 9.12 • Orthogonal vectors

Let \(V\) be an \(\mathbb{R}\)-vector space equipped with a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle.\) Two vectors \(v_1,v_2 \in V\) are called orthogonal with respect to \(\langle\cdot{,}\cdot\rangle\) if \(\langle v_1,v_2\rangle=0.\) We write \(v_1\perp v_2\) if the vectors \(v_1,v_2 \in V\) are orthogonal. A subset \(S\subset V\) is called orthogonal with respect to \(\langle\cdot{,}\cdot\rangle\) if all pairs of distinct vectors of \(S\) are orthogonal with respect to \(\langle\cdot{,}\cdot\rangle.\) A basis of \(V\) which is also an orthogonal subset is called an orthogonal basis.

Example 9.13

  1. Perpendicular vectors in \(\mathbb{R}^n\) are orthogonal with respect to the standard scalar product defined by the rule (9.1).

  2. Example 9.7 continued: As we computed above, the vectors \(\vec{v}_1=\vec{e}_1+\vec{e}_2\) and \(\vec{v}_2=\vec{e}_2-\vec{e}_1\) satisfy \(\langle \vec{v}_1,\vec{v}_2\rangle_\mathbf{A}=0\) and hence are orthogonal with respect to \(\langle\cdot{,}\cdot\rangle_\mathbf{A}.\)
  3. Example 9.2 (vi) continued: Let \(f_1 \in V\) be the function \(x \mapsto x\) and \(f_3 \in V\) be the function \(x\mapsto \frac{1}{2}(5x^3-3x).\) Then \[\langle f_1,f_3\rangle=\int_{-1}^1 x\,\frac{1}{2}(5x^3-3x)\mathrm{d}x=\left.\frac{1}{2}\left(x^5-x^3\right)\right|_{-1}^1=0,\] so that \(f_1\) and \(f_3\) are orthogonal with respect to \(\langle\cdot{,}\cdot\rangle.\)
Definition 9.14 • Orthonormal vectors

Let \(V\) be an \(\mathbb{R}\)-vector space equipped with a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle.\) A subset \(\mathcal{S}\subset V\) is called orthonormal with respect to \(\langle\cdot{,}\cdot\rangle\) if \(\mathcal{S}\) is orthogonal with respect to \(\langle\cdot{,}\cdot\rangle\) and if for all vectors \(v \in \mathcal{S}\) we have \(\langle v,v\rangle=1.\) A basis of \(V\) which is also a orthonormal subset is called an orthonormal basis.

Remark 9.15

  • Often when \(\langle\cdot{,}\cdot\rangle\) is clear from the context we will simply speak of orthogonal or orthonormal vectors without explicitly mentioning \(\langle\cdot{,}\cdot\rangle.\)

  • Notice that an ordered basis \(\mathbf{b}\) of \(V\) is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle\) if and only if \[\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})=\mathbf{1}_{n},\] where \(n=\dim V.\)

Example 9.16

  1. The standard basis \(\{\vec{e}_1,\ldots,\vec{e}_n\}\) of \(\mathbb{R}^n\) satisfies \[\vec{e}_i\cdot \vec{e}_j=\delta_{ij}\] and hence is a orthonormal basis with respect to the standard scalar product on \(\mathbb{R}^n.\)

  2. Example 9.2 (vi) continued: Let \(\mathcal{S}=\{f_1,f_2,f_3\}\subset \mathsf{C}([-1,1],\mathbb{R})\) be the subset defined by the functions \[f_1 : x \mapsto \sqrt{\frac{3}{2}}x, \qquad f_2 : x \mapsto \frac{1}{2}\sqrt{\frac{5}{2}}(3x^2-1),\qquad f_3 : x \mapsto \frac{1}{2}\sqrt{\frac{7}{2}}(5x^3-3x).\] Then \(\mathcal{S}\) is orthonormal with respect to \(\langle\cdot{,}\cdot\rangle\) as can be verified by direct computation.

Given a subspace \(U\subset V,\) its orthogonal subspace consists of all vectors in \(V\) that are orthogonal to all vectors of \(U.\)

Definition 9.17 • Orthogonal subspace

Let \(V\) be an \(\mathbb{R}\)-vector space equipped with a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle\) and \(U\subset V\) a subspace. The set \[U^{\perp}=\left\{v \in V| \langle v,u\rangle=0\;\;\forall u \in U\right\}\] is called the orthogonal subspace to \(U\).

Remark 9.18

  • It is common to write \(\langle v,U\rangle=0\) instead of \(\langle v,u\rangle=0\;\forall u \in U.\)

  • Notice that the orthogonal subspace is indeed a subspace. The bilinearity of \(\langle\cdot{,}\cdot\rangle\) implies that \(\langle 0_V,u\rangle=0\) for all \(u \in U,\) hence \(0_V \in U^{\perp}\) and \(U^{\perp}\) is non-empty. Moreover, if \(v_1,v_2 \in U^{\perp},\) then we have for all \(u \in U\) and all \(s_1,s_2 \in \mathbb{R}\) \[\langle s_1v_1+s_2 v_2,u\rangle=s_1\langle v_1,u\rangle+s_2\langle v_2,u\rangle=0\] where we use the bilinearity of \(\langle\cdot{,}\cdot\rangle\) and that \(v_1,v_2 \in U^{\perp}.\) By Definition 3.21 it follows that \(U^{\perp}\) is indeed a subspace.
  • Notice also that a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle\) on \(V\) is non-degenerate if and only if \(V^{\perp}=\{0_V\}.\)

Example 9.19

  1. Let \(\mathbb{R}^3\) be equipped with the standard scalar product. If \(U\) is a line through the origin in \(\mathbb{R}^3,\) then \(U^{\perp}\) consists of the plane through the origin that is perpendicular to \(U,\) see Figure 9.1.

  2. Example 9.2 (iv) continued. Let \(U=\left\{s\mathbf{1}_{n} | s\in \mathbb{R}\right\}\) then \[U^{\perp}=\left\{ \mathbf{A}\in M_{n,n}(\mathbb{R}) | \operatorname{Tr}(\mathbf{A}s\mathbf{1}_{n})=0\;\;\forall s\in \mathbb{R}\right\}.\] Since \(\operatorname{Tr}(\mathbf{A}s\mathbf{1}_{n})=s\operatorname{Tr}(\mathbf{A}\mathbf{1}_{n})=s\operatorname{Tr}(\mathbf{A}),\) we conclude that the orthogonal subspace to \(U\) consists of the matrices whose trace is zero \[U^{\perp}=\left\{\mathbf{A}\in M_{n,n}(\mathbb{R}) | \operatorname{Tr}(\mathbf{A})=0\right\}.\]
Figure 9.1: The orthogonal complement of a line through the origin.
Previously in Corollary 3.65 we saw that every finite dimensional vector space \(V\) admits a basis. We can now upgrade this fact in the case where \(V\) is equipped with a symmetric bilinear form:
Theorem 9.20 • Existence of an orthogonal basis

Let \(V\) be a finite dimensional \(\mathbb{R}\)-vector space equipped with a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle.\) Then \(V\) admits an orthogonal basis with respect to \(\langle\cdot{,}\cdot\rangle.\)

For the proof of Theorem 9.20 we need two lemmas.
Lemma 9.21

Let \(V\) be an \(\mathbb{R}\)-vector space and \(\langle\cdot{,}\cdot\rangle\) a symmetric bilinear form on \(V.\) Suppose there exist vectors \(v_1,v_2 \in V\) such that \(\langle v_1,v_2\rangle \neq 0.\) Then there exists a vector \(v \in V\) with \(\langle v,v\rangle \neq 0.\)

Proof. If \(\langle v_1,v_1\rangle\neq 0\) or \(\langle v_2,v_2\rangle\neq 0\) we are done, hence assume \(\langle v_1,v_1\rangle=\langle v_2,v_2\rangle=0.\) Let \(v=v_1+v_2,\) then we obtain \[\langle v,v\rangle=\langle v_1+v_2,v_1+v_2\rangle=\langle v_1,v_1\rangle +2 \langle v_1,v_2\rangle +\langle v_2,v_2\rangle=2\langle v_1,v_2\rangle.\] By assumption we have \(\langle v_1,v_2\rangle \neq 0\) and hence also \(\langle v,v\rangle \neq 0.\)

Lemma 9.22

Let \(V\) be a finite dimensional \(\mathbb{R}\)-vector space equipped with a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle.\) Suppose \(v \in V\) satisfies \(\langle v,v\rangle\neq 0,\) then \(V=U\oplus U^{\perp}\) where \(U=\{sv | s\in \mathbb{R}\}.\)

Proof. Applying Remark 6.7, we need to show that \(U\cap U^{\perp}=\{0_V\}\) and that \(U+U^{\perp}=V.\)

We first show that \(U\cap U^{\perp}=\{0_V\}.\) Suppose \(u \in U\) and \(u \in U^{\perp}.\) Since \(u \in U\) we have \(u=sv\) for some scalar \(s.\) Since \(u \in U^{\perp}\) we must also have \(0=\langle u,v\rangle=s\langle v,v\rangle.\) Since \(\langle v,v\rangle \neq 0,\) this implies \(s=0\) and hence \(u=0_V.\)

We next show that \(U+U^{\perp}=V.\) Let \(w \in V.\) We want to write \(w=sv+\hat{v}\) for some \(\hat{v}\) satisfying \(\langle\hat{v},v\rangle=0.\) Since \(\hat{v}=w-sv,\) this condition becomes \[0=\langle v,w-sv\rangle=\langle v,w\rangle-s\langle v,v\rangle\] and since \(\langle v,v\rangle \neq 0,\) this gives \(s=\frac{\langle v,w\rangle}{\langle v,v\rangle}.\) Taking \[\hat{v}=w-\frac{\langle v,w\rangle}{\langle v,v\rangle} v\] thus gives \(w=sv+\hat{v}.\)

Proof of Theorem 9.20. Let \(n=\dim V.\) Suppose \(\langle\cdot{,}\cdot\rangle\) is degenerate and consider \(V^{\perp}.\) By Corollary 6.11 there exists a subspace \(V^{\prime}\subset V\) such that \(V=V^{\perp}\oplus V^{\prime}.\) By construction, the restriction of \(\langle\cdot{,}\cdot\rangle\) onto \(V^{\prime}\) is non-degenerate. If \(v_1,\ldots,v_m\) is an orthogonal basis of \(V^{\prime}\) and \(v_{m+1},\ldots,v_n\) a basis of \(V^{\perp},\) then \(\{v_1,\ldots,v_n\}\) is an orthogonal basis of \(V,\) since the vectors \(v_{m+1},\ldots,v_{n}\) are orthogonal to all vectors of \(V.\)

It is thus sufficient to prove the existence of an orthogonal basis for the case when \(\langle\cdot{,}\cdot\rangle\) is non-degenerate.

Let us therefore assume that \(\langle\cdot{,}\cdot\rangle\) is non-degenerate on \(V.\) We are going to prove the statement by using induction on the dimension of the vector space. If \(\dim V=0\) there is nothing to show, hence the statement is anchored. We will argue next that if every \((n-1)\)-dimensional \(\mathbb{R}\)-vector space equipped with a non-degenerate symmetric bilinear form admits an orthogonal basis, then so does every \(n\)-dimensional \(\mathbb{R}\)-vector space equipped with a non-degenerate symmetric bilinear form.

Let \(v_1\in V\) be any non-zero vector. Since \(\langle\cdot{,}\cdot\rangle\) is non-degenerate \(v_1\) cannot be orthogonal to all vectors of \(V\) and hence there exists a vector \(v_2 \in V\) such that \(\langle v_1,v_2\rangle \neq 0.\) Therefore, by Lemma 9.21 there exists a non-zero vector \(v \in V\) with \(\langle v,v\rangle \neq 0.\) Writing \(U=\left\{sv | s\in \mathbb{R}\right\},\) we have that \(V=U\oplus U^{\perp}\) by Lemma 9.22. Since \(\dim U=1,\) we must have \(\dim U^{\perp}=n-1\) by Proposition 6.12. The restriction of \(\langle\cdot{,}\cdot\rangle\) onto \(U^{\perp}\) is non-degenerate. Indeed, if there were a vector in \(U^{\perp}\) which is orthogonal to all vectors in \(U^{\perp},\) then – since it lies in \(U^{\perp}\) – it is also orthogonal to all vectors of \(U\) and hence to all vectors of \(V.\) This contradicts the assumption that \(\langle\cdot{,}\cdot\rangle\) is non-degenerate on \(V.\) Since the restriction of \(\langle\cdot{,}\cdot\rangle\) on \(U^{\perp}\) is non-degenerate and \(\dim U^{\perp}=n-1,\) the induction hypothesis implies that there exists a basis \(\{w_2,\ldots,w_{n}\}\) of \(U^{\perp}\) which is orthogonal with respect to \(\langle\cdot{,}\cdot\rangle.\) Setting \(w_1=v\) gives an orthogonal basis \(\{w_1,w_2,\ldots,w_n\}\) of \(V.\)

We also have:

Lemma 9.23

Let \(V\) be a finite dimensional \(\mathbb{R}\)-vector space equipped with a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle.\) Furthermore, let \(U\subset V\) be a subspace and \(\{u_1,\ldots,u_k\}\) be a basis of \(U.\) Then the following two statements are equivalent

  1. a vector \(v \in V\) is an element of \(U^{\perp}\);

  2. for \(1\leqslant i \leqslant k\) we have \(\langle v,u_i\rangle=0.\)

Proof. Exercise.

As a corollary to Theorem 9.20 we obtain a generalisation of Lemma 9.22.
Corollary 9.24

Let \(V\) be a finite dimensional \(\mathbb{R}\)-vector space and \(\langle\cdot{,}\cdot\rangle\) a symmetric bilinear form on \(V.\) Suppose \(U\subset V\) is a subspace such that the restriction of \(\langle\cdot{,}\cdot\rangle\) to \(U\) is non-degenerate. Then \(U\) and \(U^{\perp}\) are in direct sum and we have \[V=U\oplus U^{\perp}.\]

Proof. The proof is similar to Lemma 9.22. We first show that \(U\cap U^{\perp}=\{0_V\}.\) Suppose \(u_0 \in U \cap U^{\perp}.\) Recall that \[U^{\perp}=\{v \in V| \langle v,u\rangle=0 \; \forall u \in U\}\] Since \(u_0 \in U^{\perp}\) we thus have for all \(u \in U\) \[\langle u_0,u\rangle=0.\] Since the restriction of \(\langle\cdot{,}\cdot\rangle\) to \(U\) is non-degenerate, this implies that \(u=0_V,\) hence \(U\cap U^{\perp}=\{0_V\}.\) We next show that \(U+U^{\perp}=V.\) By Theorem 9.20, the subspace \(U\) admits an ordered basis \(\mathbf{b}=(v_1,\ldots,v_k)\) that is orthogonal with respect to \(\langle\cdot{,}\cdot\rangle,\) that is, \(\langle v_i,v_j\rangle =0\) for \(i\neq j.\) In particular, the matrix representation of \(\langle\cdot{,}\cdot\rangle\) with respect to \(\mathbf{b}\) is diagonal and the diagonal entries are given by \(\langle v_i,v_i\rangle\) for \(1\leqslant i\leqslant k.\) By Proposition 5.24 we have \[\det\mathbf{M}(\langle\cdot{,}\cdot\rangle|_U,\mathbf{b})=\prod_{i=1}^k \langle v_i,v_i\rangle\] where \(\langle\cdot{,}\cdot\rangle|_U\) denotes the restriction of \(\langle\cdot{,}\cdot\rangle\) onto \(U\times U.\) Since \(\langle\cdot{,}\cdot\rangle|_U\) is non-degenerate, we have \(\det\mathbf{M}(\langle\cdot{,}\cdot\rangle|_U,\mathbf{b}) \neq 0\) by Proposition 9.10, hence \(\langle v_i,v_i\rangle \neq 0\) for \(1\leqslant i\leqslant k.\) Finally, we argue that any vector \(w \in V\) can be written as \(w=\hat{v}+\sum_{i=1}^k s_i v_i\) for a suitable vector \(\hat{v} \in U^{\perp}\) and scalars \(s_i.\) As in the proof of Lemma 9.22, we define \[s_i=\frac{\langle v_i,w\rangle}{\langle v_i,v_i\rangle}\] and \(\hat{v}=w-\sum_{i=1}^ks_i v_i.\) Then \(w=\hat{v}+\sum_{i=1}^ks_i v_i\) and moreover \(\langle\hat{v},v_i\rangle=0\) for \(1\leqslant i\leqslant k,\) since \(\langle v_i,v_j\rangle=0\) for \(i\neq j.\) Since \(\mathbf{b}\) is a basis of \(U\) Lemma 9.22 implies that \(\hat{v}\) is an element of \(U^{\perp}.\)
Remark 9.25

In the case where the restriction of a symmetric bilinear form to a subspace \(U\) is non-degenerate, we have seen that \(U^{\perp}\) is a complement to \(U.\) The subspace \(U^{\perp}\) is called the orthogonal complement of \(U\).

The process of scaling a vector \(v\) so that \(\langle v,v\rangle\) equals some specific value – typically \(1\) – is known as normalising the vector.

Remark 9.26 • Normalisation

By definition, the matrix representation of a symmetric bilinear form \(\langle\cdot{,}\cdot\rangle\) with respect to an ordered orthogonal basis \(\mathbf{b}=(v_1,\ldots,v_n)\) of \(V\) is diagonal. Notice that if we define \[v^{\prime}_i=\left\{\begin{array}{cc} v_i, & \langle v_i,v_i\rangle =0\\ \frac{v_i}{\sqrt{|\langle v_i,v_i\rangle|}}, & \langle v_i,v_i\rangle \neq 0\end{array}\right.\] for \(1\leqslant i\leqslant n,\) then \(\mathbf{b}^{\prime}=(v^{\prime}_1,\ldots,v^{\prime}_n)\) is also an ordered basis of \(V\) and either \(\langle v^{\prime}_i,v^{\prime}_i\rangle=0\) or \[\langle v^{\prime}_i,v^{\prime}_i\rangle=\left\langle \frac{v_i}{\sqrt{|\langle v_i,v_i\rangle|}},\frac{v_i}{\sqrt{|\langle v_i,v_i\rangle|}}\right\rangle=\frac{\langle v_i,v_i\rangle}{|\langle v_i,v_i\rangle|}=\pm 1.\] Therefore, the matrix representation of \(\langle\cdot{,}\cdot\rangle\) with respect to \(\mathbf{b}^{\prime}\) is diagonal as well and the diagonal entries are elements of the set \(\{-1,0,1\}.\)

This observation allows to reformulate Theorem 9.20:
Theorem 9.27(Matrix version of Theorem 9.20). Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric \(n\times n\)-matrix. Then there exists an invertible \(n\times n\)-matrix \(\mathbf{C}\in \mathrm{GL}(n,\mathbb{R})\) and integers \(p,q,s\) such that \[\tag{9.5} \mathbf{C}^T\mathbf{A}\mathbf{C}=\begin{pmatrix} \mathbf{1}_{p} && \\ & -\mathbf{1}_{q} & \\ && \mathbf{0}_{s} \end{pmatrix}.\]
Proof. Let \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric \(n\times n\)-matrix and let \(\langle\cdot{,}\cdot\rangle\) denote the symmetric bilinear form on \(\mathbb{R}^n\) defined by the rule \(\langle \vec{x}_1,\vec{x}_2\rangle=\vec{x}_1^T\mathbf{A}\vec{x}_2\) for all \(\vec{x}_1,\vec{x}_2 \in \mathbb{R}^n.\) By Example 9.5 (ii), we have that \(\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{e})=\mathbf{A},\) where \(\mathbf{e}\) denotes the standard ordered basis of \(\mathbb{R}^n.\) Theorem 9.20 implies that \(\mathbb{R}^n\) admits an orthogonal basis with respect to \(\langle\cdot{,}\cdot\rangle.\) After carrying out the normalisation procedure described in Remark 9.26 and possibly renumbering the basis vectors, we thus obtain an ordered basis \(\mathbf{b}\) of \(\mathbb{R}^n\) such that \[\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})=\begin{pmatrix} \mathbf{1}_{p} && \\ & -\mathbf{1}_{q} & \\ && \mathbf{0}_{s} \end{pmatrix}.\] Defining \(\mathbf{C}=\mathbf{C}(\mathbf{b},\mathbf{e}),\) Proposition 9.6 thus implies that \(\mathbf{C}^T\mathbf{A}\mathbf{C}=\mathbf{M}(\langle\cdot{,}\cdot\rangle,\mathbf{b})\) as claimed. Finally, the matrix \(\mathbf{C}\) is invertible by Remark 3.104.
Remark 9.28 • Sylvester’s law of inertia

  • Sylvester’s law of inertia states that the numbers \(p\) and \(q\) in (9.5) (and hence also \(s\)) are uniquely determined by the bilinear form \(\langle\cdot{,}\cdot\rangle.\) That is, they do not depend on the choice of matrix \(\mathbf{C}\in \mathrm{GL}(n,\mathbb{R})\) such that \(\mathbf{C}^T\mathbf{A}\mathbf{C}\) is diagonal with diagonal entries from the set \(\{-1,0,1\}.\) We will not prove this fact, but a proof can be found in most textbooks about Linear Algebra.

  • The pair \((p,q)\) is known as the signature of the bilinear form \(\langle\cdot{,}\cdot\rangle\).

Home

Contents

Exercises

Lecture Recordings

Quizzes

Study Weeks