UniDistance | BMathE_M04

5.4 Properties of the determinant

Proposition 5.21 • Product rule

For matrices \(\mathbf{A},\mathbf{B}\in M_{n,n}(\mathbb{K})\) we have \[\det(\mathbf{A}\mathbf{B})=\det(\mathbf{A})\det(\mathbf{B}).\]

Proof. We first consider the case where \(\mathbf{A}\) is not invertible, then \(\det(\mathbf{A})=0\) (see the proof of Proposition 5.14). If \(\mathbf{A}\) is not invertible, then neither is \(\mathbf{A}\mathbf{B}.\) Indeed, if \(\mathbf{A}\mathbf{B}\) were invertible, then there exists a matrix \(\mathbf{C}\) such that \((\mathbf{A}\mathbf{B})\mathbf{C}=\mathbf{1}_{n}.\) But since, by Corollary 2.22, the matrix product is associative, this also gives \(\mathbf{A}(\mathbf{B}\mathbf{C})=\mathbf{1}_{n},\) so that \(\mathbf{B}\mathbf{C}\) is the inverse of \(\mathbf{A},\) a contradiction. Hence if \(\mathbf{A}\) is not invertible, we must also have \(\det(\mathbf{A}\mathbf{B})=0,\) which verifies that \(\det(\mathbf{A}\mathbf{B})=0=\det(\mathbf{A})\det(\mathbf{B})\) for \(\mathbf{A}\) not invertible.

If \(\mathbf{A}\) is invertible, we can write it as a product of elementary matrices and applying the second part of Lemma 5.13, we conclude that \(\det(\mathbf{A}\mathbf{B})=\det(\mathbf{A})\det(\mathbf{B}).\)

Corollary 5.22

A matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) is invertible if and only if \(\det(\mathbf{A})\neq 0.\) Moreover, in the case where \(\mathbf{A}\) is invertible, we have \[\det\left(\mathbf{A}^{-1}\right)=\frac{1}{\det \mathbf{A}}.\]

Proof. We have already seen that if \(\mathbf{A}\) is not invertible, then \(\det(\mathbf{A})=0.\) This is equivalent to saying that if \(\det(\mathbf{A})\neq 0,\) then \(\mathbf{A}\) is invertible. It thus remains to show that if \(\mathbf{A}\) is invertible, then \(\det(\mathbf{A})\neq 0.\) Suppose \(\mathbf{A}\) is invertible, then applying Proposition 5.21 gives \[\det(\mathbf{1}_{n})=\det\left(\mathbf{A}\mathbf{A}^{-1}\right)=\det(\mathbf{A})\det\left(\mathbf{A}^{-1}\right)=1\] so that \(\det(\mathbf{A})\neq 0\) and \(\det\left(\mathbf{A}^{-1}\right)=1/\det(\mathbf{A}).\)

Remark 5.23 • Product symbol

Recall that for scalars \(x_1,\ldots,x_n \in \mathbb{K},\) we write \[\prod_{i=1}^n x_i=x_1x_2\cdots x_n.\]

Proposition 5.24

Let \(n \in \mathbb{N}\) and \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) be an upper triangular matrix so that \(A_{ij}=0\) for \(i>j.\) Then \[\tag{5.6} \det(\mathbf{A})=\prod_{i=1}^n A_{ii}=A_{11}A_{22}\cdots A_{nn}.\]

Proof. We use induction. For \(n=1\) the condition \(A_{ij}=0\) for \(i>j\) is vacuous and (5.6) is trivially satisfied, thus the statement is anchored.

Inductive step: Assume \(n \in \mathbb{N}\) and \(n\geqslant 2.\) We want to show that if (5.6) holds for upper triangular \((n-1)\times(n-1)\)-matrices, then also for upper triangular \(n\times n\)-matrices. Let \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) be an upper triangular matrix. Choosing \(l=1\) in the formula for \(\det(\mathbf{A}),\) we obtain \[\begin{aligned} \det(\mathbf{A})&=\sum_{k=1}^n(-1)^{k+1}A_{k1}\det\left(\mathbf{A}^{(k,1)}\right)=A_{11}\det\left(\mathbf{A}^{(1,1)}\right)+\sum_{k=2}^nA_{k1}\det\left(\mathbf{A}^{(k,1)}\right)\\ &=A_{11}\det\left(\mathbf{A}^{(1,1)}\right), \end{aligned}\] where the last equality uses that \(A_{k1}=0\) for \(k>1.\) We have \(\mathbf{A}^{(1,1)}=(A_{ij})_{2\leqslant i,j\leqslant n}\) and since \(\mathbf{A}\) is an upper triangular matrix, it follows that \(\mathbf{A}^{(1,1)}\) is an \((n-1)\times(n-1)\) upper triangular matrix as well. Hence by the induction hypothesis, we obtain \[\det(\mathbf{A}^{(1,1)})=\prod_{i=2}^nA_{ii}.\] We conclude that \(\det(\mathbf{A})=\prod_{i=1}^n A_{ii},\) as claimed.

5.5 Permutations

A rearrangement of the natural numbers from \(1\) up to \(n\) is called a permutation:

Definition 5.25 • Permutation

Let \(n \in \mathbb{N}\) and \(\mathcal{X}_n=\{1,2,3,\ldots,n\}.\) A permutation is a bijective mapping \(\sigma : \mathcal{X}_n \to \mathcal{X}_n.\) The set of all permutations of \(\mathcal{X}_n\) is denoted by \(S_n.\)

Remark 5.26

If \(\tau,\sigma : \mathcal{X}_n \to \mathcal{X}_n\) are permutations, it is customary to write \(\tau\sigma\) or \(\tau\cdot \sigma\) instead of \(\tau \circ \sigma.\) Furthermore, the identity mapping \(\mathrm{Id}_{\mathcal{X}_n}\) is often simply denoted by \(1.\) A convenient way to describe a permutation \(\sigma \in S_n\) is in terms of a \(2\times n\) matrix \[\begin{pmatrix} i \\ \sigma(i) \end{pmatrix}_{1\leqslant i \leqslant n}.\] which we denote by \(\boldsymbol{\sigma}.\) For instance, for \(n=4,\) the matrix \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 & 4 \\ 2 & 3 & 1 & 4 \end{pmatrix}\] corresponds to the permutation \(\sigma\) satisfying \(\sigma(1)=2,\sigma(2)=3,\sigma(3)=1,\sigma(4)=4.\)

Permutations which only swap two natural numbers and leave all remaining numbers unchanged are known as transpositions:

Definition 5.27 • Transposition

Let \(n \in \mathbb{N}\) and \(1\leqslant k,l\leqslant n\) with \(k\neq l.\) The transposition \(\tau_{k,l} \in S_n\) is the permutation satisfying \[\tau_{k,l}(k)=l, \qquad \tau_{k,l}(l)=k, \qquad \tau_{k,l}(i)=i \text{ if } i \notin \{k,l\}.\]

Every permutation \(\sigma \in S_n\) defines a linear map \(g : \mathbb{K}^n \to \mathbb{K}^n\) satisfying \(g(\vec{e}_i)=\vec{e}_{\sigma(i)},\) where \(\{\vec{e}_1,\ldots,\vec{e}_n\}\) denotes the standard basis of \(\mathbb{K}^n.\) Since \(g\) is linear, there exists a unique matrix \(\mathbf{P}_{\sigma} \in M_{n,n}(\mathbb{K})\) so that \(g=f_{\mathbf{P}_{\sigma}}.\) Observe that the column vectors of the matrix \(\mathbf{P}_{\sigma}\) are given by \(\vec{e}_{\sigma(1)},\vec{e}_{\sigma(2)},\ldots,\vec{e}_{\sigma(n)}.\)

Definition 5.28 • Permutation matrix

We call \(\mathbf{P}_{\sigma} \in M_{n,n}(\mathbb{K})\) the permutation matrix associated to \(\sigma \in S_n.\)

Example 5.29

Let \(n=4.\) For instance, we have \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 & 4 \\ 2 & 3 & 1 & 4 \end{pmatrix} \qquad \mathbf{P}_{\sigma}=\begin{pmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}\] and \[\boldsymbol{\tau}_{2,4}=\begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 4 & 3 & 2 \end{pmatrix}\qquad \mathbf{P}_{\tau_{2,4}}=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{pmatrix}.\]

Remark 5.30

Notice that \(\mathbf{P}_{\tau_{k,l}}=\mathbf{P}_{k,l},\) where \(\mathbf{P}_{k,l}\) belongs to the elementary matrices of size \(n,\) c.f. Definition 4.1.

Assigning to a permutation its permutation matrix turns composition of permutations into matrix multiplication:

Proposition 5.31

Let \(n \in \mathbb{N}.\) Then \(\mathbf{P}_1=\mathbf{1}_{n}\) and for all \(\sigma,\pi \in S_n\) we have \[\mathbf{P}_{\pi \cdot \sigma}=\mathbf{P}_\pi \mathbf{P}_\sigma.\] In particular, for all \(\sigma \in S_n,\) the permutation matrix \(\mathbf{P}_\sigma\) is invertible with \((\mathbf{P}_{\sigma})^{-1}=\mathbf{P}_{\sigma^{-1}}.\)

Example 5.32

Considering \(n=3.\) For \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 \\ 3 & 1 & 2 \end{pmatrix} \quad \text{and}\qquad \boldsymbol{\pi}=\begin{pmatrix} 1 & 2 & 3 \\ 1 & 3 & 2 \end{pmatrix}\qquad \text{we have} \quad \boldsymbol{\pi}\cdot\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 \\ 2 & 1 & 3\end{pmatrix},\] as well as \[\mathbf{P}_\sigma=\begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{pmatrix}, \qquad \mathbf{P}_\pi=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix} \quad \text{and} \quad \mathbf{P}_{\pi\cdot \sigma}=\begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}.\] Thus we obtain \[\mathbf{P}_{\pi\cdot\sigma}=\begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix}\begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{pmatrix}=\mathbf{P}_\pi \mathbf{P}_\sigma,\] as claimed by Proposition 5.31.

Proof of Proposition 5.31
Proposition 5.31 ➔
Let \(n \in \mathbb{N}.\) Then \(\mathbf{P}_1=\mathbf{1}_{n}\) and for all \(\sigma,\pi \in S_n\) we have \[\mathbf{P}_{\pi \cdot \sigma}=\mathbf{P}_\pi \mathbf{P}_\sigma.\] In particular, for all \(\sigma \in S_n,\) the permutation matrix \(\mathbf{P}_\sigma\) is invertible with \((\mathbf{P}_{\sigma})^{-1}=\mathbf{P}_{\sigma^{-1}}.\)

. The matrix \(\mathbf{P}_1\) has column vectors given by \(\vec{e}_{1},\ldots,\vec{e}_n,\) hence \(\mathbf{P}_1=\mathbf{1}_{n}.\)

Using Proposition 2.20 and Theorem 2.21 it is sufficient to show that for all \(\pi,\sigma \in S_n\) we have \(f_{\mathbf{P}_{\pi\cdot \sigma}}=f_{\mathbf{P}_\pi}\circ f_{\mathbf{P}_\sigma}.\) For all \(1\leqslant i\leqslant n,\) we obtain \[f_{\mathbf{P}_\pi}\left(f_{\mathbf{P}_\sigma}(\vec{e}_i)\right)=f_{\mathbf{P}_\pi}\left(\vec{e}_{\sigma(i)}\right)=\vec{e}_{\pi(\sigma(i))}=\vec{e}_{(\pi\cdot\sigma)(i)}=f_{\mathbf{P}_{\pi\cdot \sigma}}(\vec{e}_i).\] The two maps thus agree on the ordered basis \(\mathbf{e}=(\vec{e}_1,\ldots,\vec{e}_n)\) of \(\mathbb{K}^n,\) so that the second claim follows by applying Lemma 3.87.

We have \[\mathbf{P}_{\sigma \cdot \sigma^{-1}}=\mathbf{P}_{1}=\mathbf{1}_{n}=\mathbf{P}_{\sigma}\mathbf{P}_{\sigma^{-1}}\] showing that \(\mathbf{P}_{\sigma}\) is invertible with inverse \((\mathbf{P}_{\sigma})^{-1}=\mathbf{P}_{\sigma^{-1}}.\)

Definition 5.33 • Signature of a permutation

For \(\sigma \in S_n\) we call \(\operatorname{sgn}(\sigma)=\det(\mathbf{P}_{\sigma})\) its signature.

Remark 5.34

Combining Proposition 5.21
Proposition 5.21 • Product rule ➔
For matrices \(\mathbf{A},\mathbf{B}\in M_{n,n}(\mathbb{K})\) we have \[\det(\mathbf{A}\mathbf{B})=\det(\mathbf{A})\det(\mathbf{B}).\]

and Proposition 5.31
Proposition 5.31 ➔
Let \(n \in \mathbb{N}.\) Then \(\mathbf{P}_1=\mathbf{1}_{n}\) and for all \(\sigma,\pi \in S_n\) we have \[\mathbf{P}_{\pi \cdot \sigma}=\mathbf{P}_\pi \mathbf{P}_\sigma.\] In particular, for all \(\sigma \in S_n,\) the permutation matrix \(\mathbf{P}_\sigma\) is invertible with \((\mathbf{P}_{\sigma})^{-1}=\mathbf{P}_{\sigma^{-1}}.\)

, we conclude that \[\operatorname{sgn}(\pi\cdot \sigma)=\operatorname{sgn}(\pi)\operatorname{sgn}(\sigma)\] for all \(\pi,\sigma \in S_n.\)
Since \(\mathbf{P}_{\tau_{k,l}}=\mathbf{P}_{k,l}\) and \(\det \mathbf{P}_{k,l}=-1\) by Lemma 5.13
Lemma 5.13 ➔
Let \(n \in \mathbb{N}\) and \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) an alternating \(n\)-multilinear map satisfying \(f_n(\mathbf{1}_{n})=1.\) Then for all \(1\leqslant k,l\leqslant n\) with \(k\neq l\) and all \(s\in \mathbb{K},\) we have \[\tag{5.3} f_n(\mathbf{D}_k(s))=s,\qquad f_n(\mathbf{L}_{k,l}(s))=1, \qquad f_n(\mathbf{P}_{k,l})=-1.\] Moreover, for \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and an elementary matrix \(\mathbf{B}\) of size \(n,\) we have \[\tag{5.4} f_n(\mathbf{B}\mathbf{A})=f_n(\mathbf{B})f_n(\mathbf{A}).\]

, we conclude that \[\operatorname{sgn}(\tau_{k,l})=-1\] for all transpositions \(\tau_{k,l} \in S_n.\)

Similarly to elementary matrices being the building blocks of invertible matrices, transpositions are the building blocks of permutations:

Proposition 5.35

Let \(n\in \mathbb{N}\) and \(\sigma \in S_n.\) Then there exists \(m \geqslant 0\) and \(m\) transpositions \(\tau_{k_1,l_1},\ldots, \tau_{k_m,l_m} \in S_n\) such that \(\sigma=\tau_{k_m,l_m}\cdots \tau_{k_1,l_1},\) where we use the convention that \(0\) transpositions corresponds to the identity permutation.

Example 5.36

Let \(n=6\) and \(\sigma\) the permutation defined by the matrix \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 & 4 & 5 & 6 \\ 3 & 5 & 2 & 4 & 6 & 1\end{pmatrix}.\] To express it as a product of transposition, we write \[\begin{array}{cccccc|c} 3 & 5 & 2 & 4 & 6 & 1 & \\ 3 & 2 & 5 & 4 & 6 & 1 & \tau_{2,3} \\ 1 & 2 & 5 & 4 & 6 & 3 & \tau_{1,6} \\ 1 & 2 & 5 & 4 & 3 & 6 & \tau_{5,6} \\ 1 & 2 & 3 & 4 & 5 & 6 & \tau_{3,5}\end{array}\] so that \(\sigma=\tau_{3,5}\tau_{5,6}\tau_{1,6}\tau_{2,3}.\)

Animation: Every permutation is a composition of transpositions

Proof of Proposition 5.35
Proposition 5.35 ➔
Let \(n\in \mathbb{N}\) and \(\sigma \in S_n.\) Then there exists \(m \geqslant 0\) and \(m\) transpositions \(\tau_{k_1,l_1},\ldots, \tau_{k_m,l_m} \in S_n\) such that \(\sigma=\tau_{k_m,l_m}\cdots \tau_{k_1,l_1},\) where we use the convention that \(0\) transpositions corresponds to the identity permutation.

. We use induction. For \(n=1\) we have \(\mathcal{X}_n=\{1\}\) and the only permutation is the identity permutation \(1,\) so the statement is trivially true and hence anchored.

Inductive step: Assume \(n \in \mathbb{N}\) and \(n\geqslant 2.\) We want to show that if the claim holds for \(S_{n-1},\) then also for \(S_n.\) Let \(\sigma \in S_n\) and define \(k=\sigma(n).\) Then the permutation \(\sigma_1=\tau_{n,k}\sigma\) satisfies \(\sigma_1(n)=\tau_{n,k}\sigma(n)=\tau_{n,k}(k)=n\) and hence does not permute \(n.\) Restricting \(\sigma_1\) to the first \(n-1\) elements, we obtain a permutation of \(\{1,\ldots,n-1\}.\) By the induction hypothesis, we thus have \({\tilde{m}} \in \mathbb{N}\) and \(\tau_{k_1,l_1},\ldots \tau_{k_{{\tilde{m}}}},\tau_{l_{{\tilde{m}}}} \in S_n\) such that \[\sigma_1=\tau_{k_{{\tilde{m}}},l_{{\tilde{m}}}}\cdots \tau_{k_1,l_1}=\tau_{n,k}\sigma.\] Since \(\tau_{n,k}^2=1,\) multiplying from the left with \(\tau_{n,k}\) gives \(\sigma=\tau_{n,k}\tau_{k_{{\tilde{m}}},l_{{\tilde{m}}}}\cdots \tau_{k_1,l_1},\) the claim follows with \(m={\tilde{m}}+1.\)

Combining Definition 5.33, Remark 5.34 and Proposition 5.35, we conclude:

Proposition 5.37

Let \(n \in \mathbb{N}\) and \(\sigma \in S_n.\) Then \(\operatorname{sgn}(\sigma)=\pm 1.\) If \(\sigma\) is a product of \(m\) transpositions, then \(\operatorname{sgn}(\sigma)=(-1)^m.\)

Remark 5.38

Permutations with \(\operatorname{sgn}(\sigma)=1\) are called even and permutations with \(\operatorname{sgn}(\sigma)=-1\) are called odd, since they arise from the composition of an even or odd number of transpositions, respectively.

5.6 The Leibniz formula

Besides the Laplace expansion, there is also a formula for the determinant which relies on permutations. As a warm-up, we first consider the case \(n=2.\) Using the linearity of the determinant in the first row, we obtain \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix}=\det\begin{pmatrix} a & 0 \\ c & d \end{pmatrix} +\det \begin{pmatrix} 0 & b \\ c & d \end{pmatrix},\] where \(a,b,c,d \in \mathbb{K}.\) Using the linearity of the determinant in the second row, we can further decompose the two above summands \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix}=\underbrace{\det \begin{pmatrix} a & 0 \\ c & 0 \end{pmatrix}+\det\begin{pmatrix} a & 0 \\ 0 & d \end{pmatrix}}_{=\det \begin{pmatrix} a & 0 \\ c & d\end{pmatrix}}+\underbrace{\det \begin{pmatrix} 0 & b \\ c & 0 \end{pmatrix}+\det\begin{pmatrix} 0 & b \\ 0 & d \end{pmatrix}}_{=\det \begin{pmatrix} 0 & b \\ c & d\end{pmatrix}}\] The first and fourth summand are always zero due to the occurrence of a zero column. The second and third summand are possibly nonzero (it might still happen that they are zero in the case where some of \(a,b,c,d\) are zero). In any case, we get \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix}=\det \begin{pmatrix} a & 0 \\ 0 & d \end{pmatrix}+\det \begin{pmatrix} 0 & b \\ c & 0 \end{pmatrix}.\] We can proceed analogously in general. Let \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K}).\) We denote the rows of \(\mathbf{A}\) by \(\{\vec{\alpha}_1,\ldots,\vec{\alpha}_n\}.\) Using the linearity of \(\det\) in the first row, we can write \[\begin{gathered} \det \mathbf{A}=\det\begin{pmatrix} A_{11} & 0 & 0 & \cdots & 0 \\ && \vec{\alpha}_2 \\ && \vdots \\ && \vec{\alpha}_n&&\end{pmatrix}+\det\begin{pmatrix} 0 & A_{12} & 0 & \cdots & 0 \\ && \vec{\alpha}_2 \\ && \vdots \\ && \vec{\alpha}_n&&\end{pmatrix}+\cdots\\ \cdots +\det\begin{pmatrix} 0 & 0 & 0 & \cdots & A_{1n} \\ && \vec{\alpha}_2 \\ && \vdots \\ && \vec{\alpha}_n&&\end{pmatrix}. \end{gathered}\] We can now use the linearity in the second row and proceed in the same fashion with each of the above summands. We continue this procedure until the \(n\)-th row. As a result, we can write \[\tag{5.7} \det \mathbf{A}= \sum_{k=1}^{n^n} \det \mathbf{M}_k\] where each of the matrices \(\mathbf{M}_k\) has exactly one possibly nonzero entry in each row. As above, some of the matrices \(\mathbf{M}_k\) will have a zero column so that their determinant vanishes. The matrices \(\mathbf{M}_k\) without a zero column must have exactly one possibly nonzero entry in each row and each column. We can thus write the matrices \(\mathbf{M}_k\) with possibly nonzero determinant as \[\mathbf{M}_k=\sum_{i=1}^n A_{\sigma(i)i}\mathbf{E}_{\sigma(i),i}\] for some permutation \(\sigma \in S_n.\) Every permutation of \(\{1,\ldots,n\}\) occurs precisely once in the expansion (5.7), hence we can write \[\det \mathbf{A}=\sum_{\sigma \in S_n}\det\left(\sum_{i=1}^nA_{\sigma(i)i}\mathbf{E}_{\sigma(i),i}\right),\] where the notation \(\sum_{\sigma \in S_n}\) means that we sum over all possible permutations of \(\{1,\ldots,n\}.\) We will next write the matrix \(\sum_{i=1}^n A_{\sigma(i)i}\mathbf{E}_{\sigma(i),i}\) differently. To this end notice that for all \(\sigma \in S_n,\) the permutation matrix \(\mathbf{P}_{\sigma}\) can be written as \(\mathbf{P}_{\sigma}=\sum_{i=1}^n \mathbf{E}_{\sigma(i),i}.\) Furthermore, the diagonal matrix \[\mathbf{D}_{\sigma}=\begin{pmatrix} A_{\sigma(1)1}& & & \\ & A_{\sigma(2)2} & & \\ && \ddots & & \\ &&& A_{\sigma(n)n}\end{pmatrix}\] can be written as \(\mathbf{D}_{\sigma}=\sum_{j=1}^n A_{\sigma(j)j}\mathbf{E}_{j,j}.\) Therefore, using Lemma 4.4, we obtain \[\mathbf{P}_{\sigma}\mathbf{D}_{\sigma}=\sum_{i=1}^n \mathbf{E}_{\sigma(i),i}\sum_{j=1}^n A_{\sigma(j)j}\mathbf{E}_{j,j}=\sum_{i=1}^n\sum_{j=1}^n A_{\sigma(j)j}\mathbf{E}_{\sigma(i),i}\mathbf{E}_{j,j}=\sum_{i=1}^n A_{\sigma(i)i}\mathbf{E}_{\sigma(i),i},\] We thus have the formula \[\det \mathbf{A}=\sum_{\sigma \in S_n} \det\left(\mathbf{P}_{\sigma}\mathbf{D}_{\sigma}\right)=\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\det(\mathbf{D}_\sigma),\] where we use the product rule Proposition 5.21 and the definition of the signature of a permutation. By Proposition 5.24, the determinant of a diagonal matrix is the product of its diagonal entries, hence we obtain \[\det \mathbf{A}=\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n A_{\sigma(i)i}.\] Finally, writing \(\pi=\sigma^{-1},\) we have \[\prod_{i=1}^n A_{\sigma(i)i}=\prod_{j=1}^n A_{j\pi(j)}.\] We have thus shown:

Proposition 5.39 • Leibniz formula for the determinant

Let \(n \in \mathbb{N}\) and \(\mathbf{A}=(A_{ij})_{1,\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K}).\) Then we have \[\tag{5.8} \det(\mathbf{A}) =\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n A_{\sigma(i)i}=\sum_{\pi \in S_n}\operatorname{sgn}(\pi)\prod_{j=1}^n A_{j\pi(j)}.\]

Example 5.40

For \(n=3\) we have six permutations \[\begin{aligned} \boldsymbol{\sigma}_1&=\begin{pmatrix} 1 & 2 & 3 \\ 1 & 2 & 3 \end{pmatrix}, \quad \boldsymbol{\sigma}_2=\begin{pmatrix} 1 & 2 & 3 \\ 1 & 3 & 2 \end{pmatrix}, \quad \boldsymbol{\sigma}_3=\begin{pmatrix} 1 & 2 & 3 \\ 2 & 1 & 3 \end{pmatrix}\\ \boldsymbol{\sigma}_4&=\begin{pmatrix} 1 & 2 & 3 \\ 2 & 3 & 1 \end{pmatrix}, \quad \boldsymbol{\sigma}_5=\begin{pmatrix} 1 & 2 & 3 \\ 3 & 1 & 2 \end{pmatrix}, \quad \boldsymbol{\sigma}_6=\begin{pmatrix} 1 & 2 & 3 \\ 3 & 2 & 1 \end{pmatrix}. \end{aligned}\] For \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant 3} \in M_{3,3}(\mathbb{K}),\) the Leibniz formula gives \[\begin{gathered} \det(\mathbf{A})=\operatorname{sgn}(\sigma_1)A_{11}A_{22}A_{33}+\operatorname{sgn}(\sigma_2)A_{11}A_{23}A_{32}+\operatorname{sgn}(\sigma_3)A_{12}A_{21}A_{33}\\ +\operatorname{sgn}(\sigma_4)A_{12}A_{23}A_{31}+\operatorname{sgn}(\sigma_5)A_{13}A_{21}A_{32}+\operatorname{sgn}(\sigma_6)A_{13}A_{22}A_{31}, \end{gathered}\] so that in agreement with Example 5.19, we obtain \[\begin{gathered} \det \mathbf{A}=A_{11}A_{22}A_{33}-A_{11}A_{23}A_{32}-A_{12}A_{21}A_{33}\\+A_{12}A_{23}A_{31}+A_{13}A_{21}A_{32}-A_{13}A_{22}A_{31}. \end{gathered}\]

Remark 5.41

Exercise 5.49 has two important consequences. Since the transpose turns the rows of a matrix into columns and vice versa, we conclude:

the determinant is also multilinear and alternating, when thought of as a map \((\mathbb{K}^n)^n \to \mathbb{K},\) that is, when taking \(n\) columns vectors as an input. In particular, the determinant is also linear in each column;
the Laplace expansion is also valid if we expand with respect to a row, that is, for \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and \(1\leqslant l \leqslant n,\) we have \[\det(\mathbf{A})=\sum_{k=1}^n(-1)^{k+l}[\mathbf{A}]_{lk}\det\left(\mathbf{A}^{(l,k)}\right).\]

Example 5.42

(\(\heartsuit\) – not examinable). For \(n \in \mathbb{N}\) and a vector \(\vec{x}=(x_i)_{1\leqslant i\leqslant n}\in \mathbb{K}^n\) we can form a matrix \(\mathbf{V}_{\vec{x}}=(V_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) with \(V_{ij}=x_i^{j-1},\) that is, \[\mathbf{V}_{\vec{x}}=\begin{pmatrix} 1 & x_1 & (x_1)^2 & \cdots & (x_1)^{n-1} \\ 1 & x_2 & (x_2)^2 & \cdots & (x_2)^{n-1} \\ 1 & x_3 & (x_3)^2 & \cdots & (x_3)^{n-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & (x_n)^2 & \cdots & (x_n)^{n-1}\end{pmatrix}.\] Such matrices are known as Vandermonde matrices and the determinant of a Vandermonde matrix is known as a Vandermonde determinant, they satisfy \[\det(\mathbf{V}_{\vec{x}})=\prod_{1\leqslant i<j \leqslant n}(x_j-x_i).\]

Sketch of a proof. We can define a function \(f : \mathbb{K}^n \to \mathbb{K},\) \(\vec{x} \mapsto \det(\mathbf{V}_{\vec{x}}).\) By the Leibniz formula, the function \(f\) is a polynomial in the variables \(x_i\) with integer coefficients. If we freeze all variables of \(f\) except the \(\ell\)-th variable, then we obtain a function \(g_{\ell} : \mathbb{K}\to \mathbb{K}\) of one variable \(x_{\ell}.\) For \(1\leqslant i\leqslant n\) with \(i\neq \ell\) we have \(g_{\ell}(x_i)=0,\) since we compute the determinant of a matrix with two identical rows, the \(\ell\)-th row and the \(i\)-th row. Factoring the zeros, we can thus write \(g_{\ell}(x_{\ell})=q_{\ell}(x_{\ell})\prod_{1\leqslant i\leqslant n, i\neq \ell}(x_{\ell}-x_i)\) for some polynomial \(q_{\ell}.\) We can repeat this argument for all \(\ell\) and hence can write \(\det(\mathbf{V}_{\vec{x}})=q(\vec{x})\prod_{1\leqslant i<j\leqslant n}(x_j-x_i)\) for some polynomial \(q(\vec{x}).\) The Leibniz formula implies that the sum of the exponents of all the factors \(x_i\) in \(\det(\mathbf{V}_{\vec{x}})\) must be \(\frac{1}{2}n(n-1).\) The same holds true for \(\prod_{1\leqslant i<j \leqslant n}.\) It follows that \(q\) must be a constant. Using the Leibniz formula again, we see that the summand of \(\det(\mathbf{V}_{\vec{x}})\) corresponding to the identity permutation is the product of the diagonal entries of \(\mathbf{V}_{\vec{x}},\) that is, \(x_2(x_3)^2\cdots(x_n)^{n-1}.\) Taking the first term in all factors of \(\prod_{1\leqslant i<j\leqslant n}(x_j-x_i),\) we also obtain \(x_2(x_3)^2\cdots(x_n)^{n-1},\) hence \(\det(\mathbf{V}_{\vec{x}})=\prod_{1\leqslant i<j\leqslant n}(x_j-x_i),\) as claimed.

5.7 Cramer’s rule

The determinant can be used to give a formula for the solution of a linear system of equations of the form \(\mathbf{A}\vec{x}=\vec{b}\) for an invertible matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K}),\) \(\vec{b} \in \mathbb{K}^n\) and unknowns \(\vec{x} \in \mathbb{K}^n.\) This formula is often referred to as Cramer’s rule. In order to derive it we start with definitions:

Definition 5.43 • Adjugate matrix

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be a square matrix. The adjugate matrix of \(\mathbf{A}\) is the \(n\times n\)-matrix \(\operatorname{Adj}(\mathbf{A})\) whose entries are given by (notice the reverse order of \(i\) and \(j\) on the right hand side) \[[\operatorname{Adj}(\mathbf{A})]_{ij}=(-1)^{i+j}\det\left(\mathbf{A}^{(j,i)}\right), \qquad 1\leqslant i,j\leqslant n.\]

Example 5.44

\[\operatorname{Adj}\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=\begin{pmatrix} d & -b \\ -c & a \end{pmatrix}, \quad \operatorname{Adj}\left(\begin{pmatrix} 1 & 1 & 2 \\ 0 & 2 & 1 \\ 1 & 0 & 2 \end{pmatrix}\right)=\begin{pmatrix} 4 & -2 & -3 \\ 1 & 0 & -1 \\ -2 & 1 & 2 \end{pmatrix}\]

The determinant and the adjugate matrix provide a formula for the inverse of a matrix:

Theorem 5.45

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K}).\) Then we have \[\operatorname{Adj}(\mathbf{A})\mathbf{A}=\mathbf{A}\operatorname{Adj}(\mathbf{A})=\det(\mathbf{A})\mathbf{1}_{n}.\] In particular, if \(\mathbf{A}\) is invertible then \[\mathbf{A}^{-1}=\frac{1}{\det \mathbf{A}}\operatorname{Adj}(\mathbf{A}).\]

Proof. Let \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}.\) For \(1\leqslant i\leqslant n\) we obtain for the \(i\)-th diagonal entry \[[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ii}=\sum_{k=1}^n(-1)^{i+k}\det\left(\mathbf{A}^{(k,i)}\right)A_{ki}=\det(\mathbf{A}),\] where we use the Laplace expansion (5.5) of the determinant. The diagonal entries of \(\operatorname{Adj}(\mathbf{A})\mathbf{A}\) are thus all equal to \(\det \mathbf{A}.\) For \(1\leqslant i,j\leqslant n\) with \(i\neq j\) we have \[[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ij}=\sum_{k=1}^n(-1)^{i+k}\left(\det \mathbf{A}^{(k,i)}\right)A_{kj}.\] We would like to interpret this last expression as a Laplace expansion. We consider a new matrix \(\hat{\mathbf{A}}=(\hat{A}_{ij})_{1\leqslant i,j\leqslant n}\) which is identical to \(\mathbf{A},\) except that the \(i\)-th column of \(\mathbf{A}\) is replaced with the \(j\)-th column of \(\mathbf{A},\) that is, for \(1\leqslant k\leqslant n,\) we have \[\tag{5.9} \hat{A}_{kl}=\left\{\begin{array}{cc} A_{kj}, & l=i,\\ A_{kl}, & l\neq i.\end{array}\right.\] Then, for all \(1\leqslant k \leqslant n\) we have \(\hat{\mathbf{A}}^{(k,i)}=\mathbf{A}^{(k,i)},\) since the only column in which \(\mathbf{A}\) and \(\hat{\mathbf{A}}\) are different is removed in \(\mathbf{A}^{(k,i)}.\) Using (5.9), the Laplace expansion of \(\hat{\mathbf{A}}\) with respect to the \(i\)-th column gives \[\begin{aligned} \det\hat{\mathbf{A}}&=\sum_{k=1}^n(-1)^{(i+k)}\hat{A}_{ki}\det\left(\hat{\mathbf{A}}^{(k,i)}\right)=\sum_{k=1}^n(-1)^{i+k}\left(\det \mathbf{A}^{(k,i)}\right)A_{kj}\\ &=[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ij} \end{aligned}\] The matrix \(\hat{\mathbf{A}}\) has a double occurrence of the \(i\)-th column, hence its column vectors are linearly dependent. Therefore \(\hat{\mathbf{A}}\) is not invertible by Proposition 4.7 and so \(\det\hat{\mathbf{A}}=[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ij}=0\) by Corollary 5.22. The off-diagonal entries of \(\operatorname{Adj}(\mathbf{A})\mathbf{A}\) are thus all zero and we conclude \(\operatorname{Adj}(\mathbf{A})\mathbf{A}=\det(\mathbf{A})\mathbf{1}_{n}.\) Using the row version of the Laplace expansion we can conclude analogously that \(\mathbf{A}\operatorname{Adj}(\mathbf{A})=\det(\mathbf{A})\mathbf{1}_{n}.\)

Finally, if \(\mathbf{A}\) is invertible, then \(\det \mathbf{A}\neq 0\) by Corollary 5.22, so that \(\mathbf{A}^{-1}=\operatorname{Adj}(\mathbf{A})/\det(\mathbf{A}),\) as claimed.

As a corollary we obtain:

Corollary 5.46

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be an invertible upper triangular matrix. Then \(\mathbf{A}^{-1}\) is also an upper triangular matrix.

Remark 5.47

Taking the transpose also implies: Let \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be an invertible lower triangular matrix. Then \(\mathbf{A}^{-1}\) is also a lower triangular matrix.

Proof of Corollary 5.46
Corollary 5.46 ➔
Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be an invertible upper triangular matrix. Then \(\mathbf{A}^{-1}\) is also an upper triangular matrix.

. Write \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}.\) Using Theorem 5.45 it suffices to show that \(\operatorname{Adj}(\mathbf{A})\) is an upper triangular matrix. If \(\mathbf{A}\) is an upper triangular matrix, then \(A_{ij}=0\) for all \(i>j.\) By definition we have \[[\operatorname{Adj}(\mathbf{A})]_{ij}=(-1)^{i+j}\det\left(\mathbf{A}^{(j,i)}\right), \qquad 1\leqslant i,j\leqslant n.\] Notice that for \(i>j\) every element below the diagonal of \(\mathbf{A}^{(j,i)}\) is also below the diagonal of \(\mathbf{A}\) and hence must be zero. It follows that \(\mathbf{A}^{(j,i)}\) is an upper triangular matrix as well. Proposition 5.24 implies that the determinant of \(\mathbf{A}^{(j,i)}\) is the product of its diagonal entries. Since \(\mathbf{A}^{(j,i)}\) arises from the upper triangular matrix \(\mathbf{A}\) by removing a row and a column, at least one of the diagonal entries of \(\mathbf{A}^{(j,i)}\) must be zero and thus \(\det \mathbf{A}^{(j,i)}=0\) for \(i>j.\) We conclude that \(\mathbf{A}^{-1}\) is an upper triangular matrix as well.

We now use Theorem 5.45 to obtain a formula for the solution of the linear system \(\mathbf{A}\vec{x}=\vec{b}\) for an invertible matrix \(\mathbf{A}.\) Multiplying from the left with \(\mathbf{A}^{-1},\) we get \[\vec{x}=\mathbf{A}^{-1}\vec{b}=\frac{1}{\det \mathbf{A}}\operatorname{Adj}(\mathbf{A})\vec{b}.\] Writing \(\vec{x}=(x_i)_{1\leqslant i\leqslant n},\) multiplication with \(\det \mathbf{A}\) gives for \(1\leqslant i \leqslant n\) \[x_i\det \mathbf{A}=\sum_{k=1}^n[\operatorname{Adj}(\mathbf{A})]_{ik}b_k=\sum_{k=1}^n(-1)^{i+k}\det\left(\mathbf{A}^{(k,i)}\right)b_k.\] We can again interpret the right hand side as a Laplace expansion of the matrix \(\hat{\mathbf{A}}_i\) obtained by replacing the \(i\)-th column of \(\mathbf{A}\) with \(\vec{b}\) and leaving \(\mathbf{A}\) unchanged otherwise. Hence, we have for all \(1\leqslant i\leqslant n\) \[x_i=\frac{\det \hat{\mathbf{A}}_i}{\det \mathbf{A}}.\] This formula is known as Cramer’s rule. While this is a neat formula, it is rarely used in computing solutions to linear systems of equations due to the complexity of computing determinants.

Example 5.48 • Cramer’s rule

We consider the system \(\mathbf{A}\vec{x}=\vec{b}\) for \[\mathbf{A}=\begin{pmatrix} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{pmatrix}\qquad \text{and} \qquad \vec{b}=\begin{pmatrix} -2 \\ 2 \\ 4 \end{pmatrix}.\] Here we obtain \[\hat{\mathbf{A}}_1=\begin{pmatrix} -2 & 1 & 1 \\ 2 & 2 & 1 \\ 4 & 1 & 2 \end{pmatrix}, \quad \hat{\mathbf{A}}_2=\begin{pmatrix} 2 & -2 & 1 \\ 1 & 2 & 1 \\ 1 & 4 & 2 \end{pmatrix}, \quad \hat{\mathbf{A}}_3=\begin{pmatrix} 2 & 1 & -2 \\ 1 & 2 & 2 \\ 1 & 1 & 4 \end{pmatrix}.\] We compute \(\det\mathbf{A}=4,\) \(\det\hat{\mathbf{A}}_1=-12,\) \(\det\hat{\mathbf{A}}_2=4\) and \(\det\hat{\mathbf{A}}_3=12\) so that Cramer’s rule gives indeed the correct solution \[\vec{x}=\frac{1}{4}\begin{pmatrix} -12 \\ 4 \\ 12 \end{pmatrix}=\begin{pmatrix} -3 \\ 1 \\ 3\end{pmatrix}.\]

Exercises

Exercise 5.49

Use the Leibniz formula to show that \[\det(\mathbf{A})=\det(\mathbf{A}^T)\] for all \(\mathbf{A}\in M_{n,n}(\mathbb{K}).\)

Solution

Let \(\mathbf{A}= (A_{ij})_{1\leqslant i,j\leqslant n}.\) We first note that \(\operatorname{sgn}(\sigma) = \operatorname{sgn}(\sigma^{-1})\) for all \(\sigma\in S_n,\) which immediately follows from \(\operatorname{sgn}(\sigma)\operatorname{sgn}(\sigma^{-1})=\operatorname{sgn}(\sigma\circ\sigma^{-1}) = \operatorname{sgn}(1) = 1.\) The Leibniz formula reads \[\det(\mathbf{A}) = \sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n A_{\sigma(i)i}.\] We write \(\pi=\sigma^{-1}\) and let \(j=\sigma(i).\) Summing over all permutations is the same as summing over all inverse permutations and hence we find \[\begin{aligned} \det(\mathbf{A}) & = \sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{j=1}^n A_{j\pi(j)}\\ & = \sum_{\pi\in S_n}\operatorname{sgn}(\pi)\prod_{j=1}^n A_{j\pi(j)}\\ & = \sum_{\pi\in S_n}\operatorname{sgn}(\pi)\prod_{j=1}^n [\mathbf{A}^T]_{\pi(j)j}\\ & = \det(\mathbf{A}^T). \end{aligned}\]

5.4 Properties of the determinant

5.5 Permutations

5.6 The Leibniz formula

5.7 Cramer’s rule

Exercises

Home

Contents

Exercises

Lecture Recordings

Quizzes

Study Weeks

✕