10 The determinant, II

10.1 Properties of the determinant

Proposition 10.1 • Product rule

For matrices \(\mathbf{A},\mathbf{B}\in M_{n,n}(\mathbb{K})\) we have \[\det(\mathbf{A}\mathbf{B})=\det(\mathbf{A})\det(\mathbf{B}).\]

Proof. We first consider the case where \(\mathbf{A}\) is not invertible, then \(\det(\mathbf{A})=0\) (see the proof of Proposition 9.15). If \(\mathbf{A}\) is not invertible, then neither is \(\mathbf{A}\mathbf{B}.\) Indeed, if \(\mathbf{A}\mathbf{B}\) were invertible, then there exists a matrix \(\mathbf{C}\) such that \((\mathbf{A}\mathbf{B})\mathbf{C}=\mathbf{1}_{n}.\) But since the matrix product is associative, this also gives \(\mathbf{A}(\mathbf{B}\mathbf{C})=\mathbf{1}_{n},\) so that \(\mathbf{B}\mathbf{C}\) is a right inverse of \(\mathbf{A}.\) By Proposition 7.7, \(\mathbf{A}\) is invertible, a contradiction. Hence if \(\mathbf{A}\) is not invertible, we must also have \(\det(\mathbf{A}\mathbf{B})=0,\) which verifies that \(\det(\mathbf{A}\mathbf{B})=0=\det(\mathbf{A})\det(\mathbf{B})\) for \(\mathbf{A}\) not invertible.

If \(\mathbf{A}\) is invertible, we can write it as a product of elementary matrices and applying the second part of Lemma 9.14, we conclude that \(\det(\mathbf{A}\mathbf{B})=\det(\mathbf{A})\det(\mathbf{B}).\)

Corollary 10.2

A matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) is invertible if and only if \(\det(\mathbf{A})\neq 0.\) Moreover, in the case where \(\mathbf{A}\) is invertible, we have \[\det\left(\mathbf{A}^{-1}\right)=\frac{1}{\det \mathbf{A}}.\]

Proof. We have already seen that if \(\mathbf{A}\) is not invertible, then \(\det(\mathbf{A})=0.\) This is equivalent to saying that if \(\det(\mathbf{A})\neq 0,\) then \(\mathbf{A}\) is invertible. It thus remains to show that if \(\mathbf{A}\) is invertible, then \(\det(\mathbf{A})\neq 0.\) Suppose \(\mathbf{A}\) is invertible, then applying Proposition 10.1 gives \[\det(\mathbf{1}_{n})=\det\left(\mathbf{A}\mathbf{A}^{-1}\right)=\det(\mathbf{A})\det\left(\mathbf{A}^{-1}\right)=1\] so that \(\det(\mathbf{A})\neq 0\) and \(\det\left(\mathbf{A}^{-1}\right)=1/\det(\mathbf{A}).\)

Remark 10.3 • Product symbol

Recall that for scalars \(x_1,\ldots,x_n \in \mathbb{K},\) we write \[\prod_{i=1}^n x_i=x_1x_2\cdots x_n.\]

Proposition 10.4

Let \(n \in \mathbb{N}\) and \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) be an upper triangular matrix so that \(A_{ij}=0\) for \(i>j.\) Then \[\tag{10.1} \det(\mathbf{A})=\prod_{i=1}^n A_{ii}=A_{11}A_{22}\cdots A_{nn}.\]

Proof. We use induction. For \(n=1\) the condition \(A_{ij}=0\) for \(i>j\) is vacuous and (10.1) is trivially satisfied, thus the statement is anchored.

Inductive step: Assume \(n \in \mathbb{N}\) and \(n\geqslant 2.\) We want to show that if (10.1) holds for upper triangular \((n-1)\times(n-1)\)-matrices, then also for upper triangular \(n\times n\)-matrices. Let \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) be an upper triangular matrix. Choosing \(l=1\) in the formula for \(\det(\mathbf{A}),\) we obtain \[\begin{aligned} \det(\mathbf{A})&=\sum_{k=1}^n(-1)^{k+1}A_{k1}\det\left(\mathbf{A}^{(k,1)}\right)=A_{11}\det\left(\mathbf{A}^{(1,1)}\right)+\sum_{k=2}^nA_{k1}\det\left(\mathbf{A}^{(k,1)}\right)\\ &=A_{11}\det\left(\mathbf{A}^{(1,1)}\right), \end{aligned}\] where the last equality uses that \(A_{k1}=0\) for \(k>1.\) We have \(\mathbf{A}^{(1,1)}=(A_{ij})_{2\leqslant i,j\leqslant n}\) and since \(\mathbf{A}\) is an upper triangular matrix, it follows that \(\mathbf{A}^{(1,1)}\) is an \((n-1)\times(n-1)\) upper triangular matrix as well. Hence by the induction hypothesis, we obtain \[\det(\mathbf{A}^{(1,1)})=\prod_{i=2}^nA_{ii}.\] We conclude that \(\det(\mathbf{A})=\prod_{i=1}^n A_{ii},\) as claimed.

10.2 Permutations

A rearrangement of the natural numbers from \(1\) up to \(n\) is called a permutation:

Definition 10.5 • Permutation

Let \(n \in \mathbb{N}\) and \(\mathcal{X}_n=\{1,2,3,\ldots,n\}.\) A permutation is a bijective mapping \(\sigma : \mathcal{X}_n \to \mathcal{X}_n.\) The set of all permutations of \(\mathcal{X}_n\) is denoted by \(S_n.\)

Remark 10.6

If \(\tau,\sigma : \mathcal{X}_n \to \mathcal{X}_n\) are permutations, it is customary to write \(\tau\sigma\) or \(\tau\cdot \sigma\) instead of \(\tau \circ \sigma.\) Furthermore, the identity mapping \(\mathrm{Id}_{\mathcal{X}_n}\) is often simply denoted by \(1.\) A convenient way to describe a permutation \(\sigma \in S_n\) is in terms of a \(2\times n\) matrix \[\begin{pmatrix} i \\ \sigma(i) \end{pmatrix}_{1\leqslant i \leqslant n}.\] which we denote by \(\boldsymbol{\sigma}.\) For instance, for \(n=4,\) the matrix \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 & 4 \\ 2 & 3 & 1 & 4 \end{pmatrix}\] corresponds to the permutation \(\sigma\) satisfying \(\sigma(1)=2,\sigma(2)=3,\sigma(3)=1,\sigma(4)=4.\)

Permutations which only swap two natural numbers and leave all remaining numbers unchanged are known as transpositions:

Definition 10.7 • Transposition

Let \(n \in \mathbb{N}\) and \(1\leqslant k,l\leqslant n\) with \(k\neq l.\) The transposition \(\tau_{k,l} \in S_n\) is the permutation satisfying \[\tau_{k,l}(k)=l, \qquad \tau_{k,l}(l)=k, \qquad \tau_{k,l}(i)=i \text{ if } i \notin \{k,l\}.\]

Every permutation \(\sigma \in S_n\) defines a linear map \(g : \mathbb{K}^n \to \mathbb{K}^n\) satisfying \(g(\vec{e}_i)=\vec{e}_{\sigma(i)},\) where \(\{\vec{e}_1,\ldots,\vec{e}_n\}\) denotes the standard basis of \(\mathbb{K}^n.\) Since \(g\) is linear, there exists a unique matrix \(\mathbf{P}_{\sigma} \in M_{n,n}(\mathbb{K})\) so that \(g=f_{\mathbf{P}_{\sigma}}.\) Observe that the column vectors of the matrix \(\mathbf{P}_{\sigma}\) are given by \(\vec{e}_{\sigma(1)},\vec{e}_{\sigma(2)},\ldots,\vec{e}_{\sigma(n)}.\)

Definition 10.8 • Permutation matrix

We call \(\mathbf{P}_{\sigma} \in M_{n,n}(\mathbb{K})\) the permutation matrix associated to \(\sigma \in S_n.\)

Example 10.9

Let \(n=4.\) For instance, we have \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 & 4 \\ 2 & 3 & 1 & 4 \end{pmatrix} \qquad \mathbf{P}_{\sigma}=\begin{pmatrix} 0 & 0 & 1 & 0 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}\] and \[\boldsymbol{\tau}_{2,4}=\begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 4 & 3 & 2 \end{pmatrix}\qquad \mathbf{P}_{\tau_{2,4}}=\begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{pmatrix}.\]

Remark 10.10

Notice that \(\mathbf{P}_{\tau_{k,l}}=\mathbf{P}_{k,l},\) where \(\mathbf{P}_{k,l}\) is one of the elementary matrices of size \(n\) (see M01 Algorithmics).

Assigning to a permutation its permutation matrix turns composition of permutations into matrix multiplication:

Proposition 10.11

Let \(n \in \mathbb{N}.\) Then \(\mathbf{P}_1=\mathbf{1}_{n}\) and for all \(\sigma,\pi \in S_n\) we have \[\mathbf{P}_{\pi \cdot \sigma}=\mathbf{P}_\pi \mathbf{P}_\sigma.\] In particular, for all \(\sigma \in S_n,\) the permutation matrix \(\mathbf{P}_\sigma\) is invertible with \((\mathbf{P}_{\sigma})^{-1}=\mathbf{P}_{\sigma^{-1}}.\)

Example 10.12

Considering \(n=3.\) For \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 \\ 3 & 1 & 2 \end{pmatrix} \quad \text{and}\qquad \boldsymbol{\pi}=\begin{pmatrix} 1 & 2 & 3 \\ 1 & 3 & 2 \end{pmatrix}\qquad \text{we have} \quad \boldsymbol{\pi}\cdot\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 \\ 2 & 1 & 3\end{pmatrix},\] as well as \[\mathbf{P}_\sigma=\begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{pmatrix}, \qquad \mathbf{P}_\pi=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix} \quad \text{and} \quad \mathbf{P}_{\pi\cdot \sigma}=\begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}.\] Thus we obtain \[\mathbf{P}_{\pi\cdot\sigma}=\begin{pmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 0 & 1 \\ 0 & 1 & 0 \end{pmatrix}\begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{pmatrix}=\mathbf{P}_\pi \mathbf{P}_\sigma,\] as claimed by Proposition 10.11.

Proof of Proposition 10.11. The matrix \(\mathbf{P}_1\) has column vectors given by \(\vec{e}_{1},\ldots,\vec{e}_n,\) hence \(\mathbf{P}_1=\mathbf{1}_{n}.\)

Using Proposition 7.2 and Theorem 7.6 it is sufficient to show that for all \(\pi,\sigma \in S_n\) we have \(f_{\mathbf{P}_{\pi\cdot \sigma}}=f_{\mathbf{P}_\pi}\circ f_{\mathbf{P}_\sigma}.\) For all \(1\leqslant i\leqslant n,\) we obtain \[f_{\mathbf{P}_\pi}\left(f_{\mathbf{P}_\sigma}(\vec{e}_i)\right)=f_{\mathbf{P}_\pi}\left(\vec{e}_{\sigma(i)}\right)=\vec{e}_{\pi(\sigma(i))}=\vec{e}_{(\pi\cdot\sigma)(i)}=f_{\mathbf{P}_{\pi\cdot \sigma}}(\vec{e}_i).\] The two maps thus agree on the ordered basis \(\mathbf{e}=(\vec{e}_1,\ldots,\vec{e}_n)\) of \(\mathbb{K}^n,\) so that the second claim follows by applying Lemma 8.6.

We have \[\mathbf{P}_{\sigma \cdot \sigma^{-1}}=\mathbf{P}_{1}=\mathbf{1}_{n}=\mathbf{P}_{\sigma}\mathbf{P}_{\sigma^{-1}}\] showing that \(\mathbf{P}_{\sigma}\) is invertible with inverse \((\mathbf{P}_{\sigma})^{-1}=\mathbf{P}_{\sigma^{-1}}.\)

Definition 10.13 • Signature of a permutation

For \(\sigma \in S_n\) we call \(\operatorname{sgn}(\sigma)=\det(\mathbf{P}_{\sigma})\) its signature.

Remark 10.14

Combining Proposition 10.1 and Proposition 10.11, we conclude that \[\operatorname{sgn}(\pi\cdot \sigma)=\operatorname{sgn}(\pi)\operatorname{sgn}(\sigma)\] for all \(\pi,\sigma \in S_n.\)
Since \(\mathbf{P}_{\tau_{k,l}}=\mathbf{P}_{k,l}\) and \(\det \mathbf{P}_{k,l}=-1\) by Lemma 9.14, we conclude that \[\operatorname{sgn}(\tau_{k,l})=-1\] for all transpositions \(\tau_{k,l} \in S_n.\)

Similarly to elementary matrices being the building blocks of invertible matrices, transpositions are the building blocks of permutations:

Proposition 10.15

Let \(n\in \mathbb{N}\) and \(\sigma \in S_n.\) Then there exists \(m \geqslant 0\) and \(m\) transpositions \(\tau_{k_1,l_1},\ldots, \tau_{k_m,l_m} \in S_n\) such that \(\sigma=\tau_{k_m,l_m}\cdots \tau_{k_1,l_1},\) where we use the convention that \(0\) transpositions corresponds to the identity permutation.

Example 10.16

Let \(n=6\) and \(\sigma\) the permutation defined by the matrix \[\boldsymbol{\sigma}=\begin{pmatrix} 1 & 2 & 3 & 4 & 5 & 6 \\ 3 & 5 & 2 & 4 & 6 & 1\end{pmatrix}.\] To express it as a product of transposition, we write \[\begin{array}{cccccc|c} 3 & 5 & 2 & 4 & 6 & 1 & \\ 3 & 2 & 5 & 4 & 6 & 1 & \tau_{2,3} \\ 1 & 2 & 5 & 4 & 6 & 3 & \tau_{1,6} \\ 1 & 2 & 5 & 4 & 3 & 6 & \tau_{5,6} \\ 1 & 2 & 3 & 4 & 5 & 6 & \tau_{3,5}\end{array}\] so that \(\sigma=\tau_{3,5}\tau_{5,6}\tau_{1,6}\tau_{2,3}.\)

Proof of Proposition 10.15. We use induction. For \(n=1\) we have \(\mathcal{X}_n=\{1\}\) and the only permutation is the identity permutation \(1,\) so the statement is trivially true and hence anchored.

Inductive step: Assume \(n \in \mathbb{N}\) and \(n\geqslant 2.\) We want to show that if the claim holds for \(S_{n-1},\) then also for \(S_n.\) Let \(\sigma \in S_n\) and define \(k=\sigma(n).\) Then the permutation \(\sigma_1=\tau_{n,k}\sigma\) satisfies \(\sigma_1(n)=\tau_{n,k}\sigma(n)=\tau_{n,k}(k)=n\) and hence does not permute \(n.\) Restricting \(\sigma_1\) to the first \(n-1\) elements, we obtain a permutation of \(\{1,\ldots,n-1\}.\) By the induction hypothesis, we thus have \({\tilde{m}} \in \mathbb{N}\) and \(\tau_{k_1,l_1},\ldots \tau_{k_{{\tilde{m}}}},\tau_{l_{{\tilde{m}}}} \in S_n\) such that \[\sigma_1=\tau_{k_{{\tilde{m}}},l_{{\tilde{m}}}}\cdots \tau_{k_1,l_1}=\tau_{n,k}\sigma.\] Since \(\tau_{n,k}^2=1,\) multiplying from the left with \(\tau_{n,k}\) gives \(\sigma=\tau_{n,k}\tau_{k_{{\tilde{m}}},l_{{\tilde{m}}}}\cdots \tau_{k_1,l_1},\) the claim follows with \(m={\tilde{m}}+1.\)

Combining Definition 10.13, Remark 10.14 and Proposition 10.15, we conclude:

Proposition 10.17

Let \(n \in \mathbb{N}\) and \(\sigma \in S_n.\) Then \(\operatorname{sgn}(\sigma)=\pm 1.\) If \(\sigma\) is a product of \(m\) transpositions, then \(\operatorname{sgn}(\sigma)=(-1)^m.\)

Remark 10.18

Permutations with \(\operatorname{sgn}(\sigma)=1\) are called even and permutations with \(\operatorname{sgn}(\sigma)=-1\) are called odd, since they arise from the composition of an even or odd number of transpositions, respectively.

10.3 The Leibniz formula

Besides the Laplace expansion, there is also a formula for the determinant which relies on permutations. As a warm-up, we first consider the case \(n=2.\) Using the linearity of the determinant in the first row, we obtain \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix}=\det\begin{pmatrix} a & 0 \\ c & d \end{pmatrix} +\det \begin{pmatrix} 0 & b \\ c & d \end{pmatrix},\] where \(a,b,c,d \in \mathbb{K}.\) Using the linearity of the determinant in the second row, we can further decompose the two above summands \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix}=\underbrace{\det \begin{pmatrix} a & 0 \\ c & 0 \end{pmatrix}+\det\begin{pmatrix} a & 0 \\ 0 & d \end{pmatrix}}_{=\det \begin{pmatrix} a & 0 \\ c & d\end{pmatrix}}+\underbrace{\det \begin{pmatrix} 0 & b \\ c & 0 \end{pmatrix}+\det\begin{pmatrix} 0 & b \\ 0 & d \end{pmatrix}}_{=\det \begin{pmatrix} 0 & b \\ c & d\end{pmatrix}}\] The first and fourth summand are always zero due to the occurrence of a zero column. The second and third summand are possibly nonzero (it might still happen that they are zero in the case where some of \(a,b,c,d\) are zero). In any case, we get \[\det\begin{pmatrix} a & b \\ c & d \end{pmatrix}=\det \begin{pmatrix} a & 0 \\ 0 & d \end{pmatrix}+\det \begin{pmatrix} 0 & b \\ c & 0 \end{pmatrix}.\] We can proceed analogously in general. Let \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K}).\) We denote the rows of \(\mathbf{A}\) by \(\{\vec{\alpha}_1,\ldots,\vec{\alpha}_n\}.\) Using the linearity of \(\det\) in the first row, we can write \[\begin{gathered} \det \mathbf{A}=\det\begin{pmatrix} A_{11} & 0 & 0 & \cdots & 0 \\ && \vec{\alpha}_2 \\ && \vdots \\ && \vec{\alpha}_n&&\end{pmatrix}+\det\begin{pmatrix} 0 & A_{12} & 0 & \cdots & 0 \\ && \vec{\alpha}_2 \\ && \vdots \\ && \vec{\alpha}_n&&\end{pmatrix}+\cdots\\ \cdots +\det\begin{pmatrix} 0 & 0 & 0 & \cdots & A_{1n} \\ && \vec{\alpha}_2 \\ && \vdots \\ && \vec{\alpha}_n&&\end{pmatrix}. \end{gathered}\] We can now use the linearity in the second row and proceed in the same fashion with each of the above summands. We continue this procedure until the \(n\)-th row. As a result, we can write \[\tag{10.2} \det \mathbf{A}= \sum_{k=1}^{n^n} \det \mathbf{M}_k\] where each of the matrices \(\mathbf{M}_k\) has exactly one possibly nonzero entry in each row. As above, some of the matrices \(\mathbf{M}_k\) will have a zero column so that their determinant vanishes. The matrices \(\mathbf{M}_k\) without a zero column must have exactly one possibly nonzero entry in each row and each column. We can thus write the matrices \(\mathbf{M}_k\) with possibly nonzero determinant as \[\mathbf{M}_k=\sum_{i=1}^n A_{\sigma(i)i}\mathbf{E}_{\sigma(i),i}\] for some permutation \(\sigma \in S_n.\) Every permutation of \(\{1,\ldots,n\}\) occurs precisely once in the expansion (10.2), hence we can write \[\det \mathbf{A}=\sum_{\sigma \in S_n}\det\left(\sum_{i=1}^nA_{\sigma(i)i}\mathbf{E}_{\sigma(i),i}\right),\] where the notation \(\sum_{\sigma \in S_n}\) means that we sum over all possible permutations of \(\{1,\ldots,n\}.\) We will next write the matrix \(\sum_{i=1}^n A_{\sigma(i)i}\mathbf{E}_{\sigma(i),i}\) differently. To this end notice that for all \(\sigma \in S_n,\) the permutation matrix \(\mathbf{P}_{\sigma}\) can be written as \(\mathbf{P}_{\sigma}=\sum_{i=1}^n \mathbf{E}_{\sigma(i),i}.\) Furthermore, the diagonal matrix \[\mathbf{D}_{\sigma}=\begin{pmatrix} A_{\sigma(1)1}& & & \\ & A_{\sigma(2)2} & & \\ && \ddots & & \\ &&& A_{\sigma(n)n}\end{pmatrix}\] can be written as \(\mathbf{D}_{\sigma}=\sum_{j=1}^n A_{\sigma(j)j}\mathbf{E}_{j,j}.\) Therefore, we obtain \[\mathbf{P}_{\sigma}\mathbf{D}_{\sigma}=\sum_{i=1}^n \mathbf{E}_{\sigma(i),i}\sum_{j=1}^n A_{\sigma(j)j}\mathbf{E}_{j,j}=\sum_{i=1}^n\sum_{j=1}^n A_{\sigma(j)j}\mathbf{E}_{\sigma(i),i}\mathbf{E}_{j,j}=\sum_{i=1}^n A_{\sigma(i)i}\mathbf{E}_{\sigma(i),i},\] We thus have the formula \[\det \mathbf{A}=\sum_{\sigma \in S_n} \det\left(\mathbf{P}_{\sigma}\mathbf{D}_{\sigma}\right)=\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\det(\mathbf{D}_\sigma),\] where we use the product rule Proposition 10.1 and the definition of the signature of a permutation. By Proposition 10.4, the determinant of a diagonal matrix is the product of its diagonal entries, hence we obtain \[\det \mathbf{A}=\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n A_{\sigma(i)i}.\] Finally, writing \(\pi=\sigma^{-1},\) we have \[\prod_{i=1}^n A_{\sigma(i)i}=\prod_{j=1}^n A_{j\pi(j)}.\] We have thus shown:

Proposition 10.19 • Leibniz formula for the determinant

Let \(n \in \mathbb{N}\) and \(\mathbf{A}=(A_{ij})_{1,\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K}).\) Then we have \[\tag{10.3} \det(\mathbf{A}) =\sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n A_{\sigma(i)i}=\sum_{\pi \in S_n}\operatorname{sgn}(\pi)\prod_{j=1}^n A_{j\pi(j)}.\]

Example 10.20

For \(n=3\) we have six permutations \[\begin{aligned} \boldsymbol{\sigma}_1&=\begin{pmatrix} 1 & 2 & 3 \\ 1 & 2 & 3 \end{pmatrix}, \quad \boldsymbol{\sigma}_2=\begin{pmatrix} 1 & 2 & 3 \\ 1 & 3 & 2 \end{pmatrix}, \quad \boldsymbol{\sigma}_3=\begin{pmatrix} 1 & 2 & 3 \\ 2 & 1 & 3 \end{pmatrix}\\ \boldsymbol{\sigma}_4&=\begin{pmatrix} 1 & 2 & 3 \\ 2 & 3 & 1 \end{pmatrix}, \quad \boldsymbol{\sigma}_5=\begin{pmatrix} 1 & 2 & 3 \\ 3 & 1 & 2 \end{pmatrix}, \quad \boldsymbol{\sigma}_6=\begin{pmatrix} 1 & 2 & 3 \\ 3 & 2 & 1 \end{pmatrix}. \end{aligned}\] For \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant 3} \in M_{3,3}(\mathbb{K}),\) the Leibniz formula gives \[\begin{gathered} \det(\mathbf{A})=\operatorname{sgn}(\sigma_1)A_{11}A_{22}A_{33}+\operatorname{sgn}(\sigma_2)A_{11}A_{23}A_{32}+\operatorname{sgn}(\sigma_3)A_{12}A_{21}A_{33}\\ +\operatorname{sgn}(\sigma_4)A_{12}A_{23}A_{31}+\operatorname{sgn}(\sigma_5)A_{13}A_{21}A_{32}+\operatorname{sgn}(\sigma_6)A_{13}A_{22}A_{31}, \end{gathered}\] so that in agreement with Example 9.20, we obtain \[\begin{gathered} \det \mathbf{A}=A_{11}A_{22}A_{33}-A_{11}A_{23}A_{32}-A_{12}A_{21}A_{33}\\+A_{12}A_{23}A_{31}+A_{13}A_{21}A_{32}-A_{13}A_{22}A_{31}. \end{gathered}\]

Remark 10.21

It follows from Leibniz’ formula that \(\det(\mathbf{A}) = \det(\mathbf{A}^T)\) (see Exercise 10.27 below). This has the following important consequences:

the determinant is also multilinear and alternating, when thought of as a map \((\mathbb{K}^n)^n \to \mathbb{K},\) that is, when taking \(n\) columns vectors as an input. In particular, the determinant is also linear in each column;
the Laplace expansion is also valid if we expand with respect to a row, that is, for \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and \(1\leqslant l \leqslant n,\) we have \[\det(\mathbf{A})=\sum_{k=1}^n(-1)^{k+l}[\mathbf{A}]_{lk}\det\left(\mathbf{A}^{(l,k)}\right).\]

Example 10.22

(\(\heartsuit\) – not examinable). For \(n \in \mathbb{N}\) and a vector \(\vec{x}=(x_i)_{1\leqslant i\leqslant n}\in \mathbb{K}^n\) we can form a matrix \(\mathbf{V}_{\vec{x}}=(V_{ij})_{1\leqslant i,j\leqslant n} \in M_{n,n}(\mathbb{K})\) with \(V_{ij}=x_i^{j-1},\) that is, \[\mathbf{V}_{\vec{x}}=\begin{pmatrix} 1 & x_1 & (x_1)^2 & \cdots & (x_1)^{n-1} \\ 1 & x_2 & (x_2)^2 & \cdots & (x_2)^{n-1} \\ 1 & x_3 & (x_3)^2 & \cdots & (x_3)^{n-1} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & (x_n)^2 & \cdots & (x_n)^{n-1}\end{pmatrix}.\] Such matrices are known as Vandermonde matrices and the determinant of a Vandermonde matrix is known as a Vandermonde determinant, they satisfy \[\det(\mathbf{V}_{\vec{x}})=\prod_{1\leqslant i<j \leqslant n}(x_j-x_i).\]

Sketch of a proof. We can define a function \(f : \mathbb{K}^n \to \mathbb{K},\) \(\vec{x} \mapsto \det(\mathbf{V}_{\vec{x}}).\) By the Leibniz formula, the function \(f\) is a polynomial in the variables \(x_i\) with integer coefficients. If we freeze all variables of \(f\) except the \(\ell\)-th variable, then we obtain a function \(g_{\ell} : \mathbb{K}\to \mathbb{K}\) of one variable \(x_{\ell}.\) For \(1\leqslant i\leqslant n\) with \(i\neq \ell\) we have \(g_{\ell}(x_i)=0,\) since we compute the determinant of a matrix with two identical rows, the \(\ell\)-th row and the \(i\)-th row. Factoring the zeros, we can thus write \(g_{\ell}(x_{\ell})=q_{\ell}(x_{\ell})\prod_{1\leqslant i\leqslant n, i\neq \ell}(x_{\ell}-x_i)\) for some polynomial \(q_{\ell}.\) We can repeat this argument for all \(\ell\) and hence can write \(\det(\mathbf{V}_{\vec{x}})=q(\vec{x})\prod_{1\leqslant i<j\leqslant n}(x_j-x_i)\) for some polynomial \(q(\vec{x}).\)

On the other hand, if we multiply all the \(x_i\) by a constant \(s \in \mathbb{K},\) the determinant multiplies by \(s^{1 + 2 + \dots + (n-1)} = s^{n(n-1)/2}.\) The product \(\prod_{1\leqslant i<j\leqslant n}(x_j-x_i)\) has the same scaling behaviour (since it has \(n(n-1)/2\) linear factors); so \(q\) must be invariant under scaling the variables. This implies that \(q\) has to be constant.

Using the Leibniz formula, we see that the summand of \(\det(\mathbf{V}_{\vec{x}})\) corresponding to the identity permutation is the product of the diagonal entries of \(\mathbf{V}_{\vec{x}},\) that is, \(x_2(x_3)^2\cdots(x_n)^{n-1}\) (and no other term in the sum has this combination of exponents). Taking the first term in all factors of \(\prod_{1\leqslant i<j\leqslant n}(x_j-x_i),\) we also obtain \(x_2(x_3)^2\cdots(x_n)^{n-1}\) (and no other term in the product has these exponents). Hence the constant \(q\) must be 1, and so \(\det(\mathbf{V}_{\vec{x}})=\prod_{1\leqslant i<j\leqslant n}(x_j-x_i),\) as claimed.

10.4 Cramer’s rule

The determinant can be used to give a formula for the solution of a linear system of equations of the form \(\mathbf{A}\vec{x}=\vec{b}\) for an invertible matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K}),\) \(\vec{b} \in \mathbb{K}^n\) and unknowns \(\vec{x} \in \mathbb{K}^n.\) This formula is often referred to as Cramer’s rule. In order to derive it we start with definitions:

Definition 10.23 • Adjugate matrix

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) be a square matrix. The adjugate matrix of \(\mathbf{A}\) is the \(n\times n\)-matrix \(\operatorname{Adj}(\mathbf{A})\) whose entries are given by (notice the reverse order of \(i\) and \(j\) on the right hand side) \[[\operatorname{Adj}(\mathbf{A})]_{ij}=(-1)^{i+j}\det\left(\mathbf{A}^{(j,i)}\right), \qquad 1\leqslant i,j\leqslant n.\]

Example 10.24

\[\operatorname{Adj}\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=\begin{pmatrix} d & -b \\ -c & a \end{pmatrix}, \quad \operatorname{Adj}\left(\begin{pmatrix} 1 & 1 & 2 \\ 0 & 2 & 1 \\ 1 & 0 & 2 \end{pmatrix}\right)=\begin{pmatrix} 4 & -2 & -3 \\ 1 & 0 & -1 \\ -2 & 1 & 2 \end{pmatrix}\]

The determinant and the adjugate matrix provide a formula for the inverse of a matrix:

Theorem 10.25

Let \(n \in \mathbb{N}\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K}).\) Then we have \[\operatorname{Adj}(\mathbf{A})\mathbf{A}=\mathbf{A}\operatorname{Adj}(\mathbf{A})=\det(\mathbf{A})\mathbf{1}_{n}.\] In particular, if \(\mathbf{A}\) is invertible then \[\mathbf{A}^{-1}=\frac{1}{\det \mathbf{A}}\operatorname{Adj}(\mathbf{A}).\]

Proof. Let \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}.\) For \(1\leqslant i\leqslant n\) we obtain for the \(i\)-th diagonal entry \[[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ii}=\sum_{k=1}^n(-1)^{i+k}\det\left(\mathbf{A}^{(k,i)}\right)A_{ki}=\det(\mathbf{A}),\] where we use the Laplace expansion (9.5) of the determinant. The diagonal entries of \(\operatorname{Adj}(\mathbf{A})\mathbf{A}\) are thus all equal to \(\det \mathbf{A}.\) For \(1\leqslant i,j\leqslant n\) with \(i\neq j\) we have \[[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ij}=\sum_{k=1}^n(-1)^{i+k}\left(\det \mathbf{A}^{(k,i)}\right)A_{kj}.\] We would like to interpret this last expression as a Laplace expansion. We consider a new matrix \(\hat{\mathbf{A}}=(\hat{A}_{ij})_{1\leqslant i,j\leqslant n}\) which is identical to \(\mathbf{A},\) except that the \(i\)-th column of \(\mathbf{A}\) is replaced with the \(j\)-th column of \(\mathbf{A},\) that is, for \(1\leqslant k\leqslant n,\) we have \[\tag{10.4} \hat{A}_{kl}=\left\{\begin{array}{cc} A_{kj}, & l=i,\\ A_{kl}, & l\neq i.\end{array}\right.\] Then, for all \(1\leqslant k \leqslant n\) we have \(\hat{\mathbf{A}}^{(k,i)}=\mathbf{A}^{(k,i)},\) since the only column in which \(\mathbf{A}\) and \(\hat{\mathbf{A}}\) are different is removed in \(\mathbf{A}^{(k,i)}.\) Using (10.4), the Laplace expansion of \(\hat{\mathbf{A}}\) with respect to the \(i\)-th column gives \[\begin{aligned} \det\hat{\mathbf{A}}&=\sum_{k=1}^n(-1)^{(i+k)}\hat{A}_{ki}\det\left(\hat{\mathbf{A}}^{(k,i)}\right)=\sum_{k=1}^n(-1)^{i+k}\left(\det \mathbf{A}^{(k,i)}\right)A_{kj}\\ &=[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ij} \end{aligned}\] The matrix \(\hat{\mathbf{A}}\) has a double occurrence of the \(i\)-th column, hence its column vectors are linearly dependent. Therefore \(\hat{\mathbf{A}}\) is not invertible by Proposition 3.21 and so \(\det\hat{\mathbf{A}}=[\operatorname{Adj}(\mathbf{A})\mathbf{A}]_{ij}=0\) by Corollary 10.2. The off-diagonal entries of \(\operatorname{Adj}(\mathbf{A})\mathbf{A}\) are thus all zero and we conclude \(\operatorname{Adj}(\mathbf{A})\mathbf{A}=\det(\mathbf{A})\mathbf{1}_{n}.\) Using the row version of the Laplace expansion we can conclude analogously that \(\mathbf{A}\operatorname{Adj}(\mathbf{A})=\det(\mathbf{A})\mathbf{1}_{n}.\)

Finally, if \(\mathbf{A}\) is invertible, then \(\det \mathbf{A}\neq 0\) by Corollary 10.2, so that \(\mathbf{A}^{-1}=\operatorname{Adj}(\mathbf{A})/\det(\mathbf{A}),\) as claimed.

We now use Theorem 10.25 to obtain a formula for the solution of the linear system \(\mathbf{A}\vec{x}=\vec{b}\) for an invertible matrix \(\mathbf{A}.\) Multiplying from the left with \(\mathbf{A}^{-1},\) we get \[\vec{x}=\mathbf{A}^{-1}\vec{b}=\frac{1}{\det \mathbf{A}}\operatorname{Adj}(\mathbf{A})\vec{b}.\] Writing \(\vec{x}=(x_i)_{1\leqslant i\leqslant n},\) multiplication with \(\det \mathbf{A}\) gives for \(1\leqslant i \leqslant n\) \[x_i\det \mathbf{A}=\sum_{k=1}^n[\operatorname{Adj}(\mathbf{A})]_{ik}b_k=\sum_{k=1}^n(-1)^{i+k}\det\left(\mathbf{A}^{(k,i)}\right)b_k.\] We can again interpret the right hand side as a Laplace expansion of the matrix \(\hat{\mathbf{A}}_i\) obtained by replacing the \(i\)-th column of \(\mathbf{A}\) with \(\vec{b}\) and leaving \(\mathbf{A}\) unchanged otherwise. Hence, we have for all \(1\leqslant i\leqslant n\) \[x_i=\frac{\det \hat{\mathbf{A}}_i}{\det \mathbf{A}}.\] This formula is known as Cramer’s rule. While this is a neat formula, it is rarely used in computing solutions to linear systems of equations due to the complexity of computing determinants.

Example 10.26 • Cramer’s rule

We consider the system \(\mathbf{A}\vec{x}=\vec{b}\) for \[\mathbf{A}=\begin{pmatrix} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{pmatrix}\qquad \text{and} \qquad \vec{b}=\begin{pmatrix} -2 \\ 2 \\ 4 \end{pmatrix}.\] Here we obtain \[\hat{\mathbf{A}}_1=\begin{pmatrix} -2 & 1 & 1 \\ 2 & 2 & 1 \\ 4 & 1 & 2 \end{pmatrix}, \quad \hat{\mathbf{A}}_2=\begin{pmatrix} 2 & -2 & 1 \\ 1 & 2 & 1 \\ 1 & 4 & 2 \end{pmatrix}, \quad \hat{\mathbf{A}}_3=\begin{pmatrix} 2 & 1 & -2 \\ 1 & 2 & 2 \\ 1 & 1 & 4 \end{pmatrix}.\] We compute \(\det\mathbf{A}=4,\) \(\det\hat{\mathbf{A}}_1=-12,\) \(\det\hat{\mathbf{A}}_2=4\) and \(\det\hat{\mathbf{A}}_3=12\) so that Cramer’s rule gives indeed the correct solution \[\vec{x}=\frac{1}{4}\begin{pmatrix} -12 \\ 4 \\ 12 \end{pmatrix}=\begin{pmatrix} -3 \\ 1 \\ 3\end{pmatrix}.\]

Exercises

Exercise 10.27

Use the Leibniz formula to show that \[\det(\mathbf{A})=\det(\mathbf{A}^T)\] for all \(\mathbf{A}\in M_{n,n}(\mathbb{K}).\)

Solution

Let \(\mathbf{A}= (A_{ij})_{1\leqslant i,j\leqslant n}.\) We first note that \(\operatorname{sgn}(\sigma) = \operatorname{sgn}(\sigma^{-1})\) for all \(\sigma\in S_n,\) which immediately follows from \(\operatorname{sgn}(\sigma)\operatorname{sgn}(\sigma^{-1})=\operatorname{sgn}(\sigma\circ\sigma^{-1}) = \operatorname{sgn}(1) = 1.\) The Leibniz formula reads \[\det(\mathbf{A}) = \sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{i=1}^n A_{\sigma(i)i}.\] We write \(\pi=\sigma^{-1}\) and let \(j=\sigma(i).\) Summing over all permutations is the same as summing over all inverse permutations and hence we find \[\begin{aligned} \det(\mathbf{A}) & = \sum_{\sigma \in S_n}\operatorname{sgn}(\sigma)\prod_{j=1}^n A_{j\pi(j)}\\ & = \sum_{\pi\in S_n}\operatorname{sgn}(\pi)\prod_{j=1}^n A_{j\pi(j)}\\ & = \sum_{\pi\in S_n}\operatorname{sgn}(\pi)\prod_{j=1}^n [\mathbf{A}^T]_{\pi(j)j}\\ & = \det(\mathbf{A}^T). \end{aligned}\]

10 The determinant, II

10.1 Properties of the determinant

10.2 Permutations

10.3 The Leibniz formula

10.4 Cramer’s rule

Exercises

Home

Chapters

Contents

✕