5 The determinant

5.1 Axiomatic characterisation

Surprisingly, whether or not a square matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) admits an inverse is captured by a single scalar, called the determinant of \(\mathbf{A}\) and denoted by \(\det \mathbf{A}\) or \(\det(\mathbf{A}).\) That is, the matrix \(\mathbf{A}\) admits an inverse if and only if \(\det \mathbf{A}\) is nonzero. In practice, however, it is often quicker to use Gauss–Jordan elimination to decide whether the matrix admits an inverse. The determinant is nevertheless a useful tool in linear algebra.

The determinant is an object of multilinear algebra, which – for \(\ell \in \mathbb{N}\) – considers mappings from the \(\ell\)-fold Cartesian product of a \(\mathbb{K}\)-vector space into another \(\mathbb{K}\)-vector space. Such a mapping \(f\) is required to be linear in each variable. This simply means that if we freeze all variables of \(f,\) except for the \(k\)-th variable, \(1\leqslant k\leqslant \ell,\) then the resulting mapping \(g_{k}\) of one variable is required to be linear. More precisely:

Definition 5.1 • Multilinear map

Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) A mapping \(f : V^\ell \to W\) is called \(\ell\)-multilinear (or simply multilinear) if the mapping \(g_{k} : V \to W,\) \(v \mapsto f(v_1,\ldots,v_{k-1},v,v_{k+1},\ldots,v_{\ell})\) is linear for all \(1 \leqslant k \leqslant \ell\) and for all \(\ell\)-tuples \((v_1,\ldots,v_{\ell}) \in V^\ell.\)

We only need an \((\ell-1)\)-tuple of vectors to define the map \(g_{k},\) but the above definition is more convenient to write down.

Two types of multilinear maps are of particular interest:

Definition 5.2 • Symmetric and alternating multilinear maps

Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(f : V^\ell \to W\) an \(\ell\)-multilinear map.

The map \(f\) is called symmetric if exchanging two arguments does not change the value of \(f.\) That is, we have \[f(v_1,\ldots,v_{\ell})=f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell.\)
The map \(f\) is called alternating if \(f(v_1,\ldots,v_{\ell})=0_W\) whenever at least two arguments agree, that is, there exist \(i\neq j\) with \(v_i=v_j.\) Alternating \(\ell\)-multilinear maps are also called \(W\)-valued \(\ell\)-forms or simply \(\ell\)-forms when \(W=\mathbb{K}.\)

\(1\)-multilinear maps are simply linear maps. \(2\)-multilinear maps are called bilinear and \(3\)-multilinear maps are called trilinear. Most likely, you are already familiar with two examples of bilinear maps:

Example 5.3 • Bilinear maps

The first one is the scalar product of two vectors in \(\mathbb{R}^3\) (or more generally \(\mathbb{R}^n\)). So \(V=\mathbb{R}^3\) and \(W=\mathbb{R}.\) Recall that the scalar product is the mapping \[V^2=\mathbb{R}^3 \times \mathbb{R}^3 \to \mathbb{R}, \quad (\vec{x},\vec{y})\mapsto \vec{x}\cdot \vec{y}=x_1y_1+x_2y_2+x_3y_3,\] where we write \(\vec{x}=(x_i)_{1\leqslant i\leqslant 3}\) and \(\vec{y}=(y_i)_{1\leqslant i\leqslant 3}.\) Notice that for all \(s_1,s_2 \in \mathbb{R}\) and all \(\vec{x}_1,\vec{x}_2,\vec{y}\in \mathbb{R}^3\) we have \[(s_1\vec{x}_1+s_2\vec{x}_2)\cdot \vec{y}=s_1(\vec{x}_1\cdot\vec{y})+s_2(\vec{x}_2\cdot\vec{y}),\] so that the scalar product is linear in the first variable. Furthermore, the scalar product is symmetric, \(\vec{x}\cdot \vec{y}=\vec{y}\cdot \vec{x}.\) It follows that the scalar product is also linear in the second variable, hence it is bilinear or \(2\)-multilinear.
The second one is the cross product of two vectors in \(\mathbb{R}^3.\) Here \(V=\mathbb{R}^3\) and \(W=\mathbb{R}^3.\) Recall that the cross product is the mapping \[V^2=\mathbb{R}^3 \times \mathbb{R}^3 \to \mathbb{R}^3, \quad (\vec{x},\vec{y})\mapsto \vec{x}\times \vec{y}=\begin{pmatrix} x_2y_3-x_3y_2 \\ x_3y_1-x_1y_3 \\ x_1y_2-x_2y_1 \end{pmatrix}.\] Notice that for all \(s_1,s_2 \in \mathbb{R}\) and all \(\vec{x}_1,\vec{x}_2,\vec{y}\in \mathbb{R}^3\) we have \[(s_1\vec{x}_1+s_2\vec{x}_2)\times \vec{y}=s_1(\vec{x}_1\times\vec{y})+s_2(\vec{x}_2\times\vec{y}),\] so that the cross product is linear in the first variable. Likewise, we can check that the cross product is also linear in the second variable, hence it is bilinear or \(2\)-multilinear. Observe that the cross product is alternating.

Example 5.4 • Multilinear map

Let \(V=\mathbb{K}\) and consider \(f : V^\ell \to \mathbb{K},\) \((x_1,\ldots,x_\ell)\mapsto x_1x_2\cdots x_\ell.\) Then \(f\) is \(\ell\)-multilinear and symmetric.

Example 5.5

Let \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric matrix, \(\mathbf{A}^T=\mathbf{A}.\) Notice that we obtain a symmetric bilinear map \[f : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}, \quad (x,y) \mapsto \vec{x}^T\mathbf{A}\vec{y},\] where on the right hand side all products are defined by matrix multiplication.

The Example 5.5 gives us a wealth of symmetric bilinear maps on \(\mathbb{R}^n.\) As we will see shortly, the situation is quite different if we consider alternating \(n\)-multilinear maps on \(\mathbb{K}_n\) (notice that we have the same number \(n\) of arguments as the dimension of \(\mathbb{K}_n\)).

Let \(\{\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n\}\) denote the standard basis of \(\mathbb{K}_n\) so that \(\Omega(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=\mathbf{1}_{n}.\)

Theorem 5.6

Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : (\mathbb{K}_n)^n \to \mathbb{K}\) satisfying \(f_n(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=1.\)

Recall that we have bijective mapping \(\Omega : (\mathbb{K}_n)^n \to M_{n,n}(\mathbb{K})\) which forms an \(n\times n\)-matrix from \(n\) row vectors of length \(n.\) For the choice \(V=\mathbb{K}_n,\) the notion of \(n\)-multilinearity thus also makes sense for a mapping \(f : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) which takes an \(n\times n\) matrix as an input. Here the multilinearity means the the mapping is linear in each row of the matrix. Since \(\Omega(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=\mathbf{1}_{n},\) we may phrase the above theorem equivalently as:

Theorem 5.7 • Existence and uniqueness of the determinant

Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)

Definition 5.8 • Determinant

The mapping \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) provided by Theorem 5.7 is called the determinant and denoted by \(\det.\) For \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) we say \(\det(\mathbf{A})\) is the determinant of the matrix \(\mathbf{A}.\)

Remark 5.9 • Abuse of notation

It would be more precise to write \(\det_n\) since the determinant is a family of mappings, one mapping \(\det_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) for each \(n \in \mathbb{N}.\) It is however common to simply write \(\det.\)

Example 5.10

For \(n=1\) the condition that a \(1\)-multilinear (i.e. linear) map \(f_1 : M_{1,1}(\mathbb{K}) \to \mathbb{K}\) is alternating is vacuous. So the Theorem 5.7 states that there is a unique linear map \(f_1 : M_{1,1}(\mathbb{K}) \to \mathbb{K}\) that satisfies \(f_1((1))=1.\) Of course, this is just the map defined by the rule \(f_1((a))=a,\) where \((a) \in M_{1,1}(\mathbb{K})\) is any \(1\)-by-\(1\) matrix.

Example 5.11

For \(n=2\) and \(a,b,c,d \in \mathbb{K}\) we consider the mapping \(f_2 : M_{2,2}(\mathbb{K}) \to \mathbb{K}\) defined by the rule \[\tag{5.1} f_2\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=ad-cb.\] We claim that \(f_2\) is bilinear in the rows and alternating. The condition that \(f_2\) is alternating simplifies to \(f(\mathbf{A})=0\) whenever the two rows of \(\mathbf{A}\in M_{2,2}(\mathbb{K})\) agree. Clearly, \(f_2\) is alternating, since \[f_2\left(\begin{pmatrix} a & b \\ a & b \end{pmatrix}\right)=ab-ab=0.\] Furthermore, \(f_2\) needs to be linear in each row. The additivity condition applied to the first row gives that we must have \[f_2\left(\begin{pmatrix} a_1+a_2 & b_1+b_2 \\ c & d \end{pmatrix}\right)=f_2\left(\begin{pmatrix} a_1 & b_1 \\ c & d\end{pmatrix}\right)+f_2\left(\begin{pmatrix} a_2 & b_2 \\ c & d\end{pmatrix}\right)\] for all \(a_1,a_2,b_1,b_2,c,d \in \mathbb{K}.\) Using the definition (5.1), we obtain \[\begin{aligned} f_2\left(\begin{pmatrix} a_1+a_2 & b_1+b_2 \\ c & d \end{pmatrix}\right)&=(a_1+a_2)d-c(b_1+b_2)\\ &=a_1d-cb_1+a_2d-cb_2\\ &=f_2\left(\begin{pmatrix} a_1 & b_1 \\ c & d\end{pmatrix}\right)+f_2\left(\begin{pmatrix} a_2 & b_2 \\ c & d\end{pmatrix}\right), \end{aligned}\] so that \(f_2\) is indeed additive in the first row. The \(1\)-homogeneity condition applied to the first row gives that we must have \[f_2\left(\begin{pmatrix}sa & sb \\ c & \mathrm{d}\end{pmatrix}\right)=sf_2\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)\] for all \(a,b,c,d \in \mathbb{K}\) and \(s\in \mathbb{K}.\) Indeed, using the definition (5.1), we obtain \[f_2\left(\begin{pmatrix}sa & sb \\ c & \mathrm{d}\end{pmatrix}\right)=sad-csb=s(ad-cb)=sf_2\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right),\] so that \(f_2\) is also \(1\)-homogeneous in the first row. We conclude that \(f_2\) is linear in the first row. Likewise, the reader is invited to check that \(f_2\) is also linear in the second row. Furthermore, we can easily compute that \(f_2(\mathbf{1}_{2})=1.\) The mapping \(f_2\) thus satisfies all the properties of Theorem 5.7, hence by the uniqueness statement we must have \(f_2=\det\) and we obtain the formula \[\tag{5.2} \det\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=ad-cb\] for all \(a,b,c,d \in\mathbb{K}.\)

5.2 Uniqueness of the determinant

So far we have only shown that the determinant exists for \(n=1\) and \(n=2.\) However, we need to show the existence and uniqueness part of Theorem 5.7 in general. We first show the uniqueness part. We start by deducing some consequences from the alternating property:

Lemma 5.12

Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:

interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_jv_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.

Proof. (i) Since \(f\) is alternating, we have for all \((v_1,\ldots,v_{\ell}) \in V^\ell\) \[f(v_1,\ldots,v_{i-1},v_i+v_j,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell})=0_W.\] Using the linearity in the \(i\)-th argument, this gives \[\begin{aligned} 0_W&=f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell}). \end{aligned}\] Using the linearity in the \(j\)-th argument, we obtain \[\begin{aligned} 0_W&=f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_j,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_j,v_{j+1},\ldots,v_{\ell}). \end{aligned}\] The first summand has a double occurrence of \(v_i\) and hence vanishes by the alternating property. Likewise, the fourth summand has a double occurrence of \(v_j\) and hence vanishes as well. Since the second summand equals \(f(v_1,\ldots,v_{\ell}),\) we thus obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] as claimed.

(ii) Suppose \(\{v_1,\ldots,v_{\ell}\}\) are linearly dependent so that we have scalars \(s_j \in \mathbb{K}\) not all zero, \(1\leqslant j\leqslant \ell,\) so that \(s_1v_1+\cdots+s_\ell v_{\ell}=0_V.\) Suppose \(s_i\neq 0\) for some index \(1\leqslant i\leqslant \ell.\) Then \[v_i=-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)v_j\] and hence by the linearity in the \(i\)-th argument, we obtain \[\begin{gathered} f\left(v_1,\ldots,v_{i-1},-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)v_j,v_{i+1},\ldots,v_{\ell}\right)\\=-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)f\left(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{\ell}\right)=0_W, \end{gathered}\] where we use that for each \(1\leqslant j\leqslant \ell\) with \(j\neq i,\) the expression \[f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{\ell})\] has a double occurrence of \(v_j\) and thus vanishes by the alternating property.

(iii) Let \((v_1,\ldots,v_{\ell})\in V^\ell\) and \((s_1,\ldots,s_\ell) \in \mathbb{K}^\ell.\) Then, using the linearity in the \(i\)-th argument, we compute \[\begin{gathered} f(v_1,\ldots,v_{i-1},v_i+\sum_{j=1,j\neq i}^\ell s_jv_j,v_{i+1},\ldots,v_{\ell})\\ =f(v_1,\ldots,v_{\ell})+\sum_{j=1,j\neq i}^\ell s_jf(v_1,\ldots,v_{i-1}v_j,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell}), \end{gathered}\] where the last equality follows exactly as in the proof of (ii).

The alternating property of an \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) together with the condition \(f_n(\mathbf{1}_{n})=1\) uniquely determines the value of \(f_n\) on the elementary matrices:

Lemma 5.13

Let \(n \in \mathbb{N}\) and \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) an alternating \(n\)-multilinear map satisfying \(f_n(\mathbf{1}_{n})=1.\) Then for all \(1\leqslant k,l\leqslant n\) with \(k\neq l\) and all \(s\in \mathbb{K},\) we have \[\tag{5.3} f_n(\mathbf{D}_k(s))=s,\qquad f_n(\mathbf{L}_{k,l}(s))=1, \qquad f_n(\mathbf{P}_{k,l})=-1.\] Moreover, for \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and an elementary matrix \(\mathbf{B}\) of size \(n,\) we have \[\tag{5.4} f_n(\mathbf{B}\mathbf{A})=f_n(\mathbf{B})f_n(\mathbf{A}).\]

Proof. Recall that \(\mathbf{D}_k(s)\) applied to a square matrix \(\mathbf{A}\) multiplies the \(k\)-th row of \(\mathbf{A}\) with \(s\) and leaves \(\mathbf{A}\) unchanged otherwise. We write \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) as \(\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_n)\) for \(\vec{\alpha}_{i} \in \mathbb{K}_n,\) \(1\leqslant i\leqslant n.\) Hence we obtain \[\mathbf{D}_k(s)\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},s\vec{\alpha}_k,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_n).\] The linearity of \(f\) in the \(k\)-th row thus gives \(f_n(\mathbf{D}_k(s)\mathbf{A})=sf_n(\mathbf{A}).\) In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies that \(f_n(\mathbf{D}_k(s))=f_n(\mathbf{D}_k(s)\mathbf{1}_{n})=sf_n(\mathbf{1}_{n})=s.\) Therefore, we have \[f_n(\mathbf{D}_k(s)\mathbf{A})=f_n(\mathbf{D}_k(s))f_n(\mathbf{A}).\]

Likewise we obtain \[\mathbf{L}_{k,l}(s)\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},\vec{\alpha}_k+s\vec{\alpha}_l,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_n)\] and we can apply property (iii) of Lemma 5.12 for the choice \(w=s\vec{\alpha}_l\) to conclude that \(f_n(\mathbf{L}_{k,l}(s)\mathbf{A})=f_n(\mathbf{A}).\) In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies \(f_n(\mathbf{L}_{k,l}(s))=f_n(\mathbf{L}_{k,l}(s)\mathbf{1}_{n})=f_n(\mathbf{1}_{n})=1.\)

Therefore, we have \[f_n(\mathbf{L}_{k,l}(s)\mathbf{A})=f_n(\mathbf{L}_{k,l}(s))f_n(\mathbf{A}).\]

Finally, we have \[\mathbf{P}_{k,l}\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},\vec{\alpha}_l,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_{l-1},\vec{\alpha}_k,\vec{\alpha}_{l+1},\ldots,\vec{\alpha}_n)\] so that property (ii) of Lemma 5.12 immediately gives that \[f_n(\mathbf{P}_{k,l}\mathbf{A})=-f_n(\mathbf{A}).\] In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies \(f_n(\mathbf{P}_{k,l})=f_n(\mathbf{P}_{k,l}\mathbf{1}_{n})=-f_n(\mathbf{1}_{n})=-1.\)

Therefore, we have \(f_n(\mathbf{P}_{k,l}\mathbf{A})=f_n(\mathbf{P}_{k,l})f_n(\mathbf{A}),\) as claimed.

We now obtain the uniqueness part of Theorem 5.7.

Proposition 5.14

Let \(n\in \mathbb{N}\) and \(f_n,\hat{f}_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) be alternating \(n\)-multilinear maps satisfying \(f_n(\mathbf{1}_{n})=\hat{f}_n(\mathbf{1}_{n})=1.\) Then \(f_n=\hat{f}_n.\)

Proof. We need to show that for all \(\mathbf{A}\in M_{n,n}(\mathbb{K}),\) we have \(f_n(\mathbf{A})=\hat{f}_n(\mathbf{A}).\) Suppose first that \(\mathbf{A}\) is not invertible. Then, by Proposition 4.7, the row vectors of \(\mathbf{A}\) are linearly dependent and hence property (ii) of Lemma 5.12 implies that \(f_n(\mathbf{A})=\hat{f}_n(\mathbf{A})=0.\)

Now suppose that \(\mathbf{A}\) is invertible. Using Gauss–Jordan elimination, we obtain \(N \in \mathbb{N}\) and a sequence of elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_N\) so that \(\mathbf{B}_N\cdots \mathbf{B}_1=\mathbf{A}.\) We obtain \[\begin{aligned} f_n(\mathbf{A})&=f_n(\mathbf{B}_N\cdots \mathbf{B}_1)=f_n(\mathbf{B}_N)f_n(\mathbf{B}_{N-1}\cdots \mathbf{B}_1)=\hat{f}_n(\mathbf{B}_N)f_n(\mathbf{B}_{N-1}\cdots \mathbf{B}_1), \end{aligned}\] where the second equality uses (5.4) and the third equality uses that (5.3) implies that \(\hat{f}_n(\mathbf{B})=f_n(\mathbf{B})\) for all elementary matrices \(\mathbf{B}.\) Proceeding in this fashion we get \[\begin{aligned} f_n(\mathbf{A})&=\hat{f}_n(\mathbf{B}_N)\hat{f}_n(\mathbf{B}_{N-1})\cdots\hat{f}_n(\mathbf{B}_1)=\hat{f}_n(\mathbf{B}_N)\hat{f}_n(\mathbf{B}_{N-1})\cdots\hat{f}_n(\mathbf{B}_2\mathbf{B}_1)=\cdots \\ &=\hat{f}_n(\mathbf{B}_N\mathbf{B}_{N-1}\cdots \mathbf{B}_1)=\hat{f}_n(\mathbf{A}). \end{aligned}\]

5.3 Existence of the determinant

It turns out that we can define the determinant recursively in terms of the determinants of certain submatrices. Determinants of submatrices are called minors. To this end we first define:

Definition 5.15

Let \(n \in \mathbb{N}.\) For a square matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and \(1\leqslant k,l\leqslant n\) we denote by \(\mathbf{A}^{(k,l)}\) the \((n-1)\times(n-1)\) submatrix obtained by removing the \(k\)-th row and \(l\)-th column from \(\mathbf{A}.\)

Example 5.16

\[\mathbf{A}=\begin{pmatrix} a & b \\ c & d \end{pmatrix}, \qquad \mathbf{A}^{(1,1)}=(d), \qquad \mathbf{A}^{(2,1)}=(b).\] \[\mathbf{A}=\begin{pmatrix} 1 & -2 & 0 & 4 \\ 3 & 1 & 1 & 0 \\ -1 & -5 & -1 & 8 \\ 3 & 8 & 2 & -12 \end{pmatrix}, \qquad \mathbf{A}^{(3,2)}=\begin{pmatrix} 1 & 0 & 4 \\ 3 & 1 & 0 \\ 3 & 2 & -12\end{pmatrix}.\]

We use induction to prove the existence of the determinant:

Lemma 5.17

Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)

Proof of Theorem 5.6
Theorem 5.6 ➔
Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : (\mathbb{K}_n)^n \to \mathbb{K}\) satisfying \(f_n(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=1.\)

. For \(n=1\) we have seen that \(f_1 : M_{1,1}(\mathbb{K}) \to \mathbb{K},\) \((a) \mapsto a\) is \(1\)-multilinear, alternating and satisfies \(f_1(\mathbf{1}_{1})=1.\) Hence Lemma 5.17 implies that for all \(n \in \mathbb{N}\) there exists an \(n\)-multilinear and alternating map \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\) By Proposition 5.14 there is only one such mapping for each \(n \in \mathbb{N}.\)

Proof of Lemma 5.17
Lemma 5.17 ➔
Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)

. We take some arbitrary, but then fixed integer \(l\) with \(1\leqslant l\leqslant n.\)

Step 1. We first show that \(f_n(\mathbf{1}_{n})=1.\) Since \([\mathbf{1}_{n}]_{kl}=\delta_{kl},\) we obtain \[f_n(\mathbf{1}_{n})=\sum_{k=1}^n(-1)^{l+k}[\mathbf{1}_{n}]_{kl}f_{n-1}\left(\mathbf{1}_{n}^{(k,l)}\right)=(-1)^{2l}f_{n-1}\left(\mathbf{1}_{n}^{(l,l)}\right)=f_{n-1}\left(\mathbf{1}_{n-1}\right)=1,\] where we use that \(\mathbf{1}_{n}^{(l,l)}=\mathbf{1}_{n-1}\) and \(f_{n-1}(\mathbf{1}_{n-1})=1.\)

Step 2. We show that \(f_n\) is multilinear. Let \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and write \(\mathbf{A}=(A_{kj})_{1\leqslant k,j\leqslant n}.\) We first show that \(f_n\) is \(1\)-homogeneous in each row. Say we multiply the \(i\)-th row of \(\mathbf{A}\) with \(s\) so that we obtain a new matrix \(\hat{\mathbf{A}}=(\hat{A}_{kj})_{1\leqslant k,j\leqslant n}\) with \[\hat{A}_{kj}=\left\{\begin{array}{cc} A_{kj}, & k\neq i,\\ sA_{kj}, & k=i.\end{array}\right.\] We need to show that \(f_n(\hat{\mathbf{A}})=sf_n(\mathbf{A}).\) We compute \[\begin{aligned} f_n(\hat{\mathbf{A}})&=\sum_{k=1}^n(-1)^{l+k}\hat{A}_{kl}f_{n-1}(\hat{\mathbf{A}}^{(k,l)})\\ &=(-1)^{l+i}sA_{il}f_{n-1}(\hat{\mathbf{A}}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\hat{\mathbf{A}}^{(k,l)}). \end{aligned}\] Now notice that \(\hat{\mathbf{A}}^{(i,l)}=\mathbf{A}^{(i,l)},\) since \(\mathbf{A}\) and \(\hat{\mathbf{A}}\) only differ in the \(i\)-th row, but this is the row that is removed. Since \(f_{n-1}\) is \(1\)-homogeneous in each row, we obtain that \(f_{n-1}(\hat{\mathbf{A}}^{(k,l)})=sf_{n-1}(\mathbf{A}^{(k,l)})\) whenever \(k \neq i.\) Thus we have \[\begin{aligned} f_n(\hat{\mathbf{A}})&=s(-1)^{l+i}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+s\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{A}^{(k,l)})\\ &=s\sum_{k=1}^n(-1)^{l+k}A_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)=sf_n(\mathbf{A}). \end{aligned}\] We now show that \(f_n\) is additive in each row. Say the matrix \(\mathbf{B}=(B_{kj})_{1\leqslant k,j\leqslant n}\) is identical to the matrix \(\mathbf{A},\) except for the \(i\)-th row, so that \[B_{kj}=\left\{\begin{array}{cc} A_{kj} & k\neq i\\ B_{j} & k=i\end{array}\right.\] for some scalars \(B_j\) with \(1\leqslant j\leqslant n.\) We need to show that \(f_n(\mathbf{C})=f_n(\mathbf{A})+f_n(\mathbf{B}),\) where \(\mathbf{C}=(C_{kj})_{1\leqslant k,j\leqslant n}\) with \[C_{kj}=\left\{\begin{array}{cc} A_{kj} & k\neq i\\ A_{ij}+B_{j} & k=i\end{array}\right.\] We compute \[f_n(\mathbf{C})=(-1)^{l+i}(A_{il}+B_l)f_{n-1}(\mathbf{C}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{C}^{(k,l)}).\] As before, since \(\mathbf{A},\mathbf{B},\mathbf{C}\) only differ in the \(i\)-th row, we have \(\mathbf{A}^{(i,l)}=\mathbf{B}^{(i,l)}=\mathbf{C}^{(i,l)}.\) Using that \(f_{n-1}\) is linear in each row, we thus obtain \[\begin{gathered} f_n(\mathbf{C})=(-1)^{l+i}B_lf_{n-1}(\mathbf{B}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{B}^{(k,l)})\\ +(-1)^{l+i}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{A}^{(k,l)})=f_n(\mathbf{A})+f_n(\mathbf{B}). \end{gathered}\]

Step 3. We show that \(f_n\) is alternating. Suppose we have \(1\leqslant i,j\leqslant n\) with \(j>i\) and so that the \(i\)-th and \(j\)-th row of the matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}\) are the same. Therefore, unless \(k=i\) or \(k=j,\) the submatrix \(\mathbf{A}^{(k,l)}\) also contains two identical rows and since \(f_{n-1}\) is alternating, all summands vanish except the one for \(k=i\) and \(k=j,\) this gives \[\begin{aligned} f_n(\mathbf{A})&=(-1)^{i+l}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+(-1)^{j+l}A_{jl}f_{n-1}(\mathbf{A}^{(j,l)})\\ &=A_{il}(-1)^l\left((-1)^{i}f_{n-1}(\mathbf{A}^{(i,l)})+(-1)^{j}f_{n-1}(\mathbf{A}^{(j,l)})\right) \end{aligned}\] where the second equality sign follows because we have \(A_{il}=A_{jl}\) for all \(1\leqslant l\leqslant n\) (the \(i\)-th and \(j\)-th row agree). The mapping \(f_{n-1}\) is alternating, hence by the first property of the Lemma 5.12, swapping rows in the matrix \(\mathbf{A}^{(j,l)}\) leads to a minus sign in \(f_{n-1}(\mathbf{A}^{(j,l)}).\) Moving the \(i\)-th row of \(\mathbf{A}^{(j,l)}\) down by \(j-i-1\) rows (which corresponds to swapping \(j-i-1\) times), we obtain \(\mathbf{A}^{(i,l)},\) hence \[f_{n-1}(\mathbf{A}^{(j,l)})=(-1)^{j-i-1}f_{n-1}(\mathbf{A}^{(i,l)}).\] This gives \[f_n(\mathbf{A})=A_{il}(-1)^l\left((-1)^{i}f_{n-1}(\mathbf{A}^{(i,l)})+(-1)^{2j-i-1}f_{n-1}(\mathbf{A}^{(i,l)})\right)=0.\]

Remark 5.18 • Laplace expansion

As a by-product of the proof of Lemma 5.17 we obtain the formula \[\tag{5.5} \det(\mathbf{A})=\sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}\det\left(\mathbf{A}^{(k,l)}\right),\] known as the Laplace expansion of the determinant. The uniqueness statement of Theorem 5.7 thus guarantees that for every \(n\times n\) matrix \(\mathbf{A},\) the scalar \(\sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}\det\left(\mathbf{A}^{(k,l)}\right)\) is independent of the choice of \(l \in \mathbb{N},\) \(1\leqslant l \leqslant n.\) In practice, when computing the determinant, it is thus advisable to choose \(l\) such that the corresponding column contains the maximal amount of zeros.

Example 5.19

For \(n=2\) and choosing \(l=1,\) we obtain \[\det\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=a\det\left(\mathbf{A}^{(1,1)}\right)-c\det\left(\mathbf{A}^{(2,1)}\right)=ad-cb,\] in agreement with (5.1). For \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant 3} \in M_{3,3}(\mathbb{K})\) and choosing \(l=3\) we obtain \[\begin{gathered} \det\left(\begin{pmatrix}A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33} \end{pmatrix}\right)=A_{13} \det\left(\begin{pmatrix} A_{21} & A_{22} \\ A_{31} & A_{32} \end{pmatrix}\right)\\-A_{23}\det\left(\begin{pmatrix} A_{11} & A_{12} \\ A_{31} & A_{32}\end{pmatrix}\right)+A_{33} \det\left(\begin{pmatrix}A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix}\right)\\ \end{gathered}\] so that \[\begin{aligned} \det \mathbf{A}&=A_{13}(A_{21}A_{32}-A_{31}A_{22})-A_{23}(A_{11}A_{32}-A_{31}A_{12})\\ &\phantom{=}+A_{33}(A_{11}A_{22}-A_{21}A_{12})\\ &=A_{11}A_{22}A_{33}-A_{11}A_{23}A_{32}-A_{12}A_{21}A_{33}\\ &\phantom{=}+A_{12}A_{23}A_{31}+A_{13}A_{21}A_{32}-A_{13}A_{22}A_{31}. \end{aligned}\]

Exercises

Exercise 5.20 • Trilinear map

Let \(V=\mathbb{R}^3\) and \(W=\mathbb{R}.\) Show that the map \[f : V^3 \to W, \quad (\vec{x},\vec{y},\vec{z}) \mapsto (\vec{x}\times\vec{y})\cdot \vec{z}\] is alternating and trilinear.

Solution

We first show that \(f\) is trilinear: Let \(s,t\in\mathbb{R}.\) By definition of \(\times,\) we have that \(f(\vec y,\vec x,\vec z) = -f(\vec x,\vec y,\vec z)\) for all \(\vec x,\vec y, \vec z\in \mathbb{R}^3\) and hence we will show linearity in the first and third slot only. Let \(\vec x,\vec x_1,\vec x_2,\vec y,\vec z,\vec z_1,\vec z_2\in\mathbb{R}^3.\) \[\begin{aligned} f(s\vec x_1+t\vec x_2,\vec y,\vec z) & = \left((s\vec x_1 + t\vec x_2)\times \vec y\right)\cdot \vec z \\ & = \left(s(\vec x_1\times \vec y)+t(\vec x_2\times \vec y)\right)\cdot \vec z \\ & = s(\vec x_1\times \vec y)\cdot \vec z + t(\vec x_2\times \vec y)\cdot \vec z\\ & = sf(\vec x_1,\vec y,\vec z) + tf(\vec x_2,\vec y,\vec z),\end{aligned}\] where we use distributivity of \(\times\) and \(\cdot\) over \(+.\)

\[\begin{aligned} f(\vec x,\vec y,s\vec z_1+t\vec z_2) & = (\vec x\times \vec y)\cdot(s\vec z_1+t\vec z_2) \\ & = s(\vec x\times \vec y)\cdot \vec z_1 + t(\vec x\times \vec y)\cdot \vec z_2, \end{aligned}\] where we use the linearity of \(\cdot\) in the second slot.

We are left to show that \(f\) is alternating: Note that \(f(\vec x,\vec x, \vec z)=0\) by definition of the cross product. Since \(f(\vec y,\vec x,\vec z) = -f(\vec x,\vec y,\vec z),\) it is enough to show that \(f(\vec x,\vec y,\vec y)= 0\): \[\begin{aligned} \left(\begin{pmatrix}x_1\\ x_2 \\ x_3\end{pmatrix}\times \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix}\right) \cdot \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix} & = \begin{pmatrix}x_2y_3-x_3y_2\\ x_3y_1-x_1y_3\\ x_1y_2-x_2y_1\end{pmatrix}\cdot \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix}. \end{aligned}\] Evaluating the dot product yields \[\color{blue}y_1x_2y_3\color{black}-\color{red}y_1y_2x_3\color{black} +\color{red}y_1y_2x_3\color{black} - x_1y_2y_3\color{black} + x_1y_2y_3 - \color{blue}y_1x_2y_3\] and this expression equals zero since terms with the same color cancel each other.

5 The determinant

5.1 Axiomatic characterisation

5.2 Uniqueness of the determinant

5.3 Existence of the determinant

Exercises

Home

Contents

Exercise Sheets

Lecture Recordings

Quizzes

Study Weeks

✕