9 The determinant, I

9.1 Axiomatic characterisation

Surprisingly, whether or not a square matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) admits an inverse is captured by a single scalar, called the determinant of \(\mathbf{A}\) and denoted by \(\det \mathbf{A}\) or \(\det(\mathbf{A}).\) That is, the matrix \(\mathbf{A}\) admits an inverse if and only if \(\det \mathbf{A}\) is nonzero. In practice, however, it is often quicker to use Gauss–Jordan elimination to decide whether the matrix admits an inverse. The determinant is nevertheless a useful tool in linear algebra.

9.1.1 Multilinear maps

The determinant is an object of multilinear algebra, which – for \(\ell \in \mathbb{N}\) – considers mappings from the \(\ell\)-fold Cartesian product of a \(\mathbb{K}\)-vector space into another \(\mathbb{K}\)-vector space. Such a mapping \(f\) is required to be linear in each variable. This simply means that if we freeze all variables of \(f,\) except for the \(k\)-th variable, \(1\leqslant k\leqslant \ell,\) then the resulting mapping \(g_{k}\) of one variable is required to be linear. More precisely:

Definition 9.1 • Multilinear map

Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) A mapping \(f : V^\ell \to W\) is called \(\ell\)-multilinear (or simply multilinear) if the mapping \(g_{k} : V \to W,\) \(v \mapsto f(v_1,\ldots,v_{k-1},v,v_{k+1},\ldots,v_{\ell})\) is linear for all \(1 \leqslant k \leqslant \ell\) and for all \(\ell\)-tuples \((v_1,\ldots,v_{\ell}) \in V^\ell.\)

We only need an \((\ell-1)\)-tuple of vectors to define the map \(g_{k},\) but the above definition is more convenient to write down.

Two types of multilinear maps are of particular interest:

Definition 9.2 • Symmetric and alternating multilinear maps

Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(f : V^\ell \to W\) an \(\ell\)-multilinear map.

The map \(f\) is called symmetric if exchanging two arguments does not change the value of \(f.\) That is, we have \[f(v_1,\ldots,v_{\ell})=f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell.\)
The map \(f\) is called alternating if \(f(v_1,\ldots,v_{\ell})=0_W\) whenever at least two arguments agree, that is, there exist \(i\neq j\) with \(v_i=v_j.\) Alternating \(\ell\)-multilinear maps are also called \(W\)-valued \(\ell\)-forms or simply \(\ell\)-forms when \(W=\mathbb{K}.\)

\(1\)-multilinear maps are simply linear maps. \(2\)-multilinear maps are called bilinear and \(3\)-multilinear maps are called trilinear. Most likely, you are already familiar with two examples of bilinear maps:

Example 9.3 • Bilinear maps

The first one is the scalar product of two vectors in \(\mathbb{R}^3\) (or more generally \(\mathbb{R}^n\)). So \(V=\mathbb{R}^3\) and \(W=\mathbb{R}.\) Recall that the scalar product is the mapping \[V^2=\mathbb{R}^3 \times \mathbb{R}^3 \to \mathbb{R}, \quad (\vec{x},\vec{y})\mapsto \vec{x}\cdot \vec{y}=x_1y_1+x_2y_2+x_3y_3,\] where we write \(\vec{x}=(x_i)_{1\leqslant i\leqslant 3}\) and \(\vec{y}=(y_i)_{1\leqslant i\leqslant 3}.\) Notice that for all \(s_1,s_2 \in \mathbb{R}\) and all \(\vec{x}_1,\vec{x}_2,\vec{y}\in \mathbb{R}^3\) we have \[(s_1\vec{x}_1+s_2\vec{x}_2)\cdot \vec{y}=s_1(\vec{x}_1\cdot\vec{y})+s_2(\vec{x}_2\cdot\vec{y}),\] so that the scalar product is linear in the first variable. Furthermore, the scalar product is symmetric, \(\vec{x}\cdot \vec{y}=\vec{y}\cdot \vec{x}.\) It follows that the scalar product is also linear in the second variable, hence it is bilinear or \(2\)-multilinear.
The second one is the cross product of two vectors in \(\mathbb{R}^3.\) Here \(V=\mathbb{R}^3\) and \(W=\mathbb{R}^3.\) Recall that the cross product is the mapping \[V^2=\mathbb{R}^3 \times \mathbb{R}^3 \to \mathbb{R}^3, \quad (\vec{x},\vec{y})\mapsto \vec{x}\times \vec{y}=\begin{pmatrix} x_2y_3-x_3y_2 \\ x_3y_1-x_1y_3 \\ x_1y_2-x_2y_1 \end{pmatrix}.\] Notice that for all \(s_1,s_2 \in \mathbb{R}\) and all \(\vec{x}_1,\vec{x}_2,\vec{y}\in \mathbb{R}^3\) we have \[(s_1\vec{x}_1+s_2\vec{x}_2)\times \vec{y}=s_1(\vec{x}_1\times\vec{y})+s_2(\vec{x}_2\times\vec{y}),\] so that the cross product is linear in the first variable. Likewise, we can check that the cross product is also linear in the second variable, hence it is bilinear or \(2\)-multilinear. Observe that the cross product is alternating.

Example 9.4 • Multilinear map

Let \(V=\mathbb{K}\) and consider \(f : V^\ell \to \mathbb{K},\) \((x_1,\ldots,x_\ell)\mapsto x_1x_2\cdots x_\ell.\) Then \(f\) is \(\ell\)-multilinear and symmetric.

Example 9.5

Let \(\mathbf{A}\in M_{n,n}(\mathbb{R})\) be a symmetric matrix, \(\mathbf{A}^T=\mathbf{A}.\) Notice that we obtain a symmetric bilinear map \[f : \mathbb{R}^n \times \mathbb{R}^n \to \mathbb{R}, \quad (x,y) \mapsto \vec{x}^T\mathbf{A}\vec{y},\] where on the right hand side all products are defined by matrix multiplication.

The Example 9.5 gives us a wealth of symmetric bilinear maps on \(\mathbb{R}^n.\) As we will see shortly, the situation is quite different if we consider alternating \(n\)-multilinear maps on \(\mathbb{K}_n\) (notice that we have the same number \(n\) of arguments as the dimension of \(\mathbb{K}_n\)).

Remark 9.6 • Alternating and skew-symmetric maps

We say an \(\ell\)- multilinear map \(f : V^\ell \to W\) is said to be antisymmetric if interchanging any two of its inputs results in a minus sign, i.e. \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \(v_1, \dots, v_n\) and \(1 \leqslant i < j \leqslant n.\)

One can check that any alternating map is antisymmetric. Let’s show this for \(n = 2.\) Assuming \(f\) is alternating, for any \(v_1, v_2 \in V\) we have \[\begin{aligned} 0 &= f(v_1 + v_2, v_1 + v_2) \\ &= f(v_1, v_1) + f(v_1, v_2) + f(v_2, v_1) + f(v_2, v_2)\\ &= 0 + f(v_1, v_2) + f(v_2, v_1) + 0 \end{aligned}\] so \(f(v_2, v_1) + f(v_1, v_2) = 0.\)

On the other hand, if \(f\) is antisymmetric, then we have \(f(v, v) = -f(v, v)\) for all \(v\) (since we can swap \(v\) with itself); so \(2 f(v, v) = 0.\) But this does not imply that \(f\) is alternating, since this ‘2’ means \(1_\mathbb{K}+ 1_\mathbb{K},\) and there exist fields such that \(1_\mathbb{K}+ 1_\mathbb{K}= 0_\mathbb{K}\)!

Of course, if \(\mathbb{K}\) is one of the familiar fields like \(\mathbb{R}\) or \(\mathbb{C},\) where \(2 \ne 0,\) then “alternating” and “antisymmetric” are the same. But in general being alternating is a more restrictive condition.

9.1.2 Existence and uniqueness

Theorem 9.7

Let \(n \in \mathbb{N},\) and let \(\{\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n\}\) denote the standard basis of \(\mathbb{K}_n.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : (\mathbb{K}_n)^n \to \mathbb{K}\) satisfying \(f_n(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=1.\)

It’s helpful to rephrase this statement in terms of matrices. Let us write \[\Omega : (\mathbb{K}_n)^n \to M_{n,n}(\mathbb{K})\] for the map sending \(n\) row vectors of length \(n\) to the \(n \times n\) matrix with those vectors as its rows. This map is clearly a bijection, so it makes sense to define a mapping \(f : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) to be “multilinear” if \(f \circ \Omega\) is multilinear, i.e. if \(f\) is linear in each row of the matrix. Similarly, we define \(f\) to be “alternating” if \(f \circ \Omega\) is, so \(f(\mathbf{A}) = 0\) whenever two of the rows of \(\mathbf{A}\) are equal.

Since \(\Omega(\vec{\varepsilon}_1,\ldots,\vec{\varepsilon}_n)=\mathbf{1}_{n},\) we may phrase the above theorem equivalently as:

Theorem 9.8 • Existence and uniqueness of the determinant

Let \(n \in \mathbb{N}.\) Then there exists a unique alternating \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K})\to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\)

Definition 9.9 • Determinant

The mapping \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) provided by Theorem 9.8 is called the determinant and denoted by \(\det.\) For \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) we say \(\det(\mathbf{A})\) is the determinant of the matrix \(\mathbf{A}.\)

Remark 9.10 • Abuse of notation

It would be more precise to write \(\det_n\) since the determinant is a family of mappings, one mapping \(\det_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) for each \(n \in \mathbb{N}.\) It is however common to simply write \(\det.\)

Example 9.11

For \(n=1\) the condition that a \(1\)-multilinear (i.e. linear) map \(f_1 : M_{1,1}(\mathbb{K}) \to \mathbb{K}\) is alternating is vacuous. So the Theorem 9.8 states that there is a unique linear map \(f_1 : M_{1,1}(\mathbb{K}) \to \mathbb{K}\) that satisfies \(f_1((1))=1.\) Of course, this is just the map defined by the rule \(f_1((a))=a,\) where \((a) \in M_{1,1}(\mathbb{K})\) is any \(1\)-by-\(1\) matrix.

Example 9.12

For \(n=2\) and \(a,b,c,d \in \mathbb{K}\) we consider the mapping \(f_2 : M_{2,2}(\mathbb{K}) \to \mathbb{K}\) defined by the rule \[\tag{9.1} f_2\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=ad-bc.\] We claim that \(f_2\) is bilinear in the rows and alternating. The condition that \(f_2\) is alternating simplifies to \(f(\mathbf{A})=0\) whenever the two rows of \(\mathbf{A}\in M_{2,2}(\mathbb{K})\) agree. Clearly, \(f_2\) is alternating, since \[f_2\left(\begin{pmatrix} a & b \\ a & b \end{pmatrix}\right)=ab-ab=0.\] Furthermore, \(f_2\) needs to be linear in each row. The additivity condition applied to the first row gives that we must have \[f_2\left(\begin{pmatrix} a_1+a_2 & b_1+b_2 \\ c & d \end{pmatrix}\right)=f_2\left(\begin{pmatrix} a_1 & b_1 \\ c & d\end{pmatrix}\right)+f_2\left(\begin{pmatrix} a_2 & b_2 \\ c & d\end{pmatrix}\right)\] for all \(a_1,a_2,b_1,b_2,c,d \in \mathbb{K}.\) Using the definition (9.1), we obtain \[\begin{aligned} f_2\left(\begin{pmatrix} a_1+a_2 & b_1+b_2 \\ c & d \end{pmatrix}\right)&=(a_1+a_2)d-c(b_1+b_2)\\ &=a_1d-cb_1+a_2d-cb_2\\ &=f_2\left(\begin{pmatrix} a_1 & b_1 \\ c & d\end{pmatrix}\right)+f_2\left(\begin{pmatrix} a_2 & b_2 \\ c & d\end{pmatrix}\right), \end{aligned}\] so that \(f_2\) is indeed additive in the first row. The \(1\)-homogeneity condition applied to the first row gives that we must have \[f_2\left(\begin{pmatrix}s a & s b \\ c & \mathrm{d}\end{pmatrix}\right)=s f_2\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)\] for all \(a,b,c,d \in \mathbb{K}\) and \(s \in \mathbb{K}.\) Indeed, using the definition (9.1), we obtain \[f_2\left(\begin{pmatrix}s a & s b \\ c & \mathrm{d}\end{pmatrix}\right)=s ad-cs b=s(ad-cb)=s f_2\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right),\] so that \(f_2\) is also \(1\)-homogeneous in the first row. We conclude that \(f_2\) is linear in the first row. Likewise, the reader is invited to check that \(f_2\) is also linear in the second row. Furthermore, we can easily compute that \(f_2(\mathbf{1}_{2})=1.\) The mapping \(f_2\) thus satisfies all the properties of Theorem 9.8, hence by the uniqueness statement we must have \(f_2=\det\) and we obtain the formula \[\tag{9.2} \det\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=ad-cb\] for all \(a,b,c,d \in\mathbb{K}.\)

9.2 Uniqueness of the determinant

So far we have only shown that the determinant exists for \(n=1\) and \(n=2.\) However, we need to show the existence and uniqueness part of Theorem 9.8 in general. We first show the uniqueness part. We start by deducing some consequences from the alternating property:

Lemma 9.13

Let \(V,W\) be \(\mathbb{K}\)-vector spaces and \(\ell \in \mathbb{N}.\) An alternating \(\ell\)-multilinear map \(f : V^\ell \to W\) satisfies:

interchanging two arguments of \(f\) leads to a minus sign. That is, for \(1\leqslant i,j\leqslant \ell\) and \(i\neq j\) we obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] for all \((v_1,\ldots,v_{\ell}) \in V^\ell\);
if the vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) are linearly dependent, then \(f(v_1,\ldots,v_{\ell})=0_W\);
for all \(1\leqslant i\leqslant \ell,\) for all \(\ell\)-tuples of vectors \((v_1,\ldots,v_{\ell}) \in V^\ell\) and scalars \(s_1,\ldots,s_\ell \in \mathbb{K},\) we have \[f(v_1,\ldots,v_{i-1},v_i+w,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell})\] where \(w=\sum_{j=1,j\neq i}^\ell s_j v_j.\) That is, adding a linear combination of vectors to some argument of \(f\) does not change the output, provided the linear combination consists of the remaining arguments.

Proof. (i) Since \(f\) is alternating, we have for all \((v_1,\ldots,v_{\ell}) \in V^\ell\) \[f(v_1,\ldots,v_{i-1},v_i+v_j,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell})=0_W.\] Using the linearity in the \(i\)-th argument, this gives \[\begin{aligned} 0_W&=f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i+v_j,v_{j+1},\ldots,v_{\ell}). \end{aligned}\] Using the linearity in the \(j\)-th argument, we obtain \[\begin{aligned} 0_W&=f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_i,v_{i+1},\ldots,v_{j-1},v_j,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\\ &\phantom{=}+f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_j,v_{j+1},\ldots,v_{\ell}). \end{aligned}\] The first summand has a double occurrence of \(v_i\) and hence vanishes by the alternating property. Likewise, the fourth summand has a double occurrence of \(v_j\) and hence vanishes as well. Since the second summand equals \(f(v_1,\ldots,v_{\ell}),\) we thus obtain \[f(v_1,\ldots,v_{\ell})=-f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{j-1},v_i,v_{j+1},\ldots,v_{\ell})\] as claimed.

(ii) Suppose \(\{v_1,\ldots,v_{\ell}\}\) are linearly dependent so that we have scalars \(s_j \in \mathbb{K}\) not all zero, \(1\leqslant j\leqslant \ell,\) so that \(s_1v_1+\cdots+s_\ell v_{\ell}=0_V.\) Suppose \(s_i\neq 0\) for some index \(1\leqslant i\leqslant \ell.\) Then \[v_i=-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)v_j\] and hence by the linearity in the \(i\)-th argument, we obtain \[\begin{gathered} f\left(v_1,\ldots,v_{i-1},-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)v_j,v_{i+1},\ldots,v_{\ell}\right)\\=-\sum_{j=1,j\neq i}^\ell\left(\frac{s_j}{s_i}\right)f\left(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{\ell}\right)=0_W, \end{gathered}\] where we use that for each \(1\leqslant j\leqslant \ell\) with \(j\neq i,\) the expression \[f(v_1,\ldots,v_{i-1},v_j,v_{i+1},\ldots,v_{\ell})\] has a double occurrence of \(v_j\) and thus vanishes by the alternating property.

(iii) Let \((v_1,\ldots,v_{\ell})\in V^\ell\) and \((s_1, \ldots, s_\ell) \in \mathbb{K}^\ell.\) Then, using the linearity in the \(i\)-th argument, we compute \[\begin{gathered} f(v_1,\ldots, v_{i-1},v_i+\sum_{j=1,j\neq i}^\ell s_j v_j,v_{i+1}, \ldots,v_{\ell})\\ =f(v_1,\ldots, v_{\ell})+\sum_{j=1,j\neq i}^\ell s_jf(v_1,\ldots, v_{i-1}v_j,v_{i+1},\ldots,v_{\ell})=f(v_1,\ldots,v_{\ell}), \end{gathered}\] where the last equality follows exactly as in the proof of (ii).

The alternating property of an \(n\)-multilinear map \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) together with the condition \(f_n(\mathbf{1}_{n})=1\) uniquely determines the value of \(f_n\) on the elementary matrices:

Lemma 9.14

Let \(n \in \mathbb{N}\) and \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) an alternating \(n\)-multilinear map satisfying \(f_n(\mathbf{1}_{n})=1.\) Then for all \(1\leqslant k,l\leqslant n\) with \(k\neq l\) and all \(s \in \mathbb{K},\) we have \[\tag{9.3} f_n(\mathbf{D}_k(s))=s,\qquad f_n(\mathbf{L}_{k,l}(s))=1, \qquad f_n(\mathbf{P}_{k,l})=-1.\] Moreover, for \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and an elementary matrix \(\mathbf{B}\) of size \(n,\) we have \[\tag{9.4} f_n(\mathbf{B}\mathbf{A})=f_n(\mathbf{B})f_n(\mathbf{A}).\]

Proof. Recall that \(\mathbf{D}_k(s)\) applied to a square matrix \(\mathbf{A}\) multiplies the \(k\)-th row of \(\mathbf{A}\) with \(s\) and leaves \(\mathbf{A}\) unchanged otherwise. We write \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) as \(\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_n)\) for \(\vec{\alpha}_{i} \in \mathbb{K}_n,\) \(1\leqslant i\leqslant n.\) Hence we obtain \[\mathbf{D}_k(s)\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},s \vec{\alpha}_k,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_n).\] The linearity of \(f\) in the \(k\)-th row thus gives \(f_n(\mathbf{D}_k(s)\mathbf{A})=s f_n(\mathbf{A}).\) In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies that \(f_n(\mathbf{D}_k(s))=f_n(\mathbf{D}_k(s)\mathbf{1}_{n})=s f_n(\mathbf{1}_{n})=s.\) Therefore, we have \[f_n(\mathbf{D}_k(s)\mathbf{A})=f_n(\mathbf{D}_k(s))f_n(\mathbf{A}).\]

Likewise we obtain \[\mathbf{L}_{k,l}(s)\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},\vec{\alpha}_k+s \vec{\alpha}_l,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_n)\] and we can apply property (iii) of Lemma 9.13 for the choice \(w=s\vec{\alpha}_l\) to conclude that \(f_n(\mathbf{L}_{k,l}(s)\mathbf{A})=f_n(\mathbf{A}).\) In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies \(f_n(\mathbf{L}_{k,l}(s))=f_n(\mathbf{L}_{k,l}(s)\mathbf{1}_{n})=f_n(\mathbf{1}_{n})=1.\)

Therefore, we have \[f_n(\mathbf{L}_{k,l}(s)\mathbf{A})=f_n(\mathbf{L}_{k,l}(s))f_n(\mathbf{A}).\]

Finally, we have \[\mathbf{P}_{k,l}\mathbf{A}=\Omega(\vec{\alpha}_1,\ldots,\vec{\alpha}_{k-1},\vec{\alpha}_l,\vec{\alpha}_{k+1},\ldots,\vec{\alpha}_{l-1},\vec{\alpha}_k,\vec{\alpha}_{l+1},\ldots,\vec{\alpha}_n)\] so that property (ii) of Lemma 9.13 immediately gives that \[f_n(\mathbf{P}_{k,l}\mathbf{A})=-f_n(\mathbf{A}).\] In particular, the choice \(\mathbf{A}=\mathbf{1}_{n}\) together with \(f_n(\mathbf{1}_{n})=1\) implies \(f_n(\mathbf{P}_{k,l})=f_n(\mathbf{P}_{k,l}\mathbf{1}_{n})=-f_n(\mathbf{1}_{n})=-1.\)

Therefore, we have \(f_n(\mathbf{P}_{k,l}\mathbf{A})=f_n(\mathbf{P}_{k,l})f_n(\mathbf{A}),\) as claimed.

We now obtain the uniqueness part of Theorem 9.8.

Proposition 9.15

Let \(n\in \mathbb{N}\) and \(f_n,\hat{f}_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) be alternating \(n\)-multilinear maps satisfying \(f_n(\mathbf{1}_{n})=\hat{f}_n(\mathbf{1}_{n})=1.\) Then \(f_n=\hat{f}_n.\)

Proof. We need to show that for all \(\mathbf{A}\in M_{n,n}(\mathbb{K}),\) we have \(f_n(\mathbf{A})=\hat{f}_n(\mathbf{A}).\) Suppose first that \(\mathbf{A}\) is not invertible. Then, by Proposition 3.18, the row vectors of \(\mathbf{A}\) are linearly dependent and hence property (ii) of Lemma 9.13 implies that \(f_n(\mathbf{A})=\hat{f}_n(\mathbf{A})=0.\)

Now suppose that \(\mathbf{A}\) is invertible. Using Gauss–Jordan elimination, we obtain \(N \in \mathbb{N}\) and a sequence of elementary matrices \(\mathbf{B}_1,\ldots,\mathbf{B}_N\) so that \(\mathbf{B}_1 \mathbf{B}_2 \cdots \mathbf{B}_N=\mathbf{A}.\)

Applying (9.4) repeatedly, we have \[f_n(\mathbf{A}) = f_n(\mathbf{B}_1\cdots \mathbf{B}_N) = f_n(\mathbf{B}_1) \dots f_n(\mathbf{B}_N)\] and similarly \[\hat{f}_n(\mathbf{A}) = \hat{f}_n(\mathbf{B}_1\cdots \mathbf{B}_N) = \hat{f}_n(\mathbf{B}_1) \cdots \hat{f}_n(\mathbf{B}_N).\] But (9.3) implies that \(\hat{f}_n(\mathbf{B}_j)=f_n(\mathbf{B}_j)\) for all \(j,\) so these two products are equal.

9.3 Existence of the determinant

It turns out that we can define the determinant recursively in terms of the determinants of certain submatrices. Determinants of submatrices are called minors. To this end we first define:

Definition 9.16

Let \(n \in \mathbb{N}.\) For a square matrix \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and \(1\leqslant k,l\leqslant n\) we denote by \(\mathbf{A}^{(k,l)}\) the \((n-1)\times(n-1)\) submatrix obtained by removing the \(k\)-th row and \(l\)-th column from \(\mathbf{A}.\)

Example 9.17

\[\text{If } \mathbf{A}=\begin{pmatrix} a & b \\ c & d \end{pmatrix}, \text{ then} \qquad \mathbf{A}^{(1,1)}=(d), \qquad \mathbf{A}^{(2,1)}=(b).\] \[\text{If } \mathbf{A}=\begin{pmatrix} 1 & -2 & 0 & 4 \\ 3 & 1 & 1 & 0 \\ -1 & -5 & -1 & 8 \\ 3 & 8 & 2 & -12 \end{pmatrix}, \text{ then} \qquad \mathbf{A}^{(3,2)}=\begin{pmatrix} 1 & 0 & 4 \\ 3 & 1 & 0 \\ 3 & 2 & -12\end{pmatrix}.\]

We use induction to prove the existence of the determinant:

Lemma 9.18

Let \(n \in \mathbb{N}\) with \(n\geqslant 2\) and \(f_{n-1} : M_{n-1,n-1}(\mathbb{K}) \to \mathbb{K}\) an alternating \((n-1)\)-multilinear mapping satisfying \(f_{n-1}(\mathbf{1}_{n-1})=1.\) Then, for any fixed integer \(l\) with \(1\leqslant l\leqslant n,\) the mapping \[f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}, \quad \mathbf{A}\mapsto \sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)\] is alternating, \(n\)-multilinear and satisfies \(f_n(\mathbf{1}_{n})=1.\)

Proof of Theorem 9.7. For \(n=1\) we have seen that \(f_1 : M_{1,1}(\mathbb{K}) \to \mathbb{K},\) \((a) \mapsto a\) is \(1\)-multilinear, alternating and satisfies \(f_1(\mathbf{1}_{1})=1.\) Hence Lemma 9.18 implies that for all \(n \in \mathbb{N}\) there exists an \(n\)-multilinear and alternating map \(f_n : M_{n,n}(\mathbb{K}) \to \mathbb{K}\) satisfying \(f_n(\mathbf{1}_{n})=1.\) By Proposition 9.15 there is only one such mapping for each \(n \in \mathbb{N}.\)

Proof of Lemma 9.18. We take some arbitrary, but then fixed integer \(l\) with \(1\leqslant l\leqslant n.\)

Step 1. We first show that \(f_n(\mathbf{1}_{n})=1.\) Since \([\mathbf{1}_{n}]_{kl}=\delta_{kl},\) we obtain \[f_n(\mathbf{1}_{n})=\sum_{k=1}^n(-1)^{l+k}[\mathbf{1}_{n}]_{kl}f_{n-1}\left(\mathbf{1}_{n}^{(k,l)}\right)=(-1)^{2l}f_{n-1}\left(\mathbf{1}_{n}^{(l,l)}\right)=f_{n-1}\left(\mathbf{1}_{n-1}\right)=1,\] where we use that \(\mathbf{1}_{n}^{(l,l)}=\mathbf{1}_{n-1}\) and \(f_{n-1}(\mathbf{1}_{n-1})=1.\)

Step 2. We show that \(f_n\) is multilinear. Let \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) and write \(\mathbf{A}=(A_{kj})_{1\leqslant k,j\leqslant n}.\) We first show that \(f_n\) is \(1\)-homogeneous in each row. Say we multiply the \(i\)-th row of \(\mathbf{A}\) with \(s\) so that we obtain a new matrix \(\hat{\mathbf{A}}=(\hat{A}_{kj})_{1\leqslant k,j\leqslant n}\) with \[\hat{A}_{kj}=\left\{\begin{array}{cc} A_{kj}, & k\neq i,\\ s A_{kj}, & k=i.\end{array}\right.\] We need to show that \(f_n(\hat{\mathbf{A}})=s f_n(\mathbf{A}).\) We compute \[\begin{aligned} f_n(\hat{\mathbf{A}})&=\sum_{k=1}^n(-1)^{l+k}\hat{A}_{kl}f_{n-1}(\hat{\mathbf{A}}^{(k,l)})\\ &=(-1)^{l+i}s A_{il}f_{n-1}(\hat{\mathbf{A}}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\hat{\mathbf{A}}^{(k,l)}). \end{aligned}\] Now notice that \(\hat{\mathbf{A}}^{(i,l)}=\mathbf{A}^{(i,l)},\) since \(\mathbf{A}\) and \(\hat{\mathbf{A}}\) only differ in the \(i\)-th row, but this is the row that is removed. Since \(f_{n-1}\) is \(1\)-homogeneous in each row, we obtain that \(f_{n-1}(\hat{\mathbf{A}}^{(k,l)})=s f_{n-1}(\mathbf{A}^{(k,l)})\) whenever \(k \neq i.\) Thus we have \[\begin{aligned} f_n(\hat{\mathbf{A}})&=s (-1)^{l+i}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+s\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{A}^{(k,l)})\\ &=s \sum_{k=1}^n(-1)^{l+k}A_{kl}f_{n-1}\left(\mathbf{A}^{(k,l)}\right)=s f_n(\mathbf{A}). \end{aligned}\] We now show that \(f_n\) is additive in each row. Say the matrix \(\mathbf{B}=(B_{kj})_{1\leqslant k,j\leqslant n}\) is identical to the matrix \(\mathbf{A},\) except for the \(i\)-th row, so that \[B_{kj}=\left\{\begin{array}{cc} A_{kj} & k\neq i\\ B_{j} & k=i\end{array}\right.\] for some scalars \(B_j\) with \(1\leqslant j\leqslant n.\) We need to show that \(f_n(\mathbf{C})=f_n(\mathbf{A})+f_n(\mathbf{B}),\) where \(\mathbf{C}=(C_{kj})_{1\leqslant k,j\leqslant n}\) with \[C_{kj}=\left\{\begin{array}{cc} A_{kj} & k\neq i\\ A_{ij}+B_{j} & k=i\end{array}\right.\] We compute \[f_n(\mathbf{C})=(-1)^{l+i}(A_{il}+B_l)f_{n-1}(\mathbf{C}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{C}^{(k,l)}).\] As before, since \(\mathbf{A},\mathbf{B},\mathbf{C}\) only differ in the \(i\)-th row, we have \(\mathbf{A}^{(i,l)}=\mathbf{B}^{(i,l)}=\mathbf{C}^{(i,l)}.\) Using that \(f_{n-1}\) is linear in each row, we thus obtain \[\begin{gathered} f_n(\mathbf{C})=(-1)^{l+i}B_lf_{n-1}(\mathbf{B}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{B}^{(k,l)})\\ +(-1)^{l+i}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+\sum_{k=1,k\neq i}^n(-1)^{l+k}A_{kl}f_{n-1}(\mathbf{A}^{(k,l)})=f_n(\mathbf{A})+f_n(\mathbf{B}). \end{gathered}\]

Step 3. We show that \(f_n\) is alternating. Suppose we have \(1\leqslant i,j\leqslant n\) with \(j>i\) and so that the \(i\)-th and \(j\)-th row of the matrix \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant n}\) are the same. Therefore, unless \(k=i\) or \(k=j,\) the submatrix \(\mathbf{A}^{(k,l)}\) also contains two identical rows and since \(f_{n-1}\) is alternating, all summands vanish except the one for \(k=i\) and \(k=j,\) this gives \[\begin{aligned} f_n(\mathbf{A})&=(-1)^{i+l}A_{il}f_{n-1}(\mathbf{A}^{(i,l)})+(-1)^{j+l}A_{jl}f_{n-1}(\mathbf{A}^{(j,l)})\\ &=A_{il}(-1)^l\left((-1)^{i}f_{n-1}(\mathbf{A}^{(i,l)})+(-1)^{j}f_{n-1}(\mathbf{A}^{(j,l)})\right) \end{aligned}\] where the second equality sign follows because we have \(A_{il}=A_{jl}\) for all \(1\leqslant l\leqslant n\) (the \(i\)-th and \(j\)-th row agree). The mapping \(f_{n-1}\) is alternating, hence by the first property of the Lemma 9.13, swapping rows in the matrix \(\mathbf{A}^{(j,l)}\) leads to a minus sign in \(f_{n-1}(\mathbf{A}^{(j,l)}).\) Moving the \(i\)-th row of \(\mathbf{A}^{(j,l)}\) down by \(j-i-1\) rows (which corresponds to swapping \(j-i-1\) times), we obtain \(\mathbf{A}^{(i,l)},\) hence \[f_{n-1}(\mathbf{A}^{(j,l)})=(-1)^{j-i-1}f_{n-1}(\mathbf{A}^{(i,l)}).\] This gives \[f_n(\mathbf{A})=A_{il}(-1)^l\left((-1)^{i}f_{n-1}(\mathbf{A}^{(i,l)})+(-1)^{2j-i-1}f_{n-1}(\mathbf{A}^{(i,l)})\right)=0.\]

Remark 9.19 • Laplace expansion

As a by-product of the proof of Lemma 9.18 we obtain the formula \[\tag{9.5} \det(\mathbf{A})=\sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}\det\left(\mathbf{A}^{(k,l)}\right),\] known as the Laplace expansion of the determinant (along the \(l\)-th column). The uniqueness statement of Theorem 9.8 thus guarantees that for every \(n\times n\) matrix \(\mathbf{A},\) the scalar \(\sum_{k=1}^n(-1)^{l+k}[\mathbf{A}]_{kl}\det\left(\mathbf{A}^{(k,l)}\right)\) is independent of the choice of \(l \in \mathbb{N},\) \(1\leqslant l \leqslant n.\) In practice, when computing the determinant, it is thus advisable to choose \(l\) such that the corresponding column contains the maximal amount of zeros.

Example 9.20

For \(n=2\) and choosing \(l=1,\) we obtain \[\det\left(\begin{pmatrix} a & b \\ c & d \end{pmatrix}\right)=a\det\left(\mathbf{A}^{(1,1)}\right)-c\det\left(\mathbf{A}^{(2,1)}\right)=ad-cb,\] in agreement with (9.1). For \(\mathbf{A}=(A_{ij})_{1\leqslant i,j\leqslant 3} \in M_{3,3}(\mathbb{K})\) and choosing \(l=3\) we obtain \[\begin{gathered} \det\left(\begin{pmatrix}A_{11} & A_{12} & A_{13} \\ A_{21} & A_{22} & A_{23} \\ A_{31} & A_{32} & A_{33} \end{pmatrix}\right)=A_{13} \det\left(\begin{pmatrix} A_{21} & A_{22} \\ A_{31} & A_{32} \end{pmatrix}\right)\\-A_{23}\det\left(\begin{pmatrix} A_{11} & A_{12} \\ A_{31} & A_{32}\end{pmatrix}\right)+A_{33} \det\left(\begin{pmatrix}A_{11} & A_{12} \\ A_{21} & A_{22} \end{pmatrix}\right)\\ \end{gathered}\] so that \[\begin{aligned} \det \mathbf{A}&=A_{13}(A_{21}A_{32}-A_{31}A_{22})-A_{23}(A_{11}A_{32}-A_{31}A_{12})\\ &\phantom{=}+A_{33}(A_{11}A_{22}-A_{21}A_{12})\\ &=A_{11}A_{22}A_{33}-A_{11}A_{23}A_{32}-A_{12}A_{21}A_{33}\\ &\phantom{=}+A_{12}A_{23}A_{31}+A_{13}A_{21}A_{32}-A_{13}A_{22}A_{31}. \end{aligned}\]

Exercises

Exercise 9.1 • Trilinear map

Let \(V=\mathbb{R}^3\) and \(W=\mathbb{R}.\) Show that the map \[f : V^3 \to W, \quad (\vec{x},\vec{y},\vec{z}) \mapsto (\vec{x}\times\vec{y})\cdot \vec{z}\] is alternating and trilinear.

Solution

We first show that \(f\) is trilinear: Let \(s,t\in\mathbb{R}.\) By definition of \(\times,\) we have that \(f(\vec y,\vec x,\vec z) = -f(\vec x,\vec y,\vec z)\) for all \(\vec x,\vec y, \vec z\in \mathbb{R}^3\) and hence we will show linearity in the first and third slot only. Let \(\vec x,\vec x_1,\vec x_2,\vec y,\vec z,\vec z_1,\vec z_2\in\mathbb{R}^3.\) \[\begin{aligned} f(s\vec x_1+t\vec x_2,\vec y,\vec z) & = \left((s\vec x_1 + t \vec x_2)\times \vec y\right)\cdot \vec z \\ & = \left(s(\vec x_1\times \vec y)+t(\vec x_2\times \vec y)\right)\cdot \vec z \\ & = s(\vec x_1\times \vec y)\cdot \vec z + t(\vec x_2\times \vec y)\cdot \vec z\\ & = s f(\vec x_1,\vec y,\vec z) + t f(\vec x_2,\vec y,\vec z),\end{aligned}\] where we use distributivity of \(\times\) and \(\cdot\) over \(+.\)

\[\begin{aligned} f(\vec x,\vec y,s\vec z_1+t\vec z_2) & = (\vec x\times \vec y)\cdot(s\vec z_1+t\vec z_2) \\ & = s (\vec x\times \vec y)\cdot \vec z_1 + t (\vec x\times \vec y)\cdot \vec z_2, \end{aligned}\] where we use the linearity of \(\cdot\) in the second slot.

We are left to show that \(f\) is alternating: Note that \(f(\vec x,\vec x, \vec z)=0\) by definition of the cross product. Since \(f(\vec y,\vec x,\vec z) = -f(\vec x,\vec y,\vec z),\) it is enough to show that \(f(\vec x,\vec y,\vec y)= 0\): \[\begin{aligned} \left(\begin{pmatrix}x_1\\ x_2 \\ x_3\end{pmatrix}\times \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix}\right) \cdot \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix} & = \begin{pmatrix}x_2y_3-x_3y_2\\ x_3y_1-x_1y_3\\ x_1y_2-x_2y_1\end{pmatrix}\cdot \begin{pmatrix}y_1\\ y_2 \\ y_3\end{pmatrix}. \end{aligned}\] Evaluating the dot product yields \[\color{blue}y_1x_2y_3\color{black}-\color{red}y_1y_2x_3\color{black} +\color{red}y_1y_2x_3\color{black} - x_1y_2y_3\color{black} + x_1y_2y_3 - \color{blue}y_1x_2y_3\] and this expression equals zero since terms with the same color cancel each other.

Exercise 9.2

Define the matrix \[\mathbf{A}=\begin{pmatrix} 4 & 2 & 0 \\ 0 & 5 & -1\\ 1 & 0 & 2 \end{pmatrix}.\] Compute the determinant of \(\mathbf{A}\) by Laplace expansion with respect to column \(\ell\) for each \(\ell\in\{1,2,3\}.\) Conclude that all choices for \(\ell\) give the same answer.

Exercise 9.3

Use the explicit formulae in Example 9.20 to show that we have \(\det(\mathbf{A}^T) = \det(\mathbf{A})\) for all \(n \times n\) square matrices with \(n \leqslant 3.\)

Exercise 9.4

Let \(\mathbf{A}\in M_{m,m}(\mathbb{K})\) and \(\mathbf{B}\in M_{n,n}(\mathbb{K}).\) Show that the determinant of the \((m + n) \times (m + n)\) matrix \(\begin{pmatrix}\mathbf{A}& 0 \\ 0 & \mathbf{B}\end{pmatrix}\) is given by \(\det(\mathbf{A}) \det(\mathbf{B}).\)

(Hint: Use induction on \(m,\) and do a Laplace expansion on the first column.)

9 The determinant, I

9.1 Axiomatic characterisation

9.1.1 Multilinear maps

9.1.2 Existence and uniqueness

9.2 Uniqueness of the determinant

9.3 Existence of the determinant

Exercises

Home

Chapters

Contents

PDFs

✕