4 Vector spaces

4.1 Abstract vector spaces

Let $\mathbb{K}$ be any field, and $n \ge 1.$ We’ve seen that the space $\mathbb{K}^n$ of column vectors has two fundamental operations, \[\begin{aligned} + &: \mathbb{K}^n \times \mathbb{K}^n \to \mathbb{K}^n,& (\vec{x},\vec{y}) &\mapsto \vec{x}+ \vec{y},& &\text{(vector addition),}\\ \cdot &: \mathbb{K}\times \mathbb{K}^n \to \mathbb{K}^n,& (s,\vec{x}) &\mapsto s\cdot\vec{x},& &\text{(scalar multiplication).} \end{aligned}\]

It turns out that there are lots of other mathematical structures where these two operations also make sense.

Example 4.1

Consider $\mathsf{P}(\mathbb{R}),$ the set of polynomial functions in one real variable, which we denote by $x,$ with real coefficients. That is, an element $p \in \mathsf{P}(\mathbb{R})$ is a function \[p : \mathbb{R}\to \mathbb{R}, \qquad x \mapsto a_n x^n+a_{n-1}x^{n-1}+\cdots + a_1 x+a_0=\sum_{k=0}^na_kx^k,\] where $n \in \mathbb{N}$ and the coefficients $a_k \in \mathbb{R}$ for $k=0,1,\ldots,n.$ The largest $m \in \mathbb{N}$ such that $a_m \neq 0$ is called the degree of $p.$ Notice that we consider polynomials of arbitrary, but finite degree. A power series $x \mapsto \sum_{k=0}^{\infty} a_k x^k,$ that you encounter in the Calculus module, is not a polynomial, unless only finitely many of its coefficients are different from zero.

Clearly, we can multiply $p$ with a real number $s \in \mathbb{R}$ to obtain a new polynomial $s\cdot_{\mathsf{P}(\mathbb{R})} p$ \[\tag{4.1} s\cdot_{\mathsf{P}(\mathbb{R})} p : \mathbb{R}\to \mathbb{R}, \qquad x \mapsto s\cdot p(x)\] so that $(s \cdot_{\mathsf{P}(\mathbb{R})}p)(x)=\sum_{k=0}^n s a_k x^k$ for all $x \in \mathbb{R}.$ Here $s \cdot p(x)$ is the usual multiplication of the real numbers $s$ and $p(x).$ If we consider another polynomial \[q : \mathbb{R}\to \mathbb{R}, \qquad x\mapsto \sum_{k=0}^n b_k x^k\] with $b_k \in \mathbb{R}$ for $k=0,1,\ldots,n,$ the sum of the polynomials $p$ and $q$ is the polynomial \[\tag{4.2} p+_{\mathsf{P}(\mathbb{R})}q : \mathbb{R}\to \mathbb{R}, \qquad x \mapsto p(x)+q(x)\] so that $(p+_{\mathsf{P}(\mathbb{R})}q)(x)=\sum_{k=0}(a_k+b_k)x^k$ for all $x \in \mathbb{R}.$ Here $p(x)+q(x)$ is the usual addition of the real numbers $p(x)$ and $q(x).$ We will henceforth omit writing $+_{\mathsf{P}(\mathbb{R})}$ and $\cdot_{\mathsf{P}(\mathbb{R})}$ and simply write $+$ and $\cdot.$

This suggests that just working with the explicit spaces of column vectors $\mathbb{K}^n$ is too limiting. Instead, we’re going to make a list of rules which capture how addition and scalar multiplication behave on $\mathbb{K}^n$; and we’ll allow ourselves to work with any structure which has “addition” and “scalar multiplication” operations that play by these rules (much as we did in Chapter 1 with the axiomatic definition of a field). We’ll think of the elements of these structures as “abstract vectors”.

Definition 4.2 • Vector space

A $\mathbb{K}$-vector space, or vector space over $\mathbb{K}$ is a set $V,$ with a distinguished element $0_V$ (called the zero vector) and two operations \[\begin{aligned} +_V : V \times V \to V& &(v_1,v_2) \mapsto v_1+_Vv_2& &(\text{vector addition}) \end{aligned}\] and \[\begin{aligned} \cdot_V : \mathbb{K}\times V \to V& &(s,v) \mapsto s \cdot_V v& &(\text{scalar multiplication}), \end{aligned}\] so that the following properties hold:

Commutativity of vector addition \[v_1+_Vv_2=v_2+_Vv_1\quad (\text{for all}\; v_1,v_2 \in V);\]
Associativity of vector addition \[v_1+_V(v_2+_Vv_3)=(v_1+_Vv_2)+_Vv_3 \quad (\text{for all}\; v_1,v_2,v_3 \in V);\]
Identity element of vector addition \[\tag{4.3} 0_V+_Vv=v+_V0_V=v\quad (\text{for all}\; v \in V);\]
Identity element of scalar multiplication \[1\cdot_V v=v\quad (\text{for all}\; v \in V);\]
Scalar multiplication by zero \[\tag{4.4} 0\cdot_{V}v=0_V \quad (\text{for all}\; v \in V);\]
Compatibility of scalar multiplication with field multiplication \[(s_1s_2)\cdot_V v=s_1\cdot_V(s_2\cdot_V v) \quad (\text{for all}\; s_1,s_2 \in \mathbb{K}, v \in V);\]
Distributivity of scalar multiplication with respect to vector addition \[s\cdot_V(v_1+_Vv_2)=s\cdot_Vv_1+_Vs\cdot_V v_2\quad (\text{for all}\; s \in \mathbb{K}, v_1,v_2 \in V);\]
Distributivity of scalar multiplication with respect to field addition \[(s_1+s_2)\cdot_Vv=s_1\cdot_Vv+_Vs_2\cdot_Vv \quad (\text{for all}\; s_1,s_2 \in \mathbb{K}, v \in V).\] The elements of $V$ are called vectors.

Examples of vector spaces

Example 4.3 • Field

A field $\mathbb{K}$ is a $\mathbb{K}$-vector space. We may take $V=\mathbb{K},$ $0_V=0_{\mathbb{K}}$ and equip $V$ with addition $+_V=+_{\mathbb{K}}$ and scalar multiplication $\cdot_V=\cdot_{\mathbb{K}}.$ Then the properties of a field imply that $V=\mathbb{K}$ is a $\mathbb{K}$-vector space.

Example 4.4 • Vector space of matrices

Let $V=M_{m,n}(\mathbb{K})$ denote the set of $m\times n$-matrices with entries in $\mathbb{K}$ and $0_V=\mathbf{0}_{m,n}$ denote the zero vector. It follows from Proposition 2.16 that $V$ equipped with addition $+_V : V \times V \to V$ defined by (2.2) and scalar multiplication $\cdot_V : \mathbb{K}\times V \to V$ defined by (2.3) is a $\mathbb{K}$-vector space. In particular, the set of column vectors $\mathbb{K}^n=M_{n,1}(\mathbb{K})$ is a $\mathbb{K}$-vector space as well.

Example 4.5 • Vector space of polynomials

The set $\mathsf{P}(\mathbb{R})$ of polynomials in one real variable and with real coefficients is an $\mathbb{R}$-vector space, when equipped with addition and scalar multiplication as defined in (4.1) and (4.2) and when the zero vector $0_{\mathsf{P}(\mathbb{R})}$ is defined to be the zero polynomial $o : \mathbb{R}\to \mathbb{R},$ that is, the polynomial satisfying $o(x)=0$ for all $x \in \mathbb{R}.$

More generally, functions form a vector space:

Example 4.6 • Vector space of functions

We follow the convention of calling a mapping with values in $\mathbb{K}$ a function. Let $I\subset \mathbb{R}$ be an interval and let $o : I \to \mathbb{K}$ denote the zero function defined by $o(x)=0$ for all $x \in I.$ We consider $V=\mathsf{F}(I,\mathbb{K}),$ the set of functions from $I$ to $\mathbb{K}$ with zero vector $0_V=o$ given by the zero function and define addition $+_V : V \times V \to V$ as in (4.2) and scalar multiplication $\cdot_V : \mathbb{K}\times V \to V$ as in (4.1). It now is a consequence of the properties of addition and multiplication of scalars that $\mathsf{F}(I,\mathbb{K})$ is a $\mathbb{K}$-vector space. (The reader is invited to check this assertion!)

Example 4.7 • Vector space of sequences

A mapping $x : \mathbb{N} \to \mathbb{K},$ from the natural numbers into a field $\mathbb{K},$ is called a sequence in $\mathbb{K}$ (or simply a sequence, when $\mathbb{K}$ is clear from the context). It is common to write $x_n$ instead of $x(n)$ for $n \in \mathbb{N}$ and to denote a sequence by $(x_n)_{n \in \mathbb{N}}=(x_0,x_1,x_2,\ldots).$ We write $\mathbb{K}^{\infty}$ for the set of sequences in $\mathbb{K}.$ For instance, taking $\mathbb{K}=\mathbb{R},$ we may consider the sequence \[\left(\frac{1}{n + 1}\right)_{n \in \mathbb{N}}=\left(1,\frac{1}{2},\frac{1}{3},\frac{1}{4},\frac{1}{5},\ldots\right)\] or the sequence \[\left(\sqrt{n+1}\right)_{n \in \mathbb{N}}=\left(1,\sqrt{2},\sqrt{3},2,\sqrt{5},\ldots\right).\] If we equip $\mathbb{K}^{\infty}$ with the zero vector given by the zero sequence $(0,0,0,0,0,\ldots),$ addition given by $(x_n)_{n \in \mathbb{N}}+(y_n)_{n\in N}=(x_n+y_n)_{n \in \mathbb{N}}$ and scalar multiplication given by $s\cdot(x_n)_{n \in \mathbb{N}}=(s x_n)_{n \in \mathbb{N}}$ for $s \in \mathbb{K},$ then $\mathbb{K}^{\infty}$ is a $\mathbb{K}$-vector space.

Example 4.8 • Zero vector space

Consider a set $V$ whose only element is a formal symbol $\star.$ We define $0_V = \star,$ addition by $\star +_V \star = \star$ and scalar multiplication by $s\cdot_V \star=\star.$ Then all the properties of Definition 4.2 are satisfied. We write $V=\{\star\},$ or simply $V=\{0\},$ and call $V$ the zero vector space (over $\mathbb{K}$).

Example 4.9 • Field embeddings

If $\mathbb{F}$ and $\mathbb{K}$ are fields, and $\iota: \mathbb{F}\to \mathbb{K}$ is a field embedding, then $\mathbb{K}$ is an $\mathbb{F}$-vector space in a natural way: the addition is the native field addition of $\mathbb{K},$ and the scalar multiplication being given by $s \cdot x = \iota(s) \cdot_\mathbb{K}x.$ (Exercise: check that the axioms are satisfied!) In effect, we are throwing away some of the structure from $\mathbb{K}$ – we are “forgetting” how to multiply elements of $\mathbb{K},$ except when one of them is in the image of $\iota.$

In particular, $\mathbb{R}$ is a $\mathbb{Q}$-vector space. This is a really strange and puzzling concept, and shows that sometimes we have to live with our definitions having unexpected consequences!

The notion of a vector space is an example of an abstract space. Later in your studies you will encounter further examples, like topological spaces, metric spaces and manifolds.

Remark 4.10 • Notation & Definition

Let $V$ be a $\mathbb{K}$-vector space.

For $v \in V$ we write $-v=(-1)\cdot_V v$ and for $v_1,v_2 \in V$ we write $v_1-v_2=v_1+_V(-v_2).$ In particular, using the properties from Definition 4.2 we have (check which properties we do use!) \[v-v=v+_V(-v)=v+_V(-1)\cdot_V v=(1-1)\cdot_Vv=0\cdot_Vv=0_V\] For this reason we call $-v$ the additive inverse of $v$.
Again, it is too cumbersome to always write $+_V,$ for this reason we often write $v_1+v_2$ instead of $v_1+_Vv_2.$
Likewise, we will often write $s \cdot v$ or $s v$ instead of $s\cdot_Vv.$
It is also customary to write $0$ instead of $0_V.$

Lemma 4.11 • Elementary properties of vector spaces

Let $V$ be a $\mathbb{K}$-vector space. Then we have:

The zero vector is unique, that is, if $0_V^{\prime}$ is another vector such that $0_V^{\prime}+v=v+0_V^{\prime}=v$ for all $v \in V,$ then $0_V^{\prime}=0_V.$
The additive inverse of every $v \in V$ is unique, that is, if $w \in V$ satisfies $v+w=0_V,$ then $w=-v.$
For all $s \in \mathbb{K}$ we have $s 0_V=0_V.$
For $s \in \mathbb{K}$ and $v \in V$ we have $s v=0_V$ if and only if either $s=0$ or $v=0_V.$

Proof. (The reader is invited to check which property of Definition 4.2 is used in each of the equality signs below)

We have $0_V^{\prime}=0_V^{\prime}+0_V=0_V.$
Since $v+w=0_V,$ adding $-v,$ we obtain $(-v)+v+w=0_V+(-v)=-v=w.$
We compute $s 0_V=s(0_V+0_V)=s 0_V+s 0_V$ so that $s 0_V-s 0_V=0_V=s 0_V.$
$\Leftarrow$ If $v=0_V,$ then $s v=0_V$ by (iii). If $s=0,$ then $s v=0_V$ by (4.4).

$\Rightarrow$ Let $s \in \mathbb{K}$ and $v \in V$ such that $s v=0_V.$ It is sufficient to show that if $s \neq 0,$ then $v=0_V.$ Since $s \neq 0$ we can multiply $s v=0_V$ with $1/s$ so that \[\frac{1}{s}\left(s v\right)=\left(\frac{1}{s}s\right)v=v=\frac{1}{s}0_V=0_V.\]

4.2 Linear combinations

Definition 4.12 • Linear combination

Let $V$ be a $\mathbb{K}$-vector space and $\mathcal{S}$ a set of vectors from $V.$ A linear combination of the vectors in $\mathcal{S}$ is a vector $w \in V$ which can be written in the form \[w=s_1v_1+\cdots+s_kv_k=\sum_{i=1}^ks_i v_i\] for some $k \in \mathbb{N},$ scalars $s_1,\ldots,s_k \in \mathbb{K},$ and vectors $v_1, \dots, v_k \in \mathcal{S}.$

Note that we don’t have to use all of the elements in $\mathcal{S}$; indeed $\mathcal{S}$ doesn’t have to be finite, and a linear combination is only allowed to mention finitely many elements (the defininition of a vector space doesn’t give us any way of making sense of infinite sums). On the other hand, if $\mathcal{S}$ is finite, then it suffices to consider linear combinations which do involve all the elements of $\mathcal{S},$ simply by introducing extra terms with $s_i = 0.$

Remark 4.13

When $k = 0,$ we understand the empty sum—the sum of no elements of $V$—to mean $0_V.$ In particular, $0_V$ is a linear combination of vectors in $\mathcal{S}$ for any $\mathcal{S}$ (even if $\mathcal{S}$ is the empty set).

Example 4.14

For $n \in \mathbb{N}$ with $n\geqslant 2$ consider $V=\mathsf{P}_n(\mathbb{R})$ and the polynomials $p_1,p_2,p_3 \in \mathsf{P}_n(\mathbb{R})$ defined by the rules $p_1(x)=1,$ $p_2(x)=x,$ $p_3(x)=x^2$ for all $x \in \mathbb{R}.$ A linear combination of $\{p_1,p_2,p_3\}$ is a polynomial of the form $p(x)=ax^2+bx+c$ where $a,b,c \in \mathbb{R}.$

For vectors in the column-vector space $\mathbb{K}^n,$ “being a linear combination” is expressible by a linear system of equations:

Example 4.15

“Is $\begin{pmatrix} 1 \\ 2 \\ 1 \end{pmatrix} \in \mathbb{R}^3$ a linear combination of $\left\{\left(\begin{smallmatrix} 3 \\ -1 \\ 0\end{smallmatrix}\right), \left(\begin{smallmatrix} 0 \\ 1 \\-1 \end{smallmatrix}\right)\right\}$?”

We’re asking if there are $s_1, s_2$ such that $s_1 \begin{pmatrix} 3 \\ -1 \\ 0\end{pmatrix} + s_2 \begin{pmatrix} 0 \\1 \\ -1 \end{pmatrix} = \begin{pmatrix} 1 \\ 2 \\ 1 \end{pmatrix},$ which is the same as the equations \[\begin{array}{rc} 3s_1 + 0 s_2 &= 1 \\ -s_1 + s_2 &= 2 \\ 0s_1 - s_2 &= 1 \end{array}, \qquad \text{i.e.} \qquad \left( \begin{array}{cc|c} 3 & 0 & 1 \\ -1 & 1 & 2 \\ 0 & -1 & 1 \end{array}\right).\] Since the RREF of this matrix is $\left(\begin{array}{rr|r} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right),$ there are no solutions. So the answer to the original question is no

Exercise 4.16

Use the method of 3.19 to show that a vector $\vec{b} \in \mathbb{R}^3$ is a linear combination of $\left\{\left(\begin{smallmatrix} 3 \\ -1 \\ 0\end{smallmatrix}\right), \left(\begin{smallmatrix} 0 \\ 1 \\-1 \end{smallmatrix}\right)\right\}$ if and only if $\left(1\quad 3 \quad 3\right) \cdot \vec{b} = 0.$

(*) Can you relate this to the RREF of the matrix \[\left( \begin{array}{rr|rrr} 3 & 0 & \phantom{-}1 & 0 & 0\\ -1 & 1 & 0 & \phantom{-}1 & 0 \\ 0 & -1 & 0 & 0 & \phantom{-}1 \end{array}\right) \qquad?\]

4.3 Vector subspaces

A vector subspace of a vector space is a subset that is itself a vector space, more precisely:

Definition 4.17 • Vector subspace

Let $V$ be a $\mathbb{K}$-vector space. A subset $U\subset V$ is called a vector subspace of $V$ if the restriction to $U$ of the addition and scalar-multiplication operations of $V$ make $U$ into a vector space; that is,

$0_V \in U,$
$v_1 +_V v_2 \in U$ for all $v_1, v_2 \in U,$
$s \cdot_V v \in U$ for all $s \in \mathbb{K}$ and $v \in U.$

One can check that these conditions are equivalent to the following easier-to-check condition:

$U$ is non-empty, and \[\tag{4.5} s_1\cdot_Vv_1+_Vs_2\cdot_V v_2 \in U\quad \text{for all}\; s_1,s_2 \in \mathbb{K}\; \text{and all}\; v_1,v_2 \in U.\]

Remark 4.18

Let’s check that this simpler condition implies the ones in Definition 4.17. Since $U$ is non-empty, it contains an element, say $u.$ Taking $s_1 = s_2 = 0$ and $v_1 = v_2 = u$ in (4.5), we see that $0\cdot_V u + 0 \cdot_V u =0_V \in U.$ Thus the zero vector $0_V$ lies in $U.$ Taking $s_1 = s_2 = 1$ and $v_1, v_2$ arbitrary, we get that $U$ is closed under sums; and taking $s_2 = 0_{\mathbb{K}}$ we see that $U$ is closed under scalar multiplication.

(On the other hand, we can’t drop the condition that $U$ be non-empty, since the empty set vacuously satisfies (4.5) but is not a subspace.)
We’ll see in the exercises that a subspace is automatically closed under linear combinations; that is, if $U$ is a vector subspace, then any linear combination of elements of $U$ is in $U.$
A vector subspace is also called a linear subspace or simply a subspace.

The prototypical examples of vector subspaces are lines and planes through the origin in $\mathbb{R}^3$:

Example 4.19 • Lines through the origin

Let $\vec{w}\neq 0_{\mathbb{R}^3},$ then the line \[U=\{s \vec{w}\, |\, s \in \mathbb{R}\} \subset \mathbb{R}^3\] is a vector subspace. Indeed, taking $s=0$ it follows that $0_{\mathbb{R}^3} \in U$ so that $U$ is non-empty. Let $\vec{u}_1,\vec{u}_2$ be vectors in $U$ so that $\vec{u}_1=t_1\vec{w}$ and $\vec{u}_2=t_2\vec{w}$ for scalars $t_1,t_2 \in \mathbb{R}.$ Let $s_1,s_2 \in \mathbb{R},$ then \[s_1\vec{u}_1+s_2\vec{u}_2=s_1t_1\vec{w}+s_2t_2\vec{w}=\left(s_1t_1+s_2t_2\right)\vec{w} \in U\] so that $U\subset \mathbb{R}^3$ is a subspace.

Example 4.20 • Zero subspace

Let $V$ be a $\mathbb{K}$-vector space and $U=\{0_V\}$ the set consisting of the zero vector of $V.$ Then, by Definition 4.17 and the properties of Definition 4.2, it follows that $U$ is a vector subspace of $V$: the zero subspace $\{0_V\}.$

If $V$ and $W$ are different vector spaces over $\mathbb{K},$ then $\{0_V\}$ and $\{0_W\}$ may or may not be exactly the same – it depends on what $V$ and $W$ are – but they have the same vector-space structure (they are isomorphic, a concept we’ll see later).

Example 4.21 • Periodic functions

Taking $I=\mathbb{R}$ and $\mathbb{K}=\mathbb{R}$ in Example 4.6, we see that the functions $f : \mathbb{R}\to \mathbb{R}$ form an $\mathbb{R}$-vector space $V=\mathsf{F}(\mathbb{R},\mathbb{R}).$ Consider the subset \[U=\left\{f \in \mathsf{F}(\mathbb{R},\mathbb{R})\,|\, f\;\text{is periodic with period}\; 2\pi\right\}\] consisting of $2\pi$-periodic functions, that is, an element $f \in U$ satisfies $f(x+2\pi)=f(x)$ for all $x \in \mathbb{R}.$ Notice that $U$ is not empty, as $\cos : \mathbb{R}\to \mathbb{R}$ and $\sin : \mathbb{R}\to \mathbb{R}$ are elements of $U.$ Suppose $f_1,f_2 \in U$ and $s_1,s_2 \in \mathbb{R}.$ Then, we have for all $x \in \mathbb{R}$ \[\begin{aligned} (s_1f_1+s_2f_2)(x+2\pi)&=s_1f_1(x+2\pi)+s_2f_2(x+2\pi)=s_1f_1(x)+s_2f_2(x)\\ &=(s_1f_1+s_2f_2)(x) \end{aligned}\] showing that $s_1f_1+s_2f_2$ is periodic with period $2\pi.$ By Definition 4.17, it follows that $U$ is a vector subspace of $\mathsf{F}(\mathbb{R},\mathbb{R}).$

Operations on subspaces

Vector subspaces are stable under intersection in the following sense:

Proposition 4.22

Let $V$ be a $\mathbb{K}$-vector space, $n\geqslant 1$ a natural number and $U_1,\ldots,U_n$ vector subspaces of $V.$ Then the intersection \[U^{\prime}=\bigcap_{j=1}^n U_j=\left\{v \in V\,|\, v \in U_j\;\text{for all}\; j=1,\ldots,n\right\}\] is a vector subspace of $V$ as well.

Proof. Since $U_j$ is a vector subspace, $0_V \in U_j$ for all $j=1,\ldots,n.$ Therefore, $0_V \in U^{\prime},$ hence $U^{\prime}$ is not empty. Let $u_1,u_2 \in U^{\prime}$ and $s_1,s_2 \in \mathbb{K}.$ By assumption, $u_1,u_2 \in U_j$ for all $j=1,\ldots,n.$ Since $U_j$ is a vector subspace for all $j=1,\ldots,n$ it follows that $s_1u_1+s_2 u_2 \in U_j$ for all $j=1,\ldots,n$ and hence $s_1u_1+s_2u_2 \in U^{\prime}.$ By Definition 4.17, it follows that $U^{\prime}$ is a vector subspace of $V.$

Remark 4.23

The last proposition is also true for $n = 0,$ if we understand “the intersection of no subspaces of $V$” to mean the whole of $V.$ Infinite intersections work too: if $I$ is any set (can be finite, can be infinite, can be empty) and we have a mapping $I \to \{ \text{vector subspaces of $V$}\},$ $i \mapsto V_i,$ then $\bigcap_{i \in I} V_i = \{ v \in V : v \in V_i\ \forall i \in I\}$ is a well-defined subset of $V$ and it is a subspace.

Remark 4.24

Notice that the union of subspaces need not be a subspace. Let $V=\mathbb{R}^2,$ $\{\vec{e}_1,\vec{e}_2\}$ its standard basis and \[U_1=\left\{s \vec{e}_1\,|\,s \in \mathbb{R}\right\} \quad \text{and} \quad U_2=\left\{s \vec{e}_2\,|\,s \in \mathbb{R}\right\}.\] Then $\vec{e}_1 \in U_1\cup U_2$ and $\vec{e}_2\in U_1\cup U_2,$ but $\vec{e}_1+\vec{e}_2 \notin U_1\cup U_2.$

Exercise 4.25

Does there exist a field $\mathbb{F}$ and a vector space $V$ over $\mathbb{F}$ which can be written as the union of two proper subspaces? What about three proper subspaces?

4.4 Subspaces generated by sets

Definition 4.26 • Subspace generated by a set

Let $V$ be a $\mathbb{K}$-vector space and $\mathcal{S}\subset V$ be a subset. The subspace generated by $\mathcal{S}$, or the span of $\mathcal{S},$ is the set $\operatorname{span}(\mathcal{S})$ whose elements are linear combinations of vectors in $\mathcal{S}.$ Formally, we have \[\operatorname{span}(\mathcal{S})=\left\{v \in V\,\Big|\, v=\sum_{i=1}^ks_iv_i \text{ for some } k \in \mathbb{N}, s_1,\ldots,s_k \in \mathbb{K},v_1,\ldots,v_k \in \mathcal{S}\right\}.\]

Remark 4.27

The notation $\langle \mathcal{S}\rangle$ for the span of $\mathcal{S}$ is also in use.

Proposition 4.28

Let $V$ be a $\mathbb{K}$-vector space and $\mathcal{S}\subset V$ be a non-empty subset. Then $\operatorname{span}(\mathcal{S})$ is a vector subspace of $V.$

Proof. Clearly $\operatorname{span}(\mathcal{S})$ cannot be empty, since it always contains $0_V.$ So it suffices to show that if $v_1, v_2 \in \operatorname{span}(\mathcal{S})$ and $s_1, s_2 \in \mathbb{K},$ then $s_1 v_1 + s_2 v_2 \in \operatorname{span}(\mathcal{S}).$ By assumption, we can write $v_1=t_1w_1+\cdots+t_kw_k$ for some $k\in \mathbb{N},$ $t_1,\ldots t_k \in \mathbb{K}$ and $w_1,\ldots, w_k \in \mathcal{S}$; and similarly $v_2=\hat{t}_1\hat{w}_1+\cdots+\hat{t}_j\hat{w}_j$ for some $j,$ scalars $\hat{t}_1,\ldots,\hat{t}_j$ and $\hat{w}_1,\ldots,\hat{w}_j \in \mathcal{S}.$

But then \[\begin{aligned} s_1v_1+s_2v_2&=s_1(t_1w_1+\cdots+t_kw_k)+s_2(\hat{t}_1\hat{w}_1+\cdots+\hat{t}_j\hat{w}_j)\\ &=s_1t_1w_1+\cdots+s_1t_kw_k+s_2\hat{t}_1\hat{w}_1+\cdots+s_2\hat{t}_j\hat{w}_j \end{aligned}\] also a linear combination of vectors in $\mathcal{S}$ and the claim follows.

Remark 4.29

For a subset $\mathcal{S}\subset V,$ we may alternatively define $\operatorname{span}(\mathcal{S})$ to be “the smallest vector subspace of $V$ that contains $\mathcal{S}$” (but then we have to justify the claim that such a subspace exists). Another alternative is “the intersection of all subspaces of $V$ that contain $S$” (but then it’s not so clear what its elements are).

4.5 Generating sets and finite-dimensionality

Definition 4.30

Let $V$ be a $\mathbb{K}$-vector space. A subset $\mathcal{S}\subset V$ is called a generating set (or spanning set) of $V$ if $\operatorname{span}(\mathcal{S}) =V.$

Exercise 4.31

Show that if $\mathcal{S}\subset \mathcal{T} \subset V,$ and $\mathcal{S}$ is a generating set of $V,$ then $\mathcal{T}$ is also a generating set.

Example 4.32

Thinking of a field $\mathbb{K}$ as a $\mathbb{K}$-vector space, the set $\mathcal{S}=\{1_{\mathbb{K}}\}$ consisting of the identity element of multiplication is a generating set for $V=\mathbb{K}.$ Indeed, for every $x \in \mathbb{K}$ we have $x=x\cdot_{V}1_{\mathbb{K}}.$

Definition 4.33

The vector space $V$ is called finite dimensional if $V$ admits a generating set with finitely many elements (also called a finite set). A vector space that is not finite dimensional will be call infinite dimensional.

Remark 4.34

Notice that we’re playing a devious notational trick: the definition of “finite-dimensional” is not “ the dimension is finite” – we haven’t defined the notion of “dimension” yet! It would be logically better to call these “finitely generated vector spaces”, but this is not the convention, so we won’t do it.

Example 4.35

The standard basis $\mathcal{S}=\{\vec{e}_1,\ldots,\vec{e}_n\}$ is a generating set for $\mathbb{K}^n,$ since for all $\vec{x}=(x_i)_{1\leqslant i\leqslant n} \in \mathbb{K}^n,$ we can write $\vec{x}=x_1\vec{e}_1+\cdots+x_n\vec{e}_n$ so that $\vec{x}$ is a linear combination of elements of $\mathcal{S}.$ Thus $\mathbb{K}^n$ is finite-dimensional.

Example 4.36

Let $\mathbf{E}_{k,l} \in M_{m,n}(\mathbb{K})$ for $1\leqslant k \leqslant m$ and $1\leqslant l \leqslant n$ denote the $m$-by-$n$ matrix satisfying $\mathbf{E}_{k,l}=(\delta_{ki}\delta_{lj})_{1\leqslant i\leqslant m, 1\leqslant j \leqslant n}.$ For example, for $m=2$ and $n=3$ we have \[\mathbf{E}_{1,1}=\begin{pmatrix} 1 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix},\qquad \mathbf{E}_{1,2}=\begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \end{pmatrix},\qquad \mathbf{E}_{1,3}=\begin{pmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}\] and \[\mathbf{E}_{2,1}=\begin{pmatrix} 0 & 0 & 0 \\ 1 & 0 & 0 \end{pmatrix},\qquad \mathbf{E}_{2,2}=\begin{pmatrix} 0 & 0 & 0 \\ 0 & 1 & 0 \end{pmatrix},\qquad \mathbf{E}_{2,3}=\begin{pmatrix} 0 & 0 & 0 \\ 0 & 0 & 1 \end{pmatrix}.\] Then $\mathcal{S}=\{\mathbf{E}_{k,l}\}_{1\leqslant k\leqslant m, 1\leqslant l \leqslant n}$ is a generating set for $M_{m,n}(\mathbb{K}),$ since a matrix $\mathbf{A}\in M_{m,n}(\mathbb{K})$ can be written as \[\mathbf{A}=\sum_{k=1}^m\sum_{l=1}^n A_{kl}\mathbf{E}_{k,l}\] so that $\mathbf{A}$ is a linear combination of the elements of $\mathcal{S}.$

Example 4.37

The vector space $\mathsf{P}(\mathbb{R})$ of polynomials is infinite dimensional. In order to see this, consider a finite set of polynomials $\{p_1,\ldots,p_n\},$ $n \in \mathbb{N}$ and let $d_i$ denote the degree of the polynomial $p_i$ for $i=1,\ldots,n.$ We set $D =\max\{d_1,\ldots,d_n\}.$ Since a linear combination of the polynomials $\{p_1,\ldots,p_n\}$ has degree at most $D ,$ any polynomial $q$ whose degree is strictly larger than $D$ will satisfy $q \notin \operatorname{span}\{p_1,\ldots,p_n\}.$ It follows that $\mathsf{P}(\mathbb{R})$ cannot be generated by a finite set of polynomials.

4.6 Linear independence

Recall that spanning was about existence of linear combinations: which elements of $V$ we can “hit” with linear combinations of $\mathcal{S}.$ We have a complementary notion which is about uniqueness of linear combinations – “how many” ways we can hit a given element:

Definition 4.38 • Linear independence

Let $\mathcal{S}\subset V$ be a subset. We say $\mathcal{S}$ is linearly independent if there is no non-trivial way of writing $0_V$ as a linear combination of vectors in $\mathcal{S}.$ That is, for all natural numbers $k \ge 1,$ all $s_1, \dots, s_k \in \mathbb{K}$ and all distinct $v_1, \dots, v_k \in \mathcal{S},$ we have the implication \[s_1v_1+\cdots+s_kv_k=0_V \qquad \iff \qquad s_1=\cdots=s_k=0.\] If $\mathcal{S}$ is not linearly independent (so there exists a non-trivial linear combination of elements of $\mathcal{S}$ equal to 0) then $\mathcal{S}$ is called linearly dependent.

Remark 4.39

Observe that a subset $\mathcal{T}$ of a linearly independent set $\mathcal{S}$ is itself linearly independent (exercise); and the empty set is linearly independent.

Example 4.40

We consider the polynomials $p_1,p_2,p_3 \in \mathsf{P}(\mathbb{R})$ defined by the rules $p_1(x)=1,p_2(x)=x,p_3(x)=x^2$ for all $x \in \mathbb{R}.$ Then $\{p_1,p_2,p_3\}$ is linearly independent. In order to see this, consider the condition \[\tag{4.6} s_1p_1+s_2p_2+s_3p_3=0_{\mathsf{P}(\mathbb{R})}=o\] where $o : \mathbb{R}\to \mathbb{R}$ denotes the zero polynomial. Since (4.6) means that \[s_1p_1(x)+s_2p_2(x)+s_3p_3(x)=o(x),\] for all $x \in \mathbb{R},$ we can evaluate this condition for any choice of real number $x.$ Taking $x=0$ gives \[s_1p_1(0)+s_2p_2(0)+s_3p_3(0)=o(0)=0=s_1.\] Taking $x=1$ and $x=-1$ gives \[\begin{aligned} 0&=s_2p_2(1)+s_3p_3(1)=s_2+s_3,\\ 0&=s_2p_2(-1)+s_3p_3(-1)=-s_2+s_3, \end{aligned}\] so that $s_2=s_3=0$ as well. It follows that $\{p_1,p_2,p_3\}$ is linearly independent.

For vectors in $\mathbb{K}^n,$ linear independence can be checked using row-echelon form.

Example 4.41

Consider the vectors in $\mathbb{R}^3$ given by \[v_1 = \left(\begin{smallmatrix} 0 \\ 1 \\ 2\end{smallmatrix}\right), v_2 = \left(\begin{smallmatrix} -9 \\ 4 \\ 6\end{smallmatrix}\right), v_3 = \left(\begin{smallmatrix} 3 \\ 0 \\ -1\end{smallmatrix}\right), v_4 = \left(\begin{smallmatrix} 4 \\ -1 \\ 5\end{smallmatrix}\right).\] These are linearly independent if and only if the system of equations \[\left(\begin{smallmatrix} 0 & -9 & 3 & 4 \\ 1 & 4 & 0 & -1 \\ 2 & 6 & -1 & 5 \end{smallmatrix}\right) \left(\begin{smallmatrix} s_1 \\ s_2 \\ s_3 \\ s_4 \end{smallmatrix}\right) = \left(\begin{smallmatrix} 0 \\ 0 \\ 0 \end{smallmatrix}\right)\] has only the zero solution. We already calculated the RREF of this matrix in a previous example; so we know that this system is equivalent to \[\left(\begin{array}{rrrr|r} 1 & 0 & 0 & \frac{17}{3} & 0 \\ 0 & 1 & 0 & -\frac{5}{3} & 0 \\ 0 & 0 & 1 & -\frac{11}{3} & 0 \end{array}\right).\] Obviously this system is consistent (because the rightmost column is entirely zeroes), but we care about uniqueness of solutions; and since none of the rows has its leading entry in the fourth column, $s_4$ is a free variable, and hence the solution is not unique.

Remark 4.42

Of course, this was bound to happen, since there are four columns to the left of the dividing line, but only three rows. So there can’t possibly be a leading entry in every column. What this proves is: any set of vectors in $\mathbb{K}^n$ containing more than $n$ elements must be linearly dependent This is an instance of the Fundamental Inequality, one of the main theorems of this module, which we’ll prove in the next section.

Exercises

Exercise 4.43

Let $U\subset V$ be a vector subspace and $k \in \mathbb{N}$ with $k\geqslant 2.$ Show that for $u_1,\ldots,u_k \in U$ and $s_1,\ldots,s_k \in \mathbb{K},$ we have $s_1u_1+\cdots+s_ku_k \in U.$

Solution

We will prove the claim by induction on $k.$ If $u_1,u_2\in U$ and $s_1,s_2\in\mathbb{K},$ then $s_1u_1+s_2u_2\in U$ according to Definition 4.17 and the statement is anchored.

Inductive step: Assume $k\geqslant 2$ and let $u_1,\ldots,u_{k+1}\in U$ and $s_1,\ldots,s_{k+1}\in\mathbb{K}.$ According to the induction hypothesis, \[u=\sum_{j=1}^k s_ju_j\in U\] and hence \[\sum_{j=1}^{k+1}s_ju_j = u + s_{k+1}u_{k+1}\in U\] again by Definition 4.17.

Exercise 4.44 • Planes through the origin

Let $\vec{w}_1,\vec{w}_2\neq 0_{\mathbb{R}^3}$ and $\vec{w}_1\neq s \vec{w}_2$ for all $s \in \mathbb{R}.$ Show that the plane \[U=\{s_1\vec{w}_1+s_2\vec{w}_2\,|\, s_1,s_2 \in \mathbb{R}\}\] is a vector subspace of $\mathbb{R}^3.$

Solution

According to Definition 4.17 we first need to show that $U\ne\emptyset$: By choosing $s_1=s_2=0,$ we see that $0_{\mathbb{R}^3}\in U$ and hence $U\ne\emptyset.$ Let $\vec u,\vec v\in U.$ We need to show that $t_1\vec u+t_2\vec v\in U,$ for all $t_1,t_2\in\mathbb{R}.$ There exist scalars $s_1,\hat s_1, s_2,\hat s_2$ such that $\vec u = s_1 \vec w_1 + s_2\vec w_2$ and $\vec v = \hat s_1 \vec w_1+\hat s_2 \vec w_2$ and hence \[\begin{aligned}t_1\vec u + t_2\vec v & = t_1(s_1 \vec w_1 + s_2\vec w_2)+t_2(\hat s_1 \vec w_1+\hat s_2 \vec w_2)\\ & = (t_1s_1+t_2\hat s_1)\vec w_1 + (t_1s_2+t_2\hat s_2)\vec w_2\in U. \end{aligned}\] Note that we do not need the assumption $\vec w_1\ne s\vec w_2$ for all $s\in\mathbb{R}$ throughout the argument. Indeed, if $\vec w_1 = s\vec w_2$ for some $s\ne 0,$ then $U$ is still a subspace of $\mathbb{R}^3$ but in this case, it would be called a line instead of a plane.

Exercise 4.45 • Polynomials

Let $n \in \mathbb{N}$ and $\mathsf{P}_n(\mathbb{R})$ denote the subset of $\mathsf{P}(\mathbb{R})$ consisting of polynomials of degree at most $n$. Show that $\mathsf{P}_n(\mathbb{R})$ is a subspace of $\mathsf{P}(\mathbb{R})$ for all $n \in \mathbb{N}.$

Solution

Let $n\in \mathbb{N}$ be arbitrary. The constant polynomial $p:\mathbb{R}\to\mathbb{R}, x\mapsto 0_\mathbb{R}$ is an element of $\mathsf{P}_n(\mathbb{R})$ and hence, $\mathsf{P}_n(\mathbb{R})\ne\emptyset.$ Let now $p,q\in \mathsf{P}_n(\mathbb{R})$ be defined by \[\begin{aligned} p&: x\mapsto \sum_{k=0}^n a_kx^k\\ q&: x\mapsto \sum_{k=0}^n b_kx^k, \end{aligned}\] where $a_0,b_0,a_1,b_1,\ldots,a_n,b_n\in\mathbb{R}.$ For $s_1,s_2\in \mathbb{R},$ the polynomial $s_1p+s_2q$ is given by the function \[x\mapsto \sum_{k=0}^n (s_1a_k+s_2b_k)x^k\] and hence is an element of $\mathsf{P}_n(\mathbb{R})$ and hence $\mathsf{P}_n(\mathbb{R})$ is a subspace of $\mathsf{P}(\mathbb{R})$ according to Definition 4.17.

Exercise 4.46

Show that for a non-empty subset $\mathcal{S}$ of a $\mathbb{K}$-vector space $V,$ the set $\operatorname{span}(\mathcal{S})$ as defined in Definition 4.26 is the same as either of the two definitions given in Remark 4.29.

Solution

Let $\mathcal S\subset V$ be a non-empty set. We need to show that $\operatorname{span}(\mathcal S)$ as defined in Definition 4.26 is the smallest subspace of $V$ that contains $\mathcal S.$

To this end, let $U$ be any subspace which contains $\mathcal S.$ We will show that $\operatorname{span}(\mathcal S)\subset U.$ Since $\mathcal S\subset U$ and $U$ is a subspace, $U$ contains every finite linear combination of elements in $\mathcal S$ (see Exercise 4.43). The set $\operatorname{span}(\mathcal S)$ is by Definition 4.26 the set of all finite linear combinations of elements of $\mathcal S$ and therefore $U\supset \operatorname{span}(\mathcal S)$ and hence $\operatorname{span}(\mathcal S)$ is the smallest subspace of $V$ that contains $\mathcal S.$

Exercise 4.47

Show that a subset $\{v\}$ consisting of a single vector $v \in V$ is linearly independent if and only if $v\neq 0_V.$

Solution

The subset $\{v\}\subset V$ is linearly independent if $s v = 0_V\Longleftrightarrow s = 0.$ If $v=0_V,$ then $sv=0_V$ for any $s\in\mathbb{K}$ and therefore, the set is linearly dependent. If $v\ne 0_V,$ then $sv=0_V$ which implies $s=0$ by the last item of Lemma 4.11 and hence $\{v\}$ is linearly independent.

4 Vector spaces

4.1 Abstract vector spaces

Examples of vector spaces

4.2 Linear combinations

4.3 Vector subspaces

Operations on subspaces

4.4 Subspaces generated by sets

4.5 Generating sets and finite-dimensionality

4.6 Linear independence

Exercises

Home

Chapters

Contents

✕