8 Symmetry and groups
8.1 Symmetry
The notion of a group arose by trying to formalise the concept of symmetry. Roughly speaking, given a non-empty set \(\mathcal{X}\) with some extra structure, a symmetry or symmetry transformation of \(\mathcal{X}\) is a bijective transformation \(\sigma : \mathcal{X} \to \mathcal{X}\) that respects the extra structure. For simplicity, we ignore any extra structure, so for us a symmetry of a set \(\mathcal{X}\) is simply a bijective mapping from \(\mathcal{X}\) to itself.
Let \(n \in \mathbb{N}.\) A permutation is a symmetry of the set \(\mathcal{X}=\{1,2,\ldots,n\}.\)
Let \(V\) be a \(\mathbb{K}\)-vector space and \(v_0 \in V.\) The translation \(T_{v_0} : V \to V,\) \(v\mapsto v+v_0\) by the vector \(v_0\) is a symmetry of \(V.\)
Let \(\mathcal{X}\) be any non-empty set. The identity transformation \(\mathrm{Id}_\mathcal{X} : \mathcal{X} \to \mathcal{X}\) defined by \(\mathrm{Id}_\mathcal{X}(x)=x\) for all \(x \in \mathcal{X}\) is a symmetry of \(\mathcal{X}.\)
Often the set \(\mathcal{X}\) is a subset of some larger set \(\mathcal{Z}\) and the symmetries of \(\mathcal{X}\) arise as bijective mappings \(\sigma : \mathcal{Z} \to \mathcal{Z}\) that leave \(\mathcal{X}\) invariant, that is, \(\sigma(x) \in \mathcal{X}\) for all \(x \in \mathcal{X}.\) We illustrate this with two examples:
Consider \(\mathcal{Z}=\mathbb{R}^2\) and \(\mathcal{X}\) to be the circle of radius \(r>0\) centred at the origin \(0_{\mathbb{R}^2},\) that is, \(\mathcal{X}=\{\vec{x}=(x_i)_{1\leqslant i\leqslant 2} \in \mathbb{R}^2 | (x_1)^2+(x_2)^2=r^2\}.\) Let \(\theta \in \mathbb{R}\) and \[\mathbf{R}_{\theta}=\begin{pmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{pmatrix}\] so that \(f_{\mathbf{R}_{\theta}} : \mathbb{R}^2 \to \mathbb{R}^2\) is the counter-clockwise rotation around the origin \(0_{\mathbb{R}^2}\) with angle \(\theta.\) A rotation does not change the length of a vector and hence \(f_{\mathbf{R}_{\theta}}(\vec{x}) \in \mathcal{X}\) for each element \(\vec{x} \in \mathcal{X}.\) The restriction \(\sigma=f_{\mathbf{R}_{\theta}}|_{\mathcal{X}} : \mathcal{X} \to \mathcal{X}\) of the rotation \(f_{\mathbf{R}_{\theta}}\) to the circle \(\mathcal{X}\) is thus a symmetry of the circle. Notice that not all symmetries of the circle are restrictions of rotations. The linear mapping \[f : \mathbb{R}^2 \to \mathbb{R}^2, \quad \begin{pmatrix} x_1 \\ x_2 \end{pmatrix} \mapsto \begin{pmatrix} x_1 \\ -x_2 \end{pmatrix}\] is the reflection along the \(x_1\)-axis and hence restricts to be a bijective mapping from the circle \(\mathcal{X}\) onto itself. It is thus also a symmetry of the circle.
Let \(n \in \mathbb{N}\) with \(n\geqslant 3.\) We consider a regular polygon \(\mathcal{X}\) with \(n\) sides centred at the origin in \(\mathcal{Z}=\mathbb{R}^2\) an so that \((1,0) \in \mathcal{X}.\) Clearly, not every rotation of \(\mathbb{R}^2\) restricts to be a symmetry of \(\mathcal{X},\) but only rotations with angle \(2\pi k/n\) where \(k \in \{0,1,2,\ldots,n-1\}.\) We thus have \(n\) rotation symmetries arising from the matrices \[\begin{pmatrix} \cos\frac{2\pi k}{n} & - \sin\frac{2\pi k}{n} \\ \sin\frac{2\pi k}{n} & \cos\frac{2\pi k}{n} \end{pmatrix}.\] In addition, the reflection along the \(x_1\)-axis is a symmetry of \(\mathcal{X}.\)
The composition of two symmetries of a set \(\mathcal{X}\) is again a symmetry of \(\mathcal{X}\) and composing symmetries satisfies the following fundamental properties:
If \(\sigma,\pi,\tau : \mathcal{X} \to \mathcal{X}\) are symmetries, then \[(\sigma\circ \pi)\circ \tau=\sigma\circ(\pi\circ \tau)\]
The identity transformation \(\mathrm{Id}_\mathcal{X}\) is a symmetry of \(\mathcal{X}\) and for all symmetries \(\sigma : \mathcal{X} \to \mathcal{X},\) we have \[\sigma \circ \mathrm{Id}_\mathcal{X}=\sigma=\mathrm{Id}_\mathcal{X} \circ \sigma\]
For each symmetry \(\sigma : \mathcal{X} \to \mathcal{X}\) there exists an inverse symmetry \(\sigma^{-1} : \mathcal{X} \to \mathcal{X}\) so that \[\sigma \circ \sigma^{-1}=\mathrm{Id}_\mathcal{X}=\sigma^{-1}\circ \sigma.\]
8.2 Groups
We have defined the permutations \(S_n\) to be the bijective mappings of the set \(\mathcal{X}_n=\{1,2,\ldots,n\},\) hence by definition, they are symmetries of \(\mathcal{X}_n.\) Recall that in addition, every permutation \(\sigma \in S_n\) also gives rise to a bijective (linear) mapping from \(\mathbb{K}^n \to \mathbb{K}^n\) defined by \(\vec{e}_i \mapsto \vec{e}_{\sigma(i)},\) where \(\{\vec{e}_1,\ldots,\vec{e}_n\}\) denotes the standard basis of \(\mathbb{K}^n.\) Hence, every permutation also gives a symmetry of \(\mathbb{K}^n.\) The permutations thus make an appearance as symmetries of two different sets, \(\mathcal{X}_n\) and \(\mathbb{K}^n.\) This suggests that a more detailed picture of a symmetry is needed. It turns out that a symmetry is the interplay of two mathematical notions, the notion of a group and the action of a group on a set \(\mathcal{X}\). We start with the definition of a group, c.f. Remark 8.3:
A group is a pair \((G,*_G)\) consisting of a set \(G\) together with a binary operation \(*_G : G \times G \to G,\) called group operation, so that the following properties hold:
The group operation \(*_G\) is associative, that is, \[(a*_G b)*_G c=a*_G(b*_Gc)\quad \text{for all }a,b,c \in G.\]
There exists an element \(e_G \in G\) such that \[e_G *_G a=a=a*_G e_G \quad \text{for all } a \in G.\] The element \(e_G\) is unique (see below) and is called the identity element of \(G\).
For each \(a \in G\) there exists an element \(b \in G\) such that \[a*_Gb=e_G=b*_G a.\] The element \(b\) is unique (see below) and called the inverse of \(a\) and is commonly denoted by \(a^{-1}.\)
The symmetries of a set \(\mathcal{X}\) form a group \(G,\) often denoted by \(\mathrm{Sym}(\mathcal{X}),\) where \(*_G=\circ\) is the composition of mappings. The identity element is the identity mapping \(e_G=\mathrm{Id}_\mathcal{X}\) and the inverse of each symmetry \(\sigma\) is the mapping inverse \(\sigma^{-1}.\) In particular, for \(n \in \mathbb{N},\) the permutations of \(\mathcal{X}_n=\left\{1,2,\ldots,n\right\}\) form a group \(G=S_n\) with \(*_G=\circ\) and \(e_G=1,\) the identity permutation.
A field \(\mathbb{K}\) gives rise to two groups. The additive group of the field where \(G=\mathbb{K}\) and \(*_G=+_\mathbb{K}\) and the multiplicative group of the field where \(G=\mathbb{K}^*\) and \(*_G=\cdot_{\mathbb{K}}.\) For the additive group we have \(e_G=0_{\mathbb{K}}\) and the inverse of \(x\in \mathbb{K}\) is \(-x.\) For the multiplicative group we have \(e_G=1_{\mathbb{K}}\) and the inverse of \(x \in \mathbb{K}^*\) is \(\frac{1}{x}.\)
A \(\mathbb{K}\)-vector space \(V\) gives rise to a group where \(G=V\) and \(*_G=+_V.\) Here the identity element is the zero vector \(e_G=0_V\) and the inverse of \(v \in V\) is \(-v.\)
Let \(n \in \mathbb{N}.\) The invertible \(n\times n\) matrices with entries in \(\mathbb{K}\) form a group \(G\) commonly denoted by \(\mathrm{GL}_{n}(\mathbb{K})\) or \(\mathrm{GL}(n,\mathbb{K}).\) Here \(*_{G}\) is matrix multiplication, \(e_G=\mathbf{1}_{n},\) the identity matrix of size \(n\) and the inverse of a group element is the matrix inverse. \(\mathrm{GL}\) is an abbreviation of general linear.
A group with finitely many elements is called finite. The group of permutations \(S_n\) is an example of a finite group. A finite field gives rise to two finite groups.
Notice that we do not require the group operation \(*_G : G \times G \to G\) to be commutative, so in general \(a*_G b\neq b*_G a.\) As an example consider \(G=\mathrm{GL}(n,\mathbb{K})\) where \(*_G\) is matrix multiplication. If the group operation \(*_G\) is commutative, then the group is called Abelian or commutative. The examples (ii) and (iii) above are examples of Abelian groups. The permutation group \(S_n\) is Abelian only for \(n=1,2.\)
Often we write \(+_G\) instead of \(*_G\) and \(0_G\) instead of \(e_G\) when the group is Abelian.
Some authors write \(1_G\) instead of \(e_G\) and/or \(\cdot_G\) instead of \(*_G.\)
As always, the subscript \(G\) is often omitted so that we write \(*\) instead of \(*_G\) and \(e\) or \(1\) instead of \(e_G.\) Like for fields, \(*\) or \(*_G\) is often omitted entirely so that we write \(ab\) instead of \(a*_G b.\)
Similar to fields, the definition of a group implies some basic properties:
Let \((G,*_G)\) be a group. Then
the identity element \(e_G\) is unique;
for all \(a \in G,\) the inverse \(a^{-1}\) is unique.
Proof.
Suppose \(e_G\) and \(\hat{e}_G\) are identity elements for \(G.\) Then \[e_G=e_G*_G\hat{e}_G=\hat{e}_G.\]
Suppose \(a \in G\) and both \(b\) and \(c\) are inverse elements for \(a.\) Then \[b=b*_G e_G=b*_G(a*_G c)=(b*_G a)*_G c=e_G*_G c=c.\]
Similar to vector spaces and fields, groups allow for the notion of a subgroup.
A non-empty subset \(H\) of a group \(G\) is called a subgroup if for all \(a,b \in H,\) we have \(a*_G b \in H\) and for all \(a\in H,\) we have \(a^{-1} \in H.\)
Notice that if \(H\subset G\) is a subgroup, the non-emptiness condition implies that there exists \(a \in H.\) Therefore, \(a^{-1} \in H\) and hence \(a*_G a^{-1}=e_G \in H.\) We can thus equip \(H\) with the structure of a group as well by defining \(e_H=e_G\) and \(a*_H b=a*_G b\) for all \(a,b \in H.\)
The set of integers \(\mathbb{Z}\) is a subgroup of the Abelian group \((\mathbb{Q},+),\) where \(+\) denotes usual addition of rational numbers. Indeed \(0 \in \mathbb{Z}\) and the sum of two integers is again an integer. Recall that for \(m \in \mathbb{Z},\) the notation \(m^{-1}\) refers to the inverse element of \(m\) with respect to the group operation. So here \(m^{-1}\) is the additive inverse of \(m \in \mathbb{Z},\) that is \(-m.\) Since \(-m \in \mathbb{Z}\) for all \(m \in \mathbb{Z},\) we conclude that \(\mathbb{Z}\) is an (Abelian) subgroup of \((\mathbb{Q},+).\)
A subspace \(U\subset V\) of a \(\mathbb{K}\)-vector space \(V\) is a subgroup of the Abelian group \((V,+_V).\)
Let \(\mathrm{SL}(n,\mathbb{K})\) denote the subset of \(\mathrm{GL}(n,\mathbb{K})\) consisting of matrices of determinant \(1.\) The set \(\mathrm{SL}(n,\mathbb{K})\) is non-empty since it contains \(\mathbf{1}_{n}.\) Furthermore, the product rule for the determinant Proposition 5.21 implies that if \(\mathbf{A}, \mathbf{B}\in \mathrm{SL}(n,\mathbb{K}),\) then so is the matrix product \(\mathbf{A}\mathbf{B}.\) Corollary 5.22 furthermore implies that if \(\mathbf{A}\in \mathrm{SL}(n,\mathbb{K}),\) then so is \(\mathbf{A}^{-1}.\) It follows that \(\mathrm{SL}(n,\mathbb{K})\) – commonly also denoted by \(\mathrm{SL}_n(\mathbb{K})\) – is a subgroup of \(\mathrm{GL}(n,\mathbb{K})\) called the special linear group.
The trigonometric identities for \(\sin\) and \(\cos\) imply that \(\mathbf{R}_{\theta}\mathbf{R}_{\vartheta}=\mathbf{R}_{\theta+\vartheta},\) where \(\theta,\vartheta \in \mathbb{R}\) Since \(\mathbf{R}_0=\mathbf{1}_{2} \in \mathrm{SL}(2,\mathbb{R})\) and \(\det \mathbf{R}_\theta=1\) for all \(\theta \in \mathbb{R},\) we conclude that the rotations \(\{\mathbf{R}_{\theta} | \theta \in \mathbb{R}\}\) around the origin \(0_{\mathbb{R}^2}\) form a subgroup of \(\mathrm{SL}(2,\mathbb{R}).\) The group of rotations in \(\mathbb{R}^2\) is denoted by \(\mathrm{SO}(2).\) Later on we will encounter the orthogonal group \(\mathrm{O}(n)\) and the special orthogonal group \(\mathrm{SO}(n),\) the latter of which generalises \(\mathrm{SO}(2)\) to higher dimensions.
8.3 Group actions
In order to tie the notion of a group more closely to the notion of a symmetry, we need the concept of a group \(G\) acting on a set \(\mathcal{X}.\) This section – which we include for the interested reader – goes beyond the usual material in a Linear Algebra course and is not examinable.
Let \(G\) be a group and \(\mathcal{X}\) a non-empty set. A (left) group action of \(G\) on \(\mathcal{X}\) is a mapping \(\phi : G \times \mathcal{X} \to \mathcal{X}\) such that for all \(x \in \mathcal{X}\) \[\phi(e_G,x)=x\] and \[\phi(a*_Gb,x)=\phi(a,\phi(b,x))\] for all \(a,b \in G\) and \(x \in \mathcal{X}.\)
The first condition simply requests that the identity element \(e_G\) of \(G\) acts trivially, that is, nothing happens to the elements of \(\mathcal{X}\) when acting with \(e_G.\)
The second condition requests that acting with \(a*_G b\) corresponds to first acting with \(b\) and then acting with \(a.\)
Notice that for each fixed \(a \in G\) we obtain a mapping \(\phi_a : \mathcal{X} \to \mathcal{X}\) defined by \(\phi_a(x)=\phi(a,x).\) The above properties imply that for all \(a\in G\) we have \(\phi_a \circ \phi_{a^{-1}}=\phi_{a^{-1}}\circ \phi_a=\mathrm{Id}_\mathcal{X},\) hence \(\phi_a : \mathcal{X} \to \mathcal{X}\) is bijective and hence a symmetry of \(\mathcal{X}.\)
Every group \(G\) acts on itself. We take \(\mathcal{X}=G\) and define \[\phi : G \times G \to G, \quad (a,b)\mapsto \phi(a,b)=a*_G b.\] Then for all \(a \in G\) we have \(\phi(e_g,a)=e_g*_G a=a.\) Furthermore, for all \(a,b,c \in G\) we have \[\phi(a*_G b,c)=(a*_G b)*_G c=a*_G(b*_G c)=a*_G\phi(b,c)=\phi(a,\phi(b,c))\] so that \(\phi\) does indeed define an action of \(G\) on itself.
Consider \(\mathcal{X}=\mathbb{R}^2\) and \(G=\mathrm{SO}(2).\) We define an action \[\phi : G \times \mathcal{X} \to \mathcal{X}, \quad (\mathbf{R}_{\theta},\vec{x}) \mapsto \phi(\mathbf{R}_{\theta},\vec{x})=\mathbf{R}_{\theta}\vec{x}\] which rotates a vector \(\vec{x} \in \mathbb{R}^2\) counter-clockwise around the origin \(0_{\mathbb{R}^2}\) by the angle \(\theta.\) Here \(*_G\) is just matrix multiplication, so we have for all \(\vec{x}\in \mathbb{R}^2\) and \(\mathbf{R}_{\theta},\mathbf{R}_{\vartheta} \in \mathrm{SO}(2)\) \[\phi(\mathbf{R}_{\theta}\mathbf{R}_{\vartheta},\vec{x})=\mathbf{R}_{\theta}\mathbf{R}_{\vartheta}\vec{x}=\mathbf{R}_{\theta}\phi(\mathbf{R}_{\vartheta},\vec{x})=\phi(\mathbf{R}_{\theta},\phi(\mathbf{R}_{\vartheta},\vec{x})).\] Furthermore, since \(e_{\mathrm{SO}(2)}=\mathbf{R}_0=\mathbf{1}_{2},\) we have for all \(\vec{x} \in \mathbb{R}^2\) \[\phi(e_{\mathrm{SO}(2)},\vec{x})=\mathbf{1}_{2}\vec{x}=\vec{x}.\] It follows that \(\phi\) does indeed define an action of \(\mathrm{SO}(2)\) on \(\mathbb{R}^2.\)
Let \(n \in \mathbb{N}\) and \(\mathcal{X}=M_{n,n}(\mathbb{K}).\) The general linear group \(\mathrm{GL}(n,\mathbb{K})\) acts on \(\mathcal{X}\) by conjugation. We define \[\phi : \mathrm{GL}(n,\mathbb{K}) \times M_{n,n}(\mathbb{K}) \to M_{n,n}(\mathbb{K}), \quad (\mathbf{C},\mathbf{A}) \mapsto \phi(\mathbf{C},\mathbf{A})=\mathbf{C}\mathbf{A}\mathbf{C}^{-1}.\] Then for all \(\mathbf{A}\in M_{n,n}(\mathbb{K})\) we have \[\phi(e_G,\mathbf{A})=\phi(\mathbf{1}_{n},\mathbf{A})=\mathbf{1}_{n}\mathbf{A}(\mathbf{1}_{n})^{-1}=\mathbf{A}\] where we use that \(e_{\mathrm{GL}(n,\mathbb{K})}=\mathbf{1}_{n}.\) Moreover, for \(\mathbf{C},\mathbf{C}^{\prime} \in \mathrm{GL}(n,\mathbb{K})\) and \(\mathbf{A}\in M_{n,n}(\mathbb{K}),\) we have \[\begin{aligned} \phi(\mathbf{C}\mathbf{C}^{\prime} ,\mathbf{A})&=\mathbf{C}\mathbf{C}^{\prime} \mathbf{A}(\mathbf{C}\mathbf{C}^{\prime})^{-1}=\mathbf{C}\mathbf{C}^{\prime}\mathbf{A}(\mathbf{C}^{\prime})^{-1}\mathbf{C}^{-1}\\ &=\mathbf{C}\phi(\mathbf{C}^{\prime},\mathbf{A})\mathbf{C}^{-1}=\phi(\mathbf{C},\phi(\mathbf{C}^{\prime} ,\mathbf{A})), \end{aligned}\] where we use that for all \(\mathbf{C},\mathbf{C}^{\prime} \in \mathrm{GL}(n,\mathbb{K}),\) we have \((\mathbf{C}\mathbf{C}^{\prime} )^{-1}=(\mathbf{C}^{\prime})^{-1}\mathbf{C}^{-1}.\) It follows that \(\phi\) does indeed define an action of \(\mathrm{GL}(n,\mathbb{K})\) on \(M_{n,n}(\mathbb{K}).\)
Let \(V\) be a \(\mathbb{K}\)-vector space and \(U\subset V\) a subspace. Taking \(G=U\) with \(*_G=+_U\) and \(\mathcal{X}=V,\) the group \(G\) acts by translation. We define \[\phi : U \times V \to V, \quad (u,v)\mapsto \phi(u,v)=u+_V v.\] Since \(e_G=0_U=0_V,\) we have for all \(v \in V\) \[\phi(e_G,v)=0_V+_V v=v.\] Moreover, for all \(u_1,u_2 \in U\) and \(v \in V\) we have \[\phi(u_1+_U u_2,v)=(u_1+_U u_2)+_V v=u_1+_V\phi(u_2,v)=\phi(u_1,\phi(u_2,v)),\] where we use that \(+_U : U \times U \to U\) is the restriction of \(+_V : V \times V \to V\) to \(U\times U\subset V\times V.\) We conclude that \(\phi\) defines an action of the subspace \(U\) on \(V.\)
Let \(n \in \mathbb{N}.\) A permutation \(\sigma \in S_n\) acts on \(\mathcal{X}_n=\{1,2,\ldots,n\}\) by \[\phi : S_n \times \mathcal{X}_n \to \mathcal{X}_n \quad (\sigma,m) \mapsto \phi(\sigma,m)=\sigma(m).\] We leave it to the reader to check that this is indeed an action. In addition, a permutation \(\sigma \in S_n\) does also act on \(\mathbb{K}^n\) by the rule \[\phi(\sigma,\vec{x})=\mathbf{P}_{\sigma}\vec{x},\] where \(\vec{x} \in \mathbb{K}^n\) and \(\mathbf{P}_{\sigma}\) is the permutation matrix associated to \(\sigma \in \mathbb{K}^n,\) c.f. Definition 5.28.
A particularly important class of group actions arises when \((G,*_G)\) is the Abelian group \((\mathbb{R},+)\) or its subgroup \((\mathbb{Z},+).\) This case arise for instance when the set \(\mathcal{X}\) is a phase space (roughly speaking, the set of different physical states) of a physical system and the action describes the evolution of the system under the progression of time.
Let \(\mathcal{X}\) be a non-empty set. A time-discrete dynamical system is an action of \((\mathbb{Z},+)\) on \(\mathcal{X}.\) A time-continuous dynamical system is an action of \((\mathbb{R},+)\) on \(\mathcal{X}.\)
Often the term dynamical system is also used when the action is only defined for all non-negative times \(\mathbb{R}^+_0=\{t \in \mathbb{R}| t\geqslant 0\}\) or \(\mathbb{N}_{0}=\{t \in \mathbb{Z}| t\geqslant 0\}.\)
Let \(\mathcal{X}\subset \mathbb{R}^3\) denote the set of all points in our solar system. An asteroid initially at rest at the position \(x_0 \in \mathcal{X}\) will move under the influence of gravity. Let \(x_t\) denote the position of the asteroid after time \(t \in \mathbb{R}\) has passed. The mapping \[\phi : \mathbb{R}^+_{0} \times \mathcal{X} \to \mathcal{X}, \quad (t,x_0) \mapsto \phi(t,x_0)=x_t\] describing the movement of the asteroid is then a time-continuous dynamical system.
Let \(\mathcal{X}=\{0,1\}^N\) denote the carrier status of a contagious disease of each individual of a population of size \(N \in \mathbb{N}.\) So \(x\in \mathcal{X}\) is a list of length \(N\) containing \(0\)s and \(1\)s, where the \(k\)-th entry reflects the carrier status of the \(k\)-th member of the population, \(0\) for non-carriers and \(1\) for carriers. Let \(x_0 \in \mathcal{X}\) denote the carrier status at some initial time \(t=0\) and for \(m \in \mathbb{N}_{0}\) let \(x_m\) denote the carrier status after \(m\) days have passed. The mapping \[\phi : \mathbb{N}_{0} \times \mathcal{X} \to \mathcal{X}, \quad (m,x_0) \mapsto \phi(m,x_0)=x_m\] describing the progression of the disease in the population is then a time-discrete dynamical system.
Given a group action on some set \(\mathcal{X}\) and some point \(x \in \mathcal{X},\) we consider the subset of elements of \(\mathcal{X}\) that can be reached by acting with all the groups elements of \(G.\) This subset is known as the orbit of \(x.\) More precisely:
Let \(\mathcal{X}\) be a non-empty set, \(\phi : G \times \mathcal{X} \to \mathcal{X}\) an action of the group \((G,*_G)\) on \(\mathcal{X}\) and \(x \in \mathcal{X}.\) The orbit of \(x \in \mathcal{X}\) under \(G\) (or sometimes \(G\)-orbit of \(x\)) is the subset \[G*_Gx=\left\{\phi(a,x) \in \mathcal{X} | a \in G\right\}.\] The set of all \(G\)-orbits in \(\mathcal{X}\) is denoted by \(\mathcal{X}/G.\)
In the time-continuous dynamical system above, the orbit of \(x_0 \in \mathcal{X}\) consists of the points \(x_t\) where \(t\in \mathbb{R}^+_{0}\) and \(x_t\) is the time \(t\) position of the asteroid with initial position \(x_0.\) The orbit is thus the trajectory of the asteroid as time progresses. Therefore, the mathematical concept of orbit is a generalisation of the standard use of the term orbit.
Consider the action of \(\mathrm{SO}(2)\) on \(\mathbb{R}^2\) from above. The orbit of \(\vec{x}\neq 0_{\mathbb{R}^2}\) consists of all points in \(\mathbb{R}^2\) obtained by rotating \(\vec{x}\) counter-clockwise around the origin. Since the rotation angle can be chosen arbitrarily, the orbit of \(\vec{x}\) is the circle of all points of \(\mathbb{R}^2\) that have the same length as \(\vec{x}.\) On the other hand, the orbit of \(0_{\mathbb{R}^2}\) only consists of \(0_{\mathbb{R}^2},\) that is, we have \[\mathrm{SO}(2)*_{\mathrm{SO}(2)}0_{\mathbb{R}^2}=\{0_{\mathbb{R}^2}\}.\] In this particular case we have a complete picture of all possible orbits, an orbit is either the zero vector or else a circle centred at the origin, hence \[\mathcal{X}/G=\mathbb{R}^2/\mathrm{SO}(2)=\{0_{\mathbb{R}^2}\}\cup\{\text{circle of radius }r \text{ centred at }0_{\mathbb{R}^2} | r>0\}.\]
Consider the action of \(\mathrm{GL}(n,\mathbb{K})\) on \(M_{n,n}(\mathbb{K})\) from above. Let \(\mathbf{D}=\mathrm{diag}(\lambda_1,\ldots,\lambda_n)\) be a diagonal matrix with entries \(\lambda_1,\ldots,\lambda_n \in \mathbb{K}.\) The \(\mathrm{GL}(n,\mathbb{K})\)-orbit of \(\mathbf{D}\) then consists of all \(n\times n\)-matrices with entries in \(\mathbb{K}\) that are diagonalisable with eigenvalues \(\lambda_1,\ldots,\lambda_n.\) A complete description of the set of orbits \(M_{n,n}(\mathbb{K})/\mathrm{GL}(n,\mathbb{K})\) is out of reach for us at this point, we will however have more to say about this in the Linear Algebra II module.
We consider the action of \(S_2\) on \(\mathbb{R}^2\) as defined above. The orbit of a vector \(\vec{v}=\begin{pmatrix} x \\ y \end{pmatrix}\in\mathbb{R}^2\) with \(x\neq y\) is the subset \[S_2*_{S_2}\vec{v}=\left\{\begin{pmatrix} x \\ y \end{pmatrix},\begin{pmatrix} y \\ x \end{pmatrix}\right\}.\] On the other hand, the orbit of a vector \(\vec{v}=\begin{pmatrix} x \\ x \end{pmatrix}\in \mathbb{R}^2\) is just \(\{\vec{v}\}.\) For a vector of the first type, either \(x>y\) or \(x<y.\) The orbit of each such vector can thus be represented uniquely by a vector \(\vec{v}=\begin{pmatrix} x \\ y \end{pmatrix}\) with \(y>x.\) The vectors of the second type lie on the axis defined by the equation \(y=x.\) We thus have a bijective mapping from \(\mathbb{R}^2/S_2\) to the half plane \(\mathbb{H}=\{(x,y) \in \mathbb{R}^2 | y\geqslant x\}\).
Given a subspace \(U\subset V,\) we have seen that the Abelian group \((G,*_G)=(U,+_U)\) acts on \(\mathcal{X}=V\) by translation. In this sense \(U+v\) is simply the orbit of \(v\) under this action and \(V/U\) is the set of orbits \(\mathcal{X}/G.\) Furthermore, Lemma 7.8 is a special case of a more general statement about orbits: If a group \((G,*_G)\) acts on a non-empty set \(\mathcal{X},\) then every element \(x \in \mathcal{X}\) belongs to a unique \(G\)-orbit, namely \(G*_G x.\) In particular two orbits \(G*_G x_1\) and \(G*_G x_2\) are either equal or have empty intersection.
Exercises
Show that mapping \(S_n\times \mathbb{K}^n\to \mathbb{K}^n\) given in Example 8.15 does indeed define an action.
Solution
According to Definition 8.13, we need to show that for all \(\vec x\in\mathbb{K}^n,\) \(\phi(1,\vec x) = \vec x\) and \(\phi(\sigma\circ \pi,\vec x) = \phi(\sigma,\phi(\pi,\vec x)),\) where \(\phi(\sigma,\vec x) = \mathbf{P}_\sigma\vec x.\) By Proposition 5.31, we have \(\mathbf{P}_1=\mathbf{1}_{n}\) and therefore \(\mathbf{1}_{n}\vec x = \vec x\) for all \(\vec x,\) which shows the first part. For the second one, we compute \[\phi(\sigma\circ\pi,\vec x) = \mathbf{P}_{\sigma\circ \pi}\vec x = \mathbf{P}_\sigma(\mathbf{P}_\pi\vec x) = \phi(\sigma,\phi(\pi,\vec x)),\] where the second equality again follows from Proposition 5.31.
Prove the statement about orbits from Remark 8.20.
Solution
Let \((G,*_G)\) be a group acting on a non-empty set \(\mathcal X.\) Since \(x=\phi(e_G,x)\) for every \(x\in \mathcal X,\) we conclude that every \(x\in\mathcal X\) belongs to a \(G\)-orbit. For the second statement, let \(x\in G*_G x_1\cap G*_Gx_2,\) then there exist elements \(g_1,g_2\in G\) such that \(x=\phi(g_1,x_1)=\phi(g_2,x_2).\) Since \[\begin{aligned} x_1 & = \phi(e_G,x_1) = \phi(g_1^{-1}*_Gg_1,x_1) = \phi(g_1^{-1},\phi(g_1,x_1))\\ & = \phi(g_1^{-1},x) = \phi(g_1^{-1},\phi(g_2,x_2)) = \phi(g_1^{-1}*_Gg_2,x_2),\end{aligned}\] we conclude that \[\begin{aligned}G*_Gx_1 & = \{\phi(g,x_1)|g\in G\}=\{\phi(g*_G(g_1^{-1}*_Gg_2),x_2)|g\in G\}\\ &\subset\{\phi(g',x_2),g'\in G\}=G*_Gx_2.\end{aligned}\] By switching the roles of \(x_1\) and \(x_2,\) we conclude that \(G*_Gx_2 \subset G*_Gx_1\) and hence \(G*_Gx_1 = G*_Gx_2.\)