3 Independence

This chapter is devoted to independence, which is a fundamentally probabilistic notion. Although the basic idea behind independence is very simple, a precise statement general enough for later applications requires some care.

3.1 Independent events

Independence is classically stated on the level of events. In the real world, two events are typically independent if they describe events that are causally unrelated. For instance, if I flip a coin twice, whether I geat heads the first and the second time are independent events. The mathematical definition of course goes beyond any causal or mechanical interpretations in the real world.

Definition 3.1

Two events \(A,B \in \mathcal A\) are independent if \[\mathbb{P}(A \cap B) = \mathbb{P}(A) \cdot \mathbb{P}(B)\,.\]

Informally, \(A\) and \(B\) being independent means that knowing that \(B\) happened gives no information about the probability of \(A\) happening. More formally, the conditional probability of \(A\) given \(B,\) defined as \(\mathbb{P}(A | B) :=\frac{\mathbb{P}(A \cap B)}{\mathbb{P}(B)},\) is equal to \(\mathbb{P}(A).\)

Example 3.2

(Continuation of Example 2.8.) When throwing a die twice, obtaining a six on the first throw and obtaining a six on the second throw are independent events. More precisely, setting \[A = \{6\} \times \{1, \dots, 6\}\,, \qquad B :=\{1, \dots, 6\} \times \{6\}\,,\] we find \(\mathbb{P}(A \cap B) = \frac{1}{36} = \mathbb{P}(A) \cdot \mathbb{P}(B).\)
When throwing a single die, the events \[A = \{1,2\} \,, \qquad B = \{1,3,5\}\] are independent: \(\mathbb{P}(A \cap B) = \frac{1}{6} = \mathbb{P}(A) \cdot \mathbb{P}(B).\)

The notion of independence extends from two events to any, possibly infinite, collection of events.

Definition 3.3

A collection of events \(\{A_i\}_{i \in I}\) is independent if for any finite \(J \subset I\) we have \[\mathbb{P}\biggl(\bigcap_{i \in J} A_i\biggr) = \prod_{i \in J} \mathbb{P}(A_i)\,.\]

Remark 3.4

Even if \(I\) is finite, for the collection \(\{A_i\}_{i \in I}\) to be independent, it is not sufficient that \(\mathbb{P}\bigl(\bigcap_{i \in I} A_i\bigr) = \prod_{i \in I} \mathbb{P}(A_i).\)

Moreover, for the collection \(\{A_i\}_{i \in I}\) to be independent, it is not sufficient that all pairs \(A_i\) and \(A_j\) be independent (pairwise independence).

To see this, consider flipping an unbiased coin twice, so that \(\Omega = \{0,1\}^2\) with the uniform probability measure. Define the events \[A = \{1\} \times \{0,1\} \,, \qquad B = \{0,1\} \times \{1\}\,, \qquad C = \{0\} \times \{0\} \cup \{1\} \times \{1\}\,.\] (What is their interpretation?) Then we have \[\begin{aligned} \mathbb{P}(A) = \mathbb{P}(B) = \mathbb{P}(C) &= \frac{1}{2}\,, \\ \mathbb{P}(A \cap B) = \mathbb{P}(A \cap C) = \mathbb{P}(B \cap C) &= \frac{1}{4}\,, \\ \mathbb{P}(A \cap B \cap C) &= \frac{1}{4}\,. \end{aligned}\] We conclude that they are not independent, although they are pairwise independent.

3.2 Intermezzo: monotone class lemma*

In order to extend the notion of independence to random variables, a notion that plays a fundamental role in probability, we shall need a powerful tool from measure theory: the monotone class lemma. It is usually not covered in a course in measure theory. Thus, in this section we perform a measure-theoretic excursion. The section is marked with an asterisk, which means that it does not belong to the core material of the course; in particular, if you wish you can skip over the proofs, which will also not be asked in the exam. All that you have to know from this section is Definition 3.7 and Corollary 3.9.

Let \(E\) be a set.

Definition 3.5

A collection \(\mathcal M \subset \mathcal P(E)\) is a monotone class if

\(E \in \mathcal M\);
If \(A,B \in \mathcal M\) satisfy \(A \subset B\) then \(B \setminus A \in \mathcal M\);
If \(A_n \in \mathcal M\) and \(A_n \subset A_{n+1}\) for all \(n \in \mathbb{N}\) then \(\bigcup_{n \in \mathbb{N}} A_n \in \mathcal M.\)

The term monotone class comes from the last property, which distinguishes it from a \(\sigma\)-algebra, and states that the union of an increasing family of elements of \(\mathcal M\) is again in \(\mathcal M.\) There is a priori no very clear intuitive interpretation of this definition. Its usefulness will become apparent a posteriori, through its applications; see for instance Corollary 3.9 and Example 3.10 below.

Similarly to Definition 1.2, any collection of subsets of \(E\) generates a monotone class.

Definition 3.6

The monotone class generated by a collection \(\mathcal C \subset \mathcal P(E)\) is \[\mathcal M(\mathcal C) :=\bigcap_{\substack{\mathcal M \text{ is a monotone class}\\ \mathcal C \subset \mathcal M}} \mathcal M\,.\]

It is left as an exercise to check that the intersection of monotone classes is a monotone class, so that in particular \(\mathcal M(\mathcal C)\) is always a monotone class.

The following result is the main tool proved in this section. To state it, we need the following definition.

Definition 3.7

A collection \(\mathcal C \subset \mathcal P(E)\) is stable under finite intersections if for any \(A, B \in \mathcal C\) we have \(A \cap B \in \mathcal C.\)

Proposition 3.8 • Monotone class lemma

If \(\mathcal C \subset \mathcal P(E)\) is stable under finite intersections, then \(\mathcal M(\mathcal C) = \sigma(\mathcal C).\)

Proof. Note first that a \(\sigma\)-algebra is a monotone class (this is left as an easy exercise). Hence, we trivially have the inclusion \(\mathcal M(\mathcal C) \subset \sigma(\mathcal C).\)

To prove the reverse inclusion, \(\sigma(\mathcal C) \subset \mathcal M(\mathcal C),\) it suffices to show that \(\mathcal M(\mathcal C)\) is a \(\sigma\)-algebra⁷.

We shall proceed in several steps.

Claim. A monotone class \(\mathcal M\) is a \(\sigma\)-algebra if and only if it is stable under finite intersections.

Clearly, a \(\sigma\)-algebra is a monotone class that is stable under finite intersections. For the reverse implication, suppose that \(\mathcal M\) is a monotone class that is stable under finite intersections. Then \(\mathcal M\) is also stable under finite unions, since \[A, B \in \mathcal M \; \Rightarrow\; A^c, B^c \in \mathcal M \; \Rightarrow\; A^c \cap B^c \in \mathcal M \; \Rightarrow\; A \cup B \in \mathcal M\,.\] Let now \(A_1, A_2, \dots \in \mathcal M\) and set \(B_n :=A_1 \cup \cdots \cup A_n.\) Then, by the property we just proved, \(B_n \in \mathcal M.\) Moreover, since \(B_n \subset B_{n+1},\) by Definition 3.5 we conclude that \(\bigcup_{n}A_n = \bigcup_n B_n \in \mathcal M.\) We have therefore verified Definition 1.1, and hence proved the Claim.

By the Claim, it suffices to show that \(\mathcal M(\mathcal C)\) is stable under finite intersections, i.e. \[\tag{3.1} A,B \in \mathcal M(\mathcal C) \quad \Longrightarrow \quad A \cap B \in \mathcal M(\mathcal C)\,.\] To that end, we first fix \(A \in \mathcal C,\) and define \[\mathcal M_1 :=\{B \in \mathcal M(\mathcal C) \,\colon A \cap B \in \mathcal M(\mathcal C)\}\,.\] Since by assumption \(\mathcal C\) is stable under finite intersections, we have \[\tag{3.2} \mathcal C \subset \mathcal M_1\,.\] Moreover, we claim that \[\tag{3.3} \mathcal M_1 \text{ is a monotone class}.\] To verify (3.3), let us verify the three properties (i)–(iii) of Definition 3.5. Property (i) is trivial. To verify (ii), we take \(B, B' \in \mathcal M_1\) satisfying \(B \subset B',\) and note that \[A \cap (B' \setminus B) = (A \cap B') \setminus (A \cap B) \in \mathcal M(\mathcal C)\,,\] where the last step follows from the facts that \(\mathcal M(\mathcal C)\) is a monotone class and that \(A \cap B'\) and \(A \cap B\) are in \(\mathcal M(\mathcal C)\) by definition of \(\mathcal M_1.\) This shows that \(B' \setminus B \in \mathcal M_1,\) and hence yields (ii). Finally, to prove (iii), let us take \(B_n \in \mathcal M_1\) such that \(B_n \subset B_{n+1}.\) Then \[A \cap \biggl(\bigcup_n B_n\biggr) = \bigcup_n (A \cap B_n) \in \mathcal M(\mathcal C)\,,\] since \(A \cap B_n \in \mathcal M(\mathcal C)\) by definition of \(\mathcal M_1,\) and \(\mathcal M(\mathcal C)\) is a monotone class. We conclude that \(\bigcup_n B_n \in \mathcal M_1.\) This concludes the proof of property (iii), and hence of (3.3).

Next, from (3.2) and (3.3), we deduce that \(\mathcal M(\mathcal C) \subset \mathcal M_1.\) This means that \[\tag{3.4} \forall A \in \mathcal C, \forall B \in \mathcal M(\mathcal C), A \cap B \in \mathcal M(\mathcal C)\,.\]

Next, we repeat exactly the same argument with fixed \(B \in \mathcal M(\mathcal C)\) and \[\mathcal M_2 :=\{A \in \mathcal M(\mathcal C) \,\colon A \cap B \in \mathcal M(\mathcal C)\}\,.\] From (3.4) we know that \(\mathcal C \subset \mathcal M_2.\)

We may repeat the proof of (3.3) almost to the letter to show that \(\mathcal M_2\) is a monotone class. Since \(\mathcal C \subset \mathcal M_2,\) we conclude that \(\mathcal M(\mathcal C) \subset \mathcal M_2.\) This immediately implies (3.1), and hence concludes the proof.

The monotone class lemma may seem rather abstract, but it is very useful in probability. It allows to verify equality of two probability measures \(\mu\) and \(\nu\) on a much smaller set \(\mathcal C\) of events than the full \(\sigma\)-algebra. Typically, verifying the equality \(\mu(A) = \nu(A)\) for all \(A \in \mathcal A\) is practically impossible. However, it is often very easy to construct a simple collection of events \(\mathcal C\) (for instance intervals, rectangles, or cylinder sets) on which the equality is trivial. The monotone class lemma then allows to deduce equality on all sets \(A \in \mathcal A.\) That is its great power. The following result is a typical application of this idea.

Corollary 3.9

Let \(\mu\) and \(\nu\) be two probability measures on \((\Omega, \mathcal A).\) If there exists a collection \(\mathcal C \subset \mathcal A\) that is stable under finite intersections such that \(\sigma(\mathcal C) = \mathcal A\) and \(\mu(A) = \nu(A)\) for all \(A \in \mathcal C,\) then \(\mu = \nu.\)

Proof. Let \(\mathcal G :=\{A \in \mathcal A \,\colon\mu(A) = \nu(A)\}.\) Then \(\mathcal C \subset \mathcal G\) and it is easy to check that \(\mathcal G\) is a monotone class. Moreover, by Proposition 3.8, \[\mathcal M(\mathcal C) = \sigma(\mathcal C) = \mathcal A\,,\] and the claim follows since \(\mathcal M(\mathcal C) \subset \mathcal G.\)

We shall use Corollary 3.9 throughout this class. Proposition 3.12 below is a typical application. Here is an immediate application that shows its power in proving a famous and nontrivial result.

Example 3.10 • Uniqueness of Lebesgue measure

There exists at most one probability measure \(\lambda\) on \(([0,1], \mathcal B([0,1]))\) such that \(\lambda((a,b]) = b-a\) for all \(0 < a < b \leqslant 1.\) For the proof, simply invoke Corollary 3.9 with \(\mathcal C = \{(a,b] \,\colon 0 < a < b \leqslant 1\},\) the set of half-open intervals (which is obviously stable under finite intersections).

3 Independence

3.1 Independent events

3.2 Intermezzo: monotone class lemma*

Home

Contents

Weeks

✕