Probability Axioms & Counting

Sample space, the axioms of probability, set theory tools, and combinatorics

1. Introduction

Probability theory begins by making precise what we mean by "an experiment" and "how likely an outcome is." Everything follows from three simple axioms and a rigorous definition of a sample space. Counting techniques then let us compute probabilities when outcomes are equally likely.

2. Sample Space & Events

An experiment is any process with an uncertain outcome. The sample space \(\Omega\) is the set of all possible outcomes. A good sample space must be:

Mutually exclusive — at most one outcome occurs per trial.
Collectively exhaustive — at least one outcome always occurs.
At the right granularity — fine enough to answer the questions of interest.

An event \(A\) is any subset of \(\Omega\). Probability is assigned to events, not individual outcomes (except in discrete spaces).

3. Probability Axioms

Kolmogorov Axioms

Non-negativity: \(\mathbb{P}(A) \geq 0\) for all events \(A\).
Normalisation: \(\mathbb{P}(\Omega) = 1\).
Countable Additivity: If \(A_1, A_2, \ldots\) are disjoint, then \(\mathbb{P}\!\left(\bigcup_i A_i\right) = \sum_i \mathbb{P}(A_i)\).

Everything else in probability theory follows from these three axioms. Key consequences:

\(\mathbb{P}(\emptyset) = 0\)
\(\mathbb{P}(A) \leq 1\)
\(\mathbb{P}(A) + \mathbb{P}(A^c) = 1\)
If \(A \subseteq B\), then \(\mathbb{P}(A) \leq \mathbb{P}(B)\)
Union bound: \(\mathbb{P}(A \cup B) \leq \mathbb{P}(A) + \mathbb{P}(B)\)

Inclusion-Exclusion

\[\mathbb{P}(A \cup B) = \mathbb{P}(A) + \mathbb{P}(B) - \mathbb{P}(A \cap B)\]

Bonferroni's Inequality: \(\mathbb{P}(A_1 \cap \cdots \cap A_n) \geq \mathbb{P}(A_1) + \cdots + \mathbb{P}(A_n) - (n-1)\).

4. Discrete Uniform Law

If \(\Omega\) is finite with \(n\) equally likely outcomes, and event \(A\) contains \(k\) outcomes:

\[\mathbb{P}(A) = \frac{k}{n} = \frac{\text{number of elements in } A}{\text{total number of outcomes}}\]

This is the classical probability model. Computing \(\mathbb{P}(A)\) reduces to counting.

5. Set Theory & De Morgan's Laws

Events are sets, so set operations carry direct probability interpretations:

\(A \cap B\): both \(A\) and \(B\) occur.
\(A \cup B\): at least one of \(A\), \(B\) occurs.
\(A^c\): \(A\) does not occur.

De Morgan's Laws

\(\left(\bigcup_n S_n\right)^c = \bigcap_n S_n^c\) — the complement of a union is the intersection of complements.

\(\left(\bigcap_n S_n\right)^c = \bigcup_n S_n^c\) — the complement of an intersection is the union of complements.

6. Counting

Basic Counting Principle

If a selection is made in \(r\) stages with \(n_i\) choices at stage \(i\), the total number of selections is \(n_1 n_2 \cdots n_r\).

Permutations

The number of ways to order \(n\) distinct elements: \(n! = 1 \cdot 2 \cdot 3 \cdots n\).

Combinations

The number of ways to choose \(k\) elements from \(n\) (order doesn't matter):

\[\binom{n}{k} = \frac{n!}{k!(n-k)!}\]

Partitions

Given an \(n\)-element set and nonnegative integers \(n_1, \ldots, n_r\) summing to \(n\), the number of ways to partition into \(r\) disjoint subsets of sizes \(n_1, \ldots, n_r\) is:

\[\frac{n!}{n_1!\, n_2!\, \cdots\, n_r!}\]

Figure 1 — Permutations count ordered arrangements; combinations count unordered subsets.

7. Useful Series

Geometric Series

\(S = \sum_{i=0}^{\infty} \alpha^i = \frac{1}{1-\alpha}\) for \(|\alpha| < 1\). Finite version: \(\sum_{i=0}^{n-1}\alpha^i = \frac{1-\alpha^n}{1-\alpha}\).

Geometric series appear constantly in probability — summing over a geometric random variable's PMF, computing expected values, and verifying that PMFs normalise to 1.