Today at Booth’s PhD math camp, we revisited some foundational concepts in probability theory. Two key topics stood out to me:
Frequentist vs. Bayesian Perspectives
A crucial distinction lies in what we treat as random:
Frequentist: Parameters (e.g., population mean $\mu$) are fixed, and randomness comes from the data.
Example:
$$ P(X \mid \mu) $$We ask: Given a true mean $\mu$, how likely is it to observe sample mean $X$?
Bayesian: Parameters are treated as random variables, and data informs beliefs.
Example:
$$ P(\mu \mid X) $$We ask: Given observed data $X$, what is the probability distribution of $\mu$?
Both frameworks are reasonable, but realizing that these distinct pov exists help avoid confusion.
Modes of Convergence
Probability theory distinguishes several types of convergence for random variables $\{X_n\}$ to $X$:
Convergence in distribution:
$$ X_n \xrightarrow{d} X \quad \iff \quad F_{X_n}(x) \to F_X(x) $$The CDFs converge at continuity points of $F_X$.
Convergence in probability:
$$ X_n \xrightarrow{P} X \quad \iff \quad \forall \varepsilon > 0,\ P(|X_n - X| > \varepsilon) \to 0 $$Intuitively, the probability of deviating from $X$ shrinks to zero.
Almost sure convergence:
$$ X_n \xrightarrow{a.s.} X $$$$ \Leftrightarrow $$$$ \Pr[\lim_{n \to \infty} X_n = X] = 1 $$Strongest form: for almost every outcome, the sequence eventually sticks to $X$.
Hierarchy:
$$ X_n \xrightarrow{a.s.} X \implies X_n \xrightarrow{P} X \implies X_n \xrightarrow{d} X $$A math professor who specializes in probability and stochastics once told me that “All it takes to learn math well is decent intelligence and excellent memory”. Huh.