Today at Booth’s PhD math camp, we revisited some foundational concepts in probability theory. Two key topics stood out to me:

Frequentist vs. Bayesian Perspectives

A crucial distinction lies in what we treat as random:

  • Frequentist: Parameters (e.g., population mean $\mu$) are fixed, and randomness comes from the data.

    Example: $$ P(X \mid \mu) $$ We ask: Given a true mean $\mu$, how likely is it to observe sample mean $X$?

  • Bayesian: Parameters are treated as random variables, and data informs beliefs.

    Example: $$ P(\mu \mid X) $$ We ask: Given observed data $X$, what is the probability distribution of $\mu$?

Both frameworks are reasonable, but realizing that these distinct pov exists help avoid confusion.

Modes of Convergence

Probability theory distinguishes several types of convergence for random variables ${X_n}$ to $X$:

  • Convergence in distribution: $$ X_n \xrightarrow{d} X \quad \iff \quad F_{X_n}(x) \to F_X(x) $$

    The CDFs converge at continuity points of $F_X$.

  • Convergence in probability:

    $$ X_n \xrightarrow{P} X \quad \iff \quad \forall \varepsilon > 0,\ P(|X_n - X| > \varepsilon) \to 0 $$

    Intuitively, the probability of deviating from $X$ shrinks to zero.

  • Almost sure convergence: $$ X_n \xrightarrow{a.s.} X $$

    $$ \Leftrightarrow $$

    $$ \Pr[\lim_{n \to \infty} X_n = X] = 1 $$

    Strongest form: for almost every outcome, the sequence eventually sticks to $X$.

Hierarchy:

$$ X_n \xrightarrow{a.s.} X \implies X_n \xrightarrow{P} X \implies X_n \xrightarrow{d} X $$


A math professor who specializes in probability and stochastics once told me that “All it takes to learn math well is decent intelligence and excellent memory”. Huh.