Probability Theory: From Events to Numbers
Cite this work
Show citation formats
Kennon Stewart (2025). Probability Theory: From Events to Numbers.@inproceedings{probability_theory_2025,
title={Probability Theory: From Events to Numbers},
author={Kennon Stewart},
year={2025},
}Why It Matters to the Lab
Cities, data streams, and neural networks all share a common feature: uncertainty. Probability theory gives us the language to quantify it.
For Second Street Labs, this matters when we interpret noisy sensor data, design algorithms with randomness, or ensure robustness in the face of incomplete information.
The Probability Space
A probability space is a triple where:
- : the sample space (all possible outcomes).
- : a -algebra of subsets of (the measurable events).
- : a probability measure that assigns numbers to events.
Together, these ensure that every “legal” event gets a consistent probability.
Kolmogorov’s Axioms
Probability is built on three axioms:
- Non-negativity: for all .
- Normalization: .
- Countable Additivity: If are disjoint,
From these, we can derive continuity, complements, and conditional probability.
Examples
- Coin Tosses: , .
- Twins Example: Assign probabilities to “identical twins,” “fraternal twins,” and “female twins” using set intersections and unions.
- Continuous Example: , with defined by an exponential distribution.
Why Machine Learning Cares
- Bayesian inference: Posterior updates are probability measures over parameter sets.
- Generalization bounds: Expressed in terms of probability of error over unseen samples.
- Stochastic optimization: Algorithms like SGD rely on treating gradients as random variables drawn from a probability space.
Whether explicit or implicit, probability is the glue that holds machine learning theory together.
Key Takeaways
A probability space formalizes randomness.
Kolmogorov’s three axioms make the theory consistent.
Examples range from coin flips to continuous lifetimes.
Machine learning applies these foundations at every level.
Further Reading
- Casella & Berger, Statistical Inference (Chapter 1, probability axioms)
- Kolmogorov, Foundations of the Theory of Probability