An interactive journey through the fundamental patterns that govern randomness in data science. Visualize, experiment, and master the distributions that shape our world.
Countable outcomes like coin flips, dice rolls, or success counts.
Measurements across a range like height, temperature, or time.
Capture patterns in sales, weather, and user behavior
Forecast customer conversions and risk probabilities
Choose correct methods for hypothesis testing
The simplest case: every outcome is equally likely. Perfect fairness in probability.
Where n = number of possible outcomes. Each outcome has exactly the same probability.
Probability density is constant across the entire interval [a, b].
Counting successes in fixed trials. The probability of getting exactly k successes in n independent yes/no experiments.
Flip a fair coin 10 times. Probability of exactly 6 heads:
The famous Bell Curve. Nature's favorite pattern for continuous measurements like height, IQ, and measurement errors.
Where μ is the mean and σ is the standard deviation. The curve is perfectly symmetric around the mean.
Why the normal distribution appears everywhere. The mathematical miracle that makes statistics work.
Take many random samples from any population, calculate their means, and those means will form a normal distribution—regardless of the original population's shape!
Take n random samples from population (n ≥ 30)
Compute the average of each sample
Plot all sample means → Bell curve!
We can make conclusions about populations using sample data, even without knowing the population's distribution.
Most statistical tests assume normality. CLT justifies using these tests with large samples.
We can estimate how precise our sample statistics are using the normal distribution.
Manufacturing processes use CLT to monitor product consistency and detect anomalies.
| Distribution | Type | When to Use | Key Parameters |
|---|---|---|---|
| Uniform | Discrete/Continuous | Equal probability outcomes | a (min), b (max) |
| Binomial | Discrete | Counting successes in n trials | n (trials), p (prob) |
| Normal | Continuous | Natural measurements, errors | μ (mean), σ (std dev) |
| CLT | Theorem | Sample means from any population | n ≥ 30 |