What is Probability?
Probability is a mathematical way to measure how likely something is to happen. Think of it as a number between 0 and 1 that tells us the chance of an event occurring.
0 = Impossible (will never happen)
0.5 = Equally likely (50-50 chance)
1 = Certain (will definitely happen)
Simple Examples
- Tossing a coin: Getting heads has a probability of 0.5 (50%)
- Rolling a die: Getting a 6 has a probability of 1/6 ≈ 0.167 (16.7%)
- Drawing a card: Getting a heart has a probability of 13/52 = 0.25 (25%)
Why Probability Matters in Data Science
Probability is the foundation of data science. It helps us:
- Handle Uncertainty: Real-world data is messy and unpredictable. Probability helps us make sense of it.
- Make Predictions: We can predict customer behavior, stock prices, or weather patterns.
- Evaluate Risk: Companies use probability to assess risks before making decisions.
- Build Machine Learning Models: Many algorithms like Naive Bayes, logistic regression, and neural networks rely on probability.
Real-World Business Scenarios
- Marketing: What's the probability that a customer will click on an ad?
- Manufacturing: What's the probability that a product will fail within one year?
- Insurance: What's the probability of a car accident based on driver history?
- Finance: What's the probability of detecting fraudulent transactions?
Types of Probability
1. Theoretical Probability
Theoretical probability is based on what should happen in theory, using logic and mathematics. We calculate it when we know all possible outcomes.
Example: Fair Coin
A fair coin has 2 sides: Heads and Tails
P(Heads) = 1/2 = 0.5
We don't need to flip the coin 1000 times to know this—it's based on the structure of the coin.
2. Empirical (Experimental) Probability
Empirical probability is based on actual observations and experiments. We collect data and see what actually happens.
Example: Ad Click Rate
You run an online ad campaign and track results:
- Total impressions: 10,000
- Number of clicks: 250
P(Click) = 250/10,000 = 0.025 = 2.5%
This is based on real data, not theory.
| Theoretical Probability | Empirical Probability |
|---|---|
| Based on logic and reasoning | Based on actual experiments |
| Works with fair/ideal conditions | Works with real-world data |
| Example: Rolling a fair die | Example: Customer conversion rates |
Calculating Probability
Where:
- P(E) = Probability of event E happening
- Favorable Outcomes = Outcomes that satisfy what we're looking for
- Total Outcomes = All possible outcomes in the sample space
Step-by-Step Example: Rolling a Die
Question: What is the probability of rolling a 3 on a fair six-sided die?
Step 1: Identify total possible outcomes
A die has 6 faces: {1, 2, 3, 4, 5, 6}
Total outcomes = 6
Step 2: Identify favorable outcomes
We want to roll a 3, so there's only 1 favorable outcome
Favorable outcomes = 1
Step 3: Apply the formula
P(rolling a 3) = 1/6 ≈ 0.1667 or 16.67%
More Examples
Rolling an Even Number
Even numbers on a die: {2, 4, 6} = 3 favorable outcomes
Total outcomes = 6
P(even number) = 3/6 = 0.5 or 50%
Drawing a Heart from Cards
A standard deck has 52 cards, 13 are hearts
P(heart) = 13/52 = 1/4 = 0.25 or 25%
Types of Experiments & Sample Space
Types of Experiments
1. Deterministic Experiments
These experiments always produce the same result under identical conditions. There's no uncertainty.
- Boiling water at 100°C (at sea level) always turns to steam
- 2 + 2 always equals 4
- Dropping a ball always makes it fall down (due to gravity)
2. Probabilistic (Random) Experiments
These experiments have uncertain outcomes that can vary even under identical conditions.
- Tossing a coin (could be heads or tails)
- Rolling a die (could be any number 1-6)
- Drawing a card from a shuffled deck
- Whether it will rain tomorrow
Sample Space
The sample space is the set of all possible outcomes of a probabilistic experiment. We usually denote it with the letter S.
Sample Space Examples
1. Rolling a Single Die
S = {1, 2, 3, 4, 5, 6}
Total outcomes = 6
2. Tossing a Single Coin
S = {Heads, Tails} or {H, T}
Total outcomes = 2
3. Tossing Two Coins
S = {(H,H), (H,T), (T,H), (T,T)}
Total outcomes = 4
For Coins: Number of outcomes = 2n (where n = number of coins)
Example: 3 coins → 2³ = 8 outcomes
For Dice: Number of outcomes = 6n (where n = number of dice)
Example: 2 dice → 6² = 36 outcomes
Sample Space for Cards
A standard deck contains 52 cards:
| Color | Suit | Cards |
|---|---|---|
| Red (26 cards) | Hearts ♥ | A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K |
| Diamonds ♦ | A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K | |
| Black (26 cards) | Spades ♠ | A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K |
| Clubs ♣ | A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K |
Basic Rules of Probability
Important Definitions
- Experiment: An action that produces one or more outcomes (e.g., rolling a die, drawing a card)
- Outcome: A single result from an experiment (e.g., rolling a 3)
- Sample Space (S): All possible outcomes
- Event (E): A subset of the sample space—one or more outcomes we're interested in
Example: Rolling a Die
Sample Space: S = {1, 2, 3, 4, 5, 6}
Event (rolling an even number): E = {2, 4, 6}
P(E) = 3/6 = 0.5
Special Types of Events
1. Mutually Exclusive Events
Two events that cannot happen at the same time.
Example: Rolling a 2 and rolling a 5 on a single die roll. These events cannot both occur—the die can only show one number at a time.
2. Independent Events
Two events where one doesn't affect the other.
Example: Tossing two coins—getting heads on the first coin doesn't change the probability of getting heads on the second coin.
3. Certain Event
An event that will definitely happen.
Example: Rolling a number between 1 and 6 on a standard die.
4. Impossible Event
An event that cannot happen.
Example: Rolling a 7 on a standard six-sided die.
5. Exhaustive Events
A set of events that cover all possible outcomes—at least one must occur.
Example: For a die roll: {1,2,3} and {4,5,6} are exhaustive because together they cover all outcomes.
Probability Notation
- P(A ∪ B): Probability of A OR B occurring (union)
- P(A ∩ B): Probability of A AND B occurring (intersection)
- P(A'): Probability of A NOT occurring (complement)
The Complement Rule
The complement of an event A is simply the event that A does NOT occur.
Notation: We write it as A' or Ac (read as "A complement")
This makes sense because:
- Either event A happens, or it doesn't
- These two possibilities cover all outcomes
- Their probabilities must add up to 1 (certainty)
Examples
Example 1: Weather Forecast
If the probability of rain today is 0.3:
P(Rain) = 0.3
P(No Rain) = 1 - 0.3 = 0.7
There's a 70% chance it won't rain.
Example 2: Rolling a Die
What's the probability of NOT rolling a 6?
P(rolling a 6) = 1/6
P(not rolling a 6) = 1 - 1/6 = 5/6 ≈ 0.833
Example 3: Quality Control in Manufacturing
A factory knows that 2% of products are defective:
P(Defective) = 0.02
P(Not Defective) = 1 - 0.02 = 0.98
So 98% of products pass quality control.
The Addition Rule
The addition rule helps us find the probability of event A OR event B occurring.
Why subtract P(A and B)? Because when we add P(A) + P(B), we count the overlap twice. We need to subtract it once to get the correct answer.
Example with Numbers
P(A) = 0.4
P(B) = 0.5
P(A and B) = 0.2
Find P(A or B)
Solution:
P(A or B) = 0.4 + 0.5 - 0.2 = 0.7
Special Case: Mutually Exclusive Events
When events cannot happen together, P(A and B) = 0
Example: Rolling a Die
What's the probability of rolling a 2 OR a 5?
These events are mutually exclusive (can't roll both)
P(2) = 1/6
P(5) = 1/6
P(2 or 5) = 1/6 + 1/6 = 2/6 = 1/3
Real-World Example: Card Drawing
What's the probability of drawing a red card OR a queen?
P(Red card) = 26/52 = 1/2
P(Queen) = 4/52 = 1/13
P(Red and Queen) = 2/52 = 1/26 (red queens: hearts and diamonds)
P(Red or Queen) = 1/2 + 1/13 - 1/26 = 13/26 + 2/26 - 1/26 = 14/26 = 7/13 ≈ 0.538
The Multiplication Rule
The multiplication rule helps us find the probability that two events occur together.
For Independent Events
When events don't affect each other:
Example: Rolling a Die Twice
Event A: Rolling a 3 on first throw
Event B: Rolling an even number on second throw
P(A) = 1/6
P(B) = 3/6 = 1/2 (can roll 2, 4, or 6)
P(A and B) = 1/6 × 1/2 = 1/12 ≈ 0.083
For Dependent Events
When one event affects the probability of the other:
Read as: "Probability of A times probability of B given A"
Example: Drawing Without Replacement
A bag contains 5 red and 3 blue balls. Two balls are drawn without replacement.
What's the probability both are red?
Event A: First ball is red
P(A) = 5/8
Event B: Second ball is red
After removing one red ball: 4 red left out of 7
total
P(B | A) = 4/7
P(Both red) = 5/8 × 4/7 = 20/56 = 5/14 ≈ 0.357
| Scenario | Type | Formula |
|---|---|---|
| Drawing WITH replacement | Independent | P(A) × P(B) |
| Drawing WITHOUT replacement | Dependent | P(A) × P(B|A) |
Independent vs Dependent Events
Independent Events
The outcome of one event does not affect the other.
Examples of Independent Events
- Tossing two coins: Getting heads on the first doesn't change the probability of heads on the second
- Rolling two dice: What you roll on die 1 doesn't affect die 2
- Drawing with replacement: You put the card back, so each draw is independent
- Weather in different cities: Rain in New York doesn't directly affect rain in Tokyo
Dependent Events
One event affects the probability of the other.
Examples of Dependent Events
- Drawing without replacement: Once you remove a card, the probabilities change for the next draw
- Traffic: If there's an accident on the highway, the probability of delay increases
- Sports: If a star player is injured, the team's probability of winning decreases
- Medicine: Taking one medication might affect how another medication works
Testing for Independence
Two events A and B are independent if and only if:
Example: Are These Independent?
You flip a coin and roll a die.
Event A: Coin shows heads → P(A) = 1/2
Event B: Die shows 6 → P(B) = 1/6
P(A and B) = P(A) × P(B) = 1/2 × 1/6 = 1/12
Yes, they're independent! The coin flip doesn't affect the die roll.
Mutually exclusive: Events cannot happen together (if A occurs, B cannot)
Independent: Events don't affect each other (whether A occurs doesn't change P(B))
Conditional Probability
Conditional probability is the probability of an event A occurring, given that another event B has already occurred.
Read as: "Probability of A given B"
Why Is This Important?
In real life, we often get new information that changes probabilities. Conditional probability lets us update our beliefs based on what we know.
Simple Example: Ball Selection
You have a box with:
- 5 red balls
- 3 blue balls
- 2 green balls
You're told the ball picked is NOT green. What's the probability it's red?
Solution:
Event A = ball is red
Event B = ball is not green
If not green, we have 8 balls (5 red + 3 blue)
P(Red | Not Green) = 5/8 = 0.625
Example: Card Drawing
A card is drawn from a deck. Given that it's a face card (J, Q, K), what's the probability it's a King?
Total face cards = 12 (3 face cards × 4 suits)
Kings among face cards = 4
P(King | Face Card) = 4/12 = 1/3
Real-World Application: Student Performance
A class has:
- 60% boys, 40% girls
- 30% of boys play football
- 10% of girls play football
Question: If a randomly selected student plays football, what's the probability they're a boy?
Step 1: Find P(Football)
P(F) = P(F|Boy)×P(Boy) + P(F|Girl)×P(Girl)
P(F) = 0.3×0.6 + 0.1×0.4 = 0.18 + 0.04 = 0.22
Step 2: Find P(Boy|Football)
P(Boy|F) = P(F|Boy)×P(Boy) / P(F)
P(Boy|F) = 0.18 / 0.22 ≈ 0.818 or 81.8%
Law of Total Probability
The Law of Total Probability helps us find the probability of an event by breaking it down into simpler pieces.
When Can We Use This Law?
We need three conditions:
- Partition of Sample Space: Events B₁, B₂, ..., Bₙ must be mutually exclusive and collectively exhaustive
- Known Probabilities: We must know P(Bᵢ) for each event
- Known Conditional Probabilities: We must know P(A|Bᵢ) for each event
Manufacturing from Multiple Factories
A company makes products in 3 factories:
- Factory 1: Produces 50% of all products, 2% are defective
- Factory 2: Produces 30% of all products, 3% are defective
- Factory 3: Produces 20% of all products, 5% are defective
Question: What's the overall probability a random product is defective?
Solution:
P(Defective) = P(D|F1)×P(F1) + P(D|F2)×P(F2) + P(D|F3)×P(F3)
P(Defective) = 0.02×0.50 + 0.03×0.30 + 0.05×0.20
P(Defective) = 0.010 + 0.009 + 0.010 = 0.029 or 2.9%
Disease Testing
A disease affects 1% of the population.
A test for the disease:
- Returns positive 99% of the time if person has disease
- Returns positive 5% of the time if person doesn't have disease (false positive)
Question: What's the probability of getting a positive test?
P(Positive) = P(Pos|Disease)×P(Disease) + P(Pos|No Disease)×P(No Disease)
P(Positive) = 0.99×0.01 + 0.05×0.99
P(Positive) = 0.0099 + 0.0495 = 0.0594 or 5.94%
Bayes' Theorem
Bayes' Theorem is one of the most powerful tools in probability and statistics. It lets us reverse conditional probabilities.
Or in expanded form:
What Does It Mean?
- P(A | B): What we want to find (posterior probability)
- P(B | A): What we know (likelihood)
- P(A): Prior probability (what we believed before seeing B)
- P(B): Total probability of observing B
Classic Example: Medical Testing
A disease affects 1% of the population.
A test for the disease:
- If you have the disease, test is positive 99% of the time
- If you don't have the disease, test is positive 5% of the time
You test positive. What's the probability you have the disease?
Given:
P(Disease) = 0.01
P(No Disease) = 0.99
P(Positive | Disease) = 0.99
P(Positive | No Disease) = 0.05
Step 1: Calculate P(Positive)
P(Pos) = 0.99×0.01 + 0.05×0.99 = 0.0099 + 0.0495 = 0.0594
Step 2: Apply Bayes' Theorem
P(Disease | Positive) = 0.0099 / 0.0594 ≈ 0.167 or 16.7%
Surprising Result: Even with a positive test, there's only a 16.7% chance you have the disease! This is because the disease is rare, and false positives are common.
Proof of Bayes' Theorem
Start with conditional probability definition:
(1) P(A | B) = P(A and B) / P(B)
(2) P(B | A) = P(A and B) / P(A)
From equation (2):
P(A and B) = P(B | A) × P(A)
Substitute into equation (1):
P(A | B) = [P(B | A) × P(A)] / P(B)
This is Bayes' Theorem! ✓
Real-World Applications
- Spam Filtering: Given this email contains certain words, what's the probability it's spam?
- Medical Diagnosis: Given these symptoms, what's the probability of this disease?
- Machine Learning: Naive Bayes classifiers use this for text classification
- Weather Forecasting: Given current conditions, what's the probability of rain?
Practice Questions
Question 1: Die Roll
You roll a fair six-sided die. What is the probability of getting an even number?
P(even) = 3/6 = 1/2 = 0.5 or 50%
Question 2: Two Coins
Two coins are tossed. Find the sample space and the probability of getting at least one head.
Answer: At least one head = {HH, HT, TH} = 3 outcomes
P(at least one head) = 3/4 = 0.75 or 75%
Alternative: P(at least one head) = 1 - P(no heads) = 1 - 1/4 = 3/4
Question 3: Two Dice Sum
Two six-sided dice are rolled. What is the probability that the sum is 7?
Total outcomes: 6 × 6 = 36
P(sum = 7) = 6/36 = 1/6 ≈ 0.167 or 16.7%
Question 4: Ball Selection
A bag contains 3 red, 2 blue, and 5 green balls. What is the probability of randomly selecting a red ball?
P(red) = 3/10 = 0.3 or 30%
Question 5: Card Drawing
A card is drawn at random from a deck. Find the probability of:
a) Getting a heart
b) Getting a face card
b) Face cards: J, Q, K in 4 suits = 12 cards
P(face card) = 12/52 = 3/13 ≈ 0.231 or 23.1%
Question 6: Not Green
A box contains 4 red, 3 green, and 2 blue balls. One ball is picked at random. What is the probability that the ball is not green?
P(green) = 3/9 = 1/3
P(not green) = 1 - 1/3 = 2/3 ≈ 0.667 or 66.7%
Question 7: Multiple of 3 or 5
A number is chosen at random from {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. What is the probability that the number is a multiple of 3 or 5?
Multiples of 5: {5, 10} = 2 numbers
Common: None
P(3 or 5) = 5/10 = 1/2 = 0.5 or 50%
Question 8: With Replacement
A bag contains 6 white and 4 black balls. Two balls are drawn with replacement. What is the probability that both are white?
P(first white) = 6/10 = 3/5
P(second white) = 6/10 = 3/5 (replaced, so same)
P(both white) = 3/5 × 3/5 = 9/25 = 0.36 or 36%
Question 9: Red Card or Queen
A card is drawn from a standard deck. What is the probability of drawing a red card or a queen?
P(queen) = 4/52 = 1/13
P(red and queen) = 2/52 = 1/26 (2 red queens)
P(red or queen) = 1/2 + 1/13 - 1/26 = 13/26 + 2/26 - 1/26 = 14/26 = 7/13 ≈ 0.538 or 53.8%
Question 10: Two Dice - Both Even
If two dice are rolled, what is the probability that both show even numbers?
P(second even) = 3/6 = 1/2
P(both even) = 1/2 × 1/2 = 1/4 = 0.25 or 25%
Congratulations! You've learned:
✓ What probability is and why it matters in data science
✓ Theoretical vs empirical probability
✓ How to calculate basic probabilities
✓ Sample spaces for coins, dice, and cards
✓ Complement rule: P(A') = 1 - P(A)
✓ Addition rule: P(A or B) = P(A) + P(B) - P(A and B)
✓ Multiplication rule for independent and dependent events
✓ Conditional probability: P(A|B) = P(A and B) / P(B)
✓ Law of Total Probability
✓ Bayes' Theorem