Probability in Data Science

What is Probability?

Probability is a mathematical way to measure how likely something is to happen. Think of it as a number between 0 and 1 that tells us the chance of an event occurring.

Probability Scale:
0 = Impossible (will never happen)
0.5 = Equally likely (50-50 chance)
1 = Certain (will definitely happen)

Simple Examples

Tossing a coin: Getting heads has a probability of 0.5 (50%)
Rolling a die: Getting a 6 has a probability of 1/6 ≈ 0.167 (16.7%)
Drawing a card: Getting a heart has a probability of 13/52 = 0.25 (25%)

Why Probability Matters in Data Science

Probability is the foundation of data science. It helps us:

Handle Uncertainty: Real-world data is messy and unpredictable. Probability helps us make sense of it.
Make Predictions: We can predict customer behavior, stock prices, or weather patterns.
Evaluate Risk: Companies use probability to assess risks before making decisions.
Build Machine Learning Models: Many algorithms like Naive Bayes, logistic regression, and neural networks rely on probability.

Real-World Business Scenarios

Marketing: What's the probability that a customer will click on an ad?
Manufacturing: What's the probability that a product will fail within one year?
Insurance: What's the probability of a car accident based on driver history?
Finance: What's the probability of detecting fraudulent transactions?

Types of Probability

1. Theoretical Probability

Theoretical probability is based on what should happen in theory, using logic and mathematics. We calculate it when we know all possible outcomes.

Theoretical Probability = Favorable Outcomes / Total Possible Outcomes

Example: Fair Coin

A fair coin has 2 sides: Heads and Tails

P(Heads) = 1/2 = 0.5

We don't need to flip the coin 1000 times to know this—it's based on the structure of the coin.

2. Empirical (Experimental) Probability

Empirical probability is based on actual observations and experiments. We collect data and see what actually happens.

Empirical Probability = Number of Times Event Occurred / Total Number of Trials

Example: Ad Click Rate

You run an online ad campaign and track results:

Total impressions: 10,000
Number of clicks: 250

P(Click) = 250/10,000 = 0.025 = 2.5%

This is based on real data, not theory.

Theoretical Probability	Empirical Probability
Based on logic and reasoning	Based on actual experiments
Works with fair/ideal conditions	Works with real-world data
Example: Rolling a fair die	Example: Customer conversion rates

Calculating Probability

P(E) = Number of Favorable Outcomes / Total Number of Possible Outcomes

Where:

P(E) = Probability of event E happening
Favorable Outcomes = Outcomes that satisfy what we're looking for
Total Outcomes = All possible outcomes in the sample space

Step-by-Step Example: Rolling a Die

Question: What is the probability of rolling a 3 on a fair six-sided die?

Step 1: Identify total possible outcomes

A die has 6 faces: {1, 2, 3, 4, 5, 6}
Total outcomes = 6

Step 2: Identify favorable outcomes

We want to roll a 3, so there's only 1 favorable outcome
Favorable outcomes = 1

Step 3: Apply the formula

P(rolling a 3) = 1/6 ≈ 0.1667 or 16.67%

More Examples

Rolling an Even Number

Even numbers on a die: {2, 4, 6} = 3 favorable outcomes

Total outcomes = 6

P(even number) = 3/6 = 0.5 or 50%

Drawing a Heart from Cards

A standard deck has 52 cards, 13 are hearts

P(heart) = 13/52 = 1/4 = 0.25 or 25%

Types of Experiments & Sample Space

Types of Experiments

1. Deterministic Experiments

These experiments always produce the same result under identical conditions. There's no uncertainty.

Boiling water at 100°C (at sea level) always turns to steam
2 + 2 always equals 4
Dropping a ball always makes it fall down (due to gravity)

2. Probabilistic (Random) Experiments

These experiments have uncertain outcomes that can vary even under identical conditions.

Tossing a coin (could be heads or tails)
Rolling a die (could be any number 1-6)
Drawing a card from a shuffled deck
Whether it will rain tomorrow

Sample Space

The sample space is the set of all possible outcomes of a probabilistic experiment. We usually denote it with the letter S.

Sample Space Examples

1. Rolling a Single Die

S = {1, 2, 3, 4, 5, 6}
Total outcomes = 6

2. Tossing a Single Coin

S = {Heads, Tails} or {H, T}
Total outcomes = 2

3. Tossing Two Coins

S = {(H,H), (H,T), (T,H), (T,T)}
Total outcomes = 4

Formulas for Quick Calculation:

For Coins: Number of outcomes = 2ⁿ (where n = number of coins)
Example: 3 coins → 2³ = 8 outcomes

For Dice: Number of outcomes = 6ⁿ (where n = number of dice)
Example: 2 dice → 6² = 36 outcomes

Sample Space for Cards

A standard deck contains 52 cards:

Color	Suit	Cards
Red (26 cards)	Hearts ♥	A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K
Red (26 cards)	Diamonds ♦	A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K
Black (26 cards)	Spades ♠	A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K
Black (26 cards)	Clubs ♣	A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K

Basic Rules of Probability

Important Definitions

Experiment: An action that produces one or more outcomes (e.g., rolling a die, drawing a card)
Outcome: A single result from an experiment (e.g., rolling a 3)
Sample Space (S): All possible outcomes
Event (E): A subset of the sample space—one or more outcomes we're interested in

Example: Rolling a Die

Sample Space: S = {1, 2, 3, 4, 5, 6}

Event (rolling an even number): E = {2, 4, 6}

P(E) = 3/6 = 0.5

Special Types of Events

1. Mutually Exclusive Events

Two events that cannot happen at the same time.

P(A ∩ B) = 0

Example: Rolling a 2 and rolling a 5 on a single die roll. These events cannot both occur—the die can only show one number at a time.

2. Independent Events

Two events where one doesn't affect the other.

P(A ∩ B) = P(A) × P(B)

Example: Tossing two coins—getting heads on the first coin doesn't change the probability of getting heads on the second coin.

3. Certain Event

An event that will definitely happen.

P(Certain Event) = 1

Example: Rolling a number between 1 and 6 on a standard die.

4. Impossible Event

An event that cannot happen.

P(Impossible Event) = 0

Example: Rolling a 7 on a standard six-sided die.

5. Exhaustive Events

A set of events that cover all possible outcomes—at least one must occur.

Example: For a die roll: {1,2,3} and {4,5,6} are exhaustive because together they cover all outcomes.

Probability Notation

P(A ∪ B): Probability of A OR B occurring (union)
P(A ∩ B): Probability of A AND B occurring (intersection)
P(A'): Probability of A NOT occurring (complement)

The Complement Rule

The complement of an event A is simply the event that A does NOT occur.

Notation: We write it as A' or A^c (read as "A complement")

P(A') = 1 - P(A)

This makes sense because:

Either event A happens, or it doesn't
These two possibilities cover all outcomes
Their probabilities must add up to 1 (certainty)

Examples

Example 1: Weather Forecast

If the probability of rain today is 0.3:

P(Rain) = 0.3

P(No Rain) = 1 - 0.3 = 0.7

There's a 70% chance it won't rain.

Example 2: Rolling a Die

What's the probability of NOT rolling a 6?

P(rolling a 6) = 1/6

P(not rolling a 6) = 1 - 1/6 = 5/6 ≈ 0.833

Example 3: Quality Control in Manufacturing

A factory knows that 2% of products are defective:

P(Defective) = 0.02

P(Not Defective) = 1 - 0.02 = 0.98

So 98% of products pass quality control.

The Addition Rule

The addition rule helps us find the probability of event A OR event B occurring.

P(A or B) = P(A) + P(B) - P(A and B)

Why subtract P(A and B)? Because when we add P(A) + P(B), we count the overlap twice. We need to subtract it once to get the correct answer.

Example with Numbers

P(A) = 0.4

P(B) = 0.5

P(A and B) = 0.2

Find P(A or B)

Solution:
P(A or B) = 0.4 + 0.5 - 0.2 = 0.7

Special Case: Mutually Exclusive Events

When events cannot happen together, P(A and B) = 0

P(A or B) = P(A) + P(B)

Example: Rolling a Die

What's the probability of rolling a 2 OR a 5?

These events are mutually exclusive (can't roll both)

P(2) = 1/6

P(5) = 1/6

P(2 or 5) = 1/6 + 1/6 = 2/6 = 1/3

Real-World Example: Card Drawing

What's the probability of drawing a red card OR a queen?

P(Red card) = 26/52 = 1/2

P(Queen) = 4/52 = 1/13

P(Red and Queen) = 2/52 = 1/26 (red queens: hearts and diamonds)

P(Red or Queen) = 1/2 + 1/13 - 1/26 = 13/26 + 2/26 - 1/26 = 14/26 = 7/13 ≈ 0.538

The Multiplication Rule

The multiplication rule helps us find the probability that two events occur together.

For Independent Events

When events don't affect each other:

P(A and B) = P(A) × P(B)

Example: Rolling a Die Twice

Event A: Rolling a 3 on first throw

Event B: Rolling an even number on second throw

P(A) = 1/6

P(B) = 3/6 = 1/2 (can roll 2, 4, or 6)

P(A and B) = 1/6 × 1/2 = 1/12 ≈ 0.083

For Dependent Events

When one event affects the probability of the other:

P(A and B) = P(A) × P(B | A)

Read as: "Probability of A times probability of B given A"

Example: Drawing Without Replacement

A bag contains 5 red and 3 blue balls. Two balls are drawn without replacement.

What's the probability both are red?

Event A: First ball is red
P(A) = 5/8

Event B: Second ball is red
After removing one red ball: 4 red left out of 7 total
P(B | A) = 4/7

P(Both red) = 5/8 × 4/7 = 20/56 = 5/14 ≈ 0.357

Scenario	Type	Formula
Drawing WITH replacement	Independent	P(A) × P(B)
Drawing WITHOUT replacement	Dependent	P(A) × P(B\|A)

Independent vs Dependent Events

Independent Events

The outcome of one event does not affect the other.

Examples of Independent Events

Tossing two coins: Getting heads on the first doesn't change the probability of heads on the second
Rolling two dice: What you roll on die 1 doesn't affect die 2
Drawing with replacement: You put the card back, so each draw is independent
Weather in different cities: Rain in New York doesn't directly affect rain in Tokyo

Dependent Events

One event affects the probability of the other.

Examples of Dependent Events

Drawing without replacement: Once you remove a card, the probabilities change for the next draw
Traffic: If there's an accident on the highway, the probability of delay increases
Sports: If a star player is injured, the team's probability of winning decreases
Medicine: Taking one medication might affect how another medication works

Testing for Independence

Two events A and B are independent if and only if:

P(A and B) = P(A) × P(B)

Example: Are These Independent?

You flip a coin and roll a die.

Event A: Coin shows heads → P(A) = 1/2

Event B: Die shows 6 → P(B) = 1/6

P(A and B) = P(A) × P(B) = 1/2 × 1/6 = 1/12

Yes, they're independent! The coin flip doesn't affect the die roll.

Warning: Don't confuse "mutually exclusive" with "independent"!

Mutually exclusive: Events cannot happen together (if A occurs, B cannot)
Independent: Events don't affect each other (whether A occurs doesn't change P(B))

Conditional Probability

Conditional probability is the probability of an event A occurring, given that another event B has already occurred.

P(A | B) = P(A and B) / P(B)

Read as: "Probability of A given B"

Why Is This Important?

In real life, we often get new information that changes probabilities. Conditional probability lets us update our beliefs based on what we know.

Simple Example: Ball Selection

You have a box with:

5 red balls
3 blue balls
2 green balls

You're told the ball picked is NOT green. What's the probability it's red?

Solution:

Event A = ball is red

Event B = ball is not green

If not green, we have 8 balls (5 red + 3 blue)

P(Red | Not Green) = 5/8 = 0.625

Example: Card Drawing

A card is drawn from a deck. Given that it's a face card (J, Q, K), what's the probability it's a King?

Total face cards = 12 (3 face cards × 4 suits)

Kings among face cards = 4

P(King | Face Card) = 4/12 = 1/3

Real-World Application: Student Performance

A class has:

60% boys, 40% girls
30% of boys play football
10% of girls play football

Question: If a randomly selected student plays football, what's the probability they're a boy?

Step 1: Find P(Football)

P(F) = P(F|Boy)×P(Boy) + P(F|Girl)×P(Girl)

P(F) = 0.3×0.6 + 0.1×0.4 = 0.18 + 0.04 = 0.22

Step 2: Find P(Boy|Football)

P(Boy|F) = P(F|Boy)×P(Boy) / P(F)

P(Boy|F) = 0.18 / 0.22 ≈ 0.818 or 81.8%

Law of Total Probability

The Law of Total Probability helps us find the probability of an event by breaking it down into simpler pieces.

P(A) = P(A|B₁)×P(B₁) + P(A|B₂)×P(B₂) + ... + P(A|Bₙ)×P(Bₙ)

When Can We Use This Law?

We need three conditions:

Partition of Sample Space: Events B₁, B₂, ..., Bₙ must be mutually exclusive and collectively exhaustive
Known Probabilities: We must know P(Bᵢ) for each event
Known Conditional Probabilities: We must know P(A|Bᵢ) for each event

Manufacturing from Multiple Factories

A company makes products in 3 factories:

Factory 1: Produces 50% of all products, 2% are defective
Factory 2: Produces 30% of all products, 3% are defective
Factory 3: Produces 20% of all products, 5% are defective

Question: What's the overall probability a random product is defective?

Solution:

P(Defective) = P(D|F1)×P(F1) + P(D|F2)×P(F2) + P(D|F3)×P(F3)

P(Defective) = 0.02×0.50 + 0.03×0.30 + 0.05×0.20

P(Defective) = 0.010 + 0.009 + 0.010 = 0.029 or 2.9%

Disease Testing

A disease affects 1% of the population.

A test for the disease:

Returns positive 99% of the time if person has disease
Returns positive 5% of the time if person doesn't have disease (false positive)

Question: What's the probability of getting a positive test?

P(Positive) = P(Pos|Disease)×P(Disease) + P(Pos|No Disease)×P(No Disease)

P(Positive) = 0.99×0.01 + 0.05×0.99

P(Positive) = 0.0099 + 0.0495 = 0.0594 or 5.94%

Bayes' Theorem

Bayes' Theorem is one of the most powerful tools in probability and statistics. It lets us reverse conditional probabilities.

P(A | B) = [P(B | A) × P(A)] / P(B)

Or in expanded form:

P(A | B) = [P(B | A) × P(A)] / [P(B | A) × P(A) + P(B | Not A) × P(Not A)]

What Does It Mean?

P(A | B): What we want to find (posterior probability)
P(B | A): What we know (likelihood)
P(A): Prior probability (what we believed before seeing B)
P(B): Total probability of observing B

Classic Example: Medical Testing

A disease affects 1% of the population.

A test for the disease:

If you have the disease, test is positive 99% of the time
If you don't have the disease, test is positive 5% of the time

You test positive. What's the probability you have the disease?

Given:

P(Disease) = 0.01

P(No Disease) = 0.99

P(Positive | Disease) = 0.99

P(Positive | No Disease) = 0.05

Step 1: Calculate P(Positive)

P(Pos) = 0.99×0.01 + 0.05×0.99 = 0.0099 + 0.0495 = 0.0594

Step 2: Apply Bayes' Theorem

P(Disease | Positive) = 0.0099 / 0.0594 ≈ 0.167 or 16.7%

Surprising Result: Even with a positive test, there's only a 16.7% chance you have the disease! This is because the disease is rare, and false positives are common.

Proof of Bayes' Theorem

Derivation:

Start with conditional probability definition:
(1) P(A | B) = P(A and B) / P(B)
(2) P(B | A) = P(A and B) / P(A)

From equation (2):
P(A and B) = P(B | A) × P(A)

Substitute into equation (1):
P(A | B) = [P(B | A) × P(A)] / P(B)

This is Bayes' Theorem! ✓

Real-World Applications

Spam Filtering: Given this email contains certain words, what's the probability it's spam?
Medical Diagnosis: Given these symptoms, what's the probability of this disease?
Machine Learning: Naive Bayes classifiers use this for text classification
Weather Forecasting: Given current conditions, what's the probability of rain?

Practice Questions

Question 1: Die Roll

You roll a fair six-sided die. What is the probability of getting an even number?

Answer: Even numbers are {2, 4, 6} = 3 outcomes
P(even) = 3/6 = 1/2 = 0.5 or 50%

Question 2: Two Coins

Two coins are tossed. Find the sample space and the probability of getting at least one head.

Sample Space: {HH, HT, TH, TT}
Answer: At least one head = {HH, HT, TH} = 3 outcomes
P(at least one head) = 3/4 = 0.75 or 75%
Alternative: P(at least one head) = 1 - P(no heads) = 1 - 1/4 = 3/4

Question 3: Two Dice Sum

Two six-sided dice are rolled. What is the probability that the sum is 7?

Answer: Ways to get 7: (1,6), (2,5), (3,4), (4,3), (5,2), (6,1) = 6 ways
Total outcomes: 6 × 6 = 36
P(sum = 7) = 6/36 = 1/6 ≈ 0.167 or 16.7%

Question 4: Ball Selection

A bag contains 3 red, 2 blue, and 5 green balls. What is the probability of randomly selecting a red ball?

Answer: Total balls: 3 + 2 + 5 = 10
P(red) = 3/10 = 0.3 or 30%

Question 5: Card Drawing

A card is drawn at random from a deck. Find the probability of:

a) Getting a heart

b) Getting a face card

a) P(heart) = 13/52 = 1/4 = 0.25 or 25%
b) Face cards: J, Q, K in 4 suits = 12 cards
P(face card) = 12/52 = 3/13 ≈ 0.231 or 23.1%

Question 6: Not Green

A box contains 4 red, 3 green, and 2 blue balls. One ball is picked at random. What is the probability that the ball is not green?

Answer: Total balls: 4 + 3 + 2 = 9
P(green) = 3/9 = 1/3
P(not green) = 1 - 1/3 = 2/3 ≈ 0.667 or 66.7%

Question 7: Multiple of 3 or 5

A number is chosen at random from {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. What is the probability that the number is a multiple of 3 or 5?

Answer: Multiples of 3: {3, 6, 9} = 3 numbers
Multiples of 5: {5, 10} = 2 numbers
Common: None
P(3 or 5) = 5/10 = 1/2 = 0.5 or 50%

Question 8: With Replacement

A bag contains 6 white and 4 black balls. Two balls are drawn with replacement. What is the probability that both are white?

Answer: Total balls: 10
P(first white) = 6/10 = 3/5
P(second white) = 6/10 = 3/5 (replaced, so same)
P(both white) = 3/5 × 3/5 = 9/25 = 0.36 or 36%

Question 9: Red Card or Queen

A card is drawn from a standard deck. What is the probability of drawing a red card or a queen?

Answer: P(red) = 26/52 = 1/2
P(queen) = 4/52 = 1/13
P(red and queen) = 2/52 = 1/26 (2 red queens)
P(red or queen) = 1/2 + 1/13 - 1/26 = 13/26 + 2/26 - 1/26 = 14/26 = 7/13 ≈ 0.538 or 53.8%

Question 10: Two Dice - Both Even

If two dice are rolled, what is the probability that both show even numbers?

Answer: P(first even) = 3/6 = 1/2
P(second even) = 3/6 = 1/2
P(both even) = 1/2 × 1/2 = 1/4 = 0.25 or 25%

Summary of Key Concepts

Congratulations! You've learned:
✓ What probability is and why it matters in data science
✓ Theoretical vs empirical probability
✓ How to calculate basic probabilities
✓ Sample spaces for coins, dice, and cards
✓ Complement rule: P(A') = 1 - P(A)
✓ Addition rule: P(A or B) = P(A) + P(B) - P(A and B)
✓ Multiplication rule for independent and dependent events
✓ Conditional probability: P(A|B) = P(A and B) / P(B)
✓ Law of Total Probability
✓ Bayes' Theorem