PSTAT 5A: Conditional Probability Continued & Bayes’ Theorem

Lecture 6

Author

Narjes Mathlouthi

Published

July 29, 2025

Today’s Learning Objectives

By the end of this lecture, you will be able to:

Define probability and understand its basic properties
Identify sample spaces and events
Apply fundamental probability rules
Calculate conditional probabilities
Determine when events are independent
Use Bayes’ theorem in simple applications

Mutually Exclusive vs. Independent

Mutually Exclusive (left): the circles A and B do not overlap, so \(P(A\cap B)=0\).
Independent (right): the circles overlap, and we’ve sized the intersection so that \(P(A\cap B)=P(A)\,P(B)\).

Mutually Exclusive vs. Independent Example

Draw a single card from a 52-card deck:

Let A={“draw an Ace”}, so P(A)=4/52.
Let B={“draw a King”}, so P(B)=4/52.

Q: What is \(P(A\cap B)\) ?

Solution. They’re disjoint (you can’t draw an Ace and a King), so \(P(A\cap B) = 0\).

But \(P(A)\,P(B) = \frac{4}{52}\times\frac{4}{52} = \frac{16}{2704} \neq 0\).

Hence, \(P(A\cap B)\neq P(A)P(B)\), so they’re not independent.

Multiplication Rule

General case: \(P(A \cap B) = P(A) \times P(B|A)\)

Independent events: \(P(A \cap B) = P(A) \times P(B)\)

Tree Diagrams

🎯 Definition Tree diagrams help visualize sequential events and calculate probabilities.

Tree Diagram Examples

Practice Problem 2

A jar contains 5 red balls and 3 blue balls. Two balls are drawn without replacement.

What’s the probability both balls are red?
What’s the probability the first is red and second is blue?

Solution.

\(P(\text{both red}) = \frac{5}{8} \times \frac{4}{7} = \frac{20}{56} = \frac{5}{14}\)
\(P(\text{red then blue}) = \frac{5}{8} \times \frac{3}{7} = \frac{15}{56}\)

Law of Total Probability

🎯 Definition

If events \(B_1, B_2, \ldots, B_n\) form a partition of the sample space, then:

\[P(A) = P(A|B_1)P(B_1) + P(A|B_2)P(B_2) + \cdots + P(A|B_n)P(B_n)\]

Law of Total Probability Example

A factory has two machines:

Machine 1: Produces 60% of items, 5% defective
Machine 2: Produces 40% of items, 3% defective

Q: What’s the overall probability an item is defective?

Solution. \(P(\text{defective}) = P(D|M_1)P(M_1) + P(D|M_2)P(M_2)\)

\(= 0.05 \times 0.6 + 0.03 \times 0.4 = 0.03 + 0.012 = 0.042\)

Bayes’ Theorem

🎯 Definition \[P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}\]

This allows us to “reverse” conditional probabilities

Named after Thomas Bayes (1701-1761)

Bayes’ Theorem Components

\(A,B\): Events
\(P(A|B)\): Posterior probability - what we want to find
\(P(B|A)\): Likelihood - given \(A\), probability of observing \(B\)
\(P(A)\): Prior probability - initial probability of \(A\)
\(P(B)\): Marginal probability - total probability of \(B\)

Bayes’ Theorem Example

Medical test for a disease: ¹

Disease affects 1% of population
Test is 95% accurate for sick people
Test is 90% accurate for healthy people

Q:If someone tests positive, what’s the probability they have the disease?

Bayes’ Theorem Solution

Let:

\(D\): Person has disease
\(T^+\): Test is positive

Given:

\(P(D) = 0.01\)
\(P(T^+|D) = 0.95\)
\(P(T^-|D^c) = 0.90\), so \(P(T^+|D^c) = 0.10\)

Solution. \(P(T^+) = P(T^+|D)P(D) + P(T^+|D^c)P(D^c)\)

\(= 0.95 \times 0.01 + 0.10 \times 0.99 = 0.1085\)

Bayes’ Theorem Solution (cont.)

\[P(D|T^+) = \frac{P(T^+|D) \times P(D)}{P(T^+)} = \frac{0.95 \times 0.01}{0.1085} \approx 0.088\]

Surprising result: Even with a positive test, there’s only an 8.8% chance of having the disease!

This is due to the low base rate of the disease

Common Probability Mistakes

Confusing \(P(A|B)\) with \(P(B|A)\)

Prosecutor’s fallacy is a specific error in interpreting conditional probabilities. Confusing

\(P(\text{Evidence}\mid\text{Innocent}) \quad\text{with}\quad P(\text{Innocent}\mid\text{Evidence})\).

Ex: OJ Simpson Case ²

Common Probability Mistakes

Assuming independence when events are dependent
Ignoring base rates (as in the medical test example)

Base rate fallacy is when you ignore or underweight the prior probability \(P(H)\) of a hypothesis, focusing only on the new evidence \(E\).

Double counting in union calculations

Practice Problem 3

Two fair dice are rolled. Find:

\(P(\text{sum} = 7)\)
\(P(\text{sum} = 7 | \text{first die shows 3})\)
Are these events independent?

Solution.

6 ways out of 36: \(P(\text{sum} = 7) = \frac{6}{36} = \frac{1}{6}\)
Given first die is 3, need second die to be 4: \(P(\text{sum} = 7 | \text{first} = 3) = \frac{1}{6}\)
Yes, they’re independent since \(P(A|B) = P(A)\)

Counting and Probability

Sometimes we need to count outcomes:

Multiplication Principle: If task 1 can be done in \(m\) ways and task 2 in \(n\) ways, both can be done in \(m \times n\) ways

Permutations: Arrangements where order matters \[P(n,r) = \frac{n!}{(n-r)!}\]

Combinations: Selections where order doesn’t matter \[C(n,r) = \binom{n}{r} = \frac{n!}{r!(n-r)!}\]

Counting Example

Q: How many ways can you arrange 5 people in a row?

Solution. This is a permutation: \(P(5,5) = 5! = 120\) ways

Q:How many ways can you choose 3 people from 5 for a committee?

Solution. This is a combination: \(C(5,3) = \binom{5}{3} = \frac{5!}{3!2!} = 10\) ways

Probability with Counting

Example: A committee of 3 people is chosen from 8 people (5 women, 3 men). What’s the probability all 3 are women?

Solution. Total ways to choose 3 from 8: \(\binom{8}{3} = 56\)

Ways to choose 3 women from 5: \(\binom{5}{3} = 10\)

Probability: \(\frac{10}{56} = \frac{5}{28}\)

Real-World Applications

Medical Diagnosis: Using Bayes’ theorem for test interpretation

Quality Control: Probability of defective items

Finance: Risk assessment and portfolio theory

Sports: Probability of wins, fantasy sports

Insurance: Calculating premiums based on risk

Key Formulas Summary

Basic probability: \(P(A) = \frac{\text{favorable outcomes}}{\text{total outcomes}}\)
Complement: \(P(A^c) = 1 - P(A)\)
Addition: \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
Conditional: \(P(A|B) = \frac{P(A \cap B)}{P(B)}\)
Independence: \(P(A \cap B) = P(A) \times P(B)\)
Bayes’: \(P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}\)

Problem-Solving Strategy

Identify the sample space and events
Determine if events are independent or mutually exclusive
Choose the appropriate rule or formula
Calculate step by step
Check if your answer makes sense

Practice Problem 4

A bag contains 4 red, 3 blue, and 2 green marbles. Three marbles are drawn without replacement.

Find the probability that: a) All three are red b) No two are the same color c) At least one is blue

Practice Problem 4 Solutions

Solution.

All red: \(\frac{4}{9} \times \frac{3}{8} \times \frac{2}{7} = \frac{24}{504} = \frac{1}{21}\)
Different colors: \(\frac{4 \times 3 \times 2}{9 \times 8 \times 7} \times 3! = \frac{24 \times 6}{504} = \frac{144}{504} = \frac{2}{7}\)
At least one blue: \(1 - P(\text{no blue}) = 1 - \frac{6 \times 5 \times 4}{9 \times 8 \times 7} = 1 - \frac{120}{504} = \frac{384}{504} = \frac{16}{21}\)

Common Questions

Q1.: “Why isn’t \(P(A \cup B) = P(A) + P(B)\) always?”

A: We’d double-count outcomes in both events

Q2.: “How do I know if events are independent?”

A: Check if \(P(A|B) = P(A)\) or if \(P(A \cap B) = P(A) \times P(B)\)

Q3.: “When do I use Bayes’ theorem?”

A: When you want to “reverse” a conditional probability

Q3 note (Bayes Example)

Forward: I know my test picks up disease 95% of the time ⇒ \(P(+\mid D)=0.95\).
Reverse: I want the chance I really have the disease when the test is positive ⇒ \(P(D\mid +)\).

Looking Ahead

Next lecture:

Random Variables and Probability Distributions
Discrete vs. continuous random variables
Expected value and variance

Final Thoughts

Probability is the foundation of statistics:

Helps us quantify uncertainty
Provides tools for making decisions with incomplete information
Essential for understanding statistical inference

Practice: The key to mastering probability is working through many problems!

Questions?

Office Hours: Thursday’s 11 AM On Zoom (Link on Canvas)

Email: nmathlouthi@ucsb.edu

Next Class: Random Variables and Distributions

Resources

Footnotes

\(P(\neg D \,\wedge\, \text{Negative}) \;=\;P(\neg D)\times P(\text{Negative}\mid \neg D) \;=\;0.99\times0.90 \;=\;0.891\;(89.1\%)\)↩︎
the DNA evidence in the O. J. Simpson trial are a classic example of the prosecutor’s fallacy. Prosecutors highlighted that the chance of a random person matching the crime-scene DNA was “one in 170 million,” then implied (or let the jury infer) that Simpson therefore had a 1 in 170 million chance of being innocent.↩︎

Other Formats

Today’s Learning Objectives

Mutually Exclusive vs. Independent

Mutually Exclusive vs. Independent Example

Multiplication Rule

Tree Diagrams

Tree Diagram Examples

Practice Problem 2

Law of Total Probability

Law of Total Probability Example

Bayes’ Theorem

Bayes’ Theorem Components

Bayes’ Theorem Example

Bayes’ Theorem Solution

Bayes’ Theorem Solution (cont.)

Common Probability Mistakes

Common Probability Mistakes

Practice Problem 3

Counting and Probability

Counting Example

Probability with Counting

Real-World Applications

Key Formulas Summary

Problem-Solving Strategy

Practice Problem 4

Practice Problem 4 Solutions

Common Questions

Looking Ahead

Final Thoughts

Questions?

Resources

Footnotes