Lab 4: Introduction to Probability with Python

PSTAT 5A - Summer Session A 2025

Author

Instructor: Narjes Mathlouthi

Published

July 29, 2025

Welcome to Lab 4! Today we’re going to learn about probability using Python. Don’t worry if you’re new to this - we’ll go step by step and you’ll be guided through everything!

Getting Started

⏱️ Estimated time: 2 minutes

Navigate to our class Jupyterhub Instance. Create a new notebook and rename it “lab4”. For instructions on how to create a notebook, review detailed instructions in lab1.

First, let’s load the tools we need. Just copy this code block and run it as is - no changes needed!

# Load our tools (libraries)
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import pandas as pd

# Make our graphs look nice
%matplotlib inline
plt.style.use('seaborn-v0_8-whitegrid')

print("✅ All tools loaded successfully!")
✅ All tools loaded successfully!
Note

Think of these import statements like getting tools out of a toolbox. We’re getting:

  • numpy (for math)

  • matplotlib (for making graphs)

  • scipy.stats (for probability calculations)

Part 1: Understanding Probability with Coins

⏱️ Estimated time: 10 minutes

Let’s start with something you know - flipping coins!

Step 1: A Fair Coin

When we flip a fair coin:

  • Heads happens 50% of the time

  • Tails happens 50% of the time

Important

What “fair” means: the coin has no bias toward heads or tails, so each side is exactly as likely.
In other words, \(P(\text{Heads}) = P(\text{Tails}) = 0.5\).

Let’s tell Python about this:

# A fair coin - just copy and run this code!
outcomes = ["Tails", "Heads"]
probabilities = [0.5, 0.5]  # 50% each

print("Possible outcomes:", outcomes)
print("Probabilities:", probabilities)
print("Total probability:", sum(probabilities))  # Should be 1.0

Let’s make a plot of this:

# Make a bar chart - just copy and run this code!
plt.figure(figsize=(8, 5))
plt.bar(outcomes, probabilities, color=['lightcoral', 'lightblue'], alpha=0.7, edgecolor='black')
plt.title('Fair Coin: 50% Heads, 50% Tails')
plt.ylabel('Probability')
plt.ylim(0, 0.6)
plt.show()

Step 2: Your Turn - A Biased Coin

Now let’s try a biased coin (not fair). This coin lands on Heads 70% of the time!

Task 1: Biased Coin Probability

⏱️ Estimated time: 3 minutes

Instructions: Replace the ___ with the correct numbers.

# A biased coin that lands Heads 70% of the time
biased_outcomes = ["Tails", "Heads"]
biased_probabilities = [___, ___]  # Hint: If Heads is 70%, what is Tails?

print("Biased coin outcomes:", biased_outcomes)
print("Biased coin probabilities:", biased_probabilities)
print("Total probability:", sum(biased_probabilities))

# Check your answer: this should print 1.0

Now make a graph of your biased coin:

# Copy the graphing code from above and modify the title
plt.figure(figsize=(8, 5))
plt.bar(___, ___, color=['lightcoral', 'lightblue'], alpha=0.7, edgecolor='black')
plt.title('___')  # Change this title to describe your biased coin
plt.ylabel('Probability')
plt.ylim(0, 0.8)
plt.show()

Part 2: Calculating Expected Values (Averages)

⏱️ Estimated time: 8 minutes

The expected value tells us what we’d expect “on average” if we did the experiment many times.

Example: Expected Value of a Fair Coin

Let’s say Heads = 1 point and Tails = 0 points.

# Expected value calculation for fair coin
# Formula: Expected Value = (Value1 × Probability1) + (Value2 × Probability2)

coin_values = [0, 1]  # Tails = 0, Heads = 1
coin_probs = [0.5, 0.5]

expected_value = 0 * 0.5 + 1 * 0.5
print(f"Expected value of fair coin: {expected_value}")

# This means: if we flip many times, we expect about 0.5 points per flip on average
Expected value of fair coin: 0.5

Task 2: Expected Value of A Biased Coin

⏱️ Estimated time: 4 minutes

Calculate the expected value for your biased coin (Heads 70%, Tails 30%).

Step 1: Fill in the calculation

# Expected value for biased coin
# Values: Tails = 0, Heads = 1
# Probabilities: Tails = 30%, Heads = 70%

expected_biased = 0 * ___ + 1 * ___
print(f"Expected value of biased coin: {expected_biased}")

Step 2: What does this mean?

print(f"This means: if we flip the biased coin many times,")
print(f"we expect about {expected_biased} points per flip on average.")
print(f"Since this is closer to 1 than 0.5, the coin favors ___")  
# Fill in: Heads or Tails?

Part 3: Using Python’s Built-in Probability Tools

⏱️ Estimated time: 10 minutes

Instead of doing all calculations by hand, Python has built-in tools! Let’s learn to use scipy.stats.

Bernoulli Distribution (Single Coin Flip)

A Bernoulli distribution models any single “yes/no” or “success/failure” experiment. It has just one parameter, \(p\), which is the probability of success.

  • We record a 1 if the outcome is a “success”

  • We record a 0 if the outcome is a “failure”

The probabilities are \(P(X=1)=p,\quad P(X=0)=1-p.\)

Note

Coin‐flip example: If we let “Heads” be a success, then a fair coin flip is
\(X \sim \mathrm{Bernoulli}(p=0.5)\), so \(P(X=1)=0.5\) (Heads) and \(P(X=0)=0.5\) (Tails).

Flipping a coin once is therefore exactly a Bernoulli trial.

# Create a Bernoulli distribution for a fair coin
fair_coin = stats.bernoulli(0.5)  # 0.5 = 50% chance of success (Heads)

# Ask for probabilities
prob_tails = fair_coin.pmf(0)  # pmf = "probability mass function"
prob_heads = fair_coin.pmf(1)

print(f"Probability of Tails (0): {prob_tails}")
print(f"Probability of Heads (1): {prob_heads}")

# Get expected value automatically!
print(f"Expected value: {fair_coin.mean()}")
Probability of Tails (0): 0.4999999999999999
Probability of Heads (1): 0.5
Expected value: 0.5

Task 3: Biased Coin with Scipy

⏱️ Estimated time: 4 minutes

Create a Bernoulli distribution for your biased coin (70% Heads).

# Create Bernoulli distribution for biased coin
biased_coin = stats.bernoulli(___)  # Fill in: what's the probability of Heads?

# Calculate probabilities
prob_tails_biased = biased_coin.pmf(0)
prob_heads_biased = biased_coin.pmf(1)

print(f"Biased coin - Probability of Tails: {prob_tails_biased}")
print(f"Biased coin - Probability of Heads: {prob_heads_biased}")
print(f"Expected value: {biased_coin.mean()}")

Question: Does this match what you calculated by hand in Task 2? ___

Visualizing Different Coins

Let’s compare three different coins:

# Compare three coins with different bias
fig, axes = plt.subplots(1, 3, figsize=(12, 4))

# Three different coins
coin_types = [
    {"prob": 0.2, "name": "Mostly Tails"},
    {"prob": 0.5, "name": "Fair Coin"}, 
    {"prob": 0.8, "name": "Mostly Heads"}
]

for i, coin in enumerate(coin_types):
    # Create the distribution
    distribution = stats.bernoulli(coin["prob"])
    
    # Get probabilities
    probs = [distribution.pmf(0), distribution.pmf(1)]
    
    # Make bar chart
    axes[i].bar([0, 1], probs, color=['lightcoral', 'lightblue'], alpha=0.7, edgecolor='black')
    axes[i].set_title(f'{coin["name"]}\n(p = {coin["prob"]})')
    axes[i].set_xlabel('Outcome')
    axes[i].set_ylabel('Probability')
    axes[i].set_xticks([0, 1])
    axes[i].set_xticklabels(['Tails', 'Heads'])
    axes[i].set_ylim(0, 1)

plt.tight_layout()
plt.show()

Part 4: Multiple Coin Flips (Binomial Distribution)

⏱️ Estimated time: 12 minutes

What if we flip a coin multiple times? How many Heads will we get?

Example: 5 Fair Coin Flips

# Binomial distribution: 5 fair coin flips
n_flips = 5        # number of flips
p_heads = 0.5      # probability of heads each time

five_flips = stats.binom(n_flips, p_heads)

print("Flipping 5 fair coins...")
print(f"Probability of 0 heads: {five_flips.pmf(0):.4f}")
print(f"Probability of 1 head:  {five_flips.pmf(1):.4f}")
print(f"Probability of 2 heads: {five_flips.pmf(2):.4f}")
print(f"Probability of 3 heads: {five_flips.pmf(3):.4f}")
print(f"Probability of 4 heads: {five_flips.pmf(4):.4f}")
print(f"Probability of 5 heads: {five_flips.pmf(5):.4f}")

print(f"\nExpected number of heads: {five_flips.mean()}")
Flipping 5 fair coins...
Probability of 0 heads: 0.0312
Probability of 1 head:  0.1562
Probability of 2 heads: 0.3125
Probability of 3 heads: 0.3125
Probability of 4 heads: 0.1562
Probability of 5 heads: 0.0312

Expected number of heads: 2.5

Let’s make a graph:

# Graph showing all possibilities
possible_heads = [0, 1, 2, 3, 4, 5]
probabilities = [five_flips.pmf(k) for k in possible_heads]

plt.figure(figsize=(10, 6))
plt.bar(possible_heads, probabilities, alpha=0.7, color='lightgreen', edgecolor='black')
plt.xlabel('Number of Heads')
plt.ylabel('Probability')
plt.title('5 Fair Coin Flips: How Many Heads?')
plt.axvline(x=five_flips.mean(), color='red', linestyle='--', linewidth=2, 
            label=f'Expected = {five_flips.mean()}')
plt.legend()
plt.show()

Task 4: Basketball Free Throws

⏱️ Estimated time: 8 minutes

A basketball player makes 60% of their free throws. They shoot 10 free throws. How many will they make?

Step 1: Set up the problem

# Basketball free throws
n_shots = ___      # How many shots? (Fill in)
p_make = ___       # Probability of making each shot? (Fill in as decimal)

basketball = stats.binom(___, ___)  # Create the distribution

Step 2: Answer these questions

# a) What's the probability of making exactly 6 shots?
prob_exactly_6 = basketball.pmf(___)
print(f"Probability of exactly 6 makes: {prob_exactly_6:.4f}")

# b) How many shots do we expect them to make?
expected_makes = basketball.mean()
print(f"Expected number of makes: {expected_makes}")

# c) What's the probability of making 8 or more shots?
prob_8_or_more = (basketball.pmf(8) + basketball.pmf(9) + basketball.pmf(10))
print(f"Probability of 8+ makes: {prob_8_or_more:.4f}")

Step 3: Make a graph (fill in the blanks)

# Create bar chart
possible_makes = list(range(0, ___))  # 0 to 10 makes
probabilities = [basketball.pmf(k) for k in possible_makes]

plt.figure(figsize=(10, 6))
plt.bar(possible_makes, probabilities, alpha=0.7, color='___', edgecolor='black')
plt.xlabel('___')
plt.ylabel('___')
plt.title('___')  # Give it a descriptive title
plt.axvline(x=expected_makes, color='red', linestyle='--', linewidth=2, 
            label=f'Expected = {expected_makes}')
plt.legend()
plt.show()

Part 5: Events Happening Over Time (Poisson Distribution)

⏱️ Estimated time: 10 minutes

Sometimes we want to count events that happen over time, like:

  • Customers entering a store per hour

  • Emails arriving per day

  • Phone calls per minute

Example: Store Customers

# On average, 4 customers enter the store per hour
average_customers = 4

store_customers = stats.poisson(average_customers)

print("Store customer arrivals per hour:")
print(f"Probability of 0 customers: {store_customers.pmf(0):.4f}")
print(f"Probability of 2 customers: {store_customers.pmf(2):.4f}")
print(f"Probability of 4 customers: {store_customers.pmf(4):.4f}")
print(f"Probability of 6 customers: {store_customers.pmf(6):.4f}")

print(f"\nExpected customers per hour: {store_customers.mean()}")
Store customer arrivals per hour:
Probability of 0 customers: 0.0183
Probability of 2 customers: 0.1465
Probability of 4 customers: 0.1954
Probability of 6 customers: 0.1042

Expected customers per hour: 4.0

Let’s visualize this:

# Graph customer arrivals
possible_customers = list(range(0, 12))  # 0 to 11 customers
probabilities = [store_customers.pmf(k) for k in possible_customers]

plt.figure(figsize=(10, 6))
plt.bar(possible_customers, probabilities, alpha=0.7, color='purple', edgecolor='black')
plt.xlabel('Number of Customers per Hour')
plt.ylabel('Probability')
plt.title('Store Customer Arrivals (Average = 4 per hour)')
plt.axvline(x=average_customers, color='red', linestyle='--', linewidth=2, 
            label=f'Average = {average_customers}')
plt.legend()
plt.show()

Task 5: Your Email Inbox

⏱️ Estimated time: 6 minutes

Your email inbox receives an average of 3 emails per hour.

Step 1: Set up the Poisson distribution

# Email arrivals
average_emails = ___  # Fill in the average
email_arrivals = stats.poisson(___)

Step 2: Answer these questions

# a) Probability of getting exactly 3 emails in an hour
prob_exactly_3 = email_arrivals.pmf(___)
print(f"Probability of exactly 3 emails: {prob_exactly_3:.4f}")

# b) Probability of getting no emails (quiet hour!)
prob_no_emails = email_arrivals.pmf(___)
print(f"Probability of no emails: {prob_no_emails:.4f}")

# c) Expected number of emails per hour
expected_emails = email_arrivals.___()  # Fill in the method
print(f"Expected emails per hour: {expected_emails}")

Step 3: Make a graph (copy and modify the code from the store example)

# Your graph code here (copy from above and modify)

Part 6: Practice - Which Distribution Should I Use?

⏱️ Estimated time: 8 minutes

Task 6: Distribution Detective

⏱️ Estimated time: 8 minutes

For each scenario, I’ll tell you which distribution to use, then you solve it!

Scenario A: You flip a coin 8 times. What’s the probability of getting exactly 5 heads? This is BINOMIAL because: fixed number of trials, same probability each time

# Scenario A - Fill in the blanks
n = ___        # number of flips
p = ___        # probability of heads (assuming fair coin)
scenario_a = stats.binom(___, ___)

answer_a = scenario_a.pmf(___)  # exactly 5 heads
print(f"Scenario A answer: {answer_a:.4f}")

Scenario B: A coffee shop serves an average of 6 customers per hour. What’s the probability of serving exactly 4 customers in an hour? This is POISSON because: events happening over time at a constant rate

# Scenario B - Fill in the blanks
average_rate = ___  # customers per hour
scenario_b = stats.poisson(___)

answer_b = scenario_b.pmf(___)  # exactly 4 customers
print(f"Scenario B answer: {answer_b:.4f}")

Scenario C: A student has a 80% chance of getting a homework problem correct. What’s the probability they get their next problem right? This is BERNOULLI because: single trial, two outcomes

# Scenario C - Fill in the blanks
p_correct = ___  # probability of getting it right
scenario_c = stats.bernoulli(___)

answer_c = scenario_c.pmf(___)  # gets it right (value = 1)
print(f"Scenario C answer: {answer_c:.4f}")

Part 7: Simple Simulation (Optional)

⏱️ Estimated time: 5 minutes

Let’s check our math by having the computer “flip coins” for us!

Note

A simulation is when we use a computer to mimic a real-world or theoretical process—in this case, coin flips—over and over. By running many virtual flips, we can observe how often Heads or Tails actually occur and see these frequencies settle around the probabilities we calculated. Simulations are a powerful tool for validating our theory and building intuition about randomness.

# Simulate flipping 10 fair coins many times
np.random.seed(42)  # This makes everyone get the same "random" results

# Our setup
n_flips = 10
p_heads = 0.5
n_experiments = 1000  # We'll do this 1000 times

# What does theory say?
theory = stats.binom(n_flips, p_heads)
theoretical_prob = theory.pmf(5)  # exactly 5 heads
print(f"Theory says P(exactly 5 heads) = {theoretical_prob:.4f}")

# Now let's simulate it
count_5_heads = 0

for experiment in range(n_experiments):
    # Flip 10 coins
    flips = np.random.binomial(1, p_heads, n_flips)
    num_heads = sum(flips)
    
    if num_heads == 5:
        count_5_heads += 1

# Calculate what we observed
observed_prob = count_5_heads / n_experiments
print(f"Simulation says P(exactly 5 heads) = {observed_prob:.4f}")
print(f"Difference: {abs(theoretical_prob - observed_prob):.4f}")
print("Theory and simulation should be close!")
Theory says P(exactly 5 heads) = 0.2461
Simulation says P(exactly 5 heads) = 0.2680
Difference: 0.0219
Theory and simulation should be close!

Summary: What We Learned

⏱️ Estimated time: 2 minutes

🎉 Congratulations! You’ve learned:

  1. Basic Probability: How to calculate chances of events
  2. Expected Values: What to expect “on average”
  3. Three Important Distributions:
    • Bernoulli: One coin flip (single trial)
    • Binomial: Multiple coin flips (multiple trials)
    • Poisson: Events over time (like customers arriving)
  4. Python Tools: Using scipy.stats to do calculations
  5. Visualization: Making graphs to see probability distributions
Quick Reference Card

When to use each distribution:

  • Bernoulli: “Will the next customer buy something?” (single yes/no)

  • Binomial: “How many customers will buy something out of 20?” (counting successes)

  • Poisson: “How many customers will arrive in an hour?” (events over time)

Key Python commands: - stats.bernoulli(p) - single trial - stats.binom(n, p) - multiple trials - stats.poisson(rate) - events over time - .pmf(k) - probability of exactly k - .mean() - expected value

What’s Next?

In future labs, we’ll learn about:

  • Continuous probability distributions (like heights, weights)

  • Hypothesis testing (is a coin really fair?)

  • Confidence intervals (estimating with uncertainty)

Great job completing your first probability lab! 🎯