Review of Concepts, Applications & Confidence Intervals Intro
By the end of this session, you will be able to:
Key Point: It’s not actually “random”, it’s a deterministic function applied to random outcomes!
Think About Your Major/Research Area
Take 2 minutes to brainstorm:
Examples by Field
Psychology: Reaction times, survey responses
Biology: Species counts, gene expression levels
Economics: Stock prices, unemployment rates
Engineering: System failures, signal strength
Note: If X is discrete, then X can take values x_1, x_2, x_3, \cdot where we can list all possible values.
Note: If X is continuous, then X can take any value in an interval [a,b] or (-\infty, \infty).
These properties are :
Fixed number of trials (n)
Each trial has two outcomes
Constant probability of success
Trials are independent
Example: Number of successful research grants out of 10 applications
Models rare events
Events occur independently
Constant average rate
Useful for counts over time/space
Example: Number of emails received per hour, number of mutations in DNA sequences
Bell-shaped curve
Symmetric around mean
Parameters: \mu (mean), \sigma (standard deviation)
Many natural phenomena follow this pattern
Example: Heights, test scores, measurement errors
Models waiting times
Memoryless property
Parameter: \lambda (rate)
Right-skewed
Example: Time between arrivals, equipment lifespan, time to next earthquake
Group Discussion (5 minutes)
For each scenario, identify: 1. Is the random variable discrete or continuous? 2. What distribution might it follow? 3. What are the parameters?
Scenarios: - Number of students attending office hours per week - Time spent studying for an exam - Number of typos in a research paper - Body temperature of patients in a hospital
Consider your research question:
Discrete: Probability Mass Function (PMF)
P(X = x) for specific values
Sums to 1 over all possible values
Can find exact probabilities
Example: P(X = 3) = 0.2
Continuous: Probability Density Function (PDF)
f(x) represents density
Area under curve = 1
P(X = x) = 0 for any specific value
Find probabilities over intervals
Example: P(a < X < b) = \int_{a}^{b} f(x)dx
95% Confidence Interval Formula: \bar{x} \pm 1.96 \times \frac{\sigma}{\sqrt{n}}
Interpretation: “We are 95% confident that the true population mean lies within this interval”
Common Misconceptions
❌ WRONG: “There’s a 95% probability that μ is in this specific interval”
✅ CORRECT: “If we repeated this process many times, 95% of the intervals we construct would contain the true μ”
Note
For Your Research/Interests
Share with the class:
What random variables are important in your field of study/major?
Which distributions might be most relevant?
What challenges do you anticipate in data collection?
Thank you for your participation!
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
import seaborn as sns
# Generate random samples from different distributions
# Binomial
binom_data = np.random.binomial(n=10, p=0.3, size=100)
# Poisson
poisson_data = np.random.poisson(lam=3, size=100)
# Normal
normal_data = np.random.normal(loc=0, scale=1, size=100)
# Exponential
exp_data = np.random.exponential(scale=1/1.5, size=100)
# Create histograms
fig, axes = plt.subplots(2, 2, figsize=(12, 8))
axes[0,0].hist(binom_data, bins=11, alpha=0.7, color='steelblue')
axes[0,0].set_title('Binomial Sample')
axes[0,1].hist(poisson_data, bins=15, alpha=0.7, color='coral')
axes[0,1].set_title('Poisson Sample')
axes[1,0].hist(normal_data, bins=20, alpha=0.7, color='lightblue')
axes[1,0].set_title('Normal Sample')
axes[1,1].hist(exp_data, bins=20, alpha=0.7, color='lightgreen')
axes[1,1].set_title('Exponential Sample')
plt.tight_layout()
plt.show()
# Useful Python libraries for statistics and probability
import numpy as np # Numerical computing
import scipy.stats as stats # Statistical functions
import matplotlib.pyplot as plt # Plotting
import seaborn as sns # Statistical visualization
import pandas as pd # Data manipulation
# Quick reference for common distributions:
# stats.binom.pmf(k, n, p) # Binomial PMF
# stats.poisson.pmf(k, lam) # Poisson PMF
# stats.norm.pdf(x, mu, sigma) # Normal PDF
# stats.expon.pdf(x, scale) # Exponential PDF
# Generate random samples:
# np.random.binomial(n, p, size)
# np.random.poisson(lam, size)
# np.random.normal(mu, sigma, size)
# np.random.exponential(scale, size)
Understanding Data – Random Variables © 2025 Narjes Mathlouthi