Lecture 8
2025-07-10
Discrete Random Variables
From outcomes to numbers: quantifying randomness
By the end of this lecture, you will be able to:
python
to compute probabilities and parametersDefinition: A random variable is a function that assigns a numerical value to each outcome of a random experiment.
Notation: Usually denoted by capital letters \(X\), \(Y\), \(Z\)
Key insight: Random variables transform outcomes into numbers, making statistical analysis possible
1 | 2 | 3 | 4 | 5 | 6 |
Random Variable X maps each die face to its numerical value.
Random variables allow us to:
Examples: Height, test scores, number of defects, wait times, stock prices
Today we focus on discrete random variables - notice there are gaps between possible values!
Takes on a countable number of values
Can list all possible values
Examples:
• Dice rolls: {1, 2, 3, 4, 5, 6}
• Number of emails: {0, 1, 2, 3, …}
• Quiz scores: {0, 1, 2, …, 10}
Takes on uncountably many values
Cannot list all possible values
Examples:
• Height: Any value in \([0, \infty)\)
• Time: Any positive real number
• Temperature: Any real number
:::
Definition: The Probability Mass Function (PMF) of a discrete random variable \(X\) is:
\[P(X = x) = \text{probability that } X \text{ takes the value } x\]
Properties of PMF:
Theoretical vs Observed Frequencies
Statistics: Click “Roll Die” to see statistics
Let \(X\) = number of heads in two coin flips
Sample Space: \(\{HH, HT, TH, TT\}\)
H T H T
Click coins to flip them!
\(x\) (heads) | Outcomes | \(P(X = x)\) | Empirical |
---|---|---|---|
0 | TT | 0.25 | 0 |
1 | HT, TH | 0.50 | 0 |
2 | HH | 0.25 | 0 |
The cumulative distribution function of a random variable \(X\) is:
\[F(x) = P(X \leq x)\]
Properties of CDF: 1. \(F(x)\) is non-decreasing 2. \(\lim_{x \to -\infty} F(x) = 0\) 3. \(\lim_{x \to \infty} F(x) = 1\) 4. \(F(x)\) is right-continuous
The expected value of a discrete random variable \(X\) is:
\[E[X] = \mu = \sum_{\text{all } x} x \cdot P(X = x)\]
The variance of a random variable \(X\) measures spread around the mean:
\[\text{Var}(X) = \sigma^2 = E[(X - \mu)^2] = E[X^2] - (E[X])^2\]
Expected value represents the long-run average if we repeat the experiment many times.
Watch how the sample mean converges to the expected value! {.smaller}
Single trial, two outcomes
Parameters: \(p\) (success probability)
PMF: \(P(X = 1) = p\), \(P(X = 0) = 1-p\)
Mean: \(p\)
Variance: \(p(1-p)\)
\(n\) independent Bernoulli trials
Parameters: \(n\) (trials), \(p\) (success prob.)
PMF: \(P(X = k) = \binom{n}{k} p^k (1-p)^{n-k}\)
Mean: \(np\)
Variance: \(np(1-p)\)
Trials until first success
Parameters: \(p\) (success probability)
PMF: \(P(X = k) = (1-p)^{k-1} p\)
Mean: \(1/p\)
Variance: \((1-p)/p^2\)
Events in fixed interval
Parameters: \(\lambda\) (average rate)
PMF: \(P(X = k) = \frac{\lambda^k e^{-\lambda}}{k!}\)
Mean: \(\lambda\)
Variance: \(\lambda\)
:::
A box contains 3 red balls and 2 blue balls. Two balls are drawn without replacement. Let \(X\) = number of red balls drawn. Find the PMF of \(X\).
Solution. \(X\) can take values 0, 1, or 2.
\(P(X = 0) = \frac{\binom{3}{0}\binom{2}{2}}{\binom{5}{2}} = \frac{1 \times 1}{10} = \frac{1}{10}\)
\(P(X = 1) = \frac{\binom{3}{1}\binom{2}{1}}{\binom{5}{2}} = \frac{3 \times 2}{10} = \frac{6}{10}\)
\(P(X = 2) = \frac{\binom{3}{2}\binom{2}{0}}{\binom{5}{2}} = \frac{3 \times 1}{10} = \frac{3}{10}\)
Check: \(\frac{1}{10} + \frac{6}{10} + \frac{3}{10} = 1\) ✓
Using the red balls example from Problem 1, find \(E[X]\) and \(\text{Var}(X)\).
Solution. Expected Value: \[E[X] = 0 \times \frac{1}{10} + 1 \times \frac{6}{10} + 2 \times \frac{3}{10} = 0 + \frac{6}{10} + \frac{6}{10} = 1.2\]
Variance: \[E[X^2] = 0^2 \times \frac{1}{10} + 1^2 \times \frac{6}{10} + 2^2 \times \frac{3}{10} = 0 + \frac{6}{10} + \frac{12}{10} = 1.8\]
\[\text{Var}(X) = E[X^2] - (E[X])^2 = 1.8 - (1.2)^2 = 1.8 - 1.44 = 0.36\]
Standard Deviation: \(\sigma = \sqrt{0.36} = 0.6\)
A student takes a 10-question multiple choice quiz with 4 options per question. If the student guesses randomly, what’s the probability of getting exactly 3 correct?
Solution. This is a binomial distribution with \(n = 10\), \(p = 1/4 = 0.25\)
\[P(X = 3) = \binom{10}{3} \times (0.25)^3 \times (0.75)^7\]
\[P(X = 3) = 120 \times 0.015625 \times 0.1335 \approx 0.2503\]
So there’s about a 25% chance of getting exactly 3 correct by guessing.
Important: Property 3 holds even if \(X\) and \(Y\) are dependent!
Choose distributions based on the underlying process:
Bernoulli for single trials
Binomial for fixed trials
Geometric for waiting times
Poisson for rates
Topics we’ll cover:
Probability density functions (PDFs)
Normal distribution
Exponential distribution
Central Limit Theorem applications
Connection: Discrete distributions often approximate continuous ones, and vice versa
Office Hours: 11AM on Thursday (link on Canvas)
Email: nmathlouthi@ucsb.edu
Next Class: Continuous Random Variables
Understanding Data – Discrete Random Variables © 2025 Narjes Mathlouthi