PSTAT 5A: Sampling and Confidence Intervals

Lecture 9

Author

Narjes Mathlouthi

Published

July 29, 2025

🏠

Welcome to Lecture 9

Sampling and Confidence Intervals From samples to populations: making inferences with uncertainty

Today’s Learning Objectives

By the end of this lecture, you will be able to:

Understand sampling distributions and their properties (Section 1.2)
Apply the Central Limit Theorem to sampling (Section 1.4)
Construct confidence intervals for population means (Section 1.6)
Construct confidence intervals for population proportions (Section 1.8)
Interpret confidence intervals correctly (Section 1.5)
Determine appropriate sample sizes for desired precision
Use python to calculate confidence intervals
Distinguish between different types of sampling methods

The Big Picture: Statistical Inference

Population vs Sample

Population: All individuals of interest
Sample: Subset we actually observe
Parameter: Population characteristic ($\mu$, $p$)
Statistic: Sample characteristic ($\bar{x}$, $\hat{p}$)

Goal: Use sample statistics to estimate population parameters

Why Confidence Intervals?

Point estimates are rarely exactly correct
Interval estimates capture uncertainty
Confidence level quantifies our certainty
Margin of error shows precision

Key Insight: We trade precision for confidence

Sampling Distributions

A sampling distribution is the distribution of a statistic (like $\bar{x}$) across all possible samples of size $n$.

Key Properties:

Center:
$E[\bar{X}] = \mu$ (unbiased)

Spread:
$SE(\bar{X}) = \frac{\sigma}{\sqrt{n}}$

Shape:
Approaches normal as $n$ increases (Central Limit Theorem)

Standard Error vs Standard Deviation:

$\sigma$: spread of individual observations
$SE = \frac{\sigma}{\sqrt{n}}$: spread of sample means

Drag the slider to see how sample size affects the sampling distribution

Central Limit Theorem in Action

Sample Size:

Population μ: - | Sample Means μ: - | Standard Error: -

Confidence Intervals: The Concept

What is a Confidence Interval? A confidence interval provides a range of plausible values for a population parameter. 95% Confidence Interval: If we repeated our sampling process many times, about $95\%$ of the intervals we construct would contain the true population parameter.

Click to generate new 95% confidence intervals

Confidence Intervals for Population Means

🎯 When σ is Known:

\[\bar{x} \pm z^* \cdot \frac{\sigma}{\sqrt{n}}\]

When $\sigma$ is Unknown (more common):

\[\bar{x} \pm t^* \cdot \frac{s}{\sqrt{n}}\]

Key Components:

$\bar{x}$: sample mean
$t^*$: critical value (df = n-1)
$\frac{s}{\sqrt{n}}$: standard error

Common Confidence Levels:

90%: z* = 1.645, more precise
95%: z* = 1.96, most common
99%: z* = 2.576, more confident

Conditions Required:

Random sampling
Nearly normal population OR n ≥ 30
Independent observations

Interactive CI Demo: Confidence Intervals for Means

Sample Size: Confidence Level: Population μ: Population σ:

Current CI: Generate a sample to see CI
Captures μ? - | Margin of Error: -

Confidence Intervals for Population Proportions

🎯 Formula:

\[\hat{p} \pm z^* \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\]

Key Components:

$\hat{p} = \frac{x}{n}$: sample proportion
$z^*$: critical value
$\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$: standard error

Conditions Required:

Random sampling
$n\hat{p} \geq 10$ and $n(1-\hat{p}) \geq 10$
Independent observations
Population at least 10× sample size

Conservative Approach:

Use $\hat{p} = 0.5$ for planning when true proportion unknown (maximizes margin of error)

Interactive CI Demo: Confidence Intervals for Proportions

Sample Size: Confidence Level: Population p:

Current CI: Generate a sample to see CI
Captures p? - | Sample Proportion: -

Practice Problem 1: CI for Mean

A random sample of 25 college students shows a mean daily screen time of 6.2 hours with a standard deviation of 1.8 hours. (a) Construct a 95% confidence interval for the mean daily screen time. (b) Interpret the confidence interval in context. (c) What would happen to the interval width if we used 99% confidence instead?

Solution. (a)
Given: $n = 25$, $\bar{x} = 6.2$, $s = 1.8$, 95% confidence

For $df = 24$, $t^* = 2.064$

$SE = \frac{s}{\sqrt{n}} = \frac{1.8}{\sqrt{25}} = 0.36$

$CI = 6.2 \pm 2.064 \times 0.36 = 6.2 \pm 0.743 = (5.46, 6.94)$ hours

(b)
We are 95% confident that the true mean daily screen time for all college students is between $5.46$ and $6.94$ hours.

(c)
For 99% confidence, we use $t^* = 2.797$, giving a wider interval: $(5.19, 7.21)$ hours.

Practice Problem 2: CI for Proportion

In a survey of 400 voters, 240 support a particular candidate. (a) Construct a 90% confidence interval for the true proportion of supporters. (b) Check if the conditions for inference are met. (c) How large a sample would be needed for a margin of error of 0.03 with 95% confidence?

Solution. (a)
$\hat{p} = \frac{240}{400} = 0.6$, $n = 400$, 90% confidence, $z^* = 1.645$

$SE = \sqrt{\frac{0.6 \times 0.4}{400}} = \sqrt{\frac{0.24}{400}} = 0.0245$

$CI = 0.6 \pm 1.645 \times 0.0245 = 0.6 \pm 0.0403 = (0.560, 0.640)$

(b)

Check conditions: $n\hat{p} = 400 \times 0.6 = 240 \geq 10$ ✓
$n(1-\hat{p}) = 400 \times 0.4 = 160 \geq 10$ ✓

(c)

Sample size calculation:

$n = \frac{(z^*)^2 \hat{p}(1-\hat{p})}{ME^2} = \frac{(1.96)^2 \times 0.6 \times 0.4}{(0.03)^2} = \frac{0.9216}{0.0009} = 1024$ people

Practice Problem 3: Sample Size Planning

A market researcher wants to estimate the average amount spent on coffee per week by college students. (a) How large a sample is needed for a 95% CI with margin of error $2 if $\sigma$ = $8? (b) If the budget only allows for 100 students, what confidence level gives a $2 margin of error? (c) What’s the trade-off between sample size, confidence level, and precision?

Solution. (a)
For means:
$n = \frac{(z^*)^2 \sigma^2}{ME^2} = \frac{(1.96)^2 \times 8^2}{2^2} = \frac{245.86}{4} = 62$ students

(b)
With $n = 100$:
$ME = z^* \frac{\sigma}{\sqrt{n}} = z^* \frac{8}{\sqrt{100}} = 0.8 z^*$

For $ME = 2$:
$z^* = \frac{2}{0.8} = 2.5$,
which corresponds to about 98.8% confidence

(c) Trade-offs:

Higher confidence $\rightarrow$ wider intervals (less precision)
Larger sample $\rightarrow$ narrower intervals (more precision)
Lower margin of error $\rightarrow$ need larger sample or lower confidence

Common Mistakes and Misconceptions

Interpretation Errors

❌ Wrong: “$95\%$ of the data falls in this interval”

✅ Right: “We’re $95\%$ confident the parameter is in this interval”

❌ Wrong: “There’s a $95\%$ chance $\mu$ is in this interval”

✅ Right: “$95\%$ of such intervals contain $\mu$”

Technical Errors

Using $z*$ when σ is unknown and $n < 30$
Forgetting to check conditions
Confusing standard error with standard deviation
Using wrong degrees of freedom for t-distribution

Remember: The confidence level refers to the long-run proportion of intervals that capture the parameter!

Sample Size and Margin of Error Relationships

Population σ: Confidence Level: Desired Margin of Error:

Sample Size vs Margin of Error

Required Sample Size: - | Resulting ME: -

Types of Sampling Methods

Method	Description	Advantages	Disadvantages
Simple Random	Every individual has equal chance	Unbiased, simple	May not represent subgroups
Stratified	Sample from each subgroup	Ensures representation	More complex
Cluster	Sample entire groups	Cost-effective for spread populations	Higher variability
Systematic	Every k-th individual	Simple to implement	Can miss patterns
Convenience	Easily accessible individuals	Quick and cheap	Highly biased

Note

Sampling Method Matters: Only probability sampling methods allow for valid statistical inference!

Confidence Intervals in Practice

When to Use Each Type

Means: Continuous data (height, income, test scores)

Proportions: Categorical data (yes/no, success/failure)

Choosing Confidence Level

90%: Quick estimates, less critical decisions
95%: Standard in most research
99%: High-stakes decisions, medical trials

Real-World Applications

Political polls: Proportion confidence intervals
Quality control: Mean confidence intervals
Medical research: Both types with high confidence
Business analytics: Varies by decision importance

Communication Tips

Always include the confidence level
State what the interval estimates
Acknowledge the uncertainty
Consider practical significance

Key Takeaways

Main Concepts

Sampling distributions follow predictable patterns
Confidence intervals quantify uncertainty
Central Limit Theorem makes normal-based inference possible
Sample size directly affects precision

Practical Guidelines Choose appropriate methods based on:

Data type (continuous vs categorical)
Sample size (use t when σ unknown)
Desired precision (affects sample size)
Confidence level (affects interval width)

Key Principle Statistical inference allows us to make informed decisions about populations using sample data, while properly accounting for uncertainty.

Looking Ahead

Next Lecture: Hypothesis Testing

Topics we’ll cover:

Null and alternative hypotheses
Test statistics and p-values
Type I and Type II errors

Connection: Confidence intervals and hypothesis tests are two sides of the same statistical inference coin

Questions?

Office Hours: 11AM on Thursday (link on Canvas)
Email: nmathlouthi@ucsb.edu
Next Class: Hypothesis Testing and Statistical Significance

Resources

Read OpenIntro Statistics Chapter 5 sections 5.1-5.3
Khan Academy - Confidence Intervals
Seeing Theory - Frequentist Inference
Confidence Intervals - Wikipedia
Understanding Different Types of Intervals

Other Formats

Welcome to Lecture 9

Today’s Learning Objectives

The Big Picture: Statistical Inference

Population vs Sample

Why Confidence Intervals?

Sampling Distributions

Central Limit Theorem in Action

Confidence Intervals: The Concept

Confidence Intervals for Population Means

Interactive CI Demo: Confidence Intervals for Means

Confidence Intervals for Population Proportions

Interactive CI Demo: Confidence Intervals for Proportions

Practice Problem 1: CI for Mean

Practice Problem 2: CI for Proportion

Practice Problem 3: Sample Size Planning

Common Mistakes and Misconceptions

Sample Size and Margin of Error Relationships

Types of Sampling Methods

Confidence Intervals in Practice

Key Takeaways

Looking Ahead

Questions?

Resources