Course Resources
Your comprehensive guide to learning materials and references
Week 1: Foundations of Data Science
Getting Started with Data - Data types, basic statistics, and Python tools
📚 Core Materials
Required Reading
Essential foundations covering data types, variables, and descriptive statistics. Provides theoretical foundation for understanding how data is structured and analyzed.
Python for Statistics
Perfect introduction to Python for statistics students. Covers probability, descriptive statistics, and statistical inference using Python with real datasets.
🎓 UCSB Access: Library Database → Search “O’Reilly” → Login with NetID → Search “Think Stats”
Supplementary Reading
Deep dive into pandas operations including describe()
, groupby()
, and essential aggregation functions for real-world datasets.
🎓 UCSB Access: Library Database → Search “O’Reilly” → Login with NetID → Search “Python for Data Analysis”
Library Access
Access thousands of technology and programming books including Python, statistics, and data science titles. Essential for supplementary reading and advanced topics.
💻 Interactive Tools & Practice
Hands-on Workshop
Comprehensive hands-on workshop covering Python data types, pandas DataFrame structures, and input/output operations with downloadable datasets.
API Documentation
Official documentation for descriptive statistics in pandas. Essential reference for understanding central tendency, dispersion, and shape analysis.
Statistical Functions
Complete reference for NumPy’s statistical toolkit including mean()
, median()
, std()
, percentile()
, and advanced statistical measures.
Advanced Reference
Comprehensive statistical analysis toolkit covering probability distributions, hypothesis testing, and advanced descriptive statistics for research-grade analysis.
🎯 Learning Objectives & Study Plan
Mastery Checklist
Week 1 Learning Goals
After completing this week, you should be able to:
Study Schedule
Recommended Learning Path
Days 1-2: Core Reading (OpenIntro Chapters 1-2)
Day 3: Python for Statistics (Think Stats)
Day 4: Supplementary Reading (Python for Data Analysis)
Day 5: Interactive tutorials and hands-on practice
Day 6: Get familiar with python documentation
Day 7: Review and concept integration
🔍 Additional Resources
Extended Learning
Supplementary Materials by Topic
Python Basics: Python.org Tutorial • Codecademy Python • Python for Everybody
Data Science Foundations: Kaggle Learn • DataCamp Intro to Python • Coursera Python for Data Science
Statistical Computing: Think Python • Automate the Boring Stuff • Real Python Tutorials
Week 2: Introduction to Probability
Understanding Uncertainty - Sample spaces, conditional probability, and Bayes’ theorem
📚 Core Materials
Required Reading
Essential foundations covering probability definitions, sample spaces, events, conditional probability, and independence.
Supplementary Reading
Interactive guide to set theory operations with Venn diagrams and visual explanations of probability concepts.
Video Lectures
High-quality video lectures covering probability axioms, sample spaces, and basic probability rules with problem sets.
Quick Reference
Comprehensive reference sheet covering all major probability formulas including conditional probability and independence.
💻 Interactive Tools & Practice
Visualizations
Interactive visual introduction with animations for conditional probability, Bayes’ theorem, and independence.
Simulations
Interactive probability simulation tool with tree diagrams, conditional probability, and Bayes’ theorem calculators.
Practice Problems
Comprehensive practice problems covering basic probability, conditional probability, and independence with instant feedback.
Python Code
Gallery of statistical visualizations including probability distributions, Venn diagrams, and tree diagrams.
🎯 Learning Objectives & Study Plan
Mastery Checklist
Week 2 Learning Goals
After completing this week, you should be able to:
Study Schedule
Recommended Learning Path
Days 1-2: Core Reading (OpenIntro Chapter 3)
Day 3: Interactive tutorials and visualizations
Day 4: Video lectures and supplementary readings
Day 5: Practice problems and exercises
Day 6: Review and concept integration
Day 7: Assessment preparation
🔍 Additional Resources
Extended Learning
Supplementary Materials by Learning Style
Visual Learners: Khan Academy Videos • Treena Notes • Math is Fun Stats
Practical Applications: Medical Diagnosis Examples • Real-world Problems
Advanced Study: MIT 6.041 Full Course • Probability Fallacies Guide
Week 3: Conditional Probability, Counting & Discrete Random Variables
Advanced Probability & Discrete Distributions
📚 Core Materials
Required Reading
Essential coverage of conditional probability, Bayes’ theorem, counting principles, and discrete random variables. Includes probability mass functions and expected values.
Bayes’ Theorem
Interactive visualization of Bayes’ theorem with medical testing examples, false positives/negatives, and real-world applications. Essential for understanding conditional probability.
Combinatorics
Comprehensive coverage of counting principles, permutations, combinations, and their applications to probability with step-by-step examples and practice problems.
Discrete Distributions
Interactive exploration of discrete random variables, probability mass functions, and common distributions (Bernoulli, Binomial, Geometric, Poisson) with parameter adjustments.
💻 Interactive Tools & Practice
Bayes Calculator
Step-by-step Bayes’ theorem calculator with medical testing examples, tree diagrams, and visual representations of prior and posterior probabilities.
Combinatorics Tool
Online calculator for permutations, combinations, and factorial calculations with explanations and step-by-step solutions for complex counting problems.
Python Documentation
Complete reference for discrete probability distributions in Python including Bernoulli, Binomial, Geometric, and Poisson with PMF, CDF, and random generation functions.
Interactive Simulations
Interactive probability distribution notes and overview for discrete random variables.
🎯 Learning Objectives & Study Plan
Mastery Checklist
Week 3 Learning Goals
After completing this week, you should be able to:
Study Schedule
Recommended Learning Path
Days 1-2: Conditional probability and Bayes’ theorem review
Day 3: Counting principles: permutations and combinations
Day 4: Introduction to discrete random variables and PMFs
Day 5: Expected values, variance, and common distributions
Day 6: Python implementation and interactive practice
Day 7: Real-world applications and problem integration
🔍 Additional Resources
Extended Learning
Supplementary Materials by Topic
Bayes’ Theorem Applications: Medical Diagnosis Examples • Spam Filtering Kaggle Python Example • Spam Filtering • Legal Evidence
Combinatorics: Art of Problem Solving • Brilliant Combinatorics • Pascal’s Triangle
Discrete Distributions: Wolfram MathWorld • NIST Engineering Statistics • Real-world Examples
Week 4: Continuous Random Variables & Intro to Confidence Intervals
From Discrete to Continuous: Understanding Density and Intervals
📚 Core Materials
Required Reading
Essential coverage of continuous random variables, probability density functions, normal distribution, and Central Limit Theorem. Foundation for understanding statistical inference.
Required Reading
Introduction to confidence intervals, interpretation, and construction. Essential for understanding how sample statistics relate to population parameters.
Central Limit Theorem
Interactive visualization of the Central Limit Theorem with adjustable sample sizes and population distributions. See how sample means become normally distributed regardless of the original population shape.
Continuous Distributions
Python-focused introduction to continuous distributions including normal, exponential, and Pareto distributions with real data examples and implementation.
🎓 UCSB Access: Library Database → Search “O’Reilly” → Login with NetID → Search “Think Stats”
💻 Interactive Tools & Practice
Distribution Explorer
Interactive exploration of continuous distributions including Normal, Exponential, and Uniform distributions. Adjust parameters and see real-time changes in PDFs and CDFs.
Confidence Intervals
Interactive confidence interval construction tool. Upload data or use built-in datasets to create and interpret confidence intervals with different confidence levels.
Normal Distribution
Interactive normal distribution calculator with Z-score calculations, area under curve, and probability computations. Essential for understanding standardization.
Python Documentation
Complete reference for continuous probability distributions in Python including Normal, Exponential, Uniform, and T-distributions with PDF, CDF, and random generation functions.
🎯 Learning Objectives & Study Plan
Mastery Checklist
Week 4 Learning Goals
After completing this week, you should be able to:
Study Schedule
Recommended Learning Path
Days 1-2: Continuous random variables and PDFs (OpenIntro Ch. 4)
Day 3: Normal distribution and standardization
Day 4: Central Limit Theorem and sampling distributions
Day 5: Introduction to confidence intervals (OpenIntro Ch. 5.1-5.2)
Day 6: Python implementation and interactive practice
Day 7: Real-world applications and interpretation practice
🔍 Additional Resources
Extended Learning
Supplementary Materials by Topic
Normal Distribution: Standard Normal Table • 68-95-99.7 Rule • Z-score Calculator
Central Limit Theorem: Khan Academy CLT • Interactive CLT Demo • Rice University Simulations
Confidence Intervals: Interpretation Guide • Common Misconceptions • Sample Size Calculator
Week 5: Statistical Methods & Testing
Confidence Intervals, Hypothesis Testing, and Statistical Inference
📚 Core Materials
Required Reading
Comprehensive coverage of confidence intervals for means and proportions, including t-distribution applications when population standard deviation is unknown.
Required Reading
Introduction to hypothesis testing fundamentals including null and alternative hypotheses, p-values, Type I and Type II errors, and statistical significance.
T-Distribution
Video series explaining the t-distribution, degrees of freedom, and when to use t vs z distributions in confidence intervals and hypothesis testing.
Hypothesis Testing
Interactive exploration of hypothesis testing concepts including point estimation, confidence intervals, and the bootstrap method with real-time visualizations.
💻 Interactive Tools & Practice
T-Distribution Calculator
Interactive t-distribution calculator with adjustable degrees of freedom, tail probability calculations, and critical value finder for confidence intervals.
Hypothesis Testing Simulator
Step-by-step hypothesis testing tool with built-in datasets or data upload capability. Automatically calculates test statistics, p-values, and conclusions.
P-Value Visualization
Interactive visualization of p-values, showing the relationship between test statistics, distributions, and probability calculations for different hypothesis tests.
Python Implementation
Complete reference for statistical tests in Python including t-tests, z-tests, and chi-square tests with confidence interval functions and effect size calculations.
🎯 Learning Objectives & Study Plan
Mastery Checklist
Week 5 Learning Goals
After completing this week, you should be able to:
Study Schedule
Recommended Learning Path
Days 1-2: Advanced confidence intervals with t-distribution (OpenIntro Ch. 5.3-5.4)
Day 3: Confidence intervals for proportions and sample size calculations
Day 4: Hypothesis testing fundamentals (OpenIntro Ch. 6.1-6.2)
Day 5: P-values, significance levels, and Type I/II errors
Day 6: One-sample tests implementation and practice
Day 7: Real-world applications and result interpretation
🔍 Additional Resources
Extended Learning
Real-World Applications & Case Studies
Medical Research: Clinical Trial Analysis Notebook • Drug Effectiveness Testing • Medical Device Validation
Business Analytics: A/B Testing Guide • Marketing Campaign Analysis • Customer Satisfaction Testing
Quality Control: Manufacturing Process Control • Product Testing Analysis • Six Sigma Applications
Educational Research: Student Performance Analysis • Teaching Method Effectiveness • Standardized Test Scores
Essential Formulas for Week 5
Confidence Intervals: - For Means (σ unknown): \(\bar{x} \pm t_{\alpha/2,df} \cdot \frac{s}{\sqrt{n}}\) - For Proportions: \(\hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\) - Degrees of Freedom: \(df = n - 1\)
Hypothesis Testing: - Test Statistic (means): \(t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}\) - Test Statistic (proportions): \(z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\) - P-value: Probability of observing test statistic or more extreme under H₀
Error Types: - Type I Error (α): Reject true H₀ - Type II Error (β): Fail to reject false H₀ - Power: \(1 - \beta\) = Probability of correctly rejecting false H₀
Week 6: Linear Regression Basics
Statistical Modeling and Relationship Analysis
📚 Core Materials
Required Reading
Comprehensive introduction to linear regression including correlation, least squares method, regression equations, and interpretation of slope and intercept.
Python for Regression
Python-focused approach to linear regression with real data examples, including correlation analysis, fitting regression lines, and residual analysis.
🎓 UCSB Access: Library Database → Search “O’Reilly” → Login with NetID → Search “Think Stats”
Correlation Analysis
Comprehensive video series covering correlation coefficients, scatterplots, regression lines, and interpretation of relationships between quantitative variables.
Regression Assumptions
Detailed coverage of linear regression assumptions including linearity, independence, normality, and equal variance with diagnostic methods and remedies.
💻 Interactive Tools & Practice
Regression Visualization
Interactive exploration of linear regression with adjustable data points, real-time least squares fitting, and visualization of residuals and R-squared values.
Correlation Calculator
Interactive tool for calculating correlation coefficients and fitting regression lines with built-in datasets or data upload capability.
Residual Analysis
Interactive regression analysis tools including scatterplot creation, line fitting, residual plots, and regression diagnostics for assumption checking.
Python Libraries
Complete documentation for linear regression in scikit-learn including model fitting, prediction, coefficient interpretation, and performance metrics.
🎯 Learning Objectives & Study Plan
Mastery Checklist
Week 6 Learning Goals
After completing this week, you should be able to:
Study Schedule
Recommended Learning Path
Days 1-2: Correlation analysis and scatterplots (OpenIntro Ch. 7.1)
Day 3: Linear regression theory and least squares method (Ch. 7.2)
Day 4: Regression equations, interpretation, and R-squared (Ch. 7.3)
Day 5: Residual analysis and assumption checking
Day 6: Python implementation with real datasets
Day 7: Advanced topics and course integration review
🔍 Additional Resources & Real-World Applications
Business Analytics
Sales and Marketing Analysis
Kaggle Notebooks: - Marketing Campaign ROI Analysis • Sales Forecasting with Regression • Customer Lifetime Value Prediction
Applications: Advertising spend vs. sales revenue, price elasticity analysis, customer acquisition cost modeling
Datasets: Marketing Data • Sales Data • E-commerce Data
Health & Medicine
Medical Research Applications
Research Examples: - BMI vs Health Outcomes • Drug Dosage Effectiveness • Treatment Response Prediction
Applications: Dose-response relationships, biomarker analysis, treatment outcome prediction, epidemiological studies
Datasets: Heart Disease Data • Diabetes Prediction • Cancer Research Data
Environmental Science
Climate and Environmental Modeling
Climate Analysis: - Temperature vs Time Trends • Air Quality Prediction • Renewable Energy Analysis
Applications: Climate change modeling, pollution correlation analysis, energy consumption prediction, environmental impact assessment
Datasets: Global Temperature • Air Quality • Energy Consumption
Economics & Finance
Economic Analysis & Financial Modeling
Financial Applications: - Stock Price Prediction • Economic Indicators Analysis • Housing Price Modeling
Applications: Portfolio optimization, risk assessment, economic forecasting, market trend analysis, real estate valuation
Datasets: Stock Market Data • Housing Prices • Economic Indicators
📊 Advanced Regression Topics Preview
Beyond Simple Regression
Topics for Further Study
Multiple Regression: Adding more predictor variables, interpreting coefficients in multivariate models, dealing with multicollinearity
Polynomial Regression: Modeling non-linear relationships, choosing appropriate degree, overfitting concerns
Logistic Regression: Binary outcome variables, odds ratios, classification problems
Model Diagnostics: Advanced residual analysis, influence measures, model selection criteria (AIC, BIC)
Regularization: Ridge regression, Lasso regression, dealing with high-dimensional data
Time Series: Regression with temporal data, autocorrelation, trend analysis
🎓 Recommended Next Courses: PSTAT 109 (Statistics for Economics), PSTAT 126 (Regression Analysis), PSTAT 131 (Statistical Machine Learning)
Essential Regression Formulas
Correlation: - Pearson’s r: \(r = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}\)
Simple Linear Regression: - Slope: \(b_1 = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sum(x_i - \bar{x})^2}\) - Intercept: \(b_0 = \bar{y} - b_1\bar{x}\) - Regression Line: \(\hat{y} = b_0 + b_1 x\)
Model Evaluation: - R-squared: \(R^2 = \frac{SSR}{SST} = 1 - \frac{SSE}{SST}\) - Residual: \(e_i = y_i - \hat{y}_i\) - Standard Error: \(s_e = \sqrt{\frac{\sum e_i^2}{n-2}}\)
Key Relationships: - SST = SSR + SSE (Total = Regression + Error) - Correlation and R²: \(R^2 = r^2\) (for simple linear regression)