Linear Transformation Properties

Proving Relationships Between X and Y = aX

Worksheet 1 Problem 3 Solution

2025-06-26

Problem Statement

Given: Let \(X = \{x_i\}_{i=1}^{n}\) and define \(Y = \{y_i\}_{i=1}^{n}\) by \(y_i = a x_i\) for some fixed constant \(a \neq 0\).

Prove the following relationships:

\[\sum_{i=1}^{n} y_i = a \sum_{i=1}^{n} x_i, \quad \bar{Y} = a \bar{X}, \quad S_Y^2 = a^2 S_X^2\]

Understanding the Notation

Set Notation

\(X = \{x_i\}_{i=1}^{n}\) means:

  • \(X\) is a dataset containing \(n\) observations

  • The observations are labeled \(x_1, x_2, x_3, \ldots, x_n\)

  • \(i\) is an index that runs from 1 to \(n\)

  • This is read as: “X is the set of \(x_i\) for \(i\) from 1 to \(n\)

Examples:

  • If \(n = 5\): \(X = \{x_1, x_2, x_3, x_4, x_5\}\)
  • If \(X = \{2, 4, 6, 8, 10\}\), then \(x_1 = 2, x_2 = 4, x_3 = 6, x_4 = 8, x_5 = 10\)

Summation Notation

\(\sum_{i=1}^{n} x_i\) means:

  • Add up all the \(x_i\) values

  • Start with \(i = 1\) and go up to \(i = n\)

  • This equals: \(x_1 + x_2 + x_3 + \cdots + x_n\)

Example:

If \(X = \{2, 4, 6, 8, 10\}\), then: \[\sum_{i=1}^{5} x_i = x_1 + x_2 + x_3 + x_4 + x_5 = 2 + 4 + 6 + 8 + 10 = 30\]

Linear Transformation

\(Y = \{y_i\}_{i=1}^{n}\) where \(y_i = a x_i\) means:

  • Each element of \(Y\) is obtained by multiplying the corresponding element of \(X\) by the constant \(a\)

  • This is called a linear transformation

  • \(a\) is called the scaling factor

Example:

If \(X = \{2, 4, 6, 8, 10\}\) and \(a = 3\), then:

  • \(y_1 = 3 \times 2 = 6\)

  • \(y_2 = 3 \times 4 = 12\)

  • \(y_3 = 3 \times 6 = 18\)

  • \(y_4 = 3 \times 8 = 24\)

  • \(y_5 = 3 \times 10 = 30\)

So \(Y = \{6, 12, 18, 24, 30\}\)

Sample Mean Notation

Sample Mean: \(\bar{X} = \frac{1}{n} \sum_{i=1}^{n} x_i\)

This means:

  • Add up all observations: \(\sum_{i=1}^{n} x_i\)

  • Divide by the number of observations: \(n\)

  • The “bar” over \(X\) indicates the mean

Example:

For \(X = \{2, 4, 6, 8, 10\}\): \[\bar{X} = \frac{1}{5}(2 + 4 + 6 + 8 + 10) = \frac{30}{5} = 6\]

Sample Variance Notation

Sample Variance: \(S_X^2 = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{X})^2\)

This means:

  • For each observation, find its deviation from the mean: \((x_i - \bar{X})\)

  • Square each deviation: \((x_i - \bar{X})^2\)

  • Add up all squared deviations: \(\sum_{i=1}^{n} (x_i - \bar{X})^2\)

  • Divide by \((n-1)\): This gives the sample variance

Proof 1: Sum Relationship

Prove: \(\sum_{i=1}^{n} y_i = a \sum_{i=1}^{n} x_i\)

Proof 1: Sum Relationship

Step 1: Start with the definition of \(y_i\) \[y_i = a x_i \text{ for all } i = 1, 2, \ldots, n\]

Step 2: Write out the sum of all \(y_i\) \[\sum_{i=1}^{n} y_i = y_1 + y_2 + y_3 + \cdots + y_n\]

Step 3: Substitute the definition \(y_i = a x_i\) \[\sum_{i=1}^{n} y_i = a x_1 + a x_2 + a x_3 + \cdots + a x_n\]

Proof 1: Sum Relationship (continued)

Step 4: Factor out the constant \(a\) \[\sum_{i=1}^{n} y_i = a(x_1 + x_2 + x_3 + \cdots + x_n)\]

Step 5: Recognize the sum notation \[\sum_{i=1}^{n} y_i = a \sum_{i=1}^{n} x_i\]

Therefore: \(\sum_{i=1}^{n} y_i = a \sum_{i=1}^{n} x_i\)

Key Property Used: Constants can be factored out of sums \[\sum_{i=1}^{n} (c \cdot x_i) = c \sum_{i=1}^{n} x_i\]

Example: Sum Relationship

Given: \(X = \{2, 4, 6\}\) and \(a = 5\)

Then: \(Y = \{10, 20, 30\}\) (since \(y_i = 5x_i\))

Check our formula:

  • \(\sum_{i=1}^{3} x_i = 2 + 4 + 6 = 12\)

  • \(\sum_{i=1}^{3} y_i = 10 + 20 + 30 = 60\)

  • \(a \sum_{i=1}^{3} x_i = 5 \times 12 = 60\)

Verification: \(60 = 60\)

Proof 2: Sample Mean Relationship

Prove: \(\bar{Y} = a \bar{X}\)

Proof 2: Sample Mean Relationship

Step 1: Start with the definition of sample mean \[\bar{Y} = \frac{1}{n} \sum_{i=1}^{n} y_i\]

Step 2: Use our result from Proof 1 We proved that \(\sum_{i=1}^{n} y_i = a \sum_{i=1}^{n} x_i\)

Step 3: Substitute this result \[\bar{Y} = \frac{1}{n} \cdot a \sum_{i=1}^{n} x_i\]

Proof 2: Sample Mean Relationship (continued)

Step 4: Rearrange the constants \[\bar{Y} = a \cdot \frac{1}{n} \sum_{i=1}^{n} x_i\]

Step 5: Recognize the definition of \(\bar{X}\) \[\bar{Y} = a \bar{X}\]

Therefore: \(\bar{Y} = a \bar{X}\)

Key Property Used: Constants can be moved outside of fractions \[\frac{1}{n} \cdot a \sum_{i=1}^{n} x_i = a \cdot \frac{1}{n} \sum_{i=1}^{n} x_i\]

Example: Sample Mean Relationship

Given: \(X = \{2, 4, 6\}\) and \(a = 5\)

Then: \(Y = \{10, 20, 30\}\)

Check our formula:

  • \(\bar{X} = \frac{1}{3}(2 + 4 + 6) = \frac{12}{3} = 4\)

  • \(\bar{Y} = \frac{1}{3}(10 + 20 + 30) = \frac{60}{3} = 20\)

  • \(a \bar{X} = 5 \times 4 = 20\)

Verification: \(20 = 20\)

Proof 3: Sample Variance Relationship

Prove: \(S_Y^2 = a^2 S_X^2\)

Proof 3: Sample Variance Relationship

Step 1: Start with the definition of sample variance \[S_Y^2 = \frac{1}{n-1} \sum_{i=1}^{n} (y_i - \bar{Y})^2\]

Step 2: Substitute \(y_i = a x_i\) and \(\bar{Y} = a \bar{X}\) \[S_Y^2 = \frac{1}{n-1} \sum_{i=1}^{n} (a x_i - a \bar{X})^2\]

Step 3: Factor out \(a\) from the parentheses \[S_Y^2 = \frac{1}{n-1} \sum_{i=1}^{n} [a(x_i - \bar{X})]^2\]

Proof 3: Sample Variance Relationship (continued)

Step 4: Use the property \((ab)^2 = a^2 b^2\) \[S_Y^2 = \frac{1}{n-1} \sum_{i=1}^{n} a^2 (x_i - \bar{X})^2\]

Step 5: Factor out the constant \(a^2\) from the sum \[S_Y^2 = \frac{a^2}{n-1} \sum_{i=1}^{n} (x_i - \bar{X})^2\]

Step 6: Recognize the definition of \(S_X^2\) \[S_Y^2 = a^2 \cdot \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{X})^2 = a^2 S_X^2\]

Therefore: \(S_Y^2 = a^2 S_X^2\)

Key Properties Used in Variance Proof

Properties Used:

  1. Factoring: \(ax_i - a\bar{X} = a(x_i - \bar{X})\)

  2. Squaring: \([a(x_i - \bar{X})]^2 = a^2(x_i - \bar{X})^2\)

  3. Constants in sums: \(\sum_{i=1}^{n} a^2 f(x_i) = a^2 \sum_{i=1}^{n} f(x_i)\)

Example: Sample Variance Relationship

Given: \(X = \{1, 3, 5\}\) and \(a = 2\)

Calculate \(S_X^2\):

  • \(\bar{X} = \frac{1+3+5}{3} = 3\)

  • \(S_X^2 = \frac{1}{2}[(1-3)^2 + (3-3)^2 + (5-3)^2] = \frac{1}{2}[4 + 0 + 4] = 4\)

For \(Y = \{2, 6, 10\}\): - \(\bar{Y} = \frac{2+6+10}{3} = 6 = 2 \times 3\)

  • \(S_Y^2 = \frac{1}{2}[(2-6)^2 + (6-6)^2 + (10-6)^2] = \frac{1}{2}[16 + 0 + 16] = 16\)

Check: \(a^2 S_X^2 = 2^2 \times 4 = 16\)

Complete Summary of Results

For the linear transformation \(Y = \{y_i\}_{i=1}^{n}\) where \(y_i = a x_i\):

  1. Sum: \(\sum_{i=1}^{n} y_i = a \sum_{i=1}^{n} x_i\)

  2. Mean: \(\bar{Y} = a \bar{X}\)

  3. Variance: \(S_Y^2 = a^2 S_X^2\)

Note: The standard deviation relationship is \(S_Y = |a| S_X\)

Practical Implications

Scaling Up (a > 1):

  • Sums and means increase by factor \(a\)

  • Variance increases by factor \(a^2\)

  • Data becomes more spread out

Example: Converting inches to feet

  • If \(a = 12\), variance increases by \(12^2 = 144\)

Scaling Down (0 < a < 1):

  • Sums and means decrease by factor \(a\)

  • Variance decreases by factor \(a^2\)

  • Data becomes less spread out

Example: Converting dollars to cents

  • If \(a = 0.01\), variance decreases by \((0.01)^2 = 0.0001\)

Real-World Applications

Temperature Conversion:

  • Celsius to Fahrenheit: \(F = \frac{9}{5}C + 32\) (not linear!)

  • Celsius to Kelvin: \(K = C + 273.15\) (not linear!)

  • But scaling: \(C_{doubled} = 2C\) is linear with \(a = 2\)

Unit Conversions: - Meters to centimeters: \(a = 100\)

  • Dollars to cents: \(a = 100\)

  • Hours to minutes: \(a = 60\)

Why These Properties Matter

Statistical Significance:

  1. Standardization: Converting to z-scores uses linear transformations
  2. Unit Changes: Results remain proportionally correct
  3. Data Analysis: Understanding how transformations affect summary statistics
  4. Modeling: Linear regression relies on these properties

Key Insight: Linear transformations preserve the shape of the distribution while changing location and scale.

Practice Problem

Try This: A dataset has mean \(\bar{X} = 15\) and variance \(S_X^2 = 9\).

If we transform the data using \(y_i = 3x_i - 2\), what are the new mean and variance?

Tip

Hint: This is not a pure linear transformation! You need \(y_i = 3x_i - 2 = 3(x_i - \frac{2}{3})\)

Answer:

  • New mean: \(\bar{Y} = 3(15) - 2 = 43\)

  • New variance: \(S_Y^2 = 3^2 \times 9 = 81\)

  • (The constant \(-2\) doesn’t affect variance!)

Common Mistakes to Avoid

❌ Wrong: \(S_Y^2 = a S_X^2\)

✅ Correct: \(S_Y^2 = a^2 S_X^2\)

Why: Variance involves squared deviations, so the scaling factor gets squared too.

❌ Wrong: Adding constants affects variance

✅ Correct: Only multiplication affects variance; addition only shifts the mean

Conclusion

What We Proved:

For the linear transformation \(y_i = ax_i\) with constant \(a \neq 0\):

  1. Sums scale linearly: Factor of \(a\)
  2. Means scale linearly: Factor of \(a\)
  3. Variances scale quadratically: Factor of \(a^2\)

Key Takeaway: Understanding these relationships is fundamental for:

  • Data transformations

  • Statistical modeling

  • Unit conversions

  • Standardization procedures

Questions?

Key Concepts Covered:

  • Summation notation and indexing

  • Linear transformations

  • Properties of means and variances

  • Step-by-step mathematical proofs

Next Steps:

  • Apply to real datasets

  • Explore non-linear transformations

  • Practice with different scaling factors

Additional Practice

Exercise 1: If \(X = \{10, 20, 30, 40\}\) and \(Y = \{-5, -10, -15, -20\}\), what is the value of \(a\)?

Exercise 2: A dataset has \(\bar{X} = 50\) and \(S_X = 10\). After transformation \(y_i = 0.5x_i\), find \(\bar{Y}\) and \(S_Y\).

Exercise 3: Prove that if \(y_i = ax_i + b\) (adding a constant), then \(S_Y^2 = a^2 S_X^2\) (the constant doesn’t affect variance).