Lesson 7 — Tracing how much of a parent ends up in their child
BIO 202, Spring 2026, draft v2. The same regression machinery you have been using since Lesson 3, applied to parents and their children. By Stage C, the slope has a familiar identity.
What you'll do
Four stages. Build a parent-offspring simulator, then run the regression on 934 real children from Galton's 1885 paper.
A — Two independent traits, no relationship
Simulate a population where two traits are generated independently. Read three summary numbers; reseed and read them again.
Locked — confirm your name above to begin.
Scenario
Two simulated traits, drawn independently from Normal(0, 1). n individuals. Read the three summary numbers (covariance, Pearson r, Spearman ρ) and watch what they do as you change the seed.
Scatter of two independent traits
n:200
|
cov(x, y):—
|
Pearson r:—
|
Spearman ρ:—
Prediction (required before sliders unlock)
Q1. You simulate two traits that are truly independent (no real connection) and measure 200 individuals. What do you predict the sample Pearson r will be?
Q2. You keep simulating the same two independent traits but increase n from 200 → 2,000 → 20,000. What happens to the typical sample r?
Try at least 5 seed/n combinations to unlock Stage B. 0/5 combos
Controls
R code — independent traits, three summaries
set.seed(42)n <- 200x <- rnorm(n); y <- rnorm(n)cov(x, y)cor(x, y) # Pearsoncor(x, y, method = "spearman") # Spearman rank correlation
B — Couple the traits: introduce heritability h²
Now the offspring trait depends on the parents' trait. h² is a slider. Move it.
Complete Stage A (submit prediction, try 5 combos) to unlock this section.
Q3. (Reflection — not scored.) Suppose β̂ came out as 1.0 in some other population instead of 0.65. In one sentence, what would that say about the heritability of height in that population? Write your answer below.
Drag h² through at least 8 distinct values to unlock Stage D. 0/8 values
Controls
R code — read h² off the regression slope
set.seed(42)n <- 500h2 <- 0.50mu <- 68; sigma <- 3f <- rnorm(n, mu, sigma); m <- rnorm(n, mu, sigma)midparent <- (f + m) / 2offspring <- mu + h2 * (midparent - mu) + rnorm(n, 0, sigma * sqrt(1 - h2))fit <- lm(offspring ~ midparent)coef(fit)[2] # the slope IS h-hat-squaredsummary(fit)$coefficients[2, 2] # SE on h²-hat
D — Galton's 1885 data: 934 children, 197 families
Real father–mother–child records. Same regression you just built. Read the slope, then bootstrap it.
Complete Stage C (submit prediction, try 8 h² values) to unlock this section.
Galton's mother coefficient of 1.08 is rough. Refit using an un-corrected mother coefficient of 1.0 and report how the slope changes. Then run a per-sex fit: male offspring on midparent (no correction), then female offspring on midparent. Which approach gives the cleanest ĥ²? Hit "I tried it" after you have an answer.