BIO 202 — Lesson 7: Tracing how much of a parent ends up in their child

A — Two independent traits, no relationship

Simulate a population where two traits are generated independently. Read three summary numbers; reseed and read them again.

Scenario

Two simulated traits, drawn independently from Normal(0, 1). n individuals. Read the three summary numbers (covariance, Pearson r, Spearman ρ) and watch what they do as you change the seed.

Scatter of two independent traits

n: 200 | cov(x, y): — | Pearson r: — | Spearman ρ: —

Prediction (required before sliders unlock)

Q1. You simulate two traits that are truly independent (no real connection) and measure 200 individuals. What do you predict the sample Pearson r will be?
Exactly 0 — true independence means zero correlation in the sample Near 0 but generally nonzero — sample r wobbles around the true population value (0) because of finite sampling Equally likely anywhere in (−1, 1) — independent traits give random r
Q2. You keep simulating the same two independent traits but increase n from 200 → 2,000 → 20,000. What happens to the typical sample r?
The wobble around 0 shrinks — sample r converges to the true value (0) The wobble stays the same size — sample size doesn't affect r's spread Sample r grows away from 0 — more data magnifies the apparent correlation

Try at least 5 seed/n combinations to unlock Stage B. 0/5 combos

Controls

n200

seed42

R code — independent traits, three summaries

set.seed(42)n <- 200x <- rnorm(n); y <- rnorm(n)cov(x, y)cor(x, y)                # Pearsoncor(x, y, method = "spearman")   # Spearman rank correlation

B — Couple the traits: introduce heritability h²

Now the offspring trait depends on the parents' trait. h² is a slider. Move it.

Scenario

Simulated parent–offspring pairs. Midparent m = (father + mother) / 2. Offspring trait generated as:

y = μ + h²·(m − μ) + Normal(0, σ·√(1 − h²))

Move h². Watch the cloud.

Midparent vs. offspring, with cloud tilt

true h²: 0.50 | cov: — | Pearson r: —

Prediction (required before sliders unlock)

Q1. As h² moves from 0 to 1, the covariance between midparent and offspring will:
Grow with h² Stay near zero Drop toward negative
Q2. At h² = 1, the cloud of (midparent, offspring) points looks like:
A round cloud A tight line A flat horizontal band

Try at least 6 h² values to unlock Stage C. 0/6 h² values

Controls

h² (heritability)0.50

n pairs300

seed42

R code — couple two traits via h²

set.seed(42)n <- 300h2 <- 0.50mu <- 68; sigma <- 3father <- rnorm(n, mu, sigma); mother <- rnorm(n, mu, sigma)midparent <- (father + mother) / 2offspring <- mu + h2 * (midparent - mu) +             rnorm(n, 0, sigma * sqrt(1 - h2))cov(midparent, offspring); cor(midparent, offspring)

C — Fit offspring on midparent

Same simulator. Fit lm(offspring ~ midparent). Compare β̂ to the true h² slider.

Scenario

Same simulator as Stage B. Now we explicitly fit lm(offspring ~ midparent) and display the slope β̂ next to the true h² slider.

Drag h². Watch β̂.

Midparent × offspring with fitted line

true h²: 0.50 | fitted slope β̂: — | |β̂ − h²|: —

R²: — | SE(β̂): —

Prediction (required before slider unlocks)

Q1. β̂ is the slope of the regression of offspring on midparent. Pick the description that captures what β̂ does for you:
For every unit a couple's midparent trait exceeds the population mean, the predicted child exceeds the mean by β̂ units — i.e., β̂ is the heritability h² The Pearson r between offspring and midparent — a dimensionless "are-they-correlated?" answer 1, no matter how strong inheritance is
Q2. Galton observed slope ≈ 0.65 for child-on-midparent height. Biological reading?
For every 1 inch a couple's midparent height is above (or below) average, expect their adult child to land about 0.65 inches above (or below). h² ≈ 0.65; "regression toward the mean" is the geometric consequence of h² < 1, not a separate force. Heredity in his sample was broken The data are noise — no signal
Q3. (Reflection — not scored.) Suppose β̂ came out as 1.0 in some other population instead of 0.65. In one sentence, what would that say about the heritability of height in that population? Write your answer below.

Drag h² through at least 8 distinct values to unlock Stage D. 0/8 values

Controls

h² (truth)0.50

n pairs500

seed42

R code — read h² off the regression slope

set.seed(42)n <- 500h2 <- 0.50mu <- 68; sigma <- 3f <- rnorm(n, mu, sigma); m <- rnorm(n, mu, sigma)midparent <- (f + m) / 2offspring <- mu + h2 * (midparent - mu) + rnorm(n, 0, sigma * sqrt(1 - h2))fit <- lm(offspring ~ midparent)coef(fit)[2]               # the slope IS h-hat-squaredsummary(fit)$coefficients[2, 2]   # SE on h²-hat

D — Galton's 1885 data: 934 children, 197 families

Real father–mother–child records. Same regression you just built. Read the slope, then bootstrap it.

Scenario

934 child-records, 197 Victorian-era families. Midparent height = (father + 1.08·mother) / 2 (Galton's sex correction).

Fit lm(childHeight ~ midparentHeight). Bootstrap it. The slope is what you came here for.

Galton family data: midparent height vs child height

N children: — | families: — | slope (= ĥ²): —

intercept: — | R²: — | 95% bootstrap on ĥ²: —

Prediction (required before bootstrap unlocks)

Q1. The slope of childHeight on midparentHeight in Galton's data will be:
Exactly 1 — children inherit their parents' full deviation from the population mean About 0.65 — children inherit roughly two-thirds of their parents' deviation Near 0 — parental height tells you almost nothing about the child
Q2. Split the data by child sex. The two sex-specific slopes will:
Be roughly equal Have opposite signs Both be near zero

Run the bootstrap and split-by-sex toggle to wrap up. 0/2 actions

Controls

color by child sex

bootstrap reps200

seed42

R code — Galton in 5 lines

g <- read.csv("data/clean/galton_families.csv")g$midparent <- (g$father + 1.08 * g$mother) / 2fit <- lm(childHeight ~ midparent, data = g)coef(fit)[2]   # h-hat-squared ≈ 0.65B <- 200replicate(B, {  k <- sample(nrow(g), nrow(g), replace = TRUE)  coef(lm(childHeight ~ midparent, data = g[k, ]))[2]}) |> quantile(c(.025, .975))

Stretch challenge (optional)

Galton's mother coefficient of 1.08 is rough. Refit using an un-corrected mother coefficient of 1.0 and report how the slope changes. Then run a per-sex fit: male offspring on midparent (no correction), then female offspring on midparent. Which approach gives the cleanest ĥ²? Hit "I tried it" after you have an answer.

Not yet attempted.

Lesson 7 — Tracing how much of a parent ends up in their child

What you'll do

A — Two independent traits, no relationship

Scenario

Scatter of two independent traits

Prediction (required before sliders unlock)

Controls

R code — independent traits, three summaries

B — Couple the traits: introduce heritability h²

Scenario

Midparent vs. offspring, with cloud tilt

Prediction (required before sliders unlock)

Controls

R code — couple two traits via h²

C — Fit offspring on midparent

Scenario

Midparent × offspring with fitted line

Prediction (required before slider unlocks)

Controls

R code — read h² off the regression slope

D — Galton's 1885 data: 934 children, 197 families

Scenario

Galton family data: midparent height vs child height

Prediction (required before bootstrap unlocks)

Controls

R code — Galton in 5 lines

Stretch challenge (optional)