Lesson 2 — Resampling to ask if new data still belongs

BIO 202, Spring 2026, draft v1. Measurements arriving one at a time. You will not always be told when (or whether) the population doing the generating has changed.

What you'll do

Stage A replays the end of Lesson 1 as a movie. Stages B, C, and D break it. Predict before you play each stage.

Why "does this still belong"? Albino baby alligators get born at the same rate in a Florida swamp and in a captive enclosure — but the wild ones get eaten within days, because a bright-white reptile in a green swamp is a beacon. Sample the "newly hatched" population: identical distributions in both places. Sample the "still alive a week later" population: the swamp sample has shifted, the enclosure one hasn't. Today's question is the question a biologist faces every time the data arrives in pieces: did the population that produced the new measurements change, or did I just get unlucky?

A — Running mean of a stable population

Draws stream in one at a time. Watch the running mean. Watch the interval around it.

Locked — confirm your name above to begin.

Scenario

Adults walk out one at a time. You measure each one's height. Running mean ȳ (blue) and a 95% bootstrap interval (ribbon) update after every draw. The dashed red line is the true μ.

Click "Step the stream" and watch.

Draws + running mean with bootstrap interval

draws: 0  |  running ȳ:  |  true μ: 168.0  |  CI width:

Prediction (required before the stream starts)

  1. Q1. You watch 10 draws come in, then watch another 90. Which statement is more accurate?
  2. Q2. The bootstrap CI on ȳ is your sense of "how wrong could ȳ still be?" After 200 draws it is roughly 1.4 cm wide. After 800 draws (4× more data) it should be roughly:
Stream at least 80 draws to unlock Stage B. 0/80 draws

Controls

42
5

R code — running mean + bootstrap CI on a stationary stream

# Stage A: draws stream from N(168, 10). Running mean + bootstrap CI.set.seed(42)mu_pop <- 168sigma  <- 10y <- rnorm(400, mu_pop, sigma)running <- cumsum(y) / seq_along(y)# bootstrap CI on the running mean at each stepci <- t(sapply(seq_along(y), function(k) {  draws <- replicate(200, mean(sample(y[1:k], k, replace = TRUE)))  quantile(draws, c(0.025, 0.975))}))plot(running, type = "l", col = "#2f6b8f", lwd = 2,     xlab = "draw index", ylab = "running mean")abline(h = mu_pop, col = "#b23a48", lwd = 2, lty = 2)

B — The population switches under you, without warning

Same kind of stream. Same running mean. Somewhere in the middle, the building changes.

Complete Stage A (submit prediction, stream 80 draws) to unlock this section.

Scenario

The stream starts the same way Stage A did. At some draw index you set, the door switches. The next adult comes from a different population (NBA players, μ ≈ 199 cm).

Your job: notice when, before clicking Reveal switch.

The interval no longer covers 168. The model "all draws come from N(168, 10)" has stopped fitting.

Running mean + CI, with a hidden switch

draws: 0  |  running ȳ:  |  μ_old: 168.0  |  μ_new: 199.0
switch at draw: — (hidden)  |  CI covers μ_old?

Prediction (required before the stream starts)

  1. Q1. After the population switches, the running mean ȳ:
  2. Q2. The bootstrap CI on ȳ stops covering the old μ_old = 168 cm. What you can honestly say to a colleague:
Stream the data and click "Reveal switch" once you spot it (or once 250 draws have passed). 0/1 reveal

Controls

80
42

R code — stream with a hidden population switch

# Stage B: stream with a hidden switch from N(168, 10) to N(199, 9).set.seed(42)switch_at <- 80   # the analyst is not told thisN <- 300y <- c(rnorm(switch_at, 168, 10),       rnorm(N - switch_at, 199, 9))running <- cumsum(y) / seq_along(y)# at each step, ask: does my bootstrap CI still cover mu_old = 168?covers <- sapply(seq_along(y), function(k) {  bs <- replicate(200, mean(sample(y[1:k], k, replace = TRUE)))  q  <- quantile(bs, c(0.025, 0.975))  168 >= q[1] & 168 <= q[2]})which(!covers)[1]   # first draw at which CI excludes mu_old

C — How fast does the alarm fire?

Same setup as Stage B, run 100 times. Sliders for shift size and CI width. Predict before you run.

Complete Stage B (submit prediction, reveal the switch) to unlock this section.

Scenario

100 replicates. Each runs 300 draws: first 100 from N(168, 10), then a switch to N(168 + Δ, 10). Δ and the CI level are sliders. We record the first draw at which the CI excludes μ_old.

Gray bars: true alarms (after the switch). Orange bars: false alarms (CI broke before the switch).

Histogram of "first alarm draw index" across 100 replicates

median alarm:  |  false-alarm rate:  |  missed (no alarm by draw 300):

Prediction (required before sliders unlock)

  1. Q1. You raise the shift magnitude Δ from 5 cm to 15 cm. The median first-alarm draw will:
  2. Q2. You widen the CI from 95% to 99%. The false-alarm rate (fires before the real switch) will:
Run 100-replicate batches across at least 3 (Δ, CI) combinations to unlock Stage D. 0/3 runs

Controls

10.0
95
42

R code — replicate the detection experiment

set.seed(42)delta <- 10.0ci_level <- 95 / 100alpha <- 1 - ci_levelfirst_alarm <- replicate(100, {  y <- c(rnorm(100, 168, 10), rnorm(200, 168 + delta, 10))  covers <- sapply(10:length(y), function(k) {    bs <- replicate(100, mean(sample(y[1:k], k, replace = TRUE)))    q <- quantile(bs, c(alpha/2, 1 - alpha/2))    168 >= q[1] & 168 <= q[2]  })  alarm_idx <- which(!covers)[1] + 9  if (is.na(alarm_idx)) NA else alarm_idx})hist(first_alarm, breaks = 30, col = "gray70")

D — A real shifted population: NHANES vs NBA

Two real datasets. Draw one adult from each. Look at where the bars overlap.

Complete Stage C (submit prediction, run 3 batches) to unlock this section.

Scenario

Gray: 7,414 NHANES adults. Blue: 4,768 NBA careers (males). One random individual from each, on every draw.

NHANES (gray) and NBA (blue) — heights, with random draws on top

NHANES μ:  |  NBA μ:  |  Δ μ:
% NHANES taller than the NBA median:  |  draws so far: 0

Prediction (required before the draw button unlocks)

  1. Q1. The mean NBA height is about 30 cm taller than the NHANES mean. What fraction of NHANES adults do you expect to exceed the median NBA player?
  2. Q2. You draw 10 random NBA players and 10 random NHANES adults. Among these 20 individuals, the tallest one is most likely:
Make at least 20 paired draws to wrap up. 0/20 paired draws

Controls

42

R code — two real populations side by side

nh  <- read.csv("data/clean/nhanes_adults.csv")nba <- read.csv("data/clean/nba_players.csv")nba_h_cm <- nba$height_in * 2.54mean(nh$Height); sd(nh$Height)mean(nba_h_cm); sd(nba_h_cm)mean(nh$Height > median(nba_h_cm))set.seed(42)hist(nh$Height, breaks = 40, freq = FALSE,     col = rgb(.5, .5, .5, .4), border = NA,     xlim = c(140, 220), xlab = "height (cm)", main = "")hist(nba_h_cm, breaks = 40, freq = FALSE,     col = rgb(.18, .42, .56, .5), border = NA, add = TRUE)

Stretch challenge (optional)

NBA heights come in feet-inches, NHANES heights in cm. Convert the NBA roster yourself (1 inch = 2.54 cm) and reproduce the overlap fraction on real data. Then ask: are the two σ's the same? If not, which population is wider, and why might that be?

Not yet attempted.

Showcase — when two populations stop being one

Same machine you just ran on NHANES vs. NBA. Different islands.

What you're looking at

A few thousand years ago, a small number of tortoises washed up on Galápagos and founded every population on every island. One source pool, one species.

Today their shells look different on each island. Tortoises on dry, sparse islands like Pinta carry saddleback shells with a tall front opening — necks have to reach up to graze cactus pads. Tortoises on lush volcanic islands like Isabela carry domed shells with a low front opening — they graze grasses near the ground. x here is the height of the front shell opening in cm.

The question is the same one you asked of NHANES vs. NBA: could these two samples have come from a single underlying population?

Top: each island's distribution of front-opening heights (n=20 each). Bottom: bootstrap of the difference of means.

Pinta mean: cm  |  Isabela mean: cm  |  observed Δ: cm
bootstrap 95% CI on Δ:  |  CI includes 0?

What the bootstrap is doing

Both islands trace back to one ancestor population. What kept the shapes from re-merging? Vertical transmission within each island — Pinta parents had Pinta-shape babies on Pinta, generation after generation. And only weak transmission between islands — tortoises don't swim across kilometres of ocean. So the two channels of inheritance ran in parallel for long enough to drift apart and stay apart.

The bootstrap distribution below is the diagnostic for that breakdown: if the 95% CI of the difference does not include zero, the two samples are no longer consistent with a single common pool.

By the end of the course you will ask this same question — did transmission between two things stop? — about cells in a body, workers in a colony, lineages in a clade. Same engine. Different scale. (Unit 5 names the algebra: it is the Price equation.)

R code — bootstrap the difference of means

t <- read.csv("data/clean/galapagos_tortoises.csv")p <- subset(t, island == "pinta")$front_opening_cmi <- subset(t, island == "isabela")$front_opening_cmdelta_obs <- mean(p) - mean(i)# bootstrap: resample each island with replacement, recompute the differencedeltas <- replicate(2000, {  pb <- sample(p, length(p), replace = TRUE)  ib <- sample(i, length(i), replace = TRUE)  mean(pb) - mean(ib)})quantile(deltas, c(0.025, 0.975))   # 95% CI on the difference