BIO 202, Spring 2026, draft v1. Measurements arriving one at a time. You will not always be told when (or whether) the population doing the generating has changed.
Stage A replays the end of Lesson 1 as a movie. Stages B, C, and D break it. Predict before you play each stage.
Draws stream in one at a time. Watch the running mean. Watch the interval around it.
Adults walk out one at a time. You measure each one's height. Running mean ȳ (blue) and a 95% bootstrap interval (ribbon) update after every draw. The dashed red line is the true μ.
Click "Step the stream" and watch.
# Stage A: draws stream from N(168, 10). Running mean + bootstrap CI.set.seed(42)mu_pop <- 168sigma <- 10y <- rnorm(400, mu_pop, sigma)running <- cumsum(y) / seq_along(y)# bootstrap CI on the running mean at each stepci <- t(sapply(seq_along(y), function(k) { draws <- replicate(200, mean(sample(y[1:k], k, replace = TRUE))) quantile(draws, c(0.025, 0.975))}))plot(running, type = "l", col = "#2f6b8f", lwd = 2, xlab = "draw index", ylab = "running mean")abline(h = mu_pop, col = "#b23a48", lwd = 2, lty = 2)
Same kind of stream. Same running mean. Somewhere in the middle, the building changes.
The stream starts the same way Stage A did. At some draw index you set, the door switches. The next adult comes from a different population (NBA players, μ ≈ 199 cm).
Your job: notice when, before clicking Reveal switch.
# Stage B: stream with a hidden switch from N(168, 10) to N(199, 9).set.seed(42)switch_at <- 80 # the analyst is not told thisN <- 300y <- c(rnorm(switch_at, 168, 10), rnorm(N - switch_at, 199, 9))running <- cumsum(y) / seq_along(y)# at each step, ask: does my bootstrap CI still cover mu_old = 168?covers <- sapply(seq_along(y), function(k) { bs <- replicate(200, mean(sample(y[1:k], k, replace = TRUE))) q <- quantile(bs, c(0.025, 0.975)) 168 >= q[1] & 168 <= q[2]})which(!covers)[1] # first draw at which CI excludes mu_old
Same setup as Stage B, run 100 times. Sliders for shift size and CI width. Predict before you run.
100 replicates. Each runs 300 draws: first 100 from N(168, 10), then a switch to N(168 + Δ, 10). Δ and the CI level are sliders. We record the first draw at which the CI excludes μ_old.
Gray bars: true alarms (after the switch). Orange bars: false alarms (CI broke before the switch).
set.seed(42)delta <- 10.0ci_level <- 95 / 100alpha <- 1 - ci_levelfirst_alarm <- replicate(100, { y <- c(rnorm(100, 168, 10), rnorm(200, 168 + delta, 10)) covers <- sapply(10:length(y), function(k) { bs <- replicate(100, mean(sample(y[1:k], k, replace = TRUE))) q <- quantile(bs, c(alpha/2, 1 - alpha/2)) 168 >= q[1] & 168 <= q[2] }) alarm_idx <- which(!covers)[1] + 9 if (is.na(alarm_idx)) NA else alarm_idx})hist(first_alarm, breaks = 30, col = "gray70")
Two real datasets. Draw one adult from each. Look at where the bars overlap.
Gray: 7,414 NHANES adults. Blue: 4,768 NBA careers (males). One random individual from each, on every draw.
nh <- read.csv("data/clean/nhanes_adults.csv")nba <- read.csv("data/clean/nba_players.csv")nba_h_cm <- nba$height_in * 2.54mean(nh$Height); sd(nh$Height)mean(nba_h_cm); sd(nba_h_cm)mean(nh$Height > median(nba_h_cm))set.seed(42)hist(nh$Height, breaks = 40, freq = FALSE, col = rgb(.5, .5, .5, .4), border = NA, xlim = c(140, 220), xlab = "height (cm)", main = "")hist(nba_h_cm, breaks = 40, freq = FALSE, col = rgb(.18, .42, .56, .5), border = NA, add = TRUE)
NBA heights come in feet-inches, NHANES heights in cm. Convert the NBA roster yourself (1 inch = 2.54 cm) and reproduce the overlap fraction on real data. Then ask: are the two σ's the same? If not, which population is wider, and why might that be?