BIO 202, Spring 2026, draft v1. Mendel's peas. Monohybrid 3:1, dihybrid 9:3:3:1, and what to do when the chi-squared on real data fits the expected ratio too well.
Four stages. Run Mendelian crosses in simulation. Predict each ratio before you sample. End on Mendel's own data — and the chi-squared score that has been bothering geneticists since R. A. Fisher noticed it in 1936.
Simulate Mendel's monohybrid cross. Slide the offspring count. Watch how close the observed ratio is to 3:1.
Two heterozygous pea plants (Aa × Aa). Each parent contributes one allele at random — Mendel's law of segregation. Three quarters of offspring should be dominant phenotype (AA or Aa); one quarter recessive (aa). The expected ratio is 3:1. The observed ratio is whatever you sample.
| Phenotype | Observed | Expected (3:1) | (O−E)²/E |
|---|---|---|---|
| Dominant | — | — | — |
| Recessive | — | — | — |
| χ² (1 df) | — | P = — | |
set.seed(42)n <- 40# Each offspring is dominant with prob 3/4 (Aa × Aa)offspring <- rbinom(1, n, 0.75) # count dominant phenotypeobs <- c(dominant = offspring, recessive = n - offspring)exp <- c(dominant = n * 0.75, recessive = n * 0.25)chisq.test(obs, p = c(0.75, 0.25))
Add a second locus. Four categories. Watch the chi-squared get pickier as n falls.
Two heterozygous loci, segregating independently. The four phenotype categories are expected at 9:3:3:1 — round-yellow : round-green : wrinkled-yellow : wrinkled-green. Same chi-squared machinery as Stage A, three degrees of freedom instead of one.
| Phenotype | Observed | Expected | (O−E)²/E |
|---|---|---|---|
| Round, yellow | — | — | — |
| Round, green | — | — | — |
| Wrinkled, yellow | — | — | — |
| Wrinkled, green | — | — | — |
| χ² (3 df) | — | P = — | |
set.seed(42)n <- 80p <- c(9, 3, 3, 1) / 16 # 9:3:3:1 expectedobs <- rmultinom(1, n, p)[,1]chisq.test(obs, p = p)
Run many simulated 3:1 crosses. Plot the distribution of χ² statistics. The P-value from a single experiment is the tail fraction of this distribution.
Generate 1,000 simulated monohybrid crosses, each at n offspring. Compute χ² for each. The distribution is the χ² null. The 5% tail is the rejection region. Your one observed experiment is one draw from this distribution.
set.seed(42)n <- 80reps <- 1000chi2 <- replicate(reps, { d <- rbinom(1, n, 0.75) o <- c(d, n - d); e <- c(0.75*n, 0.25*n) sum((o - e)^2 / e)})mean(chi2 > 3.84) # empirical rejection rate
Mendel's published pea ratios fit 3:1 unusually well. So well, in fact, that the combined χ² across his crosses is in the lower 1% tail. Either his lab tech was lying, or he tossed crosses that didn't look right, or the universe was unusually kind. You decide.
Mendel's 1865 paper reports several monohybrid F2 ratios — round/wrinkled seed, yellow/green seed, etc. Load his counts. Compute χ² for each trait. Then combine across traits. Where does Mendel's combined χ² sit in the distribution of 1,000 simulated Mendel-style experimenters?
| Trait | n | Dominant | Recessive | χ² (1 df) | P |
|---|
peas <- read.csv("data/clean/mendel_pea.csv")# Per-trait chi-squaredchi_obs <- apply(peas, 1, function(r) { o <- c(r["dominant"], r["recessive"]) n <- sum(o); e <- c(0.75*n, 0.25*n) sum((o - e)^2 / e)})total_obs <- sum(chi_obs)# Simulate 1000 honest experimentersset.seed(42)null_totals <- replicate(1000, sum(apply(peas, 1, function(r) { n <- r["dominant"] + r["recessive"] d <- rbinom(1, n, 0.75) o <- c(d, n - d); e <- c(0.75*n, 0.25*n) sum((o - e)^2 / e)})))mean(null_totals <= total_obs) # Fisher's complaint