Lesson 8 — Counting the ratios that breed true

BIO 202, Spring 2026, draft v1. Mendel's peas. Monohybrid 3:1, dihybrid 9:3:3:1, and what to do when the chi-squared on real data fits the expected ratio too well.

What you'll do

Four stages. Run Mendelian crosses in simulation. Predict each ratio before you sample. End on Mendel's own data — and the chi-squared score that has been bothering geneticists since R. A. Fisher noticed it in 1936.

Did Mendel have any idea about genes? Did he even know what part of the cell was inheritance? When he came up with dominant and recessive — these words you all are learning now in his rules — what did he know about molecular biology? Nothing. Literally nothing. He probably knows more than Mendel did about genetics. Now, Mendel was very smart. He set up really careful experiments. — 202_lec01_06

A — One gene, two alleles, a 3:1 ratio

Simulate Mendel's monohybrid cross. Slide the offspring count. Watch how close the observed ratio is to 3:1.

Locked — confirm your name above to begin.

Scenario

Two heterozygous pea plants (Aa × Aa). Each parent contributes one allele at random — Mendel's law of segregation. Three quarters of offspring should be dominant phenotype (AA or Aa); one quarter recessive (aa). The expected ratio is 3:1. The observed ratio is whatever you sample.

Observed counts vs expected 3:1

PhenotypeObservedExpected (3:1)(O−E)²/E
Dominant
Recessive
χ² (1 df)P = —

Prediction

  1. Q1. With n = 40 offspring, the χ² test against 3:1 will:
Try at least 5 (n, seed) combinations to unlock Stage B. 0/5 combos

Controls

40
42

R code — monohybrid cross

set.seed(42)n <- 40# Each offspring is dominant with prob 3/4 (Aa × Aa)offspring <- rbinom(1, n, 0.75)  # count dominant phenotypeobs <- c(dominant = offspring, recessive = n - offspring)exp <- c(dominant = n * 0.75, recessive = n * 0.25)chisq.test(obs, p = c(0.75, 0.25))

B — Two genes, independent assortment, a 9:3:3:1

Add a second locus. Four categories. Watch the chi-squared get pickier as n falls.

Complete Stage A to unlock this section.

Scenario

Two heterozygous loci, segregating independently. The four phenotype categories are expected at 9:3:3:1 — round-yellow : round-green : wrinkled-yellow : wrinkled-green. Same chi-squared machinery as Stage A, three degrees of freedom instead of one.

Observed counts vs expected 9:3:3:1

PhenotypeObservedExpected(O−E)²/E
Round, yellow
Round, green
Wrinkled, yellow
Wrinkled, green
χ² (3 df)P = —

Prediction

  1. Q1. With n = 50 offspring across four categories, how often will the χ² reject 9:3:3:1 if the cross really is dihybrid?
Try at least 5 (n, seed) combinations to unlock Stage C. 0/5 combos

Controls

80
42

R code — dihybrid cross

set.seed(42)n <- 80p <- c(9, 3, 3, 1) / 16          # 9:3:3:1 expectedobs <- rmultinom(1, n, p)[,1]chisq.test(obs, p = p)

C — When the chi-squared rejects, and when it shouldn't

Run many simulated 3:1 crosses. Plot the distribution of χ² statistics. The P-value from a single experiment is the tail fraction of this distribution.

Complete Stage B to unlock this section.

Scenario

Generate 1,000 simulated monohybrid crosses, each at n offspring. Compute χ² for each. The distribution is the χ² null. The 5% tail is the rejection region. Your one observed experiment is one draw from this distribution.

The null hypothesis is set up to be wrong. It is an incorrect hypothesis. The question is, how incorrect is it? Which means that for any null-hypothesis test, if your sample size is sufficiently large, you will reject the null — because the null is set up to be wrong. — 145_lec01_07

χ² null distribution under perfect 3:1

n / cross: 80  |  replicates: 1000  |  % χ² > 3.84: (theoretical: 5.0%)

Prediction

  1. Q1. Across 1,000 simulated 3:1 crosses, the fraction with χ² > 3.84 (the 0.05 critical value) will be:
Try at least 3 different n values to unlock Stage D. 0/3 values

Controls

80
42

R code — χ² null distribution

set.seed(42)n <- 80reps <- 1000chi2 <- replicate(reps, {  d <- rbinom(1, n, 0.75)  o <- c(d, n - d); e <- c(0.75*n, 0.25*n)  sum((o - e)^2 / e)})mean(chi2 > 3.84)            # empirical rejection rate

D — Mendel's actual data — and Fisher's complaint

Mendel's published pea ratios fit 3:1 unusually well. So well, in fact, that the combined χ² across his crosses is in the lower 1% tail. Either his lab tech was lying, or he tossed crosses that didn't look right, or the universe was unusually kind. You decide.

Complete Stage C to unlock this section.

Scenario

Mendel's 1865 paper reports several monohybrid F2 ratios — round/wrinkled seed, yellow/green seed, etc. Load his counts. Compute χ² for each trait. Then combine across traits. Where does Mendel's combined χ² sit in the distribution of 1,000 simulated Mendel-style experimenters?

Mendel's combined χ² vs simulated honest experimenters

Mendel's combined χ²:  |  P(χ² ≤ Mendel) under honest 3:1:

Per-trait breakdown

TraitnDominantRecessiveχ² (1 df)P

Prediction

  1. Q1. Combined across his published F2 ratios, Mendel's χ² sits where in the distribution of honest experimenters' χ² totals?
Run the simulated experimenter test at least 2 times to wrap up. 0/2 runs

Controls

42

R code — Mendel vs honest experimenters

peas <- read.csv("data/clean/mendel_pea.csv")# Per-trait chi-squaredchi_obs <- apply(peas, 1, function(r) {  o <- c(r["dominant"], r["recessive"])  n <- sum(o); e <- c(0.75*n, 0.25*n)  sum((o - e)^2 / e)})total_obs <- sum(chi_obs)# Simulate 1000 honest experimentersset.seed(42)null_totals <- replicate(1000, sum(apply(peas, 1, function(r) {  n <- r["dominant"] + r["recessive"]  d <- rbinom(1, n, 0.75)  o <- c(d, n - d); e <- c(0.75*n, 0.25*n)  sum((o - e)^2 / e)})))mean(null_totals <= total_obs)   # Fisher's complaint