BIO 202, Spring 2026, draft v1. Mutation-selection balance. The reason bad alleles don't go to zero is that mutation keeps recreating them.
Mutation-only model. Mutation + selection equilibrium at q ≈ μ/(hs). Back-calculate s. Cystic fibrosis as the case where μ/(hs) breaks.
Without selection, a new mutation arises at rate μ per generation. In a finite population, it usually goes extinct from drift; rarely it sweeps to fixation.
Start with q = 0. Each generation, each non-mutant allele can mutate to "a" with rate μ. Plot q over time. Without selection, this is a Wright-Fisher process plus an injection rate of μ per allele per generation.
set.seed(42)mu <- 1e-4N <- 1000q <- 0; traj <- qfor (g in 1:2000) { q_mut <- (1-q)*mu + q*(1-mu) # symmetric for simplicity q <- rbinom(1,2*N,q_mut)/(2*N) traj <- c(traj, q)}
Add selection: aa has fitness 1 − s; Aa has fitness 1 − hs (h is dominance). The deleterious allele settles at q ≈ √(μ/s) for recessive lethals, q ≈ μ/(hs) for dominant.
q stabilizes when mutation influx (μ × frequency of A) equals selection efflux (≈ hs × 2pq for partially dominant, ≈ s × q² for recessive lethal). For partially dominant (h > 0): q ≈ μ/(hs). For recessive (h = 0): q ≈ √(μ/s).
mu <- 1e-5; s <- 0.1; h <- 0.5q <- 0; gens <- 5000; traj <- numeric(gens+1)for (g in 1:gens) { p <- 1-q; w_AA <- 1; w_Aa <- 1-h*s; w_aa <- 1-s wbar <- p^2*w_AA + 2*p*q*w_Aa + q^2*w_aa q <- (p*q*w_Aa + q^2*w_aa) / wbar q <- q*(1-mu) + (1-q)*mu traj[g+1] <- q}tail(traj, 1)
If you can measure q (population frequency of the allele), μ (per-generation rate), and h (dominance), you can solve for s: s ≈ μ/(h·q). The pink-katydid back-calculation.
Pink katydid example. Population frequency q ≈ 5 × 10⁻⁴. Mutation rate μ ≈ 5 × 10⁻⁵. Dominance h ≈ 1 (visible heterozygote). Solve: s ≈ μ/(h·q) ≈ 0.1. The selection coefficient is 10% — a huge effect.
mu <- 5e-5; h <- 1; q <- 5e-4# Mutation-selection balance: q = mu/(h*s) for h > 0s_inferred <- mu / (h * q)s_inferred
CF allele frequency in Northern European populations is ~0.022. If h = 0 and s = 1 (recessive lethal), μ/(hs) doesn't apply; the recessive formula gives q ≈ √(μ/s). With μ ≈ 10⁻⁶ and s ≈ 1, q ≈ 10⁻³. Observed is 22× higher. Something else.
Test three models against q_observed = 0.022: (1) recessive-lethal mutation-selection balance q = √(μ/s); (2) partially dominant model q = μ/(hs); (3) heterozygote advantage at AA fitness 1−sₐ, Aa fitness 1, aa fitness 1−s_aa. Only the third reaches the observed q.
The basic mutation-selection balance model assumes two causal arrows: μ → q and s_aa → q. It predicts q ≈ 0.001. The observed q is 22× higher. Which additional arrows in the causal model could be doing the work? Check every plausible one.
# Recessive lethal balancemu <- 1e-6; s_aa <- 1; q_rec <- sqrt(mu/s_aa)# Partial dominance balanceh <- 0.1; q_pd <- mu/(h*s_aa)# Heterozygote advantage equilibrium: q_eq = s_AA / (s_AA + s_aa)s_AA <- 0.02; q_ha <- s_AA/(s_AA + s_aa)c(rec = q_rec, partial = q_pd, hetadv = q_ha, observed = 0.022)