# The distribution of deleterious mutations at the mutation-selection balance

If we sample a random individual from an asexual population that had allot of time to adapt to its environment, how many deleterious mutations can we expect it to have? This distribution of deleterious mutations is the starting point of many population genetics models. In an eariler post we've calculated this distribution - not only at the mutation-selection balance, but also on the way towards the balance from a mutation-free population.

Here, we calculate the distribution of deleterious mutation at the mutation-selection balance by following the derivation of John Haigh in his ultra-classic paper "The Accumulation of Deleterious Genes in a Population - Muller's Ratchet" (Haigh, 1978).

We focus on a finite asexual population undergoing selection and mutation. All mutations are deleterious with an independent (multiplicative) identical effect on fitness. The number of mutations per individual per generation is Poisson distributed1. and The master equation is:

$$p{k}(t) = \sum{j=0}^{k} (X{k-j}(t)(1-s)^{k-j} e^{-\lambda} \frac{\lambda^j}{j!})/T{1}(t)$$

$X_i(t)$ is a multinomial random variable representing the number of individuals with i deleterious mutations at time t, $p_k(t)$ is the multinomial probability for $X_k(t+1)$ (or the expected value of $Xk(t+1)/\sum{j\ge 0}{Xj(t+1)}$), s is the selection coefficient2, $\lambda$ is the mutation rate (usually denoted on this blog by $\mu$ or U) in mutations per individual per generation, and $T{r}(t)$ is the sum of r-th power of the population fitness at time t:

$$T{r}(t) = \sum{i\ge 0} X_{i}(t) (1-s)^{ir}$$

So $\bar{\omega} = T{1}(t) / \sum{j\ge 0}{X_j(t)}$ is the population mean fitness.

Now we are looking for the stable distribution of mutations in the population $n=(n_0, n_1, ...)$ for which $E[X(t+1) | X(t) = n] = n$. This is:

$$nk = N \sum{j=0}^{k} n_{k-j}(1-s)^{k-j} \frac{\lambda^j e^{-\lambda}}{j!}/T$$

where $N=\sum_{i \ge 0} ni$ is the stable population size and $T/N = \sum{i \ge 0} \frac{n_i}{N} (1-s)^i$ is the stable population mean fitness, (usually denoted by $\bar{\omega}$).

Looking at $n_0$ and assuming it is positive (because this the the stable number of individuals without deleterious mutations), the sum has only one term:

$$n0 = N n{0}(1-s)^{0} \frac{\lambda^0 e^{-\lambda}}{0!}/T \Rightarrow \ T = N e^{-\lambda}$$

which is another $\bar{\omega} = e^{-U}$ result, as we have seen before.

Using this identify, we get

$$nk = N \sum{j=0}^{k} n_{k-j}(1-s)^{k-j} \frac{\lambda^j e^{-\lambda}}{j!}/N e^{-\lambda} \Rightarrow \ nk = \sum{j=0}^{k} n_{k-j}(1-s)^{k-j} \frac{\lambda^j }{j!}$$

We find the solution using induction, assume that $n_m = n_0 \frac{\lambda^m}{s^m m!}$ for all $m<k$ and check for $k$ (the induction base is trivial, $n_0=n_0$): $$nk = \sum{j=0}^{k} n{k-j}(1-s)^{k-j} \frac{\lambda^j }{j!} = \ \sum{j=0}^{k} n{0} \frac{\lambda^{k-j}}{s^{k-j} (k-j)!} (1-s)^{k-j} \frac{\lambda^j }{j!} \Rightarrow \ n{0} \lambda^{k} \sum{j=0}^{k} \frac{(1-s)^{k-j} }{s^{k-j} (k-j)!j!} = \ n{0} \frac{\lambda^{k}}{s^k k!} \sum{j=0}^{k} \frac{k!}{(k-j)!j!} s^{j} (1-s)^{k-j} = \ n{0} \frac{\lambda^{k}}{s^k k!} \sum{j=0}^{k} {k \choose j} s^{j} (1-s)^{k-j} = \ n{0} \frac{\lambda^{k}}{s^k k!} \sum_{j=0}^{k} P(Bin(k,s)=j) \Rightarrow \ nk = n{0} \frac{\lambda^{k}}{s^k k!}$$ Where $P(Bin(k,s)=j)$ is the probability that a Binomial random variable with k trials and success probability s succeeds j times and fails k-j times.

Haigh wrote this as: $$nk = n{0}\frac{\theta^k}{k!}, \; \theta=\lambda/s$$ From this we can find the frequency of the fittest class of individuals, $f_0 = n0/N$, because we know that the population size is $\sum{k \ge 0} n_k = N$:

$$N = \sum_{k \ge 0} nk = \sum{k \ge 0} n{0}\frac{\theta^k}{k!} = \ n{0} \sum{k \ge 0} \frac{\theta^k}{k!} = \ n{0} e^\theta \Rightarrow \ n{0} = N e^{-\theta} \Rightarrow \ f{0} = e^{-\theta}$$

For example, (Trindade et al. 2010) studied a mutator strain of E. coli and found that is has a mutation rate of $\lambda=0.005$ and selection coefficient $s=0.03$. For these values, the frequency of the fittest individuals is roughly 85%.

The main assumption made here is $n_0 > 0$ (emphasized above). Of course, in an infinite population this is always true, but in real populations, the frequency of unloaded individuals $f_0$ can be smaller smaller that $1/N$ and then we can expect $n_0=0$, and the population will have a different distribution. According to Gessler (1995) the number of deleterious mutations per individual will assume a shifted negative binomial distribution.

1. This is reasonable because the mutation rate per locus is very low, but there are many loci

2. The effect of deleterious mutations on fitness