Pop. Gen. III: Selection at one locus
\[ \def\mathbi#1{\textbf{\em #1}} \def\mbf#1{\mathbf{#1}} \def\mbb#1{\mathbb{#1}} \def\mcal#1{\mathcal{#1}} \newcommand{\bo}[1]{{\bf #1}} \newcommand{\tr}{{\mbox{\tiny \sf T}}} \newcommand{\bm}[1]{\mbox{\boldmath $#1$}} \newcommand{\norm}[1]{\left\lVert#1\right\rVert} \DeclareMathOperator{\E}{\mathbb{E}} \DeclareMathOperator{\Var}{\text{Var}} \]
The law of constancy of allele frequencies
then nothing happens!”
- Bengt-Olle Bengtson
Before discussing selection, it is worth taking a moment to notice a simple but deeply important truth in evolutionary theory, which we refer to as “the law of constancy of allele frequencies”. In short, if there is no force acting to change allele frequencies, then the expectation is that they will remain the same from one generation to the next. Formally, we write:
\[ \E [p_{t+1}] = p_t \]
Although this may seem self-evident, there are two implications worth pointing out:
- Genetics alone is not an evolutionary force.
- This constancy of allele frequencies is what is usually meant by “Hardy-Weinberg Equilibrium”.
As we discussed in our first lecture, “H-W equilibrium” can be a somewhat misleading term (as no equilibrium is actually calculated). The constancy of allele frequencies is really just a calculation of H-W proportions, which emerges implicitly from the concept of independent sampling of alleles and binomial probabilities.
So, if H-W proportions represent our expectation for allele frequencies under random mating, and we expect the frequencies to remain the same from one generation to the next in the absence of any evolutionary force… then what evolutionary processes can alter allele frequencies?
Four Evolutionary processes
In this course, we will discuss four main evolutionary processes that can alter allele frequencies over time. They include:
- Selection
- Mutation
- Migration
- Genetic drift
As we will outline below, there are crucial similarities and differences between these different processes. But before we can get into the details for each process, we need to make a brief digression into how geneticists formally describe changes in allele frequency.
Fundamentally, what we are interested in quantifying is the per-generation change in allele frequencies. This requires a mathematical expression which takes as input the current frequencies and gives as output the predicted frequencies in the future. There are several ways to do this, but in essence we have two choices.
- Differential equations are the tool of choice if we wish to describe allele frequency changes in ‘continuous time’ (i.e., where we treat time as an infinitely divisible continuous variable).
- Recursion equations are the appropriate tool if we are describing allele frequency changes in ‘discrete time’ (i.e., from one discrete generation to the next)
In this course, we will work exclusively with recursion equations, and therefore will be treating generation as discrete time steps. Below is a very brief primer on what recursion equations are, and how they work. Take a moment to read through, as we will be using recursion equations throughout this course to describe the four evolutionary processes.
Mathematical expressions describing the absolute amount or change in some quantity at time \(t+1\) as a function of it’s current value at time \(t\). You can think of them as “lovely little calculating machines”. In brief, we iterate these equations over and over again to describe per-generation change in allele frequency. Recursion equations take the following basic form:
\[ \begin{aligned} x_{t + 1} &= f(x_t) \\ x^{\prime} &= f(x), \end{aligned} \]
where the variable \(x\) is the quantity we are interested in tracking over time, \(f(x_t)\) is some function of the current value of \(x\) at time \(t\), and the \(x_{t + 1}\) is the predicted value of \(x\) at the next time point. The second equation shows the exact same thing, but with slightly different notation; here we drop the \(t\) subscripts, and denote the current value of our quantity of interest as \(x\), and the predicted value in the next time-step as \(x^{\prime}\). For our purposes, where we are generally interested in expressing changes in allele frequency, we will use the conventional \(p\) or \(q\) variables to represent our allele frequencies of interest, and for the most part we will use the ‘prime’ notation as follows:
\[ p^{\prime} = f(p). \]
One last thing: As you can see, our recursion equations describe the absolute value of the allele frequency we are describing from one generation to the next. However, sometimes it is useful to calculate the ‘difference’ in allele frequency - that is, the per-generation change in allele frequency rather than the absolute value. We can easily calculate this per-generation difference using our recursion equation to give what we call a difference equation as follows:
\[ \begin{aligned} \Delta p &= p_{t + 1} - p_t \\ &= p^{\prime} - p, \end{aligned} \]
where the \(\Delta\) notation indicates that we are describing a difference in frequencies, and I have again shown both the \(t\) subscript and ‘prime’ notations, which are entirely equivalent.
Now that we’re comfortable with what recursion equations are, let’s take a look at what these equations look like for each of our four evolutionary processes:
| \(\text{4 Evolutionary Processes}\) | ||
|---|---|---|
| \(\text{Selection}\) | \(p^{\prime} = f(p,s,h,\ldots)\) | \(f=\text{function of selection paramters}\) |
| \(\text{Mutation}\) | \(p^{\prime} = p(1 - \mu) + \mu(1 - p)\) | \(\mu = \text{mutation rate } (A_1 \rightarrow A_2)\) |
| \(\text{Migration}\) | \(p_{local}^{\prime} = p_{local}(1 - m) + m p_{immig.}\) | \(\text{m = migration rate}\) |
| \(\text{Genetic Drift}\) | \(\E [p^{\prime}] = p\) | \(\text{but generally } p^{\prime} \neq p\) |
We will focus first on models of Selection at a single locus, but will also include multiple processes - in particular, a combination of selection and mutation.
Selection - a general model
To build a model of selection we first make several simplifying assumptions.
- We will model selection in a large (effectively infinite) population with random mating.
- We will focus on selection at a single locus (\(\mbb{A}\)) with two alleles(\(A_1\), \(A_2\)).
- Although many factors contribute to the fitness of a given genotype, we will first focus only on viability selection, i.e., the probability of survival to reproduction of individuals of each genotype.
- We will assign constant viability values to each genotype.
The order of events in the life-cycle proceeds as follows:
fertilization \(\rightarrow\) selection \(\rightarrow\) meiosis \(\rightarrow\) random mating and fertilization again.
Neutral genetic variation: Different genotypes w/ identical relative fitness values.
The following table summarizes the important steps during a single generation in the model:
| \(A_1A_1\) | \(A_1A_2\) | \(A_2A_2\) | |
|---|---|---|---|
| \(\text{Frequency at birth}\) | \(p^2\) | \(2pq\) | \(q^2\) |
| \(\text{Viability}\) | \(w_{11}\) | \(w_{12}\) | \(w_{22}\) |
| \(\text{Frequency after selection}\) | \(p^2 \frac{w_{11}}{\overline{w}}\) | \(2pq \frac{w_{12}}{\overline{w}}\) | \(q^2 \frac{w_{22}}{\overline{w}}\) |
where \(\overline{w}\) is the population average viability, which is the sum of the numerators of the expressions for the genotypic frequencies after selection:
\[ \overline{w} = p^2 w_{11} + 2pq w_{12} + q^2 w_{22}. \]
We are about ready to write a recursion equation! But first notice that since we assume there are only two alternative alleles at the \(\mbb{A}\) locus, and we know that their frequencies must sum to 1 (i.e., \(p + q = 1\)), then we only need to track the frequency of one allele (we can always calculate the frequency of the other by subtracting from 1). So, we can write the recursion equation describing the frequency of the \(A_1\) allele in the next generation as1:
1 Recall that each heterozygote carries only one \(A_1\) allele, so drop the \(2\) in the \(2pq\)! The frequency of the \(A_2\) allele will be: \[ q^{\prime} = 1 - p^{\prime} = \frac{q^2 w_{22} + pq w_{12}}{\overline{w}} \]
\[ p^{\prime} = \frac{p^2 w_{11} + pq w_{12}}{\overline{w}} \]
As mentioned earlier, it is often convenient to quantify the per-generation change in allele frequency due to selection using a difference equation:
\[ \begin{aligned} \Delta p &= p^{\prime} - p = \frac{p^2 w_{11} + pq w_{12} - p \overline{w}}{\overline{w}} \\ &= \frac{p(p q w_{11} + q (1 - 2p)w_{12} - q^2 w_{22})}{\overline{w}} \\ &= \frac{pq\left[ p(\textcolor{DarkRed}{w_{11} - w_{12}}) + q(\textcolor{DarkRed}{w_{12} - w_{22}})\right]}{\overline{w}} \end{aligned} \]
This is probably the single most important equation in all of population genetics and evolution!
Notice that the key terms in the numerator involve differences between homozygote and heterozygote fitnesses!
- If \(w_{11} > w_{12}\), \(p\) will increase! and the opposite will occur if \(w_{11} < w_{12}\)
- Likewise, if \(w_{22} > w_{12}\), \(q\) will increase! and the opposite will occur if \(w_{22} < w_{12}\)
It all boils down to the relative fitness of heterozygotes! This offers deep insight into how evolution by natural selection works!
Relative fitness
Until now the \(w_{ij}\) terms were called ‘viability’, or the absolute Darwinian fitness. But notice that all terms in our selection equation have viability as a factor. If we divide the numerator of the expressions for the frequency after selection by any one of the viability terms, say \(w_{11}\), then each viability term in the selection equation would become a ratio of \(w_{11}\). However, the actual value of \(\Delta p\) would stay the same!
Using these ratios allows us to express our selection model in terms of relative fitness instead of absolute viabilities. In nature, the fitness of a given genotype has many components, including viability, fertility, mating success, etc. When using relative fitness, the dynamics of the selection equation stay the same, but our definition of fitness can be much broader. So let’s update our table to include
| \(A_1A_1\) | \(A_1A_2\) | \(A_2A_2\) | |
|---|---|---|---|
| \(\text{Frequency at birth}\) | \(p^2\) | \(2pq\) | \(q^2\) |
| \(\text{Viability}\) | \(w_{11}\) | \(w_{12}\) | \(w_{22}\) |
| \(\text{Relative fitness}\) | \(w_{11} = \frac{w_{11}}{w_{11}} = 1\) | \(w_{12} = \frac{w_{12}}{w_{11}}\) | \(w_{22} = \frac{w_{22}}{w_{11}}\) |
| \(\text{Frequency after selection}\) | \(p^2 \frac{w_{11}}{\overline{w}}\) | \(2pq \frac{w_{12}}{\overline{w}}\) | \(q^2 \frac{w_{22}}{\overline{w}}\) |
Notice that we are now using the terms \(w_{ij}\) to express relative fitness instead of absolute viability. This is the convention in most models of selection, and one we will follow for the remainder of the course.
Selection and Dominance coefficients
This brings us to another common convention in population genetics: using what are called selection and dominance coefficients to express relative fitnesses:
| \(A_1A_1\) | \(A_1A_2\) | \(A_2A_2\) | |
|---|---|---|---|
| \(\text{Relative fitness}\) | \(w_{11} = \frac{w_{11}}{w_{11}} = 1\) | \(w_{12} = \frac{w_{12}}{w_{11}} = 1 - h s\) | \(w_{22} = \frac{w_{22}}{w_{11}} = 1 - s\) |
We have very specific interpretations for these two coefficients. Specifically:
- The selection coefficient, \(s\), describes the fitness of the \(A_2 A_2\) homozygote relative to the \(A_1 A_1\) homozygote.
- The dominance coefficient, \(h\), describes the fitness of heterozygote relative to the selective difference between the two homozygotes.
Hopefully, you can see that the dominance coefficient plays a critical role when using these relative fitness expressions. Depending on the value of \(h\), we can identify several important selection scenarios:
| Dominance | Selection scenario |
|---|---|
| \(h = 0\) | \(A_1\) dominant, \(A_2\) recessive |
| \(h = 1\) | \(A_1\) recessive, \(A_2\) dominant |
| \(h = 1/2\) | Codominance |
| \(0 < h < 1\) | Incomplete dominance |
| \(h < 0\) | Overdominance (heterozygote advantage) |
| \(h > 1\) | Underdominance (heterozygote disadvantage) |
Although we will largely use the above expressions for relative fitness it is important to note that other fitness expressions are possible!
For example, we could write \(w_{11} = 1\), \(w_{12} = 1 + h s\), and \(w_{22} = 1 + s\) instead. In this case, the natural interpretation is that \(A_2\) is a beneficial allele rather than a deleterious one (and vice versa for \(A_1\)). The choice doesn’t really matter, so long as you are consistent. BUT, one reason we often use the first set of expressions is that \(s\) is bounded between \(0\) and \(1\) (i.e., \(s \in [0, 1]\)).
Now, if we substitute our relative fitness expressions, \(w_{11} = 1\), \(w_{12} = 1 - h s\), and \(w_{22} = 1 - s\) into the selection difference equation and simplify, we get:
\[ \Delta p = \frac{p q s \left( ph + q (1 - h)\right)}{\overline{w}} \]
where now \(\overline{w} = 1 - 2 p q h s - q^2 s\). This simplified version of our selection difference equation suggests a few important points about the expected dynamics:
- \(h\) determines where the allele frequency ends up.
- \(s\) determines how quickly it gets there.
3 selection scenarios
Directional selection
Simply put, ‘directional selection’ involves selection favouring one allele over the other. In the long term, the ‘positively’ selected allele will deterministically increase in frequency until it reaches a frequency of \(p = 1\) and replaces the alternative allele. When this happens, we say that the positively selected allele has ‘gone to fixation’ in the population. But notice that ‘positive selection’ for one allele implies ‘negative selection’ for the other. Moreover, any decrease in frequency of the ‘negatively selected’ allele will mirror the increase in frequency of the ‘positively selected’ allele. They are two sides of the same coin, and it depends only on which allele’s frequency you choose to focus on.
Keeping this in mind, and based on what we learned from our selection equations, we can draw the following conclusions:
- Directional selection occurs when the the following relative fitness relations hold: \(W_{11} > W_{12} > W_{22}\) OR \(W_{11} < W_{12} < W_{22}\).
- Directional/positive selection occurs with incomplete dominance (when \(0 < h < 1\))
- Not surprisingly, under directional selection, we expect that the frequency of the selectively favoured allele approaches 1. That is, if \(A_1\) is favored, \(p \rightarrow 1\).
- Equivalently, \(\Delta p > 0\) for \(0 < p < 1\).
- The reverse is true if \(A_2\) is favoured.
- Note also that under our standard relative fitness parameterization, \(s > 0\), and therefore the sign of \(\Delta p\) is determined by the expression \(p h + q(1 - h)\), which is which is always positive for incomplete dominance.
Let’s take a look at what the deterministic allele frequency trajectory looks like when we have positive selection for the \(A_1\) allele:
Above, we have plotted the allele frequency trajectories for \(300\) generations for a variety of different initial frequencies of \(A_1\), and with a selection coefficient of \(s = 0.05\) and dominance coefficient of \(h = 1/2\).
We can also look at a plot of \(\Delta p\) as a function of \(p\):
This show us that, as expected, for all frequencies \(0 < p < 1\), the per-generation change in \(p\) is always positive2.
2 Except, of course, for what are called the ‘boundary’ cases were \(p = 0\) and \(p = 1\). In these cases, \(\Delta p\) evaluates to 0. Can you explain why?
\(\Delta p\) is strongly dependent on the value of \(p\).
- Evolution by natural selection proceeds slowly when there is little genetic variation (i.e., when \(p\) is close to \(0\) or \(1\)!
- The maximum rate of change occurs when \(p = 1/2\).
Heterozygote Advantage (overdominance)
We now turn to a critically important form of selection which goes right to the heart of THE central question in evolutionary biology: How is genetic variation maintained?
Heterozygote advantage occurs when \(h < 0\) in our standard fitness expressions of \(w_{11} = 1\), \(w_{12} = 1 - h s\), and \(w_{22} = 1 - s\). This has several important implications for the behaviour of our 1-locus selection model:
- Unlike directional selection, under heterozygote advantage allele frequencies can approach an internal equilibrium3, \(p \rightarrow \hat{p}\).
- We can easily see that this internal equilibrium exists by examining a plot of \(\Delta p\) against \(p\).
- The equilibrium occurs where \(\Delta p = 0\)
3 An equilibrium represents a steady state for a recursion equation. Formally, an equilibrium is a solution to the equation \(\Delta p = 0\). For our selection equations, there are always two ‘trivial’ equilibria at the boundaries \(p = 0\) and \(p = 1\), because if the frequency of one allele is 0 then selection cannot act to change the frequency in the next generation. An internal equilibrium occurs when there is a 3rd solution to \(\Delta p = 0\) for intermediate allele frequencies (when \(0 < p < 1\)).
In the above plot of \(\Delta p\) against \(p\), we have set \(w_{11} = 1\) , \(w_{12} = 1.025\), and \(w_{22} = 1.005\). This is an example of heterozygote advantage, and the internal equilibrium, \(\hat{p}\), is indicated by the red dashed line. Take some time to think through this plot!
- If the frequency of \(A_1\) is below \(\hat{p}\), then \(\Delta p\) is positive, which means that \(p\) should increase in the next generation.
- But if the frequency of \(A_1\) is above \(\hat{p}\), then \(\Delta p\) is negative, which means that \(p\) should decrease in the next generation.
- This mean that, in the long run, \(p\) should converge on \(\hat{p}\), provided that \(p \neq \{0,1\}\)!
We can visualize this convergence by plotting \(p\) against time as we did earlier for the same parameter values, where \(\hat{p}\) is again indicated by the red dashed line:
Under heterozygote advantage, both alleles can be maintained in the population in a stable equilibrium. This is a form of BALANCING SELECTION4. This is the only way to maintain genetic variation at a single locus by selection without also invoking one of the other 4 evolutionary processes.
4 Balancing selection is selection that actively maintains genetic variation. Formally, it is defined as selection for which there exists a stable internal equilibrium. MANY different kinds of selection models can give rise to balancing selection. In our simple model with one locus and only viability selection, however, heterozygote advantage is the only way to get balancing selection without invoking other forms of selection or other evolutionary processes.
Disruptive selection (underdominance)
The last form of selection we will discuss is disruptive selection. As you can probably imagine, this is a form of selection that occurs when the fitness of heterozygotes is lower than either homozygote.
- Underdominance occurs when \(h > 1\) in our standard one locus selection model.
- Another example where the outcome of selection depends on the initial frequency, \(p\).
- \(p\) will increase when close to \(1\), and decrease when close to \(0\).
You will meet this form of selection again in your Problem Sets, where you will also encounter a new kind of the equilibrium!
Mutation-Selection Balance
We will finish our exploration of selection at a single locus by studying a particularly common and important form of balancing selection with major importance for human health: Mutation-Selection Balance. Not surprisingly, mutation-selection balance involves 2 of our evolutionary processes. Specifically mutation generates new genetic variants (usually deleterious), and selection acts on those variants. Let’s start with a few important points:
- The vast majority of new mutations with reasonably large fitness effects are deleterious (decrease fitness), and partially recessive (\(h < 1/2\), usually pretty close to \(0\))
- These deleterious variants enter the population by mutation, and are continually removed by purifying selection5.
- With both evolutionary processes acting at the same time, eventually a balance is reached where the rate of introduction of deleterious variants by mutation is perfectly counterbalanced by their removal from the population by selection.
- The equilibrium number of deleterious mutations is a critical genetic feature of populations, affecting many evolutionary processes (e.g., evolution of sex, recombination, disease genetics).
5 Purifying selection is just another term for directional selection which is reserved for this situation of selection against deleterious mutations.
Now, let’s step through a simple model of mutation-selection balance at a single locus. We will do this in three steps:
STEP 1: Change due to mutation
Let’s assume that the \(A_1\) allele is the ‘wild-type’, and that it mutates to a deleterious variant, \(A_2\), at a rate \(\mu\):
\[ A_1 \xrightarrow{\mu} A_2 \]
We also assume that selection against \(A_2\) is sufficiently strong that it will always be rare, so we ignore back-mutation for now (i.e., \(A_1 \leftarrow A_2\) will be exceedingly rare). The per-generation change in \(p\) due to mutation will be:
\[ \Delta_{\mu} p = -\mu p + \mu (1 - p) \]
Single-locus mutation rates are usually quite small, on the order of \(10^{−5}\) or \(10^{−6}\). Thus, the frequency of \(A_2\) increases very slowly. If selection against \(A_2\) is strong, as we have assumed, it allows the following approximation (recall our table from earlier):
\[ \Delta_{\mu} p = -\mu p + \mu (1 - p) \approx -\mu \]
because \(p \approx 1\), and therefore \(q = 1 - p \approx 0\).
STEP 2: Change due to selection
From our general selection equation, introduced earlier, we can approximate the per-generation change in frequency of \(A_1\) when \(q \approx 0\) as follows:
\[ \Delta p = \left. \frac{p q s \left( ph + q (1 - h)\right)}{1 - 2 p q h s - q^2 s} \right|_{p \rightarrow 1} \approx q h s \]
Can you work through the logic in this approximation?
Think about it in the following terms: if \(q\) is small, then terms involving \(q^2\) will be so small as to be negligible. Similarly, if the denominator is of the order \(1 - q\), and \(q\) is small, the denominator can, with reasonable accuracy, be approximated by \(1\).
Step 3: Solve for equilibrium frequency of \(A_2\):
We have now defined two difference equations:
- \(\Delta_{\mu} p\) is the per-generation change in frequency of \(A_1\) due to mutation.
- \(\Delta_{s} p\) is the per-generation change in frequency of \(A_1\) due to selection.
The equation describing the overall per-generation difference in frequency is \[ \Delta p = \Delta_{\mu} p + \Delta_{s} p \]
To find the equilibrium frequency of the deleterious allele, we need to solve the following equation for \(q\)6: \[ \begin{aligned} 0 &= \Delta_{\mu} p + \Delta_{s} p \\ &\approx -\mu + q h s \end{aligned} \]
6 Notice that we’ve sneakily switched which allele we are tracking. Although we set the equations up as changes in \(p\), we are solving the difference equation in terms of \(q\), since it makes sense to focus on the overall frequency of the deleterious allele.
Solving for \(q\) gives the equilibrium frequency of our deleterious \(A_2\) allele: \[ \hat{q} = \frac{\mu}{h s} \]
Looking at the equation for \(\hat{q}\) offers some immediate insight into how the fitness effects of deleterious mutations influence their overall frequency in the population!
- Both \(h\) and \(s\) are in the denominator… meaning: -The equilibrium frequency of the deleterious \(A_2\) allele will increase if selection is weak (i.e., \(s\) is small) or strongly recessive (i.e., \(h\) is small).
However, at a certain point, the approximations we made to calculate \(\hat{q}\) will break down. Specifically, the above result works best when \(h\) is not too small. But what happens when \(A_2\) is completely recessive?
If \(A_2\) is completely recessive, we can no longer ignore terms involving \(p\) and \(q^2\) in our equations for \(\Delta_s p\): \[ \begin{aligned} \Delta_{\mu} p + \Delta_{s} p &= -\mu + p q s \left[ ph + q(1 - h) \right] \\ &\approx \left. q s h + 3 q^2sh + q^2 s \right|_{h = 0} \\ &\approx q^2 s \end{aligned} \]
Now, setting this result equal to \(0\) and solving for \(q\) gives the equilibrium frequency for recessive deleterious mutations: \[ \hat{q} = \sqrt{\frac{\mu}{s}} \]
One can quickly see that \(\hat{q}\) is always larger when deleterious mutations are completely recessive. This offers immediate insight into why deleterious mutations persist in populations despite strong selection against them, and also into why many common congenital diseases are recessive!
Our examples of different kinds of selection may seem like disconnected cases. In fact, they are all connected. Sewall Wright (from Pop. Gen. II) provided a unifying principle when he showed that you can write our general selection difference equation \[ \Delta p = \frac{p q s \left( ph + q (1 - h)\right)}{\overline{w}} \]
in an alternate form: \[ \Delta p = \frac{p q}{2 \overline{w}} \frac{d \overline{w}}{d p} \]
That is, \(\Delta p\) is proportional the derivative or slope of the mean fitness with respect to the allele frequency \(p\). If \(d \overline{w}/d p\) is positive, selection increases \(p\), and population mean fitness goes up. If \(d \overline{w}/d p\) is negative, \(p\) decreases, and population mean fitness goes up again. This alternative form of the difference equation due to selection offers two other important insights:
- \(\Delta p\) is proportional to the genetic variance, \(pq\).
- So is the rate of change of the mean fitness.
- Thus, selection increases population mean fitness at a rate that is proportional to the genetic variation!!!