Quant. Gen. II: The basic model

R.A. Fisher

As we discussed last lecture, R.A. Fisher played a central role in unifying Mendelian inheritance with population genetics under a formal mathematical framework. For all his faults (and he had many), his scientific contributions were immense. Fisher was a brilliant mathematician, even at a young age. However, he suffered from very poor eyesight his whole life, and wore his trademark high-magnification coke-bottle glasses. Perhaps due to his poor eyesight, he formed the habit early in his life of doing mathematical derivations in his head, relying heavily on geometrical arguments, which were presumably easier to ‘see’ in his head. He was notorious for not writing out his derivations step by step, but then expecting his readers/students to figure out for themselves how he arrived at his answer!

Fisher was also a cantankerous character, which led to several conflicts with contemporaries during his career… most notably with Sewall Wright, with whom he had a bitter and long-lasting disagreement about the evolution of dominance. In 1919, he turned down a position at UCL offered by Karl Pearson, and instead went to the Rothamsted Experimental Station (an agricultural research station). During his time there, and the years after, he developed a tremendous amount of statistical theory. He invented ANOVA, generalized linear modeling for Gaussian error distributions, and developed/popularized Maximum Likelihood Methods for parameter estimation.

Figure 1: Ronald Aylmar Fisher (1890 – 1962) (Image from Adelaide Digital Library).
Insight

If you ever think of biology as being less quantitative than other ‘hard’ sciences, just remember that the foundations of modern statistics were originally developed by to study evolution and inheritance!!! A geneticist developed the statistical methods that are taught in mathematics and science departments around the world!

What is phenotypic variance composed of?

Recall from the last lecture some of the consequences of relaxing the simplifying assumptions that an individual’s phenotype was determined entirely by the genes they carry (i.e., by allowing environmental effects to influence the phenotype):

  • We can no longer definitively identify an individual’s genotype based on their phenotype.
    • Our focus therefore shifts from the study of genotypes to phenotypes.
  • This has an important implication: we now need to quantify variation in the phenotype, and ideally, to determine what part of that variation is due to the genes and individual carries, and what part is due to the environment.
Figure 2: Cartoon sketch of the decomposition of Phenotypic variance into genetic variance (\(V_G\)) and environmental variance (\(V_E\)).
Additivity of variances
If variables are independent, \[ \Var[X] + \Var[Y] + \ldots + \Var[n]= \Var[X + Y + \ldots + n] \]

so that \(V_P = V_G + V_E\).

Example

Consider a common garden experiment with some number of clonal lines.

  1. Plant them in a common garden.

  2. Collect phenotypic data, calculate group-specific means and variances for each line.

  3. The total variance in the measured phenotype is an estimate of the total phenotypic variance: \[ \sum_{i=1}^n s_i^2 = \hat{V}_P \]

  4. The mean of the group-level variances is an estimator for the environmental variance: ¯ \[ \overline{s}_{\text{groups}}^2 = \hat{V}_E \]

  5. Now, think through the logic… remember that individuals in each group are genetically identical because they are planted from single lines.

  6. We know that \(\hat{V}_P = \hat{V}_G + \hat{V}_E\), so we can rearrange to calculate \(\hat{V}_G\).

Figure 3: Sketched example of calculating \(\hat{V}_G\) from clonal lines.
Heritability (broad sense)
If variables are independent, \[ \frac{\hat{V}_G}{\hat{V}_P} = \hat{H}^2 \Rightarrow \text{Heritability, in the broad sense} \] Heritability: The proportion of all phenotypic variation that is attributable to variation in genotypes.
Insight

Notice that \(\hat{H}^2\) depends on \(\hat{V}_P\), which in turn includes \(\hat{V}_E\).

Estimates of heritability depend on the environment in which they are measured!!!1

1 Must be careful with interpretation!

The ‘Basic Model’ ca. R.A. Fisher (1918)

Now, let’s return to our ‘Basic Model’ of quantitative variation due to R.A. Fisher.

Recall that we have:

  • Several loci, \(\mbf{A}\), \(\mbf{B}\), \(\mbf{C}\), … \(\mbf{n}\).
  • Each with alternative alleles: \(A^+\), \(A^−\), \(B^+\), \(B^−\), \(C^+\), \(C^−\), … that contribute additively to a given trait… one increasing the trait’s value, the other decreasing it.
  • Each locus has limited effect, and all contribute additively to the phenotypic value.
  • We will temporarily ignore the environmental effect2.

2 Don’t worry, we’ll come back to this.

We can illustrate the contribution of a single locus to the phenotype as follows:

Notice that since we have many loci contributing to the overall phenotype, the dominance of say, the ‘\(+\)’ allele, at each locus won’t necessarily be the same. However, with this framework, we can decompose the genetic variation into different components by first considering what the expected phenotype would be with perfectly additive allelic effects at each locus.

Figure 4: Decomposition of \(V_G\) into additive and dominance effects.

We can see that \(V_G = V_A + \text{... something}\).

The sum of all squared deviations from the grand mean is \(V_G\) (by definition). The average of the squared deviations from additivity (the blue regression line) gives the variance explained by genetic dominance, \(V_D\). In the simple example above, with one locus, \[ V_D = \frac{\Delta_1^2 + \Delta_2^2 + \Delta_3^2}{3}. \]

But what does \(V_D\) tell us? Consider the following examples:

Figure 5: \(V_D\) captures the additional variation in the phenotype due to different dominance relations among the contributing alleles
Insight

Genetic variance, \(V_G\), can be decomposed into Additive and Dominance variance: \[ V_G = V_A + V_D \]

Plugging this into our earlier equation for the total phenotypic variance, \(V_P\), we have: \[ V_P = V_A + V_D + V_E \]

and we can define heritability in a ‘narrower’ sense to be that component of phenotypic variation which is attributable only to additive genetic variance: \[ h^2 = \frac{V_A}{V_P} \Rightarrow \text{heritability in the 'narrow sense'} \]

So far we have been considering a single locus… but as we know, Fisher’s model generalizes to many loci. When dealing with the multilocus version of the model, the key difference is that instead of simply summing these \(\Delta^2\) terms for a single locus (as we did above), we model the phenotype for the population, and partition the observed variance into the components that can be explained by additive gene action vs. dominance.