Field of Science

Genotype-phenotype maps and mathy biology

ResearchBlogging.orgI'm reading a book chapter by Peter Stadler from 2002 called Landscapes and Effective Fitness [1]. It has this absolutely gorgeous figure:

 I love it. But just before this figure he has this equation:

I hate it. I hate it because all it says is that each type, x, is at a frequency Px of the total population, so those Px sum to one. But of course. I just don't think this kind of writing is conducive to discourse, because in biology there is already a huge gap between the majority who don't read (and cite) papers with equations, and those who write them. So why muddy the waters with equations like this that says next to nothing?

However, I reiterate (and is why I'm reading the chapter) that this figure of a genotype-phenotype-fitness map is super cool.There are many more different genotypes (the genetic make-up of an organism) than there are different phenotypes (the combined physical attributes of the organism). This must be so, because we now know that each trait is affected by many genes; it takes more than one gene to make a trait (there may be exceptions where only one gene encodes a trait).

The figure is a conceptual map, but real g-p mapping is sort of the holy grail in evolutionary biology at the moment. With a real map like in hand evolutionary dynamics can be predicted, and we will be able to say which genetic changes are required to change the phenotype. However, realistically we can only map a very small portion of the genotype on to the phenotype, and there even seems to be some confusion about what the proper answer is to the question of what the genotype-phenotype map looks like. Hopefully the answer won't be too mathy...

[1] Peter F. Stadler, & Christopher R. Stephens (2003). Landscapes and Effective Fitness Comm. Theor. Biol DOI: 10.1080/08948550302439
[20] A testable genotype-phenotype map: Modeling evolution of RNA molecules. In: Lässig, M. and Valleriani, A., editors, Biological Evolution and Statistical Physics, pp. 56–83. Springer-Verlag, Berlin, 2002.

How to be a good speaker

Bjørn's two rules of being a good speaker:
  1. Love the words that you speak
  2. Always have something to say
An engaged speaker is more enjoyable to listen to than a bored one. If you love the words as they leave your mouth, you are more likely to engage the audience. Caveat: we all hate someone who loves to speak - too much. I am here talking about giving a presentation, where you are expected to deliver a monologue. In dialogue, be a good listener.

If you don't have something to say, don't give a talk. As a scientist, this is the same as not having done anything, in which case you are not doing your job. But I also mean this in a more general sense: live life learning, and have your lessons to share. If not, then it's a waste, in my opinion.

I'm at the 16th Evolutionary Biology Meeting in Marseille, and I trust I don't need to say that some of the presentations don't measure up to the science behind them. And that's a shame; people being bored listening to your talk when they really should be excited about the science. It's a total myth that all one needs to do is do good science, and people will be interested in your talk. Rather, unless it is the something you are supremely interested in (which is probably only a small fraction of what you hear at conferences and seminars), then people tend to lose interest, tune out, and sometimes even feel antipathy for the speaker.

There are other things a speaker can do, but those are not my rules.

I am speaking tomorrow evening on the Impact of Epistasis and Pleiotropy on Adaptation.

Epistasis in evolution

[The following is a post written for BEACON.]

What is epistasis?
Epistasis is a measure of the strength of epistatic interactions. Epistatic interactions are non-additive interactions between alleles, loci, or mutations. That is, if the combined effect of a pair of mutations is not what we expect from their individual effects, we then say there is epistasis between those two mutations.

Two mutations that are both detrimental on their own can be beneficial when they occur together. An example of this is from Joe Thornton’s lab: the present function of reduced sensitivity to hormone in vertebrate glucocorticoid receptor is an example of this. Two mutations both reduced sensitivity and destabilized the newly duplicated gene shortly after its birth 450 million years ago. A third mutation – neutral without the first two mutations – buffered the destabilization, and allowed to gene to go fixation (Carroll et al., 2010).

Epistasis is mostly measured in terms of fitness, as the deviation from additivity, but in principle any trait-value can be used*. If mutation A increases fitness by 5% and B increases fitness by 10%, then we might expect that an organism with both mutations get a fitness increase of 1.05×1.10=1.155 or 15.5%. This would be the case if the two mutations do not interact, so that their effects on fitness are independent of each other. The deviation can be measured in various ways, but the proper way of doing it would be like this:

ε = log10[WAB × W0/ (WA × WB)],

where W0 is the fitness of the organisms with neither mutation. This is the best definition (!), because we assumed above that the effects of the mutations are to increase fitness by a fraction of the current fitness, rather than by adding a number. If mutations did increase fitness by an absolute number, we might measure epistasis as

ε = WAB + W0 – (WA + WB).

Both of these measures are then zero when there is no epistasis, and both can be extended to deal with more than two mutations interacting. When ε>0 we call it positive epistasis, and negative epistasis when ε<0 (Fig. 1).

So, if an organism with both mutations have a fitness of 1.20, then the amount of epistasis is ε = log10[1.20 / (1.05 × 1.10)] = 0.01660. If two deleterious mutations together have a beneficial effect, the sign of the joint effect is reversed, and this is called reciprocal sign epistasis (e.g., WA = 0.95, WB = 0.90, WAB = 1.20, giving ε = 0.1472). A trivial case of negative epistasis is when both mutations are independently neutral, but their joint effect is deleterious (e.g., WA = 1.0, WB = 1.0, WAB = 0.90, ε = -0.04576). I say this is a trivial case, because this type of interaction could be one where two genes carry out the same function, thereby exhibiting robustness by being redundant; the organisms then only suffers a fitness decrease when both genes are not working properly.

Fig. 1: Schematic illustration of epistasis. Two mutations A and B can interact epistatically in different ways with varying effects on fitness. The fitness of the wild-type is represented by the black baselines, and the heights of arrows represent the fitness after one mutation (WA or WB) and after both mutations (WAB). Green, positive epistasis, red, negative epistasis, black, no epistasis. In (a), two independently beneficial mutations may have their joint effect increased or diminished (WAB larger or smaller), while in (b) the independent effect of the two mutations is deleterious and beneficial, respectively, and the combined expected effect on fitness is deleterious. In (c), each mutation by itself is deleterious, but when they interact, the result can be reciprocal sign epistasis (green arrow). These sketches illustrate an additive model, where the sum of WA and WB is equal to WAB without epistasis. In our model, using the geometric mean this corresponds to taking the logarithms of the fitness. From Østman et al. (2012).

Epistasis is a feature of the genotype-phenotype map, and of genetic architecture. The genes that together are responsible for a trait (e.g., eyes, lungs, blood-clotting) are likely to interact and have non-zero epistasis. Many genes are also pleiotropic, i.e. part of gene-networks of more than one trait (Fig. 2), as they are expressed in different contexts (tissues, cell-types, in response to different environmental cues, etc.).
Fig 2: Epistatic modules. (A) Hypothetical genotype-phenotype map with three modules of groups of genes affecting three traits: eyes, lungs, and blood-clotting. The genes within each module interact epistatically, while some genes exhibit pleiotropy (black arrows). Not all pairs of genes affecting the same trait necessarily have a non-zero epistasis. (B) Human liver coexpression network and corresponding gene modules. The gene coexpression network consists of the top 12.5% most differentially expressed genes (5,012 expression traits). The colors of the nodes represent their module assignments. Each of the colors correspond to a trait, and most genes are only expressed in that trait, though some are expressed in more than one (pleiotropy), as indicated by lines signifying coexpression. From Friend (2010).

Why is epistasis important in evolution?
One reason why epistasis is so important in evolutionary biology is that it affects the fitness landscape. The structure of the fitness landscape in large part determines many important things in evolution, such as evolvability, robustness, repeatability, contingency, and speciation. If the environment dictates that on set of genes/loci has a particular combination of alleles that optimizes fitness, then without epistasis each gene can be optimized individually until the optimal combination is reached (i.e., there is one peak in the local fitness landscape, aka smooth landscape). Deterministically, the population will end up on the peak. However, if the genes/loci interact, then fitness values are modified, and the fitness landscape will no longer be smooth, but contain multiple local peaks with valleys in between. Evolution in such a rugged fitness landscape will not be predictable, and multiple outcomes are now possible. Because there are multiple peaks the population might get stuck on a local peak with lower fitness than the highest peak in the landscape. Another possibility is that more than one peak is climbed at the same time, and if such a situation can be sustained, it can lead to evolutionary branching and even speciation.

Another reason why epistasis is so important is that interactions between genes means that much more complex traits can be made. If genes did not interact, then no trait would be affected by more than one gene (is this necessarily always true?). It is of course not possible to make a complex structure with only one kind of protein. Conversely, the more genes interact within a module, the more complex the trait can be, which in turn translates into higher fitness. With only a handful of genes available, only a simple eye can develop, while many genes together can make a more complex structure, which can increase the organism’s fitness. The fact that genes interact epistatically is why complex multicellular organisms with abundant cellular differentiation are possible at all.

How prevalent is epistasis?
Very. Basically, when people measure it, pretty much all pairs of mutations are epistatic. That’s hard to believe is true, and it probably isn’t. Measuring fitness is generally difficult; you have to measure the fitness of four organisms, and just a little bit of error will give ε different from zero. Therefore it is reasonable to attribute lots of non-zero measures below some limit to no epistasis. And then still, it turns out lots of pairs of mutations have significant epistasis between them.

For example, Costanzo et al. (2010), using data from a genome-wide, quantitative analysis of genetic interactions in yeast, showed that even when including only high values of epistasis (|ε|>0.08), then a large fraction of gene pairs are epistatic (Fig. 3A). Or in Drosophila melanogaster, where 15 insertions in the genes involved in startle-induced locomotion show extensive genetic interactions (Fig. 3B)

Fig. 3: Prevalance of epistasis. (A) The distribution of genetic interaction network degree for negative (red) and positive (green) interactions involving query genes. From Costanzo et al. (2010). (B) Epistatic interactions for startle-induced locomotion among 15 P[GT1] insertion lines in double heterozygous genotypes. From Yamamoto et al. (2008).

What is the current research focus?
Two major areas of research in evolution are adaptation and speciation. This has been so for a long time, and while we do know a lot about both, there is little doubt that this will not change in the foreseeable future. Adaptation is particularly affected by epistasis and pleiotropy, and it is an outstanding question to what extent adaptation is enhanced or mitigated by epistasis. Empirical data suggest that epistasis causes diminishing returns (e.g., Kahn et al, 2010), but this probably just means that the shape of fitness peaks are shallower the closer you get to the apex, which would just mean that the biggest returns on fitness comes with the first beneficial mutations (which are more likely to go to fixation in the first place). How much does epistasis affect evolvability? Fitness landscape ruggedness can limit a population’s ability to evolve, and ruggedness depends on the amount of epistasis among and within genes. But are these epistatic interactions set in stone, or are they malleable? In other words, how easy is it to create epistatic interactions, and once formed, can they be broken and allow for new advances in adaptation?

Speciation is also a much studied area of evolutionary biology, but the impact of genetic architecture is only recently coming into focus. Epistasis can cause Dobzhansky-Muller incompatibilities, which can lead to reproductive isolation (which is cool if your gold standard of speciation is the Biological Species Concept). But more generally, the epistastic nature of the genetic architecture causing multiple fitness peaks implies that evolutionary branching can occur. It remains an open question how much this is governed by epistasis, and particularly whether epistasis is a prerequisite for speciation of microbes.

* Not that I am thereby saying that fitness is just another trait. I hold the view that fitness – reproductive success – is a function of other traits, such that a network would point from genes to traits, and traits to fitness.

Carroll SM, Ortlund EA, and Thornton JW (2011). Mechanisms for the evolution of a derived function in the ancestral glucocorticoid receptor. PLoS Genetics, 7 (6) PMID: 21698144
Costanzo M, et al. (2010). The Genetic Landscape of a Cell Science, 327 DOI: 10.1126/science.1180823
Friend SH (2010). The need for precompetitive integrative bionetwork disease model building. Clinical pharmacology and therapeutics, 87 (5), 536-9 PMID: 20407459
Khan AI, Dinh DM, Schneider D, Lenski RE, and Cooper TF (2011). Negative epistasis between beneficial mutations in an evolving bacterial population. Science (New York, N.Y.), 332 (6034), 1193-6 PMID: 21636772
Yamamoto A, Zwarts L, Callaerts P, Norga K, Mackay TF, and Anholt RR (2008). Neurogenetic networks for startle-induced locomotion in Drosophila melanogaster. Proceedings of the National Academy of Sciences of the United States of America, 105 (34), 12393-8 PMID: 18713854
Østman B, Hintze A, and Adami C (2012). Impact of epistasis and pleiotropy on evolutionary adaptation. Proceedings of The Royal Society Biological sciences, 279 (1727), 247-56 PMID: 21697174

What would surprise you?

How often do you go "shiiiiiiiiiiit, so that's how it is!?!" What would really shock you? "FUCK! I never thought that would be the case..."

Probably not that often. But those moments are so great, and as a scientist, I'd say we sort of live for them.

I was thinking about this in terms of working in evolution. What would be a really big moment that I could say advanced my understanding of how living things evolve? Most papers I read anymore are incremental advances. Actually, all of them are. When I first started learning about evolution, I was in near-constant shock/revelational mode. It was pure delight to discover what we know about evolution. But now that I know most of it, nothing much surprises me anymore. Which is a shame.

So it got me thinking about where I could search for such moments. Something akin to learning that the Earth is not the center of the universe, or that everything is made of atoms. Or that there were dinosaurs, and that we evolved. The rest seems to be details. Important details, but not revelational.

I do various things in evolution, but my overarching focus is the origin of evolutionary novelty (but I like speciation, too). How do new things come into existence? The first eyes, first brain, first blood. People will then say that those things are derived from previous structures. Eyes from simpler photoreceptors, brains from simple nervous systems, blood cells from other cells. And these systems derived from yet simpler cells, but along the way, something new happened at least at some points that enabled these new systems/structures to form. New proteins were added to the mix, encoded by new genes. So where did these new genes come from? Well, they were derived from other genes, by duplication and neofunctionalization: a new gene is a copy and a refashioning of an old gene. So far so good. Then where did the first gene come from? Sorry, I don't work on origin-of-life stuff.

Is that it? Not quite. There are some major transitions in evolution to be explained. Unicellularity to multicellularity, cellular differentiation, asexual to sexual reproduction, and stuff like that.

But then, I am still left with this feeling at times that there is really nothing that would really upset my world-view (of evolution) much anymore. Still nothing revelational in sight. I hope I'm wrong.

Titles in evolutionary biology

These are the new papers for the last couple of weeks that I would like to read but will probably never get to. Gone are the days of the polymaths already, and now this!

  • Systematic underestimation of the age of selected alleles
  • Predatory Fish Select for Coordinated Collective Motion in Virtual Prey
  • Rapid evolution of Wolbachia incompatibility types
  • Avoidance of roads and selection for recent cutovers by threatened caribou: fitness-rewarding or maladaptive behaviour?
  • Weak Selection and Protein Evolution
  • Patterns of Neutral Diversity Under General Models of Selective Sweeps
  • Selective Sweeps in Multilocus Models of Quantitative Traits
  • Distinct evolutionary patterns of morphometric sperm traits in passerine birds
  • A selective force favoring increased G+C content in bacterial genes
  • Evolutionary Dynamics of Strategic Behavior in a Collective-Risk Dilemma
  • Evolution of Stress Response in the Face of Unreliable Environmental Signals
  • Network Context and Selection in the Evolution to Enzyme Specificity
  • Clade Age and Species Richness Are Decoupled Across the Eukaryotic Tree of Life
  • Evolutionary medicine: its scope, interest and potential*
  • The role of ‘soaking’ in spiteful toxin production in Pseudomonas aeruginosa
  • On the evolutionary origins of the egalitarian syndrome
  • Clade Age and Species Richness Are Decoupled Across the Eukaryotic Tree of Life

* Because I am meeting with Stephen Stearns when he visits MSU this Thursday, I will take an actual look at this review article. Paul Ewald was here last week, and we had a good talk about selection in pathogens and human disease. I also met with Randolph Nesse last semester, so evolutionary medicine has been in focus a lot lately.

ENCODE: What defines genomic function?

ResearchBlogging.orgA new wealth of articles by the ENCODE (the ENCyclopedia Of DNA Elements) consortium suggest that far more of the human genome carries out some function or other, and one might conclude that very little DNA is junk:

From an introduction to the new ENCODE papers:
Collectively, the papers describe 1,640 data sets generated across 147 different cell types. Among the many important results there is one that stands out above them all: more than 80% of the human genome's components have now been assigned at least one biochemical function.
[Emphasis added.]
80%? That is a lot (see [2] for details). It doesn't throw out the idea of junk-DNA, i.e., that there is DNA that has no function - but it puts the number much closer to zero than the 90% that I have heard before. But I seriously wonder what is meant by "function". Take a look at this image[1]:

 Gene regulation is a very spatial thing, which means that if you were to move a gene (i.e., the protein-coding DNA, or exons) somewhere else, then if would probably not be transcribed at the right time. So, if you were to cut out a length of DNA that doesn't have any function, then other DNA will be shifted spatially, and this might screw up proper transcription. So, DNA without function might be important as a filler. On the other hand, ENCODE includes in the 80% everything that is transcribed (i.e., DNA is used to produce RNA), but that doesn't mean that it has a function, as defined in my book. RNA may be floating around in the cell, and may never be translated (into protein), and may not have any other (e.g. regulatory) function either. On top of that, ever if it is translated and a protein is created based on that DNA, it doesn't necessarily follow that the protein does anything (could even be detrimental to the organism), and then that surely isn't functional.

To me, this is one of those moments where my understanding of how things work is challenged. If it really is true that no more than 20% of the human genome is junk (and it apparently could be a lot less than that), then I am happy to update my understanding, but I am super-skeptical that there is that little junk in the human genome. But I am not too happy with the usage of the words junk and non-functional here.

[1] Joseph R. Ecker, Wendy A. Bickmore, Inês Barroso, Jonathan K. Pritchard, Yoav Gilad & Eran Segal (2012). Genomics: ENCODE explained Nature, 489 DOI: 10.1038/489052a
[2] The ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome Nature, 489 DOI: 10.1038/nature11247

Darwin's Restaurant (CoE #51)

The 51st edition of Carnival of Evolution is up at The Stochastic Scientist: Darwin's Restaurant. There's something on the menu for everyone.

Next edition will be hosted by The Genealogical World of Phylogenetic Networks.