Author post: An amino acid polymorphism in the Drosophila insulin receptor demonstrates pleiotropic and adaptive function in life history trait

This next guest post is by Annalise Paaby on her paper: Paaby et al. “An amino acid polymorphism in the Drosophila insulin receptor demonstrates pleiotropic and adaptive function in life history traits” bioRxived here.

Find the alleles!
Organisms vary, even within populations, in ways that appear adaptive. We would very much like to identify the genetic elements that encode these phenotypic differences—but this is a challenging task. For polygenic traits, the tiny contributions of single loci can be near-impossible to detect in an experimental setting. In contrast, natural selection operates on a grand scale, with power to discriminate between alleles. We took advantage of the fact that Drosophila melanogaster are distributed across an extreme environmental gradient in order to identify a specific polymorphism that contributes to adaptive variation.D. melanogaster live along the east coasts of North America and Australia. On both continents, flies in low-latitude, warm environments develop faster and are more fecund, while flies in high-latitude, cold environments live longer and are more resistant to most stresses.Knocking out insulin signaling genes extends lifespan, increases stress tolerance, and reduces reproduction. Given these phenotypes, we wondered whether insulin signaling genes might vary in natural populations and influence life history. In a paper published a few years ago, we showed that alleles of a polymorphism in the Insulin-like Receptor (InR) showed clines in frequency in both North America and Australia. Since the populations were founded at different times from different source populations, the replicated pattern on separate continents is good evidence that the polymorphism is a target of selection.

What is this polymorphism?
The polymorphism we discovered is a complex indel that disrupts a region of glutamines and histidines in the first exon of InR. In our original survey, we found many segregating alleles, all differing in length by multiples of three nucleotides.H owever, two alleles comprise the majority. An allele we call InRshort is common at high latitudes, and InRlong, which is six nucleotides longer, is common at low latitudes. The alleles differ in four amino acids across a span of 16 residues.

The alleles affect signaling
In our current study, we show that InRshort and InRlong affect levels of insulin signaling. We took InRshort and InRlong flies from a single population in New York, replaced the X and second chromosomes, and randomized the genetic backgrounds of the third chromosome, on which InR resides. We measured levels of insulin signaling in test lines by performing qPCR on seven transcriptional targets in the pathway, all downstream of the receptor.We found that for five of the seven targets (four of which were significant), signaling was highest in InRlong, lowest in InRshort, and intermediate in the heterozygote—suggesting that InRshort and InRlong act additively on signaling levels. The directionality of these results makes sense: reduction of insulin signaling is known to extend lifespan, increase stress tolerance and reduce reproductive success, and these are the phenotypes we see at high latitudes where InRshort is common.

Fluctuations over time
In our new study, we returned to the North American populations we evaluated five years prior. However, this time around we mapped 100-bp paired-end reads from pooled population samples. (These data relate to Alan Bergland’s larger exploration of spatial and temporal variation in D. melanogaster, described here on arXiv.) We called each of the discrete polymorphisms within the complex indel polymorphism—SNPs or small indels—individually. Some of those discrete polymorphisms distinguish between the InRshort and InRlong alleles, and they confirm that the clines persist in North America.We reasoned that alleles prevalent in high-latitude, cold climates might be selected for in the winter, and alleles prevalent in low-latitude, warm climates might be selected for in the summer. We examined a Pennsylvania population at multiple timepoints over three years and saw dramatic fluctuations in allele frequency (changes of approximately 20%) for discrete polymorphisms associated with InRshort and InRlong. As predicted, the “winter” and “summer” alleles were those common at high and low latitudes, respectively.However, the polymorphisms that showed the most dramatic fluctuations over seasonal time were not necessarily those with the strongest clines in frequency across geographical space. We suggest that aspects of demography and selection probably vary between seasonal and geographical environments, even in the face of apparently similar climatic pressures.

A question of pleiotropy
A longstanding question in the field of life history evolution is whether single alleles affect multiple traits at once (pleiotropy) or affect traits individually but reside near each other (linkage). The question itself arises from the observation, made many times over, that life history traits are typically correlated. For example, long-lived individuals often show reduced reproductive fitness. Longevity is also often positively correlated with the ability to tolerate stress. Do the same genetic variants encode multiple trait phenotypes?We assayed our InRshort and InRlong test lines for multiple phenotypes: fecundity, development time, body size and allometry, body weight and lipid content, tolerance for multiple stresses, and lifespan. We used the test lines described above, a replicate set of InRshort and InRlong lines derived from a second population, and lines in which we measured the effects of InRshort and InRlong in an InRhypomorph mutant background.Our full report can be found in the manuscript, but the take-home message is that InRshort and InRlong are significantly associated with all of the tested traits, in directions predicted by a selection regime favoring fast development time, rapid egg-laying, and high heat tolerance in warm climates, and resistance to cold and starvation stresses in cold climates. The InRshort allele was also associated with increased lifespan in males, though we do not necessarily expect that lifespan itself is associated with fitness.In conclusion, our results implicate insulin signaling as a major mediator of life history adaptation in D. melanogaster, and suggest that tradeoffs can be explained by extensive pleiotropy at a single locus.

Some other things I would like to mention
I value this study for its functional tests—phenotypic effects of candidate polymorphisms are often missing from evolutionary studies. However, and this is a major caveat: the InRshort and InRlong alleles were embedded in genotypic backgrounds that extended well beyond the locus in the test lines. On their own, I do not consider the functional tests definitive. But D. melanogaster have low linkage disequilibrium, which we know decays rapidly just outside our candidate polymorphism. In my opinion, the segregation of InRshort and InRlong in large, recombining wild populations pinpoints the functional alleles, while the experimental assays confirm our hypotheses about the selection regime.When we first measured fecundity, we counted every single egg laid by every single female over every single one of their lives. And the InRlong females, which we knew were more fecund—their culture bottles grew like gangbusters—laid only five more eggs on average than InRshort females! Highly non-significant. But, it looked like the InRlong flies laid eggs faster. We set up a different assay to measure eggs laid in the first day, and InRlong was six times more fecund. I think this provides an important lesson. We can easily imagine big fitness consequences for egg laying rate, but we might not think to measure it in the lab. Many studies, especially those from a molecular genetics point of view, have been keen to emphasize decoupling of lifespan and reproduction for so-called longevity genes. For conclusions drawn about natural genetic variants (which are the ones of utmost relevance, in my opinion), the question of tradeoffs must consider those fitness axes that are relevant to the wild organism. And these are often unknowable.We found that InRshort and InRlong were associated with smaller and larger body sizes, respectively. This makes sense in terms of levels of insulin signaling, but not in terms of body sizes in wild populations. High latitude flies are typically larger, not smaller. So, if InRshort and InRlong alleles affect body size, they either do so epistatically with other body size loci or they suffer antagonistic selection pressures along multiple fitness axes. Interesting!

MIPSTR: a method for multiplex genotyping of germ-line and somatic STR variation across many individuals

MIPSTR: a method for multiplex genotyping of germ-line and somatic STR variation across many individuals
Keisha Dawn Carlson, Peter H Sudmant, Maximilian Oliver Press, Evan E Eichler, Jay Shendure, Christine Queitsch

Abstract Short tandem repeats (STRs) are highly mutable genetic elements that often reside in functional genomic regions. The cumulative evidence of genetic studies on individual STRs suggests that STR variation profoundly affects phenotype and contributes to trait heritability. Despite recent advances in sequencing technology, STR variation has remained largely inaccessible across many individuals compared to single nucleotide variation or copy number variation. STR genotyping with short-read sequence data is confounded by (1) the difficulty of uniquely mapping short, low-complexity reads and (2) the high rate of STR amplification stutter. Here, we present MIPSTR, a robust, scalable, and affordable method that addresses these challenges. MIPSTR uses targeted capture of STR loci by single-molecule Molecular Inversion Probes (smMIPs) and a unique mapping strategy. Targeted capture and mapping strategy resolve the first challenge; the use of single molecule information resolves the second challenge. Unlike previous methods, MIPSTR is capable of distinguishing technical error due to amplification stutter from somatic STR mutations. In proof-of-principle experiments, we use MIPSTR to determine germ-line STR genotypes for 102 STR loci with high accuracy across diverse populations of the plant A. thaliana. We show that putatively functional STRs may be identified by deviation from predicted STR variation and by association with quantitative phenotypes. Employing DNA mixing experiments and a mutant deficient in DNA repair, we demonstrate that MIPSTR can detect low-frequency somatic STR variants. MIPSTR is applicable to any organism with a high-quality reference genome and is scalable to genotyping many thousands of STR loci in thousands of individuals.

An estimate of the average number of recessive lethal mutations carried by humans

An estimate of the average number of recessive lethal mutations carried by humans
Ziyue Gao, Darrel Waggoner, Matthew Stephens, Carole Ober, Molly Przeworski
(Submitted on 28 Jul 2014)

The effects of inbreeding on human health depend critically on the number and severity of recessive, deleterious mutations carried by individuals. In humans, existing estimates of these quantities are based on comparisons between consanguineous and non-consanguineous couples, an approach that confounds socioeconomic and genetic effects of inbreeding. To circumvent this limitation, we focused on a founder population with almost complete Mendelian disease ascertainment and a known pedigree. By considering all recessive lethal diseases reported in the pedigree and simulating allele transmissions, we estimated that each haploid set of human autosomes carries on average 0.29 (95% credible interval [0.10, 0.83]) autosomal, recessive alleles that lead to complete sterility or severe disorders at birth or before reproductive age when homozygous. Comparison to existing estimates of the deleterious effects of all recessive alleles suggests that a substantial fraction of the burden of autosomal, recessive variants is due to single mutations that lead to death between birth and reproductive age. In turn, the comparison to estimates from other eukaryotes points to a surprising constancy of the average number of recessive lethal mutations across organisms with markedly different genome sizes.

Comparative Performance of Two Whole Genome Capture Methodologies on Ancient DNA Illumina Libraries

Comparative Performance of Two Whole Genome Capture Methodologies on Ancient DNA Illumina Libraries
Maria Avila-Arcos, Marcela Sandoval-Velasco, Hannes Schroeder, Meredith L Carpenter, Anna-Sapfo Malaspinas, Nathan Wales, Fernando PeƱaloza, Carlos D Bustamante, M. Thomas P Gilbert

1. The application of whole genome capture (WGC) methods to ancient DNA (aDNA) promises to increase the efficiency of ancient genome sequencing. 2. We compared the performance of two recently developed WGC methods in enriching human aDNA within Illumina libraries built using both double-stranded (DSL) and single-stranded (SSL) build protocols. Although both methods effectively enriched aDNA, one consistently produced marginally better results, giving us the opportunity to further explore the parameters influencing WGC experiments. 3. Our results suggest that bait length has an important influence on library enrichment. Moreover, we show that WGC biases against the shorter molecules that are enriched in SSL preparation protocols. Therefore application of WGC to such samples is not recommended without future optimization. Lastly, we document the effect of WGC on other features including clonality, GC composition and repetitive DNA content of captured libraries. 4. Our findings provide insights for researchers planning to perform WGC on aDNA, and suggest future tests and optimization to improve WGC efficiency.

Butter: High-precision genomic alignment of small RNA-seq data

Butter: High-precision genomic alignment of small RNA-seq data
Michael J Axtell

Eukaryotes produce large numbers of small non-coding RNAs that act as specificity determinants for various gene-regulatory complexes. These include microRNAs (miRNAs), endogenous short interfering RNAs (siRNAs), and Piwi-associated RNAs (piRNAs). These RNAs can be discovered, annotated, and quantified using small RNA-seq, a variant RNA-seq method based on highly parallel sequencing. Alignment to a reference genome is a critical step in analysis of small RNA-seq data. Because of their small size (20-30 nts depending on the organism and sub-type) and tendency to originate from multi-gene families or repetitive regions, reads that align equally well to more than one genomic location are very common. Typical methods to deal with multi-mapped small RNA-seq reads sacrifice either precision or sensitivity. The tool ‘butter’ balances precision and sensitivity by placing multi-mapped reads using an iterative approach, where the decision between possible locations is dictated by the local densities of more confidently aligned reads. Butter displays superior performance relative to other small RNA-seq aligners. Treatment of multi-mapped small RNA-seq reads has substantial impacts on downstream analyses, including quantification of MIRNA paralogs, and discovery of endogenous siRNA loci. Butter is freely available under a GNU general public license.

Clonal interference and Muller’s ratchet in spatial habitats

Clonal interference and Muller’s ratchet in spatial habitats
Jakub Otwinowski, Joachim Krug
(Submitted on 18 Feb 2013 (v1), last revised 23 Jul 2014 (this version, v3))

Competition between independently arising beneficial mutations is enhanced in spatial populations due to the linear rather than exponential growth of clones. Recent theoretical studies have pointed out that the resulting fitness dynamics is analogous to a surface growth process, where new layers nucleate and spread stochastically, leading to the build up of scale-invariant roughness. This scenario differs qualitatively from the standard view of adaptation in that the speed of adaptation becomes independent of population size while the fitness variance does not. Here we exploit recent progress in the understanding of surface growth processes to obtain precise predictions for the universal, non-Gaussian shape of the fitness distribution for one-dimensional habitats, which are verified by simulations. When the mutations are deleterious rather than beneficial the problem becomes a spatial version of Muller’s ratchet. In contrast to the case of well-mixed populations, the rate of fitness decline remains finite even in the limit of an infinite habitat, provided the ratio Ud/s2 between the deleterious mutation rate and the square of the (negative) selection coefficient is sufficiently large. Using again an analogy to surface growth models we show that the transition between the stationary and the moving state of the ratchet is governed by directed percolation.

Concerning RNA-Guided Gene Drives for the Alteration of Wild Populations

Concerning RNA-Guided Gene Drives for the Alteration of Wild Populations
Kevin M Esvelt, Andrea L Smidler, Flaminia Catteruccia, George M Church

Gene drives may be capable of addressing ecological problems by altering entire populations of wild organisms, but their use has remained largely theoretical due to technical constraints. Here we consider the potential for RNA-guided gene drives based on the CRISPR nuclease Cas9 to serve as a general method for spreading altered traits through wild populations over many generations. We detail likely capabilities, discuss limitations, and provide novel precautionary strategies to control the spread of gene drives and reverse genomic changes. The ability to edit populations of sexual species would offer substantial benefits to humanity and the environment. For example, RNA-guided gene drives could potentially prevent the spread of disease, support agriculture by reversing pesticide and herbicide resistance in insects and weeds, and control damaging invasive species. However, the possibility of unwanted ecological effects and near-certainty of spread across political borders demand careful assessment of each potential application. We call for thoughtful, inclusive, and well-informed public discussions to explore the responsible use of this currently theoretical technology.