Secondary contact and local adaptation contribute to genome-wide patterns of clinal variation in Drosophila melanogaster

Secondary contact and local adaptation contribute to genome-wide patterns of clinal variation in Drosophila melanogaster

Alan O. Bergland, Ray Tobler, Josefa Gonzalez, Paul Schmidt, Dmitri Petrov
doi: http://dx.doi.org/10.1101/009084

Populations arrayed along broad latitudinal gradients often show patterns of clinal variation in phenotype and genotype. Such population differentiation can be generated and maintained by a combination of demographic events and adaptive evolutionary processes. Here, we investigate the evolutionary forces that generated and maintain clinal variation genome-wide among populations of Drosophila melanogaster sampled in North America and Australia. We contrast patterns of clinal variation in these continents with patterns of differentiation among ancestral European and African populations. We show that recently derived North America and Australia populations were likely founded by both European and African lineages and that this admixture event generated genome-wide patterns of parallel clinal variation. The pervasive effects of admixture meant that only a handful of loci could be attributed to the operation of spatially varying selection using an FST outlier approach. Our results provide novel insight into a well-studied system of clinal differentiation and provide a context for future studies seeking to identify loci contributing to local adaptation in D. melanogaster.

Estimating the temporal and spatial extent of gene flow among sympatric lizard populations (genus Sceloporus) in the southern Mexican highlands

Estimating the temporal and spatial extent of gene flow among sympatric lizard populations (genus Sceloporus) in the southern Mexican highlands

Jared A Grummer, Martha L. Calderón, Adrián Nieto Montes-de Oca, Eric N Smith, Fausto Mendez-de la Cruz, Adam Leache
doi: http://dx.doi.org/10.1101/008623

Interspecific gene flow is pervasive throughout the tree of life. Although detecting gene flow between populations has been facilitated by new analytical approaches, determining the timing and geography of hybridization has remained difficult, particularly for historical gene flow. A geographically explicit phylogenetic approach is needed to determine the ancestral population overlap. In this study, we performed population genetic analyses, species delimitation, simulations, and a recently developed approach of species tree diffusion to infer the phylogeographic history, timing and geographic extent of gene flow in the Sceloporus spinosus group. The two species in this group, S. spinosus and S. horridus, are distributed in eastern and western portions of Mexico, respectively, but populations of these species are sympatric in the southern Mexican highlands. We generated data consisting of three mitochondrial genes and eight nuclear loci for 148 and 68 individuals, respectively. We delimited six lineages in this group, but found strong evidence of mito-nuclear discordance in sympatric populations of S. spinosus and S. horridus owing to mitochondrial introgression. We used coalescent simulations to differentiate ancestral gene flow from secondary contact, but found mixed support for these two models. Bayesian phylogeography indicated more than 60% range overlap between ancestral S. spinosus and S. horridus populations since the time of their divergence. Isolation-migration analyses, however, revealed near-zero levels of gene flow between these ancestral populations. Interpreting results from both simulations and empirical data indicate that despite a long history of sympatry among these two species, gene flow in this group has only recently occurred.

Rate and cost of adaptation in the Drosophila genome

Rate and cost of adaptation in the Drosophila genome

Stephan Schiffels, Michael Lässig, Ville Mustonen
doi: http://dx.doi.org/10.1101/008680

Recent studies have consistently inferred high rates of adaptive molecular evolution between Drosophila species. At the same time, the Drosophila genome evolves under different rates of recombination, which results in partial genetic linkage between alleles at neighboring genomic loci. Here we analyze how linkage correlations affect adaptive evolution. We develop a new inference method for adaptation that takes into account the effect on an allele at a focal site caused by neighboring deleterious alleles (background selection) and by neighboring adaptive substitutions (hitchhiking). Using complete genome sequence data and fine-scale recombination maps, we infer a highly heterogeneous scenario of adaptation in Drosophila. In high-recombining regions, about 50% of all amino acid substitutions are adaptive, together with about 20% of all substitutions in proximal intergenic regions. In low-recombining regions, only a small fraction of the amino acid substitutions are adaptive, while hitchhiking accounts for the majority of these changes. Hitchhiking of deleterious alleles generates a substantial collateral cost of adaptation, leading to a fitness decline of about 30/2N per gene and per million years in the lowest-recombining regions. Our results show how recombination shapes rate and efficacy of the adaptive dynamics in eukaryotic genomes.

Segregation distorters are not a primary source of Dobzhansky-Muller incompatibilities in house mouse hybrids

Segregation distorters are not a primary source of Dobzhansky-Muller incompatibilities in house mouse hybrids

Russ Corbett-Detig, Emily Jacobs-Palmer, Daniel Hartl, Hopi Hoekstra
doi: http://dx.doi.org/10.1101/008672

Understanding the molecular basis of species formation is an important goal in evolutionary genetics, and Dobzhansky-Muller incompatibilities are thought to be a common source of postzygotic reproductive isolation between closely related lineages. However, the evolutionary forces that lead to the accumulation of such incompatibilities between diverging taxa are poorly understood. Segregation distorters are an important source of Dobzhansky-Muller incompatibilities between Drosophila species and crop plants, but it remains unclear if the contribution of these selfish genetic elements to reproductive isolation is prevalent in other species. Here, we genotype millions of single nucleotide polymorphisms across the genome from viable sperm of first-generation hybrid male progeny in a cross between Mus musculus castaneus and M. m. domesticus, two subspecies of rodent in the earliest stages of speciation. We then search for a skew in the allele frequencies of the gametes and show that segregation distorters are not measurable contributors to observed infertility in these hybrid males, despite sufficient statistical power to detect even weak segregation distortion with our novel method. Thus, reduced hybrid male fertility in crosses between these nascent species is attributable to other evolutionary forces.

A genomic map of the effects of linked selection in Drosophila

A genomic map of the effects of linked selection in Drosophila

Eyal Elyashiv, Shmuel Sattath, Tina T. Hu, Alon Strustovsky, Graham McVicker, Peter Andolfatto, Graham Coop, Guy Sella
(Submitted on 23 Aug 2014)

Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of ‘linked selection’ on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of linked selection from non-classic sweeps. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated.

Escape from crossover interference increases with maternal age

Escape from crossover interference increases with maternal age

Christopher L. Campbell, Nicholas A. Furlotte, Nick Eriksson, David Hinds, Adam Auton
(Submitted on 23 Aug 2014)

Recombination plays a fundamental role in meiosis, ensuring the proper segregation of chromosomes and contributing to genetic diversity by generating novel combinations of alleles. Using data derived from directUtoUconsumer genetic testing, we investigated patterns of recombination in over 4,200 families. Our analysis revealed a number of sex differences in the distribution of recombination. We find the fraction of male events occurring within hotspots to be 4.6% higher than for females. We confirm that the recombination rate increases with maternal age, while hotspot usage decreases, with no such effects observed in males. Finally, we show that the placement of female recombination events becomes increasingly deregulated with maternal age, with an increasing fraction of events appearing to escape crossover interference.

Population split time estimation and X to autosome effective population size differences inferred using physically phased genomes

Population split time estimation and X to autosome effective population size differences inferred using physically phased genomes

Shiya Song, Elzbieta Sliwerska, Jeffrey M Kidd
doi: http://dx.doi.org/10.1101/008367

Haplotype resolved genome sequence information is of growing interest due to its applications in both population genetics and medical genetics. Here, we assess the ability to correctly reconstruct haplotype sequences using fosmid pooled sequencing and apply the sequences to explore historical population relationships. We resolved phased haplotypes of sample NA19240, a trio child from the Yoruba HapMap collection using pools of a total of 521,783 fosmid clones. We phased 93% of heterozygous SNPs into haplotype-resolved blocks, with an N50 size of 318kb. Using trio information from HapMap, we linked adjacent blocks together to form paternal and maternal alleles, producing near-to-complete haplotypes. Comparison with 33 individual fosmids sequenced using capillary sequencing shows that our reconstructed sequence haplotypes have a sequence error rate of 0.005%. Utilizing fosmid-phased haplotypes from a Yoruba, a European and a Gujarati sample, we analyzed population history and inferred population split times. We date the initial split between Yoruba and out of African populations to 90,000-100,000 years ago with substantial gene flow occurring until nearly 50,000 years ago, and obtain congruent results with the autosomes and the X chromosome. We estimate that the initial split between European and Gujarati population occurred around 45,000 years ago and gene flow ended around 28,000 years ago. Analysis of X vs autosome inferred effective population sizes reveals distinct epochs in which the ratio of the effective number of males to females changes. We find a period of female bias during the ancestral human lineage up to 1 million years ago and a short period of male bias in Yoruba lineage from 160-400 thousand years ago. We demonstrate the construction of haplotype sequences of sufficient completeness and accuracy for population genetic analysis. As experimental and analytic methods improve, these approaches will continue to shed new light on the history of populations.

Variation of nonsynonymous/synonymous rate ratios at HLA genes over time and phylogenetic context

Variation of nonsynonymous/synonymous rate ratios at HLA genes over time and phylogenetic context

B&aacuterbara D Bitarello, Rodrigo dos Santos Francisco, Diogo Meyer
doi: http://dx.doi.org/10.1101/008342

Many HLA loci show an excess of nonsynonymous (dN) with respect to synonymous (dS) substitutions at codons of the antigen recognition site (ARS), a hallmark of adaptive evolution. However, it remains unclear how these changes are distributed over time and across branches of the HLA phylogeny. In particular, although HLA alleles can be assigned to functionally and phylogenetically defined groups (“lineages”), a test for differences in ω (ω = dN/dS) within and between lineages is lacking. We analysed variation of ω across divergence times and phylogenetic contexts (placement of branches in the phylogeny). We found a significant positive correlation between ω at ARS codons and divergence time, and that branches between lineages have higher ω than those within lineages. The excess of nonsynonymous hanges between lineages attained significance when we used non-ARS codons to account for the fact that, even under purifying selection, ω is inflated for recently diverged alleles. Although less intensely selected, within-lineage variation at ARS codons bears evidence of selection, in the form of higher ω than those of non-ARS codons. Our results show that ω ratios of class I HLA genes vary over time, and are higher in branches connecting alleles from distinct lineages. These results suggest that although within-lineage variation bears evidence of balancing selection, the between-lineage changes have been more intensely selected. Our findings indicate the importance of considering the effect of timescale when analysing ω values over a wide spectrum of divergences, and the value of using additional markers (in our case the tightly linked non-ARS codons) to account for the temporal dynamics of ω.

Dead or just asleep? Variance of microsatellite allele distributions in the human Y-chromosome.

Dead or just asleep? Variance of microsatellite allele distributions in the human Y-chromosome.

Joe Flood
doi: http://dx.doi.org/10.1101/008227

Several different methods confirm that a number of micro-satellites on the human Y-chromosome have allele distributions with different variances in different haplogroups, after adjusting for coalescent times. This can be demonstrated through both heteroscedasticity tests and by poor correlation of the variance vectors in different subclades. The most convincing demonstration however is the complete inactivity of some markers in certain subclades – “microsatellite death”, while they are still active in companion subclades. Many microsatellites have declined in activity as they proceed down through descendant subclades. This appears to confirm the theory of microsatellite life cycles, in which point mutations cause a steady decay in activity. However, the changes are too fast to be caused by point mutations alone, and slippage events may be implicated. The rich microsatellite terrain exposed in our large single-haplotype samples provides new opportunities for genotyping and analysis.

The impact of macroscopic epistasis on long-term evolutionary dynamics

The impact of macroscopic epistasis on long-term evolutionary dynamics

Benjamin H. Good, Michael M. Desai
(Submitted on 18 Aug 2014)

Genetic interactions can strongly influence the fitness effects of individual mutations, yet the impact of these epistatic interactions on evolutionary dynamics remains poorly understood. Here we investigate the evolutionary role of epistasis over 50,000 generations in a well-studied laboratory evolution experiment in E. coli. The extensive duration of this experiment provides a unique window into the effects of epistasis during long-term adaptation to a constant environment. Guided by analytical results in the weak-mutation limit, we develop a computational framework to assess the compatibility of a given epistatic model with the observed patterns of fitness gain and mutation accumulation through time. We find that the average fitness trajectory alone provides little power to distinguish between competing models, including those that lack any direct epistatic interactions between mutations. However, when combined with the mutation trajectory, these observables place strong constraints on the set of possible models of epistasis, ruling out most existing explanations of the data. Instead, we find the strongest support for a “two-epoch” model of adaptation, in which an initial burst of diminishing returns epistasis is followed by a steady accumulation of mutations under a constant distribution of fitness effects. Our results highlight the need for additional DNA sequencing of these populations, as well as for more sophisticated models of epistasis that are compatible with all of the experimental data.