Diversity of Mycobacterium tuberculosis across evolutionary scales
Mary B O’Neill, Tatum D Mortimer, Caitlin S Pepperell
Tuberculosis (TB) is a global public health emergency. Increasingly drug resistant strains of Mycobacterium tuberculosis (M.tb) continue to emerge and spread, highlighting the adaptability of this pathogen. Most studies of M.tb evolution have relied on ‘between-host’ samples, in which each person with TB is represented by a single M.tb isolate. However, individuals with TB commonly harbor populations of M.tb numbering in the billions. Here, we use analyses of M.tb diversity found within and between hosts to gain insight into the adaptation of this pathogen. We find that the amount of M.tb genetic diversity harbored by individuals with TB is similar to that of global between-host surveys of TB patients. This suggests that M.tb genetic diversity is generated within hosts and then lost as the infection is transmitted. In examining genomic data from M.tb samples within and between hosts with TB, we find that genes involved in the regulation, synthesis, and transportation of immunomodulatory cell envelope lipids appear repeatedly in the extremes of various statistical measures of diversity. Polyketide synthase and Mycobacterial membrane protein Large (mmpL) genes are particularly notable in this regard. In addition, we observe identical mutations emerging across samples from different TB patients. Taken together, our observations suggest that M.tb cell envelope lipids are targets of selection within hosts. These lipids are specific to pathogenic mycobacteria and, in some cases, human-pathogenic mycobacteria. We speculate that rapid adaptation of cell envelope lipids is facilitated by functional redundancy, flexibility in their metabolism, and their roles mediating interactions with the host.
RNA-guided gene drives can efficiently bias inheritance in wild yeast
James E DiCarlo, Alejandro Chavez, Sven L Dietz, Kevin M Esvelt, George M Church
Inheritance-biasing elements known as “gene drives” may be capable of spreading genomic alterations made in laboratory organisms through wild populations. We previously considered the potential for RNA-guided gene drives based on the versatile CRISPR/Cas9 genome editing system to serve as a general method of altering populations. Here we report molecularly contained gene drive constructs in the yeast Saccharomyces cerevisiae that are typically copied at rates above 99% when mated to wild yeast. We successfully targeted both non-essential and essential genes and showed that the inheritance of an unrelated “cargo” gene could be biased by an adjacent drive. Our results demonstrate that RNA-guided gene drives are capable of efficiently biasing inheritance when mated to wild-type organisms over successive generations.
Dissecting phylogenetic signal and accounting for bias in whole-genome data sets: a case study of the Metazoa
Marek L Borowiec, Ernest K Lee, Joanna C Chiu, David C Plachetzki
Transcriptome-enabled phylogenetic analyses have dramatically improved our understanding of metazoan phylogeny in recent years, although several important questions remain. The branching order near the base of the tree is one such outstanding issue. To address this question we assemble a novel data set comprised of 1,080 orthologous loci derived from 36 publicly available genomes and dissect the phylogenetic signal present in each individual partition. The size of this data set allows for a closer look at the potential biases and sources of non-phylogenetic signal. We assessed a range of measures for each data partition including information content, saturation, rate of evolution, long-branch score, and taxon occupancy and explored how each of these characteristics impacts phylogeny estimation. We then used these data to prepare a reduced set of partitions that fit an optimal set of criteria and are amenable to the most appropriate and computationally intensive analyses using site-heterogeneous models of sequence evolution. We also employed several strategies to examine the potential for long-branch attraction to bias our inferences. All of our analyses support Ctenophora as the sister lineage to other Metazoa, although support for this relationship varies among analyses. We find no support for the traditional view uniting the ctenophores and Cnidaria (jellies, anemones, corals, and kin). We also examine phylogenetic placement of myriapods (centipedes and millipedes) and find it more sensitive to the type of analysis and data used. Our study provides a workflow for minimizing systematic bias in whole genome-based phylogenetic analyses.
An annotated consensus genetic map for Pinus taeda L. and extent of linkage disequilibrium in three genotype-phenotype discovery populations
Jared W. Westbrook, Vikram E. Chhatre, Le-Shin Wu, Srikar Chamala, Leandro Gomide Neves, Patricio Muñoz, Pedro J Martínez-García, David B. Neale, Matias Kirst, Keithanne Mockaitis, C. Dana Nelson, Gary F. Peter, John M. Davis, Craig S. Echt
A consensus genetic map for Pinus taeda (loblolly pine) was constructed by merging three previously published maps with a map from a pseudo-backcross between P. taeda and P. elliottii (slash pine). The consensus map positioned 4981 markers via genotyping of 1251 individuals from four pedigrees. It is the densest linkage map for a conifer to date. Average marker spacing was 0.48 centiMorgans and total map length was 2372 centiMorgans. Functional predictions for 4762 markers for expressed sequence tags were improved by alignment to full-length P. taeda transcripts. Alignments to the P. taeda genome mapped 4225 scaffold sequences onto linkage groups. The consensus genetic map was used to compare the extent of genome-wide linkage disequilibrium in an association population of distantly related P. taeda individuals (ADEPT2), a multiple-family pedigree used for genomic selection studies (CCLONES), and a full-sib quantitative trait locus mapping population (BC1). Weak linkage disequilibrium was observed in CCLONES and ADEPT2. Average squared correlations, R2, between genotypes at SNPs less than one centiMorgan apart was less than 0.05 in both populations and R2 did not decay substantially with genetic distance. By contrast, strong and extended linkage disequilibrium was observed among BC1 full-sibs where average R2 decayed from 0.8 to less than 0.1 over 53 centiMorgans. The consensus map and analysis of linkage disequilibrium establish a foundation for comparative association and quantitative trait locus mapping between genotype-phenotype discovery populations.
Bet-hedging, seasons and the evolution of behavioral diversity in Drosophila
Jamey Kain, Sarah Zhang, Mason Klein, Aravinthan Samuel, Benjamin de Bivort
Organisms use various strategies to cope with fluctuating environmental conditions. In diversified bet-hedging, a single genotype exhibits phenotypic heterogeneity with the expectation that some individuals will survive transient selective pressures. To date, empirical evidence for bet-hedging is scarce. Here, we observe that individual Drosophila melanogaster flies exhibit striking variation in light- and temperature-preference behaviors. With a modeling approach that combines real world weather and climate data to simulate temperature preference-dependent survival and reproduction, we find that a bet-hedging strategy may underlie the observed inter-individual behavioral diversity. Specifically, bet-hedging outcompetes strategies in which individual thermal preferences are heritable. Animals employing bet-hedging refrain from adapting to the coolness of spring with increased warm-seeking that inevitably becomes counterproductive in the hot summer. This strategy is particularly valuable when mean seasonal temperatures are typical, or when there is considerable fluctuation in temperature within the season. The model predicts, and we experimentally verify, that the behaviors of individual flies are not heritable. Finally, we model the effects of historical weather data, climate change, and geographic seasonal variation on the optimal strategies underlying behavioral variation between individuals, characterizing the regimes in which bet-hedging is advantageous.
STACEY: species delimitation and phylogeny estimation under the multispecies coalescent
Graham R Jones
This article describes a new package called STACEY for BEAST2 which is capable of both species delimitation and species tree estimation using DNA sequences from multiple loci. The focus in this article is on species delimitation. STACEY is based on the multispecies coalescent model, and builds on earlier software (DISSECT), which uses a `birth-death-collapse’ prior to deal with delimitations without the need for reversible-jump Markov chain Monte Carlo moves. Like DISSECT, it requires no a priori assignment of individuals to species or populations, and no guide tree. This paper introduces two innovations. The first is a new model for the populations along the branches of the species tree, and the second is a new MCMC move for exploring the posterior when the multispecies coalescent model is assumed. The main benefit of STACEY over DISSECT is much better convergence. Current practice, using a pipeline approach to species delimitation under the multispecies coalescent, has been shown to have major problems on simulated data. The same simulated data set is used to demonstrate the accuracy and efficiency of STACEY.
The role of standing variation in geographic convergent adaptation
Peter L. Ralph, Graham Coop
The extent to which populations experiencing shared selective pressures adapt through a shared genetic response is relevant to many questions in evolutionary biology. In a number of well studied traits and species, it appears that convergent evolution within species is common. In this paper, we explore how standing, deleterious genetic variation contributes to convergent genetic responses in a geographically spread population, extending our previous work on the topic. Geographically limited dispersal slows the spread of each selected allele, hence allowing other alleles — newly arisen mutants or present as standing variation — to spread before any one comes to dominate the population. When such alleles meet, their progress is substantially slowed — if the alleles are selectively equivalent, they mix slowly, dividing the species range into a random tessellation, which can be well understood by analogy to a Poisson process model of crystallization. In this framework, we derive the geographic scale over which a typical allele is expected to dominate, the time it takes the species to adapt as a whole, and the proportion of adaptive alleles that arise from standing variation. Finally, we explore how negative pleiotropic effects of alleles before an environment change can bias the subset of alleles that get to contribute to a species adaptive response. We apply the results to the many geographically localized G6PD deficiency alleles thought to confer resistance to malaria, whose large mutational target size and deleterious effects make them likely candidates to have been present as deleterious standing variation. We find the numbers and geographic spread of these alleles matches our predictions reasonably well, which suggest that these arose both from standing variation and new mutations since the advent of malaria. Our results suggest that much of adaptation may be geographically local even when selection pressures are wide-spread. We close by discussing the implications of these results for arguments of species coherence and the nature of divergence between species.