Genome of octoploid plant maca (Lepidium meyenii) illuminates genomic basis for high altitude adaptation in the central Andes
Jun Sheng , Wei Chen , Yang Dong , Liangsheng Zhang , Jing Zhang , Yang Tian , Liang Yan , Guanghui Zhang , Xiao Wang , Yan Zeng , Jiajin Zhang , Xiao Ma , Yuntao Tan , Ni Long , Yangzi Wang , Yujin Ma , Yu Xue , Shumei Hao , Shengchao Yang , Wen Wang
Maca (Lepidium meyenii Walp, 2n = 8x = 64) of Brassicaceae family is an Andean economic plant cultivated on the 4000-4500 meters central sierra in Peru. Considering the rapid uplift of central Andes occurred 5 to 10 million years ago (Mya), an evolutionary question arises on how plants like maca acquire high altitude adaptation within short geological period. Here, we report the high-quality genome assembly of maca, in which two close-spaced maca-specific whole genome duplications (WGDs, ~ 6.7 Mya) were identified. Comparative genomics between maca and close-related Brassicaceae species revealed expansions of maca genes and gene families involved in abiotic stress response, hormone signaling pathway and secondary metabolite biosynthesis via WGDs. Retention and subsequent evolution of many duplicated genes may account for the morphological and physiological changes (i.e. small leaf shape and loss of vernalization) in maca for high altitude environment. Additionally, some duplicated maca genes under positive selection were identified with functions in morphological adaptation (i.e. MYB59) and development (i.e. GDPD5 and HDA9). Collectively, the octoploid maca genome sheds light on the important roles of WGDs in plant high altitude adaptation in the Andes.
Too packed to change: site-specific substitution rates and side-chain packing in protein evolution
María Laura Marcos, Julian Echave
In protein evolution, due to functional and biophysical constraints, the rates of amino acid substitution differ from site to site. Among the best predictors of site-specific rates is packing density. The packing density measure that best correlates with rates is the weighted contact number (WCN), the sum of inverse square distances between the site’s Cα and the other Cαs . According to a mechanistic stress model proposed recently, rates are determined by packing because mutating packed sites stresses and destabilizes the protein’s active conformation. While WCN is a measure of Cα packing, mutations replace side chains, which prompted us to consider whether a site’s evolutionary divergence is constrained by main-chain packing or side-chain packing. To address this issue, we extended the stress theory to model side chains explicitly. The theory predicts that rates should depend solely on side-chain packing. We tested these predictions on a data set of structurally and functionally diverse monomeric enzymes. We found that, on average, side-chain contact density (WCNρ ) explains 39.1% of among-sites rate variation, larger than main-chain contact density (WCNα ) which explains 32.1%. More importantly, the independent contribution of WCNα is only 0.7%. Thus, as predicted by the stress theory, site-specific evolutionary rates are determined by side-chain packing.
The rate and molecular spectrum of spontaneous mutations in the GC-rich multi-chromosome genome of Burkholderia cenocepacia
Marcus M Dillon, Way Sung, Michael Lynch, Vaughn S Cooper
Spontaneous mutations are ultimately essential for evolutionary change and are also the root cause of nearly all disease. However, until recently, both biological and technical barriers have prevented detailed analyses of mutation profiles, constraining our understanding of the mutation process to a few model organisms and leaving major gaps in our understanding of the role of genome content and structure on mutation. Here, we present a genome-wide view of the molecular mutation spectrum in Burkholderia cenocepacia, a clinically relevant pathogen with high %GC content and multiple chromosomes. We find that B. cenocepacia has low genome-wide mutation rates with insertion-deletion mutations biased towards deletions, consistent with the idea that deletion pressure reduces prokaryotic genome sizes. Unlike previously assayed organisms, B. cenocepacia exhibits a GC-mutation bias, which suggests that at least some genomes with high GC content may be driven to this point by unusual base-substitution mutation pressure. Notably, we also observed variation in both the rates and spectra of mutations among chromosomes, and a significant elevation of G:C>T:A transversions in late-replicating regions. Thus, although some patterns of mutation appear to be highly conserved across cellular life, others vary between species and even between chromosomes of the same species, potentially influencing the evolution of nucleotide composition and genome architecture.
Tissue-specific evolution of protein coding genes in human and mouse
Nadezda Kryuchkova, Marc Robinson-Rechavi
Protein-coding genes evolve at different rates, and the influence of different parameters, from gene size to expression level, has been extensively studied. While in yeast gene expression level is the major causal factor of gene evolutionary rate, the situation is more complex in animals. Here we investigate these relations further, especially taking in account gene expression in different organs as well as indirect correlations between parameters. We used RNA-seq data from two large datasets, covering 22 mouse tissues and 27 human tissues. Over all tissues, evolutionary rate only correlates weakly with levels and breadth of expression. The strongest explanatory factors of strong purifying selection are GC content, expression in many developmental stages, and expression in brain tissues. While the main component of evolutionary rate is purifying selection, we also find tissue-specific patterns for sites under neutral evolution and for positive selection. We observe fast evolution of genes expressed in testis, but also in other tissues, notably liver, which are explained by weak purifying selection rather than by positive selection.
GC-content evolution in bacterial genomes: the biased gene conversion hypothesis expands.
Florent Lassalle, Séverine Périan, Thomas Bataillon, Xavier Nesme, Laurent Duret, Vincent Daubin
The characterization of functional elements in genomes relies on the identification of the footprints of natural selection. In this quest, taking into account neutral evolutionary processes such as mutation and genetic drift is crucial because these forces can generate patterns that may obscure or mimic signatures of selection. In mammals, and probably in many eukaryotes, another such confounding factor called GC-Biased Gene Conversion (gBGC) has been documented. This mechanism generates patterns identical to what is expected under selection for higher GC-content, specifically in highly recombining genomic regions. Recent results have suggested that a mysterious selective force favouring higher GC-content exists in Bacteria but the possibility that it could be gBGC has been excluded. Here, we show that gBGC is probably at work in most if not all bacterial species. First we find a consistent positive relationship between the GC-content of a gene and evidence of intra-genic recombination throughout a broad spectrum of bacterial clades. Second, we show that the evolutionary force responsible for this pattern is acting independently from selection on codon usage, and could potentially interfere with selection in favor of optimal AU-ending codons. A comparison with data from human populations shows that the intensity of gBGC in Bacteria is comparable to what has been reported in mammals. We propose that gBGC is not restricted to sexual Eukaryotes but also widespread among Bacteria and could therefore be an ancestral feature of cellular organisms. We argue that if gBGC occurs in bacteria, it can account for previously unexplained observations, such as the apparent non-equilibrium of base substitution patterns and the heterogeneity of gene composition within bacterial genomes. Because gBGC produces patterns similar to positive selection, it is essential to take this process into account when studying the evolutionary forces at work in bacterial genomes.
Tackling drug resistant infection outbreaks of global pandemic Escherichia coli ST131 using evolutionary and epidemiological genomics
(Submitted on 4 Nov 2014)
High-throughput molecular approaches are required to investigate the origin and diffusion of antimicrobial resistance in rapidly radiating pathogen outbreaks. The most frequent cause of human infection is Escherichia coli, which is dominated by ST131, a single pandemic clone. This epidemic subtype possesses an extensive array of virulence elements and tolerates many drugs. Frequent global sweeps of new dominant ST131 varieties necessitate deep genomic scrutiny of their spread, evolution and lateral transfer of drug resistance genes. Phylogenetic methods that decipher past events can predict future patterns of virulence and transmission based on genetic signatures of adaptation and recombination. Antibiotic tolerance is controlled by natural variation in gene expression levels, which can initiate delayed cell growth. This dormancy allows survival despite drug exposure, and yet may only be present in part of the infecting cell population. Consequently, genomic epidemiology needs to explore the scale of phenotypic regulatory control acting on RNA. A multi-faceted approach can comprehensively assess antimicrobial resistance in E. coli ST131 in terms of within-host genetic heterogeneity, regulation of gene expression, and transmission dynamics between hosts to achieve a goal of pre-empting resistance before it emerges by optimising drug treatment protocols.
Introns structure patterns of variation in nucleotide composition in Arabidopsis thaliana and rice protein-coding genes
Adrienne Ressayre, Sylvain Glemin, Pierre Montalent, Laurana Serres-Giardi, Christine Dillmann, Johann Joets
Plant genomes are large, intron-rich and present a wide range of variation in coding region G+C content. Concerning coding regions, a sort of syndrome can be described in plants: the increase in G+C content is associated with both the increase in heterogeneity among genes within a genome and the increase in variation across genes. Taking advantage of the large number of genes composing plant genomes and the wide range of variation in gene intron number, we performed a comprehensive survey of the patterns of variation in G+C content at different scales from the nucleotide level to the genome scale in two species Arabidopsis thaliana and Oryza sativa, comparing the patterns in genes with different intron numbers. In both species, we observed a pervasive effect of gene intron number and location along genes on G+C content, codon and amino acid frequencies suggesting that in both species, introns have a barrier effect structuring G+C content along genes. In external gene regions (located upstream first or downstream last intron), species-specific factors are shaping G+C content while in internal gene regions (surrounded by introns), G+C content is constrained to remain within a range common to both species. In rice, introns appear as a major determinant of gene G+C content while in A. thaliana introns have a weaker but significant effect. The structuring effect of introns in both species is susceptible to explain the G+C content syndrome observed in plants.
Similar efficacies of selection shape mitochondrial and nuclear genes in Drosophila melanogaster and Homo sapiens
Brandon S. Cooper, Chad Burrus, Chao Ji, Matthew W. Hahn, Kristi L. Montooth
Deleterious mutations contribute to polymorphism even when selection effectively prevents their fixation. The efficacy of selection in removing deleterious mitochondrial mutations from populations depends on the effective population size (Ne) of the mtDNA, and the degree to which a lack of recombination magnifies the effects of linked selection. Using complete mitochondrial genomes from Drosophila melanogaster and nuclear data available from the same samples, we re-examine the hypothesis that non-recombining animal mtDNA harbor an excess of deleterious polymorphisms relative to the nuclear genome. We find no evidence of recombination in the mitochondrial genome, and the much-reduced level of mitochondrial synonymous polymorphism relative to nuclear genes is consistent with a reduction in Ne. Nevertheless, we find that the neutrality index (NI), a measure of the excess on nonsynonymous polymorphism relative to the neutral expectation, is not significantly different between mitochondrial and nuclear loci. Reanalysis of published data from Homo sapiens reveals the same lack of a difference between the two genomes, though small samples in previous studies had suggested a strong difference in both species. Thus, despite a smaller Ne, mitochondrial loci of both flies and humans appear to experience similar efficacies of selection as do loci in the recombining nuclear genome.
Recent evolution of the mutation rate and spectrum in Europeans
As humans dispersed out of Africa, they adapted to new environmental challenges including changes in exposure to mutagenic solar radiation. This raises the possibility that different populations experienced different selective pressures affecting genome integrity. Prior work has uncovered divergent selection in tropical versus temperate latitudes on eQTLs that regulate the DNA damage response, as well as evidence that the human mutation rate per year has changed at least 2-fold since we shared a common ancestor with chimpanzees. Here, I present evidence that the rate of a particular mutation type has recently increased in the European lineage, rising in frequency by 50% during the 30,000–50,000 years since Europeans diverged from Asians. A comparison of single nucleotide polymorphisms (SNPs) private to Africa, Asia, and Europe in the 1000 Genomes data reveals that private European variation is enriched for the transition 5’-TCC-3’→5’-TTC-3’. Although it is not clear whether UV played a causal role in the changing the European mutational spectrum, 5’-TCC-3’→5’-TTC-3’ is known to be the most common somatic mutation present in melanoma skin cancers, as well as the mutation most frequently induced in vitro by UV. Regardless of its causality, this change indicates that DNA replication fidelity has not remained stable even since the origin of modern humans and might have changed numerous times during our recent evolutionary history.
Different tastes for different individuals
Individual taste differences were first reported in the first half of the 20th century, but the primary reasons for these differences have remained uncertain. Much of the taste variation among different mammalian species can be explained by pseudogenization of taste receptors. In this study, by analyzing 14 ethnically diverse populations, we investigated whether the most recent disruptions of taste receptor genes segregate with their intact forms. Our results revealed an unprecedented prevalence of segregating loss-of-function (LoF) taste receptor variants, identifying one of the most pronounced cases of functional population diversity in the human genome. LoF variant frequency was considerably higher than the overall mutation rate, and many humans harbored varying numbers of critical mutations. In particular, molecular evolutionary rates of sour and bitter receptors were far higher in humans than those of sweet, salty, and umami receptors compared with other carnivorous mammals although not all of the taste receptors genes were identified. Many LoF variants are population-specific, some of which arose even after the population differentiation, but not before divergence of the modern and archaic (Neanderthal and Denisovan) human. Based on these findings, we conclude that modern humans might have been losing their taste receptor genes because of high-frequency LoF taste receptor variants. Finally I actually demonstrated the genetic testing of taste receptors from personal exome sequence.