Natural variation in teosinte at the domestication locus teosinte branched1 (tb1)

Natural variation in teosinte at the domestication locus teosinte branched1 (tb1)

Laura Vann, Thomas Kono, Tanja Pyha ̈j ̈arvi, Matthew B Hufford, Jeffrey Ross-Ibarra

Premise of the study: The teosinte branched1 (tb1) gene is a major QTL controlling branching differences between maize and its wild progenitor, teosinte. The insertion of a transposable element (Hopscotch) upstream of tb1 is known to enhance the gene’s expression, causing reduced tillering in maize. Observations of the maize tb1 allele in teosinte and estimates of an insertion age of the Hopscotch that predates domestication led us to investigate its prevalence and potential role in teosinte. Methods: Prevalence of the Hopscotch element was assessed across an Americas-wide sample of 1110 maize and teosinte individuals using a co-dominant PCR assay. Population genetic summaries were calculated for a subset of individuals from four teosinte populations in central Mexico. Phenotypic data were also collected from a single teosinte population where Hopscotch was found segregating. Key results: Genotyping results suggest the Hopscotch element is at higher than expected frequency in teosinte. Analysis of linkage disequilibrium near tb1 does not support recent introgression of the Hopscotch allele from maize into teosinte. Population genetic signatures are consistent with selection on this locus revealing a potential ecological role for Hopscotch in teosinte. Finally, two greenhouse experiments with teosinte do not suggest tb1 controls tillering in natural populations. Conclusions: Our findings suggest the role of Hopscotch differs between maize and teosinte. Future work should assess tb1 expression levels in teosinte with and without the Hopscotch and more comprehensively phenotype teosinte to assess the ecological significance of the Hopscotch insertion and, more broadly, the tb1 locus in teosinte. Key words: domestication; maize; teosinte; teosinte branched1; transposable element

Reconstructing Austronesian population history in Island Southeast Asia

Reconstructing Austronesian population history in Island Southeast Asia
Mark Lipson, Po-Ru Loh, Nick Patterson, Priya Moorjani, Ying-Chin Ko, Mark Stoneking, Bonnie Berger, David Reich

Austronesian languages are spread across half the globe, from Easter Island to Madagascar. Evidence from linguistics and archaeology indicates that the “Austronesian expansion,” which began 4-5 thousand years ago, likely had roots in Taiwan, but the ancestry of present-day Austronesian-speaking populations remains controversial. Here, focusing primarily on Island Southeast Asia, we analyze genome-wide data from 56 populations using new methods for tracing ancestral gene flow. We show that all sampled Austronesian groups harbor ancestry that is more closely related to aboriginal Taiwanese than to any present-day mainland population. Surprisingly, western Island Southeast Asian populations have also inherited ancestry from a source nested within the variation of present-day populations speaking Austro-Asiatic languages, which have historically been nearly exclusive to the mainland. Thus, either there was once a substantial Austro-Asiatic presence in Island Southeast Asia, or Austronesian speakers migrated to and through the mainland, admixing there before continuing to western Indonesia.

Genomic variation in a widespread Neotropical bird (Xenops minutus) reveals divergence, population expansion, and gene flow

Genomic variation in a widespread Neotropical bird (Xenops minutus) reveals divergence, population expansion, and gene flow
Michael G. Harvey, Robb T. Brumfield
(Submitted on 26 May 2014)

Elucidating the demographic and phylogeographic histories of species provides insight into the processes responsible for generating biological diversity, and genomic datasets are now permitting the estimation of histories and demographic parameters with unprecedented accuracy. We used a genomic single nucleotide polymorphism (SNP) dataset generated using a RAD-Seq method to investigate the historical demography and phylogeography of a widespread lowland Neotropical bird (Xenops minutus). As expected, we found that prominent landscape features that act as dispersal barriers, such as Amazonian rivers and the Andes Mountains, are associated with the deepest phylogeographic breaks, and also that isolation by distance is limited in areas between these barriers. In addition, we inferred positive population growth for most populations and detected evidence of historical gene flow between populations that are now physically isolated. Even with genomic estimates of historical demographic parameters, we found the prominent diversification hypotheses to be untestable. We conclude that investigations into the multifarious processes shaping species histories, aided by genomic datasets, will provide greater resolution of diversification in the Neotropics, but that future efforts should focus on understanding the processes shaping the histories of lineages rather than trying to reconcile these histories with landscape and climatic events in Earth history.

Human genomic regions with exceptionally high or low levels of population differentiation identified from 911 whole-genome sequences

Human genomic regions with exceptionally high or low levels of population differentiation identified from 911 whole-genome sequences
Vincenza Colonna, Qasim Ayub, Yuan Chen, Luca Pagani, Pierre Luisi, Marc Pybus, Erik Garrison, Yali Xue, Chris Tyler-Smith

Background: Population differentiation has proved to be effective for identifying loci under geographically-localized positive selection, and has the potential to identify loci subject to balancing selection. We have previously investigated the pattern of genetic differentiation among human populations at 36.8 million genomic variants to identify sites in the genome showing high frequency differences. Here, we extend this dataset to include additional variants, survey sites with low levels of differentiation, and evaluate the extent to which highly differentiated sites are likely to result from selective or other processes. Results: We demonstrate that while sites of low differentiation represent sampling effects rather than balancing selection, sites showing extremely high population differentiation are enriched for positive selection events and that one half may be the result of classic selective sweeps. Among these, we rediscover known examples, where we actually identify the established functional SNP, and discover novel examples including the genes ABCA12, CALD1 and ZNF804, which we speculate may be linked to adaptations in skin, calcium metabolism and defense, respectively. Conclusions: We have identified known and many novel candidate regions for geographically restricted positive selection, and suggest several directions for further research.

The distribution of deleterious genetic variation in human populations

The distribution of deleterious genetic variation in human populations
Kirk E Lohmueller

Population genetic studies suggest that most amino-acid changing mutations are deleterious. Such mutations are of tremendous interest in human population genetics as they are important for the evolutionary process and may contribute risk to common disease. Genomic studies over the past 5 years have documented differences across populations in the number of heterozygous deleterious genotypes, numbers of homozygous derived deleterious genotypes, number of deleterious segregating sites and proportion of sites that are potentially deleterious. These differences have been attributed to population history affecting the ability of natural selection to remove deleterious variants from the population. However, recent studies have suggested that the genetic load may not differ across populations, and that the efficacy of natural selection has not differed across human populations. Here I show that these observations are not incompatible with each other and that the apparent differences are due to examining different features of the genetic data and differing definitions of terms.

Quantifying evolutionary dynamics of the basic genome of E. coli

Quantifying evolutionary dynamics of the basic genome of E. coli

Purushottam Dixit, Tin Yau Pang, F. William Studier, Sergei Maslov
(Submitted on 11 May 2014)

The ~4-Mbp basic genome shared by 32 independent isolates of E. coli representing considerable population diversity has been approximated by whole-genome multiple-alignment and computational filtering designed to remove mobile elements and highly variable regions. Single nucleotide polymorphisms (SNPs) in the 496 basic-genome pairs are identified and clonally inherited stretches are distinguished from those acquired by horizontal transfer (HT) by sharp discontinuities in SNP density. The six least diverged genome-pairs each have only one or two HT stretches, each occupying 42-115-kbp of basic genome and containing at least one gene cluster known to confer selective advantage. At higher divergences, the typical mosaic pattern of interspersed clonal and HT stretches across the entire basic genome are observed, including likely fragmented integrations across a restriction barrier. A simple model suggests that individual HT events are of the order of 10-kbp and are the chief contributor to genome divergence, bringing in almost 12 times more SNPs than point mutations. As a result of continuing horizontal transfer of such large segments, 400 out of the 496 strain-pairs beyond genomic divergence of share virtually no genomic material with their common ancestor. We conclude that the active and continuing horizontal transfer of moderately large genomic fragments is likely to be mediated primarily by a co evolving population of phages that distribute random genome fragments throughout the population by generalized transduction, allowing efficient adaptation to environmental changes.

Background selection as baseline for nucleotide variation across the Drosophila genome

Background selection as baseline for nucleotide variation across the Drosophila genome
Josep M Comeron

The constant removal of deleterious mutations by natural selection causes a reduction in neutral diversity and efficacy of selection at genetically linked sites (a process called Background Selection, BGS). Population genetic studies, however, often ignore BGS effects when investigating demographic events or the presence of other types of selection. To obtain a more realistic evolutionary expectation that incorporates the unavoidable consequences of deleterious mutations, we generated high-resolution landscapes of variation across the Drosophila melanogaster genome under a BGS scenario independent of polymorphism data. We find that BGS plays a significant role in shaping levels of variation across the entire genome, including long introns and intergenic regions distant from annotated genes. We also find that a very large percentage of the observed variation in diversity across autosomes can be explained by BGS alone, up to 70% across individual chromosome arms, thus indicating that BGS predictions can be used as baseline to infer additional types of selection and demographic events. This approach allows detecting several outlier regions with signal of recent adaptive events and selective sweeps. The use of a BGS baseline, however, is particularly appropriate to investigate the presence of balancing selection and our study exposes numerous genomic regions with the predicted signature of higher polymorphism than expected when a BGS context is taken into account. Importantly, we show that these conclusions are robust to the mutation and selection parameters of the BGS model. Finally, analyses of protein evolution together with previous comparisons of genetic maps between Drosophila species, suggest temporally variable recombination landscapes and thus, local BGS effects that may differ between extant and past phases. Because genome-wide BGS and temporal changes in linkage effects can skew approaches to estimate demographic and selective events, future analyses should incorporate BGS predictions and capture local recombination variation across genomes and along lineages.

Tandem duplications and the limits of natural selection in Drosophila yakuba and Drosophila simulans

Tandem duplications and the limits of natural selection in Drosophila yakuba and Drosophila simulans
Rebekah L Rogers, Julie M Cridland, Ling Shao, Tina T Hu, Peter Andolfatto, Kevin R Thornton
Subjects: Populations and Evolution (q-bio.PE)

Tandem duplications are an essential source of genetic novelty, and their prevalence in natural populations is expected to influence the trajectory of adaptive walks. Here, we describe evolutionary impacts of recently-derived, segregating tandem duplications in Drosophila yakuba and Drosophila simulans. We observe an excess of duplicated genes involved in defense against pathogens, chorion development, cuticular peptides, and lipases or endopeptidases associated with the accessory glands, as well as insecticide metabolism, suggesting that duplications function in Red Queen dynamics and rapid evolution. We observe evidence of widespread selection on the D. simulans X, suggesting adaptation through duplication is common on the X. Though we find many high frequency variants, duplicates display an excess of low frequency variants consistent with largely detrimental impacts, limiting the variation that can effectively facilitate adaptation. Although we observe hundreds of gene duplications, we show that segregating variation is insufficient to provide duplicate copies of the entire genome, and the number of duplications in the population spans 13.4% of major chromosome arms in D. yakuba and 9.7% in D. simulans. Whole gene duplication rates are low at $1.1 \times 10^{-9}$ in D. yakuba and $6.1 \times 10^{-9}$ in D. simulans, suggesting long wait times for new mutations. Hence, if adaptive processes are dependent on individual duplications, evolution will be severely limited by mutation. Hence, parallel recruitment of the same duplicated gene in different species will be rare and standing variation will define evolutionary outcomes, in spite of convergence across rapidly evolving phenotypes.

Comparison of Y-chromosomal lineage dating using either evolutionary or genealogical Y-STR mutation rates

Comparison of Y-chromosomal lineage dating using either evolutionary or genealogical Y-STR mutation rates

Chuan-Chao Wang, Li Hui

We have compared the Y chromosomal lineage dating between sequence data and commonly used Y-SNP plus Y-STR data. The coalescent times estimated using evolutionary Y-STR mutation rates correspond best with sequence-based dating when the lineages include the most ancient haplogroup A individuals. However, the times using slow mutated STR markers with genealogical rates fit well with sequence-based estimates in main lineages, such as haplogroup CT, DE, K, NO, IJ, P, E, C, I, J, N, O, and R. In addition, genealogical rates lead to more plausible time estimates for Neolithic coalescent sublineages compared with sequence-based dating.

The Landscape of Human STR Variation

The Landscape of Human STR Variation
Thomas F. Willems, Melissa Gymrek, Gareth Highnam, The 1000 Genomes Project The 1000 Genomes Project, David Mittelman, Yaniv Erlich

Short Tandem Repeats are among the most polymorphic loci in the human genome. These loci play a role in the etiology of a range of genetic diseases and have been frequently utilized in forensics, population genetics, and genetic genealogy. Despite this plethora of applications, little is known about the variation of most STRs in the human population. Here, we report the largest-scale analysis of human STR variation to date. We collected information for nearly 700,000 STR loci across over 1,000 individuals in phase 1 of the 1000 Genomes Project. This process nearly saturated common STR variations. After employing a series of quality controls, we utilize this call set to analyze determinants of STR variation, assess the human reference genome?s representation of STR alleles, find STR loci with common loss-of-function alleles, and obtain initial estimates of the linkage disequilibrium between STRs and common SNPs. Overall, these analyses further elucidate the scale of genetic variation beyond classical point mutations. The resource is publicly available at http://strcat.teamerlich.org/ both in raw format and via a graphical interface.