Neighbourhoods of phylogenetic trees: exact and asymptotic counts

Neighbourhoods of phylogenetic trees: exact and asymptotic counts

Jamie V. De Jong, Jeanette C McLeod, Mike Steel
(Submitted on 15 Aug 2015)

A central theme in phylogenetics is the reconstruction and analysis of evolutionary trees from a given set of data. To determine the optimal search methods for reconstructing trees, it is crucial to understand the size and structure of the neighbourhoods of trees under tree rearrangement operations. The diameter and size of the immediate neighbourhood of a tree has been well-studied, however little is known about the number of trees at distance two, three or (more generally) k from a given tree. In this paper we provide a number of exact and asymptotic results concerning these quantities, and identify some key aspects of tree shape that play a role in determining these quantities. We obtain several new results for two of the main tree rearrangement operations – Nearest Neighbour Interchange and Subtree Prune and Regraft — as well as for the Robinson-Foulds metric on trees.

Sex-dependent dominance at a single locus maintains variation in age at maturity in Atlantic salmon

Sex-dependent dominance at a single locus maintains variation in age at maturity in Atlantic salmon

Nicola Barson, Tutku Aykanat, Kjetil Hindar, Matthew Baranski, Geir Bolstad, Peder Fiske, Celeste Jacq, Arne Jensen, Susan E Johnston, Sten Karlsoon, Matthew Kent, Eero Niemelä, Torfinn Nome, Tor Naesje, Panu Orell, Atso Romakkaniemi, Harald Saegrov, Kurt Urdal, Jaakko Erkinaro, Sigbjorn Lien, Craig Primmer

Males and females share many traits that have a common genetic basis, however selection on these traits often differs between the sexes leading to sexual conflict. Under such sexual antagonism, theory predicts the evolution of genetic architectures that resolve this sexual conflict. Yet, despite intense theoretical and empirical interest, the specific genetic loci behind sexually antagonistic phenotypes have rarely been identified, limiting our understanding of how sexual conflict impacts genome evolution and the maintenance of genetic diversity. Here, we identify a large effect locus controlling age at maturity in 57 salmon populations, an important fitness trait in which selection favours earlier maturation in males than females, and show it is a clear example of sex dependent dominance reducing intralocus sexual conflict and maintaining adaptive variation in wild populations. Using high density SNP data and whole genome re-sequencing, we found that vestigial-like family member 3 (VGLL3) exhibits sex-dependent dominance in salmon, promoting earlier and later maturation in males and females, respectively. VGLL3, an adiposity regulator associated with size and age at maturity in humans, explained 39.4% of phenotypic variation, an unexpectedly high effect size for what is usually considered a highly polygenic trait. Such large effects are predicted under balancing selection from either sexually antagonistic or spatially varying selection. Our results provide the first empirical example of dominance reversal permitting greater optimisation of phenotypes within each sex, contributing to the resolution of sexual conflict in a major and widespread evolutionary trade-off between age and size at maturity. They also provide key empirical evidence for how variation in reproductive strategies can be maintained over large geographical scales. We further anticipate these findings will have a substantial impact on population management in a range of harvested species where trends towards earlier maturation have been observed

A genomic region containing RNF212 is associated with sexually-dimorphic recombination rate variation in wild Soay sheep (Ovis aries).

A genomic region containing RNF212 is associated with sexually-dimorphic recombination rate variation in wild Soay sheep (Ovis aries).

Susan E Johnston, Jon Slate, Josephine M Pemberton

Meiotic recombination breaks down linkage disequilibrium and forms new haplotypes, meaning that it is an important driver of diversity in eukaryotic genomes. Understanding the causes of variation in recombination rate is not only important in interpreting and predicting evolutionary phenomena, but also for understanding the potential of a population to respond to selection. Yet, there remains little data on if, how and why recombination rate varies in natural populations. Here, we used extensive pedigree and high-density SNP information in a wild population of Soay sheep (Ovis aries) to determine individual crossovers in 3330 gametes from 813 individuals. Using these data, we investigated the recombination landscape and the genetic architecture of individual autosomal recombination rate. The population was strongly heterochiasmic (male to female linkage map ratio = 1.31), driven by significantly elevated levels of male recombination in sub-telomeric regions. Autosomal recombination rate was heritable in both sexes (h2 = 0.16 & 0.12 in females and males, respectively), but with different genetic architectures. In females, 46.7% of heritable variation was explained by a sub-telomeric region on chromosome 6; a genome-wide association study showed the strongest associations at RNF212, with further associations observed at a nearby ~374kb region of complete linkage disequilibrium containing three additional candidate loci, CPLX1, GAK and PCGF3. This region did not affect male recombination rate. A second region on chromosome 7 containing REC8 and RNF212B explained 26.2% of heritable variation in recombination rate in both sexes, with further single locus associations identified on chromosome 3. Our findings provide a key empirical example of the genetic architecture of recombination rate in a wild mammal population with male-biased crossover frequency.

GWAS identifies a single selective sweep for age of maturation in wild and cultivated Atlantic salmon males.

GWAS identifies a single selective sweep for age of maturation in wild and cultivated Atlantic salmon males.

Fernando Ayllon, Erik Kjærner-Semb, Tomasz Furmanek, Vidar Wennevik, Monica Solberg, Harald Sægrov, Kurt Urdal, Geir Dahle, Geir Lasse Taranger, Kevin A Glover, Markus S Almén, Carl J Rubin, Rolf B Edvardsen, Anna Wargelius

Abstract Background Sea age at sexual maturation displays large plasticity for wild Atlantic salmon males and varies between 1-5 years. This flexibility can also be observed in domesticated salmon. Previous studies have uncovered a genetic predisposition for age at maturity with moderate heritability, thus suggesting a polygenic nature of this trait. The aim with this study was to identify genomic regions and associated SNPs and genes conferring age at maturity in salmon. Results We performed a GWAS using a pool sequencing approach (n=20 per river and trait) of salmon returning as sexually mature either after one sea winter (2009) or after three sea winters (2011) in six rivers in Norway. The study revealed one major selective sweep, which covered 76 significant SNP in a 230 kb region of Chr 25. A SNP assay of other year classes of wild salmon and from cultivated fish supported this finding. The assay in cultivated fish reduced the haplotype conferring the trait to a region which covered 4 SNPs of a 2386 bp region containing the vgll3 gene. 2 of these SNPs caused miss-sense mutations in vgll3. Conclusions This study presents a single selective region in the genome for age at maturation in male Atlantic salmon. The SNPs identified may be used as QTLs to prevent early maturity in aquaculture and in monitoring programs of wild salmon. Interestingly, the identified vgll3 gene has previously been linked to time of puberty in humans, suggesting a conserved mechanism for time of puberty in vertebrates.

Fast and efficient QTL mapper for thousands of molecular phenotypes

Fast and efficient QTL mapper for thousands of molecular phenotypes

Halit Ongen, Alfonso Buil, Andrew Brown, Emmanouil Dermitzakis, Olivier Delaneau

Motivation: In order to discover quantitative trait loci (QTLs), multi-dimensional genomic data sets combining DNA-seq and ChiP-/RNA-seq require methods that rapidly correlate tens of thousands of molecular phenotypes with millions of genetic variants while appropriately controlling for multiple testing. Results: We have developed FastQTL, a method that implements a popular cis-QTL mapping strategy in a user- and cluster-friendly tool. FastQTL also proposes an efficient permutation procedure to control for multiple testing. The outcome of permutations is modeled using beta distributions trained from a few permutations and from which adjusted p-values can be estimated at any level of significance with little computational cost. The Geuvadis & GTEx pilot data sets can be now easily analyzed an order of magnitude faster than previous approaches. Availability: Source code, binaries and comprehensive documentation of FastQTL are freely available to download at

Impact of the X chromosome and sex on regulatory variation

Impact of the X chromosome and sex on regulatory variation

Kimberly R Kukurba, Princy Parsana, Kevin S Smith, Zachary Zappala, David A Knowles, Marie-Julie Favé, Xin Li, Xiaowei Zhu, James B Potash, Myrna M Weissman, Jianxin Shi, Anshul Kundaje, Douglas F Levinson, Philip Awadalla, Sara Mostafavi, Alexis Battle, Stephen B Montgomery

The X chromosome, with its unique mode of inheritance, contributes to differences between the sexes at a molecular level, including sex-specific gene expression and sex-specific impact of genetic variation. We have conducted an analysis of the impact of both sex and the X chromosome on patterns of gene expression identified through transcriptome sequencing of whole blood from 922 individuals. We identified that genes on the X chromosome are more likely to have sex-specific expression compared to the autosomal genes. Furthermore, we identified a depletion of regulatory variants on the X chromosome, especially among genes under high selective constraint. In contrast, we discovered an enrichment of sex-specific regulatory variants on the X chromosome. To resolve the molecular mechanisms underlying such effects, we generated and connected sex-specific chromatin accessibility to sex-specific expression and regulatory variation. As sex-specific regulatory variants can inform sex differences in genetic disease prevalence, we have integrated our data with genome-wide association study data for multiple immune traits and to identify traits with significant sex biases. Together, our study provides genome-wide insight into how the X chromosome and sex shape human gene regulation and disease.

Genome variation and meiotic recombination in Plasmodium falciparum: insights from deep sequencing of genetic crosses

Genome variation and meiotic recombination in Plasmodium falciparum: insights from deep sequencing of genetic crosses

Alistair Miles, Zamin Iqbal, Paul Vauterin, Richard Pearson, Susana Campino, Michel Theron, Kelda Gould, Daniel Mead, Eleanor Drury, John O’Brien, Valentin Ruano Rubio, Bronwyn MacInnis, Jonathan Mwangi, Upeka Samarakoon, Lisa Ranford-Cartwright, Michael Ferdig, Karen Hayton, Xinzhuan Su, Thomas Wellems, Julian Rayner, Gil McVean, Dominic Kwiatkowski

The malaria parasite Plasmodium falciparum has a great capacity for evolutionary adaptation to evade host immunity and develop drug resistance. Current understanding of parasite evolution is impeded by the fact that a large fraction of the genome is either highly repetitive or highly variable, and thus difficult to analyse using short read technologies. Here we describe a resource of deep sequencing data on parents and progeny from genetic crosses, which has enabled us to perform the first integrated analysis of SNP, INDEL and complex polymorphisms, using Mendelian error rates as an indicator of genotypic accuracy. These data reveal that INDELs are exceptionally abundant and the dominant mode of polymorphism within the core genome. We analyse patterns of meiotic recombination, including the relative contribution of crossover and non-crossover events, and we observe several instances of recombination that modify copy number variants associated with drug resistance. We describe a novel web application that allows these data to be explored in detail.