Phylogenetic tree shapes resolve disease transmission patterns
Jennifer Gardy, Caroline Colijn
Whole genome sequencing is becoming popular as a tool for understanding outbreaks of communicable diseases, with phylogenetic trees being used to identify individual transmission events or to characterize outbreak-level overall transmission dynamics. Existing methods to infer transmission dynamics from sequence data rely on well-characterised infectious periods, epidemiological and clinical meta-data which may not always be available, and typically require computationally intensive analysis focussing on the branch lengths in phylogenetic trees. We sought to determine whether the topological structures of phylogenetic trees contain signatures of the overall transmission patterns underlying an outbreak. Here we use simulated outbreaks to train and then test computational classifiers. We test the method on data from two real-world outbreaks. We find that different transmission patterns result in quantitatively different phylogenetic tree shapes. We describe five topological features that summarize a phylogeny’s structure and find that computational classifiers based on these are capable of predicting an outbreak’s transmission dynamics. The method is robust to variations in the transmission parameters and network types, and recapitulates known epidemiology of previously characterized real-world outbreaks. We conclude that there are simple structural properties of phylogenetic trees which, when combined, can distinguish communicable disease outbreaks with a super-spreader, homogeneous transmission, and chains of transmission. This is possible using genome data alone, and can be done during an outbreak. We discuss the implications for management of outbreaks.
Reassortment between influenza B lineages and the emergence of a co-adapted PB1-PB2-HA gene complex
Gytis Dudas, Trevor Bedford, Samantha Lycett, Andrew Rambaut
Comments: 33 pages, 21 figures
Subjects: Populations and Evolution (q-bio.PE)
Influenza B viruses are increasingly being recognized as major contributors to morbidity attributed to seasonal influenza. Currently circulating influenza B isolates are known to belong to two antigenically distinct lineages referred to as B/Victoria and B/Yamagata. Frequent exchange of genomic segments of these two lineages has been noted in the past, but the observed patterns of reassortment have not been formalized in detail. We investigate inter-lineage reassortments by comparing phylogenetic trees across genomic segments. Our analyses indicate that of the 8 segments of influenza B viruses only PB1, PB2 and HA segments maintained separate Victoria and Yamagata lineages and that currently circulating strains possess PB1, PB2 and HA segments derived entirely from one or the other lineage; other segments have repeatedly reassorted between lineages thereby reducing genetic diversity. We argue that this difference between segments is due to selection against reassortant viruses with mixed lineage PB1, PB2 and HA segments. Given sufficient time and continued recruitment to the reassortment-isolated PB1-PB2-HA gene complex, we expect influenza B viruses to eventually undergo sympatric speciation.
High Genetic Diversity and Adaptive Potential of Two Simian Hemorrhagic Fever Viruses in a Wild Primate Population
Adam L. Bailey, Michael Lauck, Andrea Weiler, Samuel D. Sibley, Jorge M. Dinis, Zachary Bergman, Chase W. Nelson, Michael Correll, Michael Gleicher, David Hyeroba, Alex Tumukunde, Geoffrey Weny, Colin Chapman, Jens Kuhn, Austin Hughes, Thomas C. Friedrich, Tony L. Goldberg, David H. O’Connor
Key biological properties such as high genetic diversity and high evolutionary rate enhance the potential of certain RNA viruses to adapt and emerge. Identifying viruses with these properties in their natural hosts could dramatically improve disease forecasting and surveillance. Recently, we discovered two novel members of the viral family Arteriviridae: simian hemorrhagic fever virus (SHFV)-krc1 and SHFV-krc2, infecting a single wild red colobus (Procolobus rufomitratus tephrosceles) in Kibale National Park, Uganda. Nearly nothing is known about the biological properties of SHFVs in nature, although the SHFV type strain, SHFV-LVR, has caused devastating outbreaks of viral hemorrhagic fever in captive macaques. Here we detected SHFV-krc1 and SHFV-krc2 in 40% and 47% of 60 wild red colobus tested, respectively. We found viral loads in excess of 1×10^6-1×10^7 RNA copies per milliliter of blood plasma for each of these viruses. SHFV-krc1 and SHFV-krc2 also showed high genetic diversity at both the inter- and intra-host levels. Analyses of synonymous and non-synonymous nucleotide diversity across viral genomes revealed patterns suggestive of positive selection in SHFV open reading frames (ORF) 5 (SHFV-krc2 only) and 7 (SHFV-krc1 and SHFV-krc2). Thus, these viruses share several important properties with some of the most rapidly evolving, emergent RNA viruses.
A stochastic microscopic model for the dynamics of antigenic variation
Gustavo Guerberoff, Fernando Alvarez-Valin
(Submitted on 8 Nov 2013)
We present a novel model that describes the within-host evolutionary dynamics of parasites undergoing antigenic variation. The approach uses a multi-type branching process with two types of entities defined according to their relationship with the immune system: clans of resistant parasitic cells (i.e. groups of cells sharing the same antigen not yet recognized by the immune system) that may become sensitive, and individual sensitive cells that can acquire a new resistance thus giving rise to the emergence of a new clan. The simplicity of the model allows analytical treatment to determine the subcritical and supercritical regimes in the space of parameters. By incorporating a density-dependent mechanism the model is able to capture additional relevant features observed in experimental data, such as the characteristic parasitemia waves. In summary our approach provides a new general framework to address the dynamics of antigenic variation which can be easily adapted to cope with broader and more complex situations.
Mapping of the Influenza-A Hemagglutinin Serotypes Evolution by the ISSCOR Method
Jan P. Radomski, Piotr P. Slonimski, Włodzimierz Zagórski-Ostoja, Piotr Borowicz
(Submitted on 8 Nov 2013)
Analyses and visualizations by the ISSCOR method of influenza virus hemagglutinin genes of different A-subtypes revealed some rather striking temporal relationships between groups of individual gene subsets. Based on these findings we consider application of the ISSCOR-PCA method for analyses of large sets of homologous genes to be a worthwhile addition to a toolbox of genomics – allowing for a rapid diagnostics of trends, and ultimately even aiding an early warning of newly emerging epidemiological threats.
The hemagglutinin mutation E391K of pandemic 2009 influenza revisited
Jan P. Radomski, Piotr Płoński, Włodzimierz Zagórski-Ostoja
(Submitted on 8 Nov 2013)
Phylogenetic analyses based on small to moderately sized sets of sequential data lead to overestimating mutation rates in influenza hemagglutinin (HA) by at least an order of magnitude. Two major underlying reasons are: the incomplete lineage sorting, and a possible absence in the analyzed sequences set some of key missing ancestors. Additionally, during neighbor joining tree reconstruction each mutation is considered equally important, regardless of its nature. Here we have implemented a heuristic method optimizing site dependent factors weighting differently 1st, 2nd, and 3rd codon position mutations, allowing to extricate incorrectly attributed sub-clades. The least squares regression analysis of distribution of frequencies for all mutations observed on a partially disentangled tree for a large set of unique 3243 HA sequences, along all nucleotide positions, was performed for all mutations as well as for non-equivalent amino acid mutations: in both cases demonstrating almost flat gradients, with a very slight downward slope towards the 3′-end positions. The mean mutation rates per sequence per year were 3.83*10^-4 for the all mutations, and 9.64*10^-5 for the non-equivalent ones.
Reliable reconstruction of HIV-1 whole genome haplotypes reveals clonal interference and genetic hitchhiking among immune escape variants
Aridaman Pandit, Rob J de Boer
(Submitted on 26 Sep 2013)
Following transmission, HIV-1 evolves into a diverse population, and next generation sequencing enables us to detect variants occurring at low frequencies. Studying viral evolution at the level of whole genomes was hitherto not possible because next generation sequencing delivers relatively short reads. We here provide a proof of principle that whole HIV-1 genomes can be reliably reconstructed from short reads, and use this to study the selection of immune escape mutations at the level of whole genome haplotypes. Using realistically simulated HIV-1 populations, we demonstrate that reconstruction of complete genome haplotypes is feasible with high fidelity. We do not reconstruct all genetically distinct genomes, but each reconstructed haplotype represents one or more of the quasispecies in the HIV-1 population. We then reconstruct 30 whole genome haplotypes from published short sequence reads sampled longitudinally from a single HIV-1 infected patient. We confirm the reliability of the reconstruction by validating our predicted haplotype genes with single genome amplification sequences, and by comparing haplotype frequencies with observed epitope escape frequencies. Phylogenetic analysis shows that the HIV-1 population undergoes selection driven evolution, with successive replacement of the viral population by novel dominant strains. We demonstrate that immune escape mutants evolve in a dependent manner with various mutations hitchhiking along with others. As a consequence of this clonal interference, selection coefficients have to be estimated for complete haplotypes and not for individual immune escapes.