Saulo A. Aflitos, Elio Schijlen, Richard Finkers, Sandra Smit, Jun Wang, Gengyun Zhang, Ning Li, Likai Mao, Hans de Jong, Freek Bakker, Barbara Gravendeel, Timo Breit, Rob Dirks, Henk Huits, Darush Struss, Ruth Wagner, Hans van Leeuwen, Roeland van Ham, Laia Fito, Laëtitia Guigner, Myrna Sevilla, Philippe Ellul, Eric W. Ganko, Arvind Kapur, Emmanuel Reclus, Bernard de Geus, Henri van de Geest, Bas te Lintel Hekkert, Jan C. Van Haarst, Lars Smits, Andries Koops, Gabino Sanchez Perez, Dick de Ridder, Sjaak van Heusden, Richard Visser, Zhiwu Quan, Jiumeng Min, Li Liao, Xiaoli Wang, Guangbiao Wang, Zhen Yue, Xinhua Yang, Na Xu, Eric Schranz, Eric F. Smets, Rutger A. Vos, Han Rauwerda, Remco Ursem, Cees Schuit, Mike Kerns, Jan van den Berg, Wim H. Vriezen, Antoine Janssen, Torben Jahrman, Frederic Moquet, Julien Bonnet, Sander A. Peters
(Submitted on 21 Apr 2015)
Genetic variation in the tomato clade was explored by sequencing a selection of 84 tomato accessions and related wild species representative for the Lycopersicon, Arcanum, Eriopersicon, and Neolycopersicon groups. We present a reconstruction of three new reference genomes in support of our comparative genome analyses. Sequence diversity in commercial breeding lines appears extremely low, indicating the dramatic genetic erosion of crop tomatoes. This is reflected by the SNP count in wild species which can exceed 10 million i.e. 20 fold higher than in crop accessions. Comparative sequence alignment reveals group, species, and accession specific polymorphisms, which explain characteristic fruit traits and growth habits in tomato accessions. Using gene models from the annotated Heinz reference genome, we observe a bias in dN/dS ratio in fruit and growth diversification genes compared to a random set of genes, which probably is the result of a positive selection. We detected highly divergent segments in wild S. lycopersicum species, and footprints of introgressions in crop accessions originating from a common donor accession. Phylogenetic relationships of fruit diversification and growth specific genes from crop accessions show incomplete resolution and are dependent on the introgression donor. In contrast, whole genome SNP information has sufficient power to resolve the phylogenetic placement of each accession in the four main groups in the Lycopersicon clade using Maximum Likelihood analyses. Phylogenetic relationships appear correlated with habitat and mating type and point to the occurrence of geographical races within these groups and thus are of practical importance for introgressive hybridization breeding. Our study illustrates the need for multiple reference genomes in support of tomato comparative genomics and Solanum genome evolution studies.