Our paper: Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture

[This author post is by John Pool on his paper: Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture arXived here.]

We are in the process of publishing this analysis of >100 sequenced Drosophila melanogaster genomes (largely haploid genomes at >25X depth).  These genomes come from more than 20 geographic locations, largely within sub-Saharan Africa, where the species is thought to originate.  Truth be told, this sampling scheme was somewhat accidental – we wanted to identify a population representing a “center of genetic diversity” for the species, which for us involved sequencing small numbers of genomes from many different population samples (some from previous lab stocks, others from newly collected lines).  Ultimately we did find the sample we were looking for, and we are in the process of sequencing ~300 genomes from this Zambian population.  Still, it seemed more than worthwhile to analyze the “geographic scatter” of genomes we had obtained from across sub-Saharan Africa (as well as one small sample from Europe).

Our ambitions for this paper were largely descriptive – a preliminary analysis of genetic variation within and among the sampled populations.  We envisioned being able to compare diversity levels and genetic structure across Africa (much as I once did with a dramatically smaller data set), and to identify specific loci with signatures of selection.  And we were able to do that.  We found the highest levels of genetic diversity in and around Zambia, raising the prospect of a southern-central African origin for D. melanogaster.  We found low-to-moderate levels of genetic structure across most of sub-Saharan Africa, with only Ethiopian populations showing stronger genetic differentiation (along with some morphological differentiation, but that’s another story).  Analyses of allele frequencies within and between populations revealed a substantial number of loci with evidence of recent natural selection – many GO categories enriched for such outliers pertained to gene regulation, much as we had observed in another recent population genomic analysis.

Of course that’s how we normally think of natural selection’s influence on genetic variation – specific beneficial mutations leading to selective sweeps (whether hard or soft, partial or complete), each one influencing diversity on a limited genomic scale.  And at least in
species with large outbreeding populations like Drosophila, recurrent hitchhiking may be common enough to affect diversity at random sites in the genome (e.g. 1, 2, 3).  So we weren’t surprised to find sweep signals.  The bigger surprise to us was finding evidence that specific episodes of natural selection had affected genetic variation on the scale of whole chromosome arms or the entire genome.

The first major surprise concerned genomic patterns of non-African admixture in African D. melanogaster populations.  The occurrence of such introgression had been documented before, and there were previous findings that non-African genotypes were associated with urban environments in Africa, and that admixture levels could vary within the genome. We developed a hidden Markov model approach to detect admixed chromosomal regions (based simply on the reduced diversity found in populations outside sub-Saharan Africa).  Whereas we tend to think of admixture as a selectively neutral force, the genomic patterns of admixture we observed did not seem consistent with passive gene flow.  Non-African genotypes had displaced large portions of the gene pool of presumably quite large African populations, and this had occurred within a very short time (judging by the megabase scale of admixture tracts).  Levels of admixture across the genome showed both broad-scale heterogeneity (chromosomal differences) and relatively narrow “spikes” of admixture.  These peaks of admixture quite often overlapped with outliers for high FST between Africa and Europe, as would be expected if these regions contained functional differences between populations for which introgressing non-African alleles may now be favored in some African environments (e.g. modernizing cities).  

The second surprise came as we documented population genetic patterns associated with polymorphic inversions (as further analyzed in a forthcoming paper by Russ Corbett-Detig and Dan Hartl).  It was already known that inversions tend to differ in frequency between D. melanogaster populations, but theory and most empirical data suggested that only diversity around the inversion breakpoints should be affected.  Instead, we observed some African populations in which elevated inversion frequencies were associated with notable reductions in diversity for entire chromosome arms (and ultimately affecting genome-wide average diversity), consistent with directional selection on rearrangements or linked loci.  Perhaps more surprisingly, mostinversions found in the non-African sample (France) served to substantially increase diversity across whole chromosome arms (by up to 29% in the case of inversions on arm 3R), and by 12% genome-wide.  Here, we can only suggest that selection may have acted to favor inverted chromosomes that recently originated from a more genetically diverse (e.g. African or African-admixed) population.  Accounting for these inversions substantially alters chromosomal diversity ratios between African and European populations.

Hence, we may have the curious situation of natural selection driving introgression in both directions across the sub-Saharan/cosmopolitan population genetic divide in D. melanogaster.

You can find our draft manuscript here, supplemental items here, and the data here.

 I’m definitely glad we were able to post a draft at arXiv – it was time to communicate our findings to the research community (especially to facilitate our colleagues’ analysis and publication plans for this data set), and there’s really no downside to us as authors.  I also appreciate the chance to post here at Haldane’s Sieve, and it would be great to discuss any aspect of our draft.

John Pool


Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture

Population genomics of sub-Saharan Drosophila melanogaster: African diversity and non-African admixture
John E. Pool, Russell B. Corbett-Detig, Ryuichi P. Sugino, Kristian A. Stevens, Charis M. Cardeno, Marc W. Crepeau, Pablo Duchen, J. J. Emerson, Perot Saelao, David J. Begun, Charles H. Langley
(Submitted on 23 Aug 2012)

(ABRIDGED) We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa FST were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations.