John D O’Brien, Lucas Amenga-Etego, Ruiqi Li
A recent genomic characterization of more than $200$ Plasmodium falciparum samples isolated from the bloodstreams of clinical patients across three continents further supports the presence of significant strain mixture within infections. Consistent with previous studies, these data suggest that the degree of genetic strain admixture within infections varies significantly both within and across populations. The life cycle of the parasite implies that the mixture of multiple genotypes within an infected individual controls the outcrossing rate across populations, making methods for measuring this process in situ central to understanding the genetic epidemiology of the disease. Peculiar features of the P. falciparum genome mean that standard methods for assessing structure within a population — inbreeding coefficients and related $F$-statistics — cannot be used directly. Here we review an initial effort to estimate the degree of mixture within clinical isolates of P. falciparum using these statistics, and provide several generalizations using both frequentist and Bayesian approaches. Using the Bayesian approach, based on the Balding-Nichols model, we provide estimates of inbreeding coefficients for 168 samples from northern Ghana and find significant admixture in more than 70% of samples, and characterize the model fit using posterior predictive checks. We also compare this approach to a recently introduced mixture model and find that for a significant minority of samples the F-statistic-based approach provides a significantly better explanation for the data. We show how to extend this model to a multi-level testing framework that can integrate other data types and use it to demonstrate that transmission intensity significantly associates with degree of structure of within-sample mixture in northern Ghana.