Site-specific amino-acid preferences are mostly conserved in two closely related protein homologs
Michael B Doud , Orr Ashenberg , Jesse Bloom
Evolution drives changes in a protein’s sequence over time. The extent to which these changes in sequence affect the underlying preferences for each amino acid at each site is an important question with implications for comparative sequence-analysis methods such as molecular phylogenetics. To quantify the extent that site-specific amino-acid preferences change during evolution, we performed deep mutational scanning on two homologs of human influenza nucleoprotein with 94% amino-acid identity. We found that only a small fraction of sites (14 out of 497) exhibited changes in their amino-acid preferences that exceeded the noise in our experiments. Given the limited change in amino-acid preferences between these close homologs, we tested whether our measurements could be used to build site-specific substitution models that describe the evolution of nucleoproteins from more diverse influenza viruses. We found that site-specific evolutionary models informed by our experiments greatly outperformed non-site-specific alternatives in fitting the phylogenies of nucleoproteins from human, swine, equine, and avian influenza. Combining the experimental data from both nucleoprotein homologs improved phylogenetic fit, in part because measurements in multiple genetic contexts better captured the evolutionary average of the amino-acid preferences for sites with changing preferences. Overall, our results show that site-specific amino-acid preferences are sufficiently conserved during evolution that measuring mutational effects in one protein provides information that can improve quantitative evolutionary modeling of nearby homologs.