3D RNA from evolutionary couplings
Caleb Weinreb, Torsten Gross, Chris Sander, Debora S. Marks
RNA genes are ubiquitous in cell physiology, with a diverse repertoire of known functions. In fact, the majority of the eukaryotic genome does not code for proteins, and thousands of conserved RNAs of currently unknown function have been identified. Knowledge of 3D structure could can help elucidate the function of these RNAs but despite outstanding word using X-ray crystallography, NMR and cryoEM, structure determination remains low-throughput. RNA structure prediction in silico is a promising alternative. However, 3D structure prediction for large RNAs requires tertiary contacts between distant secondary structural elements that are difficult to infer with existing methods. Here, based only on sequences, we use a global statistical probability model of co-variation to detect 3D contacts, in analogy to recently developed breakthrough methods for computational protein folding. In blinded tests on 22 known RNA structures ranging in size from 65 to 1800 nucleotides, the predicted contacts matched physical interactions with 65-95% prediction accuracy. Importantly, we infer many long-range tertiary contacts, including non-Watson-Crick interactions. When used as restraints in molecular dynamics simulations, the inferred contacts improve RNA 3D structure prediction to a coordinate error as low as 6 to 10 Angstrom rmsd with potential for use with other constraints. These contacts include functionally important interactions, such as those that distinguish the active and inactive conformations of four riboswitches. In blind prediction mode, we present evolutionary couplings for 180 RNAs of unknown structure (available at this https URL). We anticipate that this approach will shed light on the structure and function of as yet less known RNA genes.