Combining exome and gene expression datasets in one graphical model of disease to empower the discovery of disease mechanisms
Aziz M. Mezlini, Fabio Fuligni, Adam Shlien, Anna Goldenberg
Identifying genes associated with complex human diseases is one of the main challenges of human genetics and computational medicine. To answer this question, millions of genetic variants get screened to identify a few of importance. To increase the power of identifying genes associated with diseases and to account for other potential sources of protein function aberrations, we propose a novel factor-graph based model, where much of the biological knowledge is incorporated through factors and priors. Our extensive simulations show that our method has superior sensitivity and precision compared to variant-aggregating and differential expression methods. Our integrative approach was able to identify important genes in breast cancer, identifying genes that had coding aberrations in some patients and regulatory abnormalities in others, emphasizing the importance of data integration to explain the disease in a larger number of patients.