Scott M. Lundberg, William B. Tu, Brian Raught, Linda Z. Penn, Michael M. Hoffman, Su-In Lee
Introduction: A cell’s epigenome arises from interactions among chromatin factors — transcription factors, histones, and other DNA-associated proteins — co-localized at particular genomic regions. Identifying the network of interactions among chromatin factors, the chromatin network, is of paramount importance in understanding epigenome regulation. Methods: We developed a novel computational approach, ChromNet, to infer the chromatin network from a set of ChIP-seq datasets. ChromNet has three key features that enable its use on large collections of ChIP-seq data. First, rather than using pairwise co-localization of factors along the genome, ChromNet identifies conditional dependence relationships that better discriminate direct and indirect interactions. Second, our novel statistical technique, the group graphical model, improves inference of conditional dependence on tightly correlated datasets. These datasets include transcription factors that form a complex or the same transcription factor assayed in different laboratories. Third, ChromNet’s computationally efficient method allows network learning among thousands of factors, and efficient relearning as new data is added. Results: We applied ChromNet to all available ChIP-seq data from the ENCODE Project, consisting of 1,415 ChIP-seq datasets, which revealed previously known chromatin factor interactions better than alternative approaches. ChromNet also identified previously unreported chromatin factor interactions. We experimentally validated one of these interactions, between the MYC and HCFC1 transcription factors. Discussion: ChromNet provides a useful tool for understanding the interactions among chromatin factors and identifying novel interactions. We have provided an interactive web-based visualization of the full ENCODE chromatin network and the ability to incorporate custom datasets at http://chromnet.cs.washington.edu.