Molecular phenotypes that are causal to complex traits can have low heritability and are expected to have small influence.
Work on genetic makeup of complex traits has led to some unexpected findings. Molecular trait heritability estimates have consistently been lower than those of common diseases, even though it is intuitively expected that the genotype signal weakens as it becomes more dissociated from DNA. Further, results from very large studies have not been sufficient to explain most of the heritable signal, and suggest hundreds if not thousands of responsible alleles. Here, I demonstrate how trait heritability depends crucially on the definition of the phenotype, and is influenced by the variability of the assay, measurement strategy, and the quantification approach used. For a phenotype downstream of many molecular traits, it is possible that its heritability is larger than for any of its upstream determinants. I also rearticulate via models and data that if a phenotype has many dependencies, a large number of small effect alleles are expected. However, even if these alleles do drive highly heritable causal intermediates that can be modulated, it does not imply that large changes in phenotype can be obtained.
An Approximate Bayesian Computation Approach to Examining the Phylogenetic Relationships among the Four Gibbon Genera using Whole Genome Sequence Data
Krishna Veeramah, August E Woerner, Laurel Johnstone, Ivo Gut, Marta Gut, Tomas Marques-Bonet, Lucia Carbone, Jeff D Wall, Michael F Hammer
Gibbons are believed to have diverged from the larger great apes ~16.8 Mya and today reside in the rainforests of Southeast Asia. Based on their diploid chromosome number, the family Hylobatidae is divided into four genera, Nomascus, Symphalangus, Hoolock and Hylobates. Genetic studies attempting to elucidate the phylogenetic relationships among gibbons using karyotypes, mtDNA, the Y chromosome, and short autosomal sequences have been inconclusive. To examine the relationships among gibbon genera in more depth, we performed 2nd generation whole genome sequencing to a mean of ~15X coverage in two individuals from each genus. We developed a coalescent-based Approximate Bayesian Computation method incorporating a model of sequencing error generated by high coverage exome validation to infer the branching order, divergence times, and effective population sizes of gibbon taxa. Although Hoolock and Symphalangus are likely sister taxa, we could not confidently resolve a single bifurcating tree despite the large amount of data analyzed. Our combined results support the hypothesis that all four gibbon genera diverged at approximately the same time. Assuming an autosomal mutation rate of 1×10-9/site/year this speciation process occurred ~5 Mya during a period in the Early Pliocene characterized by climatic shifts and fragmentation of the Sunda shelf forests. Whole genome sequencing of additional individuals will be vital for inferring the extent of gene flow among species after the separation of the gibbon genera.
Virulence genes are a signature of the microbiome in the colorectal tumor microenvironment
Michael B Burns, Joshua Lynch, Timothy K Starr, Dan Knights, Ran Blekhman
Background The human gut microbiome is associated with the development of colon cancer, and recent studies have found changes in the composition of the microbial communities in cancer patients compared to healthy controls. However, host-bacteria interactions are mainly expected to occur in the cancer microenvironment, whereas current studies primarily use stool samples to survey the microbiome. Here, we highlight the major shifts in the colorectal tumor microbiome relative to that of matched normal colon tissue from the same individual, allowing us to survey the microbial communities at the tumor microenvironment, and provides intrinsic control for environmental and host genetic effects on the microbiome. Results We characterized the microbiome in 44 primary tumor and 44 patient-matched normal colon tissues. We find that tumors harbor distinct microbial communities compared to nearby healthy tissue. Our results show increased microbial diversity at the tumor microenvironment, with changes in the abundances of commensal and pathogenic bacterial taxa, including Fusobacterium and Providencia. While Fusobacteria has previously been implicated in CRC, Providencia is a novel tumor- associated agent, and has several features that make it a potential cancer driver, including a strong immunogenic LPS and an ability to damage colorectal tissue. Additionally, we identified a significant enrichment of virulence-associated genes in the colorectal cancer microenvironment. Conclusions This work identifies bacterial taxa significantly correlated with colorectal cancer, including a novel finding of an elevated abundance of Providencia in the tumor microenvironment. We also describe several metabolic pathways and enzymes differentially present in the tumor associated microbiome, and show that the bacterial genes in the tumor microenvironment are enriched for virulence associated genes from the aggregate microbial community. This virulence enrichment indicates that the microbiome likely plays an active role in colorectal cancer development and/or progression. These reuslts provide a starting point for future prognostic and therapeutic research with the potential to improve patient outcomes.
Accounting for eXentricities: Analysis of the X chromosome in GWAS reveals X-linked genes implicated in autoimmune diseases
Diana Chang, Feng Gao, Li Ma, Aaron Sams, Andrea Slavney, Yedael Waldman, Paul Billing-Ross, Aviv Madar, Richard Spritz, Alon Keinan
Many complex human diseases are highly sexually dimorphic, which suggests a potential contribution of the X chromosome. However, the X chromosome has been neglected in most genome-wide association studies (GWAS). We present tailored analytical methods and software that facilitate X-wide association studies (XWAS), which we further applied to reanalyze data from 16 GWAS of different autoimmune diseases (AID). We associated several X-linked genes with disease risk, among which ARHGEF6 is associated with Crohn’s disease and replicated in a study of ulcerative colitis, another inflammatory bowel disease (IBD). Indeed, ARHGEF6 interacts with a gastric bacterium that has been implicated in IBD. Additionally, we found that the centromere protein CENPI is associated with three different AID; replicated a previously investigated association of FOXP3, which regulates genes involved in T-cell function, in vitiligo; and discovered that C1GALT1C1 exhibits sex-specific effect on disease risk in both IBDs. These and other X-linked genes that we associated with AID tend to be highly expressed in tissues related to immune response, display differential gene expression between males and females, and participate in major immune pathways. Combined, the results demonstrate the importance of the X chromosome in autoimmunity, reveal the potential of XWAS, even based on existing data, and provide the tools and incentive to appropriately include the X chromosome in future studies.
Accounting for experimental noise reveals that transcription dominates control of steady-state protein levels in yeast
Gábor Csárdi, Alexander Franks, David S. Choi, Eduardo M. Airoldi, D. Allan Drummond
Cells respond to their environment by modulating protein levels through mRNA transcription and post-transcriptional control. Modest correlations between global steady-state mRNA and protein measurements have been interpreted as evidence that transcript levels determine roughly 40% of the variation in protein levels, indicating dominant post-transcriptional effects. However, the techniques underlying these conclusions, such as correlation and regression, yield biased results when data are noisy, missing systematically, and collinear—properties of mRNA and protein measurements—which motivated us to revisit this subject. Noise-robust analyses of 25 studies of budding yeast reveal that mRNA levels explain roughly 80% of the variation in steady-state protein levels. Post-transcriptional regulation amplifies rather than competes with the transcriptional signal. Measurements are highly reproducible within but not between studies, and are distorted in part by between-study differences in gene expression. These results substantially revise current models of protein-level regulation and introduce multiple noise-aware approaches essential for proper analysis of many biological phenomena.
The genetic architecture of neurodevelopmental disorders
Kevin J Mitchell
Neurodevelopmental disorders include rare conditions caused by identified single mutations, such as Fragile X, Down and Angelman syndromes, and much more common clinical categories such as autism, epilepsy and schizophrenia. These common conditions are all highly heritable but their genetics is considered to be “complex”. In fact, this sharp dichotomy in genetic architecture between rare and common disorders may be largely artificial. On the one hand, much of the apparent complexity in the genetics of common disorders may derive from underlying genetic heterogeneity, which has remained obscure until recently. On the other hand, even for supposedly Mendelian conditions, the relationship between single mutations and clinical phenotypes is rarely simple. The categories of monogenic and complex disorders may therefore merge across a continuum, with some mutations being strongly associated with specific syndromes and others having a more variable outcome, modified by the presence of additional genetic variants.
MUSiCC: Towards an accurate estimation of average genomic copy-numbers in the human microbiome
Ohad Manor, Elhanan Borenstein
Functional metagenomic analyses commonly involve a normalization step, where measured levels of genes or pathways are converted into relative abundances. Here, we demonstrate that this normalization scheme introduces marked biases both across and within human microbiome samples and systematically identify various sample- and gene-specific properties that contribute to these biases. We introduce an alternative normalization paradigm, MUSiCC, which combines universal single-copy genes with machine learning methods to correct these biases and to obtain a more accurate and biologically meaningful measure of gene abundances. Finally, we demonstrate that MUSiCC significantly improves downstream discovery of functional shifts in the microbiome. MUSiCC is available at http://elbo.gs.washington.edu/software.html.