Proteins linked to autosomal dominant and autosomal recessive disorders harbor characteristic rare missense mutation distribution patterns
Tychele Turner , Christopher Douville , Dewey Kim , Peter D Stenson , David N Cooper , Aravinda Chakravarti , Rachel Karchin
The role of rare missense variants in disease causation remains difficult to interpret. We explore whether the clustering pattern of rare missense variants (MAF<0.01) in a protein is associated with mode of inheritance. Mutations in genes associated with autosomal dominant (AD) conditions are known to result in either loss or gain of function, whereas mutations in genes associated with autosomal recessive (AR) conditions invariably result in loss of function. Loss- of-function mutations tend to be distributed uniformly along protein sequence, while gain-of- function mutations tend to localize to key regions. It has not previously been ascertained whether these patterns hold in general for rare missense mutations. We consider the extent to which rare missense variants are located within annotated protein domains and whether they form clusters, using a new unbiased method called CLUstering by Mutation Position (CLUMP). These approaches quantified a significant difference in clustering between AD and AR diseases. Proteins linked to AD diseases exhibited more clustering of rare missense mutations than those linked to AR diseases (Wilcoxon P=5.7×10-4, permutation P=8.4×10-4). Rare missense mutation in proteins linked to either AD or AR diseases were more clustered than controls (1000G) (Wilcoxon P=2.8×10-15 for AD and P=4.5×10-4 for AR, permutation P=3.1×10-12 for AD and P=0.03 for AR). Differences in clustering patterns persisted even after removal of the most prominent genes. Testing for such non-random patterns may reveal novel aspects of disease etiology in large sample studies.