Extended Data Fig. 7: Characterization of genetic variants in C-terminal IDRs.
From: Aberrant phase separation and nucleolar dysfunction in rare genetic diseases

(a) Scheme of the IDR catalog identification algorithm. (b) Summary of all variants identified in C-terminal IDRs. (c) Frameshift variants are enriched for pathogenic variants. (d) Gene Ontology (GO) term enrichment analysis of (left) genes that contain at least one ‘stop gained’ (i.e. truncating) mutation in the catalog; (middle) genes that contain at least one frameshift (≥20 amino acids) in the catalog; (right) genes that contain at least one frameshift (≥20 amino acids) that creates a sequence consisting of at least 15% arginines. (e) pLI score distributions for indicated gene sets. Disease genes: genes that have at least one “pathogenic”, “likely pathogenic”, or “conflicting interpretations” entry in ClinVar. (f) Word cloud plot of diseases associated with ‘stop gained’ (i.e. truncating) variants. Font size of words correlates with frequency of occurrence. (g) Word cloud plot of diseases associated with frameshift variants that create an at least 20 amino acid long sequence. Font size of words correlates with frequency of occurrence. (h) Word cloud plot of diseases associated with frameshift variants that create an at least 20 amino acid long sequence that consists of at least 15% arginines. Font size of words correlates with frequency of occurrence.