Fig. 2: ClinVar classification and AlphaMissense pathogenicity scores for variants in individuals with inflammatory bowel diseases in a rare disease cohort.

a Number and proportion of patients with inflammatory bowel diseases (IBD) in the cohort carrying loss-of-function (LoF) variants, ClinVar pathogenic (ClinVar_P) or likely pathogenic (ClinVar_LP) variants, or variants predicted as likely pathogenic by AlphaMissense (left panel). Composition of patients according to the combination of these variants found in each individual (right panel). b Distribution of AlphaMissense pathogenicity scores for ClinVar variants and other missense variants in candidate genes. Vertical lines indicate the thresholds for likely benign (0.34) and likely pathogenic (0.564) classifications by AlphaMissense. c–f Each panel corresponds to a different prediction method: c AlphaMissense, d REVEL, e ESM1b, and f BayesDel. The experimentally validated protein domain structure of COL7A1 (UniProt accession number: Q02388) is shown, with intrinsically disordered regions (IDRs) indicated by orange boxes at the top. Pathogenicity scores along amino acid positions are plotted as colored dots, with color scales reflecting each method’s scoring system. ClinVar pathogenic and likely pathogenic variants are marked with red and salmon circles, respectively. Notably, many of these variants are frequently misclassified as likely benign, especially in IDRs (gray highlight), where the AlphaFold2 model confidence score (pLDDT) is below 50. A black arrow points to a misclassified pathogenic variant found in our cohort by AM.