Extended Data Fig. 9: Random forest model predictions for FIX variants in the EAHAD FIX Variant Database associated with hemophilia B.
From: Multiplex and multimodal mapping of variant effects in secreted proteins via MultiSTEP

a, Spearman correlation of MultiSTEP functional scores with EVE, AlphaMissense, REVEL, and CADD variant effect predictors. b, Histograms of four variant effect predictor scores for F9 missense variants of known effect curated from ClinVar, gnomAD, and MLOF. Color indicates clinical variant interpretation. Data from four variant effect predictors are shown. Black dashed vertical lines indicate the thresholds for each predictor. For AlphaMissense we used the thresholds recommended in the original publication for 90% precision on existing ClinVar annotated variants ( ≤ 0.34: benign, 0.34-0.564: uncertain, ≥0.564: pathogenic). For REVEL, we used the thresholds used in the initial publication to assess REVEL’s precision in ClinVar (<0.5: benign, 0.5: uncertain, >0.5 pathogenic). For EVE, we used the thresholds recommended in the original publication for the 75% most confident classifications ( ≤ 0.359: benign, 0.359-0.641: uncertain, ≥0.641: pathogenic). For CADD, we used the same thresholds used in the MLOF clinical laboratory (<10: benign, 10–20: uncertain, >20: pathogenic). Number of variants scored by each predictor is annotated. c, Classification accuracy for F9 missense variants of known effect curated from ClinVar, gnomAD, and MLOF in our test set (benign/likely benign, n = 4 variants; pathogenic/likely pathogenic, n = 34 variants) by MultiSTEP variant function classifier and the four variant effect predictors using thresholds defined in (b). True benign/likely benign and pathogenic/likely pathogenic labels are denoted on the x-axis, and columns are colored relative to the classification for each method. Solid colors indicate correct classification, whereas striped colors indicate incorrect classification. For variant effect predictors, missing variants are colored gray with stripes and uncertain predictions are colored yellow with stripes. PPV, positive predictive value; NPV, negative predictive value; Spec, specificity; Sens, sensitivity.