Fig. 5: JARVIS performance on validation sets.

ROC curves from prediction on different sets of noncoding variants (falling into intergenic regions, UTRs, lincRNAs, UCNEs, and VISTA enhancers) not used during JARVIS training. In each case, a benign set of equal size has been randomly subset from the denovo-db control variants, avoiding any overlaps with the pathogenic variants in each of the validation sets. a GWAS hit SNVs (n = 1262). b Noncoding variants with mendelian traits (n = 118). c Generalization test set (ncRNA; n = 70). d Generalization test set (other; n = 34). In each plot, n refers to the total size of the validation set, including both pathogenic variants and a sample of control variants of equal size.