Fig. 3: Functional validation of transcript-expression based annotation.
From: Transcript expression-aware annotation improves rare variant interpretation

a, We define highly conserved and unconserved regions as phyloCSF > 1,000 (n = 9,817) and phyloCSF < −100 (n = 11,860), respectively, and compare the expression status of these regions across GTEx. Regions with high phyloCSF scores are enriched for near-constitutive expression, whereas unconserved regions are enriched for little to no usage across GTEx. This difference is significant after correcting for gene length (logistic regression P < 1 × 10−100). We note that unconserved regions with high levels of expression (pext > 0.9) are enriched for immune-related genes, which are selected for diversity and thus have low conservation, but represent true coding regions. b, Transcript-expression based annotation recapitulates, and adds information to, existing interpretation tools. High-confidence pLoF LOFTEE variants in gnomAD with no flags (n = 458,880) are enriched for higher pext values, whereas high-confidence pLoF variants falling on low phyloCSF (n = 44,373) or unlikely open-reading frame regions (n = 2,437) are enriched for low expression. However, high-confidence pLoF variants can also have a low pext score. Variants flagged falling on regions that are unlikely open-reading frame or have weak conservation are enriched for lower pext values. Red dots denote the median pext value across GTEx, c, Non-synonymous variants found on near-constitutive regions tend to be more deleterious. We compared the MAPS score for variants with low (<0.1), medium (0.1 ≤ pext ≤ 0.9) and high (pext > 0.9) expression. Variants with near-constitutive expression have a higher MAPS score, which indicates higher deleteriousness than those with little to no evidence of expression. Points represent MAPS values and error bars denote the 95% confidence interval. Dashed grey and orange lines represent MAPS values for all gnomAD missense and synonymous variants, respectively. The number of variants evaluated per category and unadjusted proportion singleton values can be found in Supplementary Table 5a.