Fig. 3: NVEs tend to impact more mutationally constrained genes and provide additional functional interpretations in GWAS.
From: Widespread naturally variable human exons aid genetic interpretation

A Distribution of LOEUF scores of genes with NVEs (purple) and genes that do not have any NVEs (black), in bins of width 0.1. Significant by the KS test p-value 10−212. B Percentage of genes with NVEs in constrained and unconstrained genes, matched on gene expression, in Whole Blood. For each gene, the median TPM across all individuals was taken to assess average expression in blood. Genes were separated as either being constrained (bottom quintile of LOEUF scores in blood) or unconstrained (top quintile of LOEUF scores in blood). Constrained genes were then filtered for those with median expression of over 1 TPM, and matched with unconstrained genes with similar expression, n = 178. Significant by proportion normal two-sided z-test in scipy, p = 10^−11. C The number of genes stratified by NVE EF bin, represented in each LOEUF quintile, with low values indicating more constraint. Genes that contain low EF NVEs are enriched among the most constrained genes, whereas genes with high EF NVEs are largely unconstrained. Here, the EF represents the NVE that is most frequent across the gene. D GWAS from three global biobanks (UKBB, FG release 9, and BBJ) were pooled together. We included unannotated canonically alternative exons found in the NVE discovery process to boost power. For the unannotated splice site set, pLOF, annotated splice region, nonsense, and missense variants were filtered out of this set to ensure that enrichment was not driven by well-explained variants. The number of variants being categorized as within an unannotated splice site is shown as n values. Estimates shown as mean +/− 95% CIs of a binomial estimate given n.