Fig. 2 | Nature Communications

Fig. 2

From: Large-scale genome-wide enrichment analyses identify new trait-associated genes and pathways across 31 human phenotypes

Fig. 2

Comparison of RSS-E to other methods for identifying enrichments from GWAS summary statistics. We used real genotypes12 to simulate individual-level data under two genetic architectures (“sparse” and “polygenic”) with four baseline-enrichment patterns: a baseline and enrichment datasets followed baseline (M0) and enrichment (M1) models in RSS-E; b baseline datasets assumed that a random set of near-gene SNPs were enriched for genetic associations and enrichment datasets followed M1; c baseline datasets assumed that a random set of coding SNPs were enriched for genetic associations and enrichment datasets followed M1; d baseline datasets followed M0 and enrichment datasets assumed that trait-associated SNPs were both more frequent, and had larger effects, inside than outside the target gene set. We computed the corresponding single-SNP summary statistics, and, on these summary data, we compared RSS-E with Pascal13 and LDSC14 using their default setups. Pascal includes two gene scoring options: maximum-of-χ2 (-max) and sum-of-χ2 (-sum), and two pathway scoring options: χ2 approximation (-chi) and empirical sampling (-emp). For each simulated dataset, both Pascal and LDSC produced enrichment p values, whereas RSS-E produced an enrichment BF; these statistics were used to rank the significance of enrichments. Each panel displays the trade-off between false and true enrichment discoveries for all methods in 200 baseline and 200 enrichment datasets of a given simulation scenario, and also reports the corresponding areas under the curve (AUCs), where a higher value indicates better performance. Simulation details and additional results are provided in Supplementary Figures 1–4

Back to article page