Fig. 3

Comparison of RSS-E to other methods for identifying gene-level associations from GWAS summary statistics. We used real genotypes12 to simulate individual-level data with and without enrichment in the target gene set (a “baseline”; b “enrichment”), each under two genetic architectures (“sparse” and “polygenic”), and then computed corresponding single-SNP summary statistics. On these summary data, we compared RSS-E with four other methods: SimpleM17, VEGAS18, GATES19, and COMBAT20. We applied VEGAS to the full set of SNPs (-sum), to a specified percentage of the most significant SNPs (−10% and −20%), and to the single most significant SNP (-max), within 100 kb of the transcribed region of each gene. All methods are available in the package COMBAT (Methods). For each simulated dataset, we defined a gene as “trait-associated” if at least one SNP within 100 kb of the transcribed region of this gene had nonzero effect. For each gene in each dataset, RSS-E produced the posterior probability that the gene was trait-associated. whereas the other methods produced association p values; these statistics were used to rank the significance of gene-level associations. Each panel displays the trade-off between false and true gene-level associations for all methods in 100 datasets of a given simulation scenario, and reports the corresponding AUCs. Simulation details and additional results are provided in Supplementary Figures 6, 7