Abstract
Genome-wide association studies (GWAS) yield large numbers of genetic loci associated with traits and diseases. Predicting the effector genes that mediate these locus-trait associations remains challenging. Here we present the FLAMES (fine-mapped locus assessment model of effector genes) framework, which predicts the most likely effector gene in a locus. FLAMES creates machine learning predictions from biological data linking single-nucleotide polymorphisms to genes, and then evaluates these scores together with gene-centric evidence of convergence of the GWAS signal in functional networks. We benchmark FLAMES on gene-locus pairs derived by expert curation, rare variant implication and domain knowledge of molecular traits. We demonstrate that combining single-nucleotide-polymorphism-based and convergence-based modalities outperforms prioritization strategies using a single line of evidence. Applying FLAMES, we resolve the FSHB locus in the GWAS for dizygotic twinning and further leverage this framework to find schizophrenia risk genes that converge with rare coding evidence and are relevant in different stages of life.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
The reference data needed for running FLAMES and/or running (pathway-naive) PoPS is available via Zenodo at https://zenodo.org/records/12635505 (ref. 50). All annotations used are publicly available and the source of each individual annotation can be found in Supplementary Table 1. The BrainSpan expression data is available through https://www.brainspan.org/.
Code availability
The code for installing and running FLAMES can be accessed via https://github.com/Marijn-Schipper/FLAMES. A static version of the code used to perform the analyses in this paper is available via Zenodo at https://zenodo.org/records/14050681 (ref. 51).
Change history
27 October 2025
A Correction to this paper has been published: https://doi.org/10.1038/s41588-025-02416-7
References
Gazal, S. et al. Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nat. Genet. 54, 827–836 (2022).
Forgetta, V. et al. An effector index to predict target genes at GWAS loci. Hum. Genet. 141, 1431–1447 (2022).
Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021).
Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023).
Liang, K. Y. H. et al. Predicting ExWAS findings from GWAS data: a shorter path to causal genes. Hum. Genet. 142, 749–758 (2023).
Sinnott-Armstrong, N., Naqvi, S., Rivas, M. & Pritchard, J. K. GWAS of three molecular traits highlights core genes and pathways alongside a highly polygenic background. eLife 10, e58615 (2021).
Chen, T. & Guestrin, C. XGBoost: a scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Krishnapuram, B. & Shah, M.) 785–794 (Association for Computing Machinery, 2016).
Sudlow, C. et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12, e1001779 (2015).
Karczewski, K. J. et al. Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genomics 2, 100168 (2022).
Benner, C. et al. FINEMAP: efficient variable selection using summary data from genome-wide association studies. Bioinformatics 32, 1493–1501 (2016).
Lundberg, S. & Lee, S.-I. A unified approach to interpreting model predictions. Preprint at https://arxiv.org/abs/1705.07874 (2017).
Zhou, W. et al. Global Biobank meta-analysis initiative: powering genetic discovery across human disease. Cell Genomics 2, 100192 (2022).
Watanabe, K. et al. A global overview of pleiotropy and genetic architecture in complex traits. Nat. Genet. 51, 1339–1348 (2019).
Mbarek, H. et al. Genome-wide association study meta-analysis of dizygotic twinning illuminates genetic regulation of female fecundity. Hum. Reprod. 39, 240–257 (2024).
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
Bliss, S. P., Navratil, A. M., Xie, J. & Roberson, M. S. GnRH signaling, the gonadotrope and endocrine control of fertility. Front. Neuroendocrinol. 31, 322–340 (2010).
Gadea, G. & Blangy, A. Dock-family exchange factors in cell migration and disease. Eur. J. Cell Biol. 93, 466–477 (2014).
Boomsma, D. I. The genetics of human DZ twinning. Twin Res. Hum. Genet. 23, 74–76 (2020).
Trubetskoy, V. et al. Mapping genomic loci implicates genes and synaptic biology in schizophrenia. Nature 604, 502–508 (2022).
Koopmans, F. et al. SynGO: an evidence-based, expert-curated knowledge base for the synapse. Neuron 103, 217–234.e4 (2019).
Liberzon, A. et al. Molecular signatures database (MSigDB) 3.0. Bioinformatics 27, 1739–1740 (2011).
de Leeuw, C. A., Mooij, J. M., Heskes, T. & Posthuma, D. MAGMA: generalized gene-set analysis of GWAS data. PLoS Comput. Biol. 11, e1004219 (2015).
van der Meer, D. et al. Clustering schizophrenia genes by their temporal expression patterns aids functional interpretation. Schizophr. Bull. 50, 327–338 (2024).
Hartigan, J. A. & Wong, M. A. Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc. Ser. C Appl. Stat. 28, 100–108 (1979).
Rousseeuw, P. J. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Pijnenburg, R. et al. Myelo- and cytoarchitectonic microstructural and functional human cortical atlases reconstructed in common MRI space. NeuroImage 239, 118274 (2021).
Singh, T. et al. Rare coding variants in ten genes confer substantial risk for schizophrenia. Nature 604, 509–516 (2022).
Lips, E. S. et al. Functional gene group analysis identifies synaptic gene groups as risk factor for schizophrenia. Mol. Psychiatry 17, 996–1006 (2012).
Kirov, G. et al. Comparative genome hybridization suggests a role for NRXN1 and APBA2 in schizophrenia. Hum. Mol. Genet. 17, 458–465 (2008).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Benner, C. et al. Prospects of fine-mapping trait-associated genomic regions by using summary statistics from genome-wide association studies. Am. J. Hum. Genet. 101, 539–551 (2017).
Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 47, D886–D894 (2019).
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17, 122 (2016).
Javierre, B. M. et al. Lineage-specific genome architecture links enhancers and non-coding disease variants to target gene promoters. Cell 167, 1369–1384.e19 (2016).
Jung, I. et al. A compendium of promoter-centered long-range chromatin interactions in the human genome. Nat. Genet. 51, 1442–1449 (2019).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Fulco, C. P. et al. Activity-by-contact model of enhancer–promoter regulation from thousands of CRISPR perturbations. Nat. Genet. 51, 1664–1669 (2019).
Pliner, H. A. et al. Cicero predicts cis-regulatory DNA interactions from single-cell chromatin accessibility data. Mol. Cell 71, 858–871.e8 (2018).
Hormozdiari, F., Kostem, E., Kang, E. Y., Pasaniuc, B. & Eskin, E. Identifying causal variants at loci with multiple signals of association. Genetics 198, 497–508 (2014).
Lonsdale, J. et al. The genotype-tissue expression (GTEx) project. Nat. Genet. 45, 580–585 (2013).
Kerimov, N. et al. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat. Genet. 53, 1290–1299 (2021).
Wang, J. et al. HACER: an atlas of human active enhancers to interpret regulatory variants. Nucleic Acids Res. 47, D106–D112 (2019).
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Hoon, D. S. B., Rahimzadeh, N. & Bustos, M. A. EpiMap: Fine-tuning integrative epigenomics maps to understand complex human regulatory genomic circuitry. Signal Transduct. Target. Ther. 6, 1–3 (2021).
Fishilevich, S. et al. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database (Oxford) 2017, bax028 (2017).
Kundaje, A. et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
Yates, A. D. et al. Ensembl 2020. Nucleic Acids Res. 48, D682–D688 (2020).
Miller, J. A. et al. Transcriptional landscape of the prenatal human brain. Nature 508, 199–206 (2014).
Hawrylycz, M. J. et al. An anatomically comprehensive atlas of the adult human brain transcriptome. Nature 489, 391–399 (2012).
Schipper, M. Annotation data needed for FLAMES. Zenodo https://doi-org.vu-nl.idm.oclc.org/10.5281/zenodo.10409675 (2024).
Schipper, M. Static version FLAMES code and data. Zenodo https://doi-org.vu-nl.idm.oclc.org/10.5281/zenodo.14050636 (2024).
Acknowledgements
We thank M. de Hemptinne, K. Heilbron and the members of the Complex Trait Genetics laboratory at VU University for their input and discussions. This project was supported by NWO Gravitation: BRAINSCAPES: a roadmap from neurogenetics to neurobiology (grant no. 024.004.012), and European Research Council advanced grant ERC-2018-ADG 834057. This research has been conducted using the UK Biobank Resource under Application no. 16406.
Author information
Authors and Affiliations
Contributions
M.S. and D.P. conceived the study. M.S. developed the FLAMES framework and implemented the software. M.S. performed the analyses and wrote the manuscript with contributions from CA.d.L., B.A.P.C.M., D.P.W. and D.P. M.S., D.P., N.H, D.I.B. and M.C.O.ʼD. interpreted the results. All authors provided meaningful contributions at each stage of the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Full feature impact by SHAP values.
Impact of features denoted by SHAP value. Blue denotes low feature values, red denote high feature values.
Extended Data Fig. 2 Ratio of annotation scores in L2G expert-curated causal genes.
Ratios of average annotation score per annotation in ExWAS implicated gene in GWAS locus versus the rest of the genes in the locus. Error bars represent 95% confidence intervals were calculated by bootstrapping 1000 times.
Extended Data Fig. 3 Odds ratio of expert-curated causal gene having the highest annotation scores in the locus.
Odds ratio of highest annotation score in the GWAS locus belonging to the ExWAS implicated gene in the locus versus the rest of the genes in the locus. Error bars represent 95% confidence intervals were calculated by bootstrapping 1000 times.
Extended Data Fig. 4 Benchmark of FLAMES versus Ei.
Benchmarking results of FLAMES versus Ei on the ExWAS implicated benchmarking set. Precision-recall curves of FLAMES (red line), our XGBoost model (purple line) and Ei (blue line) are visualized, with corresponding area under the precision recall curve (AUPRC) denoted in legend. For methods see Supplementary Note.
Extended Data Fig. 5 Benchmark of FLAMES on three molecular traits with within-sample-LD fine-mapping.
Benchmark of FLAMES in loci of three molecular traits, using fine-mapping results from within-sample fine-mapping6. Precision recall curves of FLAMES raw score (green line), with corresponding area under the precision-recall curve (AUPRC) denoted in legend. FLAMES recommended (orange diamond) denotes FLAMES prioritizations at our recommended threshold. Highest FLAMES (green diamond), MAGMA (cyan pentagon) and PoPS (white triangle) represent prioritizing the gene with the highest corresponding score in the locus. Closest gene (bold black cross) denotes taking the closest gene(s) to the fine-mapped credible sets. Closest + PoPS (grey triangle) denotes prioritizing a gene if it is the closest gene and has the highest PoPS score in the locus. Random predictor (black cross) represents prioritizing a random gene in the locus. FLAMES is scored using the more conservative raw FLAMES score due to multiple signals present per locus (see Supplementary Note).
Extended Data Fig. 6 Raw FLAMES score of prioritized vs non-prioritized genes.
Scores were derived from ExWAS implicated benchmarking set. Prioritized genes in blue, not prioritized in red. To prioritize genes we used the recommended FLAMES threshold.
Extended Data Fig. 7 Decision plot of DOCK5 locus.
Decision plot of FLAMES XGB scores of DOCK5 locus highlighting that DOCK5 is prioritized mostly due to high MAGMA Z-scores, distance and eQTL evidence.
Extended Data Fig. 8 Benchmarking results when including pathway naïve FLAMES.
Precision and recall of state-of-the-art gene prioritization methods in different benchmarking sets. Precision recall curves of FLAMES (green line), FLAMES scores using PoPS scores generated excluding pathway information (red line), our XGBoost model (purple line), L2G (blue line) and cS2G (yellow line) are visualized, with corresponding area under the precision-recall curve (AUPRC) denoted in legend. FLAMES recommended (orange diamond) denotes FLAMES prioritizations at our recommended threshold. Highest FLAMES (green diamond), L2G (blue circle), MAGMA (cyan pentagon) and PoPS (white triangle) represent prioritizing the gene with the highest corresponding score in the locus. Closest gene (bold black cross) denotes taking the closest gene(s) to the fine-mapped credible sets. Closest + PoPS (grey triangle) denotes prioritizing a gene if it is the closest gene and has the highest PoPS score in the locus. Random predictor (black cross) represents prioritizing a random gene in the locus. a, Benchmarking on all loci-gene pairs with available GWAS summary statistics in the expert-curated dataset3. b, Benchmarking of interpretable loci for GWAS of urate, IGF-1 levels and testosterone levels in blood6. c, Benchmarking on high confidence ExWAS implicated genes in nine traits5.
Extended Data Fig. 9 Brain expressed genes expression profile in BRAINSPAN.
Expression profile of BRAINSPAN brain expressed genes after k-means clustering. Gene expression is mean scaled per gene, and averaged across all genes in the cluster per timepoint. The expression of the two separate clusters is represented in orange and blue ± 95% confidence intervals in grey.
Supplementary information
Supplementary Information
Supplementary Note.
Supplementary Tables 1–25
Supplementary Tables 1–25.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Schipper, M., de Leeuw, C.A., Maciel, B.A.P.C. et al. Prioritizing effector genes at trait-associated loci using multimodal evidence. Nat Genet 57, 323–333 (2025). https://doi.org/10.1038/s41588-025-02084-7
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41588-025-02084-7
This article is cited by
-
Protein–protein interactions shape trans-regulatory impact of genetic variation on protein expression and complex traits
Nature Genetics (2026)
-
Realizing the promise of genome-wide association studies for effector gene prediction
Nature Genetics (2025)
-
Genomics of drug target prioritization for complex diseases
Nature Reviews Genetics (2025)
-
Clustering of lymphoid neoplasms by cell of origin, somatic mutation and drug usage profiles: a multi-trait genome-wide association study
Blood Cancer Journal (2025)
-
Local genetic sex differences in quantitative traits
Nature Communications (2025)


