Abstract
Drug development faces persistent challenges, with high attrition rates and unexpected adverse effects contributing to clinical trial failures. The recent convergence of large-scale biobanks, multi-omics data and computational methods, including machine learning, has led to advances in genetics-driven drug discovery, offering new opportunities to refine target selection and reduce late-stage risk. Integrating multiple lines of evidence centred on human genetics within a probabilistic framework enables the systematic prioritization of drug targets, prediction of adverse effects, and identification of drug repurposing opportunities. In this Review, we explore how these integrative approaches can address unmet clinical needs in diverse disease contexts, focusing on complex diseases.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout



Similar content being viewed by others
References
Wong, C. H., Siah, K. W. & Lo, A. W. Estimation of clinical trial success rates and related parameters. Biostatistics 20, 273 (2018).
Sun, D., Gao, W., Hu, H. & Zhou, S. Why 90% of clinical drug development fails and how to improve it? Acta Pharm. Sin. B 12, 3049–3062 (2022).
Dowden, H. & Munro, J. Trends in clinical success rates and therapeutic focus. Nat. Rev. Drug Discov. 18, 495–496 (2019).
Cook, D. et al. Lessons learned from the fate of AstraZeneca’s drug pipeline: a five-dimensional framework. Nat. Rev. Drug Discov. 13, 419–431 (2014).
Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 12, 581–594 (2013).
Minikel, E. V., Painter, J. L., Dong, C. C. & Nelson, M. R. Refining the impact of genetic evidence on clinical success. Nature 629, 624–629 (2024). This study demonstrates that drug mechanisms supported by human genetic evidence are 2.6 times more likely to reach approval, with success rates increasing as confidence in effector gene assignment improves.
Razuvayevskaya, O., Lopez, I., Dunham, I. & Ochoa, D. Genetic factors associated with reasons for clinical trial stoppage. Nat. Genet. 56, 1862–1867 (2024). This study uses NLP-based classification of 28,561 stopped clinical trials to demonstrate that lack of genetic support for drug targets is associated with efficacy-related trial failures, whereas safety-related stoppages are enriched for targets that are highly constrained or broadly expressed, underscoring the role of human genetics in de-risking drug development.
Falaguera, M. J. et al. Temporal trends in novel drug target discovery reveal the increasing importance of human genetic data. Preprint at Research Square https://doi.org/10.21203/rs.3.rs-5669559/v1 (2024).
Morgan, P. et al. Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat. Rev. Drug Discov. 17, 167–181 (2018).
Dugger, S. A., Platt, A. & Goldstein, D. B. Drug development in the era of precision medicine. Nat. Rev. Drug Discov. 17, 183–196 (2018).
Loewa, A., Feng, J. J. & Hedtrich, S. Human disease models in drug development. Nat. Rev. Bioeng. 1, 545–559 (2023).
Ingber, D. E. Human organs-on-chips for disease modelling, drug development and personalized medicine. Nat. Rev. Genet. 23, 467–491 (2022).
Carss, K. J. et al. Using human genetics to improve safety assessment of therapeutics. Nat. Rev. Drug Discov. 22, 145–162 (2023).
Duncan, L. E., Ostacher, M. & Ballon, J. How genome-wide association studies (GWAS) made traditional candidate gene studies obsolete. Neuropsychopharmacology 44, 1518–1523 (2019).
Gallagher, C. S., Ginsburg, G. S. & Musick, A. Biobanking with genetics shapes precision medicine and global health. Nat. Rev. Genet. 26, 191–202 (2025).
Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942–948 (2021).
Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).
Namba, S., Konuma, T., Wu, K.-H., Zhou, W. & Okada, Y. A practical guideline of genomics-driven drug discovery in the era of global biobank meta-analysis. Cell Genom. 2, 100190 (2022). This paper demonstrates a practical framework for genomics-driven drug discovery using cross-population GWAS meta-analyses, integrating gene prioritization, Mendelian randomization and gene expression correlation analyses that identifies 266 drug repositioning candidates across 13 diseases.
Qi, T., Song, L., Guo, Y., Chen, C. & Yang, J. From genetic associations to genes: methods, applications, and challenges. Trends Genet. 40, 642–667 (2024).
Horn, R. & Merchant, J. Ethical and social implications of public–private partnerships in the context of genomic/big health data collection. Eur. J. Hum. Genet. 32, 736–741 (2024).
Buniello, A. et al. Open targets platform: facilitating therapeutic hypotheses building in drug discovery. Nucleic Acids Res. 53, D1467–D1475 (2025). This paper describes how enhancements to the Open Targets Platform, including a revamped target prioritization framework and direction of effect assessment, significantly improve the ability to identify, annotate and rank drug targets.
Zhang, X. et al. Drug development advances in human genetics-based targets. MedComm 5, e481 (2024).
Rusina, P. V. et al. Genetic support for FDA-approved drugs over the past decade. Nat. Rev. Drug Discov. 22, 864 (2023).
Trajanoska, K. et al. From target discovery to clinical drug development with human genetics. Nature 620, 737–745 (2023).
Sadler, M. C., Auwerx, C., Deelen, P. & Kutalik, Z. Multi-layered genetic approaches to identify approved drug targets. Cell Genom. 3, 100341 (2023). This paper demonstrates that integrating GWAS, exome sequencing and molecular QTL data improves drug target gene prioritization, with network diffusion further improving performance.
Duffy, Á et al. Development of a human genetics-guided priority score for 19,365 genes and 399 drug indications. Nat. Genet. 56, 51–59 (2024). This paper demonstrates that integrating multiple genetic features into a genetic priority score identifies drug targets with higher probabilities of clinical success, whereas a directional extension (GPS-DOE) informs modulation strategies, providing a scalable framework to prioritize therapeutic targets across 19,365 genes and 399 drug indications.
Chen, R. et al. Expanding drug targets for 112 chronic diseases using a machine learning-assisted genetic priority score. Nat. Commun. 15, 8891 (2024).
Nguyen, P. A., Born, D. A., Deaton, A. M., Nioi, P. & Ward, L. D. Phenotypes associated with genes encoding drug targets are predictive of clinical trial side effects. Nat. Commun. 10, 1579 (2019).
Duffy, Á. et al. Tissue-specific genetic features inform prediction of drug side effects in clinical trials. Sci. Adv. 6, eabb6242 (2020).
Minikel, E. V. & Nelson, M. R. Human genetic evidence enriched for side effects of approved drugs. PLoS Genet. 21, e1011638 (2025).
Plenge, R. M. Disciplined approach to drug discovery and early development. Sci. Transl. Med. 8, 349ps15 (2016).
Wu, S. S. et al. Reviving an R&D pipeline: a step change in the phase II success rate. Drug Discov. Today 26, 308–314 (2021).
Fernando, K. et al. Achieving end-to-end success in the clinic: Pfizer’s learnings on R&D productivity. Drug Discov. Today 27, 697–704 (2022).
Emmerich, C. H. et al. Improving target assessment in biomedical research: the GOT-IT recommendations. Nat. Rev. Drug Discov. 20, 64–81 (2021).
McDonagh, E. M. et al. Human genetics and genomics for drug target identification and prioritization: Open Targets’ perspective. Annu. Rev. Biomed. Data Sci. 7, 59–81 (2024).
Cerezo, M. et al. The NHGRI-EBI GWAS catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res. 53, D998–D1005 (2025).
Costanzo, M. C. et al. Realizing the promise of genome-wide association studies for effector gene prediction. Nat. Genet. 57, 1578–1587 (2025).
Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease. Cell Genom. 2, 100192 (2022).
Gaulton, K. J., Preissl, S. & Ren, B. Interpreting non-coding disease-associated human variants using single-cell epigenomics. Nat. Rev. Genet. 24, 516–534 (2023).
Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).
Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).
Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).
Shringarpure, S. S. et al. Large language models identify causal genes in complex trait GWAS. Preprint at medRxiv https://doi.org/10.1101/2024.05.30.24308179 (2025).
Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021). This paper demonstrates that systematically integrating fine-mapping, colocalization and functional genomics data across 133,441 GWAS loci using machine learning improves causal gene prioritization compared to distance-based methods, with prioritized genes showing 8.1-fold enrichment for approved drug targets.
Gazal, S. et al. Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nat. Genet. 54, 827–836 (2022).
Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).
Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023).
Schipper, M. et al. Prioritizing effector genes at trait-associated loci using multimodal evidence. Nat. Genet. 57, 323–333 (2025).
Hemerich, D. et al. An integrative framework to prioritize genes in more than 500 loci associated with body mass index. Am. J. Hum. Genet. 111, 1035–1046 (2024).
Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).
Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).
Evangelista, J. E. et al. SigCom LINCS: data and metadata search engine for a million gene expression signatures. Nucleic Acids Res. 50, W697–W709 (2022).
Pilarczyk, M. et al. Connecting omics signatures and revealing biological mechanisms with iLINCS. Nat. Commun. 13, 4678 (2022).
Wu, P. et al. Integrating gene expression and clinical data to identify drug repurposing candidates for hyperlipidemia and hypertension. Nat. Commun. 13, 46 (2022). This paper demonstrates that integrating disease gene expression signatures from GWAS data, drug perturbation databases and electronic health records enables high-throughput identification and clinical validation of drug repurposing candidates.
Zhao, H. et al. Proteome-wide Mendelian randomization in global biobank meta-analysis reveals multi-ancestry drug targets for common diseases. Cell Genom. 2, 100195 (2022).
Schmidt, A. F. et al. Genetic drug target validation using Mendelian randomisation. Nat. Commun. 11, 3255 (2020).
Zheng, X. et al. DMRdb: a disease-centric Mendelian randomization database for systematically assessing causal relationships of diseases with genes, proteins, CpG sites, metabolites and other diseases. Nucleic Acids Res. 53, D1363–D1371 (2025).
Hemani, G. et al. The MR-base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).
Ferolito, B. R. et al. Leveraging large-scale biobanks for therapeutic target discovery. Preprint at medRxiv https://doi.org/10.1101/2025.02.10.25321487 (2025).
Zhao, S. et al. Adjusting for genetic confounders in transcriptome-wide association studies improves discovery of risk genes of complex traits. Nat. Genet. 56, 336–347 (2024).
Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).
Zuber, V. et al. Combining evidence from Mendelian randomization and colocalization: review and comparison of approaches. Am. J. Hum. Genet. 109, 767–782 (2022).
Hukku, A., Sampson, M. G., Luca, F., Pique-Regi, R. & Wen, X. Analyzing and reconciling colocalization and transcriptome-wide association studies from the perspective of inferential reproducibility. Am. J. Hum. Genet. 109, 825–837 (2022).
Lin, Z. & Pan, W. A robust cis-Mendelian randomization method with application to drug target discovery. Nat. Commun. 15, 6072 (2024).
Kim, M. S. et al. Prioritization of therapeutic targets for dyslipidemia using integrative multi-omics and multi-trait analysis. Cell Rep. Med. 4, 101112 (2023).
Okamoto, J. et al. Multi-INTACT: integrative analysis of the genome, transcriptome, and proteome identifies causal mechanisms of complex traits. Genome Biol. 26, 19 (2025).
Jensen, L. T., Attfield, K. E., Feldmann, M. & Fugger, L. Allosteric TYK2 inhibition: redefining autoimmune disease therapy beyond JAK1-3 inhibitors. eBioMedicine 97, 104840 (2023).
Yuan, S. et al. Mendelian randomization and clinical trial evidence supports TYK2 inhibition as a therapeutic target for autoimmune diseases. eBioMedicine 89, 104488 (2023).
Jurgens, S. J. et al. Rare coding variant analysis for human diseases across biobanks and ancestries. Nat. Genet. 56, 1811–1820 (2024).
Wang, X. et al. The impact on clinical success from the 23andMe cohort. Preprint at medRxiv https://doi.org/10.1101/2024.06.17.24309059 (2024).
Gaynor, S. M. et al. Yield of genetic association signals from genomes, exomes and imputation in the UK Biobank. Nat. Genet. 56, 2345–2351 (2024).
Hawkes, G. et al. Whole-genome sequencing in 333,100 individuals reveals rare non-coding single variant and aggregate associations with height. Nat. Commun. 15, 8549 (2024).
Ribeiro, D. M., Hofmeister, R. J., Rubinacci, S. & Delaneau, O. Noncoding rare variant associations with blood traits in 166,740 UK Biobank genomes. Nat. Genet. 57, 2146–2155 (2025).
Wainschtein, P. et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 54, 263–273 (2022).
Gusarova, V. et al. Genetic inactivation of ANGPTL4 improves glucose homeostasis and is associated with reduced risk of diabetes. Nat. Commun. 9, 2252 (2018).
Akbari, P. et al. Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity. Science 373, eabf8683 (2021).
Petrazzini, B. O. et al. Exome sequence analysis identifies rare coding variants associated with a machine learning-based marker for coronary artery disease. Nat. Genet. 56, 1412–1419 (2024).
Verma, A. et al. Diversity and scale: genetic architecture of 2068 traits in the VA Million Veteran Program. Science 385, eadj1182 (2024).
Karczewski, K. J. et al. Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects. Nat. Genet. 57, 2408–2417 (2025).
Levin, M. G. et al. Genome-wide assessment of pleiotropy across >1000 traits from global biobanks. Preprint at medRxiv https://doi.org/10.1101/2025.04.18.25326074 (2025).
Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).
Han, X. et al. Large-scale multitrait genome-wide association analyses identify hundreds of glaucoma risk loci. Nat. Genet. 55, 1116–1125 (2023).
Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).
Park, S. et al. Multivariate genomic analysis of 5 million people elucidates the genetic architecture of shared components of the metabolic syndrome. Nat. Genet. 56, 2380–2391 (2024).
Julienne, H. et al. Multitrait GWAS to connect disease variants and biological mechanisms. PLoS Genet. 17, e1009713 (2021).
Luo, L. et al. Multi-trait analysis of rare-variant association summary statistics using MTAR. Nat. Commun. 11, 2850 (2020).
Li, X. et al. A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies. Nat. Comput. Sci. 5, 125–143 (2025).
Povysil, G. et al. Rare-variant collapsing analyses for complex traits: guidelines and applications. Nat. Rev. Genet. 20, 747–759 (2019).
Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).
Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).
Liu, Y. et al. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am. J. Hum. Genet. 104, 410–421 (2019).
Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).
Ziyatdinov, A. et al. Joint testing of rare variant burden scores using non-negative least squares. Am. J. Hum. Genet. 111, 2139–2149 (2024).
Aggarwal, R. et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit. Med. 4, 1–23 (2021).
Khurshid, S. et al. Deep learning to predict cardiac magnetic resonance-derived left ventricular mass and hypertrophy from 12-lead ECGs. Circ. Cardiovasc. Imaging 14, e012281 (2021).
Haas, M. E. et al. Machine learning enables new insights into genetic contributions to liver fat accumulation. Cell Genom. 1, 100066 (2021).
Pirruccello, J. P. et al. Genetic analysis of right heart structure and function in 40,000 people. Nat. Genet. 54, 792–803 (2022).
Yun, T. et al. Unsupervised representation learning on high-dimensional clinical data improves genomic discovery and prediction. Nat. Genet. 56, 1604–1613 (2024).
Flynn, B. I. et al. Deep learning based phenotyping of medical images improves power for gene discovery of complex disease. NPJ Digit. Med. 6, 1–12 (2023).
Tadros, R. et al. Large-scale genome-wide association analyses identify novel genetic loci and mechanisms in hypertrophic cardiomyopathy. Nat. Genet. 57, 530–538 (2025).
Huang, J. et al. A critical assessment of using ChatGPT for extracting structured data from clinical notes. NPJ Digit. Med. 7, 1–13 (2024).
Wei, W.-Q. et al. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J. Am. Med. Inform. Assoc. 23, e20–e27 (2016).
Verma, A. et al. The Penn Medicine BioBank: towards a genomics-enabled learning healthcare system to accelerate precision medicine in a diverse population. J. Pers. Med. 12, 1974 (2022).
Schneider, C. V. et al. Large-scale identification of undiagnosed hepatic steatosis using natural language processing. eClinicalMedicine 62, 102149 (2023).
Somineni, H. et al. Machine learning across multiple imaging and biomarker modalities in the UK Biobank improves genetic discovery for liver fat accumulation. Preprint at medRxiv https://doi.org/10.1101/2024.01.06.24300923 (2024).
Chen, R. et al. Trans-ancestral rare variant association study with machine learning-based phenotyping for metabolic dysfunction-associated steatotic liver disease. Genome Biol. 26, 50 (2025).
An, U. et al. Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries. Nat. Genet. 55, 2269–2276 (2023).
Miao, J. et al. Valid inference for machine learning-assisted genome-wide association studies. Nat. Genet. 56, 2361–2369 (2024).
Garg, M. et al. Disease prediction with multi-omics and biomarkers empowers case–control genetic discoveries in the UK Biobank. Nat. Genet. 56, 1821–1831 (2024).
Burstein, D. et al. Genome-wide analysis of a model-derived binge eating disorder phenotype identifies risk loci and implicates iron metabolism. Nat. Genet. 55, 1462–1470 (2023).
Chen, R. et al. Genetic analyses of eight complex diseases using predicted continuous representations of disease. Cell Rep. Methods 5, 101115 (2025).
Yang, L., Sadler, M. C. & Altman, R. B. Genetic association studies using disease liabilities from deep neural networks. Am. J. Hum. Genet. 112, 675–692 (2025).
Carrasco-Zanini, J. et al. Proteomic signatures improve risk prediction for common and rare diseases. Nat. Med. 30, 2489–2498 (2024).
Buergel, T. et al. Metabolomic profiles predict individual multidisease outcomes. Nat. Med. 28, 2309–2320 (2022).
Justesen, J. M. et al. Genetics of cardiometabolic disease progression. Preprint at medRxiv https://doi.org/10.1101/2025.02.01.25321518 (2025).
Hanson, M. A., Barreiro, P. G., Crosetto, P. & Brockington, D. The strain on scientific publishing. Quant. Sci. Stud. 5, 823–843 (2024).
Tirunagari, S. et al. Lit-OTAR framework for extracting biological evidences from literature. Bioinformatics 41, btaf113 (2025).
Birgmeier, J. et al. AMELIE speeds Mendelian diagnosis by matching patient phenotype and genotype to primary literature. Sci. Transl. Med. 12, eaau9113 (2020).
Li, P.-H. et al. A large language model framework for literature-based disease–gene association prediction. Brief. Bioinform. 26, bbaf070 (2025).
Piñero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48, D845–D855 (2020).
Papatheodorou, I. et al. Expression Atlas update: from tissues to single cells. Nucleic Acids Res. 48, D77–D83 (2020).
Tian, R. et al. Genome-wide CRISPRi/a screens in human neurons link lysosomal failure to ferroptosis. Nat. Neurosci. 24, 1020–1034 (2021).
Chen, Z., Boehnke, M., Wen, X. & Mukherjee, B. Revisiting the genome-wide significance threshold for common variant GWAS. Genes Genomes Genet. 11, jkaa056 (2021).
Wang, X. et al. Discovery and validation of sub-threshold genome-wide association study loci using epigenomic signatures. eLife 5, e10557 (2016).
Koutsandreas, T., Tsafou, K., Horn, H., Barrett, I. & Petsalaki, E. Network-based approaches for drug target identification. Annu. Rev. 8, 423–446 (2025).
Cheng, F. et al. Network-based approach to prediction and population-based validation of in silico drug repurposing. Nat. Commun. 9, 2691 (2018).
Ruiz, C., Zitnik, M. & Leskovec, J. Identification of disease treatment mechanisms through the multiscale interactome. Nat. Commun. 12, 1796 (2021).
Middleton, L. et al. Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data. Sci. Adv. 10, eadj1424 (2024). This study demonstrates that Mantis-ML 2.0, an automated machine learning framework integrating knowledge graphs, graph neural networks and natural language processing, enhances phenome-wide gene–disease association prediction across 5,220 diseases.
Guney, E., Menche, J., Vidal, M. & Barábasi, A.-L. Network-based in silico drug efficacy screening. Nat. Commun. 7, 10331 (2016).
MacNamara, A. et al. Network and pathway expansion of genetic disease associations identifies successful drug targets. Sci. Rep. 10, 20970 (2020).
Barrio-Hernandez, I. & Beltrao, P. Network analysis of genome-wide association studies for drug target prioritisation. Curr. Opin. Chem. Biol. 71, 102206 (2022).
Hamosh, A., Scott, A. F., Amberger, J. S., Bocchini, C. A. & McKusick, V. A. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514–D517 (2005).
Gene Ontology Consortium The Gene Ontology resource: 20 years and still GOing strong. Nucleic Acids Res. 47, D330–D338 (2019).
Mastropietro, A., De Carlo, G. & Anagnostopoulos, A. XGDAG: explainable gene–disease associations via graph neural networks. Bioinformatics 39, btad482 (2023).
Huang, K. et al. A foundation model for clinician-centered drug repurposing. Nat. Med. 30, 3601–3613 (2024). This paper introduces TxGNN, a graph foundation model for zero-shot drug repurposing that predicts indications and contraindications across 17,080 diseases, including those without approved therapies.
Tan, D. et al. Caution when using network partners for target identification in drug discovery. Hum. Genet. Genom. Adv. 6, 100409 (2025).
Zhang, J. et al. Identifying therapeutic targets for rheumatoid arthritis by genomics-driven integrative approaches. Preprint at medRxiv https://doi.org/10.1101/2024.03.19.24304536 (2024).
Xu, J. et al. Interpretable deep learning translation of GWAS and multi-omics findings to identify pathobiology and drug repurposing in Alzheimer’s disease. Cell Rep. 41, 111717 (2022).
Fang, H. & Knight, J. C. Priority index: database of genetic targets in immune-mediated disease. Nucleic Acids Res. 50, D1358–D1367 (2021).
Cunningham, M. et al. PINNED: identifying characteristics of druggable human proteins using an interpretable neural network. J. Cheminform. 15, 64 (2023).
Raies, A. et al. DrugnomeAI is an ensemble machine-learning framework for predicting druggability of candidate drug targets. Commun. Biol. 5, 1–16 (2022).
Rask-Andersen, M., Masuram, S. & Schiöth, H. B. The druggable genome: evaluation of drug targets in clinical trials suggests major shifts in molecular class and indication. Annu. Rev. Pharmacol. Toxicol. 54, 9–26 (2014).
Ghoussaini, M., Nelson, M. R. & Dunham, I. Future prospects for human genetics and genomics in drug discovery. Curr. Opin. Struct. Biol. 80, 102568 (2023).
Stein, D. et al. Genome-wide prediction of pathogenic gain- and loss-of-function variants from ensemble learning of a diverse feature set. Genome Med. 15, 103 (2023).
Adesuyan, M. et al. Phosphodiesterase type 5 inhibitors in men with erectile dysfunction and the risk of Alzheimer disease. Neurology 102, e209131 (2024).
Shameer, K. et al. Pharmacological risk factors associated with hospital readmission rates in a psychiatric cohort identified using prescriptome data mining. BMC Med. Inform. Decis. Mak. 18, 79 (2018).
Wang, L., Babushkin, N., Liu, Z. & Liu, X. Trans-eQTL mapping in gene sets identifies network effects of genetic variants. Cell Genom. 4, 100538 (2024).
Kerimov, N. et al. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat. Genet. 53, 1290–1299 (2021).
Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, aaf6814 (2016).
Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).
Cook, M. B. et al. Our Future Health: a unique global resource for discovery and translational research. Nat. Med. 31, 728–730 (2025).
Seruga, B. et al. Under-reporting of harm in clinical trials. Lancet Oncol. 17, e209–e219 (2016).
Cannon, M. et al. DGIdb 5.0: rebuilding the drug–gene interaction database for precision medicine and drug discovery platforms. Nucleic Acids Res. 52, D1227–D1235 (2024).
insitro and Lilly enter strategic agreements to advance novel treatments for metabolic diseases. businesswire https://www.businesswire.com/news/home/20241009485564/en/insitro-and-Lilly-Enter-Strategic-Agreements-to-Advance-Novel-Treatments-for-Metabolic-Diseases (2024).
Kamya, P. et al. Pandaomics: an AI-driven platform for therapeutic target and biomarker discovery. J. Chem. Inf. Model. 64, 3961–3969 (2024).
Fu, Y. et al. Intestinal mucosal barrier repair and immune regulation with an AI-developed gut-restricted PHD inhibitor. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02503-w (2024).
Ren, F. et al. A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nat. Biotechnol. 43, 63–75 (2025).
BenevolentAI and AstraZeneca collaboration yields continued success as further novel target progressed into portfolio. BenevolentAI https://www.benevolent.com/news-and-media/press-releases-and-in-media/benevolentai-and-astrazeneca-collaboration-yields-continued-success-further-novel-target-progressed-portfolio/ (2024).
Verge Genomics and Ferrer announce agreement to co-develop clinical-stage ALS therapy VRG50635. Verge Genomics https://www.vergegenomics.com/news-blog/verge-genomics-and-ferrer-announce-agreement-to-co-develop-clinical-stage-als-therapy-vrg50635 (2024).
Xu, Z. et al. A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial. Nat. Med. 31, 2602–2610 (2025).
Gao, S. et al. TxAgent: an AI agent for therapeutic reasoning across a universe of tools. Preprint at https://doi.org/10.48550/arXiv.2503.10970 (2025).
Gottweis, J. et al. Towards an AI co-scientist. Preprint at https://doi.org/10.48550/arXiv.2502.18864 (2025).
Ferreira, C. R. The burden of rare diseases. Am. J. Med. Genet. A 179, 885–892 (2019).
Fermaglich, L. J. & Miller, K. L. A comprehensive study of the rare diseases and conditions targeted by orphan drug designations and approvals over the forty years of the Orphan Drug Act. Orphanet J. Rare Dis. 18, 163 (2023).
Cipriani, V. et al. Rare disease gene association discovery in the 100,000 Genomes Project. Nature https://doi.org/10.1038/s41586-025-08623-w (2025).
Greene, D. et al. Genetic association analysis of 77,539 genomes reveals rare disease etiologies. Nat. Med. 29, 679–688 (2023).
Mullard, A. Parsing clinical success rates. Nat. Rev. Drug Discov. 15, 447–447 (2016).
Rees, E. & Owen, M. J. Translating insights from neuropsychiatric genetics and genomics for precision psychiatry. Genome Med. 12, 43 (2020).
Kim, C. K. et al. Alzheimer’s disease: key insights from two decades of clinical trial failures. J. Alzheimers Dis. 87, 83–100 (2022).
Dong, X., Liu, C. & Dozmorov, M. Review of multi-omics data resources and integrative analysis for human brain disorders. Brief. Funct. Genom. 20, 223–234 (2021).
Yao, S. et al. Connecting genomic results for psychiatric disorders to human brain cell types and regions reveals convergence with functional connectivity. Nat. Commun. 16, 395 (2025).
Elliott, L. T. et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018).
Tanaka, Y. et al. OnSIDES database: Extracting adverse drug events from drug labels using natural language processing models. Med. 6, 100642 (2025).
Finan, C. et al. The druggable genome and support for target identification and validation in drug development. Sci. Transl. Med. 9, eaag1166 (2017).
Pollin, T. I. et al. A null mutation in human APOC3 confers a favorable plasma lipid profile and apparent cardioprotection. Science 322, 1702–1705 (2008).
Duarte Lau, F. & Giugliano, R. P. Lipoprotein(a) and its significance in cardiovascular disease: a review. JAMA Cardiol. 7, 760–769 (2022).
Abul-Husn, N. S. et al. A protein-truncating HSD17B13 variant and protection from chronic liver disease. N. Engl. J. Med. 378, 1096–1106 (2018).
Romeo, S. et al. Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet. 40, 1461–1465 (2008).
Friedman, D. J. & Pollak, M. R. Genetics of kidney failure and the evolving story of APOL1. J. Clin. Invest. 121, 3367–3374 (2011).
Sankaran, V. G. et al. Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322, 1839–1842 (2008).
Klein, R. J. et al. Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).
Gudbjartsson, D. F. et al. Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction. Nat. Genet. 41, 342–347 (2009).
Duerr, R. H. et al. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314, 1461–1463 (2006).
Feagan, B. G. et al. Ustekinumab as induction and maintenance therapy for Crohn’s disease. N. Engl. J. Med. 375, 1946–1960 (2016).
Guerreiro, R. et al. TREM2 variants in Alzheimer’s disease. N. Engl. J. Med. 368, 117–127 (2013).
Ban, M. et al. Replication analysis identifies TYK2 as a multiple sclerosis susceptibility factor. Eur. J. Hum. Genet. 17, 1309–1313 (2009).
Acton, E. K., Willis, A. W. & Hennessy, S. Core concepts in pharmacoepidemiology: key biases arising in pharmacoepidemiologic studies. Pharmacoepidemiol. Drug Saf. 32, 9–18 (2023).
Austin, P. C. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar. Behav. Res. 46, 399–424 (2011).
Lévesque, L. E., Hanley, J. A., Kezouh, A. & Suissa, S. Problem of immortal time bias in cohort studies: example using statins for preventing progression of diabetes. BMJ 340, b5087 (2010).
Funk, M. J. & Landi, S. N. Misclassification in administrative claims data: quantifying the impact on treatment effect estimates. Curr. Epidemiol. Rep. 1, 175–185 (2014).
Acknowledgements
R.D. is supported by the National Institute of General Medical Sciences of the NIH (R35-GM124836).
Author information
Authors and Affiliations
Contributions
The authors contributed equally to all aspects of the article.
Corresponding author
Ethics declarations
Competing interests
R.D. is a scientific co-founder, consultant and equity holder for Pensieve Health (pending) and is a consultant for Variant Bio and Character Bio. A.D. is currently a full-time employee of GlaxoSmithKline.
Peer review
Peer review information
Nature Reviews Genetics thanks David Ochoa, Murray Cairns and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chen, R., Duffy, Á. & Do, R. Genomics of drug target prioritization for complex diseases. Nat Rev Genet 27, 231–245 (2026). https://doi.org/10.1038/s41576-025-00904-4
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41576-025-00904-4


