Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Genomics of drug target prioritization for complex diseases

Abstract

Drug development faces persistent challenges, with high attrition rates and unexpected adverse effects contributing to clinical trial failures. The recent convergence of large-scale biobanks, multi-omics data and computational methods, including machine learning, has led to advances in genetics-driven drug discovery, offering new opportunities to refine target selection and reduce late-stage risk. Integrating multiple lines of evidence centred on human genetics within a probabilistic framework enables the systematic prioritization of drug targets, prediction of adverse effects, and identification of drug repurposing opportunities. In this Review, we explore how these integrative approaches can address unmet clinical needs in diverse disease contexts, focusing on complex diseases.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Allelic series to infer direction and amount of therapeutic modulation.
Fig. 2: Machine learning approaches for phenotype extraction, imputation and augmentation.
Fig. 3: A unified probabilistic framework for drug target prioritization.

Similar content being viewed by others

References

  1. Wong, C. H., Siah, K. W. & Lo, A. W. Estimation of clinical trial success rates and related parameters. Biostatistics 20, 273 (2018).

    Article  Google Scholar 

  2. Sun, D., Gao, W., Hu, H. & Zhou, S. Why 90% of clinical drug development fails and how to improve it? Acta Pharm. Sin. B 12, 3049–3062 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  3. Dowden, H. & Munro, J. Trends in clinical success rates and therapeutic focus. Nat. Rev. Drug Discov. 18, 495–496 (2019).

    Article  PubMed  CAS  Google Scholar 

  4. Cook, D. et al. Lessons learned from the fate of AstraZeneca’s drug pipeline: a five-dimensional framework. Nat. Rev. Drug Discov. 13, 419–431 (2014).

    Article  PubMed  CAS  Google Scholar 

  5. Plenge, R. M., Scolnick, E. M. & Altshuler, D. Validating therapeutic targets through human genetics. Nat. Rev. Drug Discov. 12, 581–594 (2013).

    Article  PubMed  CAS  Google Scholar 

  6. Minikel, E. V., Painter, J. L., Dong, C. C. & Nelson, M. R. Refining the impact of genetic evidence on clinical success. Nature 629, 624–629 (2024). This study demonstrates that drug mechanisms supported by human genetic evidence are 2.6 times more likely to reach approval, with success rates increasing as confidence in effector gene assignment improves.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Razuvayevskaya, O., Lopez, I., Dunham, I. & Ochoa, D. Genetic factors associated with reasons for clinical trial stoppage. Nat. Genet. 56, 1862–1867 (2024). This study uses NLP-based classification of 28,561 stopped clinical trials to demonstrate that lack of genetic support for drug targets is associated with efficacy-related trial failures, whereas safety-related stoppages are enriched for targets that are highly constrained or broadly expressed, underscoring the role of human genetics in de-risking drug development.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  8. Falaguera, M. J. et al. Temporal trends in novel drug target discovery reveal the increasing importance of human genetic data. Preprint at Research Square https://doi.org/10.21203/rs.3.rs-5669559/v1 (2024).

  9. Morgan, P. et al. Impact of a five-dimensional framework on R&D productivity at AstraZeneca. Nat. Rev. Drug Discov. 17, 167–181 (2018).

    Article  PubMed  CAS  Google Scholar 

  10. Dugger, S. A., Platt, A. & Goldstein, D. B. Drug development in the era of precision medicine. Nat. Rev. Drug Discov. 17, 183–196 (2018).

    Article  PubMed  CAS  Google Scholar 

  11. Loewa, A., Feng, J. J. & Hedtrich, S. Human disease models in drug development. Nat. Rev. Bioeng. 1, 545–559 (2023).

    Article  CAS  Google Scholar 

  12. Ingber, D. E. Human organs-on-chips for disease modelling, drug development and personalized medicine. Nat. Rev. Genet. 23, 467–491 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  13. Carss, K. J. et al. Using human genetics to improve safety assessment of therapeutics. Nat. Rev. Drug Discov. 22, 145–162 (2023).

    Article  PubMed  CAS  Google Scholar 

  14. Duncan, L. E., Ostacher, M. & Ballon, J. How genome-wide association studies (GWAS) made traditional candidate gene studies obsolete. Neuropsychopharmacology 44, 1518–1523 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  15. Gallagher, C. S., Ginsburg, G. S. & Musick, A. Biobanking with genetics shapes precision medicine and global health. Nat. Rev. Genet. 26, 191–202 (2025).

    Article  PubMed  CAS  Google Scholar 

  16. Szustakowski, J. D. et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 53, 942–948 (2021).

    Article  PubMed  CAS  Google Scholar 

  17. Backman, J. D. et al. Exome sequencing and analysis of 454,787 UK Biobank participants. Nature 599, 628–634 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Namba, S., Konuma, T., Wu, K.-H., Zhou, W. & Okada, Y. A practical guideline of genomics-driven drug discovery in the era of global biobank meta-analysis. Cell Genom. 2, 100190 (2022). This paper demonstrates a practical framework for genomics-driven drug discovery using cross-population GWAS meta-analyses, integrating gene prioritization, Mendelian randomization and gene expression correlation analyses that identifies 266 drug repositioning candidates across 13 diseases.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  19. Qi, T., Song, L., Guo, Y., Chen, C. & Yang, J. From genetic associations to genes: methods, applications, and challenges. Trends Genet. 40, 642–667 (2024).

    Article  PubMed  CAS  Google Scholar 

  20. Horn, R. & Merchant, J. Ethical and social implications of public–private partnerships in the context of genomic/big health data collection. Eur. J. Hum. Genet. 32, 736–741 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Buniello, A. et al. Open targets platform: facilitating therapeutic hypotheses building in drug discovery. Nucleic Acids Res. 53, D1467–D1475 (2025). This paper describes how enhancements to the Open Targets Platform, including a revamped target prioritization framework and direction of effect assessment, significantly improve the ability to identify, annotate and rank drug targets.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. Zhang, X. et al. Drug development advances in human genetics-based targets. MedComm 5, e481 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  23. Rusina, P. V. et al. Genetic support for FDA-approved drugs over the past decade. Nat. Rev. Drug Discov. 22, 864 (2023).

    Article  PubMed  CAS  Google Scholar 

  24. Trajanoska, K. et al. From target discovery to clinical drug development with human genetics. Nature 620, 737–745 (2023).

    Article  PubMed  CAS  Google Scholar 

  25. Sadler, M. C., Auwerx, C., Deelen, P. & Kutalik, Z. Multi-layered genetic approaches to identify approved drug targets. Cell Genom. 3, 100341 (2023). This paper demonstrates that integrating GWAS, exome sequencing and molecular QTL data improves drug target gene prioritization, with network diffusion further improving performance.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  26. Duffy, Á et al. Development of a human genetics-guided priority score for 19,365 genes and 399 drug indications. Nat. Genet. 56, 51–59 (2024). This paper demonstrates that integrating multiple genetic features into a genetic priority score identifies drug targets with higher probabilities of clinical success, whereas a directional extension (GPS-DOE) informs modulation strategies, providing a scalable framework to prioritize therapeutic targets across 19,365 genes and 399 drug indications.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Chen, R. et al. Expanding drug targets for 112 chronic diseases using a machine learning-assisted genetic priority score. Nat. Commun. 15, 8891 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Nguyen, P. A., Born, D. A., Deaton, A. M., Nioi, P. & Ward, L. D. Phenotypes associated with genes encoding drug targets are predictive of clinical trial side effects. Nat. Commun. 10, 1579 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Duffy, Á. et al. Tissue-specific genetic features inform prediction of drug side effects in clinical trials. Sci. Adv. 6, eabb6242 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  30. Minikel, E. V. & Nelson, M. R. Human genetic evidence enriched for side effects of approved drugs. PLoS Genet. 21, e1011638 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  31. Plenge, R. M. Disciplined approach to drug discovery and early development. Sci. Transl. Med. 8, 349ps15 (2016).

    Article  PubMed  Google Scholar 

  32. Wu, S. S. et al. Reviving an R&D pipeline: a step change in the phase II success rate. Drug Discov. Today 26, 308–314 (2021).

    Article  PubMed  CAS  Google Scholar 

  33. Fernando, K. et al. Achieving end-to-end success in the clinic: Pfizer’s learnings on R&D productivity. Drug Discov. Today 27, 697–704 (2022).

    Article  PubMed  CAS  Google Scholar 

  34. Emmerich, C. H. et al. Improving target assessment in biomedical research: the GOT-IT recommendations. Nat. Rev. Drug Discov. 20, 64–81 (2021).

    Article  PubMed  CAS  Google Scholar 

  35. McDonagh, E. M. et al. Human genetics and genomics for drug target identification and prioritization: Open Targets’ perspective. Annu. Rev. Biomed. Data Sci. 7, 59–81 (2024).

    Article  PubMed  Google Scholar 

  36. Cerezo, M. et al. The NHGRI-EBI GWAS catalog: standards for reusability, sustainability and diversity. Nucleic Acids Res. 53, D998–D1005 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Costanzo, M. C. et al. Realizing the promise of genome-wide association studies for effector gene prediction. Nat. Genet. 57, 1578–1587 (2025).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Zhou, W. et al. Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease. Cell Genom. 2, 100192 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Gaulton, K. J., Preissl, S. & Ren, B. Interpreting non-coding disease-associated human variants using single-cell epigenomics. Nat. Rev. Genet. 24, 516–534 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  40. Nasser, J. et al. Genome-wide enhancer maps link risk variants to disease genes. Nature 593, 238–243 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Hormozdiari, F. et al. Colocalization of GWAS and eQTL signals detects target genes. Am. J. Hum. Genet. 99, 1245–1260 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  42. Gusev, A. et al. Integrative approaches for large-scale transcriptome-wide association studies. Nat. Genet. 48, 245–252 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  43. Shringarpure, S. S. et al. Large language models identify causal genes in complex trait GWAS. Preprint at medRxiv https://doi.org/10.1101/2024.05.30.24308179 (2025).

  44. Mountjoy, E. et al. An open approach to systematically prioritize causal variants and genes at all published human GWAS trait-associated loci. Nat. Genet. 53, 1527–1533 (2021). This paper demonstrates that systematically integrating fine-mapping, colocalization and functional genomics data across 133,441 GWAS loci using machine learning improves causal gene prioritization compared to distance-based methods, with prioritized genes showing 8.1-fold enrichment for approved drug targets.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  45. Gazal, S. et al. Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity. Nat. Genet. 54, 827–836 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Pers, T. H. et al. Biological interpretation of genome-wide association studies using predicted gene functions. Nat. Commun. 6, 5890 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  47. Weeks, E. M. et al. Leveraging polygenic enrichments of gene features to predict genes underlying complex traits and diseases. Nat. Genet. 55, 1267–1276 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  48. Schipper, M. et al. Prioritizing effector genes at trait-associated loci using multimodal evidence. Nat. Genet. 57, 323–333 (2025).

    Article  PubMed  CAS  Google Scholar 

  49. Hemerich, D. et al. An integrative framework to prioritize genes in more than 500 loci associated with body mass index. Am. J. Hum. Genet. 111, 1035–1046 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  50. Schaid, D. J., Chen, W. & Larson, N. B. From genome-wide associations to candidate causal variants by statistical fine-mapping. Nat. Rev. Genet. 19, 491–504 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Gamazon, E. R. et al. A gene-based association method for mapping traits using reference transcriptome data. Nat. Genet. 47, 1091–1098 (2015).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  52. Evangelista, J. E. et al. SigCom LINCS: data and metadata search engine for a million gene expression signatures. Nucleic Acids Res. 50, W697–W709 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  53. Pilarczyk, M. et al. Connecting omics signatures and revealing biological mechanisms with iLINCS. Nat. Commun. 13, 4678 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  54. Wu, P. et al. Integrating gene expression and clinical data to identify drug repurposing candidates for hyperlipidemia and hypertension. Nat. Commun. 13, 46 (2022). This paper demonstrates that integrating disease gene expression signatures from GWAS data, drug perturbation databases and electronic health records enables high-throughput identification and clinical validation of drug repurposing candidates.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Zhao, H. et al. Proteome-wide Mendelian randomization in global biobank meta-analysis reveals multi-ancestry drug targets for common diseases. Cell Genom. 2, 100195 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Schmidt, A. F. et al. Genetic drug target validation using Mendelian randomisation. Nat. Commun. 11, 3255 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  57. Zheng, X. et al. DMRdb: a disease-centric Mendelian randomization database for systematically assessing causal relationships of diseases with genes, proteins, CpG sites, metabolites and other diseases. Nucleic Acids Res. 53, D1363–D1371 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. Hemani, G. et al. The MR-base platform supports systematic causal inference across the human phenome. eLife 7, e34408 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Ferolito, B. R. et al. Leveraging large-scale biobanks for therapeutic target discovery. Preprint at medRxiv https://doi.org/10.1101/2025.02.10.25321487 (2025).

  60. Zhao, S. et al. Adjusting for genetic confounders in transcriptome-wide association studies improves discovery of risk genes of complex traits. Nat. Genet. 56, 336–347 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  61. Giambartolomei, C. et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 10, e1004383 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Zuber, V. et al. Combining evidence from Mendelian randomization and colocalization: review and comparison of approaches. Am. J. Hum. Genet. 109, 767–782 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  63. Hukku, A., Sampson, M. G., Luca, F., Pique-Regi, R. & Wen, X. Analyzing and reconciling colocalization and transcriptome-wide association studies from the perspective of inferential reproducibility. Am. J. Hum. Genet. 109, 825–837 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  64. Lin, Z. & Pan, W. A robust cis-Mendelian randomization method with application to drug target discovery. Nat. Commun. 15, 6072 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  65. Kim, M. S. et al. Prioritization of therapeutic targets for dyslipidemia using integrative multi-omics and multi-trait analysis. Cell Rep. Med. 4, 101112 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  66. Okamoto, J. et al. Multi-INTACT: integrative analysis of the genome, transcriptome, and proteome identifies causal mechanisms of complex traits. Genome Biol. 26, 19 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  67. Jensen, L. T., Attfield, K. E., Feldmann, M. & Fugger, L. Allosteric TYK2 inhibition: redefining autoimmune disease therapy beyond JAK1-3 inhibitors. eBioMedicine 97, 104840 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  68. Yuan, S. et al. Mendelian randomization and clinical trial evidence supports TYK2 inhibition as a therapeutic target for autoimmune diseases. eBioMedicine 89, 104488 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Jurgens, S. J. et al. Rare coding variant analysis for human diseases across biobanks and ancestries. Nat. Genet. 56, 1811–1820 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  70. Wang, X. et al. The impact on clinical success from the 23andMe cohort. Preprint at medRxiv https://doi.org/10.1101/2024.06.17.24309059 (2024).

  71. Gaynor, S. M. et al. Yield of genetic association signals from genomes, exomes and imputation in the UK Biobank. Nat. Genet. 56, 2345–2351 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  72. Hawkes, G. et al. Whole-genome sequencing in 333,100 individuals reveals rare non-coding single variant and aggregate associations with height. Nat. Commun. 15, 8549 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  73. Ribeiro, D. M., Hofmeister, R. J., Rubinacci, S. & Delaneau, O. Noncoding rare variant associations with blood traits in 166,740 UK Biobank genomes. Nat. Genet. 57, 2146–2155 (2025).

    Article  PubMed  CAS  Google Scholar 

  74. Wainschtein, P. et al. Assessing the contribution of rare variants to complex trait heritability from whole-genome sequence data. Nat. Genet. 54, 263–273 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  75. Gusarova, V. et al. Genetic inactivation of ANGPTL4 improves glucose homeostasis and is associated with reduced risk of diabetes. Nat. Commun. 9, 2252 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  76. Akbari, P. et al. Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity. Science 373, eabf8683 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  77. Petrazzini, B. O. et al. Exome sequence analysis identifies rare coding variants associated with a machine learning-based marker for coronary artery disease. Nat. Genet. 56, 1412–1419 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  78. Verma, A. et al. Diversity and scale: genetic architecture of 2068 traits in the VA Million Veteran Program. Science 385, eadj1182 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  79. Karczewski, K. J. et al. Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects. Nat. Genet. 57, 2408–2417 (2025).

    Article  PubMed  CAS  Google Scholar 

  80. Levin, M. G. et al. Genome-wide assessment of pleiotropy across >1000 traits from global biobanks. Preprint at medRxiv https://doi.org/10.1101/2025.04.18.25326074 (2025).

  81. Turley, P. et al. Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat. Genet. 50, 229–237 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  82. Han, X. et al. Large-scale multitrait genome-wide association analyses identify hundreds of glaucoma risk loci. Nat. Genet. 55, 1116–1125 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  83. Grotzinger, A. D. et al. Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat. Hum. Behav. 3, 513–525 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Park, S. et al. Multivariate genomic analysis of 5 million people elucidates the genetic architecture of shared components of the metabolic syndrome. Nat. Genet. 56, 2380–2391 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Julienne, H. et al. Multitrait GWAS to connect disease variants and biological mechanisms. PLoS Genet. 17, e1009713 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  86. Luo, L. et al. Multi-trait analysis of rare-variant association summary statistics using MTAR. Nat. Commun. 11, 2850 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  87. Li, X. et al. A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies. Nat. Comput. Sci. 5, 125–143 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  88. Povysil, G. et al. Rare-variant collapsing analyses for complex traits: guidelines and applications. Nat. Rev. Genet. 20, 747–759 (2019).

    Article  PubMed  CAS  Google Scholar 

  89. Lee, S., Abecasis, G. R., Boehnke, M. & Lin, X. Rare-variant association analysis: study designs and statistical tests. Am. J. Hum. Genet. 95, 5–23 (2014).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  90. Wu, M. C. et al. Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89, 82–93 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  91. Liu, Y. et al. ACAT: a fast and powerful p value combination method for rare-variant analysis in sequencing studies. Am. J. Hum. Genet. 104, 410–421 (2019).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  92. Lee, S. et al. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am. J. Hum. Genet. 91, 224–237 (2012).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  93. Ziyatdinov, A. et al. Joint testing of rare variant burden scores using non-negative least squares. Am. J. Hum. Genet. 111, 2139–2149 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  94. Aggarwal, R. et al. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit. Med. 4, 1–23 (2021).

    Article  Google Scholar 

  95. Khurshid, S. et al. Deep learning to predict cardiac magnetic resonance-derived left ventricular mass and hypertrophy from 12-lead ECGs. Circ. Cardiovasc. Imaging 14, e012281 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  96. Haas, M. E. et al. Machine learning enables new insights into genetic contributions to liver fat accumulation. Cell Genom. 1, 100066 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  97. Pirruccello, J. P. et al. Genetic analysis of right heart structure and function in 40,000 people. Nat. Genet. 54, 792–803 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  98. Yun, T. et al. Unsupervised representation learning on high-dimensional clinical data improves genomic discovery and prediction. Nat. Genet. 56, 1604–1613 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  99. Flynn, B. I. et al. Deep learning based phenotyping of medical images improves power for gene discovery of complex disease. NPJ Digit. Med. 6, 1–12 (2023).

    Article  Google Scholar 

  100. Tadros, R. et al. Large-scale genome-wide association analyses identify novel genetic loci and mechanisms in hypertrophic cardiomyopathy. Nat. Genet. 57, 530–538 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  101. Huang, J. et al. A critical assessment of using ChatGPT for extracting structured data from clinical notes. NPJ Digit. Med. 7, 1–13 (2024).

    Article  CAS  Google Scholar 

  102. Wei, W.-Q. et al. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J. Am. Med. Inform. Assoc. 23, e20–e27 (2016).

    Article  PubMed  Google Scholar 

  103. Verma, A. et al. The Penn Medicine BioBank: towards a genomics-enabled learning healthcare system to accelerate precision medicine in a diverse population. J. Pers. Med. 12, 1974 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  104. Schneider, C. V. et al. Large-scale identification of undiagnosed hepatic steatosis using natural language processing. eClinicalMedicine 62, 102149 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  105. Somineni, H. et al. Machine learning across multiple imaging and biomarker modalities in the UK Biobank improves genetic discovery for liver fat accumulation. Preprint at medRxiv https://doi.org/10.1101/2024.01.06.24300923 (2024).

  106. Chen, R. et al. Trans-ancestral rare variant association study with machine learning-based phenotyping for metabolic dysfunction-associated steatotic liver disease. Genome Biol. 26, 50 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  107. An, U. et al. Deep learning-based phenotype imputation on population-scale biobank data increases genetic discoveries. Nat. Genet. 55, 2269–2276 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  108. Miao, J. et al. Valid inference for machine learning-assisted genome-wide association studies. Nat. Genet. 56, 2361–2369 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  109. Garg, M. et al. Disease prediction with multi-omics and biomarkers empowers case–control genetic discoveries in the UK Biobank. Nat. Genet. 56, 1821–1831 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  110. Burstein, D. et al. Genome-wide analysis of a model-derived binge eating disorder phenotype identifies risk loci and implicates iron metabolism. Nat. Genet. 55, 1462–1470 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  111. Chen, R. et al. Genetic analyses of eight complex diseases using predicted continuous representations of disease. Cell Rep. Methods 5, 101115 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  112. Yang, L., Sadler, M. C. & Altman, R. B. Genetic association studies using disease liabilities from deep neural networks. Am. J. Hum. Genet. 112, 675–692 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  113. Carrasco-Zanini, J. et al. Proteomic signatures improve risk prediction for common and rare diseases. Nat. Med. 30, 2489–2498 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  114. Buergel, T. et al. Metabolomic profiles predict individual multidisease outcomes. Nat. Med. 28, 2309–2320 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  115. Justesen, J. M. et al. Genetics of cardiometabolic disease progression. Preprint at medRxiv https://doi.org/10.1101/2025.02.01.25321518 (2025).

  116. Hanson, M. A., Barreiro, P. G., Crosetto, P. & Brockington, D. The strain on scientific publishing. Quant. Sci. Stud. 5, 823–843 (2024).

    Article  Google Scholar 

  117. Tirunagari, S. et al. Lit-OTAR framework for extracting biological evidences from literature. Bioinformatics 41, btaf113 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  118. Birgmeier, J. et al. AMELIE speeds Mendelian diagnosis by matching patient phenotype and genotype to primary literature. Sci. Transl. Med. 12, eaau9113 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  119. Li, P.-H. et al. A large language model framework for literature-based disease–gene association prediction. Brief. Bioinform. 26, bbaf070 (2025).

    Article  Google Scholar 

  120. Piñero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48, D845–D855 (2020).

    PubMed  PubMed Central  Google Scholar 

  121. Papatheodorou, I. et al. Expression Atlas update: from tissues to single cells. Nucleic Acids Res. 48, D77–D83 (2020).

    PubMed  PubMed Central  CAS  Google Scholar 

  122. Tian, R. et al. Genome-wide CRISPRi/a screens in human neurons link lysosomal failure to ferroptosis. Nat. Neurosci. 24, 1020–1034 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  123. Chen, Z., Boehnke, M., Wen, X. & Mukherjee, B. Revisiting the genome-wide significance threshold for common variant GWAS. Genes Genomes Genet. 11, jkaa056 (2021).

    Article  Google Scholar 

  124. Wang, X. et al. Discovery and validation of sub-threshold genome-wide association study loci using epigenomic signatures. eLife 5, e10557 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  125. Koutsandreas, T., Tsafou, K., Horn, H., Barrett, I. & Petsalaki, E. Network-based approaches for drug target identification. Annu. Rev. 8, 423–446 (2025).

    Google Scholar 

  126. Cheng, F. et al. Network-based approach to prediction and population-based validation of in silico drug repurposing. Nat. Commun. 9, 2691 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  127. Ruiz, C., Zitnik, M. & Leskovec, J. Identification of disease treatment mechanisms through the multiscale interactome. Nat. Commun. 12, 1796 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  128. Middleton, L. et al. Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data. Sci. Adv. 10, eadj1424 (2024). This study demonstrates that Mantis-ML 2.0, an automated machine learning framework integrating knowledge graphs, graph neural networks and natural language processing, enhances phenome-wide gene–disease association prediction across 5,220 diseases.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  129. Guney, E., Menche, J., Vidal, M. & Barábasi, A.-L. Network-based in silico drug efficacy screening. Nat. Commun. 7, 10331 (2016).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  130. MacNamara, A. et al. Network and pathway expansion of genetic disease associations identifies successful drug targets. Sci. Rep. 10, 20970 (2020).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  131. Barrio-Hernandez, I. & Beltrao, P. Network analysis of genome-wide association studies for drug target prioritisation. Curr. Opin. Chem. Biol. 71, 102206 (2022).

    Article  PubMed  CAS  Google Scholar 

  132. Hamosh, A., Scott, A. F., Amberger, J. S., Bocchini, C. A. & McKusick, V. A. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514–D517 (2005).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  133. Gene Ontology Consortium The Gene Ontology resource: 20 years and still GOing strong. Nucleic Acids Res. 47, D330–D338 (2019).

    Article  Google Scholar 

  134. Mastropietro, A., De Carlo, G. & Anagnostopoulos, A. XGDAG: explainable gene–disease associations via graph neural networks. Bioinformatics 39, btad482 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  135. Huang, K. et al. A foundation model for clinician-centered drug repurposing. Nat. Med. 30, 3601–3613 (2024). This paper introduces TxGNN, a graph foundation model for zero-shot drug repurposing that predicts indications and contraindications across 17,080 diseases, including those without approved therapies.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  136. Tan, D. et al. Caution when using network partners for target identification in drug discovery. Hum. Genet. Genom. Adv. 6, 100409 (2025).

    Article  CAS  Google Scholar 

  137. Zhang, J. et al. Identifying therapeutic targets for rheumatoid arthritis by genomics-driven integrative approaches. Preprint at medRxiv https://doi.org/10.1101/2024.03.19.24304536 (2024).

  138. Xu, J. et al. Interpretable deep learning translation of GWAS and multi-omics findings to identify pathobiology and drug repurposing in Alzheimer’s disease. Cell Rep. 41, 111717 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  139. Fang, H. & Knight, J. C. Priority index: database of genetic targets in immune-mediated disease. Nucleic Acids Res. 50, D1358–D1367 (2021).

    Article  Google Scholar 

  140. Cunningham, M. et al. PINNED: identifying characteristics of druggable human proteins using an interpretable neural network. J. Cheminform. 15, 64 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  141. Raies, A. et al. DrugnomeAI is an ensemble machine-learning framework for predicting druggability of candidate drug targets. Commun. Biol. 5, 1–16 (2022).

    Article  Google Scholar 

  142. Rask-Andersen, M., Masuram, S. & Schiöth, H. B. The druggable genome: evaluation of drug targets in clinical trials suggests major shifts in molecular class and indication. Annu. Rev. Pharmacol. Toxicol. 54, 9–26 (2014).

    Article  PubMed  CAS  Google Scholar 

  143. Ghoussaini, M., Nelson, M. R. & Dunham, I. Future prospects for human genetics and genomics in drug discovery. Curr. Opin. Struct. Biol. 80, 102568 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  144. Stein, D. et al. Genome-wide prediction of pathogenic gain- and loss-of-function variants from ensemble learning of a diverse feature set. Genome Med. 15, 103 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  145. Adesuyan, M. et al. Phosphodiesterase type 5 inhibitors in men with erectile dysfunction and the risk of Alzheimer disease. Neurology 102, e209131 (2024).

    Article  PubMed  PubMed Central  Google Scholar 

  146. Shameer, K. et al. Pharmacological risk factors associated with hospital readmission rates in a psychiatric cohort identified using prescriptome data mining. BMC Med. Inform. Decis. Mak. 18, 79 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  147. Wang, L., Babushkin, N., Liu, Z. & Liu, X. Trans-eQTL mapping in gene sets identifies network effects of genetic variants. Cell Genom. 4, 100538 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  148. Kerimov, N. et al. A compendium of uniformly processed human gene expression and splicing quantitative trait loci. Nat. Genet. 53, 1290–1299 (2021).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  149. Dewey, F. E. et al. Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study. Science 354, aaf6814 (2016).

    Article  PubMed  Google Scholar 

  150. Halldorsson, B. V. et al. The sequences of 150,119 genomes in the UK Biobank. Nature 607, 732–740 (2022).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  151. Cook, M. B. et al. Our Future Health: a unique global resource for discovery and translational research. Nat. Med. 31, 728–730 (2025).

    Article  PubMed  CAS  Google Scholar 

  152. Seruga, B. et al. Under-reporting of harm in clinical trials. Lancet Oncol. 17, e209–e219 (2016).

    Article  PubMed  Google Scholar 

  153. Cannon, M. et al. DGIdb 5.0: rebuilding the drug–gene interaction database for precision medicine and drug discovery platforms. Nucleic Acids Res. 52, D1227–D1235 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  154. insitro and Lilly enter strategic agreements to advance novel treatments for metabolic diseases. businesswire https://www.businesswire.com/news/home/20241009485564/en/insitro-and-Lilly-Enter-Strategic-Agreements-to-Advance-Novel-Treatments-for-Metabolic-Diseases (2024).

  155. Kamya, P. et al. Pandaomics: an AI-driven platform for therapeutic target and biomarker discovery. J. Chem. Inf. Model. 64, 3961–3969 (2024).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  156. Fu, Y. et al. Intestinal mucosal barrier repair and immune regulation with an AI-developed gut-restricted PHD inhibitor. Nat. Biotechnol. https://doi.org/10.1038/s41587-024-02503-w (2024).

  157. Ren, F. et al. A small-molecule TNIK inhibitor targets fibrosis in preclinical and clinical models. Nat. Biotechnol. 43, 63–75 (2025).

    Article  PubMed  CAS  Google Scholar 

  158. BenevolentAI and AstraZeneca collaboration yields continued success as further novel target progressed into portfolio. BenevolentAI https://www.benevolent.com/news-and-media/press-releases-and-in-media/benevolentai-and-astrazeneca-collaboration-yields-continued-success-further-novel-target-progressed-portfolio/ (2024).

  159. Verge Genomics and Ferrer announce agreement to co-develop clinical-stage ALS therapy VRG50635. Verge Genomics https://www.vergegenomics.com/news-blog/verge-genomics-and-ferrer-announce-agreement-to-co-develop-clinical-stage-als-therapy-vrg50635 (2024).

  160. Xu, Z. et al. A generative AI-discovered TNIK inhibitor for idiopathic pulmonary fibrosis: a randomized phase 2a trial. Nat. Med. 31, 2602–2610 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  161. Gao, S. et al. TxAgent: an AI agent for therapeutic reasoning across a universe of tools. Preprint at https://doi.org/10.48550/arXiv.2503.10970 (2025).

  162. Gottweis, J. et al. Towards an AI co-scientist. Preprint at https://doi.org/10.48550/arXiv.2502.18864 (2025).

  163. Ferreira, C. R. The burden of rare diseases. Am. J. Med. Genet. A 179, 885–892 (2019).

    Article  PubMed  Google Scholar 

  164. Fermaglich, L. J. & Miller, K. L. A comprehensive study of the rare diseases and conditions targeted by orphan drug designations and approvals over the forty years of the Orphan Drug Act. Orphanet J. Rare Dis. 18, 163 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  165. Cipriani, V. et al. Rare disease gene association discovery in the 100,000 Genomes Project. Nature https://doi.org/10.1038/s41586-025-08623-w (2025).

  166. Greene, D. et al. Genetic association analysis of 77,539 genomes reveals rare disease etiologies. Nat. Med. 29, 679–688 (2023).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  167. Mullard, A. Parsing clinical success rates. Nat. Rev. Drug Discov. 15, 447–447 (2016).

    PubMed  Google Scholar 

  168. Rees, E. & Owen, M. J. Translating insights from neuropsychiatric genetics and genomics for precision psychiatry. Genome Med. 12, 43 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  169. Kim, C. K. et al. Alzheimer’s disease: key insights from two decades of clinical trial failures. J. Alzheimers Dis. 87, 83–100 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  170. Dong, X., Liu, C. & Dozmorov, M. Review of multi-omics data resources and integrative analysis for human brain disorders. Brief. Funct. Genom. 20, 223–234 (2021).

    Article  CAS  Google Scholar 

  171. Yao, S. et al. Connecting genomic results for psychiatric disorders to human brain cell types and regions reveals convergence with functional connectivity. Nat. Commun. 16, 395 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  172. Elliott, L. T. et al. Genome-wide association studies of brain imaging phenotypes in UK Biobank. Nature 562, 210–216 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  173. Tanaka, Y. et al. OnSIDES database: Extracting adverse drug events from drug labels using natural language processing models. Med. 6, 100642 (2025).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  174. Finan, C. et al. The druggable genome and support for target identification and validation in drug development. Sci. Transl. Med. 9, eaag1166 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  175. Pollin, T. I. et al. A null mutation in human APOC3 confers a favorable plasma lipid profile and apparent cardioprotection. Science 322, 1702–1705 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  176. Duarte Lau, F. & Giugliano, R. P. Lipoprotein(a) and its significance in cardiovascular disease: a review. JAMA Cardiol. 7, 760–769 (2022).

    Article  PubMed  Google Scholar 

  177. Abul-Husn, N. S. et al. A protein-truncating HSD17B13 variant and protection from chronic liver disease. N. Engl. J. Med. 378, 1096–1106 (2018).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  178. Romeo, S. et al. Genetic variation in PNPLA3 confers susceptibility to nonalcoholic fatty liver disease. Nat. Genet. 40, 1461–1465 (2008).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  179. Friedman, D. J. & Pollak, M. R. Genetics of kidney failure and the evolving story of APOL1. J. Clin. Invest. 121, 3367–3374 (2011).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  180. Sankaran, V. G. et al. Human fetal hemoglobin expression is regulated by the developmental stage-specific repressor BCL11A. Science 322, 1839–1842 (2008).

    Article  PubMed  CAS  Google Scholar 

  181. Klein, R. J. et al. Complement factor H polymorphism in age-related macular degeneration. Science 308, 385–389 (2005).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  182. Gudbjartsson, D. F. et al. Sequence variants affecting eosinophil numbers associate with asthma and myocardial infarction. Nat. Genet. 41, 342–347 (2009).

    Article  PubMed  CAS  Google Scholar 

  183. Duerr, R. H. et al. A genome-wide association study identifies IL23R as an inflammatory bowel disease gene. Science 314, 1461–1463 (2006).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  184. Feagan, B. G. et al. Ustekinumab as induction and maintenance therapy for Crohn’s disease. N. Engl. J. Med. 375, 1946–1960 (2016).

    Article  PubMed  CAS  Google Scholar 

  185. Guerreiro, R. et al. TREM2 variants in Alzheimer’s disease. N. Engl. J. Med. 368, 117–127 (2013).

    Article  PubMed  CAS  Google Scholar 

  186. Ban, M. et al. Replication analysis identifies TYK2 as a multiple sclerosis susceptibility factor. Eur. J. Hum. Genet. 17, 1309–1313 (2009).

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  187. Acton, E. K., Willis, A. W. & Hennessy, S. Core concepts in pharmacoepidemiology: key biases arising in pharmacoepidemiologic studies. Pharmacoepidemiol. Drug Saf. 32, 9–18 (2023).

    Article  PubMed  Google Scholar 

  188. Austin, P. C. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar. Behav. Res. 46, 399–424 (2011).

    Article  Google Scholar 

  189. Lévesque, L. E., Hanley, J. A., Kezouh, A. & Suissa, S. Problem of immortal time bias in cohort studies: example using statins for preventing progression of diabetes. BMJ 340, b5087 (2010).

    Article  PubMed  Google Scholar 

  190. Funk, M. J. & Landi, S. N. Misclassification in administrative claims data: quantifying the impact on treatment effect estimates. Curr. Epidemiol. Rep. 1, 175–185 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

R.D. is supported by the National Institute of General Medical Sciences of the NIH (R35-GM124836).

Author information

Authors and Affiliations

Authors

Contributions

The authors contributed equally to all aspects of the article.

Corresponding author

Correspondence to Ron Do.

Ethics declarations

Competing interests

R.D. is a scientific co-founder, consultant and equity holder for Pensieve Health (pending) and is a consultant for Variant Bio and Character Bio. A.D. is currently a full-time employee of GlaxoSmithKline.

Peer review

Peer review information

Nature Reviews Genetics thanks David Ochoa, Murray Cairns and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, R., Duffy, Á. & Do, R. Genomics of drug target prioritization for complex diseases. Nat Rev Genet 27, 231–245 (2026). https://doi.org/10.1038/s41576-025-00904-4

Download citation

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41576-025-00904-4

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research