Abstract
Genomic prediction has become central to human, animal and plant biology, enabling quantitative inference of how genetic variation shapes complex traits. Although these domains share statistical foundations, such as linear mixed models, Bayesian regression and deep-learning frameworks, they have advanced largely in parallel. Here we synthesize their methodological evolution and highlight opportunities for integration and deeper collaborations. Agricultural genetics contributed to the mixed-model and Bayesian frameworks underlying modern polygenic scores, while human genomics has driven advances in nonlinear modeling, federated learning and biology-informed artificial intelligence. We propose a roadmap centered on interoperable data standards, shared benchmarks and cross-disciplinary training to unify predictive genomics across species. Together, these efforts establish genomic prediction as a comparative science capable of explaining how genetic information drives form and function across the diversity of life. We emphasize that shared biological architectures and knowledge transfer across species can directly improve the robustness, interpretability and generalizability of predictive models.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout


References
Crouch, D. J. M. & Bodmer, W. F. Polygenic inheritance, GWAS, polygenic risk scores, and the search for functional variants. Proc. Natl Acad. Sci. USA 117, 18924–18933 (2020).
Sahito, J. H. et al. Advancements and prospects of genome-wide association studies (GWAS) in maize. Int. J. Mol. Sci. 25, 1918 (2024).
Meuwissen, T., Hayes, B. & Goddard, M. Genomic selection: a paradigm shift in animal breeding. Anim. Front. 6, 6–14 (2016).
Huber, C. D., Kim, B. Y. & Lohmueller, K. E. Population genetic models of GERP scores suggest pervasive turnover of constrained sites across mammalian evolution. PLoS Genet. 16, e1008827 (2020).
Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).
Meuwissen, T. H., Hayes, B. J. & Goddard, M. E. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829 (2001).
Kaler, A. S., Purcell, L. C., Beissinger, T. & Gillman, J. D. Genomic prediction models for traits differing in heritability for soybean, rice, and maize. BMC Plant Biol. 22, 87 (2022).
Barreto, C. A. V. et al. Genomic prediction in multi-environment trials in maize using statistical and machine learning methods. Sci. Rep. 14, 1062 (2024).
Elgart, M. et al. Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations. Commun. Biol. 5, 856 (2022).
Lee, S. H., Clark, S. & van der Werf, J. H. J. Estimation of genomic prediction accuracy from reference populations with varying degrees of relationship. PLoS ONE 12, e0189775 (2017).
Lee, S. H., Weerasinghe, W. M. S. P., Wray, N. R., Goddard, M. E. & van der Werf, J. H. J. Using information of relatives in genomic prediction to apply effective stratified medicine. Sci. Rep. 7, 42091 (2017).
Price, A. L. et al. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904–909 (2006).
Callister, A. N. et al. Accounting for population structure in genomic predictions of Eucalyptus globulus. G3 (Bethesda) 12, jkac180 (2022).
Nishio, M. et al. Comparing pedigree and genomic inbreeding coefficients, and inbreeding depression of reproductive traits in Japanese Black cattle. BMC Genomics 24, 376 (2023).
Johnsson, M. Genomics in animal breeding from the perspectives of matrices and molecules. Hereditas 160, 20 (2023).
Huang, J. et al. Genomics and phenomics of body mass index reveals a complex disease network. Nat. Commun. 13, 7973 (2022).
Harris, K. M. et al. Cohort profile: the national longitudinal study of adolescent to adult health (add health). Int. J. Epidemiol. 48, 1415–1415k (2019).
Dashti, H. S. et al. Sleep health, diseases, and pain syndromes: findings from an electronic health record biobank. Sleep 44, zsaa189 (2021).
David, I., Ricard, A., Huynh-Tran, V.-H., Dekkers, J. C. M. & Gilbert, H. Quality of breeding value predictions from longitudinal analyses, with application to residual feed intake in pigs. Genet. Sel. Evol. 54, 32 (2022).
Rojas de Oliveira, H. et al. Phenotypic and genomic modeling of lactation curves: a longitudinal perspective. JDS Commun. 5, 241–246 (2024).
Gutierrez-Reinoso, M. A., Aponte, P. M. & Garcia-Herreros, M. Genomic analysis, progress and future perspectives in dairy cattle selection: a review. Animals (Basel) 11, 599 (2021).
Cole, J. B., Makanjuola, B. O., Rochus, C. M., van Staaveren, N. & Baes, C. The effects of breeding and selection on lactation in dairy cattle. Anim. Front. 13, 55–63 (2023).
Brito, L. F. et al. Large-scale phenotyping of livestock welfare in commercial production systems: a new frontier in animal breeding. Front. Genet. 11, 793 (2020).
Tuggle, C. K. et al. Current challenges and future of agricultural genomes to phenomes in the USA. Genome Biol. 25, 8 (2024).
Kipkoech, S. et al. Conservation priorities and distribution patterns of vascular plant species along environmental gradients in Aberdare ranges forest. PhytoKeys 131, 91–113 (2019).
Wu, Y., Li, D. & Vermund, S. H. Advantages and limitations of the body mass index (BMI) to assess adult obesity. Int. J. Environ. Res. Public Health 21, 757 (2024).
Gregorius, H.-R. Distribution of variation over populations. Theory Biosci. 128, 179–189 (2009).
Genetic Alliance & The New England Public Health Genetics Education Collaborative. Understanding Genetics: A New England Guide for Patients and Health Professionals (Genetic Alliance, 2010).
Wijsman, E. M. The role of large pedigrees in an era of high-throughput sequencing. Hum. Genet. 131, 1555–1563 (2012).
Fradgley, N. et al. A large-scale pedigree resource of wheat reveals evidence for adaptation and selection by breeders. PLoS Biol. 17, e3000071 (2019).
Koganebuchi, K. & Kimura, R. Biomedical and genetic characteristics of the Ryukyuans: demographic history, diseases and physical and physiological traits. Ann. Hum. Biol. 46, 354–366 (2019).
Delval, I., Fernández-Bolaños, M. & Izar, P. Towards an integrated concept of personality in human and nonhuman animals. Integr. Psychol. Behav. Sci. 58, 271–302 (2024).
York, R. A. Assessing the genetic landscape of animal behavior. Genetics 209, 223–232 (2018).
Liscum, E. et al. Phototropism: growing towards an understanding of plant movement. Plant Cell 26, 38–55 (2014).
Djanaguiraman, M., Narayanan, S., Erdayani, E. & Prasad, P. V. V. Effects of high temperature stress during anthesis and grain filling periods on photosynthesis, lipids and grain yield in wheat. BMC Plant Biol. 20, 268 (2020).
Janicka, K., Drabik, K., Wengerska, K. & Rozempolska-Rucińska, I. Effect of stocking density on behavioural and physiological traits of laying hens. Animals (Basel) 15, 604 (2025).
Venkatesh, S. S. et al. Genome-wide analyses identify 25 infertility loci and relationships with reproductive traits across the allele frequency spectrum. Nat. Genet. 57, 1107–1118 (2025).
Bizouerne, E. et al. Genetic variability in seed longevity and germination traits in a tomato MAGIC population in contrasting environments. Plants (Basel) 12, 3632 (2023).
Wall, J. D. & Pritchard, J. K. Haplotype blocks and linkage disequilibrium in the human genome. Nat. Rev. Genet. 4, 587–597 (2003).
Andrade, A. C. B., Viana, J. M. S., Pereira, H. D., Pinto, V. B. & Fonseca E Silva, F. Linkage disequilibrium and haplotype block patterns in popcorn populations. PLoS ONE 14, e0219417 (2019).
Zhao, W. et al. Factors affecting the accuracy of genomic prediction in joint pig populations. Animal 17, 100980 (2023).
Pritchard, J. K. & Przeworski, M. Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69, 1–14 (2001).
Slatkin, M. Linkage disequilibrium–understanding the evolutionary past and mapping the medical future. Nat. Rev. Genet. 9, 477–485 (2008).
Daly, M. J., Rioux, J. D., Schaffner, S. F., Hudson, T. J. & Lander, E. S. High-resolution haplotype structure in the human genome. Nat. Genet. 29, 229–232 (2001).
Teissier, M. et al. Genomic predictions based on haplotypes fitted as pseudo-SNP for milk production and udder type traits and SCS in French dairy goats. J. Dairy Sci. 103, 11559–11573 (2020).
Feitosa, F. L. B. et al. Comparison between haplotype-based and individual SNP-based genomic predictions for beef fatty acid profile in Nelore cattle. J. Anim. Breed. Genet. 137, 468–476 (2020).
Zhang, Y. et al. Structural variation reshapes population gene expression and trait variation in 2,105 Brassica napus accessions. Nat. Genet. 56, 2538–2550 (2024).
Ladeira, G. C., Pinedo, P. J., Santos, J. E. P., Thatcher, W. W. & Rezende, F. M. Detecting and characterizing copy number variation in a large commercial U.S. Holstein cattle population. BMC Genomics 26, 381 (2025).
De Oliveira, L. F. et al. Genome-wide detection of copy number variation and association studies with physiological and anatomical indicators of heat stress response in lactating sows. J. Anim. Breed. Genet. 143, 183–192 (2025).
Abbas, Q., Wilhelm, M., Kuster, B., Poppenberger, B. & Frishman, D. Exploring crop genomes: assembly features, gene prediction accuracy, and implications for proteomics studies. BMC Genomics 25, 619 (2024).
Rice, A. & Mayrose, I. The Chromosome Counts Database (CCDB). Methods Mol. Biol. 2703, 123–129 (2023).
De Los Campos, G., Hickey, J. M., Pong-Wong, R., Daetwyler, H. D. & Calus, M. P. L. Whole-genome regression and prediction methods applied to plant and animal breeding. Genetics 193, 327–345 (2013).
Zhao, Z., Fritsche, L. G., Smith, J. A., Mukherjee, B. & Lee, S. The construction of cross-population polygenic risk scores using transfer learning. Am. J. Hum. Genet. 109, 1998–2008 (2022).
Yu, J. et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat. Genet. 38, 203–208 (2006).
Habier, D., Fernando, R. L. & Garrick, D. J. Genomic BLUP decoded: a look into the black box of genomic prediction. Genetics 194, 597–607 (2013).
Clark, S. A. & van der Werf, J. Genomic best linear unbiased prediction (gBLUP) for the estimation of genomic breeding values. Methods Mol. Biol. 1019, 321–330 (2013).
Lourenco, D. et al. Single-step genomic evaluations from theory to practice: using SNP chips and sequence data in BLUPF90. Genes (Basel) 11, 790 (2020).
Henderson, C. R., Kempthorne, O., Searle, S. R. & von Krosigk, C. M. The estimation of environmental and genetic trends from records subject to culling. Biometrics 15, 192 (1959).
Gianola, D., de los Campos, G., Hill, W. G., Manfredi, E. & Fernando, R. Additive genetic variability and the Bayesian alphabet. Genetics 183, 347–363 (2009).
Habier, D., Fernando, R. L., Kizilkaya, K. & Garrick, D. J. Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12, 186 (2011).
Gianola, D. Priors in whole-genome regression: the Bayesian alphabet returns. Genetics 194, 573–596 (2013).
Wolc, A. & Dekkers, J. C. M. Application of Bayesian genomic prediction methods to genome-wide association analyses. Genet. Sel. Evol. 54, 31 (2022).
Truong, B. et al. Efficient polygenic risk scores for biobank scale data by exploiting phenotypes from inferred relatives. Nat. Commun. 11, 3074 (2020).
Chen, C.-Y., Han, J., Hunter, D. J., Kraft, P. & Price, A. L. Explicit modeling of ancestry improves polygenic risk scores and BLUP prediction. Genet. Epidemiol. 39, 427–438 (2015).
Zhao, Y. X. et al. Genome-wide association studies uncover genes associated with litter traits in the pig. Animal 16, 100672 (2022).
Ruan, Y. et al. Improving polygenic prediction in ancestrally diverse populations. Nat. Genet. 54, 573–580 (2022).
Muneeb, M., Feng, S. & Henschel, A. Transfer learning for genotype-phenotype prediction using deep learning models. BMC Bioinformatics 23, 511 (2022).
Arashi, M., Roozbeh, M., Hamzah, N. A. & Gasparini, M. Ridge regression and its applications in genetic studies. PLoS ONE 16, e0245376 (2021).
Kwak, S. G. LASSO regression analysis: applications in dyslipidemia and cardiovascular disease research. J. Lipid Atheroscler. 14, 289–297 (2025).
Zhang, Z. et al. Discriminative elastic-net regularized linear regression. IEEE Trans. Image Process. 26, 1466–1481 (2017).
Jung, K.-W. et al. Prediction of cancer incidence and mortality in Korea, 2021. Cancer Res. Treat. 53, 316–322 (2021).
Merrick, L. F., Lozada, D. N., Chen, X. & Carter, A. H. Classification and regression models for genomic selection of skewed phenotypes: a case for disease resistance in winter wheat (Triticum aestivum L.). Front. Genet. 13, 835781 (2022).
Tamibmaniam, J., Hussin, N., Cheah, W. K., Ng, K. S. & Muninathan, P. Proposal of a clinical decision tree algorithm using factors associated with severe dengue infection. PLoS ONE 11, e0161696 (2016).
Jamthikar, A. et al. Cardiovascular/stroke risk predictive calculators: a comparison between statistical and machine learning models. Cardiovasc. Diagn. Ther. 10, 919–938 (2020).
Wang, X. et al. Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs. J. Anim. Sci. Biotechnol. 13, 60 (2022).
Bellot, P., de Los Campos, G. & Pérez-Enciso, M. Can deep learning improve genomic prediction of complex human traits?. Genetics 210, 809–819 (2018).
Ye, S., Li, J. & Zhang, Z. Multi-omics-data-assisted genomic feature markers preselection improves the accuracy of genomic prediction. J. Anim. Sci. Biotechnol. 11, 109 (2020).
Ehret, A., Hochstuhl, D., Gianola, D. & Thaller, G. Application of neural networks with back-propagation to genome-enabled prediction of complex traits in Holstein-Friesian and German Fleckvieh cattle. Genet. Sel. Evol. 47, 22 (2015).
Ballard, J. L., Wang, Z., Li, W., Shen, L. & Long, Q. Deep learning-based approaches for multi-omics data integration and analysis. BioData Min. 17, 38 (2024).
Pedrosa, V. B. et al. Machine learning methods for genomic prediction of cow behavioral traits measured by automatic milking systems in North American Holstein cattle. J. Dairy Sci. 107, 4758–4771 (2024).
Tabatabaei, S. F., Akbari Roknabadi, S. & Koohi, S. DeepEPI: CNN-transformer-based model for extracting TF interactions through predicting enhancer-promoter interactions. Bioinform. Adv. 5, vbaf221 (2025).
Monti, M., Fiorentino, J., Milanetti, E., Gosti, G. & Tartaglia, G. G. Prediction of time series gene expression and structural analysis of gene regulatory networks using recurrent neural networks. Entropy (Basel) 24, 141 (2022).
Wang, K. et al. DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants. Mol. Plant 16, 279–293 (2023).
Zou, J. et al. A primer on deep learning in genomics. Nat. Genet. 51, 12–18 (2019).
Wu, Y. & Xie, L. AI-driven multi-omics integration for multi-scale predictive modeling of genotype-environment-phenotype relationships. Comput. Struct. Biotechnol. J. 27, 265–277 (2025).
Chen, X., Roberts, R., Liu, Z. & Tong, W. A generative adversarial network model alternative to animal studies for clinical pathology assessment. Nat. Commun. 14, 7141 (2023).
Riley, R., Mathieson, I. & Mathieson, S. Interpreting generative adversarial networks to infer natural selection from genetic data. Genetics 226, iyae024 (2024).
Avsec, Ž. et al. AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence model. Preprint at bioRxiv https://doi.org/10.1101/2025.06.25.661532 (2025).
Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Xu, Y., Fleming, S., Tegtmeyer, M., McCarroll, S. A. & Babadi, M. Explainable modeling of single-cell perturbation data using attention and sparse dictionary learning. Cell Syst. 16, 101245 (2025).
Van Dijk, A. D. J., Kootstra, G., Kruijer, W. & de Ridder, D. Machine learning in plant science and plant breeding. iScience 24, 101890 (2021).
Vivek, S., Faul, J., Thyagarajan, B. & Guan, W. Explainable variational autoencoder (E-VAE) model using genome-wide SNPs to predict dementia. J. Biomed. Inform. 148, 104536 (2023).
Patel, A. P. et al. A multi-ancestry polygenic risk score improves risk prediction for coronary artery disease. Nat. Med. 29, 1793–1803 (2023).
Wang, X. et al. High-throughput phenotyping with deep learning gives insight into the genetic architecture of flowering time in wheat. Gigascience 8, giz120 (2019).
Ahanger, M. A. et al. Plant responses to environmental stresses-from gene to biotechnology. AoB Plants 9, plx025 (2017).
Zamorano-Algandar, R. et al. Genetic markers associated with milk production and thermotolerance in Holstein dairy cows managed in a heat-stressed environment. Biology (Basel) 12, 679 (2023).
Silva Neto, J. B. et al. Genotype-by-environment interactions in beef and dairy cattle populations: a review of methodologies and perspectives on research and applications. Anim. Genet. 55, 871–892 (2024).
Carey, C. E. et al. Principled distillation of UK Biobank phenotype data reveals underlying structure in human variation. Nat. Hum. Behav. 8, 1599–1615 (2024).
Assary, E., Vincent, J. P., Keers, R. & Pluess, M. Gene–environment interaction and psychiatric disorders: review and future directions. Semin. Cell Dev. Biol. 77, 133–143 (2018).
Hartiala, J. A., Hilser, J. R., Biswas, S., Lusis, A. J. & Allayee, H. Gene–environment interactions for cardiovascular disease. Curr. Atheroscler. Rep. 23, 75 (2021).
Nguyen, H., Shrestha, S., Draghici, S. & Nguyen, T. PINSPlus: a tool for tumor subtype discovery in integrated genomic data. Bioinformatics 35, 2843–2846 (2019).
Chen, Y. et al. Chromatin accessibility: biological functions, molecular mechanisms and therapeutic application. Signal Transduct. Target. Ther. 9, 340 (2024).
Abdulraheem, M. I. Mechanisms of plant epigenetic regulation in response to plant stress: recent discoveries and implications. Plants (Basel) 13, 163 (2024).
Weaver, I. C. G. et al. Epigenetic programming by maternal behavior. Nat. Neurosci. 7, 847–854 (2004).
Tobi, E. W. et al. DNA methylation differences after exposure to prenatal famine are common and timing- and sex-specific. Hum. Mol. Genet. 18, 4046–4053 (2009).
Araujo, A. C. et al. Transgenerational epigenetic heritability for growth, body composition, and reproductive traits in Landrace pigs. Front. Genet. 15, 1526473 (2024).
Kuchta, K. et al. Predicting proteome dynamics using gene expression data. Sci. Rep. 8, 13866 (2018).
Hornisch, M. & Piazza, I. Regulation of gene expression through protein-metabolite interactions. NPJ Metab. Health Dis. 3, 7 (2025).
Van Hilten, A. et al. Phenotype prediction using biologically interpretable neural networks on multi-cohort multi-omics data. NPJ Syst. Biol. Appl. 10, 81 (2024).
Lopez-Cruz, M. et al. Leveraging data from the Genomes-to-Fields Initiative to investigate genotype-by-environment interactions in maize in North America. Nat. Commun. 14, 6904 (2023).
Giuffra, E. & Tuggle, C. K. Functional Annotation of Animal Genomes (FAANG): current achievements and roadmap. Annu. Rev. Anim. Biosci. 7, 65–88 (2019).
Stolovitzky, G., Monroe, D. & Califano, A. Dialogue on reverse-engineering assessment and methods: the DREAM of high-throughput pathway inference. Ann. NY Acad. Sci. 1115, 1–22 (2007).
Zheng, Z. et al. Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries. Nat. Genet. 56, 767–777 (2024).
Mendoza-Revilla, J. et al. A foundational large language model for edible plant genomes. Commun. Biol. 7, 835 (2024).
Zhai, J. et al. Cross-species modeling of plant genomes at single-nucleotide resolution using a pretrained DNA language model. Proc. Natl Acad. Sci. USA 122, e2421738122 (2025).
Wu, C. et al. A transformer-based genomic prediction method fused with knowledge-guided module. Brief. Bioinform. 25, bbad438 (2023).
Murphy, K. M., Ludwig, E., Gutierrez, J. & Gehan, M. A. Deep learning in image-based plant phenotyping. Annu. Rev. Plant Biol. 75, 771–795 (2024).
Guan, H. et al. A lightweight model for efficient identification of plant diseases and pests based on deep learning. Front. Plant Sci. 14, 1227011 (2023).
Privé, F., Arbel, J. & Vilhjálmsson, B. J. LDpred2: better, faster, stronger. Bioinformatics 36, 5424–5431 (2021).
1000 Genomes Project Consortium et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Gudmundsson, S. et al. Variant interpretation using population databases: lessons from gnomAD. Hum. Mutat. 43, 1012–1030 (2022).
Shrestha, R. et al. Multifunctional crop trait ontology for breeders’ data: field book, annotation, data discovery and semantic enrichment of the literature. AoB Plants 2010, plq008 (2010).
Harrison, P. W. et al. The FAANG data portal: global, open-access, ‘FAIR’, and richly validated genotype to phenotype data for high-quality Functional Annotation of Animal Genomes. Front. Genet. 12, 639238 (2021).
Harrow, J. et al. ELIXIR: providing a sustainable infrastructure for life science data at European scale. Bioinformatics 37, 2506–2511 (2021).
Sarma, K. V. et al. Federated learning improves site performance in multicenter deep learning without data sharing. J. Am. Med. Inform. Assoc. 28, 1259–1264 (2021).
Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Methods 22, 287–297 (2025).
Cui, H. et al. Towards multimodal foundation models in molecular cell biology. Nature 640, 623–633 (2025).
Zhu, J.-K. Abiotic stress signaling and responses in plants. Cell 167, 313–324 (2016).
Teng, J. et al. A compendium of genetic regulatory effects across pig tissues. Nat. Genet. 56, 112–123 (2024).
Yang, J. et al. Correction: incomplete dominance of deleterious alleles contributes substantially to trait variation and heterosis in maize. PLoS Genet. 17, e1009825 (2021).
Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, eado9336 (2024).
Elakhdar, A., El-Naggar, A. A., El-Wakeell, S. & Ahmed, A. H. Integrating univariate and multivariate stability indices for breeding clime-resilient barley cultivars. BMC Plant Biol. 25, 76 (2025).
Yue, H. et al. Assessing the role of genotype by environment interaction as determinants of maize grain yield and lodging resistance. BMC Plant Biol. 25, 120 (2025).
Streit, M. et al. Using genome-wide association analysis to characterize environmental sensitivity of milk traits in dairy cattle. G3 (Bethesda) 3, 1085–1093 (2013).
Park, S. et al. Interactions between polygenic risk scores, dietary pattern, and menarche age with the obesity risk in a large hospital-based cohort. Nutrients 13, 3772 (2021).
Guo, T. et al. Dynamic effects of interacting genes underlying rice flowering-time phenotypic plasticity and global adaptation. Genome Res. 30, 673–683 (2020).
Mehrban, H., Naserkheil, M., Lee, D. & Ibáñez-Escriche, N. Multi-trait single-step GBLUP improves accuracy of genomic prediction for carcass traits using yearling weight and ultrasound traits in Hanwoo. Front. Genet. 12, 692356 (2021).
Duan, J., Zhang, J., Liu, L. & Wen, Y. A guidance of model selection for genomic prediction based on linear mixed models for complex traits. Front. Genet. 13, 1017380 (2022).
Dang, X. et al. AMMI and GGE biplot analysis for genotype x environment interactions affecting the yield and quality characteristics of sugar beet. PeerJ 12, e16882 (2024).
Da Silva Júnior, A. C. et al. Multi-trait and multi-environment Bayesian analysis to predict the G x E interaction in flood-irrigated rice. PLoS ONE 17, e0259607 (2022).
Smart, J. J. & Grammer, G. L. Modernising fish and shark growth curves with Bayesian length-at-age models. PLoS ONE 16, e0246734 (2021).
Loch, A. A. et al. Use of a Bayesian Network Model to predict psychiatric illness in individuals with ‘at risk mental states’ from a general population cohort. Neurosci. Lett. 770, 136358 (2022).
Jighly, A. et al. Using genomic prediction with crop growth models enables the prediction of associated traits in wheat. J. Exp. Bot. 74, 1389–1402 (2023).
Cuevas, J. et al. Genomic prediction of genotype x environment interaction kernel regression models. Plant Genome 9, plantgenome2016.03.0024 (2016).
Maltecca, C. et al. Predicting growth and carcass traits in swine using microbiome data and machine learning algorithms. Sci. Rep. 9, 6574 (2019).
Huangfu, Y., Palloni, A., Beltrán-Sánchez, H. & McEniry, M. C. Gene–environment interactions and the case of body mass index and obesity: how much do they matter? PNAS Nexus 2, pgad213 (2023).
Wang, H. et al. Cropformer: an interpretable deep learning framework for crop genomic prediction. Plant Commun. 6, 101223 (2025).
Lee, H.-J., Lee, J. H., Gondro, C., Koh, Y. J. & Lee, S. H. deepGBLUP: joint deep learning networks and GBLUP framework for accurate genomic prediction of complex traits in Korean native cattle. Genet. Sel. Evol. 55, 56 (2023).
Wu, S., Xu, Y., Zhang, Q. & Ma, S. Gene–environment interaction analysis via deep learning. Genet. Epidemiol. 47, 261–286 (2023).
Linder, J., Srivastava, D., Yuan, H., Agarwal, V. & Kelley, D. R. Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation. Nat. Genet. 57, 949–961 (2025).
Roohani, Y., Huang, K. & Leskovec, J. Predicting transcriptional outcomes of novel multigene perturbations with GEARS. Nat. Biotechnol. 42, 927–935 (2024).
Turner, S. et al. Quality control procedures for genome-wide association studies. Curr. Protoc. Hum. Genet. Ch. 1, Unit1.19 (2011).
Pavan, S. et al. Recommendations for choosing the genotyping method and best practices for quality control in crop genome-wide association studies. Front. Genet. 11, 447 (2020).
Kumar, B. et al. Genetic diversity, population structure and linkage disequilibrium analyses in tropical maize using genotyping by sequencing. Plants (Basel) 11, 799 (2022).
Happ, M. M., Wang, H., Graef, G. L. & Hyten, D. L. Generating high density, low cost genotype data in soybean [Glycine max (L.) Merr.]. G3 (Bethesda) 9, 2153–2160 (2019).
Martchenko, D. & Shafer, A. B. A. Contrasting whole-genome and reduced representation sequencing for population demographic and adaptive inference: an alpine mammal case study. Heredity (Edinb.) 131, 273–281 (2023).
Li, X. et al. Whole-genome resequencing of wild and domestic sheep identifies genes associated with morphological and agronomic traits. Nat. Commun. 11, 2815 (2020).
Coleman, J. R. I. et al. Quality control, imputation and analysis of genome-wide genotyping data from the Illumina HumanCoreExome microarray. Brief. Funct. Genomics 15, 298–304 (2016).
Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat. Protoc. 5, 1564–1573 (2010).
Dimou, N. L., Tsirigos, K. D., Elofsson, A. & Bagos, P. G. GWAR: robust analysis and meta-analysis of genome-wide association studies. Bioinformatics 33, 1521–1527 (2017).
Weale, M. E. Quality control for genome-wide association studies. Methods Mol. Biol. 628, 341–372 (2010).
Dadd, T., Weale, M. E. & Lewis, C. M. A critical evaluation of genomic control methods for genetic association studies. Genet. Epidemiol. 33, 290–298 (2009).
Sprang, M., Krüger, M., Andrade-Navarro, M. A. & Fontaine, J.-F. Statistical guidelines for quality control of next-generation sequencing techniques. Life Sci. Alliance 4, e202101113 (2021).
Naito, T. & Okada, Y. Genotype imputation methods for whole and complex genomic regions utilizing deep learning technology. J. Hum. Genet. 69, 481–486 (2024).
Sun, Q. et al. MagicalRsq: machine-learning-based genotype imputation quality calibration. Am. J. Hum. Genet. 109, 1986–1997 (2022).
Zhang, H., Yin, L., Wang, M., Yuan, X. & Liu, X. Factors affecting the accuracy of genomic selection for agricultural economic traits in maize, cattle, and pig populations. Front. Genet. 10, 189 (2019).
Rainio, O., Teuho, J. & Klén, R. Evaluation metrics and statistical tests for machine learning. Sci. Rep. 14, 6086 (2024).
Devine, J. et al. Classifying high-dimensional phenotypes with ensemble learning. Preprint at bioRxiv https://doi.org/10.1101/2023.05.29.542750 (2023).
Baker, S. G. Metrics for evaluating polygenic risk scores. JNCI Cancer Spectr. 5, pkaa106 (2021).
Naidu, G., Zuva, T. & Sibanda, E. M. in Lecture Notes in Networks and Systems (eds Silhavy, R. and Silhavy, P.) 15–25 (Springer International Publishing, 2023).
Miller, C., Portlock, T., Nyaga, D. M. & O’Sullivan, J. M. A review of model evaluation metrics for machine learning in genetics and genomics. Front. Bioinform. 4, 1457619 (2024).
Legarra, A. & Reverter, A. Semi-parametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet. Sel. Evol. 50, 53 (2018).
Yang, F. et al. A hybrid sampling algorithm combining synthetic minority over-sampling technique and edited nearest neighbor for missed abortion diagnosis. BMC Med. Inform. Decis. Mak. 22, 344 (2022).
Bunkhumpornpat, C., Boonchieng, E., Chouvatut, V. & Lipsky, D. FLEX-SMOTE: synthetic over-sampling technique that flexibly adjusts to different minority class distributions. Patterns (NY) 5, 101073 (2024).
Van den Berg, I., Meuwissen, T. H. E., MacLeod, I. M. & Goddard, M. E. Predicting the effect of reference population on the accuracy of within, across, and multibreed genomic prediction. J. Dairy Sci. 102, 3155–3174 (2019).
Gyawali, P. K. et al. Improving genetic risk prediction across diverse population by disentangling ancestry representations. Commun. Biol. 6, 964 (2023).
Moreno-Grau, S. et al. Polygenic risk score portability for common diseases across genetically diverse populations. Hum. Genomics 18, 93 (2024).
Wientjes, Y. C. J. et al. Empirical and deterministic accuracies of across-population genomic prediction. Genet. Sel. Evol. 47, 5 (2015).
Uffelmann, E. et al. Genome-wide association studies. Nat. Rev. Methods Primers 1, 60 (2021).
Acharjee, A., Larkman, J., Xu, Y., Cardoso, V. R. & Gkoutos, G. V. A random forest based biomarker discovery and power analysis framework for diagnostics research. BMC Med. Genomics 13, 178 (2020).
Acknowledgements
This work was supported by the Purdue University Office of Research Life and Health Sciences Seed Program (to M. Tegtmeyer, R.W., M. Tuinstra and L.F.B.).
Author information
Authors and Affiliations
Contributions
S.A., L.F.d.O, M. Tuinstra, R.W., L.F.B. and M. Tegtmeyer conceived the work. S.A., L.F.d.O. and M. Tegtmeyer wrote the manuscript with input from all authors. S.A., L.F.d.O, M.N.H., A.B.S., R.W., L.F.B. and M. Tegtmeyer edited and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
All authors declare no competing interests.
Peer review
Peer review information
Nature Genetics thanks Julius van der Werf, Jinliang Yang, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Table 1
Phenotypic and genotypic data structure across species.
Supplementary Table 2
Prediction performance improvements of nonlinear models over statistical approaches.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Arirangan, S., de Oliveira, L.F., Hasan, M.N. et al. Sharing approaches in predictive genomics across animals, plants and humans. Nat Genet (2026). https://doi.org/10.1038/s41588-025-02491-w
Received:
Accepted:
Published:
Version of record:
DOI: https://doi.org/10.1038/s41588-025-02491-w