Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Review Article
  • Published:

Harnessing artificial intelligence to fill global shortfalls in biodiversity knowledge

Abstract

Large, well described gaps exist in both what we know and what we need to know to address the biodiversity crisis. Artificial intelligence (AI) offers new potential for filling these knowledge gaps, but where the biggest and most influential gains could be made remains unclear. To date, biodiversity-related uses of AI have largely focused on tracking and monitoring of wildlife populations. Rapid progress is being made in the use of AI to build phylogenetic trees and species distribution models. However, AI also has considerable unrealized potential in the re-evaluation of important ecological questions, especially those that require the integration of disparate and inherently complex data types, such as images, video, text, audio and DNA. This Review describes the current and potential future use of AI to address seven clearly defined shortfalls in biodiversity knowledge. Recommended steps for AI-based improvements include the re-use of existing image data and the development of novel paradigms, including the collaborative generation of new testable hypotheses. The resulting expansion of biodiversity knowledge could lead to science spanning from genes to ecosystems — advances that might represent our best hope for meeting the rapidly approaching 2030 targets of the Global Biodiversity Framework.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Potential roles of artificial intelligence in filling biodiversity knowledge gaps and downstream applications.
Fig. 2: The seven shortfalls in biodiversity knowledge.

Similar content being viewed by others

References

  1. COP15: final text of Kunming–Montréal Global Biodiversity Framework. Convention on Biological Diversity https://www.cbd.int/article/cop15-final-text-kunming-montreal-gbf-221222 (2022).

  2. Caldwell, I. R. et al. Global trends and biases in biodiversity conservation research. Cell Rep. Sustain. 1, 100082 (2024).

    Google Scholar 

  3. García-Roselló, E., González-Dacosta, J. & Lobo, J. M. The biased distribution of existing information on biodiversity hinders its use in conservation, and we need an integrative approach to act urgently. Biol. Conserv. 283, 110118 (2023).

    Google Scholar 

  4. Daru, B. H. & Rodriguez, J. Mass production of unvouchered records fails to represent global biodiversity patterns. Nat. Ecol. Evol. 7, 816–831 (2023).

    Google Scholar 

  5. Daru, B. H. et al. Widespread sampling biases in herbaria revealed from large-scale digitization. N. Phytol. 217, 939–955 (2018).

    Google Scholar 

  6. Raven, P. H. & Wilson, E. O. A fifty-year plan for biodiversity surveys. Science 258, 1099–1100 (1992).

    CAS  Google Scholar 

  7. Hortal, J. et al. Seven shortfalls that beset large-scale knowledge of biodiversity. Annu. Rev. Ecol. Evol. Syst. 46, 523–549 (2015).

    Google Scholar 

  8. Pereira, H. M. et al. Essential biodiversity variables. Science 339, 277–278 (2013).

    CAS  Google Scholar 

  9. Berger-Tal, O. & Lahoz-Monfort, J. J. Conservation technology: the next generation. Conserv. Lett. 11, e12458 (2018).

    Google Scholar 

  10. Tuia, D. et al. Perspectives in machine learning for wildlife conservation. Nat. Commun. 13, 792 (2022).

    CAS  Google Scholar 

  11. Barta, Z. Deep learning in terrestrial conservation biology. Biol. Futura 74, 359–367 (2023).

    Google Scholar 

  12. Stowell, D. Computational bioacoustics with deep learning: a review and roadmap. PeerJ 10, e13152 (2022).

    Google Scholar 

  13. Weinstein, B. G. A computer vision for animal ecology. J. Anim. Ecol. 87, 533–545 (2018).

    Google Scholar 

  14. Schneider, S., Taylor, G. W., Linquist, S. & Kremer, S. C. Past, present and future approaches using computer vision for animal re-identification from camera trap data. Meth. Ecol. Evol. 10, 461–470 (2019).

    Google Scholar 

  15. Tuia, D. et al. Toward a collective agenda on AI for Earth science data analysis. IEEE Geosci. Remote. Sens. Mag. 9, 88–104 (2021).

    Google Scholar 

  16. Biodiversity and Artificial Intelligence: Opportunities & Recommendations for Action. The Global Partnership on AI https://gpai.ai/projects/responsible-ai/environment/biodiversity-and-AI-opportunities-recommendations-for-action.pdf (2022).

  17. Pichler, M. & Hartig, F. Machine learning and deep learning—a review for ecologists. Meth. Ecol. Evol. 14, 994–1016 (2023).

    Google Scholar 

  18. Cardoso, P., Erwin, T. L., Borges, P. A. V. & New, T. R. The seven impediments in invertebrate conservation and how to overcome them. Biol. Conserv. 144, 2647–2655 (2011).

    Google Scholar 

  19. Nguyen, E. et al. Sequence modeling and design from molecular to genome scale with Evo. Science 386, 6723 (2024).

    Google Scholar 

  20. Hao, M. et al. Large-scale foundation model on single-cell transcriptomics. Nat. Meth. 21, 1481–1491 (2024).

    CAS  Google Scholar 

  21. Rosen, Y. et al. Universal cell embeddings: a foundation model for cell biology. Preprint at bioRxiv https://doi.org/10.1101/2023.11.28.568918 (2023).

  22. Stevens, S. et al. BioCLIP: a vision foundation model for the tree of life. In 2024 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) 19412–19424 (IEEE, 2024).

  23. Bánki, O. Catalogue of Life Annual Release 2024. Catalogue of Life https://www.catalogueoflife.org/2024/06/27/release (2024).

  24. Mora, C., Tittensor, D. P., Adl, S., Simpson, A. G. B. & Worm, B. How many species are there on Earth and in the ocean? PLoS Biol. 9, e1001127 (2011).

    CAS  Google Scholar 

  25. Caley, M. J., Fisher, R. & Mengersen, K. Global species richness estimates have not converged. Trends Ecol. Evol. 29, 187–188 (2014).

    Google Scholar 

  26. Whittaker, R. J. et al. Conservation biogeography: assessment and prospect. Divers. Distrib. 11, 3–23 (2005).

    Google Scholar 

  27. Winter, M. et al. Patterns and biases in climate change research on amphibians and reptiles: a systematic review. R. Soc. Open Sci. 3, 160158 (2016).

    Google Scholar 

  28. Diniz-Filho, J. A. F., De Marco, P. Jr & Hawkins, B. A. Defying the curse of ignorance: perspectives in insect macroecology and conservation biogeography. Insect Conserv. Divers. 3, 172–179 (2010).

    Google Scholar 

  29. Löbl, I., Klausnitzer, B., Hartmann, M. & Krell, F.-T. The silent extinction of species and taxonomists—an appeal to science policymakers and legislators. Diversity 15, 1053 (2023).

    Google Scholar 

  30. Parsons, D. J., Pelletier, T. A., Wieringa, J. G., Duckett, D. J. & Carstens, B. C. Analysis of biodiversity data suggests that mammal species are hidden in predictable places. Proc. Natl Acad. Sci. USA 119, e2103400119 (2022).

    CAS  Google Scholar 

  31. Gong, Z. et al. BIOSCAN-CLIP: bridging vision and genomics for biodiversity monitoring at scale. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.17537 (2024).

  32. Zhao, B. & Mac Aodha, O. Incremental generalized category discovery. In 2023 Proc. IEEE/CVF Int. Conf. Comput. Vis. 19080–19090 (IEEE, 2023).

  33. Li, R., Ratnasingham, S., Zarubiieva, I., Somervuo, P. & Taylor, G. W. PROTAX-GPU: a scalable probabilistic taxonomic classification system for DNA barcodes. Phil. Trans. R. Soc. B 379, 20230124 (2024).

    CAS  Google Scholar 

  34. Chen, Y. & Rolnick, D. Understanding insect range shifts with out-of-distribution detection. climatechange.ai https://www.climatechange.ai/papers/neurips2023/130/paper.pdf (Climate Change AI, 2023).

  35. Gabeff, V., Rußwurm, M., Tuia, D. & Mathis, A. WildCLIP: scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models. Int. J. Comput. Vis. 132, 3770–3786 (2024).

    Google Scholar 

  36. Paul, D. et al. A simple interpretable transformer for fine-grained image classification and analysis. Preprint at arXiv https://doi.org/10.48550/arXiv.2311.04157 (2024).

  37. Chiquier, M., Mall, U. & Vondrick, C. Evolving interpretable visual classifiers with large language models. In Computer Vision—ECCV 2024: Lecture Notes in Computer Science (eds Leonardis, A. et al.) 15122 (Springer, 2024).

  38. Gonzalez, A. et al. A global biodiversity observing system to unite monitoring and guide action. Nat. Ecol. Evol. 7, 1947–1952 (2023).

    Google Scholar 

  39. Lees, A. C. et al. A roadmap to identifying and filling shortfalls in Neotropical ornithology. Auk 137, ukaa048 (2020).

    Google Scholar 

  40. Anderson-Teixeira, K. J. et al. CTFS-ForestGEO: a worldwide network monitoring forests in an era of global change. Glob. Change Biol. 21, 528–549 (2015).

    Google Scholar 

  41. Moeller, A. K., Lukacs, P. M. & Horne, J. S. Three novel methods to estimate abundance of unmarked animals using remote cameras. Ecosphere 9, e02331 (2018).

    Google Scholar 

  42. Gilbert, N. A., Clare, J. D. J., Stenglein, J. L. & Zuckerberg, B. Abundance estimation of unmarked animals based on camera-trap data. Conserv. Biol. 35, 88–100 (2021).

    Google Scholar 

  43. Royle, J. A. & Nichols, J. D. Estimating abundance from repeated presence–absence data or point counts. Ecology 84, 777–790 (2003).

    Google Scholar 

  44. Strebel, N. et al. Estimating abundance based on time-to-detection data. Meth. Ecol. Evol. 12, 909–920 (2021).

    Google Scholar 

  45. Fiss, C. J. et al. Performance of unmarked abundance models with data from machine-learning classification of passive acoustic recordings. Ecosphere 15, e4954 (2024).

    Google Scholar 

  46. Frommolt, K.-H. & Tauchert, K.-H. Applying bioacoustic methods for long-term monitoring of a nocturnal wetland bird. Ecol. Inform. 21, 4–12 (2014).

    Google Scholar 

  47. Rhinehart, T. A., Chronister, L. M., Devlin, T. & Kitzes, J. Acoustic localization of terrestrial wildlife: current practices and future opportunities. Ecol. Evol. 10, 6794–6818 (2020).

    Google Scholar 

  48. Parham, J., Stewart, C., Berger-Wolf, T., Rubenstein, D. & Holmberg, J. The great Grevy’s rally: a review on procedure. cthulhu.dyn.wildme.io https://cthulhu.dyn.wildme.io/public/papers/parham_ijcai_aiwc_2018.pdf (2018).

  49. Whitehead, H. Computer assisted individual identification of sperm whale flukes. Rep. Int. Whal. Comm. 12, 71–77 (1990).

    Google Scholar 

  50. Crall, J. P., Stewart, C. V., Berger-Wolf, T. Y., Rubenstein, D. I. & Sundaresan, S. R. HotSpotter; patterned species instance recognition. In 2013 IEEE Workshop on Applications of Computer Vision (WACV) 230–237 (IEEE, 2013).

  51. Arzoumanian, Z., Holmberg, J. & Norman, B. An astronomical pattern-matching algorithm for computer-aided identification of whale sharks, Rhincodon typus. J. Appl. Ecol. 42, 999–1011 (2005).

    Google Scholar 

  52. Ye, M. et al. Deep learning for person re-identification: a survey and outlook. IEEE Trans. Pattern Anal. Mach. Intell. 44, 2872–2893 (2021).

    Google Scholar 

  53. Nepovinnykh, E. et al. Species-agnostic patterned animal re-identification by aggregating deep local features. Int. J. Comput. Vis. 132, 4003–4018 (2024).

    Google Scholar 

  54. Čermák, V., Picek, L., Adam, L. & Papafitsoros, K. WildlifeDatasets: an open-source toolkit for animal re-identification. In 2024 IEEE/CVF Winter Conf. on Applications of Computer Vision (WACV) 5941–5951 (IEEE, 2024).

  55. Moskvyak, O., Maire, F., Armstrong, A. O., Dayoub, F. & Baktashmotlagh, M. Robust re-identification of manta rays from natural markings by learning pose invariant embeddings. In 2021 Digital Image Computing: Techniques and Applications (DICTA) https://doi.org/10.1109/DICTA52665.2021.9647359 (IEEE, 2021).

  56. Sundaresan, A. et al. Adapting the re-ID challenge for static sensors. Preprint at arXiv https://doi.org/10.48550/arXiv.2412.00290 (2024).

  57. Ravoor, P. C. & Sudarshan, T. S. B. Deep learning methods for multi-species animal re-identification and tracking—a survey. Comput. Sci. Rev. 38, 100289 (2020).

    Google Scholar 

  58. Zuerl, M. et al. PolarBearVidID: a video-based re-identification benchmark dataset for polar bears. Animals 13, 801 (2023).

    Google Scholar 

  59. Kuncheva, L. I., Garrido-Labrador, J. L., Ramos-Pérez, I., Hennessey, S. L. & Rodríguez, J. J. An experiment on animal re-identification from video. Ecol. Inform. 74, 101994 (2023).

    Google Scholar 

  60. Koski, W. R. et al. Evaluation of UAS for photographic re-identification of bowhead whales, Balaena mysticetus. J. Unman. Veh. Syst. 3, 22–29 (2015).

    Google Scholar 

  61. Knight, E. et al. Individual identification in acoustic recordings. Trends Ecol. Evol. 39, 947–960 (2024).

    Google Scholar 

  62. Linhart, P., Mahamoud-Issa, M., Stowell, D. & Blumstein, D. T. The potential for acoustic individual identification in mammals. Mamm. Biol. 102, 667–683 (2022).

    Google Scholar 

  63. Yang, J., Zhou, K., Li, Y. & Liu, Z. Generalized out-of-distribution detection: a survey. Int. J. Comput. Vis. 132, 5635–5662 (2024).

    Google Scholar 

  64. Vaze, S., Han, K., Vedaldi, A. & Zisserman, A. Generalized category discovery. In 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR) 7482–7491 (IEEE, 2022).

  65. Kulits, P., Wall, J., Bedetti, A., Henley, M. & Beery, S. ElephantBook: a semi-automated human-in-the-loop system for elephant re-identification. In ACM SIGCAS Conf. on Computing and Sustainable Societies (COMPASS) 88–98 (Association for Computing Machinery, 2021).

  66. Delplanque, A. et al. From crowd to herd counting: how to precisely detect and count African mammals using aerial imagery and deep learning? ISPRS J. Photogramm. Remote. Sens. 197, 167–180 (2023).

    Google Scholar 

  67. Perez, G., Maji, S. & Sheldon, D. DISCount: counting in large image collections with detector-based importance sampling. Proc. AAAI Conf. Artif. Intell. 38, 22294–22302 (2024).

    Google Scholar 

  68. Hebert, P. D. N., Floyd, R., Jafarpour, S. & Prosser, S. W. J. Barcode 100K specimens: in a single nanopore run. Mol. Ecol. Resour. 25, e14028 (2025).

    Google Scholar 

  69. Reichstein, M. et al. Deep learning and process understanding for data-driven Earth system science. Nature 566, 195–204 (2019).

    CAS  Google Scholar 

  70. Karpatne, A. et al. Theory-guided data science: a new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 29, 2318–2331 (2017).

    Google Scholar 

  71. Purvis, A., Gittleman, J. L., Cowlishaw, G. & Mace, G. M. Predicting extinction risk in declining species. Proc. R. Soc. Lond. B 267, 1947–1952 (2000).

    CAS  Google Scholar 

  72. Newsome, T. M. et al. Constraints on vertebrate range size predict extinction risk. Glob. Ecol. Biogeogr. 29, 76–86 (2020).

    Google Scholar 

  73. Staude, I. R., Navarro, L. M. & Pereira, H. M. Range size predicts the risk of local extinction from habitat loss. Glob. Ecol. Biogeogr. 29, 16–25 (2020).

    Google Scholar 

  74. Mace, G. M. et al. Quantification of extinction risk: IUCN’s system for classifying threatened species. Conserv. Biol. 22, 1424–1442 (2008).

    Google Scholar 

  75. Jetz, W. et al. Include biodiversity representation indicators in area-based conservation targets. Nat. Ecol. Evol. 6, 123–126 (2022).

    Google Scholar 

  76. Pimm, S. L. et al. Emerging technologies to conserve biodiversity. Trends Ecol. Evol. 30, 685–696 (2015).

    Google Scholar 

  77. Geurts, E. M., Reynolds, J. D. & Starzomski, B. M. Turning observations into biodiversity data: broadscale spatial biases in community science. Ecosphere 14, e4582 (2023).

    Google Scholar 

  78. Fretwell, P. T. & Trathan, P. N. Discovery of new colonies by Sentinel2 reveals good and bad news for emperor penguins. Remote Sens. Ecol. Conserv. 7, 139–153 (2021).

    Google Scholar 

  79. Cubaynes, H. C. & Fretwell, P. T. Whales from space dataset, an annotated satellite image dataset of whales for training machine learning models. Sci. Data 9, 245 (2022).

    Google Scholar 

  80. Mannocci, L. et al. Leveraging social media and deep learning to detect rare megafauna in video surveys. Conserv. Biol. 36, e13798 (2022).

    Google Scholar 

  81. Elith, J., Leathwick, J. R. & Hastie, T. A working guide to boosted regression trees. J. Anim. Ecol. 77, 802–813 (2008).

    CAS  Google Scholar 

  82. Beery, S., Cole, E., Parker, J., Perona, P. & Winner, K. Species distribution modeling for machine learning practitioners: a review. In Proc. 4th ACM SIGCAS Conf. on Computing and Sustainable Societies 329–348 (Association for Computing Machinery, 2021).

  83. Isaac, N. J. B. et al. Data integration for large-scale models of species distributions. Trends Ecol. Evol. 35, 56–67 (2020).

    Google Scholar 

  84. Botella, C. et al. Overview of GeoLifeCLEF 2023: species composition prediction with high spatial resolution at continental scale using remote sensing. In CLEF 2023—Working Notes of the Conference and Labs of the Evaluation Forum article 3497, 1954–1971 (2023).

  85. Mashiane, K., Ramoelo, A. & Adelabu, S. Prediction of species richness and diversity in sub-alpine grasslands using satellite remote sensing and random forest machine-learning algorithm. Appl. Veg. Sci. 27, e12778 (2024).

    Google Scholar 

  86. Lange, C., Cole, E., Van Horn, G. & Mac Aodha, O. Active learning-based species range estimation. In NIPS 2023: Proc. 37th Int. Conf. on Neural Information Processing Systems article 1815, 41892–41913 (NeurIPS, 2024).

  87. Mondain-Monval, T. et al. Adaptive sampling by citizen scientists improves species distribution model performance: a simulation study. Meth. Ecol. Evol. 15, 1206–1220 (2024).

    Google Scholar 

  88. Seaton, F. M., Jarvis, S. G. & Henrys, P. A. Spatio-temporal data integration for species distribution modelling in R-INLA. Meth. Ecol. Evol. 15, 1221–1232 (2024).

    Google Scholar 

  89. Pollock, L. J. et al. Understanding co-occurrence by modelling species simultaneously with a joint species distribution model (JSDM). Meth. Ecol. Evol. 5, 397–406 (2014).

    Google Scholar 

  90. Caradima, B., Schuwirth, N. & Reichert, P. From individual to joint species distribution models: a comparison of model complexity and predictive performance. J. Biogeogr. 46, 2260–2274 (2019).

    Google Scholar 

  91. Talluto, M. V., Mokany, K., Pollock, L. J. & Thuiller, W. Multifaceted biodiversity modelling at macroecological scales using Gaussian processes. Divers. Distrib. 24, 1492–1502 (2018).

    Google Scholar 

  92. Chen, D. & Gomes, C. P. Bias reduction via end-to-end shift learning: application to citizen science. In Proc. Thirty-Third AAAI Conf. on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conf. and Ninth AAAI Symp. on Educational Advances in Artificial Intelligence 493–500 (AAAI Press, 2019).

  93. Rolf, E. et al. A generalizable and accessible approach to machine learning with global satellite imagery. Nat. Commun. 12, 4392 (2021).

    CAS  Google Scholar 

  94. Klemmer, K., Rolf, E., Robinson, C., Mackey, L. & Rußwurm, M. SatCLIP: global, general-purpose location embeddings with satellite imagery. Preprint at arXiv https://doi.org/10.48550/arXiv.2311.17179 (2024).

  95. Cole, E. et al. Spatial implicit neural representations for global-scale species mapping. In Proc. 40th Int. Conf. on Machine Learning article 202, 6320–6342 (PMLR, 2023).

  96. Valavi, R., Guillera-Arroita, G., Lahoz-Monfort, J. J. & Elith, J. Predictive performance of presence-only species distribution models: a benchmark study with reproducible code. Ecol. Monogr. 92, e01486 (2022).

    Google Scholar 

  97. Holt, R. D. Bringing the Hutchinsonian niche into the 21st century: ecological and evolutionary perspectives. Proc. Natl Acad. Sci. USA 106, 19659–19665 (2009).

    CAS  Google Scholar 

  98. Soberón, J. Grinnellian and Eltonian niches and geographic distributions of species. Ecol. Lett. 10, 1115–1123 (2007).

    Google Scholar 

  99. Anderson, R. P. A framework for using niche models to estimate impacts of climate change on species distributions. Ann. NY Acad. Sci. 1297, 8–28 (2013).

    Google Scholar 

  100. Bennett, J. M. et al. GlobTherm, a global database on thermal tolerances for aquatic and terrestrial organisms. Sci. Data 5, 180022 (2018).

    Google Scholar 

  101. Childress, E. S. & Letcher, B. H. Estimating thermal performance curves from repeated field observations. Ecology 98, 1377–1387 (2017).

    Google Scholar 

  102. Soberon, J. & Peterson, A. T. Interpretation of models of fundamental ecological niches and species’ distributional areas. Biodiv. Inform. https://doi.org/10.17161/bi.v2i0.4 (2005).

  103. Jetz, W. et al. Biological Earth observation with animal sensors. Trends Ecol. Evol. 37, 293–298 (2022).

    Google Scholar 

  104. Deetjen, M. E., Biewener, A. A. & Lentink, D. High-speed surface reconstruction of a flying bird using structured light. J. Exp. Biol. 220, 1956–1961 (2017).

    Google Scholar 

  105. Nath, T. et al. Using DeepLabCut for 3D markerless pose estimation across species and behaviors. Nat. Protoc. 14, 2152–2176 (2019).

    CAS  Google Scholar 

  106. Pereira, T. D. et al. SLEAP: a deep learning system for multi-animal pose tracking. Nat. Meth. 19, 486–495 (2022).

    CAS  Google Scholar 

  107. Wei, J. N. et al. A deep learning and digital archaeology approach for mosquito repellent discovery. Preprint at bioRxiv https://doi.org/10.1101/2022.09.01.504601 (2024).

  108. Nativi, S., Mazzetti, P. & Craglia, M. Digital ecosystems for developing digital twins of the Earth: the Destination Earth case. Remote. Sens. 13, 2119 (2021).

    Google Scholar 

  109. Koning, Kde et al. Digital twins: dynamic model-data fusion for ecology. Trends Ecol. Evol. 38, 916–926 (2023).

    Google Scholar 

  110. Trantas, A., Plug, R., Pileggi, P. & Lazovik, E. Digital twin challenges in biodiversity modelling. Ecol. Inform. 78, 102357 (2023).

    Google Scholar 

  111. Wilman, H. et al. EltonTraits 1.0: species-level foraging attributes of the world’s birds and mammals. Ecology 95, 2027–2027 (2014).

    Google Scholar 

  112. Tobias, J. A. et al. AVONET: morphological, ecological and geographical data for all birds. Ecol. Lett. 25, 581–597 (2022).

    Google Scholar 

  113. Oskyrko, O., Mi, C., Meiri, S. & Du, W. ReptTraits: a comprehensive dataset of ecological traits in reptiles. Sci. Data 11, 243 (2024).

    Google Scholar 

  114. Shirey, V. et al. LepTraits 1.0: a globally comprehensive dataset of butterfly traits. Sci. Data 9, 382 (2022).

    Google Scholar 

  115. Sheard, C. et al. Nest traits for the world’s birds. Glob. Ecol. Biogeogr. 33, 206–214 (2024).

    Google Scholar 

  116. Kattge, J. et al. TRY—a global database of plant traits. Glob. Change Biol. 17, 2905–2935 (2011).

    Google Scholar 

  117. Murphy, S. J. et al. SAviTraits 1.0: seasonally varying dietary attributes for birds. Glob. Ecol. Biogeogr. 32, 1690–1698 (2023).

    Google Scholar 

  118. Madin, J. S. et al. The coral trait database, a curated database of trait information for coral species from the global oceans. Sci. Data 3, 160017 (2016).

    Google Scholar 

  119. Cébron, A. et al. BactoTraits—a functional trait database to evaluate how natural and man-induced changes influence the assembly of bacterial communities. Ecol. Indic. 130, 108047 (2021).

    Google Scholar 

  120. Pigot, A. L. et al. Macroevolutionary convergence connects morphological form to ecological function in birds. Nat. Ecol. Evol. 4, 230–239 (2020).

    Google Scholar 

  121. Augusto, L. & Boča, A. Tree functional traits, forest biomass, and tree species diversity interact with site properties to drive forest soil carbon. Nat. Commun. 13, 1097 (2022).

    CAS  Google Scholar 

  122. Wilde, B. C., Bragg, J. G. & Cornwell, W. Analyzing trait–climate relationships within and among taxa using machine learning and herbarium specimens. Am. J. Bot. 110, e16167 (2023).

    CAS  Google Scholar 

  123. Kline, J. et al. A framework for autonomic computing for in situ imageomics. In 2023 IEEE Int. Conf. on Autonomic Computing and Self-Organizing Systems (ACSOS) 11–16 (IEEE, 2023).

  124. Hoyal Cuthill, J. F., Guttenberg, N. & Huertas, B. Male and female contributions to diversity among birdwing butterfly images. Commun. Biol. 7, 774 (2024).

    Google Scholar 

  125. Stoddard, M. C., Kilner, R. M. & Town, C. Pattern recognition algorithm reveals how birds evolve individual egg pattern signatures. Nat. Commun. 5, 4117 (2014).

    CAS  Google Scholar 

  126. Johnson, T. F., Isaac, N. J. B., Paviolo, A. & González-Suárez, M. Handling missing values in trait data. Glob. Ecol. Biogeogr. 30, 51–62 (2021).

    Google Scholar 

  127. Marcos, D., Potze, A., Xu, W., Tuia, D. & Akata, Z. Attribute prediction as multiple instance learning. Trans. Mach. Learn. Res. 8, 253156463 (2022).

    Google Scholar 

  128. Crofts, A. L. et al. Linking aerial hyperspectral data to canopy tree biodiversity: an examination of the spectral variation hypothesis. Ecol. Monogr. 94, e1605 (2024).

    Google Scholar 

  129. Wilson, O. J. The 3D pollen project: an open repository of three-dimensional data for outreach, education and research. Rev. Palaeobot. Palynol. 312, 104860 (2023).

    Google Scholar 

  130. Fabian, S. T., Sondhi, Y., Allen, P. E., Theobald, J. C. & Lin, H.-T. Why flying insects gather at artificial light. Nat. Commun. 15, 689 (2024).

    CAS  Google Scholar 

  131. Isaac, N. J. B., Turvey, S. T., Collen, B., Waterman, C. & Baillie, J. E. M. Mammals on the EDGE: conservation priorities based on threat and phylogeny. PLoS ONE 2, e296 (2007).

    Google Scholar 

  132. Diniz Filho, J. A. F. et al. Macroecological links between the Linnean, Wallacean, and Darwinian shortfalls. Front. Biogeogr. 15, e59566 (2023).

    Google Scholar 

  133. Upham, N. S., Esselstyn, J. A. & Jetz, W. Inferring the mammal tree: species-level sets of phylogenies for questions in ecology, evolution, and conservation. PLoS Biol. 17, e3000494 (2019).

    CAS  Google Scholar 

  134. Warnow, T. Computational Phylogenetics: An Introduction to Designing Methods for Phylogeny Estimation (Cambridge Univ. Press, 2017).

  135. Mo, Y. K., Hahn, M. W. & Smith, M. L. Applications of machine learning in phylogenetics. Mol. Phylogenet. Evol. 196, 108066 (2024).

    CAS  Google Scholar 

  136. Lürig, M. D., Donoughe, S., Svensson, E. I., Porto, A. & Tsuboi, M. Computer vision, machine learning, and the promise of phenomics in ecology and evolutionary biology. Front. Ecol. Evol. 9, 642774 (2021).

    Google Scholar 

  137. Younis, S. et al. Taxon and trait recognition from digitized herbarium specimens using deep convolutional neural networks. Bot. Lett. 165, 377–383 (2018).

    Google Scholar 

  138. Weaver, W. N., Ng, J. & Laport, R. G. LeafMachine: using machine learning to automate leaf trait extraction from digitized herbarium specimens. Appl. Plant. Sci. 8, e11367 (2020).

    Google Scholar 

  139. Stupp, D. et al. Co-evolution based machine-learning for predicting functional interactions between human genes. Nat. Commun. 12, 6454 (2021).

    CAS  Google Scholar 

  140. Elhamod, M. et al. Discovering novel biological traits from images using phylogeny-guided neural networks. In Proc. 29th ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD 2023) 3966–3978 (Association for Computing Machinery, 2023).

  141. Meagher, J. P., Damoulas, T., Jones, K. E. & Girolami, M. in Statistical Data Science (ed. Adams, N.) Ch. 7, 111–124 (World Scientific, 2018).

  142. Nguyen, T. Q., Ebnesajjad, C., Cole, S. R. & Stuart, E. A. Sensitivity analysis for an unobserved moderator in RCT-to-target-population generalization of treatment effects. Ann. Appl. Stat. 11, 225–247 (2017).

    Google Scholar 

  143. Blackburn, D. C. et al. Increasing the impact of vertebrate scientific collections through 3D imaging: the openVertebrate (oVert) thematic collections network. BioScience 74, 169–186 (2024).

    Google Scholar 

  144. Yang, C.-H. et al. Arboretum: a large multimodal dataset enabling AI for biodiversity. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.17720 (2024).

  145. Gharaee, Z. et al. BIOSCAN-5M: a multimodal dataset for insect biodiversity. Preprint at arXiv https://doi.org/10.48550/arXiv.2406.12723 (2024).

  146. Alberdi, A. et al. Promises and pitfalls of using high-throughput sequencing for diet analysis. Mol. Ecol. Resour. 19, 327–348 (2019).

    Google Scholar 

  147. Wilkinson, D. P., Golding, N., Guillera-Arroita, G., Tingley, R. & McCarthy, M. A. Defining and evaluating predictions of joint species distribution models. Meth. Ecol. Evol. 12, 394–404 (2021).

    Google Scholar 

  148. Schliep, E. M. et al. Joint species distribution modelling for spatio‐temporal occurrence and ordinal abundance data. Glob. Ecol. Biogeogr. 27, 142–155 (2018).

    Google Scholar 

  149. Blanchet, F. G., Cazelles, K. & Gravel, D. Co-occurrence is not evidence of ecological interactions. Ecol. Lett. 23, 1050–1063 (2020).

    Google Scholar 

  150. Ratnayake, M. N., Dyer, A. G. & Dorin, A. Tracking individual honeybees among wildflower clusters with computer vision-facilitated pollinator monitoring. PLoS ONE 16, e0239504 (2021).

    CAS  Google Scholar 

  151. Liu, J. & Wang, X. Plant diseases and pests detection based on deep learning: a review. Plant. Meth. 17, 22 (2021).

    Google Scholar 

  152. Høye, T. T. et al. Deep learning and computer vision will transform entomology. Proc. Natl Acad. Sci. USA 118, e2002545117 (2021).

    Google Scholar 

  153. Nawoya, S. et al. Computer vision and deep learning in insects for food and feed production: a review. Comput. Electron. Agric. 216, 108503 (2024).

    Google Scholar 

  154. Folliot, A., Haupert, S., Ducrettet, M., Sèbe, F. & Sueur, J. Using acoustics and artificial intelligence to monitor pollination by insects and tree use by woodpeckers. Sci. Total. Environ. 838, 155883 (2022).

    CAS  Google Scholar 

  155. Ornai, A. & Keasar, T. Floral complexity traits as predictors of plant-bee interactions in a Mediterranean pollination web. Plants 9, 1432 (2020).

    Google Scholar 

  156. Pichler, M., Boreux, V., Klein, A.-M., Schleuning, M. & Hartig, F. Machine learning algorithms to infer trait-matching and predict species interactions in ecological networks. Meth. Ecol. Evol. 11, 281–293 (2020).

    Google Scholar 

  157. Sydenham, M. A. K. et al. MetaComNet: a random forest-based framework for making spatial predictions of plant–pollinator interactions. Meth. Ecol. Evol. 13, 500–513 (2022).

    Google Scholar 

  158. Caron, D., Maiorano, L., Thuiller, W. & Pollock, L. J. Addressing the Eltonian shortfall with trait-based interaction models. Ecol. Lett. 25, 889–899 (2022).

    Google Scholar 

  159. Llewelyn, J. et al. Predicting predator–prey interactions in terrestrial endotherms using random forest. Ecography 2023, e06619 (2023).

    Google Scholar 

  160. Kotula, H. J., Peralta, G., Frost, C. M., Todd, J. H. & Tylianakis, J. M. Predicting direct and indirect non-target impacts of biocontrol agents using machine-learning approaches. PLoS ONE 16, e0252448 (2021).

    CAS  Google Scholar 

  161. Roslin, T. & Majaneva, S. The use of DNA barcodes in food web construction—terrestrial and aquatic ecologists unite! Genome 59, 603–628 (2016).

    CAS  Google Scholar 

  162. Adhurya, S. & Park, Y.-S. A novel method for predicting ecological interactions with an unsupervised machine learning algorithm. Methods Ecol. Evol. 15, 1247–1260 (2024).

    Google Scholar 

  163. Strydom, T. et al. A roadmap towards predicting species interaction networks (across space and time). Phil. Trans. R. Soc. B 376, 20210063 (2021).

    Google Scholar 

  164. Strydom, T. et al. Food web reconstruction through phylogenetic transfer of low-rank network representation. Meth. Ecol. Evol. 13, 2838–2849 (2022).

    Google Scholar 

  165. Suraci, J. P. et al. Beyond spatial overlap: harnessing new technologies to resolve the complexities of predator–prey interactions. Oikos 2022, e09004 (2022).

    Google Scholar 

  166. Caron, D. et al. Trait-matching models predict pairwise interactions across regions, not food web properties. Glob. Ecol. Biogeogr. 33, e13807 (2024).

    Google Scholar 

  167. Corso, G., Stark, H., Jegelka, S., Jaakkola, T. & Barzilay, R. Graph neural networks. Nat. Rev. Meth. Primer 4, 17 (2024).

    CAS  Google Scholar 

  168. Hamilton, W. L. Graph representation learning. In Synthesis Lectures on Artificial Intelligence and Machine Learning (SLAIML) https://doi.org/10.1007/978-3-031-01588-5 (Springer, 2020).

  169. Kim, J. et al. Pure transformers are powerful graph learners. In Proc. 36th Int. Conf. on Neural Information Processing Systems (NIPS 2022) article 1060, 14582–14595 (NeurIPS, 2022).

  170. Strydom, T. et al. Graph embedding and transfer learning can help predict potential species interaction networks despite data limitations. Meth. Ecol. Evol. 14, 2917–2930 (2023).

    Google Scholar 

  171. Allesina, S. & Pascual, M. Googling food webs: can an eigenvector measure species’ importance for coextinctions? PLoS Comput. Biol. 5, e1000494 (2009).

    Google Scholar 

  172. McDonald-Madden, E. et al. Using food-web theory to conserve ecosystems. Nat. Commun. 7, 10245 (2016).

    CAS  Google Scholar 

  173. O’Connor, L. M. J. et al. Vulnerability of terrestrial vertebrate food webs to anthropogenic threats in Europe. Glob. Change Biol. 30, e17253 (2024).

    Google Scholar 

  174. Fricke, E. et al. Collapse of terrestrial mammal food webs since the Late Pleistocene. Science 377, 1008–1011 (2022).

    CAS  Google Scholar 

  175. Elliott, M. J. & Fortes, J. A. B. Toward reliable biodiversity information extraction from large language models. In 2024 IEEE 20th Int. Conf. on e-Science (IEEE, 2024).

  176. Bledsoe, E. K. et al. Data rescue: saving environmental data from extinction. Proc. R. Soc. B 289, 20220938 (2022).

    Google Scholar 

  177. Lewis, P. et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. In Proc. 34th Int. Conf. on Neural Information Processing Systems (NIPS 2020) article 793, 9459–9474 (NeurIPS, 2020).

  178. Berger-Tal, O. et al. Leveraging AI to improve evidence synthesis in conservation. Trends Ecol. Evol. 39, 548–557 (2024).

    Google Scholar 

  179. Ryo, M. et al. Explainable artificial intelligence enhances the ecological interpretability of black-box species distribution models. Ecography 44, 199–205 (2021).

    Google Scholar 

  180. Runge, J. et al. Inferring causation from time series in Earth system sciences. Nat. Commun. 10, 2553 (2019).

    Google Scholar 

  181. Camps-Valls, G. et al. Discovering causal relations and equations from data. Phys. Rep. 1044, 1–68 (2023).

    Google Scholar 

  182. Willard, J., Jia, X., Xu, S., Steinbach, M. & Kumar, V. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Comput. Surv. 55, 66 (2022).

    Google Scholar 

  183. Liu, L. et al. Knowledge-guided machine learning can improve carbon cycle quantification in agroecosystems. Nat. Commun. 15, 357 (2024).

    Google Scholar 

  184. Hartig, F. et al. Novel community data in ecology-properties and prospects. Trends Ecol. Evol. 39, 280–293 (2024).

    CAS  Google Scholar 

  185. Niazi, S. K. & Mariam, Z. Computer-aided drug design and drug discovery: a prospective analysis. Pharmaceuticals 17, 22 (2024).

    CAS  Google Scholar 

  186. Guo, K., Yang, Z., Yu, C.-H. & Buehler, M. J. Artificial intelligence and machine learning in design of mechanical materials. Mater. Horiz. 8, 1153–1172 (2021).

    CAS  Google Scholar 

  187. Han, Z., Zhang, L., Jiang, Y., Wang, H. & Jiguet, F. Unravelling species co-occurrence in a steppe bird community of Inner Mongolia: insights for the conservation of the endangered Jankowski’s bunting. Divers. Distrib. 26, 843–852 (2020).

    Google Scholar 

  188. Zérah, Y., Valero, S. & Inglada, J. Physics-driven probabilistic deep learning for the inversion of physical models with application to phenological parameter retrieval from satellite times series. IEEE Trans. Geosci. Remote. Sens. 61, 4404723 (2023).

    Google Scholar 

  189. Karpatne, A., Kannan, R. & Kumar, V. (eds) Knowledge Guided Machine Learning: Accelerating Discovery Using Scientific Knowledge and Data: Data Mining and Knowledge Discovery Series (CRC Press, 2023).

  190. Nguyen, T., Brandstetter, J., Kapoor, A., Gupta, J. K. & Grover, A. ClimaX: a foundation model for weather and climate. In Proc. 40th Int. Conf. on Machine Learning (ICML 2023) 25904–25938 (PMLR, 2023).

  191. Lam, R. et al. Learning skillful medium-range global weather forecasting. Science 382, 1416–1412 (2023).

    CAS  Google Scholar 

  192. de Bézenac, E., Pajot, A. & Gallinari, P. Deep learning for physical processes: incorporating prior scientific knowledge. J. Stat. Mech. 2019, 124009 (2019).

    Google Scholar 

  193. Rolnick, D. et al. Position: application-driven innovation in machine learning. In Proc. 41st Int. Conf. on Machine Learning article 235, 42707–42718 (PMLR, 2024).

  194. Harvey, E., Gounand, I., Ward, C. L. & Altermatt, F. Bridging ecology and conservation: from ecological networks to ecosystem function. J. Appl. Ecol. 54, 371–379 (2017).

    Google Scholar 

  195. Hengl, T. et al. SoilGrids250m: global gridded soil information based on machine learning. PLoS ONE 12, e0169748 (2017).

    Google Scholar 

  196. Mahecha, M. D. et al. Earth system data cubes unravel global multivariate dynamics. Earth Syst. Dyn. 11, 201–234 (2020).

    Google Scholar 

  197. Bush, A. et al. Connecting Earth observation to high-throughput biodiversity data. Nat. Ecol. Evol. 1, 0176 (2017).

    Google Scholar 

  198. Bojinski, S. et al. The concept of essential climate variables in support of climate research, applications, and policy. Bull. Am. Meteor. Soc. 95, 1431–1443 (2014).

    Google Scholar 

  199. Skidmore, A. K. & Pettorelli, N. Agree on biodiversity metrics to track from space: ecologists and space agencies must forge a global monitoring strategy. Nature 523, 403–406 (2015).

    CAS  Google Scholar 

  200. Lang, N., Jetz, W., Schindler, K. & Wegner, J. D. A high-resolution canopy height model of the Earth. Nat. Ecol. Evol. 7, 1778–1789 (2023).

    Google Scholar 

  201. Reiche, J. et al. Combining satellite data for better tropical forest monitoring. Nat. Clim. Change 6, 120–122 (2016).

    Google Scholar 

  202. Moreno-Martínez, Á. et al. A methodology to derive global maps of leaf traits using remote sensing and climate data. Remote. Sens. Environ. 218, 69–88 (2018).

    Google Scholar 

  203. Yang, H. et al. Global patterns of tree wood density. Glob. Change Biol. 30, e17224 (2024).

    CAS  Google Scholar 

  204. Wolf, S. et al. Citizen science plant observations encode global trait patterns. Nat. Ecol. Evol. 6, 1850–1859 (2022).

    Google Scholar 

  205. Gómez-Chova, L., Tuia, D., Moser, G. & Camps-Valls, G. Multimodal classification of remote sensing images: a review and future directions. Proc. IEEE 103, 1560–1584 (2015).

    Google Scholar 

  206. Bachmann, R., Mizrahi, D., Atanov, A. & Zamir, A. MultiMAE: multi-modal multi-task masked autoencoders. Computer Vision—ECCV 2022 17th Eur. Conf. Proc. XXXVII 348–367 (Springer, 2022).

  207. Hoban, S. et al. Genetic diversity targets and indicators in the CBD post-2020 Global Biodiversity Framework must be improved. Biol. Conserv. 248, 108654 (2020).

    Google Scholar 

  208. Des Roches, S. et al. The ecological importance of intraspecific variation. Nat. Ecol. Evol. 2, 57–64 (2018).

    Google Scholar 

  209. Des Roches, S., Pendleton, L. H., Shapiro, B. & Palkovacs, E. P. Conserving intraspecific variation for nature’s contributions to people. Nat. Ecol. Evol. 5, 574–582 (2021).

    Google Scholar 

  210. Exposito-Alonso, M. et al. Genetic diversity loss in the Anthropocene. Science 377, 1431–1435 (2022).

    CAS  Google Scholar 

  211. Schmidt, C., Hoban, S., Hunter, M., Paz-Vinas, I. & Garroway, C. J. Genetic diversity and IUCN Red List status. Conserv. Biol. 37, e14064 (2023).

    Google Scholar 

  212. Yates, M. C., Derry, A. M. & Cristescu, M. E. Environmental RNA: a revolution in ecological resolution? Trends Ecol. Evol. 36, 601–609 (2021).

    CAS  Google Scholar 

  213. Cristescu, M. E. Can environmental RNA revolutionize biodiversity science? Trends Ecol. Evol. 34, 694–697 (2019).

    Google Scholar 

  214. Hobern, D. BIOSCAN: DNA barcoding to accelerate taxonomy and biogeography for conservation and sustainability. Genome 64, 161–164 (2021).

    Google Scholar 

  215. Li, Z., Cranganore, S. S., Youngblut, N. & Kilbertus, N. Whole genome transformer for gene interaction effects in microbiome habitat specificity. Preprint at arXiv https://doi.org/10.48550/arXiv.2405.05998 (2024).

  216. Srivathsan, A. & Meier, R. Scalable, cost-effective, and decentralized DNA barcoding with Oxford nanopore sequencing. Meth. Mol. Biol. 2744, 223–238 (2024).

    Google Scholar 

  217. Meier, R., Hartop, E., Pylatiuk, C. & Srivathsan, A. Towards holistic insect monitoring: species discovery, description, identification and traits for all insects. Phil. Trans. R. Soc. B 379, 20230120 (2024).

    Google Scholar 

  218. Pei, W. et al. Megabarcoding reveals a tale of two very different dark taxa along the same elevational gradient. Preprint at bioRxiv https://doi.org/10.1101/2024.04.29.591578 (2024).

  219. Dalla-Torre, H. et al. Nucleotide transformer: building and evaluating robust foundation models for human genomics. Nat. Meth. https://doi.org/10.1038/s41592-024-02523-z (2024).

  220. Zhou, Z. et al. DNABERT-S: learning species-aware DNA embedding with genome foundation models. Preprint at arXiv https://doi.org/10.48550/arXiv.2402.08777 (2024).

  221. Richard, G. et al. ChatNT: a multimodal conversational agent for DNA, RNA and protein tasks. Preprint at bioRxiv https://doi.org/10.1101/2024.04.30.591835 (2024).

  222. Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Meth. 21, 1470–1480 (2024).

    CAS  Google Scholar 

  223. Luccioni, S., Jernite, Y. & Strubell, E. Power hungry processing: watts driving the cost of AI deployment? In Proc. 2024 ACM Conf. on Fairness, Accountability, and Transparency (FAccT) 85–99 (Association for Computing Machinery, 2024).

Download references

Acknowledgements

The authors’ research work is supported by the AI and Biodiversity Change (ABC) Global Center, which is funded by the US National Science Foundation under award 2330423 (to J.K., S.B., M.A.J. and T.B.-W.) and the Natural Sciences and Engineering Research Council of Canada under award 585136 (to L.J.P., K.G., D.R. and G.W.T.). This paper also draws on research supported in part by the Social Sciences and Humanities Research Council of Canada.

Author information

Authors and Affiliations

Authors

Contributions

All authors researched data for the article and wrote the first draft of the manuscript. L.J.P., J.K., S.B., K.G., O.M.A., B.M., D.R., G.W.T., D.T. and T.B.-W. reviewed or edited the manuscript before submission. L.J.P., J.K, K.G., M.A.J., O.M.A. and T.B.-W. made substantial contributions to discussions of its content. In addition, T.B.-W. initiated the review paper, L.J.P., J.K., M.A.J. and K.G. conceived the original idea, and L.J.P. and J.K. led the organization of paper sections and refined the drafts.

Corresponding author

Correspondence to Laura J. Pollock.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Reviews Biodiversity thanks Sarab Sethi, who co-reviewed with Peggy Bevan, Russell Dinnage and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Glossary

Active learning

Sets of methodologies that aim to optimize data-collection strategies or to select the most informative samples among a large quantity of redundant items using iteration and the uncertainty of the AI model as guiding principles. Such models both increase confidence and reduce the amount of data required by adding examples to the training data in successive improvement cycles, in which the algorithm prompts the user for further information.

Benchmarks

Standardized, structured challenges posed to the AI community that often take the form of a fixed dataset, split into training, validation and testing subsets, alongside a carefully designed metric or set of metrics used to evaluate success.

Bioinformatics

A field of biology focused on the methods, tools, software and infrastructure needed to store, manage and analyse large, complex biological datasets.

Category discovery

A challenge in machine learning that aims to identify and group previously unknown or unlabelled categories within a dataset, thus allowing the model to autonomously discover and define new classes based on patterns or similarities in the data, in some cases alongside existing classes or domains.

Computer vision

A field of AI that enables machines to analyse and interpret information in images and videos to perform tasks such as image classification, object detection, semantic or instance segmentation, 3D image reconstruction, depth estimation, visual question answering, image retrieval and scene understanding.

Edge computational approaches

Computing systems that process the data at the device or sensor level and transmit only the desired results. Used when sensor behaviour is controlled by the results, in sensors that acquire a lot of unusable data, or under low-bandwidth conditions.

Embeddings

In machine learning, a representation of an object (such as an image, audio recording or word) as a numerical vector, such that some measure of discrepancy between vectors, generally called a distance, corresponds in a meaningful way to the relatedness of the objects they represent.

Foundation models

Machine learning models that are trained on a wide variety of data with the goal of being useful across a variety of different problems; broadly applicable foundation models require extremely large parameter spaces.

Fundamental niche

The role or position of an organism within an ecosystem, including its diet, behaviour and interactions with predator, prey or competitor species and its effect on its environment (habitat conditions and resources).

Gaussian processes

A type of statistical model based on the assumption that underlying random variables are normally distributed. Often used for continuous value prediction tasks that can naturally represent uncertainty in the modelled data.

Generative AI

Unlike discriminative AI, generative models are designed to generate novel content, often images or text, as opposed to providing information about existing data.

Imageomics

An emerging field, in which machine learning tools built around biological knowledge are used to analyse image data to characterize patterns and gain insights into traits and relationships at individual, population and species scales.

Machine learning

A subcategory of artificial intelligence in which models use an algorithm to pick out patterns in a training dataset that are relevant for solving the problem at hand.

Multimodal datasets

Datasets that observe the same entity with a variety of sensors. A modality could be an on-animal sensor, a drone image or a microphone, for example.

Natural language processing

A mechanism that enables human users to interact with artificial systems using natural language (that is via text or speech).

Open world classification

Also termed open set classification. A machine learning approach designed both to classify items into known classes and to recognize whether items belong to unknown classes that were absent during training.

Phenology

The study of how recurrent phenomena such as seasonal and climate variations affect events in the life cycle of an organism.

Realized niche

The subset of conditions within the fundamental niche actually used by a species, after interactions with other species (in particular predation and competition) and dispersal limitations have been taken into account.

Species distribution models

Also known as environmental niche models, these key tools predict species occurrence or abundance as a function of abiotic or biotic environmental variables; used for ecological inference of species responses to the environment and for mapping present and projected species distributions in response to climate or land-use change.

Tokenization

The process by which raw data (such as a text) are converted into smaller units (such as individual words) that can be used by models such as transformers.

Traits

Phenotypic attributes that affect an organism’s fitness and/or influence its ecosystem functions and can also provide insights into the consequences of biodiversity loss for ecosystem functioning and human well-being.

Transformers

A neural network architecture that processes sequences of tokens (often text or images) in parallel rather than sequentially and is therefore highly effective at capturing long-range dependencies in data but computationally very expensive, requiring very large models and amounts of training data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pollock, L.J., Kitzes, J., Beery, S. et al. Harnessing artificial intelligence to fill global shortfalls in biodiversity knowledge. Nat. Rev. Biodivers. 1, 166–182 (2025). https://doi.org/10.1038/s44358-025-00022-3

Download citation

  • Accepted:

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s44358-025-00022-3

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing