Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

Scientific Data
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. scientific data
  3. comment
  4. article
The missing link in FAIR data policy: biodata resources in life sciences
Download PDF
Download PDF
  • Comment
  • Open access
  • Published: 06 February 2026

The missing link in FAIR data policy: biodata resources in life sciences

  • Lucy Poveda  ORCID: orcid.org/0000-0002-5291-55821,
  • Gavin Farrell  ORCID: orcid.org/0000-0001-5166-85512,
  • Silvio C. E. Tosatto  ORCID: orcid.org/0000-0003-4525-77932,3,
  • Monique Zahn-Zabal  ORCID: orcid.org/0000-0001-7961-60911,
  • Patrick Ruch  ORCID: orcid.org/0000-0002-3374-29621,4,
  • Julien Gobeill  ORCID: orcid.org/0000-0001-9809-77411,4,
  • Robert M. Waterhouse  ORCID: orcid.org/0000-0003-4199-90521 &
  • …
  • Christophe Dessimoz  ORCID: orcid.org/0000-0002-2170-853X1,5 

Scientific Data , Article number:  (2026) Cite this article

  • 1199 Accesses

  • 4 Altmetric

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Data integration
  • Databases
  • Funding
  • Research management

In the life sciences, FAIR principles have reshaped research policy, but their implementation still relies largely on individual researchers – many of whom lack the expertise or support needed to make data truly reusable. Realising FAIR’s promise requires sustained investment in the infrastructures that organise, standardise, and curate data: deposition databases and knowledgebases. These biodata resources are especially critical for AI, which depends on large, high-quality, and consistent data. Landmark advances like AlphaFold and the COVID-19 response illustrate how sustained curation and standardisation in expert resources such as UniProt and the Protein Data Bank have enabled rapid innovation. Yet biodata resources remain precariously funded, jeopardising long-term sustainability and the expert workforce they require. To support ambitious, data-driven science, funders must align policy and budgets by establishing dedicated mechanisms that allocate a small (e.g., 1%), but strategic and stable share, of research funding to core data infrastructures. This would maximise the value of public investment, strengthen open science and international collaboration, and unlock the full potential of FAIR.

References

  1. Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016).

    Google Scholar 

  2. Gabella, C., Durinx, C. & Appel, R. Funding knowledgebases: Towards a sustainable funding model for the UniProt use case. F1000Res. 6, 2051 (2018).

    Google Scholar 

  3. Mons, B. Invest 5% of research funds in ensuring data are reusable. Nature Publishing Group UK https://doi.org/10.1038/d41586-020-00505-7 (2020).

  4. Stroe, O. Open data on the rise: the value of EMBL-EBI data resources. EMBL-EBI News https://www.ebi.ac.uk/about/news/announcements/value-and-impact-emblebi-2021/ (2021).

  5. Dessimoz, C. & Thomas, P. D. AI and the democratization of knowledge. Sci. Data 11, 268 (2024).

    Google Scholar 

  6. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Google Scholar 

  7. Crystallography: Protein Data Bank. Nat. New Biol. 233, 223–223 (1971).

  8. Choudhary, P. et al. PDB NextGen Archive: centralizing access to integrated annotations and enriched structural information by the Worldwide Protein Data Bank. Database (Oxford) 2024 (2024).

  9. UniProt Consortium. UniProt: The universal protein knowledgebase in 2025. Nucleic Acids Res. 53, D609–D617 (2025).

    Google Scholar 

  10. Zhu, N. et al. A Novel Coronavirus from Patients with Pneumonia in China, 2019. New England Journal of Medicine https://doi.org/10.1056/NEJMoa2001017 (2020).

  11. De Castro, E. et al. ViralZone 2024 provides higher-resolution images and advanced virus-specific resources. Nucleic Acids Res 52, D817–D821 (2023).

    Google Scholar 

  12. O’Cathail, C. et al. The European Nucleotide Archive in 2024. Nucleic Acids Res. 53, D49–D55 (2025).

    Google Scholar 

  13. Wu, F. et al. A new coronavirus associated with human respiratory disease in China. Nature 579, 265–269 (2020).

    Google Scholar 

  14. Hadfield, J. et al. Nextstrain: real-time tracking of pathogen evolution. Bioinformatics 34, 4121–4123 (2018).

    Google Scholar 

  15. Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. & Sayers, E. W. GenBank. Nucleic Acids Res 44, D67–D72 (2015).

    Google Scholar 

  16. Iudin, A. et al. EMPIAR: the Electron Microscopy Public Image Archive. Nucleic Acids Res 51, D1503–D1511 (2022).

    Google Scholar 

  17. Zenodo. https://zenodo.org (2025).

  18. Rutherford, K. M., Lera-Ramírez, M. & Wood, V. PomBase: a Global Core Biodata Resource-growth, collaboration, and sustainability. Genetics 227 (2024).

  19. Bansal, P. et al. Rhea, the reaction knowledgebase in 2022. Nucleic Acids Res 50, D693–D700 (2021).

    Google Scholar 

  20. Bastian, F. B. et al. Bgee in 2024: focus on curated single-cell RNA-seq datasets, and query tools. Nucleic Acids Res. 53, D878–D885 (2025).

    Google Scholar 

  21. Durinx, C. et al. Identifying ELIXIR Core Data Resources. F1000Res. 5, 2422 (2017).

    Google Scholar 

  22. Sarkans, U. et al. The BioStudies database-one stop shop for all data supporting a life sciences study. Nucleic Acids Res. 46, D1266–D1270 (2018).

    Google Scholar 

  23. Europe PMC Consortium. Europe PMC: a full-text literature database for the life sciences and platform for innovation. Nucleic Acids Res 43, D1042–8 (2015).

    Google Scholar 

  24. Gobeill, J. et al. SIB Literature Services: RESTful customizable search engines in biomedical literature, enriched with automatically mapped biomedical concepts. Nucleic Acids Res 48, W12–W16 (2020).

    Google Scholar 

  25. Leitner, F. et al. Introducing meta-services for biomedical information extraction. Genome Biology 9, 1–11 (2008).

    Google Scholar 

  26. Gobeill, J. et al. Overview of the BioCreative VI text-mining services for Kinome Curation Track. Database (Oxford) 2018, (2018).

  27. Gaudet, P. & Dessimoz, C. Gene Ontology: Pitfalls, Biases, and Remedies. Methods Mol Biol 1446, 189–205 (2017).

    Google Scholar 

  28. Rodríguez-López, M. et al. Broad functional profiling of fission yeast proteins using phenomics and machine learning. https://doi.org/10.7554/eLife.88229 (2023).

  29. Lai, P.-T. et al. EnzChemRED, a rich enzyme chemistry relation extraction dataset. Scientific Data 11, 1–19 (2024).

    Google Scholar 

  30. Gu, Y. et al. Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3, 1–23 (2022).

    Google Scholar 

  31. Anderson, W. P., Global Life Science Data Resources Working Group. Data management: A global coalition to sustain core data. Nature 543, 179 (2017).

    Google Scholar 

  32. List of Current Global Core Biodata Resources. Global Biodata Coalition https://globalbiodata.org/what-we-do/global-core-biodata-resources/list-of-current-global-core-biodata-resources/ (2023).

  33. The Agreement – CoARA. https://coara.eu/agreement/the-agreement-full-text/ (2025).

  34. Imker, H. J. Who bears the burden of long-lived molecular biology databases? Data Sci. J. 19, 8 (2020).

    Google Scholar 

  35. Johnson, T. R. & Bourne, P. E. The biological data sustainability paradox. arXiv [q-bio.OT] (2023).

  36. https://eden-fidelis.eu (2025).

  37. Homepage. EOSC Data Commons https://www.eosc-data-commons.eu (2025).

  38. Gabella, C., Duvaud, S. & Durinx, C. Managing the life cycle of a portfolio of open data resources at the SIB Swiss Institute of Bioinformatics. Brief. Bioinform. 23 (2022).

  39. Lauer, K. B. et al. Open data: A driving force for innovation in the life sciences. F1000Research 10, 828 (2021).

    Google Scholar 

  40. Beagrie, N. & Houghton, J. Data-Driven Discovery: The Value and Impact of EMBL-EBI Managed Data Resources. https://www.embl.org/documents/document/embl-ebi-impact-report-2021/ (2021).

  41. Tauriello, G. et al. ModelArchive: A deposition database for computational macromolecular structural models. J. Mol. Biol. 168996 https://doi.org/10.1016/j.jmb.2025.168996 (2025).

Download references

Acknowledgements

This study received funding from ELIXIR: the research infrastructure for life-science data. In addition, we acknowledge SNSF grant #205085 to C.D, and SNSF/CHIST-ERA grant #217525 to P. R.

Author information

Authors and Affiliations

  1. SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland

    Lucy Poveda, Monique Zahn-Zabal, Patrick Ruch, Julien Gobeill, Robert M. Waterhouse & Christophe Dessimoz

  2. Department of Biomedical Sciences, University of Padova, Padova, Italy

    Gavin Farrell & Silvio C. E. Tosatto

  3. Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council (CNR-IBIOM), 70126, Bari, Italy

    Silvio C. E. Tosatto

  4. Information Sciences, HES-SO/HEG Geneva, 1227, Carouge, Switzerland

    Patrick Ruch & Julien Gobeill

  5. University of Lausanne, Lausanne, Switzerland

    Christophe Dessimoz

Authors
  1. Lucy Poveda
    View author publications

    Search author on:PubMed Google Scholar

  2. Gavin Farrell
    View author publications

    Search author on:PubMed Google Scholar

  3. Silvio C. E. Tosatto
    View author publications

    Search author on:PubMed Google Scholar

  4. Monique Zahn-Zabal
    View author publications

    Search author on:PubMed Google Scholar

  5. Patrick Ruch
    View author publications

    Search author on:PubMed Google Scholar

  6. Julien Gobeill
    View author publications

    Search author on:PubMed Google Scholar

  7. Robert M. Waterhouse
    View author publications

    Search author on:PubMed Google Scholar

  8. Christophe Dessimoz
    View author publications

    Search author on:PubMed Google Scholar

Contributions

This article was developed within the framework of Work Package 5 of the ELIXIR Data Platform Workplan (2024–2028). L.P. and C.D. initiated and coordinated the manuscript drafting based on the group’s prior work. L.P. and C.D. wrote the first draft. All authors contributed to the conceptual development of the arguments and provided comments, suggestions, and revisions. All authors approved the final version of the manuscript.

Corresponding author

Correspondence to Christophe Dessimoz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Poveda, L., Farrell, G., Tosatto, S.C.E. et al. The missing link in FAIR data policy: biodata resources in life sciences. Sci Data (2026). https://doi.org/10.1038/s41597-026-06690-w

Download citation

  • Received: 01 July 2025

  • Accepted: 23 January 2026

  • Published: 06 February 2026

  • DOI: https://doi.org/10.1038/s41597-026-06690-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims and scope
  • Editors & Editorial Board
  • Journal Metrics
  • Policies
  • Open Access Fees and Funding
  • Calls for Papers
  • Contact

Publish with us

  • Submission Guidelines
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

Scientific Data (Sci Data)

ISSN 2052-4463 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing Careers

Sign up for the Nature Briefing: Careers newsletter — what matters in careers research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: Careers