Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Correspondence
  • Published:

Defining and benchmarking open problems in single-cell analysis

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The Open Problems in Single-cell Analysis living benchmarking platform.
Fig. 2: Task overview, setup and results.

Code availability

All Open Problems code is publicly available at https://www.github.com/openproblems-bio/openproblems. This code includes data loaders for all datasets used, with associated metadata on where this data came from. Code to reproduce the figures is publicly available at https://github.com/openproblems-bio/nbt2025-manuscript. Detailed information on all datasets is available at https://openproblems.bio/datasets. Documentation for the platform and contribution guides can be found at https://openproblems.bio/documentation.

References

  1. Zappia, L., Phipson, B. & Oshlack, A. PLOS Comput. Biol. 14, e1006245 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  2. Heumos, L. et al. Nat. Rev. Genet. 24, 550–572 (2023).

    Article  CAS  PubMed  Google Scholar 

  3. Luecken, M. D. & Theis, F. J. Mol. Syst. Biol. 15, e8746 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Donoho, D. J. Comput. Graph. Stat. 26, 745–766 (2017).

    Article  Google Scholar 

  5. Sonrel, A. et al. Genome Biol. 24, 119 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  6. Brooks, T. G., Lahens, N. F., Mrčela, A. & Grant, G. R. Nat. Rev. Genet. 25, 326–339 (2024).

    Article  CAS  PubMed  Google Scholar 

  7. Buchka, S., Hapfelmeier, A., Gardner, P. P., Wilson, R. & Boulesteix, A.-L. Genome Biol. 22, 152 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Musgrave, K., Belongie, S. & Lim, S.-N. In Computer Vision – ECCV 2020 (eds Vedaldi, A. et al.) Lecture Notes in Computer Science Vol. 12370 (Springer, 2020); https://doi.org/10.1007/978-3-030-58595-2_41

  9. Luecken, M. D. et al. Nat. Methods 19, 41–50 (2022).

    Article  CAS  PubMed  Google Scholar 

  10. Chazarra-Gil, R., van Dongen, S., Kiselev, V. Y. & Hemberg, M. Nucleic Acids Res. 49, e42 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Tran, H. T. N. et al. Genome Biol. 21, 12 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Mereu, E. et al. Nat. Biotechnol. 38, 747–755 (2020).

    Article  CAS  PubMed  Google Scholar 

  13. Cao, Y. et al. Preprint at bioRxiv https://doi.org/10.1101/2023.12.19.572303 (2025).

  14. Cannoodt, R. et al. J. Open Source Softw. 9, 6089 (2024).

    Article  Google Scholar 

  15. CZI Cell Science Program et al. Nucleic Acids Res. 53, D886–D900 (2025).

  16. Dimitrov, D. et al. Nat. Commun. 13, 3224 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Armingol, E., Baghdassarian, H. M. & Lewis, N. E. Nat. Rev. Genet. 25, 381–400 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. Nat. Protoc. 15, 1484–1506 (2020).

    Article  CAS  PubMed  Google Scholar 

  19. Hou, R., Denisenko, E., Ong, H. T., Ramilowski, J. A. & Forrest, A. R. R. Nat. Commun. 11, 5011 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Raredon, M. S. B. et al. Sci. Rep. 12, 4187 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Cabello-Aguilar, S. et al. Nucleic Acids Res. 48, e55 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Lance, C. et al. In Proc. NeurIPS 2021 Competitions and Demonstrations Track 162–176 (NeurIPS, 2022).

  23. Luecken, M. D. et al. In Proc. Neural Information Processing Systems Track on Datasets and Benchmarks 1 (NeurIPS, 2021); https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

  24. Gigante, S. et al. Openproblems-Bio/Openproblems: V1.0.0. Zenodo https://doi.org/10.5281/ZENODO.13769879 (2024).

Download references

Acknowledgements

We received continual support in many ways from Jonah Cool, Ivana Williams and Fiona Griffin from the Chan Zuckerberg Initiative for this project, without whom we would not have come this far. We would also like to thank Mohammad Lotfollahi for early discussions on Open Problems. E.V.B. would like to thank the Caltech Bioengineering Graduate program and Paul W. Sternberg for support. This work was supported by the Chan Zuckerberg Initiative Foundation (grant CZIF2022-007488, Human Cell Atlas Data Ecosystem) and the Chan Zuckerberg Initiative DAF, an advised fund of the Silicon Valley Community Foundation (grant number 2021-235155) awarded to M.D.L., D.B.B., S.G., F.J.T. and S.K. This work was co-funded by the European Union (ERC, DeepCell -101054957, to A.S. and F.J.T.). Views and opinions expressed are, however, those of the authors only and do not necessarily reflect those of the European Union or the European Research Council. Neither the European Union nor the granting authority can be held responsible for them. G.P. is supported by the Helmholtz Association under the joint research school Munich School for Data Science and by the Joachim Herz Foundation. Throughout this work, W.L. was supported by the US National Institutes of Health under Continuing Education Training Grants (T15). D.D. was supported by the European Union’s Horizon 2020 Research and Innovation Program (860329 Marie-Curie ITN “STRATEGY-CKD”). M.E.V. is supported by the US National Institutes of Health under a Ruth L. Kirschstein National Research Service Award (1F31CA257625) from the National Cancer Institute. E.D. is supported by Wellcome Sanger core funding (WT206194). This work was supported by the Research Foundation Flanders (FWO) (1SF3822N to L.D.). B.R. is supported by the Bavarian state government with funds from the Hightech Agenda Bavaria. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiele Intelligentie (AI) Vlaanderen” programme. C.B.G.-B. was supported by a PhD fellowship from Fonds Wetenschappelijk Onderzoek (FWO, 11F1519N). V.K. was supported by Wellcome Sanger core funding. G.L.M. received support from Swiss National Science Foundation grant PZ00P3_193445 and Chan Zuckerberg Initiative grants number 2022-249212 and 2019-002427. D.R. was supported by the National Cancer Institute of the US National Institutes of Health (2U24CA180996).

Author information

Authors and Affiliations

Authors

Consortia

Contributions

M.D.L., S.G., and D.B.B. conceived the idea. M.D.L., S.G., D.B.B., R.C., and O.B.B. developed the infrastructure. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., M.F.M., A.A., E.D., Q.Q., A.S., A.B., and Z.L. formalized a benchmarking task. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., D.S.M., M.F.M., A.A., E.D., Q.Q., D.J.O., M.K., O.B.B., K.W., S.N.Y., A.S., A.B., Z.L., C.A-E., E.d.V.B., A.T.C., B.D., C.E., V.K., H.S., V.S. and A.T. contributed to the codebase. M.D.L., S.G., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., L.D. and K.W. analyzed the results. M.D.L., S.G., D.B.B., J.M.B., A.O.P., J.S.-R., D.W., L.P., Y.S., F.J.T. and S.K. provided resources and supervised the work. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L. and D.D. coordinated the research. M.D.L., S.G., D.B.B., F.J.T. and S.K. acquired funding for the work. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., M.F.M., A.A., E.D., Q.Q., D.J.O., M.K., O.B.B., A.S., A.B., Z.L., B.R., J.M.B., A.O.P., C.A-E., E.d.V.B., A.B., C.B.G-B., A.T.C., B.D., C.E., S.F., A.G., S.H., Y.J., V.K., G.L.M., M.G.L., R.L., D.R., H.S., V.S., A.T., G.X. and C.X. contributed to benchmarking task definition. M.D.L., S.G., D.B.B., R.C., D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V. and D.S.M. prepared the manuscript. D.C.S., N.S.M., L.Z., G.P., W.L., D.D., M.E.V., D.S.M. and M.F.M. contributed equally as second authors. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Fabian J. Theis or Smita Krishnaswamy.

Ethics declarations

Competing interests

M.D.L. consults for CatalYm GmbH, contracted for the Chan Zuckerberg Initiative and received speaker fees from Pfizer and Janssen Pharmaceuticals. S.G. has equity interest in Immunai Inc. D.B.B. is a paid employee of and has equity interest in NVIDIA. R.C. has equity interest in Data Intuitive BV. L.Z. has consulted for Lamin Labs GmbH. W.L. contracted for Protein Evolution Incorporated. From 2019 to 2022, A.A. was a consultant for 10x Genomics. From October 2023, E.D. has been a consultant for EnsoCell Therapeutics. O.B.B is currently an employee of Bridge Bio Pharma. A.S. consults for Cellarity Inc. and Exvivo Labs Inc. A.B. is a paid employee of and has equity interest in Cellarity, Inc. J.B. has equity interest in Cellarity, Inc. J.S.-R. reports funding from GSK, Pfizer and Sanofi and fees or honoraria from Travere Therapeutics, Stadapharm, Astex, Owkin, Pfizer and Grunenthal. D.W. has equity interest in Immunai Inc. F.J.T. consults for Immunai Inc., Singularity Bio B.V., CytoReason Ltd and Cellarity, and has ownership interest in Dermagnostix GmbH and Cellarity. S.K. is a visiting professor at Meta and scientific advisor at Ascent Bio, Inc. E.d.V.B has ownership interest in Retro Biosciences and ImYoo Inc and is employed by ImYoo Inc. A.T.C. is an employee of Orion Medicines. B.D. is a paid employee of and has equity interest in Cellarity Inc. A.G. is currently an employee of Google DeepMind. Google DeepMind has not directed any aspect of this study nor exerts any commercial rights over the results. R.L. is an employee of Genentech. V.S. has ownership interest in Altos Labs and Vesalius Therapeutics. A.T. has an ownership interest in Dreamfold.

Supplementary information

Supplementary Information

Supplementary Methods, Note 1 and Figs. 1–11

Source data

Source Data Fig. 1

Methods and metrics used per existing benchmarking repository, including dates of first and last commit.

Source Data Fig. 2

Table of metric results for the cell–cell communication task with metric explanations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luecken, M.D., Gigante, S., Burkhardt, D.B. et al. Defining and benchmarking open problems in single-cell analysis. Nat Biotechnol 43, 1035–1040 (2025). https://doi.org/10.1038/s41587-025-02694-w

Download citation

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41587-025-02694-w

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics