Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Comment
  • Published:

Synthetic multimodal data modelling for data imputation

Foundation models can be advantageously harnessed to estimate missing data in multimodal biomedical datasets and to generate realistic synthetic samples.

This is a preview of subscription content, access via your institution

Access options

Buy this article

USD 39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: From traditional data imputation to synthetic multimodal data modelling for hypothesis testing.

References

  1. Reed, S. et al. Preprint at https://arxiv.org/abs/2205.06175 (2022).

  2. Girdhar, R. et al. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 15180–15190 (IEEE, 2023).

  3. Zhang, S. et al. Preprint at https://arxiv.org/abs/2303.00915 (2023).

  4. Stein, G. et al. Preprint at https://arxiv.org/abs/2306.04675 (2023).

  5. Chambon, P., Bluethgen, C., Langlotz, C. P. & Chaudhari, A. Preprint at https://arxiv.org/abs/2210.04133 (2022).

  6. Gambardella, G. et al. Nat. Commun. 13, 1714 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Adam, G. et al. npj Precis. Oncol. 4, 19 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Liu, S., Abdellaoui, A., Verweij, K. & Wingen, G. Adv. Sci. 10, 2205486 (2023).

    Article  CAS  Google Scholar 

  9. Sun, S., Torok, J., Mezias, C., Ma, D. & Raj, A. Cell Rep. 42, 10 (2023).

    Google Scholar 

  10. Biswal, S., Xiao, C., Westover, M. B. & Sun, J. In Machine Learning for Healthcare Conference 513–531 (2019).

  11. Wang, H. et al. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 15878–15887 (IEEE, 2023).

  12. Zhang, C. et al. In Proc. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Zhang, A. & Rangwala, H.) 2418–2428 (ACM, 2022).

  13. Roohani, Y., Huang, K. & Leskovec, J. Nat. Biotechnol. 42, 927–935 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  14. Nitski, O. et al. Lancet Digit. Health 3, 295–305 (2021).

    Article  Google Scholar 

  15. Durairaj, J. et al. Nature 622, 646–653 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Heusel, M. et al. In Proc. 31st International Conference on Neural Information Processing Systems (eds von Luxburg, U. et al) 6629–6640 (Curran Associates, 2017).

  17. Papineni, K. et al. In Proc. 40th Annual Meeting of the Association for Computational Linguistics (eds Isabelle, P. et al.) 311–318 (Association for Computational Linguistics, 2002).

  18. The All of Us Research Program Genomics Investigators Nature 627, 340–346 (2024).

Download references

Acknowledgements

The research reported here was supported by the National Cancer Institute (NCI; grant number R01 CA260271). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. M.P. was supported by a grant from Fonds Wetenschappelijk Onderzoek-Vlaanderen (FWO; 1161223N and V467423N. The work was further supported by grants from FWO (3G045620, 3G046318) and UGent BOF (BOF01J06219, BOF/IOP/2022/045BOF).

Author information

Authors and Affiliations

Authors

Contributions

F.C.-P. and M.P. prepared the figure. F.C.-P., M.P. and O.G. conceived the idea and wrote the paper. O.G. and K.M. reviewed the paper and obtained the funding. All authors reviewed and accepted the submitted version of the paper.

Corresponding author

Correspondence to Olivier Gevaert.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Biomedical Engineering thanks the anonymous reviewers for their contribution to the peer review of this work.

Supplementary information

Supplementary Information

Supplementary references.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Carrillo-Perez, F., Pizurica, M., Marchal, K. et al. Synthetic multimodal data modelling for data imputation. Nat. Biomed. Eng 9, 421–425 (2025). https://doi.org/10.1038/s41551-024-01324-1

Download citation

  • Published:

  • Version of record:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41551-024-01324-1

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing