Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

End-to-end cryo-EM complex structure determination with high accuracy and ultra-fast speed

An Author Correction to this article was published on 11 July 2025

This article has been updated

Abstract

While cryogenic-electron microscopy yields high-resolution density maps for complex structures, accurate determination of the corresponding atomic structures still necessitates significant expertise and labour-intensive manual interpretation. Recently, artificial intelligence-based methods have emerged to streamline this process; however, several challenges persist. First, existing methods typically require multi-stage training and inference, causing inefficiencies and inconsistency. Second, these approaches often encounter bias and incur substantial computational costs in aligning predicted atomic coordinates with sequence. Last, due to the limitations of available datasets, previous studies struggle to generalize effectively to complicated and unseen test data. Here, in response to these challenges, we introduce end-to-end and efficient CryoFold (E3-CryoFold), a deep learning method that enables end-to-end training and one-shot inference. E3-CryoFold uses three-dimensional and sequence transformers to extract features from density maps and sequences, using cross-attention modules to integrate the two modalities. Additionally, it uses an SE(3) graph neural network to construct atomic structures based on extracted features. E3-CryoFold incorporates a pretraining stage, during which models are trained on simulated density maps derived from Protein Data Bank structures. Empirical results demonstrate that E3-CryoFold improves the average template modelling score of the generated structures by 400% as compared to Cryo2Struct and significantly outperforms ModelAngelo, while achieving this huge improvement using merely one-thousandth of the inference time required by these methods. Thus, E3-CryoFold represents a robust, streamlined and cohesive framework for cryogenic-electron microscopy structure determination.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: The architecture and pipeline of E3-CryoFold.
Fig. 2: The analysis results of atomic structures on 150 test experimental density maps for E3-CryoFold against Cryo2Struct and Phenix in four metrics.
Fig. 3: The analysis results in various metrics for E3-CryoFold, Cryo2Struct and Phenix on 150 test experimental density maps.
Fig. 4: The analysis results of atomic models built on 109 test experimental cryo-EM maps for E3-CryoFold against ModelAngelo in eight metrics.
Fig. 5: The analysis results of atomic models built for 428 test experimental cryo-EM maps.
Fig. 6: The analysis results of atomic models built for 428 test cryo-EM maps.

Similar content being viewed by others

Data availability

The experimental dataset can be downloaded at https://doi.org/10.7910/DVN/FCDG0W (ref. 46), and the standard test dataset can be downloaded at https://doi.org/10.7910/DVN/2GSSC9 (ref. 47). The low-resolution and simulated datasets are accessible at https://zhanggroup.org/CR-I-TASSER/. All source data are accessible from ref. 48 (standard_test_data.xlsx for standard test dataset, novel_test_data.xlsx for novel established test dataset, low_resolution_experimental_data.xlsx for low-resolution density maps, simulated_data.xlsx for simulated density maps). Source data are provided with this paper.

Code availability

The source code of E3-CryoFold is available via GitHub at https://github.com/A4Bio/CryoFold/ (ref. 49). This repository also contains the instructions and tutorial for applying E3-CryoFold on an example cryo-EM map to generate a complex structure.

Change history

  • 22 August 2025

    In the version of the article initially published, in the Acknowledgements, the National Science and Technology Major Project grant number was incorrect and has now been amended to 2022ZD0115102 in the HTML and PDF versions of the article.

  • 11 July 2025

    A Correction to this paper has been published: https://doi.org/10.1038/s42256-025-01094-8

References

  1. Boadu, F., Cao, H. & Cheng, J. Combining protein sequences and structures with transformers and equivariant graph neural networks to predict protein function. Bioinformatics 39, i318–i325 (2023).

    Article  Google Scholar 

  2. Bai, X.-C., McMullan, G. & Scheres, S. H. How cryo-EM is revolutionizing structural biology. Trends Biochem. Sci. 40, 49–57 (2015).

    Article  Google Scholar 

  3. Lawson, C. L. et al. Outcomes of the EMDataResource cryo-EM ligand modeling challenge. Nat. Methods 21, 1340–1348 (2024).

    Article  Google Scholar 

  4. Dhakal, A., Gyawali, R., Wang, L. & Cheng, J. A large expert-curated cryo-EM image dataset for machine learning protein particle picking. Sci. Data 10, 392 (2023).

    Article  Google Scholar 

  5. Dhakal, A., Gyawali, R., Wang, L. & Cheng, J. Cryotransformer: a transformer model for picking protein particles from cryo-EM micrographs. Bioinformatics 40, btae109 (2024).

    Article  Google Scholar 

  6. Giri, N., Roy, R. S. & Cheng, J. Deep learning for reconstructing protein structures from cryo-EM density maps: recent advances and future directions. Curr. Opin. Struct. Biol. 79, 102536 (2023).

    Article  Google Scholar 

  7. Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. Sect. D Biol. Crystallogr. 66, 486–501 (2010).

    Article  Google Scholar 

  8. Croll, T. I. Isolde: a physically realistic environment for model building into low-resolution electron-density maps. Acta Crystallogr. Sect. D: Struct. Biol. 74, 519–530 (2018).

    Article  Google Scholar 

  9. Gao, Y., Thorn, V. & Thorn, A. Errors in structural biology are not the exception. Acta Crystallogr. Sect. D Struct. Biol. 79, 206–211 (2023).

    Article  Google Scholar 

  10. Croll, T. I. et al. Making the invisible enemy visible. Nat. Struct. Mol. Biol. 28, 404–408 (2021).

    Article  Google Scholar 

  11. Zhong, E. D., Bepler, T., Berger, B. & Davis, J. H. Cryodrgn: reconstruction of heterogeneous cryo-EM structures using neural networks. Nat. Methods 18, 176–185 (2021).

    Article  Google Scholar 

  12. Rangan, R. et al. CryoDRGN-ET: deep reconstructing generative networks for visualizing dynamic biomolecules inside cells. Nat. Methods 21, 1537–1545 (2024).

    Article  Google Scholar 

  13. Levy, A., Wetzstein, G., Martel, J. N., Poitevin, F. & Zhong, E. Amortized inference for heterogeneous reconstruction in cryo-EM. Adv. Neural Inf. Process. Syst. 35, 13038–13049 (2022).

    Google Scholar 

  14. Pfab, J., Phan, N. M. & Si, D. Deeptracer for fast de novo cryo-EM protein structure modeling and special studies on cov-related complexes. Proc. Natl Acad. Sci. USA 118, e2017525118 (2021).

    Article  Google Scholar 

  15. Ronneberger, O., Fischer, P. & Brox, T. U-net: convolutional networks for biomedical image segmentation. In Proc. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Part III Vol. 18, 234–241 (Springer, 2015).

  16. Hoffman, K. L. & Padberg, M. et al. Traveling salesman problem. Encycl. Oper. Res. Manag. Sci. 1, 1573–1578 (2013).

    Google Scholar 

  17. Jamali, K. et al. Automated model building and protein identification in cryo-EM maps. Nature 628, 450–457 (2024).

    Article  Google Scholar 

  18. Rabiner, L. & Juang, B. An introduction to hidden Markov models. IEEE ASSP Mag. 3, 4–16 (1986).

    Article  Google Scholar 

  19. Terashi, G., Wang, X., Prasad, D., Nakamura, T. & Kihara, D. Deepmainmast: integrated protocol of protein structure modeling for cryo-EM with deep learning and structure prediction. Nat. Methods 21, 122–131 (2024).

    Article  Google Scholar 

  20. Jumper, J. et al. Highly accurate protein structure prediction with alphafold. Nature 596, 583–589 (2021).

    Article  Google Scholar 

  21. Giri, N. & Cheng, J. De novo atomic protein structure modeling for cryoEM density maps using 3D transformer and HMM. Nat. Commun. 15, 5511 (2024).

    Article  Google Scholar 

  22. Vaswani, A. Attention is all you need. In Proc. Advances in Neural Information Processing Systems Vol. 30 (eds Guyon, I. et al.) 6000–6010 (Curran Associates, 2017).

  23. Shekhar, M. et al. Cryofold: determining protein structures and data-guided ensembles from cryo-em density maps. Matter 4, 3195–3216 (2021).

    Article  Google Scholar 

  24. Lawson, C. L. et al. Emdatabank unified data resource for 3DEM. Nucleic Acids Res. 44, D396–D403 (2016).

    Article  Google Scholar 

  25. Zhang, Y. & Skolnick, J. Tm-align: a protein structure alignment algorithm based on the Tm-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    Article  Google Scholar 

  26. Chen, C.-F. R., Fan, Q. & Panda, R. Crossvit: cross-attention multi-scale vision transformer for image classification. In Proc. IEEE/CVF International Conference on Computer Vision 357–366 (IEEE, 2021).

  27. Satorras, V. G., Hoogeboom, E. & Welling, M. E (n) equivariant graph neural networks. In Proc. International Conference on Machine Learning 9323–9332 (PMLR, 2021).

  28. Han, J., Rong, Y., Xu, T. & Huang, W. Geometrically equivariant graph neural networks: a survey. Preprint at https://arxiv.org/abs/2202.07230 (2022).

  29. Giri, N., Wang, L. & Cheng, J. Cryo2StructData: a large labeled cryo-EM density map dataset for AI-based modeling of protein structures. Sci. Data 11, 458 (2024).

    Article  Google Scholar 

  30. Terwilliger, T. C., Adams, P. D., Afonine, P. V. & Sobolev, O. V. A fully automatic method yielding initial models from high-resolution cryo-electron microscopy maps. Nat. Methods 15, 905–908 (2018).

    Article  Google Scholar 

  31. Mariani, V., Biasini, M., Barbato, A. & Schwede, T. LDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29, 2722–2728 (2013).

    Article  Google Scholar 

  32. Basu, S. & Wallner, B. DockQ: a quality measure for protein-protein docking models. PLoS ONE 11, e0161879 (2016).

    Article  Google Scholar 

  33. Pintilie, G. et al. Measurement of atom resolvability in cryo-EM maps with Q-scores. Nat. Methods 17, 328–334 (2020).

    Article  Google Scholar 

  34. Van Heel, M. & Schatz, M. Fourier shell correlation threshold criteria. J. Struct. Biol. 151, 250–262 (2005).

    Article  Google Scholar 

  35. Yamashita, K., Palmer, C. M., Burnley, T. & Murshudov, G. N. Cryo-EM single-particle structure refinement and map calculation using servalcat. Biol. Crystallogr. 77, 1282–1291 (2021).

    Google Scholar 

  36. Allegretti, M., Mills, D. J., McMullan, G., Kühlbrandt, W. & Vonck, J. Atomic model of the F420-reducing [NiFe] hydrogenase by electron cryo-microscopy using a direct electron detector. eLife 3, e01963 (2014).

    Article  Google Scholar 

  37. Bartesaghi, A., Matthies, D., Banerjee, S., Merk, A. & Subramaniam, S. Structure of β-galactosidase at 3.2-Å resolution obtained by cryo-electron microscopy. Proc. Natl Acad. Sci. USA 111, 11709–11714 (2014).

    Article  Google Scholar 

  38. Hattne, J. et al. Analysis of global and site-specific radiation damage in cryo-EM. Structure 26, 759–766 (2018).

    Article  Google Scholar 

  39. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  MathSciNet  Google Scholar 

  40. Young, I. T. & Van Vliet, L. J. Recursive implementation of the Gaussian filter. Signal Process. 44, 139–151 (1995).

    Article  Google Scholar 

  41. Loshchilov, I. et al. Fixing weight decay regularization in Adam. Preprint at https://arxiv.org/abs/1711.05101 (2017).

  42. Smith, L. N. Cyclical learning rates for training neural networks. In Proc. 2017 IEEE Winter Conference on Applications of Computer Vision (WACV) 464–472 (IEEE, 2017).

  43. Gao, Z., Tan, C. & Li, S. Z. Foldtoken4: Consistent & hierarchical fold language. Preprint at bioRxiv https://doi.org/10.1101/2024.08.04.606514 (2024).

  44. Ingraham, J. B. et al. Illuminating protein space with a programmable generative model. Nature 623, 1070–1078 (2023).

    Article  Google Scholar 

  45. Chomboon, K., Chujai, P., Teerarassamee, P., Kerdprasop, K. & Kerdprasop, N. An empirical study of distance metrics for k-nearest neighbour algorithm. In Proc. 3rd International Conference on Industrial Application Engineering Vol. 2, p. 4 (Institute of Industrial Applications Engineers, 2015).

  46. Giri, N., Wang, L. & Cheng, J. Cryo2StructData: full dataset. Harvard Dataverse https://doi.org/10.7910/DVN/FCDG0W (2023).

  47. Giri, N., Wang, L. & Cheng, J. Cryo2StructData: test dataset. Harvard Dataverse https://doi.org/10.7910/DVN/2GSSC9 (2023).

  48. Wang, J. cryofold_source_data.zip. figshare https://doi.org/10.6084/m9.figshare.28530359.v1 (2025).

  49. Wang, J. & Tan, C. End-to-end Cryo-EM complex structure determination with high accuracy and ultra-fast speed. Zenodo https://doi.org/10.5281/zenodo.14970359 (2025).

  50. Hodson, T. O. Root mean square error (RMSE) or mean absolute error (MAE): when to use them or not. Geosci. Model Dev. 15, 5481–5487 (2022).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by National Science and Technology Major Project (grant no. 2022ZD0115102), National Natural Science Foundation of China Project (grant no. 624B2115, and grant no. U21A20427), Project (grant no. WU2022A009) from the Center of Synthetic Biology and Integrated Bioengineering of Westlake University and Project (grant no. WU2023C019) from the Westlake University Industries of the Future Research Funding.

Author information

Authors and Affiliations

Authors

Contributions

J.W. conceived the idea and developed the framework. Z.G. provided the crucial technology for SE(3) GNN and structure generation, as well as the pretraining dataset. J.W. drafted the paper and C.T. helped in writing Methods. C.T., Z.G., Y.Z., G.Z. helped in editing the paper. J.W. and C.T. prepared codes and released them on GitHub. S.Z.L. supervised the project and helped revise the paper.

Corresponding authors

Correspondence to Yang Zhang or Stan Z. Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary information for E3-CryoFold.

Supplementary Data 1

Statistical data for Supplementary Fig. 2.

Supplementary Data 2

Statistical data for Supplementary Fig. 3.

Source data

Source Data Fig. 2

Statistical source data for Fig. 2.

Source Data Fig. 4

Statistical source data for Fig. 4.

Source Data Figs. 5 and 6

Statistical source data for Figs. 5 and 6.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Tan, C., Gao, Z. et al. End-to-end cryo-EM complex structure determination with high accuracy and ultra-fast speed. Nat Mach Intell 7, 1091–1103 (2025). https://doi.org/10.1038/s42256-025-01056-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s42256-025-01056-0

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing