Abstract
Cryogenic electron microscopy (cryo-EM) has become a premier technique for determining high-resolution structures of biological macromolecules. However, its broad application is constrained by the demand for specialized expertise. Here, to address this limitation, we introduce the Cryo-EM Image Evaluation Foundation (Cryo-IEF) model, a versatile tool pre-trained on ~65 million cryo-EM particle images through unsupervised learning. Cryo-IEF performs diverse cryo-EM processing tasks, including particle classification by structure, pose-based clustering and image quality assessment. Building on this foundation, we developed CryoWizard, a fully automated single-particle cryo-EM processing pipeline enabled by fine-tuned Cryo-IEF for efficient particle quality ranking. CryoWizard resolves high-resolution structures across samples of varied properties and effectively mitigates the prevalent challenge of preferred orientation in cryo-EM.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout






Similar content being viewed by others
Data availability
Cryo-EM micrograph data and particle image data from EMPIAR used in training are available at https://www.ebi.ac.uk/empiar/; accession codes are given in Supplementary Tables 1 and 2. Density maps used to generate simulated cryo-EM particle images were downloaded from the EMDB, which is available at https://www.ebi.ac.uk/emdb/. Particle data from cryoPPP are available at https://calla.rnet.missouri.edu/cryoppp/. The raw data for 12 simulated and four genuine particle datasets can be found at https://zenodo.org/records/17066236 (ref. 59) and https://zenodo.org/uploads/17066297 (ref. 60), respectively. The resampled version of the CryoBench Ribosembly datasets is available on Zenodo (https://zenodo.org/records/17066704)61. The reconstructed results obtained from CryoSolver and CryoWizard can be accessed at Zenodo (https://zenodo.org/records/17062718)62.
Code availability
Codes with introduction details are available at https://github.com/westlake-repl/Cryo-IEF, which is based on PyTorch.
References
Nogales, E. The development of cryo-EM into a mainstream structural biology technique. Nat. Methods 13, 24–27 (2016).
Cheng, Y. Single-particle cryo-EM—how did it get here and where will it go. Science 361, 876–880 (2018).
Bai, X.-C., McMullan, G. & Scheres, S. H. How cryo-EM is revolutionizing structural biology. Trends Biochem. Sci. 40, 49–57 (2015).
Frank, J. Advances in the field of single-particle cryo-electron microscopy over the last decade. Nat. Protoc. 12, 209–212 (2017).
Holcomb, J. et al. Protein crystallization: eluding the bottleneck of X-ray crystallography. AIMS Biophys. 4, 557–575 (2017).
Scheiner, G. The resolution revolution. Diabetes Self Manag. 32, 28–29 (2015).
Amunts, A. et al. Structure of the yeast mitochondrial large ribosomal subunit. Science 343, 1485–1489 (2014).
Liao, M., Cao, E., Julius, D. & Cheng, Y. Structure of the TRPV1 ion channel determined by electron cryo-microscopy. Nature 504, 107–112 (2013).
McMullan, G., Faruqi, A. & Henderson, R. Direct electron detectors. Methods Enzymol. 579, 1–17 (2016).
Zheng, S. Q. et al. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods 14, 331–332 (2017).
Li, X. et al. Electron counting and beam-induced motion correction enable near-atomic-resolution single-particle cryo-EM. Nat. Methods 10, 584–590 (2013).
Bai, X.-C., Fernandez, I. S., McMullan, G. & Scheres, S. H. Ribosome structures to near-atomic resolution from thirty thousand cryo-EM particles. Elife 2, e00461 (2013).
Scheres, S. H. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).
Punjani, A., Rubinstein, J. L., Fleet, D. J. & Brubaker, M. A. cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination. Nat. Methods 14, 290–296 (2017).
Zhou, Y., Moscovich, A., Bendory, T. & Bartesaghi, A. Unsupervised particle sorting for high-resolution single-particle cryo-EM. Inverse Probl. 36, 044002 (2020).
Zivanov, J. et al. New tools for automated high-resolution cryo-EM structure determination in RELION-3. Elife 7, e42166 (2018).
Bepler, T. et al. Positive-unlabeled convolutional neural networks for particle picking in cryo-electron micrographs. Nat. Methods 16, 1153–1160 (2019).
Wagner, T. et al. SPHIRE-crYOLO is a fast and accurate fully automated particle picker for cryo-EM. Commun. Biol. 2, 218 (2019).
Kimanius, D., Dong, L., Sharov, G., Nakane, T. & Scheres, S. H. W. New tools for automated cryo-EM single-particle analysis in RELION-4.0. Biochem. J. 478, 4169–4185 (2021).
Li, Y., Cash, J. N., Tesmer, J. J. G. & Cianfrocco, M. A. High-throughput cryo-EM enabled by user-free preprocessing routines. Structure 28, 858–869 (2020).
Zhong, E. D., Bepler, T., Berger, B. & Davis, J. H. CryoDRGN: reconstruction of heterogeneous cryo-EM structures using neural networks. Nat. Methods 18, 176–185 (2021).
Pfab, J., Phan, N. M. & Si, D. DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc. Natl Acad. Sci. USA 118, e2017525118 (2021).
Jamali, K. et al. Automated model building and protein identification in cryo-EM maps. Nature 628, 450–457 (2024).
Scheres, S. H. Processing of structurally heterogeneous cryo-EM data in RELION. Methods Enzymol. 579, 125–157 (2016).
Zhu, D. et al. Correction of preferred orientation-induced distortion in cryo-electron microscopy maps.Sci. Adv. 10, eadn0092 (2024).
Zhang, H. et al. CryoPROS: Correcting misalignment caused by preferred orientation using AI-generated auxiliary particles. Nat. Commun. 16, 4565 (2025).
Liu, Y. et al. Overcoming the preferred-orientation problem in cryo-EM with self-supervised deep learning. Nat. Methods 22, 113–123 (2025).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning 1597–1607 (PMLR, 2020).
He, K., Fan, H., Wu, Y., Xie, S. & Girshick, R. Momentum contrast for unsupervised visual representation learning. In Proc. IEEE/CVF International Conference on Computer Vision 9726–9735 (IEEE, 2020).
Chen, X., Xie, S. & He, K. An empirical study of training self-supervised vision transformers. In Proc. IEEE/CVF International Conference on Computer Vision 9640–9649 (IEEE, 2021).
Oquab, M. et al. Dinov2: learning robust visual features without supervision. In Transactions on Machine Learning Research https://openreview.net/pdf?id=a68SUt6zFt (2024).
Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).
Pai, S. et al. Foundation model for cancer imaging biomarkers. Nat. Mach. Intell. 6, 354–367 (2024).
Wang, X. A pathology foundation model for cancer diagnosis and prognosis prediction. Nature 634, 970–978 (2024).
Xu, H. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188 (2024).
Ma, C., Tan, W., He, R. & Yan, B. Pretraining a foundation model for generalizable fluorescence microscopy-based image restoration. Nat. Methods 21, 1558–1567 (2024).
Iudin, A. et al. EMPIAR: the Electron Microscopy Public Image Archive. Nucleic Acids Res. 51, D1503–D1511 (2023).
Dhakal, A., Gyawali, R., Wang, L. & Cheng, J. A large expert-curated cryo-EM image dataset for machine learning protein particle picking. Sci. Data 10, 392 (2023).
El Banani, M. et al. Probing the 3d awareness of visual foundation models. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 21795–21806 (IEEE, 2024).
Zhong, E. D., Lerer, A., Davis, J. H. & Berger, B. CryoDRGN2: ab initio neural reconstruction of 3D protein structures from real cryo-EM images. In Proc. IEEE/CVF International Conference on Computer Vision 4066–4075 (IEEE, 2021).
Luo, Z., Ni, F., Wang, Q. & Ma, J. OPUS-DSD: deep structural disentanglement for cryo-EM single-particle analysis. Nat. Methods 20, 1729–1738 (2023).
Levy, A. et al. CryoDRGN-AI: neural ab initio reconstruction of challenging cryo-EM and cryo-ET datasets. Nat. Methods 22, 1486–1494 (2025).
Qin, B. et al. Cryo-EM captures early ribosome assembly in action. Nat. Commun. 14, 898 (2023).
Jeon, M. et al. CryoBench: diverse and challenging datasets for the heterogeneity problem in cryo-EM. In Proc. 38th Conference on Neural Information Processing Systems 89468–89512 (Curran, 2024).
Hu, M. et al. A particle-filter framework for robust cryo-EM 3D reconstruction. Nat. Methods 15, 1083–1089 (2018).
Tan, Y. Z. et al. Addressing preferred specimen orientation in single-particle cryo-EM through tilting. Nat. Methods 14, 793–796 (2017).
Liu, Z. et al. Determination of the ribosome structure to a resolution of 2.5 Å by single-particle cryo-EM. Protein Sci. 26, 82–92 (2017).
Fan, X. et al. Single particle cryo-EM reconstruction of 52 kDa streptavidin at 3.2 Angstrom resolution. Nat. Commun. 10, 2386 (2019).
Baxter, W. T., Grassucci, R. A., Gao, H. & Frank, J. Determination of signal-to-noise ratios and spectral SNRs in cryo-EM low-dose imaging of molecules. J. Struct. Biol. 166, 126–132 (2009).
Palovcak, E., Asarnow, D., Campbell, M. G., Yu, Z. & Cheng, Y. Enhancing the signal-to-noise ratio and generating contrast for cryo-EM images with convolutional neural networks. IUCrJ 7, 1142–1150 (2020).
Liu, Y.-T., Hu, J. & Zhou, Z. H. Resolving the Preferred Orientation Problem in CryoEM Reconstruction with Self-Supervised Deep Learning (Oxford Univ. Press, 2023).
McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).
Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).
Rohou, A. & Grigorieff, N. CTFFIND4: fast and accurate defocus estimation from electron micrographs. J. Struct. Biol. 192, 216–221 (2015).
Lawson, C. L. et al. EMDataBank unified data resource for 3DEM. Nucleic Acids Res. 44, D396–D403 (2016).
Dosovitskiy, A. An image is worth 16 × 16 words: transformers for image recognition at scale. In International Conference on Learning Representations https://openreview.net/pdf?id=YicbFdNTTy (2021).
Grill, J.-B. et al. Bootstrap your own latent—a new approach to self-supervised learning. In 34th Conference on Neural Information Processing Systems (NeurIPS 2020) https://papers.nips.cc/paper/2020/file/f3ada80d5c4ee70142b17b8192b2958e-Paper.pdf (2020).
Punjani, A., Zhang, H. & Fleet, D. J. Non-uniform refinement: adaptive regularization improves single-particle cryo-EM reconstruction. Nat. Methods 17, 1214–1221 (2020).
Yan, Y., Fan, S., Yuan, F. & Shen, H. Simulated cryo-EM particle datasets from paper ‘A comprehensive foundation model for cryo-EM image processing’. Zenodo https://zenodo.org/records/17066236 (2025).
Yan, Y., Fan, S., Yuan, F. & Shen, H. Genuine particle datasets for paper ‘A comprehensive foundation model for cryo-EM image processing’. Zenodo https://zenodo.org/uploads/17066297 (2025).
Yan, Y., Fan, S., Yuan, F. & Shen, H. The resampled CryoBench Ribosembly dataset used in paper ‘A comprehensive foundation model for cryo-EM image processing’. Zenodo https://zenodo.org/records/17066704 (2025).
Yan, Y., Fan, S., Yuan, F. & Shen, H. The reconstruction results of CryoSolver and CryoWizard. Zenodo https://zenodo.org/records/17062718 (2025).
Acknowledgements
We thank the HPC Center of Westlake University for providing computational facility support and technical assistance. This work was supported by the Ministry of Science and Technology of the People’s Republic of China (2024YFA0916903 to H.S. and 2022ZD0115100 to F.Y.), the National Science Foundation of China (32122042 and 32071208 to H.S. and U21A20427 to F.Y.), the Zhejiang Provincial Natural Science Foundation (DQ24C050001 to H.S.), the Research Center for Industries of the Future, Westlake University and the Westlake Education Foundation (to H.S and F.Y.). We thank our colleagues Y. Shi, H. Yu, P. Lu, D. Ma, Q. Hu, Q. Zhou, J. Wu, Z. Yan, Z. Shi and J. Chai for generously sharing their in-house cryo-EM data. We also acknowledge the use of data from EMDB and EMPIAR for training our models.
Author information
Authors and Affiliations
Contributions
The project was conceived and supervised by F.Y. and H.S. Y.Y. was primarily responsible for training the AI models, while S.F. mainly handled the preparation and processing of cryo-EM data as well as the construction of the automated data-processing pipeline. The initial draft of the manuscript was written by Y.Y. and S.F. and subsequently revised and finalized by F.Y. and H.S. All authors reviewed and provided feedback on the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Wah Chiu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Allison Doerr, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Detailed architectures and scale and ablation tests of the AI models.
(a-c) The detailed architectures of the prediction head (a), the projection head (b), and the classifier head (c) in the AI models are illustrated. (d)(e) Scale test of the dataset (d) and backbone (e) sizes on the performance of the Cryo-IEF model. (f)(g) Different training parameters (f) and loss functions (g) on the fine-tuned performance of CryoRanker.
Extended Data Fig. 2 Pipeline for preparing training datasets.
The pre-training dataset contains particles from various sources (EMPIAR, CryoPPP, and In-house datasets). The fine-tuning dataset is a subset of the pre-training one, where particles were processed by 2D classification in CryoSPARC and assigned quality scores adjusted based on Class Ranker evaluation.
Extended Data Fig. 3 Pose-dependent feature clustering in Cryo-IEF.
(a) Cryo-IEF feature distributions for particles from four distinct 2D classes in EMPIAR-10217 (n = 3,000 particles per class; see Supplementary Fig. 2 for 2D classification results). (b) Corresponding feature analysis for three particle clusters in EMPIAR-10096 (n = 3,000 particles per cluster; Supplementary Fig. 3). In both datasets, Cryo-IEF demonstrates consistent ability to separate particles by orientation, as evidenced by distinct clustering patterns corresponding to different projection views.
Extended Data Fig. 4 Evaluation of CryoRanker by precision and recall metrics.
The precision scores in relation to the predicted particle scores (left panel) and the recall values in relation to the labeled particle scores (right panel) of the four genuine particle datasets are displayed, indicating the performance of the CryoRanker model.
Extended Data Fig. 5 Correlation between CryoRanker scores and reconstruction resolutions.
Particles from six genuine datasets were divided into five equal-sized stacks based on CryoRanker scores (highest to lowest). Each stack was independently processed through CryoSPARC’s non-uniform refinement pipeline. Higher CryoRanker scores consistently yielded improved reconstruction resolutions, demonstrating the metric’s effectiveness for particle quality assessment.
Extended Data Fig. 6 Detailed flowchart for the fully automated cryo-EM data processing pipeline, CryoWizard.
The default pipeline is marked by blue arrows, while the pipeline for addressing the preferred orientation problem is marked by orange arrows. The preferred orientation problem is diagnosed by calculating the cFAR value of the refined structure during the initial model search step. Current pipeline is implemented in Python and interfaces with CryoSPARC-tools, an open-source Python library that enables scripted access to the CryoSPARC software package.
Extended Data Fig. 7 Automated structure resolution using CryoWizard with RELION.
CryoWizard incorporating RELION as the data processing platform resolved 3.3-Å cryo-EM map using the dataset EMPIAR-10556. Please refer to Methods for details.
Extended Data Fig. 8 CryoWizard overcomes preferred orientation challenges.
(a) For datasets with severe preferred orientation (for example, EMPIAR-10217), CryoWizard’s clustering module groups Cryo-IEF-extracted features (UMAP visualization) using K-Means + +. Among eight resulting classes, Class 5 particles produce an isotropic template (cFAR score: 0.75) for final refinement, while other classes show varying degrees of orientation bias (cFAR scores shown). (b-e) Manually selected particles from EMPIAR-10217 (b) and EMPIAR-10096 (d) yielded severely anisotropic maps (cFAR=0.01 and 0.03, respectively), whereas CryoWizard processing (c,e) achieved dramatically improved isotropy (cFAR=0.74 for EMPIAR-10217 at 2.37 Å; cFAR=0.34 for EMPIAR-10096 at 2.78 Å). Manual selection details in Supplementary Figs. 2 and 3.
Extended Data Fig. 9 Comparative performance of CryoWizard and spIsoNet in addressing preferred orientation.
The figure compares orientation correction results between conventional processing (a,g), CryoWizard (d,j), and subsequent spIsoNet processing for EMPIAR-10096 (a-f) and EMPIAR-10217 (g-l) datasets. For each dataset: (1) Initial maps from manual particle selection (a,g) and CryoWizard (d,j) were processed through spIsoNet’s Anisotropy Correction module (b,e,h,k) or Misalignment Correction module (c,f,i,l). CryoWizard-generated maps served as superior inputs for spIsoNet processing compared to conventional maps, demonstrating the complementary strengths of both approaches. Manual selection protocols are detailed in Supplementary Figs. 2 and 3, with spIsoNet parameters described in Methods.
Supplementary information
Supplementary Information (download PDF )
Supplementary Figs. 1–4 and Tables 1–5.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yan, Y., Fan, S., Yuan, F. et al. A comprehensive foundation model for cryo-EM image processing. Nat Methods 23, 88–95 (2026). https://doi.org/10.1038/s41592-025-02916-8
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41592-025-02916-8
This article is cited by
-
A comprehensive foundation model for cryo-EM image processing
Nature Methods (2026)


