Table 1 Overview of recent deep learning methods for cryo-EM image processing tasks.

From: A large-scale curated and filterable dataset for cryo-EM foundation model pre-training

Downstream Task

Method Name

Publication

Year

RealData

Dataset

Image

Motion Correction

NoiseFlow25

ICPR

2022

No

2 sync.

No Men.

Motion Enhancement

DST-net26

IEEE/ACM TCBB

2024

No

7 sync.

168

Micrograph Denoising

Topaz-denoise73

Nat. Commun.

2020

Yes

18 real

3,439

Micrograph Denoising

Restore63

IUCrJ

2020

Yes

4 real

No Men.

Micrograph Denoising

NT2C65

BMC Bioinform.

2022

Yes

3real+4sync.

1,472+1,750

Micrograph Filtering

Miffi28

J. Struct. Biol.

2024

Yes

9 real

45,768

Micrograph Filtering

MicrographCleaner27

J. Struct. Biol.

2020

Yes

16 real

539

Particle Picking

DeepPicker29

J. Struct. Biol.

2016

Yes

5 real

1,804

Particle Picking

FastParticlePicker30

ICAMCS

2017

Yes

3 real

300

Particle Picking

APPLEPicker31

J. Struct. Biol.

2018

Yes

4 real

324+

Particle Picking

crYOLO32

Commun. Biol.

2019

Yes

45 real

840

Particle Picking

Topaz33

Nat. Methods

2019

Yes

6 real

2,296

Particle Picking

PIXER34

BMC Bioinform.

2019

Yes

5real+1sync.

476+496

Particle Picking

PARSED35

BMC Bioinform.

2020

No

No Men.

No Men.

Particle Picking

DeepCryoPicker36

BMC Bioinform.

2020

Yes

3 real

350

Particle Picking

CASSPER37

Commun. Biol.

2021

Yes

4 real

1,875

Particle Picking

DRPnet38

BMC Bioinform.

2021

Yes

1 real

50

Particle Picking

EPicker39

Nat. Commun.

2022

Yes

10 real

100

Particle Picking

SynergyNet40

BIBM

2022

Yes

3 real

No Men.

Particle Picking

CryoTransformer41

BMC Bioinform.

2024

Yes

22 real

5,172

Particle Picking

CryoMAE42

WACV

2025

Yes

5 real

1,561

Particle Picking

CryoSegNet43

Brief. Bioinform.

2024

Yes

22 real

4,948

  1. Methods are categorized and ordered hierarchically from movie-level to micrograph-level and finally particle-level, covering motion correction, micrograph denoising, micrograph filtering, and particle picking in cryo-EM. The table highlights the scale of training datasets used by these deep learning methods. While most methods are trained on real data, the limited availability of annotated cryo-EM data restricts their training to relatively small datasets, thereby constraining their generalizability. CryoCRAB includes 746 proteins, comprising 152,385 sets of raw movie frames, which can be utilized to train foundational cryo-EM models, enabling general-purpose feature extraction for downstream tasks.