Fig. 6 | Scientific Data

Fig. 6

From: A large-scale curated and filterable dataset for cryo-EM foundation model pre-training

Fig. 6

Validation details of CryoCRAB. The left column shows the micrograph images and their corresponding frequency domain representations, while the right column displays the intensity histograms of the images along with their minimum and maximum values. The preprocessing pipeline includes the following steps: (a) Input raw images: Displays the original unprocessed images and their frequency domain characteristics. (b) Background subtraction: Eliminates background variations caused by ice layer inhomogeneity using Gaussian blur, enhancing the contrast between signals and the background. (c) Band-limit to 3 Ã…: Applies band-limiting to the frequency domain of the images, reducing the impact of low signal-to-noise ratio regions and improving image quality. (d) CTF Filtered: Applies CTF filtering to further optimize the signal-to-noise ratio of the images. (e) Normalization and Standardization: Adjusts the pixel value range through contrast normalization, focusing on protein particle regions, followed by Z-score standardization to transform pixel values into a distribution with zero mean and unit variance, eliminating brightness bias and ensuring data consistency.

Back to article page