Noise2Void for denoising atomic resolution scanning transmission electron microscopy images

Thornley, William; Sullivan-Allsop, Sam; Cai, Rongsheng; Clark, Nick; Gorbachev, Roman; Haigh, Sarah J.

doi:10.1038/s41524-025-01939-1

Download PDF

Article
Open access
Published: 13 January 2026

Noise2Void for denoising atomic resolution scanning transmission electron microscopy images

William Thornley¹,
Sam Sullivan-Allsop^1,2,
Rongsheng Cai¹,
Nick Clark^1,2,
Roman Gorbachev^2,3 &
…
Sarah J. Haigh^1,2

npj Computational Materials volume 12, Article number: 68 (2026) Cite this article

2837 Accesses
1 Altmetric
Metrics details

Subjects

Abstract

The Noise2Void technique is demonstrated for successful denoising of atomic resolution scanning transmission electron microscopy (STEM) images. The technique is applied to denoising atomic resolution images and videos of gold adatoms on a graphene surface within a graphene liquid-cell, with the denoised experimental data qualitatively demonstrating improved visibility of both the Au adatoms and the graphene lattice. The denoising performance is quantified by comparison to similar simulated data and the approach is found to significantly outperform both total variation and simple Gaussian blurring. Compared to other denoising methods, the Noise2Void technique has the combined advantages that it requires no manual intervention during training or denoising, no prior knowledge of the sample and is compatible with real-time data acquisition rates of at least 45 frames per second.

Stripe noise removal in conductive atomic force microscopy

Article Open access 16 February 2024

Unsupervised deep denoising for four-dimensional scanning transmission electron microscopy

Article Open access 13 October 2024

AI-driven image processing framework for high-accuracy detection and characterization of vacancies in 2D materials

Article Open access 25 February 2026

Introduction

The latest (scanning) transmission electron microscope ((S)TEM) instruments are capable of spatial resolutions of better than 50 pm, allowing atomic structure to be resolved for many different crystal orientations^1,2,3,4. Yet imaging at such high magnifications requires high electron fluence to provide sufficient signal-to-noise ratio (SNR) in the resulting images so that information is transferred^5,6. Thus, the limiting factor to achieving atomic resolution (S)TEM imaging of a particular material is often it’s stability under a high-energy electron beam^7,8,9. Image denoising techniques, which aim to improve the SNR of images after acquisition, have been studied for many years, with a particular focus on removing additive Gaussian white noise (AGWN)¹⁰. By improving the SNR, denoising enables information transfer to be retained while lowering the electron fluence (electrons incident per unit area)^11,12. Effective denoising therefore has the potential to unlock improved atomic resolution (S)TEM imaging of systems such as metal organic frameworks and pharmaceutical crystals, where spatial information transfer is currently limited by the material’s electron beam sensitivity⁷.

For in situ (S)TEM imaging of dynamic processes, the requirements for excellent electron stability are further increased since the material must survive for multiple image frames¹³. Here, successful image denoising provides opportunities for fractionating the material’s critical electron dose across more images: either increasing the number of frames in an image series before damage is observed or increasing frame acquisition rate while retaining information transfer within the individual images.

In this work, we focus on denoising for a particular experimental challenge, one that combines demanding requirements for both spatial and temporal resolution in the STEM. This is the investigation of local atomic motion at solid-liquid interfaces; behaviour that underpins many physical processes such as wetting, adhesion, chemical etching, dissolution and solidification^14,15,16. Such studies have recently become achievable in both TEM and STEM imaging modes using advanced liquid cells^17,18. Imaging of volatile liquid-phase samples inside the vacuum of the TEM requires confining a thin volume between two electron transparent windows that are impermeable to liquid (Fig. 1). In these environmental cells, both the liquid and the cell windows introduce undesired distortions to the electron beam as it transmits through the sample lowering the SNR^16,19. Graphene windows are thinner and less dense than commercial silicon nitride windows so introduce less unwanted scattering^12,19,20. However, even the presence of the liquid alone will lower the SNR compared to the same sample imaged in vacuum¹¹. To maintain spatial resolution in liquid-cell (S)TEM images therefore requires higher electron fluence than achieving the same resolution ex situ in the absence of liquid.

**Fig. 1: Schematic illustration of the experimental setup and acquired data.**

A further challenge associated with liquid-cell TEM and STEM is the potential for electron beam-induced changes to the liquid-cell chemistry. For aqueous solutions, these radiolytic changes to the local liquid environment occur at comparatively low electron fluence²¹. The effects include increasing pH and the creation of chemically active free radicals, which may be only indirectly visible in the (S)TEM images as behavioural artefacts within the system under investigation²². Various methods exist to mitigate radiolytic effects by altering the solution chemistry to inhibit the concentration of radiolytic species²¹ yet ultimately the coupling of SNR and required image resolution determines the minimum electron dose in the liquid-cell¹¹. Denoising offers the potential for improving the SNR of liquid-cell images post-acquisition, enabling reduction in the electron fluence and consequently reducing radiolytic changes in the liquid environment.

Deep learning techniques have advanced the field of image denoising over the last decade with improved performance, computational efficiency (in application) and by not being reliant on manual parameter tuning²³. Deep learning techniques typically require training data, in the form of input and ground-truth/target pairs, to train models. Application to experimental denoising tasks like improving the SNR of (S)TEM images, therefore presents a challenge because often the ground-truth input data (noise-free equivalent of the experimental data) is not known.

There are several examples of supervised deep learning denoising techniques being successfully applied to (S)TEM images^24,25,26. These have been generally found to outperform more traditional approaches such as total-variation (TV) regularisation²⁷, non-local low-rank/sparse-representation techniques^28,29 and simple filters³⁰. However, they required extensive training from image simulations or artificially noised images, which can be computationally expensive, and once trained demonstrate limited transferability to different samples. Various examples of the application of denoising to environmental cells can be found in literature^31,32,33 but the available improvement in SNR was relatively modest despite high computational complexity.

The Noise2Noise family of deep learning techniques, which includes Noise2Void, present a powerful alternative solution to the problem of lacking ground-truth (noise-free) equivalents of noisy experimental images but which has not been rigorously investigated for application to denoising of (S)TEM data. Broadly, the Noise2Noise approaches use specialised training regimes in order to utilise the noisy experimental images as both their input and ground-truth training data, thereby providing opportunities for image denoising without requiring manual intervention or extensive simulated data to train a model. The Noise2Void technique involves applying these specialised training regimes to a UNet³⁴, which is a type of convolutional neural network (CNN) that would otherwise typically require supervised training (involving paired/labelled training data)³⁵.

In this paper we test application of the Noise2Void denoising approach to the exemplar system of atomically resolved STEM imaging of gold adatoms in graphene liquid-cells (Fig. 1). The sample is atomically dispersed gold on few-layer graphene, imaged in situ within a graphene liquid-cell to visualise the dynamic behaviour at the solid-liquid interface (Fig. 1b). We present denoising of experimental and simulated data, and quantitatively compare denoising performance to simple Gaussian filtering and the more computationally demanding total variation approach. We demonstrate how this denoising approach can be used as an initial step to enhance the effectiveness of feature identification from atomic resolution images, as illustrated in Fig. 1b.

Results

Noise2Void implementation

To train a suitable denoising model for a particular set of images, the Noise2Void algorithm assumes that the image-noise is random and pixel-wise uncorrelated³⁴. For each training datum, the algorithm uses the same noisy image as both input and ground-truth, thereby training the network to replicate the input at the output. Taken alone, this would simply train the model to learn unity, making it useless. The key insight is that a ‘blind-spot’ is introduced to the model, preventing it from using the value of pixel A at the input to predict the value of pixel A at the output. Each pixel value at the output must be predicted using that pixel’s neighbours (at the input), but not that pixel itself. Since pixel-wise uncorrelated information cannot, by definition, be predicted with knowledge of a pixel’s neighbourhood only, it cannot be reproduced at the model’s output. Pixel-wise correlated information in the image can however be reproduced at the output. Therefore, in attempting to reproduce the input at the output, the network reproduces the input image without pixel-wise uncorrelated noise³⁴. The ‘blind-spot’ effect is not enforced in the network architecture, but is implemented in training. During training, the input image has a grid of pixels masked, with their intensity replaced by some other value.

The family of Noise2Noise-inspired algorithms includes the original Noise2Noise, as well as Noise2Self, Noise2Void and N2V2^34,36,37,38. The Noise2Void technique uses a UNet architecture³⁵ and the ‘blind-spot’ masked pixel’s intensity is replaced with one of its neighbour’s³⁴. However, the key novelty of the Noise2Void technique itself does not place a strict/specific constraint on the exact network architecture. N2V2³⁸ improves on Noise2Void, in part by using a modified UNet, which is the approach we adapt in this work. In N2V2, the masked pixel’s intensity is replaced not by a neighbour but with a local average³⁸. For both Noise2Void and N2V2, the model’s training loss is calculated by determining the model’s ability to reconstruct the input image only at the masked pixels. Since each masked pixel is modified at the input, their prediction at the output can only be made based on the intensities in their local neighbourhoods, thus implementing the ‘blind-spot’ without needing to add constraints to the network architecture. To successfully apply Noise2Void to denoising of the STEM data considered in this work, here several adaptations were made to the Noise2Void techniques described previously (in the original Noise2Void and the N2V2 paper^34,38). This included modifications to the input format, the network architecture and the training regime to allow application of the approach to experimental STEM images.

The Noise2Void approach is limited to only statistically uncorrelated image noise³⁴. For in situ (S)TEM experiments, the liquid within a liquid-cell always imparts statistically uncorrelated noise onto the (S)TEM images^11,39 and this can often be the dominant source of image noise. Also, as discussed earlier, it is often desirable to minimise beam-dose during in situ experiments and this also introduces uncorrelated image noise²⁸. Therefore despite being limited to uncorrelated image noise, Noise2Void is particularly suited to in situ liquid-cell experiments.

Modification to support multiple channels

One of the advantages of the STEM imaging approach is the ability to simultaneously acquire data from different detectors. Information in these images from different detectors is often physically correlated. To illustrate the application of Noise2Void to such data sets we have acquired STEM image pairs, collecting data from the bright field (BF) and annular dark field (ADF) detectors simultaneously (see Fig. 1a). These images are generated from detector measurements acquired simultaneously during a single raster scan and contain physically correlated, yet distinct, information due to the different angular sampling of the transmitted electrons, so should be considered together to gain the best understanding of the material under investigation. Images produced from the ADF detector with high collection solid angle are often referred to as Z-contrast, being dominated by Rutherford scattering with the intensity scaling with the sample’s thickness and atomic number⁴⁰. The BF detector collects small angle scattering to produce a phase contrast image or interference pattern where regular crystalline features tend to give strong contrast, although it is not directly interpretable in terms of the crystal structure of the material⁴¹. Previous implementations of Noise2Void and N2V2 in literature have only applied the technique to monochrome images (i.e. a single colour channel)^34,38. As our experimental data-sets consists of time-series of image-pairs, we have modified the network to operate on both ADF and BF channels in unison by implementing the input frames as dual-channel images. This has the advantage that the network can use BF channel information to enhance its denoising of the ADF channel and vice versa.

Modification to network architecture

In this work, the UNet architecture was modified compared to the original UNet paper³⁵ and to all previous Noise2Void papers^34,38. The original UNet paper uses an encoder-decoder architecture with a depth of four, max-pooling between layers and 64 feature channels at the output of its first layer³⁵. The original Noise2Void paper changed the UNet to have a depth of two and 32 feature channels at the output of its first layer (so a much leaner network)³⁴. The effect of the number of layers in the UNet architecture in dictating the denoising success was tested in the range two to four layers, with four layers found to give the best performance. Therefore for all the results presented in this work, the UNet has a depth of four, with 64 feature channels at the output of the first layer.

The pooling technique was also experimented with, inspired by modifications to pooling reported in the N2V2 paper³⁸, with the aim of optimising denoising performance. Replacing the max pooling layers of UNet with average pooling was found to improve denoising performance, likely due to the greater linearity of information transfer down the network.

A further architecture modification made here is the increase of the receptive field compared to the original Noise2Void implementation³⁴ and the N2V2³⁸ architecture. With this change the network has more information on which to make pixel-predictions. Although increasing the number of layers in the network would also achieve this, increasing the number of layers beyond four significantly increases GPU VRAM requirements so the approach becomes harder to practically implement. Increasing the receptive-field in only the first layer was sufficient to achieve improved denoising performance. The size of the convolutional kernel in the first layer was therefore increased to 5 × 5, with 3 × 3 kernels elsewhere, whereas other networks use 3 × 3 kernels in all layers. Our modified Noise2Void architecture is illustrated schematically in Fig. 1a.

Modification to training

Our Noise2Void model was first trained on 8475 dual-channel frames from a full data set of 16,750 frames with the training data extracted by sampling every other frame. Initial results with Noise2Void produced grid-like artefacts in the denoised frames (see Fig. 2b), unwanted features which are also reported in the original Noise2Void paper³⁴. These artefacts are reduced with the addition of a jitter to the positions of the masked pixels (px) in the training images, specifically random displacements of up to 2 px translations, and modification to the network’s upsampling approach⁴². This differs from the N2V2 approach, where grid-like artefacts were reduced by removing the outermost skip connection, which in this case could have limited spatial resolution.

**Fig. 2: Modified approach to remove chequerboard artefacts found when denoising.**

Other modifications to the Noise2Void architecture were also tested while optimising the Noise2Void denoising performance, including replacing the convolution blocks with Xception⁴³ style separable convolutions (to improve network efficiency), adding an Atrous Spatial Pyramid Pooling (ASPP) layer⁴⁴ at the bottleneck and replacing every convolution block with an ASPP block, but all resulted in a poorer denoising result. Removing the outermost skip-connection compared to the original UNet was also tested as reported in N2V2³⁸, but this also resulted in worse performance.

Denoising performance—experimental data

Figure 3a shows a representative frame from the experimental dataset, consisting of a dual-channel ADF-BF image pair with the ADF image shown on the top row and the simultaneously acquired BF image on the bottom row. The aim of the denoising is to improve confidence in identification of the positions of the Au adatoms at the solid liquid interface (bright features in the ADF images and dark features in the BF images) even at low electron fluence. A secondary aim is the identification of the location of the Au adatoms with respect to the underlying graphene lattice, which requires resolving the graphene’s atomic lattice. We first consider the latter imaging challenge. Figure 3b-d compares the same experimental image pair after Gaussian denoising, TV denoising and after the Noise2Void approach. Unlike our modified dual-channel Noise2Void, Gaussian and TV denoising are not dual-channel techniques, and so are applied to each image channel separately. To more clearly highlight the different SNRs, intensity profiles are extracted from identical locations on all images demonstrating the intensity modulation resulting from the graphene lattice sampled along the $[10\bar{1}0]$ direction as shown in Fig. 3a–d. Examination of Fig. 3d shows that the UNet Noise2Void approach has revealed the underlying graphene lattice in both the BF and the ADF channel, while the periodicity is not clear in the ADF images subjected to either Gaussian blurring or TV denoising (Fig. 3b,c).

**Fig. 3: Demonstration of Noise2Void denoising performance on the experimental graphene liquid-cell data compared to standard methods.**

Once the graphene lattice is resolved, it is necessary to identify the number and position of Au adatoms on the graphene surface. Fig. 4 provides similar data to Fig. 3 but with the line profile sectioning through an Au adatom on the graphene inside the liquid-cell.

**Fig. 4: Demonstration of Noise2Void denoising performance on the experimental graphene liquid-cell data (ADF and BF image pairs) and comparison to standard denoising methods.**

The TV and Noise2Void denoisers produce flatter ‘background’ (Au-free graphene region) intensities in the ADF image than Gaussian blurring, which has the benefit of increasing the number of Au adatoms that are correctly identified (as shown in Fig. 1b).

A rigorous quantification of SNR in TEM images requires the underlying ‘ground-truth’ image signal be known, to accurately measure the noise. This is achievable by comparison to image simulations as demonstrated in section ‘Denoising Performance—Simulated Data’. Nonetheless, useful insights can still be gained from consideration of experimental SNR values.

We first consider Noise2Void denoising of the experimental ADF images. The background ‘noise’ includes a ‘real’ periodic signal from the graphene lattice as well as unwanted intensity variations resulting from both the presence of the liquid and background noise from the imaging system. The Z-contrast dependence of the ADF signal means the graphene lattice is only a weak modulation compared to the intensity of Au adatoms, so when seeking to identify only the Au adatom peak locations it can be considered small compared to the unwanted signals we wish to remove. In practice, Noise2Void denoising can reveal the periodic modulations of the graphene lattice in the ADF signal (as shown in Fig. 3d). However, we ignore this in the following analysis, recognising this lattice signal will result in underestimation of the SNR improvement when gold-free regions are considered as the noise reference.

The SNR for the Au adatoms measured from the ADF intensity line profile in Fig. 4 gives a value of 2 for the raw input image. This is improved to 3.3, 7.2 and 12 for Gaussian blurring, TV denoising and Noise2Void denoising, respectively (see ‘Methods’ for details of SNR quantification). Arguably more challenging than identification of the positions of well isolated individual adatoms, is the ability to resolve separate adatoms that are in close proximity. This competes with achieving a high SNR for single adatoms when using Gaussian blur so may be a limiting factor for conventional denoising approaches. Fig. S1 compares intensity line profiles where two Au adatoms are separated by 0.34 nm (~7.4 px). Quantification of the drop in the intensity of the ADF image between the adatoms, relative to the height of the least intense adatom peak reveals a drop of 57% for the original raw input image, compared to 28%, 14% and 29% for the Gaussian blur, TV and Noise2Void, respectively. Together, the data in Figs. 4 and S1 suggest that Noise2Void is the better denoiser for the ADF channel because it improves the SNR significantly from 2 to 12, while also retaining the ability to separately distinguish nearby features, which can be suppressed by other denoising methods. Examples of denoised Au nanoclusters can be seen in Fig. S4.

Compared to the ADF image channel, the denoising performance for the BF channel in Fig. 3 is harder to assess because the feature of interest (the graphene lattice) covers the whole image and there is no region that can reasonably be approximated as purely background. Considering Figs. 4 and S1, in all denoisers, but especially TV and Noise2Void, higher-frequency components have been diminished, yet it is not clear if this significantly improves the image for the extraction of the crystallographic lattice in later analyses.

The transfer of periodic crystallographic information after denoising is more easily assessed in Fourier space, as shown in Fig. 5, where the upper row is the magnitude of the fast Fourier transform (FFT) of the ADF image and the second row is the FFT-magnitude of the corresponding BF image. The input frame is shown in Fig. 5a, while Fig. 5b-d shows each denoiser’s output frame. The Gaussian denoised FFT-magnitude shows that only high spatial frequency noise has been removed from both the ADF and the BF images, as expected with this simple approach. Encouragingly, the Noise2Void approach has both suppressed the high-frequency noise and also enhanced the transfer of low spatial frequencies relative to the high spatial frequencies in both ADF and BF images. The graphene lattice appears in the FFT-magnitudes as sets of hexagonal lattice spots. These are visible in the FFT-magnitudes of the BF input image and in all the denoised BF images. However, in agreement with the real space analysis in Fig. 3, only denoising with Noise2Void reveals all six of the first-order graphene spots (corresponding to the $\{01\bar{1}0\}$ lattice spacings) for the FFT of the ADF image.

**Fig. 5: Fourier space demonstration of Noise2Void denoising performance on the experimental graphene liquid-cell data and comparison to standard denoising methods.**

The third and fourth rows of Fig. 5 show six-fold reduced polar plots of the FFT-magnitudes, where these have been warped by a polar transformation (such that the horizontal axis now corresponds to the radial direction in the above FFT-magnitudes) and summed periodically over 60^∘ (azimuthal) sections/wedges. Line profiles have then been overlain, where they display the totally azimuthally integrated intensity as a function of radius (the radial profile). In these radial profiles, each set of six-fold symmetric hexagonal graphene lattice spots reduces to a single spot. While the lowest order graphene lattice ($01\bar{1}0$ spot) is clearly visible in the BF radial profile for all data presented, careful inspection will also reveal these spots in the ADF radial profiles. The higher order ($2\bar{1}\bar{1}0$ type) spots can also be seen in both channels which, for the BF channel have SNR values of 9.0, 8.2, 10.7 and 16.0 for the input, Gaussian blur denoised, TV denoised and Noise2Void denoised images, respectively.

The Noise2Void ADF FFT-magnitude shows an unusual hexagonal symmetry in the information transfer, approximately corresponding to the orientation of the graphene lattice. It appears that Noise2Void has selectively suppressed regions in the FFT more distant from the lattice-points of the graphene lattice, in a manner akin to a process that is often performed manually by selective FFT filtering to enhance the visibility of a crystal lattice. Nonetheless, no component of the underlying UNet architecture is six-fold rotationally symmetric, meaning the Noise2Void model has learned this symmetry during training. While denoising of a perfect lattice can be effectively achieved with manual FFT filtering, this utilises prior knowledge of the sample’s structure to choose how and where to modify the FFT (i.e. a physics-based model). In contrast, Noise2Void has no initial knowledge of the sample and must therefore have learned the intrinsic six-fold symmetry by identifying correlations within the videos (i.e. a data-driven model). The hexagonal symmetry introduced by Noise2Void denoising accommodates the various orientations of the crystal lattice across the experimental dataset. The resemblance of the FFTs of Noise2Void’s denoised videos to what one might expect from an expert’s FFT filtering can be seen as endorsement of the ability of Noise2Void to identify and patterns in the training data and exploit that correlation to enhance denoising performance. Expert FFT filtering is often laborious (requiring adaptation whenever the sample’s orientation changes), whereas Noise2Void’s results in similar success but without manual intervention.

More generally, when applied to the whole experimental dataset, we find the Noise2Void denoised images result in 89% more successfully tracked adatoms in comparison to the raw images. This practical measure of performance demonstrates the real-world enhancement that this technique can bring to a wider data-analysis pipeline, which involved tracking 5266 adatoms (improved from 2771) across a large dataset consisting of over a hundred experimental videos.

Denoising performance—simulated data

Despite the apparent success of the Noise2Void method presented in Figs. 3, 4, 5 and S1, the lack of ground truth data for experimental images limits quantitative comparison of the different approaches. We therefore apply our Noise2Void approach to denoising of simulated STEM image pairs that closely resemble the experimental data and compare the results to noise-free simulations (see methods for further information). Three metrics are used to allow quantitative comparison; peak signal-to-noise ratio (PSNR), structural similarity (SSIM), and normalised root-mean-square error (N-RMSE). Peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) indicate the similarity of the image to the ground-truth (higher is better), while normalised root-mean-square error (N-RMSE) indicate the difference from the ground-truth (lower is better). While PSNR is the most widely used metric, SSIM is more closely correlated with human perception and N-RMSE is comparatively simpler to understand⁴⁵. The implementation for each can be found in scikit-image’s ‘metrics’ module⁴⁶ (see ‘Methods’ section). The denoising results for simulated data with an input PSNR value of 7 dB and 6 dB for the ADF and BF channels respectively, are presented in Fig. 6 where our Noise2Void approach is again compared to Gaussian and TV denoising. We also compare to BM3D denoising⁴⁷ (Fig. S6). The PSNR, SSIM and N-RMSE values are summarised in Fig. 7. This quantitative comparison shows that the Noise2Void approach gives the best denoising performance for both BF and ADF images, even for very noisy simulated input data, with all metrics significantly outperforming Gaussian blurring and the TV approach for our simulated data. Similar behaviour is seen when considering the SSIM and N-RMSE values with the noisy input data showing an order of magnitude improvement in all metrics for the ADF channel. The SSIM and N-RMSE improvements for the BF channels are also similarly significant, corresponding to 9 and 6 fold improvement factors, respectively. Qualitatively, our comparisons suggest that our Noise2Void approach can result in less significant image artefacts than TV and BM3D denoising (see Figs. 6 and S6).

**Fig. 6: Demonstration of the denoising performance of the three denoising techniques applied to a simulated liquid-cell frame.**

**Fig. 7: Evaluation of the denoising performances of a simple Gaussian blur, TV denoising and Noise2Void applied to a simulated liquid-cell frame.**

Denoising at higher PSNR

The experimental input images used in section ‘Denoising performance – experimental data’ have relatively high SNR (~2 in the ADF images, qualitatively corresponding to a PSNR of ~16 in the simulated ADF images, an order of magnitude higher than the input simulations in Fig. 6). This high SNR has the advantage that it provides guidance for feature identification from the raw data, yet it is highly desirable to collect raw experimental data with much poorer SNR to minimise the electron dose applied to the sample. To enable optimisation of imaging conditions and infer the minimum electron fluence for data acquisition, we now consider denoising of data for a range of different input PSNR values. The PSNR scale (when measured in dB) is logarithmic so our trained Noise2Void algorithm was applied to simulations with input PSNRs in the range from 7 dB to 20 dB. Figure 8 demonstrates denoising of input simulated ADF images with PSNR values of 16.0 and 9.9 dB, showing improvements to 30.9 and 27.5 dB, respectively, in the denoised ADF outputs. A quantitative comparison of the denoising performance for PSNR values from 7 dB to 20 dB and a comparison to TV denoising is shown in Fig. 9 (and Fig. S3). Our Noise2Void approach is shown to outperform TV denoising in the ADF channel for all input PSNRs, with the most effective denoising achieved for the ADF channel over the BF. Furthermore, the output PSNR in the denoised image remains exceptionally high across all input PSNRs, giving an improvement ratio of the logarithmic PSNR (PSNR output dB / PSNR input dB) of ~3 at the lowest input PSNR, compared to an improvement ratio of ~1.6 for the highest input PSNR, both for the ADF channel.

**Fig. 8: Demonstration of the denoising performance for Noise2Void for simulated liquid-cell data with different PSNR values.**

**Fig. 9: Noise2Void and TV denoising performance as a function of input noise, for the ADF channel.**

As discussed earlier, it is experimentally desirable to minimise the electron fluence, however, this reduces the SNR. Since Fig. 9 shows that denoising performance is not a simple function of input SNR, an exact factor cannot be given for the reduction in electron fluence that our trained Noise2Void can compensate for, it depends on the input SNR. However, assuming a reasonable SNR value for the input images, such as the PSNR of 14dB shown in Fig. 9, our trained Noise2Void denoiser improves the SNR by approximately an order of magnitude and therefore reduces the necessary electron fluence by approximately an order of magnitude, for a given sample. An experimental demonstration of our denoiser’s performance at electron fluences differing by approximately an order of magnitude can be seen in Fig. S5, where two images with electron fluences of 3.17 × 10⁵e⁻nm⁻²s⁻¹ and 1.67 × 10⁶e⁻nm⁻²s⁻¹ are shown.

Artefact analysis

Denoising techniques risk introducing image artefacts into the denoised images. It is therefore necessary to understand the potential image artefacts from our trained Noise2Void model.

An image containing only Gaussian noise was denoised using the trained model. The output images and their FFTs (see Figs. S7, S8) show patterns that lack well defined periodicities. Another image, containing a bright point in the middle of the ADF channel, was also denoised (see Fig. S9). Together, these show that, while the denoiser does impart artefacts on the image, these do not have well-defined periodicities and are of low intensity compared to the intensities of atom-like features present in the training and simulation datasets.

Finally, holes in the graphene liquid-cell’s windows, which can be seen in the bright field channel of images after increasing electron dose by an order of magnitude to damage the graphene, were checked to ensure adatoms are not present within the defective region. This would be unphysical and therefore guaranteed to be an image artefact. The lack of adatoms in these regions (see Fig. S10) suggests that the denoiser is not generating adatoms artificially.

It is important to note that the risk of artefacts may be higher where there is not a sufficiently large number of input images for model training. Unlike TV denoising or simple Gaussian blurring therefore, our modified Noise2Void approach would not be suitable where there is only a small number of experimental images.

Comparison of denoising speed

The speed of a particular denoising method is important if it is to be applied for real time denoising as well as for efficient data processing post-acquisition. A significant speed milestone for TEM denoising is therefore matching the rate at which experimental data is acquired, allowing the method to be used to direct or assist with the experimental data collection.

The precise speed of a denoiser will depend on several factors, including disk read-write speed, processor performance, hardware utilisation etc. Nevertheless, it is informative to compare the relative speed of each denoising technique applied to the same video series. The time required for Gaussian blurring, TV and our Noise2Void technique was compared for the denoising of a single video (containing about ~100 frames) on the same hardware. The mean time-per-frame of the modified Noise2Void approach was found to be 22 ms (45 frames-per-second), only twice the 11ms required by simple Gaussian blurring, and an order of magnitude less than the 300ms required by TV. The test was repeated three times to assess the variability of the time required for each denoising approach and the results were found to vary by approximately 10%. We also found our modified Noise2Void approach to be approximately 13 times faster than BM3D denoising (see Fig. S6). All approaches are faster than the relatively slow data acquisition rates used for these liquid-cell STEM experiments (~1.9 frames-per-second). Nevertheless faster acquisition rates of 10–20 frames-per-second (50–100 ms per acquisition) are routinely used in TEM and STEM imaging, for example during instrument alignment and for analysing electron-beam sensitive samples. Noise2Void, therefore, has the advantage compared to TV denoising of being fully compatible with real-time STEM imaging, suggesting it could provide a valuable tool to assist with alignments, especially when imaging electron beam sensitive samples or where low-dose conditions are required.

Transfer training

Training of the Noise2Void model is the slowest and most computationally intensive part of the Noise2Void denoising. We find that training requires a high-performance workstation while the denoising can be done on a standard laptop. The time required to train a Noise2Void model is highly dependent on the compute resources available and amount of training data. Approximately an hour and a half was required to train each Noise2Void model, with more training details available in the ‘Methods’ section. The ability of a trained model to be used for a new experimental data set is therefore important for the wider usability and applicability of this denoising approach. Fig. 10 shows the success of the trained Noise2Void approach when applied to a new set of experimental data. The new experimental dataset was 1230 experimental frames, which were again dual-channel images, 512 × 512 px in size, acquired on a similar but not identical sample. The data was obtained by a different microscopist, and during a different experimental session so instrumental parameters will have changed including lens aberrations, which dictate information transfer to the image. Figure 10a shows an example frame from the new experimental data. Figure 10b shows the result from our modified Noise2Void model (presented in Figs. 3 and 5 and trained on previous experimental data) when applied to denoise the new data. Fig. 10c compares this to where the original model was used as a starting point, but then further trained using a subset of the new experimental data (so overall the model had been trained on both old and new datasets). We refer to this as a ‘transfer trained’ model. Finally, we consider the model performance relative to where completely fresh models were trained only on half the new data set (Fig. 10d) and only on the complete new data set (Fig. 10e). Examination of the images and line profiles demonstrates that all approaches denoise the data effectively, with the ADF SNR ratio improved for all the denoised data. The model trained on fresh data gives a slightly improved transfer of the graphene lattice in the ADF channel after denoising, but the improvement is small demonstrating that retraining is not required as part of a workflow for similar samples.

**Fig. 10: Demonstration of the effectiveness of denoising a new set of experimental data.**

The similarity of the denoiser outputs in Fig. 10 suggests that transfer training and training from scratch provided little improvement over the model trained only on the original dataset. This may be because the original model was generalised to an extent such that the difference between our old and new experimental data was not significant. It therefore cannot be said, for our two experimental data sets, that transfer training provided an advantage over training from scratch. However, this does not imply that the same would be true for two different datasets and where the two datasets differ sufficiently, there may well be a significant advantage to transfer training over training from scratch.

Comparison with mono-channel networks

A significant difference of this work over earlier implementations of Noise2Void is the application to dual-channel images. It is therefore interesting to consider whether this dual-channel UNet provides improved performance compared to two single-channel UNets for denoising dual-channel data. To test this, a Noise2Void denoiser was created using two separate mono-channel UNets, each trained on a single channel (ADF or BF images). Both networks are identical, except for the number of channels at the input and output, and each mono-channel model was trained for the same number of training epochs as the dual-channel model used elsewhere in this work. Comparison of the training losses for both the dual and the mono-channel networks (Fig. S11) shows that the dual-channel and ADF mono-channel plateaux after 32 epochs, but the BF mono-channel is continuing to decrease, suggesting further training is required.

While the ADF mono-channel showed similar results to the dual-channel Noise2Void denoiser, image artefacts were present in the output of the BF mono-channel denoiser, which is to be expected when the model’s training has not converged. While further training of the BF mono-channel network may remove these artefacts, it’s presence suggests that the mono-channel denoising technique requires more training compared to the dual-channel model. This would require more computing resources and is therefore disadvantageous.

The speed of the dual-channel denoiser was also compared to the mono-channel, and found to be around a factor of two faster, with frames taking on average 43 ms per frame for the mono-channel compared to 23 ms per frame for the dual-channel denoiser (both times are averages over 100 frames). This is unsurprising since for the mono-channel case there are two UNets rather than one. The combination of longer training and slower denoising speed demonstrates that the dual-channel UNet network architecture is preferred for denoising this dual-channel video dataset compared to two mono-channel UNets. Given the increasing capabilities of STEM instruments to collect large multiple-channel data sets, the demonstrated denoising approach for efficient data analysis is likely to be increasingly important.

Discussion

In this study, we have demonstrated the successful application of a modified Noise2Void architecture for denoising atomic resolution STEM dual-channel images. Neither the training nor the application of the method requires manual intervention, with no prior sample knowledge or image simulations required. The method achieved a factor of 3 improvement in PSNR and an order of magnitude reduction in N-RMSE values, as measured by application to image simulation. The success when applied to experimental data is demonstrated by the increased SNR for visibility of both isolated metal adatoms and the graphene lattice, outperforming Gaussian blurring and the TV approach. The Noise2Void methodology is found to have a mean processing time-per-frame of just 23ms (or 45fps) making it fully compatible with real time denoising of experimental STEM images, even when performing live search and alignment procedures. Denoising is highly desirable to improve the SNR of experimental data, providing opportunities for greater information recovery, for lowering the electron fluence or for taking experimental data sets at higher frame rates. These new imaging capabilities can be used to reduce the potential for electron beam-induced changes to the system or for investigating atomic-scale dynamic behaviour with improved temporal resolution. We demonstrate the potential of the technique for atomic resolution imaging of Au adatoms on graphene surrounded by liquid, a material system with a strongly crystalline component. We expect that the technique may be applicable wherever uncorrelated image noise is a significant factor, such as environmental cell or low-dose TEM/STEM.

Methods

Sample preparation

The cells were constructed using layer-by-layer transfer of 2D crystals onto a SiN support grid as described previously^17,48. The cell was filled with a solution of HAuCl₄ salt dissolved in organic solvent to generate atomically dispersed species of Au and Au clusters on a graphene surface surrounded by liquid. The upper and lower graphene windows are few layer graphene (3-4 layers or ~1 nm thick) and the liquid thickness is controlled by the thickness of the boron nitride spacer crystal to be 30–50 nm (Fig. 1). The samples were imaged using a double corrected GrandARM JEOL ARM300CF STEM with a convergence angle of 32 mrad, an accelerating voltage of 80 kV. Simultaneous bright field and annular dark field STEM images were acquired with acceptance angles of up to 18.7 mrad and between 46.8 mrad and 169.3 mrad, respectively. Frame sizes were 512 × 512 pixels. In total 16,950 frames were acquired as video sequences with ~ 100 frames per video.

Multislice simulations

Multislice image simulations were performed using the abTEM software⁴⁹. Atomic coordinates were for a 4-layer sheet of Bernal stacked graphene, with gold atoms randomly deposited on the upper surface. Simulation parameters were matched to the JEOL GrandARM experimental data.

White noise (Gaussian) noise was added to the simulated images in Fig. 6 as additive Gaussian white noise (AGWN). PSNRs were simulated in the range of ~7–20 dB, with values of 16.4 and 16.2 for the ADF and BF channels, respectively, found to qualitatively provide the best match to the SNR of the experimental images.

Denoising

All denoising processing was performed on a Nvidia A5000 equipped workstation. Gaussian blurring was performed frame-wise and channel-independently with a standard deviation of the Gaussian kernel of σ = 1 px throughout the dataset. Kernels of σ = 2 px and σ = 0.5 px were also tested but found to give poorer performance, so are not included in the results.

Total variation denoising was performed using the Chambolle algorithm⁵⁰ as implemented by scikit-image v0.20.0. This implementation takes as a parameter $\frac{1}{\lambda }$, which was chosen as 1.0 for the ADF channel and 0.75 for the BF channel. These values were chosen as optimal after performing denoising calculations for $\frac{1}{\lambda }$ = 0.1–2.0 in increments of 0.1.

BM3D⁴⁷ denoising was performed using the Python library bm3d from the Python Package Index. Optimum sigma PSD parameters for the simulated images were found by testing values between 0.4 and 1.2 separately for each image channel. For the scaling of our simulations, values of 0.8 and 0.4 were found to be optimal for the ADF and BF image channels, respectively.

For the Noise2Void denoising, the UNet has a depth of four, with 64 feature channels at the output of the first layer. Average pooling is used, and the size of the convolutional kernel in the first layer is increased to 5 × 5 kernels with 3 × 3 kernels used elsewhere. Our modified Noise2Void model was initially trained on 8475 frames from a full data set of 16,750 frames with the training data extracted by sampling every other frame. All frames are dual-channel ADF and BF STEM images. No data augmentation was used and the Adam optimiser with a constant learning rate of 10⁻⁵ was used to train the model. Default initialisation was used for model weights. Training lasted for 32 epochs with a batch size of 12 dual-channel images. Noise2Void pixel masks were selected in a grid of pixels, with grid spacing equal to 24 px. These grid-points were then modified/jittered by randomly translating them by up to two pixels in each direction (see Fig. 2).

We also tested the performance of our modified Noise2Void model on new experimental data. We compared this to retraining of the model (transfer training), where the previously trained model was used as a starting point and this was retrained on a subset of 615 images from the new experimental data. The original and retrained model were also compared to a completely new (randomly initialised) model trained exclusively on this new data (both using 615 frames from the total data set of 1230 frames and using all the new data) (section ‘Transfer training’).

Noise metrics

Values of PSNR, SSIM and N-RMSE were calculated using scikit-image’s metrics module. SNR values were calculated by dividing the peak intensity (above the mean background) by the root-mean-square background noise.

The peak signal-to-noise ratio (PSNR), measured in decibels (dB) is calculated as

$${\rm{PSNR}}=10\times {\log }_{10}\left(\frac{{{\rm{R}}}^{2}}{{\rm{MSE}}}\right)$$

(1)

for data range R and mean-squared error MSE.

The calculation of SSIM is more involved, incorporating the luminance, contrast and structure of the images with

$${\rm{SSIM}}({\bf{x}},{\bf{y}})=\frac{(2{\mu }_{x}{\mu }_{y}+{C}_{1})(2{\sigma }_{xy}+{C}_{2})}{({\mu }_{x}^{2}+{\mu }_{y}^{2}+{C}_{1})({\sigma }_{x}^{2}+{\sigma }_{y}^{2}+{C}_{2})}$$

(2)

for images x, y where μ and σ represent the mean and standard deviations and C are small constants for numerical stability. This is fully described and justified in ref. ⁵¹.

Normalised root mean square error (N-RMSE) was calculated as

$${\text{N-RMSE}}\,=\frac{\sqrt{\frac{1}{N}\mathop{\sum }\nolimits_{n}^{N}{({I}_{n}-{\mu }_{I})}^{2}}}{{I}_{\max }-{I}_{\min }}$$

(3)

for each pixel n out of a total of N pixels, each with intensity I_n and where ${I}_{\min }$, ${I}_{\max }$ are the minimum and maximum intensities in the image and μ_I is the mean pixel intensity.

Data availability

Data supporting these findings, including experimental videos, simulated images, model weights and denoised output is made available upon reasonable request. Code for training and applying the Noise2Void models, and for generating the simulations, is made available at https://github.com/wilot/N2V-for-Au-GLC.

Code availability

Code for training and applying the Noise2Void models, and for generating the simulations, is made available at https://github.com/wilot/N2V-for-Au-GLC.

References

Erni, R., Rossell, M. D., Kisielowski, C. & Dahmen, U. Atomic-resolution imaging with a sub-50-pm electron probe. Phys. Rev. Lett. 102, 096101 (2009).
Article PubMed Google Scholar
Liu, J. J. Advances and applications of atomic-resolution scanning transmission electron microscopy. Microsc. Microanal. 27, 943–995 (2021).
Article CAS Google Scholar
Morishita, S. et al. Attainment of 40.5 pm spatial resolution using 300 kV scanning transmission electron microscope equipped with fifth-order aberration corrector. Microscopy 67, 46–50 (2017).
Article Google Scholar
Sawada, H. et al. STEM imaging of 47-pm-separated atomic columns by a spherical aberration-corrected electron microscope with a 300-kV cold field emission gun. J. Electron Microsc. 58, 357–361 (2009).
Article CAS Google Scholar
Rose, H. Information transfer in transmission electron microscopy. Ultramicroscopy 15, 173–191 (1984).
Article Google Scholar
Lee, Z., Rose, H., Lehtinen, O., Biskupek, J. & Kaiser, U. Electron dose dependence of signal-to-noise ratio, atom contrast and resolution in transmission electron microscope images. Ultramicroscopy 145, 3–12 (2014).
Article CAS PubMed Google Scholar
Tien, E.-P. et al. Electron beam and thermal stabilities of MFM-300 (M) metal–organic frameworks. J. Mater. Chem. A 12, 24165–24174 (2024).
Article CAS Google Scholar
Ghosh, S., Kumar, P., Conrad, S., Tsapatsis, M. & Mkhoyan, K. A. Electron-beam-damage in metal organic frameworks in the TEM. Microsc. Microanal. 25, 1704–1705 (2019).
Article Google Scholar
Chen, Q. et al. Imaging beam-sensitive materials by electron microscopy. Adv. Mater. 32, 1907619 (2020).
Article CAS Google Scholar
Fan, L., Zhang, F., Fan, H. & Zhang, C. Brief review of image denoising techniques. Vis. Comput. Ind. Biomed. Art. 2, 7 (2019).
Article PubMed PubMed Central Google Scholar
de Jonge, N. Theory of the spatial resolution of (scanning) transmission electron microscopy in liquid water or ice layers. Ultramicroscopy 187, 113–125 (2018).
Article PubMed Google Scholar
de Jonge, N., Houben, L., Dunin-Borkowski, R. E. & Ross, F. M. Resolution and aberration correction in liquid cell transmission electron microscopy. Nat. Rev. Mater. 4, 61–78 (2019).
Article Google Scholar
Ilett, M. et al. Analysis of complex, beam-sensitive materials by transmission electron microscopy and associated techniques. Philos. Trans. R. Soc. A: Math., Phys. Eng. Sci. 378, 20190601 (2020).
Article Google Scholar
Howe, J. M. & Saka, H. In situ transmission electron microscopy studies of the solid-liquid interface. MRS Bull. 29, 951–957 (2004).
Article CAS Google Scholar
Zhang, Q. et al. Atomic dynamics of electrified solid–liquid interfaces in liquid-cell TEM. Nature 630, 643–647 (2024).
Article CAS PubMed Google Scholar
Pu, S., Gong, C. & Robertson, A. W. Liquid cell transmission electron microscopy and its applications. R. Soc. Open Sci. 7, 191204 (2020).
Article CAS PubMed PubMed Central Google Scholar
Clark, N. et al. Tracking single adatoms in liquid in a transmission electron microscope. Nature 609, 942–947 (2022).
Article CAS PubMed Google Scholar
Wang, C., Qiao, Q., Shokuhfar, T. & Klie, R. F. High-resolution electron microscopy and spectroscopy of ferritin in biocompatible graphene liquid cells and graphene sandwiches. Adv. Mater. 26, 3410–3414 (2014).
Article CAS PubMed Google Scholar
Ross, F. M. Opportunities and challenges in liquid cell electron microscopy. Science 350, aaa9886 (2015).
Article PubMed Google Scholar
Park, J. et al. Graphene liquid cell electron microscopy: progress, applications, and perspectives. ACS Nano 15, 288–308 (2021).
Article CAS PubMed Google Scholar
Abellan, P. et al. Factors influencing quantitative liquid (scanning) transmission electron microscopy. Chem. Commun. 50, 4873–4880 (2014).
Article CAS Google Scholar
Lee, J., Nicholls, D., Browning, N. D. & Mehdi, B. L. Controlling radiolysis chemistry on the nanoscale in liquid cell scanning transmission electron microscopy. Phys. Chem. Chem. Phys. 23, 17766–17773 (2021).
Article CAS PubMed Google Scholar
Tian, C. et al. Deep learning on image denoising: an overview. Neural Netw. 131, 251–275 (2020).
Article PubMed Google Scholar
Wang, F., Henninen, T. R., Keller, D. & Erni, R. Noise2atom: unsupervised denoising for scanning transmission electron microscopy images. Appl. Microsc. 50, 23 (2020).
Article PubMed PubMed Central Google Scholar
Khan, A., Lee, C.-H., Huang, P. Y. & Clark, B. K. Leveraging generative adversarial networks to create realistic scanning transmission electron microscopy images. npj Comput. Mater. 9, 85 (2023).
Article Google Scholar
Ede, J. M. & Beanland, R. Improving electron micrograph signal-to-noise with an atrous convolutional encoder-decoder. Ultramicroscopy 202, 18–25 (2019).
Article CAS PubMed Google Scholar
Kawahara, K., Ishikawa, R., Sasano, S., Shibata, N. & Ikuhara, Y. Atomic-resolution STEM image denoising by total variation regularization. Microscopy 71, 302–310 (2022).
Article CAS PubMed Google Scholar
Mevenkamp, N. et al. Poisson noise removal from high-resolution stem images based on periodic block matching. Adv. Struct. Chem. Imaging 1, 3 (2015).
Article Google Scholar
Yankovich, A. B. et al. Non-rigid registration and non-local principle component analysis to improve electron microscopy spectrum images. Nanotechnology 27, 364001 (2016).
Article PubMed Google Scholar
Roels, J. et al. An overview of state-of-the-art image restoration in electron microscopy. J. Microsc. 271, 239–254 (2018).
Article CAS PubMed Google Scholar
Marchello, G., De Pace, C., Duro-Castano, A., Battaglia, G. & Ruiz-Pérez, L. End-to-end image analysis pipeline for liquid-phase electron microscopy. J. Microsc. 279, 242–248 (2020).
Article CAS PubMed PubMed Central Google Scholar
Reboul, C. F. et al. Single: Atomic-resolution structure identification of nanocrystals by graphene liquid cell em. Sci. Adv. 7, eabe6679 (2021).
Article CAS PubMed PubMed Central Google Scholar
Frangakis, A. S. It’s noisy out there! a review of denoising techniques in cryo-electron tomography. J. Struct. Biol. 213, 107804 (2021).
Article PubMed Google Scholar
Krull, A., Buchholz, T.-O. & Jug, F. Noise2Void - Learning Denoising From Single Noisy Images. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2124–2132 (2019).
Ronneberger, O., Fischer, P. & Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Navab, N., Hornegger, J., Wells, W. M. & Frangi, A. F. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, 234–241 (Springer International Publishing, Cham, 2015).
Lehtinen, J. et al. Noise2Noise: Learning image restoration without clean data. In Dy, J. & Krause, A. (eds.) Proc. 35th International Conference on Machine Learning, Vol. 80 of Proceedings of Machine Learning Research, 2965–2974 (PMLR, 2018).
Batson, J. & Royer, L. Noise2Self: Blind denoising by self-supervision. In Chaudhuri, K. & Salakhutdinov, R. (eds.) Proc. 36th International Conference on Machine Learning, Vol. 97 of Proceedings of Machine Learning Research, 524–533 (PMLR, 2019).
Höck, E., Buchholz, T.-O., Brachmann, A., Jug, F. & Freytag, A. N2V2 - Fixing Noise2Void checkerboard artifacts with modified sampling strategies and a tweaked network architecture. In Karlinsky, L., Michaeli, T. & Nishino, K. (eds.) Computer Vision – ECCV 2022 Workshops, 503–518 (Springer Nature Switzerland, 2023).
Welch, D. A., Faller, R., Evans, J. E. & Browning, N. D. Simulating realistic imaging conditions for in situ liquid microscopy. Ultramicroscopy 135, 36–42 (2013).
Article CAS PubMed Google Scholar
Krivanek, O. L. et al. Atom-by-atom structural and chemical analysis by annular dark-field electron microscopy. Nature 464, 571–574 (2010).
Article CAS PubMed Google Scholar
Cowley, J. Scanning transmission electron microscopy of thin specimens. Ultramicroscopy 2, 3–16 (1976).
Article CAS PubMed Google Scholar
Odena, A., Dumoulin, V. & Olah, C. Deconvolution and checkerboard artifacts. Distill https://doi.org/10.23915/distill.00003 (2016).
Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, 2017).
Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K. & Yuille, A. L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848 (2018).
Article PubMed Google Scholar
Wang, Z. & Bovik, A. C. Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Process. Mag. 26, 98–117 (2009).
Article Google Scholar
van der Walt, S. et al. scikit-image: image processing in Python. PeerJ 2, e453 (2014).
Article PubMed PubMed Central Google Scholar
Dabov, K., Foi, A., Katkovnik, V. & Egiazarian, K. Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 2080–2095 (2007).
Article PubMed Google Scholar
Kelly, D. J. et al. Nanometer resolution elemental mapping in graphene-based TEM liquid cells. Nano Lett. 18, 1168–1174 (2018).
Article CAS PubMed PubMed Central Google Scholar
Madsen, J. & Susi, T. abTEM: Transmission electron microscopy from first principles. Open Res. Eur. 1, 13015 (2021).
Google Scholar
Chambolle, A. An algorithm for total variation minimization and applications. J. Math. Imaging Vis. 20, 89–97 (2004).
Article Google Scholar
Wang, Z., Bovik, A., Sheikh, H. & Simoncelli, E. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004).
Article PubMed Google Scholar

Download references

Acknowledgements

The authors acknowledge funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant ERC-2016-STG-EvoluTEM-715502 and QTWIST (no.101001515)). We also thank the Engineering and Physical Sciences Research Council (EPSRC) for funding under grants EP/Y024303, EP/S021531/1, EP/M010619/1, EP/V007033/1, EP/S030719/1, EP/V001914/1, EP/V036343/1 and EP/P009050/1 and also for the EPSRC Centre for Doctoral Training(CDT) Graphene-NOWNANO. TEM access was supported by the Henry Royce Institute for Advanced Materials, funded through EPSRC grants EP/R00661X/1, EP/S019367/1, EP/P025021/1 and EP/P025498/1. RVG. acknowledges funding from the European Quantum Flagship Project 2DSIPC (no. 820378). We thank Diamond Light Source for access and support in use of the electron Physical Science Imaging Centre (Instrument E02 and proposal numbers MG33252 and MG35552) that contributed to the results presented here.

Author information

Authors and Affiliations

Department of Materials, University of Manchester, Manchester, UK
William Thornley, Sam Sullivan-Allsop, Rongsheng Cai, Nick Clark & Sarah J. Haigh
National Graphene Institute, University of Manchester, Manchester, UK
Sam Sullivan-Allsop, Nick Clark, Roman Gorbachev & Sarah J. Haigh
Department of Physics and Astronomy, University of Manchester, Manchester, UK
Roman Gorbachev

Authors

William Thornley
View author publications
Search author on:PubMed Google Scholar
Sam Sullivan-Allsop
View author publications
Search author on:PubMed Google Scholar
Rongsheng Cai
View author publications
Search author on:PubMed Google Scholar
Nick Clark
View author publications
Search author on:PubMed Google Scholar
Roman Gorbachev
View author publications
Search author on:PubMed Google Scholar
Sarah J. Haigh
View author publications
Search author on:PubMed Google Scholar

Contributions

W.T. implemented the Noise2Void architecture, its modifications, performed the training, simulations, comparisons and timings. S.S.-A. and N.C. fabricated the graphene liquid cell samples. R.C. and S.S.-A. performed the TEM imaging. S.J.H. and R.G. supervised this research. W.T. wrote this manuscript and S.S.-A., N.C. and S.J.H. contributed to writing and editing this manuscript.

Corresponding author

Correspondence to Sarah J. Haigh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information (download PDF )

Supplementary Video 1 (download MP4 )

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Thornley, W., Sullivan-Allsop, S., Cai, R. et al. Noise2Void for denoising atomic resolution scanning transmission electron microscopy images. npj Comput Mater 12, 68 (2026). https://doi.org/10.1038/s41524-025-01939-1

Download citation

Received: 06 August 2025
Accepted: 16 December 2025
Published: 13 January 2026
Version of record: 03 February 2026
DOI: https://doi.org/10.1038/s41524-025-01939-1