Abstract
Functional magnetic resonance imaging (fMRI) allows noninvasive measurement of neural activity with high spatial resolution. However, fMRI data are affected by noise. Here we introduce and evaluate a denoising method (DeepCor) that utilizes deep generative models to disentangle and remove noise. The method is applicable to data from single participants. DeepCor outperforms other state-of-the-art denoising approaches on a variety of simulated datasets. In real fMRI data, DeepCor enhances BOLD signal responses to face stimuli, outperforming CompCor by 215%.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
USD 39.95
Prices may be subject to local taxes which are calculated during checkout


Data availability
The fMRI data used to extract noise parameters for realistic data simulations are available via DataSpace at http://arks.princeton.edu/ark:/88435/dsp01dn39x4181 and the StudyForrest fMRI data are available via the Study Forrest repository at https://www.studyforrest.org/. The ABIDE I dataset has been used for functional connectivity analysis (https://fcon_1000.projects.nitrc.org/), and the resulting seven networks are available via GitHub at https://github.com/ThomasYeoLab/CBIG/blob/master/stable_projects/brain_parcellation/Yeo2011_fcMRI_clustering/1000subjects_reference/Yeo_JNeurophysiol11_SplitLabels/MNI152/Centroid_coordinates/Yeo2011_7Networks_N1000.split_components.FSL_MNI152_2mm.Centroid_RAS.csv. Source data are provided with this paper.
References
Poldrack, R. A. The role of fMRI in cognitive neuroscience: where do we stand? Curr. Opin. Neurobiol. 18, 223–227 (2008).
Canario, E., Chen, D. & Biswal, B. A review of resting-state fMRI and its use to examine psychiatric disorders. Psychoradiology 1, 42–53 (2021).
Liu, T. T. Noise contributions to the fMRI signal: an overview. NeuroImage 143, 141–151 (2016).
Glover, G. H. Overview of functional magnetic resonance imaging. Neurosurg. Clin. 22, 133–139 (2011).
Abid, A. & Zou, J. Contrastive variational autoencoder enhances salient features. Preprint at https://arxiv.org/abs/1902.04601 (2019).
Aglinskas, A., Hartshorne, J. K. & Anzellotti, S. Contrastive machine learning reveals the structure of neuroanatomical variation within autism. Science 376, 1070–1074 (2022).
Behzadi, Y., Restom, K., Liau, J. & Liu, T. T. A component based noise correction method (CompCor) for bold and perfusion based fMRI. NeuroImage 37, 90–101 (2007).
Poskanzer, C., Fang, M., Aglinskas, A. & Anzellotti, S. Controlling for spurious nonlinear dependence in connectivity analyses. Neuroinformatics https://doi.org/10.1007/s12021-021-09540-9 (2022).
Satterthwaite, T. D. et al. Motion artifact in studies of functional connectivity: characteristics and mitigation strategies. Hum. Brain Mapping 40, 2033–2051 (2019).
Yang, Z. et al. A robust deep neural network for denoising task-based fMRI data: an application to working memory and episodic memory. Med. Image Anal. 60, 101622 (2020).
Yang, Z. et al. Disentangling time series between brain tissues improves fMRI data quality using a time-dependent deep neural network. NeuroImage 223, 117340 (2020).
Zhao, C., Li, H., Jiao, Z., Du, T. & Fan, Y. A 3D convolutional encapsulated long short-term memory (3DConv-lSTM) model for denoising fMRI data. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Proc., Part VII 23 (eds Martel, A. L. et al.) 479–488 (Springer, 2020).
Chen, Z. et al. Deep learning for image enhancement and correction in magnetic resonance imaging-state-of-the-art and challenges. J. Digit. Imag. 36, 204–230 (2023).
Kumar, M. et al. BrainIAK: the brain imaging analysis kit. Aperture Neuro. https://doi.org/10.52294/31bb5b68-2184-411b-8c00-a1dacb61e1da (2021).
Power, J. D., Plitt, M., Laumann, T. O. & Martin, A. Sources and implications of whole-brain fMRI signals in humans. NeuroImage 146, 609–625 (2017).
Kanwisher, N., McDermott, J. & Chun, M. M. The fusiform face area: a module in human extrastriate cortex specialized for face perception. J. Neurosci. 17, 4302–4311 (1997).
Hanke, M. et al. A studyforrest extension, simultaneous fMRI and eye gaze recordings during prolonged natural stimulation. Sci. Data 3, 1–15 (2016).
Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. Preprint at https://arxiv.org/abs/1312.6114 (2014).
Aglinskas, A., Bergeron, A. & Anzellotti, S. Understanding heterogeneity in psychiatric disorders: a method for identifying subtypes and parsing comorbidity. Psychiatry Clin. Neurosci. https://doi.org/10.1111/pcn.13829 (2025).
Ellis, C. T., Baldassano, C., Schapiro, A. C., Cai, M. B. & Cohen, J. D. Facilitating open-science with realistic fMRI simulation: validation and application. PeerJ 8, e8564 (2020).
Bejjanki, V. R., Da Silveira, R. A., Cohen, J. D. & Turk-Browne, N. B. Noise correlations in the human brain and their impact on pattern classification. PLoS Comput. Biol. 13, e1005674 (2017).
Fang, M., Aglinskas, A., Li, Y. & Anzellotti, S. Angular gyrus responses show joint statistical dependence with brain regions selective for different categories. J. Neurosci. 43, 2756–2766 (2023).
Zhu, Y., Aglinskas, A. & SCCN Lab. Aglinskas/DeepCor: Publication code 1.1. Zenodo https://doi.org/10.5281/zenodo.17392231 (2025).
Acknowledgements
This work was supported by a NSF CAREER grant (1943862) awarded to S.A. and by a startup grant from Boston College awarded to S.A. The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript. We thank M. Gregas, F. Pari and W. Qiu for support with high-performance computing and A. Anzellotti for suggesting to include voxel coordinate information as input to the denoising model.
Author information
Authors and Affiliations
Contributions
S.A. conceived the project. Y.Z. and S.A. designed the model. Y.Z. developed the software, and A.A. and S.A. adapted it to task data. Y.Z. generated the simulated data. A.A. preprocessed the real data. Y.Z. performed the analysis on simulated data and functional connectivity analysis. A.A. performed the analysis on task fMRI. Y.Z. and A.A. prepared the figures. Y.Z., A.A. and S.A. wrote the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Erica Busch and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Nina Vogt, in collaboration with the Nature Methods team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Comparison of denoising performance: CompCor vs DeepCor.
Box plots showing mean R-squared values (n=10) for DeepCor and CompCor denoising methods under six simulation conditions, categorized by Linear and Nonlinear noise types with standard deviations (SD) of 0.5, 1.0, and 2.0. Each box plot represents 10 independent simulation runs per condition and method. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers. Significance bars indicate statistical differences between methods, with p = 9.13 × 10−5 < 0.001 (***) for all conditions except Nonlinear (SD=2.0), where p = 0.07 (*).
Extended Data Fig. 2 Robustness of DeepCor to random seed variability.
(a) Box plots showing the R-squared values (n=1500) of individual voxels for the simple simulation scenario, where the signal and noise are combined linearly with standard deviation (SD) = 1. (b) Box plots illustrating the R-squared distribution (n=2323) under the realistic simulation setting. Each box plot represents a different random seed. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 3 Robustness of DeepCor and CompCor to the number of times points in input data.
(a) Box plots showing the mean R-squared values (n=10) for DeepCor and CompCor in relationship to the number of time points in the simple simulation scenario, where the signal and noise are combined linearly with a standard deviation of 1. Each box plot represents results across 10 independent seeds, with significance bars indicating statistical differences between methods (p = 9.13 × 10−5 < 0.001, *** for all time points). (b) Box plots illustrating the mean R-squared values (n=10) for CompCor and DeepCor across different time points under the BrainIAK realistic simulation setting. Significance bars denote statistical differences between methods, with p = 9.13 × 10−5 < 0.001 (***) for all conditions. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 4 Robustness of DeepCor and CompCor to the number of training voxels in input data.
(a) Box plots showing the mean R-squared values (n=10) for CompCor and DeepCor in relationship to the number of training voxels in the simple simulation scenario, where the signal and noise are combined linearly with a standard deviation of 1. Each box plot represents results across 10 independent seeds, with significance bars indicating statistical differences between methods (p = 9.13 × 10−5 < 0.001, *** for all conditions). (b) Box plots illustrating the mean R-squared values (n=10) for CompCor and DeepCor across different training voxels under the BrainIAK realistic simulation setting. Significance bars denote statistical differences between methods, with p = 9.13 × 10−5 < 0.001 (***) for all conditions. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 5 Impact of latent dimension on DeepCor’s performance and final validation loss.
(a-b) Box plots illustrating the relationship between the latent dimension size and mean R-squared values (n=10) for DeepCor in both (a) the simple simulation scenario and (b) the realistic BrainIAK simulation scenario. The latent dimension indicates the output size of each encoder; the total latent dimension is twice this size, as it results from concatenating the outputs of two encoders. Each box plot represents results across 10 independent seeds. (c-d) Box plots depicting the validation loss (n=10) at the final training epoch as a function of latent dimension size in (c) the simple simulation scenario and (d) the realistic BrainIAK simulation scenario. Each box plot represents results across 10 independent seeds. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 6 Impact of number of principal components (PCs) on CompCor’s performance.
(a) Box plots of mean r-squared score over 10 rounds (n=10) with different seeds, for the denoised testing voxels to their ground truth, coming from the simple simulation dataset consisting of a linear combination of signal and noise with the noise proportion’s standard deviation set to 1, evaluated using different numbers of principal components (PCs) in the CompCor method. The horizontal line represents the mean R-squared value achieved by DeepCor on the same simulated dataset. (b) Box plots (n=10), similar to (a), but evaluated on the BrainIAK simulation dataset. Simulation parameters were the same as the ones reported in the main text. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 7 Impact of latent dimension on DeNN’s performance.
Box plots showing DeNN’s mean R-squared scores across 10 runs with different random seeds (n=10), evaluating the denoised testing voxels against their ground truth from the realistic BrainIAK simulation dataset. The latent dimension is set to the default value of 16 (labeled as “Default” on the left) and doubled to 32 (labeled as “Double” on the right). Simulation and other model parameters were the same as the ones reported in the main text. The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 8 Robustness of CompCor and DeepCor to the number ROI voxels mixed in the RONI.
In real fMRI data, RONI voxels can contain some proportion of signal of neural origin. We tested DeepCor’s robustness to this by including different proportions of ROI voxels in the RONI. Box plots showing the mean R-squared values (n=10) for DeepCor and CompCor in relationship to the percent of ROI voxels included in the RONI. Each box plot represents results across 10 independent seeds, with significance bars indicating statistical differences between methods (p = 9.13 × 10−5 < 0.001, *** for all conditions). The central line within each box represents the median, while the upper and lower bounds correspond to the first (Q1) and third quartiles (Q3), capturing the interquartile range (IQR). Whiskers extend to the smallest and largest values within 1.5 times the IQR, while individual points beyond this range are plotted as outliers.
Extended Data Fig. 9 Functional connectivity analysis.
(a) Overview of the functional connectivity evaluation pipeline, incorporating a Yeo et al.-defined 51-region brain mask. The pipeline processes whole-brain fMRI data, applies denoising methods, extracts regional mean signals, and computes interregional Pearson correlations to generate functional connectivity matrices. (b) Functional connectivity matrices from a single participant, derived from raw data, as well as CompCor-, DeNN-, and DeepCor-denoised fMRI signals. (c) Bar plot comparing within-network and between-network correlation strengths (n=200) across the four methods (raw data, CompCor, DeNN, and DeepCor). Error bars represent standard errors. Statistical significance was assessed using a one-sided Wilcoxon signed-rank test, comparing each denoising method (CompCor, DeNN, and DeepCor) with Raw data for Within-Network and Between-Network correlations, and comparing DeepCor with Raw, CompCor, and DeNN for the Difference (Within - Between) measure, with significance levels denoted as p < 0.001 (***). Exact p-values: Within-Network (vs Raw) - CompCor p = 2.8 × 10−33, DeNN p = 7.2 × 10−35, DeepCor p = 9.0 × 10−32; Between-Network (vs Raw) - CompCor p = 4.5 × 10−34, DeNN p = 7.2 × 10−35, DeepCor p = 7.2 × 10−35; Difference (Within-Between; DeepCor vs others) - vs Raw p = 1.0 × 10−34, vs CompCor p = 2.9 × 10−34, vs DeNN p = 9.6 × 10−34. (d) Mean correlation +/- SEM between the FEF and PCC, two regions exhibiting anticorrelated activity, across denoising methods. Background shading indicates the correlation values, scaled consistently with the connectivity matrices. FEF: Frontal Eye Field, PCC: Posterior Cingulate Cortex.
Extended Data Fig. 10 Algorithm for computing the padding size.
The algorithm takes the number of time points as input and outputs the padding arguments for each layer of the encoder and decoder.
Supplementary information
Source data
Source Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Source Data Extended Data Fig. 7
Statistical source data.
Source Data Extended Data Fig. 8
Statistical source data.
Source Data Extended Data Fig. 9
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhu, Y., Aglinskas, A. & Anzellotti, S. DeepCor: denoising fMRI data with contrastive autoencoders. Nat Methods 23, 334–337 (2026). https://doi.org/10.1038/s41592-025-02967-x
Received:
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41592-025-02967-x