Abstract
Chemical exposures may affect human metabolism and contribute to the etiology of neurodegenerative disorders such as Alzheimer’s disease. Identifying these small metabolites involves matching experimental spectra to reference spectra in databases. However, environmental chemicals or physiologically active metabolites are usually present at low concentrations in human specimens. The presence of noise ions can substantially degrade spectral quality, leading to false negatives and reduced identification rates. In response to this challenge, the Spectral Denoising algorithm removes both chemical and electronic noise. Spectral Denoising outperformed alternative methods in benchmarking studies on 240 tested metabolites. It improved high confident compound identifications at an average 35-fold lower concentrations than previously achievable. Spectral Denoising proved highly robust against varying levels of both chemical and electronic noise even with a greater than 150-fold higher intensity of noise ions than true fragment ions. For human plasma samples from patients with Alzheimer’s disease that were analyzed on the Orbitrap Astral mass spectrometer, Denoising Search detected 2.5-fold more annotated compounds compared to the Exploris 240 Orbitrap instrument, including drug metabolites, household and industrial chemicals, and pesticides.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
NIST Tandem Mass Spectral Library, 2023 release (NIST23) spectra are commercially available and can be purchased from multiple vendors. MassBank of North America database (Massbank.us) spectra can be freely downloaded from Massbank.us (https://massbank.us/). The metabolome dataset of Alzheimer’s disease samples and the experimental data from the chemical dilution series are available via Zenodo at https://zenodo.org/records/14920689 (ref. 43). Source data are provided with this paper.
Code availability
The code for using spectral denoising and denoising search is available via GitHub at https://github.com/FanzhouKong/spectral_denoising and via Zenodo at https://zenodo.org/records/14920689 (ref. 43).
References
Virolainen, S. J., VonHandorf, A., Viel, K., Weirauch, M. T. & Kottyan, L. C. Gene-environment interactions and their impact on human health. Genes Immun. 24, 1–11 (2023).
Rappaport, S. M., Barupal, D. K., Wishart, D., Vineis, P. & Scalbert, A. The blood exposome and its role in discovering causes of disease. Environ. Health Perspect. 122, 769–774 (2014).
Quinn, R. A. et al. Global chemical effects of the microbiome include new bile-acid conjugations. Nature 579, 123–129 (2020).
Di Minno, A., Gelzo, M., Stornaiuolo, M., Ruoppolo, M. & Castaldo, G. The evolving landscape of untargeted metabolomics. Nutr. Metab. Cardiovasc. Dis. 31, 1645–1652 (2021).
Petras, D. et al. GNPS Dashboard: collaborative exploration of mass spectrometry data in the web browser. Nat. Methods 19, 134–136 (2022).
Choi, M. et al. MassIVE.quant: a community resource of quantitative mass spectrometry–based proteomics datasets. Nat. Methods 17, 981–984 (2020).
Shin, H., Sampat, M. P., Bish, S. F., Koomen, J. M. & Markey, M. K. Statistical characterization of chemical noise in MALDI TOF MS by wavelet analysis of multiple noise realizations. AMIA Annu. Symp. Proc. 2006, 1092 (2006).
Du, P. et al. A noise model for mass spectrometry based proteomics. Bioinformatics 24, 1070–1077 (2008).
Busch, K. L. Chemical noise in mass spectrometry. Part II—effects of choices in ionization methods on chemical noise. Spectroscopy 17, 56–62 (2003).
Kaufmann, A. & Walker, S. Accuracy of relative isotopic abundance and mass measurements in a single-stage orbitrap mass spectrometer. Rapid Commun. Mass Spectrom. 26, 1081–1090 (2012).
Houel, S. et al. Quantifying the impact of chimera MS/MS spectra on peptide identification in large-scale proteomics studies. J. Proteome Res. 9, 4152–4160 (2010).
da Silva, R. R., Dorrestein, P. C. & Quinn, R. A. Illuminating the dark matter in metabolomics. Proc. Natl Acad. Sci. USA 112, 12549–12550 (2015).
Awan, M. G. & Saeed, F. MS-REDUCE: an ultrafast technique for reduction of big mass spectrometry data for high-throughput processing. Bioinformatics 32, 1518–1526 (2016).
Xu, H. & Freitas, M. A. A dynamic noise level algorithm for spectral screening of peptide MS/MS spectra. BMC Bioinf. 11, 436 (2010).
Li, H. et al. A novel spectral library workflow to enhance protein identifications. J. Proteomics 81, 173–184 (2013).
Xing, S. et al. Recognizing contamination fragment ions in liquid chromatography-tandem mass spectrometry data. J. Am. Soc. Mass. Spectrom. 32, 2296–2305 (2021).
Zhao, T., Xing, S., Yu, H. & Huan, T. De novo cleaning of chimeric MS/MS spectra for LC-MS/MS-based metabolomics. Anal. Chem. 95, 13018–13028 (2023).
Stancliffe, E., Schwaiger-Haber, M., Sindelar, M. & Patti, G. J. DecoID improves identification rates in metabolomics through database-assisted MS/MS deconvolution. Nat. Methods 18, 779–787 (2021).
Metabolomics Workbench. UCSD/NIH www.metabolomicsworkbench.org/ (2025).
Yang, X., Neta, P. & Stein, S. E. Quality control for building libraries from electrospray ionization tandem mass spectra. Anal. Chem. 86, 6393–6400 (2014).
Li, Y. et al. Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification. Nat. Methods 18, 1524–1531 (2021).
Schiffman, C. et al. Filtering procedures for untargeted LC-MS metabolomics data. BMC Bioinf. 20, 334 (2019).
MZMine: how do I determine the noise level in my data? (MIT, 2025); https://mzmine.github.io/mzmine_documentation/module_docs/featdet_mass_detection/mass-detection.html
Miller, P. E. & Denton, M. B. The quadrupole mass filter: basic operating concepts. J. Chem. Educ. 63, 617 (1986).
Broeckling, C. D. et al. Current practices in LC-MS untargeted metabolomics: a scoping review on the use of pooled quality control samples. Anal. Chem. 95, 18645–18654 (2023).
Stravs, M. A., Schymanski, E. L., Singer, H. P. & Hollender, J. Automatic recalibration and processing of tandem mass spectra using formula annotation. J. Mass Spectrom. 48, 89–99 (2013).
Li, Y. & Fiehn, O. Flash entropy search to query all mass spectral libraries in real time. Nat. Methods 20, 1475–1478 (2023).
Xing, S., Shen, S., Xu, B., Li, X. & Huan, T. BUDDY: molecular formula discovery via bottom-up MS/MS interrogation. Nat. Methods 20, 881–890 (2023).
Wang, F. et al. CFM-ID 4.0—a web server for accurate MS-based metabolite identification. Nucleic Acids Res. 50, W165–W174 (2022).
Tsugawa, H. et al. Hydrogen rearrangement rules: computational MS/MS fragmentation and structure elucidation using MS-FINDER software. Anal. Chem. 88, 7946–7958 (2016).
Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019).
Kind, T. & Fiehn, O. Seven golden rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinf. 8, 105 (2007).
Dührkop, K., Scheubert, K. & Böcker, S. Molecular formula identification with SIRIUS. Metabolites 3, 506–516 (2013).
Xing, S. & Huan, T. Radical fragment ions in collision-induced dissociation-based tandem mass spectrometry. Anal. Chim. Acta 1200, 339613 (2022).
Shaffer, C. J., Schröder, D., Alcaraz, C., Žabka, J. & Zins, E.-L. Reactions of doubly ionized benzene with nitrogen and water: a nitrogen-mediated entry into superacid chemistry. Chem. Phys. Chem. 13, 2688–2698 (2012).
Chen, Y.-W. & Lin, C.-J. in Feature Extraction: Foundations and Applications (eds Guyon, I. et al.) 315–324 (Springer, 2006).
Zhu, C. et al. Massive emissions of a broad range of emerging hindered phenol antioxidants and sulfur antioxidants from E-waste recycling in urban mining: new insights into an environmental source. Environ. Sci. Technol. Lett. 9, 42–49 (2022).
Bonini, P., Kind, T., Tsugawa, H., Barupal, D. K. & Fiehn, O. Retip: retention time prediction for compound annotation in untargeted metabolomics. Anal. Chem. 92, 7515–7522 (2020).
Nichols, C. M. et al. Untargeted molecular discovery in primary metabolism: collision cross section as a molecular descriptor in ion mobility-mass spectrometry. Anal. Chem. 90, 14484–14492 (2018).
Barupal, D. K. & Fiehn, O. Generating the blood exposome database using a comprehensive text mining and database fusion approach. Environ. Health Perspect. 127, 097008 (2019).
Metz, T. O. et al. Introducing ‘identification probability’ for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies. Anal. Chem. 97, 1–11 (2025).
Kong, F., Keshet, U., Shen, T., Rodriguez, E. & Fiehn, O. LibGen: generating high quality spectral libraries of natural products for EAD-, UVPD-, and HCD-high resolution mass spectrometers. Anal. Chem. 95, 16810–16818 (2023).
Kong, F. Data for denoising search paper. Zenodo https://zenodo.org/records/14920689 (2025).
Matyash, V., Liebisch, G., Kurzchalia, T. V., Shevchenko, A. & Schwudke, D. Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics. J. Lipid Res. 49, 1137–1146 (2008).
Djoumbou Feunang, Y. et al. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J. Cheminform. 8, 61 (2016).
Acknowledgements
Samples were provided by the Alzheimer’s Disease Metabolomics Consortium (ADMC) funded wholly or in part by the following grants and supplements thereto. However, recipients of these awards were not authors of the report presented here: grant nos. NIA R01AG046171, RF1AG051550, RF1AG057452, R01AG059093, RF1AG058942, U01AG061359 and U19AG063744 and FNIH grant no. DAOU16AMPA were awarded to R. Kaddurah-Daouk at Duke University in partnership with a large number of academic institutions. A complete listing of ADMC investigators can be found at https://sites.duke.edu/adnimetab/team/. T.S., Y.L., F.K. and O.F. were supported by grant nos. R01 GM155383 (to O.F.), R01 HL157535 (to O.F.) and U01 AG08862 (to O.F.).
Author information
Authors and Affiliations
Contributions
O.F. supervised and directed this project. F.K. developed and performed the analysis. T.S. acquired experimental data. Y.L. curated validation data. A.B. and S.S.B. provided instruments and collected experimental data. F.K. and O.F. wrote the manuscript with comments from all the other authors.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Distribution of ion counts in NIST 23 Orbitrap MS/MS spectra.
When analyzing NIST 23 Orbitrap MS/MS spectra, the plot shows the distribution of the total counts of ion frequencies per spectrum that share identical intensity values within a tolerance of ±0.1%. The vertical dashed line at 4 counts marks the 99.5th percentile. We therefore used this threshold as rule for electronic denoising.
Extended Data Fig. 2 Entropy similarity distribution between denoised and raw NIST 23 Orbitrap MS/MS spectra.
This histogram illustrates the distribution of entropy similarities between 10,000 randomly sampled denoised NIST 23 Orbitrap MS/MS spectra, against their corresponding raw spectra. The mode, mean and median spectral entropy similarity are all above 0.99. This means spectra that have already high quality (like NIST23 spectra) are not significantly affected by electronic and chemical denoising.
Extended Data Fig. 3 Count of experimental MS/MS spectra generated in dilution series for 240 chemical standards for positive and negative electrospray ionization (ESI) modes for an Orbitrap mass analyzer.
Due to the number of different adducts, more than 240 MS/MS spectra were generated by injection of reference standards at high levels of compounds. However, at lower amounts injected on the column, a decreasing number of MS/MS spectra were recorded, reflecting the ionization efficiencies of different compounds at different concentrations (0.02–500 pmol injected on column) for both positive (blue) and negative (orange) electrospray ionization (ESI) modes.
Extended Data Fig. 4 Performance benchmarking of different denoising methods on experimental spectra.
Probability density estimates of entropy similarities comparing raw spectra and spectra denoised by various denoising methods: DNL denoising, MS Reduce, 1% base peak (bp) thresholding, and Spectral Denoising. The analysis performed different denoising methods on spectra acquired at 0.02–200 pmol, using the 500 pmol spectra as reference.
Extended Data Fig. 5 MS/MS spectral similarity improvement from Spectral Denoising across different levels of noise addition and spectral entropy.
Probability density estimates of entropy similarity improvements after applying Spectral Denoising to MS/MS spectra from 240 injected standards (0.02–200 pmol), using spectra acquired at 500 pmol as references. The analysis includes nine conditions with varying levels of artificially introduced electronic and chemical noise. Spectra are further stratified into five groups based on the spectral entropy of reference spectra.
Extended Data Fig. 6 Benchmarking false discovery rates between Denoising Search and Entropy Search.
False-discovery rate (FDR) benchmarking of 1,247 positive ESI-mode Orbitrap MS/MS spectra from human plasma metabolites, manually annotated using MassBank.us, GNPS, and NIST23 libraries. The plots compare FDR values across spectral similarity scores between Denoising Search and Entropy Search using experimental spectra. (a) FDR calculated based on the top hit only. (b) FDR calculated using the top 3 hits.
Supplementary information
Source data
Source Data Fig. 2
Statistical source data.
Source Data Fig. 3
Statistical source data.
Source Data Fig. 4
Statistical source data.
Source Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 1
Statistical source data.
Source Data Extended Data Fig. 2
Statistical source data.
Source Data Extended Data Fig. 3
Statistical source data.
Source Data Extended Data Fig. 4
Statistical source data.
Source Data Extended Data Fig. 5
Statistical source data.
Source Data Extended Data Fig. 6
Statistical source data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kong, F., Shen, T., Li, Y. et al. Denoising Search doubles the number of metabolite and exposome annotations in human plasma using an Orbitrap Astral mass spectrometer. Nat Methods 22, 1008–1016 (2025). https://doi.org/10.1038/s41592-025-02646-x
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41592-025-02646-x