Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Denoising Search doubles the number of metabolite and exposome annotations in human plasma using an Orbitrap Astral mass spectrometer

Abstract

Chemical exposures may affect human metabolism and contribute to the etiology of neurodegenerative disorders such as Alzheimer’s disease. Identifying these small metabolites involves matching experimental spectra to reference spectra in databases. However, environmental chemicals or physiologically active metabolites are usually present at low concentrations in human specimens. The presence of noise ions can substantially degrade spectral quality, leading to false negatives and reduced identification rates. In response to this challenge, the Spectral Denoising algorithm removes both chemical and electronic noise. Spectral Denoising outperformed alternative methods in benchmarking studies on 240 tested metabolites. It improved high confident compound identifications at an average 35-fold lower concentrations than previously achievable. Spectral Denoising proved highly robust against varying levels of both chemical and electronic noise even with a greater than 150-fold higher intensity of noise ions than true fragment ions. For human plasma samples from patients with Alzheimer’s disease that were analyzed on the Orbitrap Astral mass spectrometer, Denoising Search detected 2.5-fold more annotated compounds compared to the Exploris 240 Orbitrap instrument, including drug metabolites, household and industrial chemicals, and pesticides.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Flowchart for Spectral Denoising.
Fig. 2: Developing, validating and benchmarking the Spectral Denoising algorithm.
Fig. 3: Probability distributions of MS/MS entropy similarities before (‘raw’) and after applying three benchmarking methods against the Spectral Denoising algorithm, under varying levels of artificially added chemical and electronic noises.
Fig. 4: Density distributions for MS/MS similarity improvements after Spectral Denoising for all MS/MS spectra from 240 injected standards between 0.02 and 200 pmol, using the 500-pmol spectra as reference.
Fig. 5: Denoising Search results for positive ESI mode hydrophilic interaction LC–MS/MS data acquired on an Exploris 240 Orbitrap instrument and the Astral mass spectrometer, using 20 plasma samples of patients with Alzheimer’s disease.

Similar content being viewed by others

Data availability

NIST Tandem Mass Spectral Library, 2023 release (NIST23) spectra are commercially available and can be purchased from multiple vendors. MassBank of North America database (Massbank.us) spectra can be freely downloaded from Massbank.us (https://massbank.us/). The metabolome dataset of Alzheimer’s disease samples and the experimental data from the chemical dilution series are available via Zenodo at https://zenodo.org/records/14920689 (ref. 43). Source data are provided with this paper.

Code availability

The code for using spectral denoising and denoising search is available via GitHub at https://github.com/FanzhouKong/spectral_denoising and via Zenodo at https://zenodo.org/records/14920689 (ref. 43).

References

  1. Virolainen, S. J., VonHandorf, A., Viel, K., Weirauch, M. T. & Kottyan, L. C. Gene-environment interactions and their impact on human health. Genes Immun. 24, 1–11 (2023).

    Article  PubMed  Google Scholar 

  2. Rappaport, S. M., Barupal, D. K., Wishart, D., Vineis, P. & Scalbert, A. The blood exposome and its role in discovering causes of disease. Environ. Health Perspect. 122, 769–774 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  3. Quinn, R. A. et al. Global chemical effects of the microbiome include new bile-acid conjugations. Nature 579, 123–129 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Di Minno, A., Gelzo, M., Stornaiuolo, M., Ruoppolo, M. & Castaldo, G. The evolving landscape of untargeted metabolomics. Nutr. Metab. Cardiovasc. Dis. 31, 1645–1652 (2021).

    Article  PubMed  Google Scholar 

  5. Petras, D. et al. GNPS Dashboard: collaborative exploration of mass spectrometry data in the web browser. Nat. Methods 19, 134–136 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Choi, M. et al. MassIVE.quant: a community resource of quantitative mass spectrometry–based proteomics datasets. Nat. Methods 17, 981–984 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Shin, H., Sampat, M. P., Bish, S. F., Koomen, J. M. & Markey, M. K. Statistical characterization of chemical noise in MALDI TOF MS by wavelet analysis of multiple noise realizations. AMIA Annu. Symp. Proc. 2006, 1092 (2006).

    PubMed  PubMed Central  Google Scholar 

  8. Du, P. et al. A noise model for mass spectrometry based proteomics. Bioinformatics 24, 1070–1077 (2008).

    Article  CAS  PubMed  Google Scholar 

  9. Busch, K. L. Chemical noise in mass spectrometry. Part II—effects of choices in ionization methods on chemical noise. Spectroscopy 17, 56–62 (2003).

    Google Scholar 

  10. Kaufmann, A. & Walker, S. Accuracy of relative isotopic abundance and mass measurements in a single-stage orbitrap mass spectrometer. Rapid Commun. Mass Spectrom. 26, 1081–1090 (2012).

    Article  CAS  PubMed  Google Scholar 

  11. Houel, S. et al. Quantifying the impact of chimera MS/MS spectra on peptide identification in large-scale proteomics studies. J. Proteome Res. 9, 4152–4160 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. da Silva, R. R., Dorrestein, P. C. & Quinn, R. A. Illuminating the dark matter in metabolomics. Proc. Natl Acad. Sci. USA 112, 12549–12550 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  13. Awan, M. G. & Saeed, F. MS-REDUCE: an ultrafast technique for reduction of big mass spectrometry data for high-throughput processing. Bioinformatics 32, 1518–1526 (2016).

    Article  CAS  PubMed  Google Scholar 

  14. Xu, H. & Freitas, M. A. A dynamic noise level algorithm for spectral screening of peptide MS/MS spectra. BMC Bioinf. 11, 436 (2010).

    Article  Google Scholar 

  15. Li, H. et al. A novel spectral library workflow to enhance protein identifications. J. Proteomics 81, 173–184 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Xing, S. et al. Recognizing contamination fragment ions in liquid chromatography-tandem mass spectrometry data. J. Am. Soc. Mass. Spectrom. 32, 2296–2305 (2021).

    Article  CAS  PubMed  Google Scholar 

  17. Zhao, T., Xing, S., Yu, H. & Huan, T. De novo cleaning of chimeric MS/MS spectra for LC-MS/MS-based metabolomics. Anal. Chem. 95, 13018–13028 (2023).

    Article  CAS  PubMed  Google Scholar 

  18. Stancliffe, E., Schwaiger-Haber, M., Sindelar, M. & Patti, G. J. DecoID improves identification rates in metabolomics through database-assisted MS/MS deconvolution. Nat. Methods 18, 779–787 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Metabolomics Workbench. UCSD/NIH www.metabolomicsworkbench.org/ (2025).

  20. Yang, X., Neta, P. & Stein, S. E. Quality control for building libraries from electrospray ionization tandem mass spectra. Anal. Chem. 86, 6393–6400 (2014).

    Article  CAS  PubMed  Google Scholar 

  21. Li, Y. et al. Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification. Nat. Methods 18, 1524–1531 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Schiffman, C. et al. Filtering procedures for untargeted LC-MS metabolomics data. BMC Bioinf. 20, 334 (2019).

    Article  Google Scholar 

  23. MZMine: how do I determine the noise level in my data? (MIT, 2025); https://mzmine.github.io/mzmine_documentation/module_docs/featdet_mass_detection/mass-detection.html

  24. Miller, P. E. & Denton, M. B. The quadrupole mass filter: basic operating concepts. J. Chem. Educ. 63, 617 (1986).

    Article  CAS  Google Scholar 

  25. Broeckling, C. D. et al. Current practices in LC-MS untargeted metabolomics: a scoping review on the use of pooled quality control samples. Anal. Chem. 95, 18645–18654 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Stravs, M. A., Schymanski, E. L., Singer, H. P. & Hollender, J. Automatic recalibration and processing of tandem mass spectra using formula annotation. J. Mass Spectrom. 48, 89–99 (2013).

    Article  CAS  PubMed  Google Scholar 

  27. Li, Y. & Fiehn, O. Flash entropy search to query all mass spectral libraries in real time. Nat. Methods 20, 1475–1478 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Xing, S., Shen, S., Xu, B., Li, X. & Huan, T. BUDDY: molecular formula discovery via bottom-up MS/MS interrogation. Nat. Methods 20, 881–890 (2023).

    Article  CAS  PubMed  Google Scholar 

  29. Wang, F. et al. CFM-ID 4.0—a web server for accurate MS-based metabolite identification. Nucleic Acids Res. 50, W165–W174 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Tsugawa, H. et al. Hydrogen rearrangement rules: computational MS/MS fragmentation and structure elucidation using MS-FINDER software. Anal. Chem. 88, 7946–7958 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Dührkop, K. et al. SIRIUS 4: a rapid tool for turning tandem mass spectra into metabolite structure information. Nat. Methods 16, 299–302 (2019).

    Article  PubMed  Google Scholar 

  32. Kind, T. & Fiehn, O. Seven golden rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry. BMC Bioinf. 8, 105 (2007).

    Article  Google Scholar 

  33. Dührkop, K., Scheubert, K. & Böcker, S. Molecular formula identification with SIRIUS. Metabolites 3, 506–516 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Xing, S. & Huan, T. Radical fragment ions in collision-induced dissociation-based tandem mass spectrometry. Anal. Chim. Acta 1200, 339613 (2022).

    Article  CAS  PubMed  Google Scholar 

  35. Shaffer, C. J., Schröder, D., Alcaraz, C., Žabka, J. & Zins, E.-L. Reactions of doubly ionized benzene with nitrogen and water: a nitrogen-mediated entry into superacid chemistry. Chem. Phys. Chem. 13, 2688–2698 (2012).

    Article  CAS  PubMed  Google Scholar 

  36. Chen, Y.-W. & Lin, C.-J. in Feature Extraction: Foundations and Applications (eds Guyon, I. et al.) 315–324 (Springer, 2006).

  37. Zhu, C. et al. Massive emissions of a broad range of emerging hindered phenol antioxidants and sulfur antioxidants from E-waste recycling in urban mining: new insights into an environmental source. Environ. Sci. Technol. Lett. 9, 42–49 (2022).

    Article  CAS  Google Scholar 

  38. Bonini, P., Kind, T., Tsugawa, H., Barupal, D. K. & Fiehn, O. Retip: retention time prediction for compound annotation in untargeted metabolomics. Anal. Chem. 92, 7515–7522 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Nichols, C. M. et al. Untargeted molecular discovery in primary metabolism: collision cross section as a molecular descriptor in ion mobility-mass spectrometry. Anal. Chem. 90, 14484–14492 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Barupal, D. K. & Fiehn, O. Generating the blood exposome database using a comprehensive text mining and database fusion approach. Environ. Health Perspect. 127, 097008 (2019).

  41. Metz, T. O. et al. Introducing ‘identification probability’ for automated and transferable assessment of metabolite identification confidence in metabolomics and related studies. Anal. Chem. 97, 1–11 (2025).

  42. Kong, F., Keshet, U., Shen, T., Rodriguez, E. & Fiehn, O. LibGen: generating high quality spectral libraries of natural products for EAD-, UVPD-, and HCD-high resolution mass spectrometers. Anal. Chem. 95, 16810–16818 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Kong, F. Data for denoising search paper. Zenodo https://zenodo.org/records/14920689 (2025).

  44. Matyash, V., Liebisch, G., Kurzchalia, T. V., Shevchenko, A. & Schwudke, D. Lipid extraction by methyl-tert-butyl ether for high-throughput lipidomics. J. Lipid Res. 49, 1137–1146 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Djoumbou Feunang, Y. et al. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J. Cheminform. 8, 61 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Samples were provided by the Alzheimer’s Disease Metabolomics Consortium (ADMC) funded wholly or in part by the following grants and supplements thereto. However, recipients of these awards were not authors of the report presented here: grant nos. NIA R01AG046171, RF1AG051550, RF1AG057452, R01AG059093, RF1AG058942, U01AG061359 and U19AG063744 and FNIH grant no. DAOU16AMPA were awarded to R. Kaddurah-Daouk at Duke University in partnership with a large number of academic institutions. A complete listing of ADMC investigators can be found at https://sites.duke.edu/adnimetab/team/. T.S., Y.L., F.K. and O.F. were supported by grant nos. R01 GM155383 (to O.F.), R01 HL157535 (to O.F.) and U01 AG08862 (to O.F.).

Author information

Authors and Affiliations

Authors

Contributions

O.F. supervised and directed this project. F.K. developed and performed the analysis. T.S. acquired experimental data. Y.L. curated validation data. A.B. and S.S.B. provided instruments and collected experimental data. F.K. and O.F. wrote the manuscript with comments from all the other authors.

Corresponding author

Correspondence to Oliver Fiehn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Primary Handling Editor: Arunima Singh, in collaboration with the Nature Methods team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Distribution of ion counts in NIST 23 Orbitrap MS/MS spectra.

When analyzing NIST 23 Orbitrap MS/MS spectra, the plot shows the distribution of the total counts of ion frequencies per spectrum that share identical intensity values within a tolerance of ±0.1%. The vertical dashed line at 4 counts marks the 99.5th percentile. We therefore used this threshold as rule for electronic denoising.

Source data

Extended Data Fig. 2 Entropy similarity distribution between denoised and raw NIST 23 Orbitrap MS/MS spectra.

This histogram illustrates the distribution of entropy similarities between 10,000 randomly sampled denoised NIST 23 Orbitrap MS/MS spectra, against their corresponding raw spectra. The mode, mean and median spectral entropy similarity are all above 0.99. This means spectra that have already high quality (like NIST23 spectra) are not significantly affected by electronic and chemical denoising.

Source data

Extended Data Fig. 3 Count of experimental MS/MS spectra generated in dilution series for 240 chemical standards for positive and negative electrospray ionization (ESI) modes for an Orbitrap mass analyzer.

Due to the number of different adducts, more than 240 MS/MS spectra were generated by injection of reference standards at high levels of compounds. However, at lower amounts injected on the column, a decreasing number of MS/MS spectra were recorded, reflecting the ionization efficiencies of different compounds at different concentrations (0.02–500 pmol injected on column) for both positive (blue) and negative (orange) electrospray ionization (ESI) modes.

Source data

Extended Data Fig. 4 Performance benchmarking of different denoising methods on experimental spectra.

Probability density estimates of entropy similarities comparing raw spectra and spectra denoised by various denoising methods: DNL denoising, MS Reduce, 1% base peak (bp) thresholding, and Spectral Denoising. The analysis performed different denoising methods on spectra acquired at 0.02–200 pmol, using the 500 pmol spectra as reference.

Source data

Extended Data Fig. 5 MS/MS spectral similarity improvement from Spectral Denoising across different levels of noise addition and spectral entropy.

Probability density estimates of entropy similarity improvements after applying Spectral Denoising to MS/MS spectra from 240 injected standards (0.02–200 pmol), using spectra acquired at 500 pmol as references. The analysis includes nine conditions with varying levels of artificially introduced electronic and chemical noise. Spectra are further stratified into five groups based on the spectral entropy of reference spectra.

Source data

Extended Data Fig. 6 Benchmarking false discovery rates between Denoising Search and Entropy Search.

False-discovery rate (FDR) benchmarking of 1,247 positive ESI-mode Orbitrap MS/MS spectra from human plasma metabolites, manually annotated using MassBank.us, GNPS, and NIST23 libraries. The plots compare FDR values across spectral similarity scores between Denoising Search and Entropy Search using experimental spectra. (a) FDR calculated based on the top hit only. (b) FDR calculated using the top 3 hits.

Source data

Supplementary information

Source data

Source Data Fig. 2

Statistical source data.

Source Data Fig. 3

Statistical source data.

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 1

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kong, F., Shen, T., Li, Y. et al. Denoising Search doubles the number of metabolite and exposome annotations in human plasma using an Orbitrap Astral mass spectrometer. Nat Methods 22, 1008–1016 (2025). https://doi.org/10.1038/s41592-025-02646-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41592-025-02646-x

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research