Abstract
Human untargeted metabolomics studies annotate only ~10% of molecular features. We introduce reference-data-driven analysis to match metabolomics tandem mass spectrometry (MS/MS) data against metadata-annotated source data as a pseudo-MS/MS reference library. Applying this approach to food source data, we show that it increases MS/MS spectral usage 5.1-fold over conventional structural MS/MS library matches and allows empirical assessment of dietary patterns from untargeted data.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout


Similar content being viewed by others
Data availability
The following files are available in addition to the Global FoodOmics mzXML files on https://massive.ucsd.edu under MSV000084900: metadata as a.txt; an image repository with between one and six images per food item that was sampled; table of FDR-based parameters; full size PDF of sleep restriction and circadian misalignment study; food reference data molecular network (excerpts found in Fig. 1). A metadata dictionary can also be accessed here: https://docs.google.com/spreadsheets/d/1Ebn-TgMWEkd_7KOw9TCRvHGPsE7dGjVCr7dg28pwbmM/edit#gid=727944641. The accessions numbers to the raw metabolomics data files available via Supplementary Table 2. The GNPS-based molecular networking analyses jobs used in this study can be accessed online at the following links: sleep and circadian study (MSV000083759, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=e0bf255bcb2e492bb0be3be1a691b5fb, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=6fe434761daf4f9da540cf1fd90b3985, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=9a90bd12f51e453e968656e6458e0da4); centenarian (MSV000084591, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=8895b6e3445546c4a5bc3a726a920227, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=981c9a7d39f742bda296d52f856981e5); impact of diet on rheumatoid arthritis (MSV000084556, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=0794151fce2c4c18a7a0aa3a09140169); LP infant (MSV000083462, MSV000083463, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=a7b222466ef844e69cdbd9835d2f6c39, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=c756a9dfb5c34a2a8655f88114edf0a8, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=4a322e640bb644068030949267fb4ea9); children with medical complexity (MSV000084610, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=df24423835a341969342c2086b46275a); american gut (MSV000081981, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=4884483bcffe4f269819858c3fd4faef); fermented food consumption (MSV000081171, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=5cca39e0ebab4066a56e41ded48b4466); Malawi legume supplement (MSV000081486, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=93ba727aa9234727a73ae7860b2af3ca); Rotarix vaccine response (MSV000084218, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=08e9b9e048f04ac4b416e574a073e8e6); IBD_1 (MSV000082431; https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=ec08eed8f186430d893c63111409baf4); IBD_individual (MSV000079115, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=fad746939afd4184975a296436aebfb7); IBD_seed (MSV000082221, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=907f2e0b7878417dbdb4c83f0df0e83a); IBD_biobank (MSV000079777, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=a79fbd4c96124209adfd0ef84cb56dec); IBD_2 (MSV000084775, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=07f855658c5342458045032ea70fc526); IBD_200 (MSV000084908, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=55bef02250d744eb97c6040c379cbfb4); Alzheimer’s disease (MSV000085256, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=aac78e9d23b84194ab2f768cb685c636); Alzheimer’s disease serum (MSV000086270, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=570aacf2244948c7afa590631de5d345); omnivore versus vegan (MSV000086989, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=74089e95b8df41b2af7c289869dc866f); COVID-19 (MSV000085505, MSV000085537, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=9cbcb6b46fe24826bc56c9e893d0bd2b); IBD_biopsy (MSV000082220, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=a83a279dad154f9ca7b549d40ce117ba); gout (MSV000084908, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=55bef02250d744eb97c6040c379cbfb4); adult saliva (MSV000083049, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=6dd6e5b1cf454d67b8a2b3c151c18f4a); legume supplementation (MSV000084663, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=93ba727aa9234727a73ae7860b2af3ca); tomato seedling (MSV000083353, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=3b6020d7034045c39969631894ae4c22); food only (MSV000084900, https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=d5adba7f67cc402396e9ba7cd85ce52b). Networking parameters were set on the basis of the MOLECULAR-LIBRARYSEARCH-FDR workflow on GNPS with the following task IDs: GFOP3500, a7bf6cc3f91d466bab923f2268d6f4fc; sleep deprivation, b55ab4004ed342d7b4ed1c488e935998; sleep study, 78bbfed8574748d1a77dc7c2f1a44d39; sleep study_SSF_test, b55ab4004ed342d7b4ed1c488e935998; centenarian, 265a9553c69e47499cca3de056b43178; centenarian_SSF_test, 265a9553c69e47499cca3de056b43178; American gut, aee5dde3b2f84079a264e68ec981487e; fermented food consumption, a44d1b2e1b9d4612974d0b85021675a7; Malawi legume supplement, de7b55f8adaa4ad9b2a8430e30435bf3; children with medical complexity, f27243af071b43ab90d846bda959fc1c; Rotarix vaccine response, a2e02e3f97a54ca08e3866cc60f8d42b; impact of diet on rheumatoid arthritis, 62b8754e761549f3b94ffae83d7ab95a; LP infant, 532aba2ad3644fadba0e6e7ea063c7ee; IBD_1, bb10b1ce90a24f3a9cef1e85e88c3882; IBD_biopsy, c4cfda90933b4842a7154f5f2def139d; IBD_individual, 3ce8cc636ae944848b4ada322aaf12fe; IBD_seed, ebbb715fc605457ba5f7e910b79d6177; IBD_biobank, 9465c34cf5444e12b89318b1fb363714; IBD_2, 983fa9271136404fb5743b44a6a109f0; IBD_200, e5acf5726722486caa897b2b07d402e8; Alzheimer’s disease, 658103164325425981c097cecba840b0; Alzheimer’s disease serum, 67516099b37647f2a9c91f890366bef3; omnivore versus vegan, ba974d08cab04f77aaacdb7828baada6; gout, a478f419ae824378aa02e5e1b310cad2; adult saliva, 32980f95dbd5437aaa9e15d05c7246bb; LP infant, 8bfbdc1bf38c418fb223306cd42af897; LP infant, 3e414e13a4394bb78c07f7ca7f4d1be3; legume supplementation, 2ca007303b9c4bb3820f392b996eba27; COVID-19 Brazil, d16eb32276c84bdb9c35c5872e97a986; Tomato seedling, f1c9cd79e0e94c66a367b6816b149750.
Code availability
The code generated during this study is available at https://github.com/DorresteinLaboratory/GlobalFoodomics.
Change history
18 October 2023
A Correction to this paper has been published: https://doi.org/10.1038/s41587-023-02025-x
References
Knights, D. et al. Bayesian community-wide culture-independent microbial source tracking. Nat. Methods 8, 8761–8763 (2011).
Ono, H. RefEx, a reference gene expression dataset as a web tool for the functional analysis of genes. Scientific Data 4, 170105 (2017).
Bono, H. All of gene expression (AOE): an integrated index for public gene expression databases. PLoS One 15, e0227076 (2020).
Turnbaugh, P. J. The human microbiome project. Nature 449, 804–810 (2007).
Skogerson, K. et al. The volatile compound BinBase mass spectral database. BMC Bioinf. 12, 321 (2011).
Lai, Z. et al. Identifying metabolites by integrating metabolome databases with mass spectrometry cheminformatics. Nat. Methods 15, 53–56 (2018).
Bouslimani, A. et al. Lifestyle chemistries from phones for individual profiling. Proc. Natl Acad. Sci. 113, E7645 (2016).
Haug, K. et al. MetaboLights: a resource evolving in response to the needs of its scientific community. Nucleic Acids Res. 48, D440 (2020).
Damen, H. et al. Siscom—a new library search system for mass spectra. Anal. Chim. Acta 103, 289–302 (1978).
Wang, M. et al. Mass spectrometry searches using MASST. Nat. Biotechnology 38, 23–26 (2020).
Robin S., et al. Nature Communications 12, 3832 (2021).
Li C., et al. Metabolite discovery through global annotation of untargeted metabolomics data. Preprint available at bioRxiv https://doi.org/10.1101/2021.01.06.425569 (2021).
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat. Biotechnol. 34, 828–837 (2016).
Barabási, A.-L. et al. The unmapped chemical complexity of our diet. Nat. Food 1, 33–37 (2020).
Maruvada, P. et al. Perspective: Dietary Biomarkers of Intake and Exposure-Exploration with Omics Approaches. Adv. Nutr. 11, 200–215 (2020).
Watrous, J. et al. Mass spectral molecular networking of living microbial colonies. Proc. Natl Acad. Sci. 109, E1743–E1752 (2012).
Quinn, R. et al. Molecular networking as a drug discovery, drug metabolism, and precision medicine strategy. Trends Pharmacol. Sci. 38, 143–154 (2017).
Sprecher, K. et al. Trait-like vulnerability of higher-order cognition and ability to maintain wakefulness during combined sleep restriction and circadian misalignment. Sleep 42, zsz113 (2019).
Lungren, D. et al. Role of spectral counting in quantitative proteomics. Expert Rev. Proteomics 7, 39–53 (2010).
Tripathi, T. et al. Chemically informed analyses of metabolomics mass spectrometry data with Qemistree. Nat. Chem. Biol. 17, 146–151 (2021).
Scheubert, K. et al. Significance estimation for large scale metabolomics annotations by spectral matching. Nat. Commun. 8, 1494 (2017).
Sumner, L. et al. Proposed minimum reporting standards for chemical analysis: Chemical Analysis Working Group (CAWG) Metabolomics Standards Initiative. Metabolomics 3, 211–221 (2021).
West, K., et al., NPJ Sci. Food 6, 22 (2022).
St. John-Williams, L. et al. Bile acids targeted metabolomics and medication classification data in the ADNI1 and ADNIGO/2 cohorts. Scientific Data 212, 1 (2019).
Aksenov, A. et al. Auto-deconvolution and molecular networking of gas chromatography–mass spectrometry data. Nat. Biotechnol. 39, 169–173 (2020).
Thompson, L. R. et al. A communal catalogue reveals Earth’s multiscale microbial diversity. Nature 551, 457–463 (2017).
McDonald, D. et al. American Gut: an open platform for citizen science microbiome research. mSystems 3, e00018-31 (2018).
Sicherer, S. H. & Sampson, H. A. Food allergy: A review and update on epidemiology, pathogenesis, diagnosis, prevention, and management. J. Allergy Clin. Immunol. 117, S470–S475 (2006).
Martin, C. L., et al. USDA Food and Nutrient Database for Dietary Studies 2011–2012: Documentation and User Guide. Beltsville, MD: US Department of Agriculture. (Agricultural Research Service, USDA Food Surveys Research Group, 2012).
Song, S. J. et al. Preservation methods differ in fecal microbiome stability,affecting suitability for field studies. mSystems 1, e00021-16 (2016).
Sprecher, K. J. et al. Trait-like vulnerability of higher-order cognition and ability to maintain wakefulness during combined sleep restriction and circadian misalignment. Sleep 42, zsz113 (2019).
McDonald, D. et al. The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. Gigascience. 1, 7 (2012).
Jarmusch, A. K. et al. ReDU: a framework to find and reanalyze public mass spectrometry data. Nat. Methods 17, 901–904 (2020).
McDonald, D. et al. redbiom: a rapid sample discovery and feature characterization system. mSystems 4, e00215-19 (2019).
Wang, M. et al. Sharing and community curation of mass spectrometry data with global natural products social molecular networking. Nat. Biotechnol. 34, 828–837 (2016).
Frank, A. M. et al. Spectral archives: extending spectral libraries to analyze both identified and unidentified spectra. Nat. Methods 8, 587–591 (2011).
Aron, A. T. et al. Reproducible molecular networking of untargeted mass spectrometry data using GNPS. Nat. Protoc. 15, 1954–1991 (2020).
Horai, H. et al. Massbank: a public repository for sharing mass spectral data for life sciences. J. Mass Spectrom. 45, 703–714 (2010).
Wishart, D. S. et al. HMDB 4.0: the human metabolome database for 2018. Nucleic Acids Res. 46, D608–D617 (2018).
Sawada, Y. et al. RIKEN tandem mass spectral database (ReSpect) for phytochemicals: a plant-specific MS/MS-based data resource and database. Phytochemistry 82, 38–45 (2012).
Huang, R. et al. The NCATS pharmaceutical collection: a 10-year update. Drug Discov. 24, 2341–2349 (2019).
Kyle, J. E. et al. LIQUID: an-open source software for identifying lipids in LC–MS/MS-based lipidomics data. Bioinformatics. 33, 1744–1746 (2017).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
McKinney, W. Data Structures for Statistical Computing in Python. In Proc. 9th Python in Science Conference (Eds. van der Walt, S. & Millman, J.) 56–61 (SciPy, 2010).
van der Walt, S., Colbert, S. C. & Varoquaux, G. The NumPy array: a structure for efficient numerical computation. Comput. Sci. Eng. 13, 22–30 (2011).
Lupton, R. C. & Allwood, J. M. Hybrid Sankey diagrams: visual analysis of multidimensional data for understanding resource use. Resour. Conserv. Recycl. 124, 141–151 (2017).
Taylor, B. C. et al. Consumption of fermented foods is associated with systematic differences in the gut microbiome and metabolome. mSystems 5, e00901-19 (2020).
Acknowledgements
Funding sources: we thank the Crohn’s & Colitis foundation #675191, U19 AG063744 01, R01AG061066, 1 DP1 AT010885, P30 DK120515, Office of Naval Research MURI grant N00014-15-1-2809 and NIH/NCATS Colorado CTSA Grant UL1TR002535, the Emch Fund and C&D Fund. This work was also supported in part by the Chancellor’s Initiative in the Microbiome and Microbial Sciences and by Illumina through reagent donation and by Danone Nutricia Research in partnership with the Center for Microbiome Innovation at UCSD. We would like to thank E. Sayyari, D. S. Nguyen, E. Wolfe and K. Sanders for sample processing, and J. DeReus for data handling, processing, and maintaining the computational infrastructure. J.P.S. was supported by SD IRACDA (5K12GM068524-17), and in part by USDA-NIFA (2019-67013-29137) and the Einstein Institute GOLD project (R01MD011389). R.C. and M.G. were supported by the Krupp Endowed Fund; R.C. was also supported by a UCSD Rheumatic Diseases Research Training Grant from the NIH/NIAMS (T32AR064194). VA Research Service, NIH/NIAMS AR060772 and AR075990 to R.T., R.H.M. was supported through a UCSD training grant from the NIH/NIDDK Gastroenterology Training Program (T32 DK007202). The Brazilian National Council for Scientific and Technological Development (CNPq)-Brazil (245954/2012) to M.F.O. and FAPESP (2014/50265-3) to N.P.L. D.W. was supported by NIH/NHLBI Training Grant (NIH T32 HL149646). K.S. was supported by a PROMOS fund (DAAD). W.B. is a postdoctoral researcher of the Research Foundation–Flanders (FWO). R.J.D. was supported by NIH DP2 AT010401-01. We thank R. da Silva for his feedback and early bioinformatics analysis for the Global FoodOmics project. We further acknowledge all the individuals that contributed samples as well as companies and organizations that have donated samples: D. Vargas, Townshend’s Tea Company, BDK Kombucha, Oregonian Tonic, Squirrel & Crow, Venissimo cheese, Fermenter’s Club San Diego, Good Neighbor Gardens, Sprouts Farmers Market, Ralphs, Whole Foods, Julian Ciderworks and San Diego Zoo and Safari Park. Specifically thank you to A. Durant for coordinating sampling at Fermentation Festivals and the wonderful staff at San Diego Zoo Wildlife Alliance for coordinating and helping with sample collection: M. Gaffney, E. Galindo, K. Kerr, A. Fidgett, J. Stuart, D. Tanciatco, and L. Pospychala. NIST would like to acknowledge The Institute for the Advancement of Food and Nutrition Sciences (IAFNS) microbiome committee for providing support for the development of standardized fecal materials. Funding for the ADMC (Alzheimer’s Disease Metabolomics Consortium, led by Dr R.K.-D. at Duke University) was provided by the National Institute on Aging grants 1U01AG061359-01 and R01AG046171, a component of the Accelerating Medicines Partnership for AD (AMP-AD) Target Discovery and Preclinical Validation Project (https://www.nia.nih.gov/research/dn/ampad-target-discovery-and-preclinical-validation-project) and the National Institute on Aging grant RF1 AG0151550, a component of the M2OVE-AD Consortium (Molecular Mechanisms of the Vascular Etiology of AD Consortium, https://www.nia.nih.gov/news/decoding-molecular-tiesbetween-vascular-disease-and-alzheimer). Additional support was provided by the following NIA grants: (1RF1AG058942-01 and 3U01 AG024904-09S4). Data collection and sharing for the ADNI was supported by National Institutes of Health Grant U01 AG024904. ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie, Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research & Development, LLC; Johnson & Johnson Pharmaceutical Research & Development LLC; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private sector contributions are facilitated by the Foundation for the National Institutes of Health (www.fnih.org). The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for Neuro Imaging at the University of Southern California. UCSD Academic Senate Research/Bridge Grant. Eunice Kennedy Shriver National Institute of Child Health and Human Development K12-HD000850.
Author information
Authors and Affiliations
Contributions
P.C.D., R.K.D., R.J.D., and J.M.G. conceptualized the idea. M.J.M., M.B., M.P., F.D.O., K.C.W., C.M.A., E.B., K.S., P.C.D., R.J.D., R.K.D., N.C.S., A.D.S., K.D., G.A., D.M.D., N.P.L., M.B., and J.M.G. collected FoodOmics samples and performed metadata curation. M.J.M, M.P., F.D.O., F.V., C.M.A., E.B., N.C.S., and J.M.G. performed FoodOmics sample processing and MS data acquisition. A.J.J., P.B.F., E.D., Q.Z., D.N., D.M., J.P.S., and J.M.G. curated Global FoodOmics metadata to match FNDDS. K.E.R., J.B.W., B.S.B., B.J.B., R.C., M.G.D.B., M.M.D., E.O.E., D.G., L.H., J.H.K., M.M., C.M., R.K., K.E.S., D.V.R., T.I.K., C.W., K.P.W.J., M.F.O., R.H.M., D.W., R.T., J.G.A., P.S.D., M.G., D.J.G., A.K.J., B.J.B., R.M.S., K.C.W., A.D.S., F.V., N.P.L., P.K.P., S.M.D.S., S.L.S., C.M.J., N.J.L., K.A.L., S.A.J., R.K.D. and J.M.G. provided samples, comparative dataset, and/or detailed metadata. L.M.M.M., T.M.C. performed COVID-19 patient and/or food sample preparation and analysis. P.L.J. was the physician responsible for the COVID-19 patients. R.D.R.O was the physician responsible for collecting the plasma from COVID-19 patients. F.P.V. was responsible for tabulation of COVID-19 patient data. M.P., J.M.G., T.S., M.G.D.B., L.D.R.G., G.H. prepared samples for food. M.W. supported GNPS computational infrastructure used in the study. C.L.W., W.B., A.K.J., K.A.W., E.S., A.T., N.P.L. and J.M.G. analyzed MS data. C.L.W., W.B., A.K.J., K.A.W., C.M., and J.M.G. generated figures. P.C.D., R.K., R.J.D., A.D.S., and J.M.G. supervised the work. P.C.D., R.K., C.L.W., K.A.W., W.B., and J.M.G. wrote the paper. All authors have contributed feedback and edits to the manuscript.
Corresponding authors
Ethics declarations
Competing interests
B.S.B. has a research grant from Prometheus Biosciences and has received consulting fees from Pfizer. P.C.D. is on the scientific advisory board of Sirenas, Cybele Microbiome, Galileo, and founder and scientific advisor of Ometa Labs LLC and Enveda (with approval by UC San Diego). J.H.K. is a consultant for Medela and on the Board for Innara Health; he owns shares in Astarte Medical and Nicolette. M.G. has research grants from Pfizer and Novartis. P.S.D. has received research support and/or consulting from Takeda, Pfizer, Abbvie, Janssen, Prometheus, Buhlmann, Polymedco. R.J.D. is a consultant for and owns shares in Impossible Foods Inc., and is on the Scientific Advisory Panel of Boost Biomes. A.J.J. has received consulting fees from Abbott Nutrition and Corebiome. D.M. is a consultant for BiomeSense, Inc., has equity and receives income. The terms of these arrangements have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. D.G. is a consultant for Biogen, Fujirebio, vTv Therapeutics, Esai and Amprion and serves on a DSMB for Cognition Therapeutics. K.P.W. reports during the conduct of the study receiving research support from SomaLogic, Inc., consulting fees from or served as a paid member of scientific advisory boards for the Sleep Disorders Research Advisory Board–National Heart, Lung and Blood Institute, CurAegis Technologies, Philips, Inc., Circadian Therapeutics, Ltd. and Circadian Biotherapies Ltd. R.T. received a research grant from AstraZeneca Consulting, SOBI, Selecta, Horizon, Allena, AstraZeneca. A.D.S. and R.K. are directors at the Center for Microbiome Innovation at UC San Diego, which receives industry research funding for multiple microbiome initiatives, but no industry funding was provided for this project. R.K. is a scientific advisory board member, and consultant for BiomeSense, Inc., has equity and receives income. He is a scientific advisory board member and has equity in GenCirq. He is a consultant and scientific advisory board member for DayTwo, and receives income. He has equity in and acts as a consultant for Cybele. He is a co-founder of Biota, Inc., and has equity. He is a cofounder of Micronoma, and has equity and is a scientific advisory board member. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. M.W. is a co-founder of Ometa Labs LLC. K.D. is an inventor on a series of patents on the use of metabolomics for the diagnosis and treatment of central nervous system diseases and holds equity in Metabolon Inc., Chymia LLC and PsyProtix. The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Biotechnology thanks Elaine Holmes and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Figs. 1−3.
Supplementary Table 1
The metadata table for the foodomics project
Supplementary Table 2
Table of the study details of each of the 28 public projects
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gauglitz, J.M., West, K.A., Bittremieux, W. et al. Enhancing untargeted metabolomics using metadata-based source annotation. Nat Biotechnol 40, 1774–1779 (2022). https://doi.org/10.1038/s41587-022-01368-1
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41587-022-01368-1
This article is cited by
-
ROASMI: accelerating small molecule identification by repurposing retention data
Journal of Cheminformatics (2025)
-
Adherence to a psychobiotic diet stabilizes the microbiome and reduces perceived stress: plenty of food for thought
Molecular Psychiatry (2025)
-
Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data
Nature Communications (2025)
-
The Impact of Modern AI in Metadata Management
Human-Centric Intelligent Systems (2025)
-
Microbial community-scale metabolic modelling predicts personalized short-chain fatty acid production profiles in the human gut
Nature Microbiology (2024)