Abstract
Dietary intake is tightly coupled to gut microbiota composition, human metabolism and the incidence of virtually all major chronic diseases. Dietary and nutrient intake are usually assessed using self-reporting methods, including dietary questionnaires and food records, which suffer from reporting biases and require strong compliance from study participants. Here, we present Metagenomic Estimation of Dietary Intake (MEDI): a method for quantifying food-derived DNA in human faecal metagenomes. We show that DNA-containing food components can be reliably detected in stool-derived metagenomic data, even when present at low abundances (more than ten reads). We show how MEDI dietary intake profiles can be converted into detailed metabolic representations of nutrient intake. MEDI identifies the onset of solid food consumption in infants, shows significant agreement with food frequency questionnaire responses in an adult population and shows agreement with food and nutrient intake in two controlled-feeding studies. Finally, we identify specific dietary features associated with metabolic syndrome in a large clinical cohort without dietary records, providing a proof-of-concept for detailed tracking of individual-specific, health-relevant dietary patterns without the need for questionnaires.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$32.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
Data availability
Data for specific food items are available at https://foodb.ca. Individual matched genomic assemblies can be downloaded from GenBank or the Nucleotide Database and are listed at https://github.com/Gibbons-Lab/medi-paper/blob/main/db/data/manifest.csv. Metagenomic sequencing data for the studied cohorts are available on the NCBI SRA under accession numbers PRJNA473126 (infants), PRJNA398089 (iHMP), PRJEB37249 (METACARDIS), PRJNA947193 (MBD) and PRJNA1198318 (PATH). Source data are provided with this paper.
Code availability
All intermediate data files, metadata and analysis code have been uploaded to GitHub (https://github.com/Gibbons-Lab/medi-paper). The MEDI software package is available on GitHub (https://github.com/Gibbons-Lab/medi).
Change history
24 March 2025
A Correction to this paper has been published: https://doi.org/10.1038/s42255-025-01284-z
References
Harding, J. E., Cormack, B. E., Alexander, T., Alsweiler, J. M. & Bloomfield, F. H. Advances in nutrition of the newborn infant. Lancet 389, 1660–1668 (2017).
de Ridder, D., Kroese, F., Evers, C., Adriaanse, M. & Gillebaart, M. Healthy diet: health impact, prevalence, correlates, and interventions. Psychol. Health 32, 907–941 (2017).
Clark, M., Hill, J. & Tilman, D. The diet, health, and environment trilemma. Annu. Rev. Environ. Resour. 43, 109–134 (2018).
David, L. A. et al. Diet rapidly and reproducibly alters the human gut microbiome. Nature 505, 559–563 (2014).
Wang, D. D. et al. The gut microbiome modulates the protective association between a Mediterranean diet and cardiometabolic disease risk. Nat. Med. 27, 333–343 (2021).
Gu, Y., Nieves, J. W., Stern, Y., Luchsinger, J. A. & Scarmeas, N. Food combination and Alzheimer disease risk: a protective diet. Arch. Neurol. 67, 699–706 (2010).
Mente, A. et al. Diet, cardiovascular disease, and mortality in 80 countries. Eur. Heart J. 44, 2560–2579 (2023).
Magkos, F., Hjorth, M. F. & Astrup, A. Diet and exercise in the prevention and treatment of type 2 diabetes mellitus. Nat. Rev. Endocrinol. 16, 545–555 (2020).
Key, T. J., Allen, N. E., Spencer, E. A. & Travis, R. C. The effect of diet on risk of cancer. Lancet 360, 861–868 (2002).
Ludwig, D. S., Ebbeling, C. B. & Heymsfield, S. B. Improving the quality of dietary research. JAMA 322, 1549–1550 (2019).
Molag, M. L. et al. Design characteristics of food frequency questionnaires in relation to their validity. Am. J. Epidemiol. 166, 1468–1478 (2007).
Timon, C. M. et al. A review of the design and validation of web- and computer-based 24-h dietary recall tools. Nutr. Res. Rev. 29, 268–280 (2016).
Conway, J. M., Ingwersen, L. A. & Moshfegh, A. J. Accuracy of dietary recall using the USDA five-step multiple-pass method in men: an observational validation study. J. Am. Diet. Assoc. 104, 595–603 (2004).
Abu-Saad, K., Shahar, D. R., Vardi, H. & Fraser, D. Importance of ethnic foods as predictors of and contributors to nutrient intake levels in a minority population. Eur. J. Clin. Nutr. 64, S88–S94 (2010).
Mozaffarian, D. & Forouhi, N. G. Dietary guidelines and health—Is nutrition science up to the task? Brit. Med. J. 360, k822 (2018).
Taubes, G. Epidemiology faces its limits. Science 269, 164–169 (1995).
Young, S. S. & Karr, A. Deming, data and observational studies. Signif. (Oxf.) 8, 116–120 (2011).
Sturgeon, C. M. et al. National Academy of Clinical Biochemistry laboratory medicine practice guidelines for use of tumor markers in testicular, prostate, colorectal, breast, and ovarian cancers. Clin. Chem. 54, e11–e79 (2008).
Mundi, S. et al. Endothelial permeability, LDL deposition, and cardiovascular risk factors—a review. Cardiovasc. Res. 114, 35–52 (2018).
Zuppinger, C. et al. Performance of the digital dietary assessment tool MyFoodRepo. Nutrients 14, 635 (2022).
Mohanty, S. P. et al. The food recognition benchmark: using deep learning to recognize food in images. Front. Nutr. 9, 875143 (2022).
Mortazavi, B. J. & Gutierrez-Osuna, R. A review of digital innovations for diet monitoring and precision nutrition. J. Diabetes Sci. Technol. 17, 217–223 (2023).
Hassannejad, H. et al. Automatic diet monitoring: a review of computer vision and wearable sensor-based methods. Int. J. Food Sci. Nutr. 68, 656–670 (2017).
West, K. A., Schmid, R., Gauglitz, J. M., Wang, M. & Dorrestein, P. C. foodMASST a mass spectrometry search tool for foods and beverages. NPJ Sci. Food 6, 22 (2022).
Dorrestein, P. Metabolomics technologies for defining diet influences on brain metabolome and in Alzheimer’s disease. Alzheimers Dement. 18, e067277 (2022).
Petrone, B. L. et al. Diversity of plant DNA in stool is linked to dietary quality, age, and household income. Proc. Natl Acad. Sci. USA 120, e2304441120 (2023).
Deagle, B. E., Thomas, A. C., Shaffer, A. K., Trites, A. W. & Jarman, S. N. Quantifying sequence proportions in a DNA-based diet study using Ion Torrent amplicon sequencing: Which counts count? Mol. Ecol. Resour. 13, 620–633 (2013).
Integrative HMP (iHMP) Research Network Consortium. The Integrative Human Microbiome Project. Nature 569, 641–648 (2019).
Quince, C., Walker, A. W., Simpson, J. T., Loman, N. J. & Segata, N. Shotgun metagenomics, from sampling to analysis. Nat. Biotechnol. 35, 833–844 (2017).
Hyatt, D. et al. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11, 119 (2010).
Blanco-Míguez, A. et al. Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat. Biotechnol. 41, 1633–1644 (2023).
Brent, M. R. How does eukaryotic gene prediction work? Nat. Biotechnol. 25, 883–885 (2007).
Wood, D. E., Lu, J. & Langmead, B. Improved metagenomic analysis with Kraken 2. Genome Biol. 20, 257 (2019).
Ounit, R., Wanamaker, S., Close, T. J. & Lonardi, S. CLARK: fast and accurate classification of metagenomic and genomic sequences using discriminative k-mers. BMC Genomics 16, 236 (2015).
Shen, W. et al. KMCP: accurate metagenomic profiling of both prokaryotic and viral populations by pseudo-mapping. Bioinformatics 39, btac845 (2023).
Gihawi, A. et al. Major data analysis errors invalidate cancer microbiome findings. Mbio 14, e0160723 (2023).
Breitwieser, F. P., Baker, D. N. & Salzberg, S. L. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts. Genome Biol. 19, 198 (2018).
Lu, J., Breitwieser, F. P., Thielen, P. & Salzberg, S. L. Bracken: estimating species abundance in metagenomics data. PeerJ Comput. Sci. 3, e104 (2017).
Srivastava, A. et al. Alignment and mapping methodology influence transcript abundance estimation. Genome Biol. 21, 239 (2020).
Sun, Z. et al. Challenges in benchmarking metagenomic profilers. Nat. Methods 18, 618–626 (2021).
Corbin, K. D. et al. Host–diet–gut microbiome interactions influence human energy balance: a randomized clinical trial. Nat. Commun. 14, 3161 (2023).
Thompson, S. V. et al. Avocado consumption alters gastrointestinal bacteria abundance and microbial metabolite concentrations among adults with overweight or obesity: a randomized controlled trial. J. Nutr. 151, 753–762 (2021).
Asnicar, F. et al. Original research: blue poo: impact of gut transit time on the gut microbiome using a novel marker. Gut 70, 1665 (2021).
Duan, Y., Pi, Y., Li, C. & Jiang, K. An optimized procedure for detection of genetically modified DNA in refined vegetable oils. Food Sci. Biotechnol. 30, 129–135 (2021).
Scollo, F. et al. Absolute quantification of olive oil DNA by droplet digital-PCR (ddPCR): comparison of isolation and amplification methodologies. Food Chem. 213, 388–394 (2016).
Baumann-Dudenhoeffer, A. M., D’Souza, A. W., Tarr, P. I., Warner, B. B. & Dantas, G. Infant diet and maternal gestational weight gain predict early metabolic maturation of gut microbiomes. Nat. Med. 24, 1822–1829 (2018).
Lloyd-Price, J. et al. Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569, 655–662 (2019).
Manore, M. M. Exercise and the Institute of Medicine recommendations for nutrition. Curr. Sports Med. Rep. 4, 193–198 (2005).
Fromentin, S. et al. Microbiome and metabolome features of the cardiometabolic disease spectrum. Nat. Med. 28, 303–314 (2022).
Thomas, M. S., Calle, M. & Fernandez, M. L. Healthy plant-based diets improve dyslipidemias, insulin resistance, and inflammation in metabolic syndrome. A narrative review. Adv. Nutr. 14, 44–54 (2023).
Neuenschwander, M. et al. Substitution of animal-based with plant-based foods on cardiometabolic health and all-cause mortality: a systematic review and meta-analysis of prospective studies. BMC Medicine 21, 404 (2023).
Embleton, N. D. Optimal protein and energy intakes in preterm infants. Early Hum. Dev. 83, 831–837 (2007).
Uauy, R., Mena, P. & Valenzuela, A. Essential fatty acids as determinants of lipid requirements in infants, children and adults. Eur. J. Clin. Nutr. 53, S66–S77 (1999).
Neis, F. A., de Costa, F., de Araújo, A. T. Jr., Fett, J. P. & Fett-Neto, A. G. Multiple industrial uses of non-wood pine products. Ind. Crops Prod. 130, 248–258 (2019).
Wallick, D. Cellulose polymers in microencapsulation of food additives. In Microencapsulation in the Food Industry (eds Gaonkar A. et al.) 181–193 (Elsevier, 2014).
Li, N., Simon, J. E. & Wu, Q. Development of a scalable, high-anthocyanin and low-acidity natural red food colorant from Hibiscus sabdariffa L. Food Chem. 461, 140782 (2024).
Ruxton, C. H. S., Gardner, E. J. & McNulty, H. M. Is sugar consumption detrimental to health? A review of the evidence 1995–2006. Crit. Rev. Food Sci. Nutr. 50, 1–19 (2010).
Crovetto, M. et al. Effect of healthy and unhealthy habits on obesity: a multicentric study. Nutrition 54, 7–11 (2018).
Gibbons, S. M. et al. Perspective: leveraging the gut microbiota to predict personalized responses to dietary, prebiotic, and probiotic interventions. Adv. Nutr. 13, 1450–1461 (2022).
Lovegrove, J. A., Hodson, L., Sharma, S. & Lanham-New S. A. Nutrition Research Methodologies (John Wiley & Sons, 2015).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Ewels, P., Magnusson, M., Lundin, S. & Käller, M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics 32, 3047–3048 (2016).
Di Tommaso, P. et al. Nextflow enables reproducible computational workflows. Nat. Biotechnol. 35, 316–319 (2017).
Corbin, K. D. et al. Integrative and quantitative bioenergetics: design of a study to assess the impact of the gut microbiome on host energy balance. Contemp. Clin. Trials Commun. 19, 100646 (2020).
1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Acknowledgements
Research reported in this publication was supported by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) of the NIH under award number R01DK133468 (to S.M.G.) and by a Global Grants for Gut Health Award from Nature Portfolio and Yakult (to S.M.G.). This research was funded in part by the Austrian Science Fund (FWF): grant Cluster of Excellence CoE7 (to C.D. and C.M.-E.) and SFB ImmunoMetabolism 10.55776/F8300 (to C.M.-E.). Computational resources for this work were provided by the MedBioNode High-Performance Computing cluster at the Medical University of Graz. H.D.H. acknowledges funding for the PATH study from the Foundation for Food and Agriculture Research (FFAR) New Innovator Award and Hass Avocado Board.
Author information
Authors and Affiliations
Contributions
C.D. and S.M.G. conceived of the study. C.D. wrote and tested the software. C.D., K.F. and S.M.G. performed analyses. H.D.H., K.D.C., C.M.-E. and S.M.G. provided datasets and resources. C.D. wrote the initial draft of the paper. C.D. and S.M.G. provided supervision. All authors contributed to writing and revising the paper.
Corresponding authors
Ethics declarations
Competing interests
The authors report no financial or non-financial competing interests relevant to the work presented in this paper. S.M.G. received funding from a Global Grants for Gut Health Award from Nature Portfolio and Yakult. However, the funders were not involved in conducting the research, drafting the paper or reviewing the work.
Peer review
Peer review information
Nature Metabolism thanks Lars Dragsted and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editor: Yanina-Yasmin Pesch, in collaboration with the Nature Metabolism team.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 MEDI benchmarks.
(a) Genomic distance (1 - ANI) vs. macronutrient distance (euclidean, in g/100 g). The blue line denotes a smooth spline regression and shaded area denotes the 95% confidence interval of the mean spline regression. (b) Benchmark of cached and batched processing using MEDI (6 CPUs per process, see Methods). 888 samples were divided into two batches of 500 and 388 FASTQ files and processes separately in parallel. Each point denotes a single FASTQ file and colors denote the batch. Vertical line denotes median classification rate. (c) Relationship between (haploid) genome/assembly size and food abundance in the iHMP data set. Shown are only genomes/assemblies with at least 1 million basepairs.
Extended Data Fig. 2 Foods and nutrients in controlled feeding studies.
(a) Food abundances in the MBD cohort by diet group (n = 30). Boxplots show 25%, 50%, and 75% quantiles.The center denotes the median and whiskers extend to the smallest and largest data points within 1.5 interquartile ranges. (b) Correlation between MEDI estimates and ground truth for varying fecal samples/food diary entry offsets. (c) MEDI predictions of total fiber content from fecal DNA (y-axis) and nutrient consumption of sugars, fibers and grains obtained from food diaries (x-axis) in a controlled-feeding study (PATH), where the dietary intake recorded in the daily food record precede the stool sample by at least 48 h. Each point denotes a single individual. For the food diaries, points represent means over all measured intake amounts and error bars denote the standard error of the mean (sd/sqrt(n)), normalized to a 100 g portion (all samples within the offset, 38 individuals with 124 food record diary entries). For the MEDI data, points x-coordinate represent point estimates of intake based on weighting nutrient profiles of food items by food item relative abundance and assuming a 100 g portion. Blue lines denote regression slopes and gray areas represent 95% confidence intervals. Annotations denote correlation coefficient (r) and p-value (p) from a Pearson product-moment correlation test.
Extended Data Fig. 3 Non-food reads in infant samples.
Relative abundance of bacterial and human reads across infant timeseries, colored by delivery route. Lines denotes a smooth spline regression and shaded areas denotes the 95% confidence interval of the spline regression.
Extended Data Fig. 4 MEDI dietary intake estimates were associated with metabolic health.
Abundances per 100 g portion for 1703 compounds across a cohort of 533 metabolically healthy and unhealthy individuals from the METACARDIS cohort. Fill colors denote abundance per standard portion (mg/100 g). Column annotations denote metabolic health status from the original METACARDIS cohort (HC - healthy cohort, MMC - IHD metabolically matched cohort, UMMC - untreated metabolically matched cohort). Here, MMC and UMMC denote disease-free but metabolically unhealthy groups. Row annotations denote the monomer mass of the compound (in g/mol).
Extended Data Fig. 5 Curation of FOODB data.
(a) Original content (x-axis) vs. energy content calculated by the Adwater method based on macronutrient content (Pearson r = 0.94, two-sided product-moment correlation test p < 2.2e-16). Colors denote detailed unique preparation types in the FOODB. (b) Cholesterol abundances across foods in the FOODB before adjustment.
Extended Data Fig. 6 Hibiscus associations.
Significant associations between food frequency questionnaires (FFQs) and Hibiscus genus abundance in the iHMP cohort (see Methods, n = 361). Associations were run for all 19 FFQ questions. Circles denote the mean and error bar denote standard deviation. p[lm] indicates the ANOVA p-value of a regression of log-transformed relative abundances and p[logit] denotes the p-value of a logistic regression of food occurrence against food frequency strata. Axis labels are common across all plots within this panel. Shown are only food groups with a Bonferroni-adjusted p(lm) < 0.05.
Supplementary information
Tables 1 and 2
Table 1. Summary of metabolites in FOODB. Includes source type (nutrient or compound), monomer mass, abundance statistics and in how many foods the metabolite was measured in. Table 2. Cohort characteristics.
Source data
Source Data Fig. 1
Raw data points.
Source Data Fig. 2
Raw data points.
Source Data Fig. 3
Raw data points.
Source Data Fig. 4
Raw data points.
Source Data Fig. 5
Raw data points.
Source Data Extended Data Fig. 1
Raw data points.
Source Data Extended Data Fig. 2
Raw data points.
Source Data Extended Data Fig. 3
Raw data points.
Source Data Extended Data Fig. 4
Raw data points.
Source Data Extended Data Fig. 5
Raw data points.
Source Data Extended Data Fig. 6
Raw data points.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Diener, C., Holscher, H.D., Filek, K. et al. Metagenomic estimation of dietary intake from human stool. Nat Metab 7, 617–630 (2025). https://doi.org/10.1038/s42255-025-01220-1
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s42255-025-01220-1
This article is cited by
-
Querying stool for dietary information
Nature Metabolism (2025)
-
Genomics of host–microbiome interactions in humans
Nature Reviews Genetics (2025)
-
Estimating dietary intake from human stool DNA
Nature Reviews Gastroenterology & Hepatology (2025)
-
Moving from genome-scale to community-scale metabolic models for the human gut microbiome
Nature Microbiology (2025)
-
Citywide metagenomic surveillance of food centres reveals local microbial signatures and antibiotic resistance gene enrichment
npj Antimicrobials and Resistance (2025)