Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Science of Food
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj science of food
  3. articles
  4. article
A unified knowledge graph linking foodomics to chemical-disease networks and flavor profiles
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 20 January 2026

A unified knowledge graph linking foodomics to chemical-disease networks and flavor profiles

  • Fangzhou Li1,2,3,
  • Jason Youn1,2,3,
  • Kaichi Xie1,3,
  • Trevor Chan1,2,3,
  • Pranav Gupta1,2,3,
  • Arielle Yoo2,3,
  • Michael Gunning1,2,3,
  • Keer Ni1,2,3 &
  • …
  • Ilias Tagkopoulos1,2,3 

npj Science of Food , Article number:  (2026) Cite this article

  • 876 Accesses

  • Metrics details

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Biochemistry
  • Chemistry
  • Computational biology and bioinformatics

Abstract

Modern nutrition science still lacks a comprehensive, machine-readable map linking diet to molecular composition and biological effects. Here we present FoodAtlas, a large-scale knowledge graph that links 1430 foods to 3610 chemicals, 2181 diseases, and 958 flavor descriptors through 96,981 provenance-tracked edges. A transformer-based text-mining pipeline extracted 48,474 quantitative food–chemical associations from 125,723 literature sentences (F1 = 0.67) and integrated them with 23,211 chemical–disease assertions from the Comparative Toxicogenomics Database, 15,222 chemical-bioactivity records from ChEMBL, 3645 flavor annotations from FlavorDB and PubChem, and 6429 taxonomic relationships. Graph embeddings revealed six dietary modules whose signature metabolites delineate distinct, multisystem disease-risk trajectories. Models built on FoodAtlas demonstrate practical utility: a bioactivity predictor achieved strong correlation with antioxidant assays (R² = 0.52; ρ = 0.72), and a substitution engine reduced simulated total disease risk by 11.9%.

Similar content being viewed by others

From language models to large-scale food and biomedical knowledge graphs

Article Open access 15 May 2023

A dataset of branched fatty acid esters of hydroxy fatty acids diversity in foods

Article Open access 10 November 2023

A novel graph mining approach to predict and evaluate food-drug interactions

Article Open access 20 January 2022

Data availability

All data is available at https://github.com/IBPA/FoodAtlas-KGv2 and https://foodatlas.ai. Food Atlas may be using data that are restricted licenses for different uses, please consult individual license terms from each data source. CTD compliance. FoodAtlas integrates curated knowledge from the Comparative Toxicogenomics Database (CTD) by linking to CTD chemical/disease identifiers and PubMed references; we do not redistribute CTD-curated interaction tables in our public releases. Users can reproduce the CTD integration locally using our scripts, which download CTD data from the source and join via CTD IDs and PMIDs. Non-commercial use is free; commercial reuse requires a license from CTD’s licensing agent. Please consult CTD’s terms before downloading/using CTD content.

Code availability

All code and instructions on how to reproduce the results for knowledge graph construction and information extraction can be found at https://github.com/IBPA/FoodAtlas-KGv2 and https://github.com/IBPA/Lit2KG, respectively.

References

  1. Allot, A. et al. LitSense: making sense of biomedical literature at sentence level. Nucleic Acids Res. 47, W594–W599 (2019).

    Google Scholar 

  2. Youn, J., Li, F., Simmons, G., Kim, S. & Tagkopoulos, I. FoodAtlas: automated knowledge extraction of food and chemicals from literature. Comput. Biol. Med. 181, 109072 (2024).

    Google Scholar 

  3. Cifuentes, A. Food analysis and foodomics. J. Chromatogr. A 1216, 7109 (2009).

    Google Scholar 

  4. García-Cañas, V., Simó, C., Herrero, M., Ibáñez, E. & Cifuentes, A. Present and future challenges in food analysis: foodomics. Anal. Chem. 84, 10150–10159 (2012).

    Google Scholar 

  5. FooDB. https://foodb.ca/. [Accessed at 12/25/2025]

  6. McKillop, K., Harnly, J., Pehrsson, P., Fukagawa, N. & Finley, J. FoodData Central, USDA’s updated approach to food composition data systems. Curr. Dev. Nutr. 5, 596–596 (2021).

    Google Scholar 

  7. USDA FoodData Central. https://fdc.nal.usda.gov/. [Accessed at 12/25/2025]

  8. Capozzi, F. & Bordoni, A. Foodomics: a new comprehensive approach to food and nutrition. Genes Nutr. 8, 1–4 (2013).

    Google Scholar 

  9. Min, W., Liu, C., Xu, L. & Jiang, S. Applications of knowledge graphs for food science and industry. Patterns 3, 100484 (2022).

    Google Scholar 

  10. Jahangir, M., Kim, H. K., Choi, Y. H. & Verpoorte, R. Health-affecting compounds in Brassicaceae. Compr. Rev. Food Sci. Food Saf. 8, 31–43 (2009).

    Google Scholar 

  11. Dooley, D. M. et al. FoodOn: a harmonized food ontology to increase global food traceability, quality control and data integration. Npj Sci. Food 2, 23 (2018).

    Google Scholar 

  12. Eftimov, T., Ispirova, G., Potočnik, D., Ogrinc, N. & Koroušić Seljak, B. ISO-FOOD ontology: a formal representation of the knowledge within the domain of isotopes for food science. Food Chem. 277, 382–390 (2019).

    Google Scholar 

  13. Furukawa H. Deep Learning for End-to-End Automatic Target Recognition from Synthetic Aperture Radar Imagery. IEICE Technical Report; IEICE Tech. Rep. 117, 35–40 (2018).

  14. Devlin, J., Chang, M-W., Lee, K. & Toutanova, K. BERT: Pre-training of Deep BidirectionalTransformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics. (2019).

  15. Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems, 30. https://doi.org/10.48550/arXiv.1706.03762 (2017).

  16. Lee, J. et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240 (2020).

    Google Scholar 

  17. Cenikj, G., Seljak, B. K. & Eftimov, T. FoodChem: A food-chemical relation extraction model. In 2021IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1-8). IEEE. (2021).

  18. Özen, N., Mu, W., van Asselt, ED. & van den Bulk, LM. Extracting chemical food safety hazards from the scientific literature automatically using large language models. Appl. Food Res. 5, 100679, https://doi.org/10.1016/j.afres.2024.100679 (2025).

    Google Scholar 

  19. Davis, A. P. et al. Comparative Toxicogenomics Database’s 20th anniversary: update 2025. Nucleic Acids Res. 53, D1328–D1334 (2025).

    Google Scholar 

  20. FlavorDB: a database of flavor molecules | Nucleic Acids Research | Oxford Academic. https://academic.oup.com/nar/article/46/D1/D1210/4559748.

  21. Fonger, G. C. Hazardous substances data bank (HSDB) as a source of environmental fate information on chemicals. Toxicology 103, 137–145 (1995).

    Google Scholar 

  22. Haussmann, S. et al. FoodKG: a semantics-driven knowledge graph for food recommendation. In The Semantic Web – ISWC 2019 (eds Ghidini, C. et al.) Vol. 11779 146–162 (Springer International Publishing, Cham, 2019).

  23. Park, D., Kim, K., Kim, S., Spranger, M. & Kang, J. FlavorGraph: a large-scale food-chemical graph for generating food representations and recommending food pairings. Sci. Rep. 11, 931 (2021).

    Google Scholar 

  24. Ni, Y., Jensen, K., Kouskoumvekaki, I. & Panagiotou, G. NutriChem 2.0: exploring the effect of plant-based foods on human health and drug efficacy. Database 2017, bax044 (2017).

    Google Scholar 

  25. van der Maaten, L. & Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).

    Google Scholar 

  26. McInnes, L., Healy, J. & Astels, S. hdbscan: hierarchical density based clustering. J. Open Source Softw. 2, 205 (2017).

    Google Scholar 

  27. Mullahy, J. Specification and testing of some modified count data models. J. Econom. 33, 341–365 (1986).

    Google Scholar 

  28. Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Methodol. 57, 289–300 (1995).

    Google Scholar 

  29. Johnson, G. H. & Fritsche, K. Effect of dietary linoleic acid on markers of inflammation in healthy persons: a systematic review of randomized controlled trials. J. Acad. Nutr. Diet. 112, 1041.e1–15 (2012).

    Google Scholar 

  30. Crowell, P. L. Prevention and therapy of cancer by dietary monoterpenes. J. Nutr. 129, 775S–778S (1999).

    Google Scholar 

  31. Benzie, I. F. F. & Choi, S.-W. Chapter One - Antioxidants in food: content, measurement, significance, action, cautions, caveats, and research needs. In Advances in Food and Nutrition Research (ed. Henry, J.) Vol. 71 1–53 (Academic Press, 2014).

  32. Shahidi, F. & Ambigaipalan, P. Phenolics and polyphenolics in foods, beverages and spices: antioxidant activity and health effects – A review. J. Funct. Foods 18, 820–897 (2015).

    Google Scholar 

  33. Chang, J., Wang, H., Su, W., He, X. & Tan, M. Artificial intelligence in food bioactive peptides screening: recent advances and future prospects. Trends Food Sci. Technol. 156, 104845 (2025).

    Google Scholar 

  34. Alvarez-Leite, J. I. The role of bioactive compounds in human health and disease. Nutrients 17, 1170 (2025).

    Google Scholar 

  35. GPT-5 System Card. https://openai.com/index/gpt-5-system-card/ (2025).

  36. Zhang, Z. et al. Multimodal chain-of-thought reasoning in language models. https://openreview.net/forum?id=gDlsMWost9 (2023).

  37. Liu, M. X. et al. ‘We Need Structured Output’: towards user-centered constraints on large language model output. In Extended Abstracts of the CHI Conference on Human Factors in Computing Systems 1–9 (Association for Computing Machinery, New York, NY, USA, 2024).

  38. Lu, W. et al. Large language model for table processing: a survey. Front. Comput. Sci. 19, 192350 (2025).

    Google Scholar 

  39. Li, F., Youn, J., Millsop, C. & Tagkopoulos, I. Predicting clinical trial success for Clostridium difficile infections based on preclinical data. Front. Artif. Intell. 710.3389/frai.2024.1487335 (2024).

  40. Ren, P. et al. A survey of deep active learning. ACM Comput. Surv. 54, 180:1-180:40 (2021).

  41. Foster-Powell, K., Holt, S. H. & Brand-Miller, J. C. International table of glycemic index and glycemic load values: 2002. Am. J. Clin. Nutr. 76, 5–56 (2002).

    Google Scholar 

  42. Shivappa, N., Steck, S. E., Hurley, T. G., Hussey, J. R. & Hébert, J. R. Designing and developing a literature-derived, population-based dietary inflammatory index. Public Health Nutr. 17, 1689–1696 (2014).

    Google Scholar 

  43. Toro, S. et al. Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI). J. Biomed. Semant. 15, 19 (2024).

    Google Scholar 

  44. Youn, J., Naravane, T. & Tagkopoulos, I. Using Word Embeddings to Learn a Better Food Ontology. Front. Artif. Intell. 310.3389/frai.2020.584784. (2020).

  45. Jacobs, D. R. Jr & Tapsell, L. C. Food synergy: the key to a healthy diet. Proc. Nutr. Soc. 72, 200–206 (2013).

    Google Scholar 

  46. Nemec, K. Cultural awareness of eating patterns in the health care setting. Clin. Liver Dis. 16, 204–207 (2020).

    Google Scholar 

  47. Forde, C. G. & de Graaf, K. Influence of sensory properties in moderating eating behaviors and food intake. Front. Nutr. 9, 841444 (2022).

    Google Scholar 

  48. Melse-Boonstra, A. Bioavailability of micronutrients from nutrient-dense whole foods: zooming in on dairy, vegetables, and fruits. Front. Nutr. 7, 101 (2020).

    Google Scholar 

  49. Benford, D. et al. The principles and methods behind EFSA’s guidance on uncertainty analysis in scientific assessment. EFSA J 16, e05122 (2018).

    Google Scholar 

  50. Shannar, A. et al. Pharmacodynamics (PD), pharmacokinetics (PK) and PK-PD modeling of NRF2 activating dietary phytochemicals in cancer prevention and in health. Curr. Pharmacol. Rep. 11, 6 (2024).

    Google Scholar 

  51. Vasilevsky, N. A. et al. Mondo: Unifying diseases for the world, by the world. Preprint at https://doi.org/10.1101/2022.04.13.22273750 (2022).

  52. Hastings, J. et al. ChEBI in 2016: improved services and an expanding collection of metabolites. Nucleic Acids Res. 44, D1214–D1219 (2016).

    Google Scholar 

  53. Kim, S. et al. PubChem 2023 update. Nucleic Acids Res. 51, D1373–D1380 (2023).

    Google Scholar 

  54. PubMed. PubMed https://pubmed.ncbi.nlm.nih.gov/. [Accessed at 12/25/2025].

  55. PubMed Central (PMC). PubMed Central (PMC) https://pmc.ncbi.nlm.nih.gov/. [Accessed at 12/25/2025].

  56. Kans, J. Entrez Direct: E-utilities on the Unix Command Line. In Entrez Programming Utilities Help [Internet] (National Center for Biotechnology Information (US), 2025).

  57. Bird, S. & Loper, E. NLTK: The Natural Language Toolkit. In Proceedings of the ACL Interactive Poster and Demonstration Sessions, pages 214–217, Barcelona, Spain. Association for Computational Linguistics. (2004).

  58. Navarro, G. A guided tour to approximate string matching. ACM Comput. Surv. 33, 31–88 (2001).

    Google Scholar 

  59. OpenAI et al. GPT-4 technical report. Preprint at https://doi.org/10.48550/arXiv.2303.08774 (2024).

  60. Ye, J. et al. A comprehensive capability analysis of GPT-3 and GPT-3.5 series models. Preprint at https://doi.org/10.48550/arXiv.2303.10420 (2023).

  61. Davis, A. P., Wiegers, T. C., Rosenstein, M. C. & Mattingly, C. J. MEDIC: a practical disease vocabulary used at the Comparative Toxicogenomics Database. Database J. Biol. Databases Curation 2012, bar065 (2012).

    Google Scholar 

  62. Dhammi, I. K. & Kumar, S. Medical subject headings (MeSH) terms. Indian J. Orthop. 48, 443–444 (2014).

    Google Scholar 

  63. Hamosh, A., Scott, A. F., Amberger, J. S., Bocchini, C. A. & McKusick, V. A. Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33, D514–D517 (2005).

    Google Scholar 

  64. Zdrazil, B. et al. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res. 52, D1180–D1192 (2024).

    Google Scholar 

  65. Carlsen, M. H. et al. The total antioxidant content of more than 3100 foods, beverages, spices, herbs and supplements used worldwide. Nutr. J. 9, 3 (2010).

    Google Scholar 

  66. Laufkötter, O., Sturm, N., Bajorath, J., Chen, H. & Engkvist, O. Combining structural and bioactivity-based fingerprints improves prediction performance and scaffold hopping capability. J. Cheminformatics 11, 54 (2019).

    Google Scholar 

  67. Lenselink, E. B. et al. Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set. J. Cheminformatics 9, 45 (2017).

    Google Scholar 

  68. What We Eat In America (WWEIA) Database. Food Surveys Research Group https://doi.org/10.15482/USDA.ADC/1178144 (2015).

  69. Calder, P. C. Omega-3 fatty acids and inflammatory processes. Nutrients 2, 355–374 (2010).

    Google Scholar 

  70. Mozaffarian, D. & Rimm, E. B. Fish intake, contaminants, and human health: evaluating the risks and the benefits. JAMA 296, 1885–1899 (2006).

    Google Scholar 

  71. Ramsden, C. E. et al. Use of dietary linoleic acid for secondary prevention of coronary heart disease and death: evaluation of recovered data from the Sydney Diet Heart Study and updated meta-analysis. BMJ 346, e8707 (2013).

    Google Scholar 

  72. Cassidy, A. et al. High anthocyanin intake is associated with a reduced risk of myocardial infarction in young and middle-aged women. Circulation 127, 188–196 (2013).

    Google Scholar 

  73. Kalt, W. et al. Recent research on the health benefits of blueberries and their anthocyanins. Adv. Nutr. 11, 224–236 (2020).

    Google Scholar 

  74. Hankinson, A., Lloyd, B. & Alweis, R. Lime-induced phytophotodermatitis. J. Community Hosp. Intern. Med. Perspect. 4, 25090 (2014).

    Google Scholar 

  75. Sacks, F. M. et al. Dietary fats and cardiovascular disease: a presidential advisory from the American Heart Association. Circulation 136, e1–e23 (2017).

    Google Scholar 

  76. de Souza, R. J. et al. Intake of saturated and trans unsaturated fatty acids and risk of all cause mortality, cardiovascular disease, and type 2 diabetes: systematic review and meta-analysis of observational studies. BMJ 351, h3978 (2015).

    Google Scholar 

  77. Imamura, F. et al. Consumption of sugar sweetened beverages, artificially sweetened beverages, and fruit juice and incidence of type 2 diabetes: systematic review, meta-analysis, and estimation of population attributable fraction. BMJ 351, h3576 (2015).

    Google Scholar 

  78. Yang, Q. et al. Added sugar intake and cardiovascular diseases mortality among US adults. JAMA Intern. Med. 174, 516–524 (2014).

    Google Scholar 

  79. Schwingshackl, L. et al. Food groups and risk of all-cause mortality: a systematic review and meta-analysis of prospective studies. Am. J. Clin. Nutr. 105, 1462–1473 (2017).

    Google Scholar 

  80. Rocha, J., Borges, N. & Pinho, O. Table olives and health: a review. J. Nutr. Sci. 9, e57 (2020).

    Google Scholar 

  81. Li, S.-C. et al. Almond consumption improved glycemic control and lipid profiles in patients with type 2 diabetes mellitus. Metabolism 60, 474–479 (2011).

    Google Scholar 

  82. Lee-Bravatti, M. A. et al. Almond consumption and risk factors for cardiovascular disease: a systematic review and meta-analysis of randomized controlled trials. Adv. Nutr. 10, 1076–1088 (2019).

    Google Scholar 

  83. Martin, N., Germanò, R., Hartley, L., Adler, A. J. & Rees, K. Nut consumption for the primary prevention of cardiovascular disease. Cochrane Database Syst. Rev. 2015, CD011583 (2015).

    Google Scholar 

  84. Jiang, R. Nut and peanut butter consumption and risk of type 2 diabetes in women. JAMA 288, 2554 (2002).

    Google Scholar 

  85. Blumberg, J., Vita, J. & Chen, C. Concord grape juice polyphenols and cardiovascular risk factors: dose-response relationships. Nutrients 7, 10032–10052 (2015).

    Google Scholar 

  86. Stein, J. H., Keevil, J. G., Wiebe, D. A., Aeschlimann, S. & Folts, J. D. Purple grape juice improves endothelial function and reduces the susceptibility of LDL cholesterol to oxidation in patients with coronary artery disease. Circulation 100, 1050–1055 (1999).

    Google Scholar 

  87. Zhao, J., Wang, X., Lin, H. & Lin, Z. Hazelnut and its by-products: a comprehensive review of nutrition, phytochemical profile, extraction, bioactivities and applications. Food Chem. 413, 135576 (2023).

    Google Scholar 

  88. Tey, S. L. et al. Effects of different forms of hazelnuts on blood lipids and α-tocopherol concentrations in mildly hypercholesterolemic individuals. Eur. J. Clin. Nutr. 65, 117–124 (2011).

    Google Scholar 

  89. Jazinaki, M. S., Rashidmayvan, M. & Pahlavani, N. The effect of pomegranate juice supplementation on C-reactive protein levels: GRADE -assessed systematic review and dose–response updated meta-analysis of data from randomized controlled trials. Phytother. Res. 38, 2818–2831 (2024).

    Google Scholar 

  90. Basu, A. & Penugonda, K. Pomegranate juice: a heart-healthy fruit juice. Nutr. Rev. 67, 49–56 (2009).

    Google Scholar 

  91. Khaw, K.-T. et al. Randomised trial of coconut oil, olive oil or butter on blood lipids and other cardiovascular risk factors in healthy men and women. BMJ Open 8, e020167 (2018).

    Google Scholar 

  92. Eyres, L., Eyres, M. F., Chisholm, A. & Brown, R. C. Coconut oil consumption and cardiovascular risk factors in humans. Nutr. Rev. 74, 267–280 (2016).

    Google Scholar 

  93. Yang, D. K. Cabbage (Brassica oleracea var. capitata) protects against H2O2 -induced oxidative stress by preventing mitochondrial dysfunction in H9c2 cardiomyoblasts. Evid. Based Complement. Alternat. Med. 2018, 2179021 (2018).

    Google Scholar 

  94. Jiang, Y. et al. Cruciferous vegetable intake is inversely correlated with circulating levels of proinflammatory markers in women. J. Acad. Nutr. Diet. 114, 700–708.e2 (2014).

    Google Scholar 

  95. McKay, D., Eliasziw, M., Chen, C. & Blumberg, J. A pecan-rich diet improves cardiometabolic risk factors in overweight and obese adults: a randomized controlled trial. Nutrients 10, 339 (2018).

    Google Scholar 

  96. Robbins, K. S., Gong, Y., Wells, M. L., Greenspan, P. & Pegg, R. B. Reprint of “Investigation of the antioxidant capacity and phenolic constituents of U.S. pecans. J. Funct. Foods 18, 1002–1013 (2015).

    Google Scholar 

  97. Feeney, E. L., Lamichhane, P. & Sheehan, J. J. The cheese matrix: understanding the impact of cheese structure on aspects of cardiovascular health – a food science and a human nutrition perspective. Int. J. Dairy Technol. 74, 656–670 (2021).

    Google Scholar 

  98. Rangel, A. H. D. N. et al. An overview of the occurrence of bioactive peptides in different types of cheeses. Foods 12, 4261 (2023).

    Google Scholar 

  99. Lemke, S. L. et al. Dietary intake of stearidonic acid–enriched soybean oil increases the omega-3 index: randomized, double-blind clinical study of efficacy and safety. Am. J. Clin. Nutr. 92, 766–775 (2010).

    Google Scholar 

  100. Baer, D. J., Henderson, T. & Gebauer, S. K. Consumption of high-oleic soybean oil improves lipid and lipoprotein profile in humans compared to a palm oil blend: a randomized controlled trial. Lipids 56, 313–325 (2021).

    Google Scholar 

  101. Fang, S., Lin, F., Qu, D., Liang, X. & Wang, L. Characterization of purified red cabbage anthocyanins: improvement in HPLC separation and protective effect against H2O2-induced oxidative stress in HepG2 cells. Molecules 24, 124 (2018).

    Google Scholar 

  102. Wiczkowski, W., Szawara-Nowak, D. & Topolska, J. Red cabbage anthocyanins: profile, isolation, identification, and antioxidant activity. Food Res. Int. 51, 303–309 (2013).

    Google Scholar 

  103. Siervo, M. et al. Nitrate-rich beetroot juice reduces blood pressure in Tanzanian adults with elevated blood pressure: a double-blind randomized controlled feasibility trial. J. Nutr. 150, 2460–2468 (2020).

    Google Scholar 

  104. Clifford, T., Howatson, G., West, D. & Stevenson, E. The potential benefits of red beetroot supplementation in health and disease. Nutrients 7, 2801–2822 (2015).

    Google Scholar 

  105. Grover, A. & Leskovec, J. "node2vec: Scalable feature learning for networks." Proceedings of the 22nd ACMSIGKDD international conference on Knowledge discovery and data mining. (2016).

Download references

Acknowledgements

This work was supported by the USDA-NIFA AI Institute for Next Generation Food Systems (AIFS), USDA-NIFA award number 2020-67021-32855.

Author information

Authors and Affiliations

  1. Department of Computer Science, the University of California at Davis, Davis, CA, USA

    Fangzhou Li, Jason Youn, Kaichi Xie, Trevor Chan, Pranav Gupta, Michael Gunning, Keer Ni & Ilias Tagkopoulos

  2. Genome Center, the University of California at Davis, Davis, CA, USA

    Fangzhou Li, Jason Youn, Trevor Chan, Pranav Gupta, Arielle Yoo, Michael Gunning, Keer Ni & Ilias Tagkopoulos

  3. USDA/NSF AI Institute for Next Generation Food Systems (AIFS), Davis, CA, USA

    Fangzhou Li, Jason Youn, Kaichi Xie, Trevor Chan, Pranav Gupta, Arielle Yoo, Michael Gunning, Keer Ni & Ilias Tagkopoulos

Authors
  1. Fangzhou Li
    View author publications

    Search author on:PubMed Google Scholar

  2. Jason Youn
    View author publications

    Search author on:PubMed Google Scholar

  3. Kaichi Xie
    View author publications

    Search author on:PubMed Google Scholar

  4. Trevor Chan
    View author publications

    Search author on:PubMed Google Scholar

  5. Pranav Gupta
    View author publications

    Search author on:PubMed Google Scholar

  6. Arielle Yoo
    View author publications

    Search author on:PubMed Google Scholar

  7. Michael Gunning
    View author publications

    Search author on:PubMed Google Scholar

  8. Keer Ni
    View author publications

    Search author on:PubMed Google Scholar

  9. Ilias Tagkopoulos
    View author publications

    Search author on:PubMed Google Scholar

Contributions

F.L. and J.Y. created the knowledge graph and built the sentence filtration and sentence extraction pipeline. K.X. annotated the information extraction dataset and fine-tuned GPT-3.5 on the data. T.C. performed analysis for results, generated figures, and wrote the paper. P.G. built the bioactivity prediction model. A.Y. integrated disease data into FA. M.G. integrated flavor data into FA. K.N. helped K.X. with annotation. I.T. conceived and supervised all aspects of the project, including project management, framework design, pipeline architecture, predictive model, data analysis, hypothesis generation and testing. All authors contributed to writing the manuscript.

Corresponding author

Correspondence to Ilias Tagkopoulos.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, F., Youn, J., Xie, K. et al. A unified knowledge graph linking foodomics to chemical-disease networks and flavor profiles. npj Sci Food (2026). https://doi.org/10.1038/s41538-025-00680-9

Download citation

  • Received: 17 July 2025

  • Accepted: 17 December 2025

  • Published: 20 January 2026

  • DOI: https://doi.org/10.1038/s41538-025-00680-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on Twitter
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Journal Information
  • Content types
  • About the Editors
  • Contact
  • Open Access
  • Calls for Papers
  • Editorial policies
  • Article Processing Charges
  • Journal Metrics
  • About the Partner
  • 5 questions with our new co-Editor-in-Chief

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Science of Food (npj Sci Food)

ISSN 2396-8370 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing