Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Advertisement

npj Science of Food
  • View all journals
  • Search
  • My Account Login
  • Content Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • RSS feed
  1. nature
  2. npj science of food
  3. articles
  4. article
Tracing origin and cultivation practice of Lithocarpus litseifolius via multi-data fusion and machine learning approaches
Download PDF
Download PDF
  • Article
  • Open access
  • Published: 13 February 2026

Tracing origin and cultivation practice of Lithocarpus litseifolius via multi-data fusion and machine learning approaches

  • Yifan Tang1,2 na1,
  • Ping Yu3,4,5 na1,
  • Feng Xiong3,5,
  • Zhilai Zhan3,5,
  • Kai Xie6,
  • Shuyan Yu1,
  • Yifan Ning1,
  • Zhanhan Zhou1,
  • Chun Wang1,
  • Weisen Qian1,
  • Xiwen Zhang1,
  • Yike Liang1,
  • Ruijiao Wang1,
  • Guoxia Han1 &
  • …
  • Jian Yang3,5 

npj Science of Food , Article number:  (2026) Cite this article

We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors present which affect the content, and all legal disclaimers apply.

Subjects

  • Biochemistry
  • Plant sciences

Abstract

Lithocarpus litseifolius (sweet tea) is a medicinal and edible plant rich in flavonoids and essential nutrients, with potential as a hepatoprotective beverages and natural sweetener. Although widely cultivated across several provinces in China, the quality and consistency of its raw material remain poorly regulated. To address this, 163 samples (n ≥ 18) from 7 main producing regions were analyzed for 22 functional compounds, 4 stable isotope ratios, and 49 multi-element to discriminate cultivation practices and geographical origins. Orthogonal partial least squares discriminant analysis (OPLS-DA) successfully generated prediction models across two cultivation regions. Integrating 8 machine-learning algorithms with multi-level data fusion identified 6 key variables—caffeine, Rb, Ce, δ¹⁵N, Sr, and 3”-O-acetylphlorizin. Five base learners built on these variables were then combined via soft-voting ensemble learning, yielding an optimal origin classifier with 100.00% accuracy. Additionally, the study delivered the first comprehensive analysis of quality variations in sweet tea and identified seven primary influenced environmental factors, offering insights into cultivation strategies and quality formation mechanisms.

Data availability

Data will be made available on request. The codes used to generate the results in this study are available on reasonable request from the corresponding author.

Code availability

The codes used to generate the results in this study are available on reasonable request from the corresponding author.

References

  1. Wang, Y. K. et al. Dihydrochalcones in sweet tea: Biosynthesis, distribution and neuroprotection function. Molecules 27. https://doi.org/10.3390/molecules27248794 (2022).

  2. Shang, A. et al. Sweet tea (Lithocarpus polystachyus rehd.) as a new natural source of bioactive dihydrochalcones with multiple health benefits. Crit. Rev. Food Sci. Nutr. 62, 917–934 (2022).

    Google Scholar 

  3. Ma, J. et al. Toxicological safety assessment of a water extract of Lithocarpus litseifolius by a 90-day repeated oral toxicity study in rats. Front Pharm. 15, 1385550. https://doi.org/10.3389/fphar.2024.1385550 (2024).

    Google Scholar 

  4. Wei, Y. Q. et al. Research progress on dihydrochalcones from Lithocarpus litseifolius extracts in treatment of type 2 diabetes mellitus and its complications. Zhongguo Zhong Yao Za Zhi 50, 658–671 (2025).

    Google Scholar 

  5. Eichenberger, M. et al. Metabolic engineering of Saccharomyces cerevisiae for de novo production of dihydrochalcones with known antioxidant, antidiabetic, and sweet tasting properties. Metab. Eng. 39, 80–89 (2017).

    Google Scholar 

  6. Jiang, J. et al. In vitro inhibitory effect of five natural sweeteners on alpha-glucosidase and alpha-amylase. Food Funct. 15, 2234–2248 (2024).

    Google Scholar 

  7. Zhang, J., Yan, X., Yuan, X. & Ke, F. Research progress on the treatment and regulation of traditional Chinese medicine lithocarpus litseifolius in obese populations. J. Southwest Med. Univ. https://doi.org/10.3969/j.issn.2096-3351.2025.05.018 (2025).

  8. Bajoub, A., Carrasco-Pancorbo, A., Ajal, E. A., Ouazzani, N. & Fernandez-Gutierrez, A. Potential of LC-MS phenolic profiling combined with multivariate analysis as an approach for the determination of the geographical origin of north Moroccan virgin olive oils. Food Chem. 166, 292–300 (2015).

    Google Scholar 

  9. Bai, L. et al. Combining stable isotopes and multi-elements with machine learning chemometric models to identify the geographical origins of Tetrastigma hemsleyanum Diels et Gilg. Food Chem. 469, 142496. https://doi.org/10.1016/j.foodchem.2024.142496 (2025).

    Google Scholar 

  10. Sim, J., McGoverin, C., Oey, I., Frew, R. & Kebede, B. Stable isotope and trace element analyses with non-linear machine-learning data analysis improved coffee origin classification and marker selection. J. Sci. Food Agric 103, 4704–4718 (2023).

    Google Scholar 

  11. Strojnik, L. et al. Geographical identification of strawberries based on stable isotope ratio and multi-elemental analysis coupled with multivariate statistical analysis: A Slovenian case study. Food Chem. 381, 132204. https://doi.org/10.1016/j.foodchem.2022.132204 (2022).

    Google Scholar 

  12. Sanematsu, K. Adsorption of REE in Weathered Granite and Its Importance for Resources. J. Clay Sci. Soc. Jpn. (Jpn.) 50, 128–134 (2012).

    Google Scholar 

  13. Cao, M. et al. Optimistic contributions of plant growth-promoting bacteria for sustainable agriculture and climate stress alleviation. Environ. Res. 217, 114924 (2023).

    Google Scholar 

  14. Bai, R. et al. Deep learning-based fusion of color and spectral features from hyperspectral imaging for the origin identification of Salvia miltiorrhiza. Sci. Traditional Chin. Med. 3, 250–258 (2025).

    Google Scholar 

  15. Yu, P. et al. Unveiling the origin and quality traits of Angelica sinensis: Hyperspectral imaging combined with chemometrics and information fusion strategies. J. Food Comp. Anal. 147. https://doi.org/10.1016/j.jfca.2025.108089 (2025).

  16. Yu, D. et al. Interpretable AI-driven multidimensional chemical fingerprints for geographical authentication of Euryales Semen. NPJ Sci. Food 9, 133 (2025).

    Google Scholar 

  17. Liu, G. et al. Temporal dynamics of bioactive compounds in sweet tea (Lithocarpus litseifolius (Hance) Chun): Linking harvest stages to flavor and health benefits. Food Res. Int. 218. https://doi.org/10.1016/j.foodres.2025.116918 (2025).

  18. Qiu, Y. X., Fu, C. X. & Comes, H. P. Plant molecular phylogeography in China and adjacent regions: Tracing the genetic imprints of Quaternary climate and environmental change in the world’s most diverse temperate flora. Mol. Phylogenet Evol. 59, 225–244 (2011).

    Google Scholar 

  19. Fan, L., Zheng, H., Milne, R. I., Zhang, L. & Mao, K. Strong population bottleneck and repeated demographic expansions of Populus adenopoda (Salicaceae) in subtropical China. Ann. Bot. 121, 665–679 (2018).

    Google Scholar 

  20. Chen, X. et al. Biogeographic and metabolic studies support a glacial radiation hypothesis during Chrysanthemum evolution. Hortic. Res 9, uhac153. https://doi.org/10.1093/hr/uhac153 (2022).

    Google Scholar 

  21. Liu, H. Y. et al. Phenolic content, main flavonoids, and antioxidant capacity of instant sweet tea (lithocarpus litseifolius [hance] chun) prepared with different raw materials and drying methods. Foods 10. https://doi.org/10.3390/foods10081930 (2021).

  22. Dare, A. P. et al. Overexpression of chalcone isomerase in apple reduces phloridzin accumulation and increases susceptibility to herbivory by two-spotted mites. Plant J. 103, 293–307 (2020).

    Google Scholar 

  23. Zhang, X. et al. Identification of UDP-glucosyltransferase involved in the biosynthesis of phloridzin in Gossypium hirsutum. Plant J. 121, e17248. https://doi.org/10.1111/tpj.17248 (2025).

    Google Scholar 

  24. Das, P. R., Kim, Y., Hong, S. J. & Eun, J. B. Profiling of volatile and non-phenolic metabolites-Amino acids, organic acids, and sugars of green tea extracts obtained by different extraction techniques. Food Chem. 296, 69–77 (2019).

    Google Scholar 

  25. Gilbert, A., Silvestre, V., Robins, R. J., Remaud, G. S. & Tcherkez, G. Biochemical and physiological determinants of intramolecular isotope patterns in sucrose from C(3), C(4) and CAM plants accessed by isotopic (1)(3)C NMR spectrometry: a viewpoint. Nat. Prod. Rep. 29, 476–486 (2012).

    Google Scholar 

  26. Zeng, S. et al. Integrated transcriptome and metabolome analysis reveals the regulation of phlorizin synthesis in Lithocarpus polystachyus under nitrogen fertilization. BMC Plant Biol. 24, 366 (2024).

    Google Scholar 

  27. Li, L. et al. Enhanced carbon use efficiency and warming resistance of soil microorganisms under organic amendment. Environ. Int 192, 109043. https://doi.org/10.1016/j.envint.2024.109043 (2024).

    Google Scholar 

  28. Hayashi, N. et al. Annual variation of natural 15N abundance in tea leaves and its practicality as an organic tea indicator. J. Agric Food Chem. 59, 10317–10321 (2011).

    Google Scholar 

  29. Roy, A. et al. Unravelling 30 ka recharge history of an intensely exploited multi-tier aquifer system in North West India through isotopic tracers - Implications on deep groundwater sustainability. Sci. Total Environ. 807, 151401. https://doi.org/10.1016/j.scitotenv.2021.151401 (2022).

    Google Scholar 

  30. Jiang, F. et al. Selenium levels in soil and tea as affected by soil properties in Jiangxi Province, China. BMC Plant Biol. 24, 1130 (2024).

    Google Scholar 

  31. Fu, L. et al. Differences in Copper Absorption and Accumulation between Copper-Exclusion and Copper-Enrichment Plants: A Comparison of Structure and Physiological Responses. PLoS One 10, e0133424 (2015).

    Google Scholar 

  32. MEE. (Ministry of Ecological Environment, 2020).

  33. Dutta, T. et al. Global demand for rare earth resources and strategies for green mining. Environ. Res 150, 182–190 (2016).

    Google Scholar 

  34. Fu, H. et al. Combining stable C, N, O, H, Sr isotope and multi-element with chemometrics for identifying the geographical origins and farming patterns of Huangjing herb. J. Food Comp. Anal. 102. https://doi.org/10.1016/j.jfca.2021.103972 (2021).

  35. Minsat, L. et al. Sustainable and scalable enzymatic production, structural elucidation, and biological evaluation of novel phlorizin analogues. ChemSusChem 18, e202401498. https://doi.org/10.1002/cssc.202401498 (2025).

    Google Scholar 

  36. Elrys, A. S. et al. Global gross nitrification rates are dominantly driven by soil carbon-to-nitrogen stoichiometry and total nitrogen. Glob. Chang Biol. 27, 6512–6524 (2021).

    Google Scholar 

  37. Kim, J. H. Multicollinearity and misleading statistical results. Korean J. Anesthesiol. 72, 558–569 (2019).

    Google Scholar 

  38. Wei, Z. et al. Melatonin increases the performance of Malus hupehensis after UV-B exposure. Plant Physiol. Biochem 139, 630–641 (2019).

    Google Scholar 

  39. Kfoury, N. et al. Striking changes in tea metabolites due to elevational effects. Food Chem. 264, 334–341 (2018).

    Google Scholar 

  40. Jiang, J., Xu, R. -k & Zhao, A. -z Comparison of the surface chemical properties of four soils derived from Quaternary red earth as related to soil evolution. Catena 80, 154–161 (2010).

    Google Scholar 

  41. Liu, W.-J. et al. Elemental and strontium isotopic geochemistry of the soil profiles developed on limestone and sandstone in karstic terrain on Yunnan-Guizhou Plateau, China: Implications for chemical weathering and parent materials. J. Asian Earth Sci. 67-68, 138–152 (2013).

    Google Scholar 

  42. Gunadasa, S. G., Tighe, M. K. & Wilson, S. C. Arsenic and cadmium leaching in co-contaminated agronomic soil and the influence of high rainfall and amendments. Environ. Pollut. 316, 120591 (2023).

    Google Scholar 

  43. Ritter, A., Regalado, C. M. & Aschan, G. Fog reduces transpiration in tree species of the Canarian relict heath-laurel cloud forest (Garajonay National Park, Spain). Tree Physiol. 29, 517–528 (2009).

    Google Scholar 

  44. Schaeffer, S. M., Sharp, E., Schimel, J. P. & Welker, J. M. Soil-plant N processes in a High Arctic ecosystem, NW Greenland are altered by long-term experimental warming and higher rainfall. Glob. Chang Biol. 19, 3529–3539 (2013).

    Google Scholar 

  45. Liu, Y. et al. Genomic basis of geographical adaptation to soil nitrogen in rice. Nature 590, 600–605 (2021).

    Google Scholar 

  46. Midway, S., Robertson, M., Flinn, S. & Kaller, M. Comparing multiple comparisons: practical guidance for choosing the best multiple comparisons test. PeerJ 8, e10387 (2020).

    Google Scholar 

  47. Seredin, P. et al. A study of the association between primary oral pathologies (dental caries and periodontal diseases) using synchrotron molecular FTIR spectroscopy in view of the patient’s personalized clinical picture (demographics and anamnesis). Int. J. Mol. Sci. 25. https://doi.org/10.3390/ijms25126395 (2024).

  48. Zhou, R. et al. Hybrid wavelength selection strategy combined with ATR-FTIR spectroscopy for preliminary exploration of vintage labeling traceability of sauce-flavor baijiu. Spectrochim. Acta A Mol. Biomol. Spectrosc. 321, 124691 (2024).

    Google Scholar 

  49. Liu, C. et al. Metabolomics for origin traceability of lamb: An ensemble learning approach based on random forest recursive feature elimination. Food Chem. X 29, 102856 (2025).

    Google Scholar 

  50. Ashiq, W. et al. Roman urdu hate speech detection using hybrid machine learning models and hyperparameter optimization. Sci. Rep. 14, 28590 (2024).

    Google Scholar 

  51. Lyu, J. et al. Generative adversarial network-based noncontrast CT angiography for aorta and carotid arteries. Radiology 309, e230681 (2023).

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank all colleagues and institutions that contributed to this work. This study was supported by the Scientific and Technological Innovation Project of China Academy of Chinese Medical Sciences (Grant No. JBKY2025A06), the Innovation Team and Talents Cultivation Program of National Administration of Traditional Chinese Medicine (Grant No. ZYYZDXK- 2023244), the Key Project at the Central Government Level: Capacity Building for the Sustainable Utilization of Valuable Chinese Medicine Resources (Grant No. 2060302), the China Agriculture Research System of MOF and MARA (Grant No. CARS-21), the National Key R&D Program of China (Grant No. 2020YFC1712700), Co-development Agreement on the Bioactivity Research of Flavonoids from Lithocarpus litseifolius (RDS10120230053).

Author information

Author notes
  1. These authors contributed equally: Yifan Tang, Ping Yu.

Authors and Affiliations

  1. Academy of Pharmacy, Xi’an-Jiaotong Liverpool University, Suzhou, China

    Yifan Tang, Shuyan Yu, Yifan Ning, Zhanhan Zhou, Chun Wang, Weisen Qian, Xiwen Zhang, Yike Liang, Ruijiao Wang & Guoxia Han

  2. Laboratory for Breeding and Processing of Lithocarpus litseifolius, Liangtian Biotech Jiangsu Co. Ltd, Changzhou, China

    Yifan Tang

  3. State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, National Resource Center for Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, PR China

    Ping Yu, Feng Xiong, Zhilai Zhan & Jian Yang

  4. State Key Laboratory for Quality Ensurance and Sustainable Use of Dao-di Herbs, Institute of Chinese Materia Medica, China Academy of Chinese Medical Sciences, Beijing, PR China

    Ping Yu

  5. Key Laboratory of Biology and Cultivation of Herb Medicine, Ministry of Agriculture and Rural Affairs, Beijing, PR China

    Ping Yu, Feng Xiong, Zhilai Zhan & Jian Yang

  6. Department of Thoracic and Cardiovascular Surgery, The Fourth Affiliated Hospital of Soochow University, Suzhou Dushu Lake Hospital, Medical Center of Soochow University, Suzhou, China

    Kai Xie

Authors
  1. Yifan Tang
    View author publications

    Search author on:PubMed Google Scholar

  2. Ping Yu
    View author publications

    Search author on:PubMed Google Scholar

  3. Feng Xiong
    View author publications

    Search author on:PubMed Google Scholar

  4. Zhilai Zhan
    View author publications

    Search author on:PubMed Google Scholar

  5. Kai Xie
    View author publications

    Search author on:PubMed Google Scholar

  6. Shuyan Yu
    View author publications

    Search author on:PubMed Google Scholar

  7. Yifan Ning
    View author publications

    Search author on:PubMed Google Scholar

  8. Zhanhan Zhou
    View author publications

    Search author on:PubMed Google Scholar

  9. Chun Wang
    View author publications

    Search author on:PubMed Google Scholar

  10. Weisen Qian
    View author publications

    Search author on:PubMed Google Scholar

  11. Xiwen Zhang
    View author publications

    Search author on:PubMed Google Scholar

  12. Yike Liang
    View author publications

    Search author on:PubMed Google Scholar

  13. Ruijiao Wang
    View author publications

    Search author on:PubMed Google Scholar

  14. Guoxia Han
    View author publications

    Search author on:PubMed Google Scholar

  15. Jian Yang
    View author publications

    Search author on:PubMed Google Scholar

Contributions

Y.T.: Conceptualization, data curation, formal analysis, project administration, software, visualization, writing original draft, and writing—review & editing. P.Y.: Conceptualization, formal analysis, project administration, methodology, supervision, software, visualization, writing—review & editing. F.X.: formal analysis, supervision. Z.Z.: formal analysis, supervision, writing—review & editing. K.X.: Investigation, resources. S.Y.: Investigation, software. Y.N.: Investigation, data curation. Z.Z.: Investigation, resources. C.W.: Investigation, resources. W.Q.: Investigation. X.Z.: Investigation. Y.L.: Investigation. R.W.: Investigation. G.H.: Supervision, writing—review & editing. J.Y.: Supervision, funding acquisition, project administration, and writing—review & editing.

Corresponding authors

Correspondence to Yifan Tang, Guoxia Han or Jian Yang.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tang, Y., Yu, P., Xiong, F. et al. Tracing origin and cultivation practice of Lithocarpus litseifolius via multi-data fusion and machine learning approaches. npj Sci Food (2026). https://doi.org/10.1038/s41538-026-00748-0

Download citation

  • Received: 05 November 2025

  • Accepted: 02 February 2026

  • Published: 13 February 2026

  • DOI: https://doi.org/10.1038/s41538-026-00748-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Download PDF

Advertisement

Explore content

  • Research articles
  • Reviews & Analysis
  • News & Comment
  • Collections
  • Follow us on X
  • Sign up for alerts
  • RSS feed

About the journal

  • Aims & Scope
  • Journal Information
  • Content types
  • About the Editors
  • Contact
  • Open Access
  • Calls for Papers
  • Editorial policies
  • Article Processing Charges
  • Journal Metrics
  • About the Partner
  • 5 questions with our new co-Editor-in-Chief

Publish with us

  • For Authors and Referees
  • Language editing services
  • Open access funding
  • Submit manuscript

Search

Advanced search

Quick links

  • Explore articles by subject
  • Find a job
  • Guide to authors
  • Editorial policies

npj Science of Food (npj Sci Food)

ISSN 2396-8370 (online)

nature.com sitemap

About Nature Portfolio

  • About us
  • Press releases
  • Press office
  • Contact us

Discover content

  • Journals A-Z
  • Articles by subject
  • protocols.io
  • Nature Index

Publishing policies

  • Nature portfolio policies
  • Open access

Author & Researcher services

  • Reprints & permissions
  • Research data
  • Language editing
  • Scientific editing
  • Nature Masterclasses
  • Research Solutions

Libraries & institutions

  • Librarian service & tools
  • Librarian portal
  • Open research
  • Recommend to library

Advertising & partnerships

  • Advertising
  • Partnerships & Services
  • Media kits
  • Branded content

Professional development

  • Nature Awards
  • Nature Careers
  • Nature Conferences

Regional websites

  • Nature Africa
  • Nature China
  • Nature India
  • Nature Japan
  • Nature Middle East
  • Privacy Policy
  • Use of cookies
  • Legal notice
  • Accessibility statement
  • Terms & Conditions
  • Your US state privacy rights
Springer Nature

© 2026 Springer Nature Limited

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing