Abstract
Lithocarpus litseifolius (sweet tea) is a medicinal and edible plant rich in flavonoids and essential nutrients, with potential as a hepatoprotective beverages and natural sweetener. Although widely cultivated across several provinces in China, the quality and consistency of its raw material remain poorly regulated. To address this, 163 samples (n ≥ 18) from 7 main producing regions were analyzed for 22 functional compounds, 4 stable isotope ratios, and 49 multi-element to discriminate cultivation practices and geographical origins. Orthogonal partial least squares discriminant analysis (OPLS-DA) successfully generated prediction models across two cultivation regions. Integrating 8 machine-learning algorithms with multi-level data fusion identified 6 key variables—caffeine, Rb, Ce, δ¹⁵N, Sr, and 3”-O-acetylphlorizin. Five base learners built on these variables were then combined via soft-voting ensemble learning, yielding an optimal origin classifier with 100.00% accuracy. Additionally, the study delivered the first comprehensive analysis of quality variations in sweet tea and identified seven primary influenced environmental factors, offering insights into cultivation strategies and quality formation mechanisms.
Data availability
Data will be made available on request. The codes used to generate the results in this study are available on reasonable request from the corresponding author.
Code availability
The codes used to generate the results in this study are available on reasonable request from the corresponding author.
References
Wang, Y. K. et al. Dihydrochalcones in sweet tea: Biosynthesis, distribution and neuroprotection function. Molecules 27. https://doi.org/10.3390/molecules27248794 (2022).
Shang, A. et al. Sweet tea (Lithocarpus polystachyus rehd.) as a new natural source of bioactive dihydrochalcones with multiple health benefits. Crit. Rev. Food Sci. Nutr. 62, 917–934 (2022).
Ma, J. et al. Toxicological safety assessment of a water extract of Lithocarpus litseifolius by a 90-day repeated oral toxicity study in rats. Front Pharm. 15, 1385550. https://doi.org/10.3389/fphar.2024.1385550 (2024).
Wei, Y. Q. et al. Research progress on dihydrochalcones from Lithocarpus litseifolius extracts in treatment of type 2 diabetes mellitus and its complications. Zhongguo Zhong Yao Za Zhi 50, 658–671 (2025).
Eichenberger, M. et al. Metabolic engineering of Saccharomyces cerevisiae for de novo production of dihydrochalcones with known antioxidant, antidiabetic, and sweet tasting properties. Metab. Eng. 39, 80–89 (2017).
Jiang, J. et al. In vitro inhibitory effect of five natural sweeteners on alpha-glucosidase and alpha-amylase. Food Funct. 15, 2234–2248 (2024).
Zhang, J., Yan, X., Yuan, X. & Ke, F. Research progress on the treatment and regulation of traditional Chinese medicine lithocarpus litseifolius in obese populations. J. Southwest Med. Univ. https://doi.org/10.3969/j.issn.2096-3351.2025.05.018 (2025).
Bajoub, A., Carrasco-Pancorbo, A., Ajal, E. A., Ouazzani, N. & Fernandez-Gutierrez, A. Potential of LC-MS phenolic profiling combined with multivariate analysis as an approach for the determination of the geographical origin of north Moroccan virgin olive oils. Food Chem. 166, 292–300 (2015).
Bai, L. et al. Combining stable isotopes and multi-elements with machine learning chemometric models to identify the geographical origins of Tetrastigma hemsleyanum Diels et Gilg. Food Chem. 469, 142496. https://doi.org/10.1016/j.foodchem.2024.142496 (2025).
Sim, J., McGoverin, C., Oey, I., Frew, R. & Kebede, B. Stable isotope and trace element analyses with non-linear machine-learning data analysis improved coffee origin classification and marker selection. J. Sci. Food Agric 103, 4704–4718 (2023).
Strojnik, L. et al. Geographical identification of strawberries based on stable isotope ratio and multi-elemental analysis coupled with multivariate statistical analysis: A Slovenian case study. Food Chem. 381, 132204. https://doi.org/10.1016/j.foodchem.2022.132204 (2022).
Sanematsu, K. Adsorption of REE in Weathered Granite and Its Importance for Resources. J. Clay Sci. Soc. Jpn. (Jpn.) 50, 128–134 (2012).
Cao, M. et al. Optimistic contributions of plant growth-promoting bacteria for sustainable agriculture and climate stress alleviation. Environ. Res. 217, 114924 (2023).
Bai, R. et al. Deep learning-based fusion of color and spectral features from hyperspectral imaging for the origin identification of Salvia miltiorrhiza. Sci. Traditional Chin. Med. 3, 250–258 (2025).
Yu, P. et al. Unveiling the origin and quality traits of Angelica sinensis: Hyperspectral imaging combined with chemometrics and information fusion strategies. J. Food Comp. Anal. 147. https://doi.org/10.1016/j.jfca.2025.108089 (2025).
Yu, D. et al. Interpretable AI-driven multidimensional chemical fingerprints for geographical authentication of Euryales Semen. NPJ Sci. Food 9, 133 (2025).
Liu, G. et al. Temporal dynamics of bioactive compounds in sweet tea (Lithocarpus litseifolius (Hance) Chun): Linking harvest stages to flavor and health benefits. Food Res. Int. 218. https://doi.org/10.1016/j.foodres.2025.116918 (2025).
Qiu, Y. X., Fu, C. X. & Comes, H. P. Plant molecular phylogeography in China and adjacent regions: Tracing the genetic imprints of Quaternary climate and environmental change in the world’s most diverse temperate flora. Mol. Phylogenet Evol. 59, 225–244 (2011).
Fan, L., Zheng, H., Milne, R. I., Zhang, L. & Mao, K. Strong population bottleneck and repeated demographic expansions of Populus adenopoda (Salicaceae) in subtropical China. Ann. Bot. 121, 665–679 (2018).
Chen, X. et al. Biogeographic and metabolic studies support a glacial radiation hypothesis during Chrysanthemum evolution. Hortic. Res 9, uhac153. https://doi.org/10.1093/hr/uhac153 (2022).
Liu, H. Y. et al. Phenolic content, main flavonoids, and antioxidant capacity of instant sweet tea (lithocarpus litseifolius [hance] chun) prepared with different raw materials and drying methods. Foods 10. https://doi.org/10.3390/foods10081930 (2021).
Dare, A. P. et al. Overexpression of chalcone isomerase in apple reduces phloridzin accumulation and increases susceptibility to herbivory by two-spotted mites. Plant J. 103, 293–307 (2020).
Zhang, X. et al. Identification of UDP-glucosyltransferase involved in the biosynthesis of phloridzin in Gossypium hirsutum. Plant J. 121, e17248. https://doi.org/10.1111/tpj.17248 (2025).
Das, P. R., Kim, Y., Hong, S. J. & Eun, J. B. Profiling of volatile and non-phenolic metabolites-Amino acids, organic acids, and sugars of green tea extracts obtained by different extraction techniques. Food Chem. 296, 69–77 (2019).
Gilbert, A., Silvestre, V., Robins, R. J., Remaud, G. S. & Tcherkez, G. Biochemical and physiological determinants of intramolecular isotope patterns in sucrose from C(3), C(4) and CAM plants accessed by isotopic (1)(3)C NMR spectrometry: a viewpoint. Nat. Prod. Rep. 29, 476–486 (2012).
Zeng, S. et al. Integrated transcriptome and metabolome analysis reveals the regulation of phlorizin synthesis in Lithocarpus polystachyus under nitrogen fertilization. BMC Plant Biol. 24, 366 (2024).
Li, L. et al. Enhanced carbon use efficiency and warming resistance of soil microorganisms under organic amendment. Environ. Int 192, 109043. https://doi.org/10.1016/j.envint.2024.109043 (2024).
Hayashi, N. et al. Annual variation of natural 15N abundance in tea leaves and its practicality as an organic tea indicator. J. Agric Food Chem. 59, 10317–10321 (2011).
Roy, A. et al. Unravelling 30 ka recharge history of an intensely exploited multi-tier aquifer system in North West India through isotopic tracers - Implications on deep groundwater sustainability. Sci. Total Environ. 807, 151401. https://doi.org/10.1016/j.scitotenv.2021.151401 (2022).
Jiang, F. et al. Selenium levels in soil and tea as affected by soil properties in Jiangxi Province, China. BMC Plant Biol. 24, 1130 (2024).
Fu, L. et al. Differences in Copper Absorption and Accumulation between Copper-Exclusion and Copper-Enrichment Plants: A Comparison of Structure and Physiological Responses. PLoS One 10, e0133424 (2015).
MEE. (Ministry of Ecological Environment, 2020).
Dutta, T. et al. Global demand for rare earth resources and strategies for green mining. Environ. Res 150, 182–190 (2016).
Fu, H. et al. Combining stable C, N, O, H, Sr isotope and multi-element with chemometrics for identifying the geographical origins and farming patterns of Huangjing herb. J. Food Comp. Anal. 102. https://doi.org/10.1016/j.jfca.2021.103972 (2021).
Minsat, L. et al. Sustainable and scalable enzymatic production, structural elucidation, and biological evaluation of novel phlorizin analogues. ChemSusChem 18, e202401498. https://doi.org/10.1002/cssc.202401498 (2025).
Elrys, A. S. et al. Global gross nitrification rates are dominantly driven by soil carbon-to-nitrogen stoichiometry and total nitrogen. Glob. Chang Biol. 27, 6512–6524 (2021).
Kim, J. H. Multicollinearity and misleading statistical results. Korean J. Anesthesiol. 72, 558–569 (2019).
Wei, Z. et al. Melatonin increases the performance of Malus hupehensis after UV-B exposure. Plant Physiol. Biochem 139, 630–641 (2019).
Kfoury, N. et al. Striking changes in tea metabolites due to elevational effects. Food Chem. 264, 334–341 (2018).
Jiang, J., Xu, R. -k & Zhao, A. -z Comparison of the surface chemical properties of four soils derived from Quaternary red earth as related to soil evolution. Catena 80, 154–161 (2010).
Liu, W.-J. et al. Elemental and strontium isotopic geochemistry of the soil profiles developed on limestone and sandstone in karstic terrain on Yunnan-Guizhou Plateau, China: Implications for chemical weathering and parent materials. J. Asian Earth Sci. 67-68, 138–152 (2013).
Gunadasa, S. G., Tighe, M. K. & Wilson, S. C. Arsenic and cadmium leaching in co-contaminated agronomic soil and the influence of high rainfall and amendments. Environ. Pollut. 316, 120591 (2023).
Ritter, A., Regalado, C. M. & Aschan, G. Fog reduces transpiration in tree species of the Canarian relict heath-laurel cloud forest (Garajonay National Park, Spain). Tree Physiol. 29, 517–528 (2009).
Schaeffer, S. M., Sharp, E., Schimel, J. P. & Welker, J. M. Soil-plant N processes in a High Arctic ecosystem, NW Greenland are altered by long-term experimental warming and higher rainfall. Glob. Chang Biol. 19, 3529–3539 (2013).
Liu, Y. et al. Genomic basis of geographical adaptation to soil nitrogen in rice. Nature 590, 600–605 (2021).
Midway, S., Robertson, M., Flinn, S. & Kaller, M. Comparing multiple comparisons: practical guidance for choosing the best multiple comparisons test. PeerJ 8, e10387 (2020).
Seredin, P. et al. A study of the association between primary oral pathologies (dental caries and periodontal diseases) using synchrotron molecular FTIR spectroscopy in view of the patient’s personalized clinical picture (demographics and anamnesis). Int. J. Mol. Sci. 25. https://doi.org/10.3390/ijms25126395 (2024).
Zhou, R. et al. Hybrid wavelength selection strategy combined with ATR-FTIR spectroscopy for preliminary exploration of vintage labeling traceability of sauce-flavor baijiu. Spectrochim. Acta A Mol. Biomol. Spectrosc. 321, 124691 (2024).
Liu, C. et al. Metabolomics for origin traceability of lamb: An ensemble learning approach based on random forest recursive feature elimination. Food Chem. X 29, 102856 (2025).
Ashiq, W. et al. Roman urdu hate speech detection using hybrid machine learning models and hyperparameter optimization. Sci. Rep. 14, 28590 (2024).
Lyu, J. et al. Generative adversarial network-based noncontrast CT angiography for aorta and carotid arteries. Radiology 309, e230681 (2023).
Acknowledgements
The authors would like to thank all colleagues and institutions that contributed to this work. This study was supported by the Scientific and Technological Innovation Project of China Academy of Chinese Medical Sciences (Grant No. JBKY2025A06), the Innovation Team and Talents Cultivation Program of National Administration of Traditional Chinese Medicine (Grant No. ZYYZDXK- 2023244), the Key Project at the Central Government Level: Capacity Building for the Sustainable Utilization of Valuable Chinese Medicine Resources (Grant No. 2060302), the China Agriculture Research System of MOF and MARA (Grant No. CARS-21), the National Key R&D Program of China (Grant No. 2020YFC1712700), Co-development Agreement on the Bioactivity Research of Flavonoids from Lithocarpus litseifolius (RDS10120230053).
Author information
Authors and Affiliations
Contributions
Y.T.: Conceptualization, data curation, formal analysis, project administration, software, visualization, writing original draft, and writing—review & editing. P.Y.: Conceptualization, formal analysis, project administration, methodology, supervision, software, visualization, writing—review & editing. F.X.: formal analysis, supervision. Z.Z.: formal analysis, supervision, writing—review & editing. K.X.: Investigation, resources. S.Y.: Investigation, software. Y.N.: Investigation, data curation. Z.Z.: Investigation, resources. C.W.: Investigation, resources. W.Q.: Investigation. X.Z.: Investigation. Y.L.: Investigation. R.W.: Investigation. G.H.: Supervision, writing—review & editing. J.Y.: Supervision, funding acquisition, project administration, and writing—review & editing.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Tang, Y., Yu, P., Xiong, F. et al. Tracing origin and cultivation practice of Lithocarpus litseifolius via multi-data fusion and machine learning approaches. npj Sci Food (2026). https://doi.org/10.1038/s41538-026-00748-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41538-026-00748-0